自動販売機のシムミューレーション問題

この問題に固定されているチケットを使って自動販売機の最適な期待値を計算する事が目的です。This problem is non-trivial, as the probability of winning at any given machine is determined by the number of tickets inserted since a win last occurred on the machine, namely P(win) = min(K^2,L^2)/L^2, where K is the number of tickets inserted and L is the "luck" value of the machine. An added constraint is that the player will not leave a machine until winning, upon beginning to insert tickets and, upon winning, must move to another machine. The K value of the machine resets upon a win. As there are O(15) machines, a search of all permutations, O(15!), would require more than the allotted time. In addition, the order visited might vary with each cycle, and the same machine can be visited multiple times before all other machines have been visited in an iteration (as long as the same machine is not used immediately after a win).
The author's solution was to compute the expected number of tickets required for a win on each machine. Then, visit the machines in descending order, based on "expected value per ticket", adding this value to the total, if the number of tickets expected to be required were less than the current ticket balance. If so, the expected number of tickets would be subtracted from the balance. However, this strategy ignores two aspects of the problem; that the same machine X can be visited before another machine Z (say, XYX), and that there may be enough tickets to visit all of the machines more than once (i.e. a single for loop over the machines is inadequate).
他のアルゴリズムはｎ進ゲーム木を使う。この場合にチケットの数が決定点数の上限です。However, as the number of tickets is O(40000), the depth of the tree is also O(40000), making the complexity prohibitive.
この問題を解くため二つな方法があるらしいです。一つは自分のアルゴリズムを確立を使用するように訂正する事です。それ以上以前に尋ねた販売機を覚えて次に最適な機械に進むロジックを書くのが必要です。
もう一つは動的計画の方法を使う事です。
The dynamic programming approach seems to be premised on caching the expected value given the last machine visited and the number of tickets allocated. The correct solution by Petr Mitrichev seems to employ a Fourier Transform and inverse Fourier Transform to speed up the lattice computation. It also utilizes a bifurcation search to limit the number of recursive calls (the depth of the simulation tree. How the lattice computation relates to convolution, and how the convolution relates to FFT, are theoretical questions unanswered.
The key point seems to be that a dynamic programming data structure on the last machine visited and the number of tickets remaining seems in order, as, if possible, the complexity is a tractable O(40000*15) operations. Call the data structure "E". Given E(i,j) as the expected value ending at machine i with j tickets, E(k,j-1), where k!=i might be found, if a win occurred in the last play at i, and E(i,j-1) might be found, if a loss occurred.
正解を見てもよくわからないからまず簡単なケースを考えます。Consider initially the case of n machines and two tickets. If at machine k with 2 tickets, a win occurs with P(win) and loss with P(loss). The optimal value given a win from k is P(win)*value[2]+max(x!=k)(E(x,1)). If a loss, the optimal value is E(k,1). So the total for E(k,2) = P(win)*value[2]+max(x!=k)(E(x,1)) + E(k,1). A similar formula exists for E(x!=k,2), and we can take the max E(x,2) over all machines x as the outcome. Thus, it would seem that, given we start at numTickets=1 and move up, processing all machines for each number of tickets, the problem might be a simple dynamic programming implementation. However, a problem with this solution is that the probability of winning or losing is based on the number of tickets inserted into a machine. Thus, a third dimension to the dynamic programming array, k, might be needed. However, as k can be up to 40000, the time and space complexity will then exceed 10^9, making the problem intractable.