Let X1, ... , Xn be independent, identically distributed random variables, uniform on [0, 1]. We observe the Xk's sequentially and must stop on exactly one of them. No recall of preceding observations is permitted. What stopping rule minimizes the expected rank of the selected observation, and what is its corresponding value?
The general solution to this full-information expected rank problem is unknown. The major difficulty is that the problem is fully history-dependent, that is, the optimal rule depends at every stage on all preceding values, and not only on simpler sufficient statistics of these. Only bounds are known for the limiting value v as n goes to infinity, namely 1.908 < v < 2.329. These bounds are obtained by studying so-called memoryless strategies, that is strategies in which the decision to stop on $X_k$ depends only on the value of $X_k$ and not on the history of observations $X_1, \cdots, X_{k-1}$. It is known that there is some room to improve the lower bound by further computations for a truncated version of the problem within the class of memoryless stategeies. It is still not known how to improve on the upper bound for the limiting value, and this for whatever strategy.[3][4][5]
Another attempt proposed to make progress on the problem is a continuous time version of the problem where the observations follow a Poisson arrival process of homogeneous rate 1. Under some mild assumptions, the corresponding value function is bounded and Lipschitz continuous, and the differential equation for this value function is derived.[6] The limiting value of presents the solution of Robbins’ problem. It is shown that for large , . This estimation coincides with the bounds mentioned above.
The advantage of the continuous time version lies in the fact that the answer can be expressed in terms of the solution of a differential equation, i.e. the answer appears in a closed form. However, since the obtained differential equation contains, apart from the "objective function", another (small) unknown function, the approach does not seem so far to give a decisive advantage for finding the optimal limiting value.
A simple suboptimal rule, which performs
almost as well as the optimal rule within the class of memoryless stopping rules, was proposed by Krieger & Samuel-Cahn.[7] The rule stops with the smallest such that for a given constant c, where is the relative rank of the ith observation and n is the total number of items. This rule has added flexibility. A curtailed version thereof can be used to select an item with a given probability , . The rule can be used to select two or more items. The problem of selecting a fixed percentage , , of n, is also treated.
Importance
One of the motivations to study Robbins' problem is that with its solution all classical (four) secretary problems would be solved. But the major reason is to understand how to cope with full history dependence in a (deceptively easy-looking) problem.
On the Ester's Book International Conference in Israel (2006) Robbins' problem was accordingly named one of the four most important problems in the field of optimal stopping and sequential analysis.
History
Herbert Robbins presented the above described problem at the International Conference on Search and Selection in Real Time[note 1] in Amherst, 1990. He concluded his address with the words I should like to see this problem solved before I die. Scientists working in the field of optimal stopping have since called this problem Robbins' problem. Unfortunately, Herbert Robbins' wish did not become true. He died in 2001.
Chow–Robbins game
Another optimal stopping problem bearing Robbins' name (and not to be c onfused with Robbins' problem) is the Chow–Robbins game:[8][9]
Given an infinite sequence of IID random variables with distribution , how to decide when to stop, in order to maximize the sample average where is the stopping time?
The probability of eventually stopping must be 1 (that is, you are not allowed to keep sampling and never stop).
For any distribution with finite second moment, there exists an optimal strategy, defined by a sequence of numbers . The strategy is to keep sampling until .[10][11]
Optimal strategy for very large n
If has finite second moment, then after subtracting the mean and dividing by the standard deviation, we get a distribution with mean zero and variance one. Consequently it suffices to study the case of with mean zero and variance one.
With this, , where is the solution to the equation[note 2]which can be proved by solving the same problem with continuous time, with a Wiener process. At the limit of , the discrete time problem becomes the same as the continuous time problem.
When the game is a fair coin toss game, with heads being +1 and tails being -1, then there is a sharper result[9]where is the Riemann zeta function.
Optimal strategy for small n
When n is small, the asymptotic bound does not apply, and finding the value of is much more difficult. Even the simplest case, where are fair coin tosses, is not fully solved.
For the fair coin toss, a strategy is a binary decision: after tosses, with k heads and (n-k) tails, should one continue or should one stop? Since 1D random walk is recurrent, starting at any , the probability of eventually having more heads than tails is 1. So, if , one should always continue. However, if , it is tricky to decide whether to stop or continue.[16]
^Bruss, F.Thomas; Ferguson, S. Thomas (1996). "Half-Prophets and Robbins' Problem of Minimizing the expected rank". Lecture Notes in Statistics (LNS). Athens Conference on Applied Probability and Time Series Analysis. Vol. 114. New York, NY: Springer New York. pp. 1–17. doi:10.1007/978-1-4612-0749-8_1. ISBN978-0-387-94788-4.
^ abcElton, John H. (2023-06-06). "Exact Solution to the Chow-Robbins Game for almost all n, using the Catalan Triangle". arXiv:2205.13499 [math].{{cite arXiv}}: CS1 maint: date and year (link)
^Dvoretzky, Aryeh. "Existence and properties of certain optimal stopping rules." Proc. Fifth Berkeley Symp. Math. Statist. Prob. Vol. 1. 1967.
^The Joint Summer Research Conferences in the Mathematical Sciences were held at the University of Massachusetts from June 7 to July 4, 1990. These were sponsored by the AMS, SIAM, and the Institute for Mathematical Statistics (IMS). Topics in 1990 were: Probability models and statistical analysis for ranking data, Inverse scattering on the line, Deformation theory of algebras and quantization with applications to physics, Strategies for sequential search and selection in real time, Schottky problems, and Logic, fields, and subanalytic sets.
importnumpyasnpfromscipy.integrateimportquadfromscipy.optimizeimportrootdeff(lambda_,alpha):returnnp.exp(lambda_*alpha-lambda_**2/2)defequation(alpha):integral,error=quad(f,0,np.inf,args=(alpha))returnintegral*(1-alpha**2)-alphasolution=root(equation,0.83992,tol=1e-15)# Print the solutionifsolution.success:print(f"Solved α = {solution.x[0]} with a residual of {solution.fun[0]}")else:print("Solution did not converge")