Threshold Rules for the Classical Prophet Inequality

Jiechen Zhang

Problem setting. Let $X_1,\ldots,X_n$ be independent nonnegative random variables, and let $M:=\max_{1\le i\le n}X_i$ denote the prophet's payoff. The classical prophet inequality is usually attributed to Krengel and Sucheston, who proved that there exists a nonanticipating stopping rule with expected payoff at least one half of $\E[M]$ [KS77, KS78]. The sharp constant $1/2$ is often attributed to Garling in this early line of work. Samuel-Cahn later showed that the same guarantee can be achieved by a single deterministic threshold rule [SC84].

We use the notation

F_i(x):=\Prob(X_i\le x), \qquad F(x):=\prod_{i=1}^n F_i(x)=\Prob(M\le x).

For a deterministic threshold $\tau\ge 0$, define

p(\tau):=\Prob(M>\tau), \qquad R(\tau):=\sum_{i=1}^n \E[(X_i-\tau)^+].

Throughout, we first assume that ties occur with probability zero; the general case follows by independent infinitesimal tie-breaking perturbations.

A common threshold decomposition. Let $\tau\ge 0$ be a threshold, possibly random, independent of $X_1,\ldots,X_n$. The algorithm stops at the first index $i$ such that $X_i>\tau$. Define

I_i:=\mathbf{1}\{X_i>\tau\}\prod_{j<i}\mathbf{1}\{X_j\le \tau\}, \qquad \ALG:=\sum_{i=1}^n X_iI_i.

Since $I_i=1$ implies $X_i>\tau$, we have $X_iI_i=\tau I_i+(X_i-\tau)^+I_i$. Summing over $i$ and using $\sum_i I_i=\mathbf{1}\{M>\tau\}$ gives

\begin{aligned} \E[\ALG] &= \underbrace{\E[\tau\mathbf{1}\{M>\tau\}]}_{\text{threshold part}} + \underbrace{\E\left[\sum_i (X_i-\tau)^+I_i\right]}_{\text{surplus part}}. \end{aligned} \tag{1}\label{eq:decomposition}

This threshold/surplus decomposition reveals the central tradeoff: the threshold part benefits from a larger value of $\tau$ on the event $\{M>\tau\}$, while the surplus part depends on how much value remains above $\tau$ and on the probability of reaching the corresponding item. All threshold rules below use the same decomposition; they differ only in how the surplus term is certified.

Deterministic thresholds. Assume in this paragraph that $\tau$ is deterministic. Conditioning on whether the algorithm reaches item $i$, independence gives

\E[\ALG] = p(\tau)\tau+ \sum_{i=1}^n \left(\prod_{j<i}F_j(\tau)\right) \E[(X_i-\tau)^+].

Since

\prod_{j<i}F_j(\tau) \ge \prod_{j=1}^n F_j(\tau) = F(\tau) = 1-p(\tau),

we obtain the deterministic surplus certificate

\E[\ALG] \ge p(\tau)\tau+(1-p(\tau))R(\tau). \tag{2}\label{eq:det-certificate}

Moreover,

R(\tau) = \sum_{i=1}^n\E[(X_i-\tau)^+] \ge \E[(M-\tau)^+] \ge \E[M]-\tau.

Therefore

\E[\ALG] \ge p(\tau)\tau+(1-p(\tau))(\E[M]-\tau).

Equivalently,

\E[\ALG] \ge \frac12\E[M] + (1-2p(\tau))\left(\frac12\E[M]-\tau\right). \tag{3}\label{eq:median-halfmean-certificate}

Median and half-mean thresholds. Equation $\eqref{eq:median-halfmean-certificate}$ immediately gives two standard choices. If $\tau_{\mathrm{med}}$ is a median threshold of $M$ [SC84], chosen so that

p(\tau_{\mathrm{med}}) = \Prob(M>\tau_{\mathrm{med}}) = \frac12,

then the first factor in $\eqref{eq:median-halfmean-certificate}$ vanishes. Likewise, for the half-mean threshold [Wit95, KW12],

\tau=\frac12\E[M],

the second factor vanishes. Either threshold satisfies $\E[\ALG]\ge \frac12\E[M]$.

More generally, every deterministic threshold between $\tau_{\mathrm{med}}$ and $\frac12\E[M]$ also works. Indeed, since $p(\tau)$ is nonincreasing in $\tau$, the two factors $1-2p(\tau)$ and $\frac12\E[M]-\tau$ have the same sign throughout this interval. Therefore

(1-2p(\tau))\left(\frac12\E[M]-\tau\right)\ge 0,

and hence $\E[\ALG]\ge \frac12\E[M]$.

Balanced surplus threshold. Another useful deterministic threshold [SC84] is the unique fixed point $\tau^\star$ of

\tau^\star = R(\tau^\star) = \sum_{i=1}^n \E[(X_i-\tau^\star)^+].

Indeed, $R$ is continuous and nonincreasing, with $R(0)=\sum_{i=1}^n \E[X_i]$ and $R(\tau)\to 0$ as $\tau\to\infty$. Moreover, $R(\tau)-\tau$ is strictly decreasing: if $0 \le s < t$, then $R(t)-t\le R(s)-t < R(s)-s$. Hence $R(\tau)-\tau$ has exactly one zero.

Using $\eqref{eq:det-certificate}$,

\E[\ALG] \ge p(\tau^\star)\tau^\star+(1-p(\tau^\star))R(\tau^\star) = \tau^\star.

On the other hand,

\E[M] \le \tau^\star+\E[(M-\tau^\star)^+] \le \tau^\star+\sum_{i=1}^n\E[(X_i-\tau^\star)^+] = 2\tau^\star.

Thus $\E[\ALG]\ge \tau^\star\ge \frac12\E[M]$.

The same argument shows that every

\tau\in\left[\frac12\E[M],\tau^\star\right]

also works. Since $R$ is nonincreasing and $R(\tau^\star)=\tau^\star$, every such $\tau$ satisfies $R(\tau)\ge R(\tau^\star)=\tau^\star\ge \tau$. Therefore $\eqref{eq:det-certificate}$ gives

\E[\ALG] \ge p(\tau)\tau+(1-p(\tau))R(\tau) \ge p(\tau)\tau+(1-p(\tau))\tau = \tau \ge \frac12\E[M].

A certified interval of deterministic thresholds. Combining the two deterministic families above, the present certificates guarantee the $1/2$ bound for every

\tau\in \left[ \min\left\{\tau_{\mathrm{med}},\frac12\E[M]\right\}, \max\left\{\tau_{\mathrm{med}},\tau^\star\right\} \right].

That is, every threshold in this interval satisfies $\E[\ALG]\ge \frac12\E[M]$.

A surplus formula for random thresholds. Let $\tau$ have CDF $G$, independent of $X_1,\ldots,X_n$. Conditional on $\tau=t$, the rule becomes the deterministic threshold-$t$ rule. For item $i$ to contribute surplus, all previous values must be at most $t$, while $(X_i-t)^+$ already enforces $X_i>t$. Thus

\E[(X_i-\tau)^+I_i\mid \tau=t] = \E[(X_i-t)^+]\prod_{j<i}F_j(t).

Using the tail identity

\E[(X_i-t)^+]=\int_t^\infty (1-F_i(x))\,dx

and applying Tonelli's theorem, we obtain

\begin{aligned} \E\left[\sum_{i=1}^n (X_i-\tau)^+I_i\right] &= \int_0^\infty \sum_{i=1}^n \left(\int_t^\infty (1-F_i(x))\,dx\right) \prod_{j<i}F_j(t)\,dG(t) \\ &= \int_0^\infty \sum_{i=1}^n (1-F_i(x)) \left( \int_0^x \prod_{j<i}F_j(t)\,dG(t) \right)dx. \end{aligned} \tag{4}\label{eq:random-surplus}

This form is useful when the threshold itself is randomized.

A randomized analogue of the median threshold. The deterministic median threshold chooses a fixed value $\tau_{\mathrm{med}}$ such that $\Prob(M>\tau_{\mathrm{med}})=1/2$. We now consider a randomized analogue of this idea. Let $\tau$ be an independent draw from the distribution of $M$; equivalently, $\tau\overset{d}{=}M$, with $\tau$ independent of the online instance. In the notation above, this means $G=F$. Under the no-ties assumption, $\Prob(M>\tau)=1/2$, so the random threshold has the same median-type exceedance probability as the deterministic median threshold. We prove the averaged surplus bound

\E\left[\sum_{i=1}^n (X_i-\tau)^+I_i\right] \ge \frac12\E[(M-\tau)^+]. \tag{5}\label{eq:random-surplus-bound}

Using $\eqref{eq:random-surplus}$ with $G=F$, fix $x\ge 0$. For $t\le x$,

F(t) = \left(\prod_{j<i}F_j(t)\right) \left(\prod_{\ell\ge i}F_\ell(t)\right) \le \left(\prod_{j<i}F_j(t)\right) \left(\prod_{\ell\ge i}F_\ell(x)\right).

Hence

\prod_{j<i}F_j(t) \ge \frac{F(t)}{\prod_{\ell\ge i}F_\ell(x)}.

Assuming $F$ is continuous, this gives

\int_0^x \prod_{j<i}F_j(t)\,dF(t) \ge \frac{1}{\prod_{\ell\ge i}F_\ell(x)} \int_0^x F(t)\,dF(t) = \frac{F(x)^2}{2\prod_{\ell\ge i}F_\ell(x)}.

Substituting this lower bound into $\eqref{eq:random-surplus}$ yields

\E\left[\sum_{i=1}^n (X_i-\tau)^+I_i\right] \ge \frac12 \int_0^\infty F(x) \sum_{i=1}^n (1-F_i(x))\prod_{j<i}F_j(x)\,dx.

The sum telescopes:

\sum_{i=1}^n (1-F_i(x))\prod_{j<i}F_j(x) = 1-F(x).

Therefore

\E\left[\sum_{i=1}^n (X_i-\tau)^+I_i\right] \ge \frac12\int_0^\infty F(x)(1-F(x))\,dx.

Since $M$ and $\tau$ are independent with common CDF $F$,

\E[(M-\tau)^+] = \int_0^\infty \Prob(\tau<x<M)\,dx = \int_0^\infty \Prob[\tau<x]\Prob[M>x]\,dx = \int_0^\infty F(x)(1-F(x))\,dx.

This proves $\eqref{eq:random-surplus-bound}$.

Combining $\eqref{eq:decomposition}$ and $\eqref{eq:random-surplus-bound}$,

\begin{aligned} \E[\ALG] &\ge \E[\tau\mathbf{1}\{M>\tau\}] + \frac12\E[(M-\tau)^+] \\ &= \E[\tau\mathbf{1}\{M>\tau\}] + \frac12\E[(M-\tau)\mathbf{1}\{M>\tau\}] \\ &= \frac12\E[(M+\tau)\mathbf{1}\{M>\tau\}]. \end{aligned}

By exchangeability of the i.i.d. pair $(M,\tau)$ and the no-ties assumption,

\E[(M+\tau)\mathbf{1}\{M>\tau\}] = \E[(M+\tau)\mathbf{1}\{\tau>M\}] = \frac12\E[M+\tau] = \E[M].

Thus $\E[\ALG]\ge \frac12\E[M]$.

Other randomized thresholds. The argument above also suggests a sufficient condition for other randomized thresholds. Let $\tau$ have CDF $G$, independent of $X_1,\ldots,X_n$. Suppose first that, for every $i$ and every $x\ge 0$,

\int_0^x \prod_{j<i}F_j(t)\,dG(t) \ge \frac12 F(x)\prod_{j<i}F_j(x). \tag{6}\label{eq:general-random-surplus-certificate}

Then $\eqref{eq:random-surplus}$ gives

\E\left[\sum_{i=1}^n (X_i-\tau)^+I_i\right] \ge \frac12 \int_0^\infty F(x) \sum_{i=1}^n (1-F_i(x))\prod_{j<i}F_j(x)\,dx = \frac12\int_0^\infty F(x)(1-F(x))\,dx.

It remains to compare the threshold part. If, in addition,

\int_0^\infty t(1-F(t))\,dG(t) \ge \int_0^\infty t(1-F(t))\,dF(t), \tag{7}\label{eq:general-threshold-certificate}

then the two terms in the decomposition are at least the corresponding two terms in the proof above. Hence the same conclusion follows:

\E[\ALG]\ge \frac12\E[M].

Thus any threshold distribution $G$ satisfying $\eqref{eq:general-random-surplus-certificate}$ and $\eqref{eq:general-threshold-certificate}$ is also certified by the same threshold/surplus argument.

A simple source of additional randomized thresholds comes from convexity. Let $\tau_0$ be any deterministic threshold already certified above, for example any threshold in the certified deterministic interval. For $\alpha\in[0,1]$, draw $\tau$ from the distribution of $M$ with probability $\alpha$, and set $\tau=\tau_0$ with probability $1-\alpha$, independently of the online instance. Equivalently, the law of $\tau$ is the convex combination

\alpha F+(1-\alpha)\delta_{\tau_0}.

Since the expected payoff is affine in the law of the independent threshold,

\E[\ALG] = \alpha\,\E[\ALG\mid \tau\overset{d}{=}M] + (1-\alpha)\,\E[\ALG\mid \tau=\tau_0] \ge \frac12\E[M].

Implementing the randomized analogue. The randomized analogue only requires drawing an independent threshold $\tau\overset{d}{=}M$. There are several equivalent ways to implement this.

First, if we have sample access to each distribution $D_i$, draw independent samples $S_i$ from $D_i$ for $i=1,\ldots,n$, and set

\tau:=\max_i S_i.

Then

\tau\overset{d}{=}\max_i X_i=M,

so this is exactly the sample-max threshold rule [RWW20].

Second, if the CDFs $F_i$ are known, then the CDF of $M$ is

F(x)=\prod_{i=1}^n F_i(x).

Draw $U$ uniformly from $[0,1]$ and set

\tau:=F^{-1}(U), \qquad F^{-1}(u):=\inf\{x:F(x)\ge u\}.

Then $\tau\overset{d}{=}M$.

Finally, if an independent simulator for the full instance is available, draw $(\widetilde X_1,\ldots,\widetilde X_n)$ independently and set

\tau:=\max_i \widetilde X_i.

If only historical data are available, one may approximate this by drawing uniformly from empirical past maxima. The exact theorem applies whenever the resulting threshold is an independent draw from the distribution of $M$.

Takeaway. The deterministic and randomized threshold rules above are all consequences of the same decomposition

\E[\ALG] = \underbrace{\E[\tau\mathbf{1}\{M>\tau\}]}_{\text{threshold part}} + \underbrace{\E\left[\sum_i (X_i-\tau)^+I_i\right]}_{\text{surplus part}}.

For deterministic thresholds, the surplus part is $\sum_{i=1}^n \left(\prod_{j<i}F_j(\tau)\right) \E[(X_i-\tau)^+].$ The factor $\prod_{j<i}F_j(\tau)$ is the probability that the rule reaches item $i$, and it is bounded below by $F(\tau)=\prod_{j=1}^n F_j(\tau)$. This gives the pointwise surplus bound $\E\left[\sum_i (X_i-\tau)^+I_i\right]\ge F(\tau)R(\tau)$.
For randomized thresholds, the same surplus term is averaged over the possible threshold values. After the tail-integral rewriting, it becomes $\int_0^\infty \sum_{i=1}^n (1-F_i(x)) \left( \int_0^x \prod_{j<i}F_j(t)\,dG(t) \right)dx.$ Thus the inner quantity $\int_0^x \prod_{j<i}F_j(t)\,dG(t)$ is the randomized analogue of the deterministic reach probability: it averages the probability of reaching item $i$ over all threshold values $t\le x$. The randomized surplus certificate is therefore no longer a pointwise comparison at a single threshold value. Instead, it is a lower bound on the averaged reach term $\int_0^x \prod_{j<i}F_j(t)\,dG(t)$. For example, one sufficient bound is $\int_0^x \prod_{j<i}F_j(t)\,dG(t) \ge \frac12 F(x)\prod_{j<i}F_j(x).$

The distinction is similar in spirit to the difference between pointwise dominance and integrated dominance conditions: deterministic thresholds compare reach probabilities at one threshold value, while randomized thresholds compare their averages over the threshold distribution.

This pointwise certificate is what certifies the deterministic threshold rules above, including the median threshold $\tau_{\mathrm{med}}$, the half-mean threshold $\frac12\E[M]$, the balanced surplus threshold $\tau^\star$, and every threshold in the deterministic interval.

The randomized analogue $\tau\overset{d}{=}M$ is the cleanest averaged certificate, since taking $G=F$ makes the surplus term telescope and lets the threshold part close by exchangeability. The same viewpoint also certifies any threshold distribution $G$ satisfying the two sufficient conditions above, as well as convex mixtures of the law of $M$ with any certified deterministic threshold.

Therefore, the median rule, half-mean rule, balanced surplus rule, sample-max rule, and these mixture rules are different ways of certifying the same threshold/surplus tradeoff.

References

[KS77] Ulrich Krengel and Louis Sucheston. Semiamarts and Finite Values. Bulletin of the American Mathematical Society, 83(4):745-747, 1977.
[KS78] Ulrich Krengel and Louis Sucheston. On Semiamarts, Amarts, and Processes with Finite Value. Advances in Probability and Related Topics, 4:197-266, Marcel Dekker, 1978.
[SC84] Ester Samuel-Cahn. Comparison of Threshold Stop Rules and Maximum for Independent Nonnegative Random Variables. The Annals of Probability, 12(4):1213-1216, 1984.
[Wit95] Rainer Wittmann. Prophet Inequalities for Dependent Random Variables. Stochastics: An International Journal of Probability and Stochastic Processes, 52(3-4):283-293, 1995.
[KW12] Robert Kleinberg and S. Matthew Weinberg. Matroid Prophet Inequalities. Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, pages 123-136, ACM, 2012.
[RWW20] Aviad Rubinstein, Jack Z. Wang, and S. Matthew Weinberg. Optimal Single-Choice Prophet Inequalities from Samples. 11th Innovations in Theoretical Computer Science Conference, Leibniz International Proceedings in Informatics 151, pages 60:1-60:10, 2020.