A useful trick…

I came across some references to “Bernstein’s trick” in some papers I was reading, but had to do a little digging to find out what it really meant. Suppose you have independent and identically distributed random variables *X _{i}* for

*i = 1, 2, … n*, taking values in

*[-1, 1]*, and with

*E[X*. Our goal is to bound

_{i}] =…

Pr[ (1/n) ∑ X._{i}> a ]

The Bernstein trick is to multiply both sides by a dummy variable *t* and exponentiate both sides:

Pr[ (1/n) ∑ X,_{i}> a ] = Pr[ exp(t ∑ X_{i}) > exp(a n t) ]

Now we use the Markov inequality on the random variable *exp(t ∑ X_i)*:

Pr[ exp(t ∑ X,_{i}) > exp(a n t) ] ≤ exp(- a n t) E[ exp(t ∑ X_{i}) ]

View original post 384 more words