A useful trick…
I came across some references to “Bernstein’s trick” in some papers I was reading, but had to do a little digging to find out what it really meant. Suppose you have independent and identically distributed random variables Xi for i = 1, 2, … n, taking values in [-1, 1], and with E[Xi] =…. Our goal is to bound
Pr[ (1/n) ∑ Xi > a ] .
The Bernstein trick is to multiply both sides by a dummy variable t and exponentiate both sides:
Pr[ (1/n) ∑ Xi > a ] = Pr[ exp(t ∑ Xi) > exp(a n t) ] ,
Now we use the Markov inequality on the random variable exp(t ∑ X_i):
Pr[ exp(t ∑ Xi) > exp(a n t) ] ≤ exp(- a n t) E[ exp(t ∑ Xi) ],
View original post 384 more words