## Survivorship Bias

The Misconception: You should focus on the successful if you wish to become successful.

The Truth: When failure becomes invisible, the difference between failure and success may also become invisible.

In New York City, in an apartment a few streets away from the center of Harlem, above trees reaching out over sidewalks and dogs pulling at leashes and conversations cut short to avoid parking tickets, a group of professional thinkers once gathered and completed equations that would both snuff and spare several hundred thousand human lives.

Tagged ,

Photograph by Jake Slagle @ flickr

“A swami puts m dollars in one envelope and 2m dollars in another. You and your opponent each get one of the envelopes (at random). You open your envelope and find x dollars, and then the swami asks you if you want to trade envelopes. You reason that if you switch, you will get either x/2 or 2x dollars, each with probability 1/2. This makes the expected value of a switch equal to (1/2)(x/2) + (1/2)(2x)=5x/4, which is greater than the x dollars that you hold in your hand. So you offer to trade.

Tagged , ,

## Guest Post: Larry Laudan. Why Presuming Innocence is Not a Bayesian Prior

"Why presuming innocence has nothing to do with assigning low prior probabilities to the proposition that defendant didn't commit the crime"

by Professor Larry Laudan
Philosopher of Science*

Several of the comments to the July 17 post about the presumption of innocence suppose that jurors are asked to believe, at the outset of a trial, that the defendant did not commit the crime and that they can legitimately convict him if and only if they are eventually persuaded that it is highly likely (pursuant to the prevailing standard of proof) that he did in fact commit it.

"...the presumption (of innocence) is not (or at least should not be) an instruction about whether jurors believe defendant did or did not commit the crime. It is, rather, an instruction about their probative attitudes."
Tagged , , ,

Tagged , ,

## Lakatos, Popper, and Feyerabend: Some Personal Reminiscences | Donald Gillies

On 28 February 2011, Donald Gillies presented memories of meeting and working with some of the heroic personalities in philosophy of science, including Karl Popper, Imre Lakatos and Paul Feyerabend. This podcast records his presentation.

## Random Refutations

“(1) Parmenides-Leucippus: Leucippus takes the existence of motion as a partial refutation of Parmenides’s theory that the world is full and motionless. This leads to the theory of ‘atoms and the void’. It is the foundation of atomic theory.

(2) Galileo refutes Aristotle’s theory of motion : this leads to the foundation of the theory of acceleration, and later of Newtonian forces. Also, Galileo takes the moons of Jupiter and the phases of Venus as a refutation of Ptolemy, and thus as empirical support of the rival theory of Copernicus.

(3) Toricelli (and predecessors) : the refutation of ‘nature abhors a vacuum‘. This prepares for a mechanistic world view.

(4) Kepler’s refutation of the hypothesis of circular motion upheld till then (even by Tycho and Galileo), leads to Kepler’s laws and so to Newton’s theory.6

(5) Lavoisier’s refutation of the phlogiston theory leads to modern chemistry.

(6) The falsification of Newton’s theory of light (Young’s two- slit experiment). This leads to the Young-Fresnel theory of light. The velocity of light in moving water is another refutation. It prepares for special relativity.

(7) Oersted’s experiment is interpreted by Faraday as a refutation of the universal theory of Newtonian central forces and thus leads to the Faraday-Maxwell field theory.

(8) Atomic theory: the atomicity of the atom is refuted by the Thomson electron. This leads to the electromagnetic theory of matter, and, in time, to the rise of electronics. See Einstein’s and Weyl’s attempts at a monistic (‘unified’) theory of gravitation and electromagnetics.

(9) Michelson’s experiment (1881-1887-1902, etc.) leads to Lorentz’s Versuch einer Theorie der electrischen und optischen Erscheinungen in bewegten Körpern (1895: see §89). Lorentz’s book was crucially important to Einstein, who alluded to it twice in §9 of his relativity paper of 1905. (Einstein himself did not regard the Michelson experiment as very important.) Einstein’s special relativity theory is (a) a development of the formalism founded by Lorentz and (b) a different—that is, relativistic—interpretation of that formalism. There is no crucial experiment so far to decide between Lorentz’s and Einstein’s interpretations; but if we have to adopt action at a distance (non-locality: see Quantum Theory and the Schism in Physics, Vol. III of the Postscript, Preface 1982), then we would have to return to Lorentz.

Incidentally, it took years before physicists began to come to some agreement about the importance of Michelson’s experiments: I do not contend that falsifications are usually accepted at once (see the preceding section) not even that they are immediately recognised as potential falsifications.

(10) The ‘chance-discoveries’ of Roentgen and of Becquerel refuted certain (unconsciously held) expectations; especially Becquerel’s expectations. They had, of course, revolutionary consequences.

(11) Wilhelm Wien’s (partially) successful theory of black body radiation conflicted with the (partially) also very successful theories of SirJames Jeans and Lord Rayleigh. The refutation by Lummer and Pringsheim of the radiation formula of Rayleigh and Jeans, together with Wien’s work, leads to Planck’s quantum theory (see L.Sc.D., p. 108). In this, Planck refutes his own theory, the absolutistic interpretation of the entropy law, as opposed to a probabilistic interpretation similar to Boltzmann’s.

(12) Philipp Lenard’s experiments concerning the photoelectric effect conflicted, as Lenard himself insisted, with what was to be expected from Maxwell’s theory. They led to Einstein’s theory of light-quanta or photons (which were of course also in conflict with Maxwell), and thus, much later, to particle- wave dualism. (

(13) The refutation of the Mach-Ostwald anti-atomistic and phe- nomenalistic theory of matter: Einstein’s great paper on Brownian motion of 1905 suggested that Brownian motion may be interpreted as a refutation of this theory. Thus this paper did much to establish the reality of molecules and atoms. (14) Rutherford’s refutation of the vortex model of the atom.8 This leads directly to Bohr’s 1913 theory of the hydrogen atom, and thus, in the end, to quantum mechanics.

(14) Rutherford’s refutation of the vortex model of the atom.8 This leads directly to Bohr’s 1913 theory of the hydrogen atom, and thus, in the end, to quantum mechanics.

(15) Rutherford’s refutation (in 1919) of the theory that chemical elements cannot be changed artificially (though they may disintegrate spontaneously).

(16) The theory of Bohr, Kramers and Slater (see L.Sc.D., pp. 250, 243): this theory was refuted by Compton and Simon. The refutation leads almost at once to the Heisenberg-Born- Jordan quantum mechanics.

(17) Schrodinger’s interpretation of his (and de Broglie’s) theory is refuted by the statistical interpretation of matter waves (experiments of Davisson and Germer, and of George Thomson, for instance). This leads to Bom’s statistical interpretation.

(18) Anderson’s discovery of the positron (1932) refutes a lot: the theory of two elementary particles — protons and electrons — is refuted; conservation of particles is refuted; and Dirac’s own original interpretation of his predicted positive particles (he thought they were protons) is refuted. Some theoretical work of about 1930-31 is thereby corroborated.

(19) The electrical theory of matter elaborated by Einstein and Weyl, and held implicitly — and at any rate, pursued — by Einstein to the end of his life (since he interpreted the unified field theory as a theory of two fields, gravitation and electromagnetics),is refuted by the neutron and by Yukawa’s theory of nuclear forces: the Yukawa Meson. This gives rise to the theory of the nucleus.
(20) The refutation of parity conservation. (See Allan Franklin, Stud. Hist. Philos. Sci. 10, 1979, p. 201.)”
That is an interesting list of scientific refutations provided by Popper himself. Popper  was right to suggest that the new theories highlighted above were not direct results of the refutations. The refutations merely created new problem situations which stimulated imaginative and critical thought by thinking men. But this initial stage of conceiving a new theory is not susceptible for logical analysis.”The question how it happens that a new idea occurs to a man  … may be of great interest  to empirical psychology ; but it is irrelevant to the logical analysis of scientific knowledge” (See Popper, K., The  Logic of Scientific Discovery,1934,  p. 7). That is because the latter does not concern with quid facti but with quid juris.
Tagged , ,

## Akaike Information Criterion Statistics

Consider a distribution ${(q_1, q_2, ...,q_k)}$ with ${q_i >0}$ and ${ q_1 + q_2 + ...+ q_k=1}$. Suppose ${N }$ independent drawings are made from the distribution and the resulting frequencies are given by ${ (N_1,N_2,...,N_k)}$, where ${N_1+N_2+...+N_k=N}$. Then the probability of getting the same frequencies by sampling from ${(q_1, q_2, ...,q_k)}$ is given by

$\displaystyle W = \frac{N!}{N_1!...N_k!} q_1^{N_1} q_2^{N_2}... q_k^{N_k}$

and thus

$\displaystyle \ln W \approx - N \sum\limits_{i=1}^{k}\frac{N_i}{N} \ln \left( \frac{N_i}{N q_i} \right)$

since ${\ln N! \approx N \ln N - N}$. Set ${p_i = N_i/N}$. Then

$\displaystyle \begin{array}{rcl} \ln W &=& - N \sum\limits_{i=1}^{k} p_i \ln (p_i / q_i) \\ &=& NB(p;q) \end{array}$

where ${B(p;q)}$ is the entropy of the distribution ${\{p_i \}}$ w.r.t. the distribution ${\{q_i \}}$. The entropy here can be interpreted as the logarithm of the probability of getting the distribution ${\{ p_i \}}$ (which could asymptotically be the true distribution) by sampling from an hypothetical distribution ${\{q_i\}}$.

Based on Sanov’s result (1961) the above discussion may be extended to more general distributions. Let ${f}$ and ${g}$ be the pdfs of the true and hypothetical distributions respectively, and ${F_N}$ the pdf estimate based on the random sampling of ${N}$ observations from ${g}$. Then

$\displaystyle B(f;g) = - \int f(z) \ln(f(z)/g(z)dz$

as ${ \lim\limits_{\epsilon \downarrow 0} \lim\limits_{N \rightarrow \infty} N^{-1} P(\sup_x |f_N(x)- f(x)| < \epsilon).}$ Note that ${- B(f;g) }$ equals ${E_f [ \ln (f(z)/g(z))] }$ which is the Kullback-Leibler divergence between ${f}$ and ${g}$. Note also that ${B(f;g) \leq 0 }$. That is because

$\displaystyle \begin{array}{rcl} - \mathbb{E}_f \left[ \ln \frac{f(z)}{g(z)}\right] &=& \mathbb{E}_f \left[ \ln \frac{g(z)}{f(z)} \right] \\ &\leq& \ln \mathbb{E}_f \left[\frac{g(z)}{f(z)}\right] = \ln \int \frac{g(z)}{f(z)}f(z) dz = 0 \end{array}$

Suppose that we observe a data set ${\mathbf{x}}$ of N elements. We could predict the future observations ${\mathbf{y}}$ whose distribution is identical to that of ${\mathbf{x}}$ by specifying a predictive distribution ${ g(\mathbf{y} | \mathbf{x}) }$ which is a function of the given dataset ${ \mathbf{x}}$. The “closeness” of ${ g(\mathbf{y} | \mathbf{x}) }$ to the true distribution of the future observations ${f(\mathbf{y})}$ is measured by the entropy

$\displaystyle \begin{array}{rcl} B(f(.); g(.| \mathbf{x})) &=& -\int \left( \frac{f(\mathbf{y})}{ g(\mathbf{y} | \mathbf{x})} \right) \ln \left( \frac{f(\mathbf{y})}{ g(\mathbf{y} | \mathbf{x})} \right) g(\mathbf{y} | \mathbf{x}) d \mathbf{y}\\ &=& \int f(\mathbf{y}) \ln g(\mathbf{y} | \mathbf{x}) d \mathbf{y} - \int f(\mathbf{y}) \ln f(\mathbf{y}) d (\mathbf{y}) \\ &=& \mathbb{E}_y \ln g(\mathbf{y} | \mathbf{x}) - c \end{array}$

Hence the entropy is equivalent to the expected log-likelihood with respect to a future observation apart for a constant. The goodness of the estimation procedure specified by ${ g(\mathbf{y} | \mathbf{x}) }$ is measured by ${\mathbb{E}_x \mathbb{E}_y \ln g(\mathbf{y} | \mathbf{x})}$ which is the average over the observed data of the expected log-likelihood of the model ${ g(\mathbf{y} | \mathbf{x}) }$ w.r.t. a future observation.

Suppose ${\mathbf{x}}$ and ${\mathbf{y}}$ are independent and that the distribution ${g(.|\mathbf{x})}$ is specified by a fixed parameter vector ${\mathbf{\theta}}$ (i.e.${ g(.|\mathbf{x}) = g(.|\mathbf{\theta}))}$. Then ${\ln g(\mathbf{x}|\mathbf{x})=\ln g(\mathbf{x}|\mathbf{\theta})}$ and hence the conventional ML estimation procedure is justified as

$\displaystyle \mathbb{E}_x \ln g(\mathbf{x}|\mathbf{\theta}) = \mathbb{E}_x \mathbb{E}_y \ln g(\mathbf{y}|\mathbf{x})$

However generally

$\displaystyle \mathbb{E}_x \ln g(\mathbf{x}|\mathbf{x}) \neq \mathbb{E}_x \mathbb{E}_y \ln g(\mathbf{y}|\mathbf{x})$