Posted on April 12, 2013 @ 12:18:00 PM by Paul Meagher
In this blog, I'll be doing a bit of algebra to show you that our conditional probability formula P(H|E) = P(H & E) / P(E) is equivalent to
P(H|E) = P(E|H) * P(H) / P(E). This latter form of the equation is the version that people most often refer to as Bayes theorem. They
are mathematically equivalent, however, in different circumstances it is easier to work with one versus the other. A Bayesian
Angel Investor will need to master this Bayes theorem version of the conditional probability equation. This version of the equation includes a term P(E|H) called the likelihood term which is also critical for a Bayesian Angel Investor to understand and master. We will briefly discuss this term, leaving a more detailed discussion until next week when I will dedicate a blog to the likelihood concept.
The derivation of Bayes theorem follows naturally from the definition of conditional probability:
P(H|E) = P(H & E) / P(E)
Using some simple algebra (moving terms from one side to the other), this equation can be rewritten as:
P(H & E) = P(E | H) * P(E)
The same right-hand value can also be computed using E as the conditioning variable in the right-hand part of the equation:
P(H & E) = P(H | E) * P(E)
Given this equivalence, you can write:
P(H|E) * P(E) = P(E|H) * P(H)
We can now substitute P(E|H) * P(H) for P(H & E) and arrive at Bayes theorem:
P(H|E) = P(E|H) * P(H) / P(E)
Notice that this formula for computing a conditional probability is similar to the original formula with the exception that the joint probability P(H & E) that used to appear in the numerator has been replaced with an equivalent expression P(E|H) * P(H).
We can simplify this equation further by pointing out that P(E), the probability of the evidence, is just a mathematical convenience that ensures that when we compute all our conditional probabilities P(H|E), they collectively sum to 1. Conceptually, we can eliminate it from our equation by making the weaker claim that P(H|E) is proporational to P(E|H) * P(H):
P(H|E) ~ P(E|H) * P(H)
What this simplified equation is saying is that the probability of an hypothesis (e.g., startup success) given the evidence (e.g., tests diagnostic of startup success) is proportional to the likelihood of the evidence P(E|H) times the prior probability of the hypothesis P(H). When making decisions, we don't necessarily need to know the probability of success exactly, just that the success probability is quite a bit bigger than the failure probability. This is why this simpler version of Bayes theorem is still useful even though it only expresses a proportional relationship and not a full identity.
In order to update our prior probability of first-time startup success from .12 (or 12%) given the evidence of some diagnostic tests, we need to multiply our prior assessment of first time startup success P(H) by a factor called the likelihood P(E|H). The likelihood term is obviously doing alot of the heavy lifting in terms of updating our prior beliefs.
In my next blog, I will discuss how likelihoods can be computed from a data table using the conditional probability equation P(E|H) = P(E & H)/P(H) and other techniques. Some statisticians argue that likelihoods are good enough for decision making, that you don't have to incorporate prior probabilities P(H) into calculations to figure out the most probable outcome. These statisticians are afraid of introducing a subjective element (e.g., your prior assessment P(H) of the relative probability of different outcomes) into decision making. Bayesians argue that this subjective element makes the probability calculations more intelligent and contextually sensitive. An angel investor with lots of business experience should have at their disposal a mathematical tool that allows them to use their experience in making startup investment decisions. Bayesian inference techniques offer the promise of being that tool.