Tuesday, August 09, 2011

Probability of a union of independent events algorithm

Lately I was looking for a copy 'n paste algorithm to calculate the probability of a union of independent events that are not mutually exclusive (aka inclusion-exclusion principle in probability). Unfortunately I couldn't find any algorithm for such a basic problem.
Therefore, I decided to write the following naive algorithm which is fast enough for my purposes (O(n2) in time and space, where n is the number of events), and share with everyone:

You can find the code snippet here, sorry for not embedding it in the blog post but blogger is really boring me with snippets having broken layout.

The idea behind the dynamic programming approach starts from this observation:

Let A, B and C be independent non mutually exclusive events. Then:

P(A or B or C) = P(A) + P(B) + P(C) - P(A)P(B) - P(A)P(C) - P(B)P(C) + P(A)P(B)P(C)

Let me simplify the expression using A instead of P(A):

P(A or B or C) = A+B+C - AB - AC - BC + ABC

Notice that AB+AC+BC = A(B+C) + BC. In general:

AB+AC+AD+...+AZ + BC+BD+....+BZ = A(B+C+D+...+Z) + B(C+D+...+Z)

ABC+...+ABZ + ACD+...+ACZ+...+AYZ + BCD+...+BYZ = A(B(C+D+...+Z)+C(D+...+Z)+...+YZ) + B(C(D+...+Z)+...+YZ)

That's exactly where we exploit the dynamic programming to avoid recalculating the same expressions twice.

Edit: My effort was totally useless given that for independent events this is equivalent to 1 - (1 - pA)(1 - pB)..., which can be calculated in linear time. I even used this formula once and forgot about it :-(