Products and expectation values
by Willie Wong
Let us start with an instructive example (modified from one I learned from Steven Landsburg). Let us play a game:
I show you three identical looking boxes. In the first box there are 3 red marbles and 1 blue one. In the second box there are 2 red marbles and 1 blue one. In the last box there is 1 red marble and 4 blue ones. You choose one at random. What is …
- The expected number of red marbles you will find?
- The expected number of blue marbles you will find?
- The expected number of marbles, irregardless of colour, you will find?
- The expected percentage of red marbles you will find?
- The expected percentage of blue marbles you will find?
Answer below the cut…
You have equal chances (1/3) of choosing each of the three boxes. So the expected number of red marbles is . The expected number of blue marbles is
. So far so good. The expected number of marbles you get is
.
So the expected numbers of red and blue marbles are the same. This means that we expect half of the marbles to be red, on average, right?
Wrong! The expected percentage of red marbles is
and the expected percentage of blue marbles is
So what gives?
Expectation values and random variables
Let us formalize the discussion a little bit. Let us suppose that in the game, I give you identical boxes, but for the purpose of computation we’ll call the boxes
. The box number
contains
red marbles and
blue marbles, so
and
are functions whose domain is the set
and who take values in the non-negative integers. (Of course, we can generalize even further by having more than two colours and fractional pieces of marbles.) By the strategy in which you play the game, we derive a probability measure
, with the property that
- The domain of
is
, the set of possible boxes to choose from.
- The value of
is between 0 and 1.
- The total probability, that is to say
, add up to 1.
In the random selection case each .
Given a set of outcomes and a probability
, a function whose domain is
is called a random variable. In our example both
and
are random variables. The expected value of the random variable is called its expectation (value) and written, and computed, as follows
From the elementary rules of arithmetic, using that the expectation is a sum, and hence is linear in its argument, we have that
where are real numbers.
For example, the total number of marbles in a box, which we can write , satisfies
. So we have that
.
Products of random variables
Unlike sums and differences of random variables, products of random variables does, in general, not commute with operation of taking an expectation. That is to say, in general the expressions and
take different values. There are, however, cases where the two do agree. First recall that
is a real number. So
by the linearity of expectations. So we can compute
The third and fourth terms on the RHS both evaluate to , as does the second term (the expectation value of a constant is the constant itself). So we conclude that
The quantity in the above equation is known as the covariance of the two random variables, and provides a measure of how the two variables change together. A tautological statement is then: “the expectation of a product of random variables is equal to the product of their expectation, if and only if their covariance is zero”.
An immediate consequence is that
Let
be a non-vanishing random variable. Then
if and only if
is constant over all outcomes with positive probability.
Proof: If is a constant, then it is clear that the desired equality holds. So it suffices to prove the other implication. The desired equality is equivalent to
. Assuming this holds, we have that the covariance
On the other hand, observe that . This implies that the random variable
is non-positive. For its expectation to be exactly 0 requires that the random variable itself vanishes on all events with positive probability, which shows the claim.
Moral of the story
In general,
and so one should be very careful when trying to draw conclusions about the expected “proportion” from the expected “numbers”.