Probability is the likelihood of an event occurring. This site discusses the following topics:
Random variables
Joint and
conditional distributions
Invariance
Bayes’ rule
A probability is writen in the
following manner: P(R). This is read as
the probability of the event R.
Random variables – an aspect of the
world that is uncertain, denoted by capital letters.
Random variable
can have a domain, or range of possible values.
For true/ false,
r = true, ¬r = false or not true
Basic Properties of
Probability
: event A or B occurs, or both
: both events A and B
occur
: event A happens,
but not B
: event A does not
happen, S is the space that contains all possibilities
Additional
Properties
![]()
If
then![]()
![]()
![]()
![]()
![]()
Joint and
conditional distributions
Probabilistic models will have a variety of likely
outcomes. Joint distributions are
likelihoods of those outcomes occurring.
Joint distributions of probabilistic models are normalized to 1. In other words, the probabilities of the
different outcomes add up to 1.
An example of this is as following: A survey is performed over a group of people
over what time of food they like. From
the survey, if a person is picked at random he is likely to prefer the
following type of food an any given night with the following distribution:

Should a probability have multiple factors, it is written as P(F, G). This is read as the probability of F and G.

Many probabilities have multiple conditions or factors. An example of this is as follows: it is more
likely to be good skiing if it snowed heavily last night, it is clear, sunny,
and a regular work day today. A
distribution of probabilities that contains multiple factors or conditions is
called a conditional distribution.
Conditional Distributions are probability distributions that have
dependencies. Continuing the previous example:

One can query the table for a conditional probability like P(p|m). This is read as the probability of pizza given
the person is male. In order to do so,
the following equation is used:

The last part of the equation,
, is the probabilities summed over the domain of A. To apply this equation to the given example, P(p, m) = .36 will be divided by the sum over the domain of
F, P(p, m) + P(h, m). Or, P(p, m) will be divided by the sum of P(p, m) = .36 and P(h,
m) = .24.

Marginalization
Joint distributions can be marginalized, or broken up into
smaller joint distributions. This is done my summing out unwanted variables.

Normalization
Should a table of conditional probabilities be desired, do the
following:

Normalization works by the same equation as querying over conditional
probabilities,

Inference
Inference with joint distributions is simply looking up the
probabilities in the distributions table then summing, multiplying and dividing
the numbers to achieve the desired result. Finding P(p
| m) from above, repeated below, is an example of inference.

Another example is P(h | f):

Product Rule
Should one have a table of conditional probability but desires
joint distributions, do the reverse as above.
![]()

When multiplying joint distributions together the different
variables have to matched up,
P(male)*P(pizza | male) = P(pizza, male),
P(male)*P(hamburgers | male) = P(hamburgers,
male), etc.
Bayes’ Rule
Should one conditional distribution be known and another desired, Bayes’ Rule is an effective
method of obtaining the desired conditional distribution.
![]()
Two variables are independent if they obey
. If this is
the case then,
, A is independent of B. In the given example, Food is independent of
Gender. It does not matter if the person
is a male or not, pizza was more likely to be chosen over hamburgers.
True independence of variables is rare. When independence
exists, it can be used with marginalization to store several smaller
distributions. This speeds up
calculation time and shrinks the necessary storage needed.
While true independence of variables may be rare, conditional independence
is a little more common. A conditional
independence is where two variables are independent if a third variable is
known. Mathematically this is ![]()
Extending the given example, Food and Gender are independent if
it is the weekend.

Chain Rule
A joint distribution can be
represented as a product of conditional distributions

The following is how the chain rule can be used to express the
given example:


For more information please see the following
http://www.cs.utah.edu/~hal/courses/2009S_AI/cs5300-day14-probability.pdf
Please also see Chapter 13
sections 1 through 6 of Artificial Intelligence: a Modern Approach
(Second Edition) by Stuart Russell and Peter Norvig.