Alpha-Beta Pruning is a method by
which to speed up a Minimax Algorithm. This algorithm works by pruning away the
branches of a search that cannot possibly influence the outcome of the final
decision. The Alpha-Beta Pruning algorithm has no effect on the outcome of the Minimax Algorithm.
More specifically, the way the
pruning works is for every node n, if a player had a better choice at the
parent of n or at any choice point further up, then n will never be reached in
actual play. So when enough information
about n is determined by examining its decedents, it is pruned.
Alpha-Beta Pruning
gets its name from the following two terms:
alpha (α) = the value of the best choice at any node for the MAX
algorithm
beta (β) = the value of the best choice at any node for the MIN
algorithm
These terms are passed along in
the code as a means of bookkeeping.
Example
1:
As an example, consider the following tree:

The triangles pointed up represent maximizing
nodes. The triangles pointing down
represent minimizing nodes. The [-inf,
+inf] are place holders for the alpha-beta
bookkeeping. Alpha is initialized to
negative infinity (-inf). Beta is initialized to positive infinity (+ inf).
When performing the Alpha-Beta Pruning with a Minimax Algorithm,
the following steps will be performed:
Step 1: Node E will be
explored. From node E, it can be seen
that the worst that the minimize agent B and the maximize node A can do is 8.

Step 2: The minimax algorithm will
continue to the next node, F. The
minimizing node will then prefer 2 over 8. Thus, the worst the maximizing node
A can do now is 2

Step 3: In the next step node G is revealed to have a value of
1. This is the last child of B, and its
best option. Therefore, the bookkeeping numbers
by B have both changed to 1. Again, the
maximizing node A’s worst choice has changed to the minimizing node B’s best
choice, 1.

Step 4: As the minimax algorithm
continues, node H is explored. As H has
a value of 7, and Node C has no smaller option, nothing can be pruned. Furthermore, 7 becomes C’s worst possible
choice, and A’s new worst choice.

Step 5: Node I is explored
to reveal a better option for node C.

Step 6: Node J is revealed
to hold an option for C that it will not take.
While node J is not a better option to the minimizing node C, node J
could not be pruned as nodes H and I both help options for the maximizing node
A than what the minimizing node B provided.

Step
7: Node
K is revealed to hold a value of 2. This
value is smaller than what node C has, 6.
This means that nodes L and M cannot possible influence node A and are
subsequently pruned. This is because D
is a minimizing node, and L and M would not change D unless they were smaller
than 2. But D will not influence A unless K had been a number larger than 6.

Example
2:
Consider the following example:

Step 1: Explore Node E

Step 2: Continue to explore node E. This allows for the worst case of 2 for Nodes
B and A

Step 3: Explore Node F.
Because the first part of Node F is greater than 2, there is no need to
continue exploring. This is because no
matter what the next part of the node is, the minimizing Node B will always
pick E before it will pick F.

Step 4: Explore the first part of G and prune the second. The reasons for pruning the second part of G
are the same as why the second part of F was pruned in step 4.

Step 5: Explore node H.
The Alpha of 2 is passed along because 2 is currently
the best that A can do. If C found a
value equal to or less than 2, than it can stop exploring because it will not
be able to find any other value that will affect A.

Step 6: Continue to
explore node H. 10 becomes the best that H can do, and the worst that C can do.

Step 7: Continue exploring.
As Node H contained values that influences A to
chose a path other than B, C continues to explore.

Step 8: Continue exploring to Node J. Node C is still trying to find something
better, a smaller value, than 10.

Step 9: Continue exploring.
The worst choice that A now has will have a
utility of 5.

Step 10: Continue exploring.
As the value of 5 has become the worst that A can choose, the alpha
value changes to 5. 5 is also now the
value to match or beat. If D, a minimizing
node, can find a value that is equal or less than 5, then it can stop
exploring.

Step 11: Because K is a
maximizing node, it will continue to explore after it has found the 2. Once its children have been explored, and the
best that it can do is to choose a value of 3, all other nodes can be
pruned. This is because no matter what
the values of the other nodes are, the minimizing node is not going to pick a
value that is greater than 3. And in
order for the maximizing node A to pick a path other than picking Node C, D
would have to pick a value greater than 3.

Psuedocode
One of the great advantages of Alpha-Beta Pruning is the ease of
implementation. Alpha-Beta Pruning can be implemented with only a few extra lines
of code to a Minimax Agent.
Please see the following Psuedocode, also
found on page 170 of Artificial Intelligence: a Modern Approach (Second
Edition) by Stuart Russell and Peter Norvig. Prentice Hall, 2002.
function ALPHA-BETA-SEARCH(state) returns an action
inputs: state, current state in game
v←
MAX-VALUE(state, -inf, +inf)
return the action in SUCCESSORS(state) with value v
function MAX-VALUE(state, α, β) returns a utility value
inputs: state, current state in game
α, the value
of the best alternative for MAX along the path to state
β, the
value of the best alternative for MIN along the path to state
if TERMINAL-TEST(state) then return UTILITY(state)
v ← -inf
for a, s in SUCCESSORS(state) do
v ←
MAX(v, MIN-VALUE(s, α, β))
if v ≥ β then return
v
α ←
MAX(α, v)
return v
function MIN-VALUE(state, α, β) returns a utility value
inputs: state, current state in game
α, the
value of the best alternative for MAX along the path to state
β, the
value of the best alternative for MIN along the path to state
if TERMINAL-TEST(state) then return UTILITY(state)
v ← +inf
for a, s in SUCCESSORS(state) do
v ←
MIN(v, MAX-VALUE(s, α, β))
if v ≤ α then return
v
β ←
MIN(β, v)
return v
Effectiveness
The effectiveness of Alpha-Beta Pruning depends upon the ordering of the nodes to
explore. An example of this can be seen
in step 11 of the second example presented here. If the Nodes K and L were
switch, the algorithm would not have been able to prune as much as it did. With optimal ordering, Alpha-Beta Pruning can take the Minimax Algorithm runs in O(b^m)
time to O(b^(m/2)) time.
More
Information
For more information please see Chapter 6
section 2 of Artificial Intelligence: a Modern Approach (Second Edition)
by Stuart Russell and Peter Norvig. Prentice Hall,
2002, or slides 18 through 22 of class lectures from day 6, http://www.cs.utah.edu/~hal/courses/2009S_AI/cs5300-day06-adversarial-search.pdf