Kalman Filters
Overview
This is a very brief introduction to Kalman Filters. Kalman filters are the continous space equivalent to Markov Chains. They both have the same steps: Transitions between times and applying Evidence. Markov chains have discrete states where specific probabilites can be assigned in a transition table and an evidence table. The Kalman Filter deals with continuous probabilities by dealing with Gaussian probability curves. So while one transitions from one state to another using discret probabilities, The Kalman Filter moves and scales the Gaussian distribution of probabilities in continuous space.
Gaussian Distribution
Since Kalman Filters deal with Gaussians you have to understand some basic concepts about them. A Gaussian distribution can be characterized with two numbers, the average value, and the standard deviation. The average value affects the location of the curve, and the standard deviation affects how wide or flat it is.
Gaussians can also be multidimensional and may look like this for 2 dimensions:
2D Gaussian with different standard deviations in x and y:
The 1D equation for is:
N(x,μ,σ)=α*e-0.5*((x-μ)2/σ2)
where x is the location to find the probability, μ is the average location, σ is the standard deviation. The multidimensional equation is formed with matrices and is:
N(x,μ,Σ)=α*e-0.5*((x-μ)'*inv(σ)*(x-μ))
How it Works
In a Kalman Filter, just as with Markov Chains there are two parts:
- Applying Transitions
- Applying Evidence
A transition for both Kalman Filters and Markov Chains uses the probabilty P(X(t+1)|X(t)). The Evidence is applied using P(E(t)|X(t)). For Kalman filters these probabilities are represented using Gaussians. This means that for transitions there is an associated μ and σ. The same is true for the evidence; there is an associated μ and σ. The physical meaning of this is that when you change states you have a location where you believe you are (μ), and some unsurity of that estimate represented by σ. The same is for the observations: you have a given observation μ and the unsurity of the measurement represented by σ. As an example there is a robot we believe to be at x=0, but we could be wrong by σ=1, so we say μx=0, σx=1; When we apply the transition we must find a new μx and σx. The transition model has an unsurity associated with how it changes,σT. The sensors will tell us where we should be and how sure the sensor readings are: μz, σz.
We will jump right to the solution of the new location μxnew and σxnew:
μxnew=((σx2+σT2)*μznew+σz2*μx)/(σx2+σT2+σz2)
σxnew2=((σx2+σT2)*σz2)/(σx2+σT2+σz2)
If we want the general case for multiple dimensions we get the equations:
μxnew=Fμx + Knew*(μznew - HFμnew)
Σxnew=(I-KnewH)(FΣxF' + ΣT)
where Knew = (FΣxF' + ΣT)H'inv(H(FΣxF' + ΣT)H' + Σz)
The x subscripts represent properties of the state, the T subscripts represent properties of the transition, and the z subscripts represent properties of the observations. F controls how transitions occur, so Fμx is the predicted new state given where we think we are. H controls how observations occur, so that HFμx gives a value that can relate directly to the sensor reading μznew. In other words if we were at μxnew the sensor would give us the sensor reading μznew=Hμxnew. Knew is basically the trust factor on whether our model or sensors are more accurate. Simplifying notation we get that the where we believe we are to be:
Xnew = Xold + K*(Measurement - ExpectedMeasurement)
If we were to see a plot of our expected probabilities of where a robot moving in a straigt line is, it could look like this:
References
Artificial Intelligence, A Modern Approach, Second Edition, Chapter 15 section 4, pp 551-559.
University of Utah Class Lecture Slides cs5300-Kalman Filters