Talk:MLRG/fall08

From ResearchWiki

(Difference between revisions)
Jump to: navigation, search
(29 Aug)
m (29 Aug)
Line 9: Line 9:
'''Seth'''
'''Seth'''
: What is the general difference between the MCMC methods and the variational methods presented in this paper?
: What is the general difference between the MCMC methods and the variational methods presented in this paper?
-
::(Piyush): On a high level, both MCMC and Variational methods are approximate inference techniques where the goal is to approximate some target probability distribution. The difference lies in the kind of approximations they provide. MCMC gives a ''sampling'' based '''stochastic approximation''' (i.e. which can vary for different runs) whereas variational inference is an optimization based '''deterministic approximation''' (i.e. same across runs). The important thing to remember however is that, in the limit of infinite computational resources, MCMC ''can'' produce exact samples (i.e. if the chain is run long enough). So the approximation arises only because of the finite limit on resources. On the other hand, variational methods solve an optimization problem of minimizing the difference between the true distribution and a candidate distribution. Exact optimization is computationally intractable so we <i>relax</i> so we usually solve a ''relaxed'' problem (thereby yielding an approximate solution).
+
::(Piyush): On a high level, both MCMC and Variational methods are approximate inference techniques where the goal is to approximate some target probability distribution. The difference lies in the kind of approximations they provide. MCMC gives a ''sampling'' based '''stochastic approximation''' (i.e. which can vary for different runs) whereas variational inference is an optimization based '''deterministic approximation''' (i.e. same across runs). The important thing to remember however is that, in the limit of infinite computational resources, MCMC ''can'' produce exact samples (i.e. if the chain is run long enough). So the approximation arises only because of the finite limit on resources. On the other hand, variational methods solve an optimization problem of minimizing the difference between the true distribution and a candidate distribution. Exact optimization is computationally intractable so we usually solve a ''relaxed'' problem (thereby yielding an approximate solution).
: On page 3 it mentions a compatibility function <math>\psi_C : \mathcal{X}^n \rightarrow \mathbb{R}_{+}</math> when describing the factorization of a probability distribution in an undirected graphical model. What does this function represent?
: On page 3 it mentions a compatibility function <math>\psi_C : \mathcal{X}^n \rightarrow \mathbb{R}_{+}</math> when describing the factorization of a probability distribution in an undirected graphical model. What does this function represent?

Revision as of 15:43, 29 August 2008

29 Aug

Hal

This is just an example of a question?
And another example?
Hal's alter-ego: The answer is blah.

Seth

What is the general difference between the MCMC methods and the variational methods presented in this paper?
(Piyush): On a high level, both MCMC and Variational methods are approximate inference techniques where the goal is to approximate some target probability distribution. The difference lies in the kind of approximations they provide. MCMC gives a sampling based stochastic approximation (i.e. which can vary for different runs) whereas variational inference is an optimization based deterministic approximation (i.e. same across runs). The important thing to remember however is that, in the limit of infinite computational resources, MCMC can produce exact samples (i.e. if the chain is run long enough). So the approximation arises only because of the finite limit on resources. On the other hand, variational methods solve an optimization problem of minimizing the difference between the true distribution and a candidate distribution. Exact optimization is computationally intractable so we usually solve a relaxed problem (thereby yielding an approximate solution).
On page 3 it mentions a compatibility function \psi_C : \mathcal{X}^n \rightarrow \mathbb{R}_{+} when describing the factorization of a probability distribution in an undirected graphical model. What does this function represent?
(Piyush): The compatibility function for each clique can be though of as an (unnormalized) probability distribution over the set of nodes in that clique. A high value of the compatibility function, for a certain configuration of the nodes, would indicate that such a configuration is highly likely.

Suresh

Why is the undirected graphical model formulated in terms of cliques instead of in terms of neighbors as the directed model is ?
(Piyush) In the parent-child relationship of directed graphical models, each term in the factorized distribution represents the probability of a node given all its parents. In undirected graphical models, however, we don't have such parent-child relationships. Thus we define it in terms of a set of cliques, with each clique representing a probability distribution of how likely a particular configuration of the set of nodes in that clique is.
Personal tools