MLRG/fall10
From ResearchWiki
m (→Participants) |
(→Participants) |
||
| Line 15: | Line 15: | ||
* [http://www.cs.utah.edu/~alnds Lalindra De Silva], PhD Student, School of Computing | * [http://www.cs.utah.edu/~alnds Lalindra De Silva], PhD Student, School of Computing | ||
* [http://www.cs.utah.edu/~sandeepp Sandeep P], MS Student, School of Computing | * [http://www.cs.utah.edu/~sandeepp Sandeep P], MS Student, School of Computing | ||
| + | * [http://www.nealrichter.com Neal Richter], PhD (Montana State U), Working at the Rubicon Project | ||
==Schedule== | ==Schedule== | ||
Revision as of 22:49, 24 September 2010
Semisupervised and Active Learning
Fri 2:00-3:20pm
MEB 3105
Contents |
Synopsis
Supervised learning algorithms usually require a good amount of labeled data in order to learn a reliable model. Since getting large quantities of labeled data can be expensive and/or difficult, much effort in machine learning has been devoted on coming up with ways to learn with a limited amount of labeled data. There are many ways of doing this. Two very important paradigms we will be looking at in this seminar are (1) semi-supervised learning which involves augmenting a small amount of available labeled data with a large amount of additional unlabeled data (which is usually very easy to obtain), and (2) active learning which involves judiciously selecting the most informative/useful labeled examples to be given to a supervised learning algorithm. In this seminar, we will be looking at some representative papers from both these paradigms. As it will not be possible to cover all important papers in a single seminar, for those interested, a bunch of papers will be added under the suggested readings.
Participants
- Piyush Rai, PhD Student, School of Computing
- Suresh Venkat, Asst. Prof, School of Computing
- Ruihong Huang, PhD Student, School of Computing
- Lalindra De Silva, PhD Student, School of Computing
- Sandeep P, MS Student, School of Computing
- Neal Richter, PhD (Montana State U), Working at the Rubicon Project
Schedule
(subject to change; * means will probably need a rescheduling)
| Date | Topic | Outline and Paper(s) | Presenter |
|---|---|---|---|
| Sep 3 | Outline, Motivation | Seminar logistics. Introduction to semisupervised learning and active learning | Piyush |
| Semisupervised Learning | |||
| Sep 10 | Bootstrapping/weak-supervision | Combining Labeled and Unlabeled Data with Co-Training (for some theoretical results, also see PAC Generalization Bounds for Co-training) | Piyush |
| Sep 17 | Low density regions and the cluster assumption for SSL | Semi-Supervised Classification by Low Density Separation, (also see section 5 of the SSL survey for other methods and further references) | Ruihong |
| Sep 24 | Imposing function smoothness: Graph based SSL | Manifold Regularization for Semi-supervised Learning, (also see section 6 of the SSL survey for other methods and further references) | Sandeep |
| Oct 1 | Probabilistic approaches: Expectation Maximization for SSL | Semi-Supervised Text Classification Using EM | |
| Oct 8 | Using unlabeled data to learn predictive functional structures | A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data | |
| Oct 22 | Semi-supervised Learning for Ranking | Learning to Rank with Partially-Labeled Data | |
| Oct 29 | Semi-supervised Learning Theory | A PAC-style Model for Learning from Labeled and Unlabeled Data | |
| Nov 5 | Semi-unsupervised Learning (Clustering/Dimensionality Reduction) | Integrating constraints and metric learning in semi-supervised clustering, Semi-Supervised Dimensionality Reduction | |
| Active Learning | |||
| Nov 12 | Pool-based active learning, Query by committee, Query by uncertainty | Support Vector Machine Active Learning with Applications to Text Classification, Active Learning survey (sections 3.1 and 3.2) | |
| Nov 19 | Stream-based active learning | Worst-Case Analysis of Selective Sampling for Linear Classification | |
| *Nov 26 | Dealing with sampling bias and using cluster-structure for active learning | Hierarchical Sampling for Active Learning | |
| Dec 3 | Semi-supervised learning and active learning | Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions | |
| Dec 10 | Multiview active learning | Active Learning with Multiple Views | |
Suggested Readings
Some survey papers:
- Semisupervised Learning Literature Survey
- Learning with Labeled and Unlabeled Data
- Active Learning Literature Survey
Other papers:
- On Co-Training: Co-Training and Expansion: Towards Bridging Theory and Practice, Bayesian Co-training, A New Analysis of Co-Training
Will be updated with more papers.