MLRG/fall10

From ResearchWiki

(Difference between revisions)
Jump to: navigation, search
m (Schedule)
m (Schedule)
Line 13: Line 13:
==Schedule==  
==Schedule==  
-
(subject to change)
+
(subject to change; * means will probably need a rescheduling)
{|  border="1" style="width: 100%; text-align:left" class="content"
{|  border="1" style="width: 100%; text-align:left" class="content"
Line 45: Line 45:
| Nov 19 || Stream-based active learning || [http://www.jmlr.org/papers/volume7/cesa-bianchi06b/cesa-bianchi06b.pdf Worst-Case Analysis of Selective Sampling for Linear Classification] ||
| Nov 19 || Stream-based active learning || [http://www.jmlr.org/papers/volume7/cesa-bianchi06b/cesa-bianchi06b.pdf Worst-Case Analysis of Selective Sampling for Linear Classification] ||
|-
|-
-
| Nov 26 || Dealing with sampling bias and using cluster-structure for active learning || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.149.5701&rep=rep1&type=pdf Hierarchical Sampling for Active Learning] ||
+
| *Nov 26 || Dealing with sampling bias and using cluster-structure for active learning || [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.149.5701&rep=rep1&type=pdf Hierarchical Sampling for Active Learning] ||
|-
|-
| Dec 3 || Semi-supervised learning and active learning || [http://pages.cs.wisc.edu/~jerryzhu/pub/zglactive.pdf Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions] ||
| Dec 3 || Semi-supervised learning and active learning || [http://pages.cs.wisc.edu/~jerryzhu/pub/zglactive.pdf Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions] ||

Revision as of 21:53, 26 August 2010

Semisupervised and Active Learning

Fri 2:00-3:20pm

MEB 3105

Contents

Synopsis

Supervised learning algorithms usually require a good amount of labeled data in order to learn a reliable model. Since getting large quantities of labeled data can be expensive and/or difficult, much effort in machine learning has been devoted on coming up with ways to learn with a limited amount of labeled data. There are many ways of doing this. Two very important paradigms we will be looking at in this seminar are (1) semi-supervised learning which involves augmenting a small amount of available labeled data with a large amount of additional unlabeled data (which is usually very easy to obtain), and (2) active learning which involves judiciously selecting the most informative/useful labeled examples to be given to a supervised learning algorithm. In this seminar, we will be looking at some representative papers from both these paradigms. As it will not be possible to cover all important papers in a single seminar, for those interested, a bunch of papers will be added under the suggested readings.

Participants

Schedule

(subject to change; * means will probably need a rescheduling)

Date Topic Outline and Paper(s) Presenter
Sep 3 Outline, Motivation Seminar logistics. Brief introduction to semisupervised learning (section 1 - FAQ - of this survey), and active learning (section 1 of this survey) Piyush
Semisupervised Learning
Sep 10 Bootstrapping/weak-supervision Combining Labeled and Unlabeled Data with Co-Training
Sep 17 Cluster assumption for SSL Semi-Supervised Classification by Low Density Separation
Sep 24 Imposing function smoothness: Graph based SSL A geometric framework for learning from labeled and unlabeled examples
Oct 1 Probabilistic approaches: Expectation Maximization for SSL Semi-Supervised Text Classification Using EM
Oct 8 Using unlabeled data to create supervised learning problems A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data
Oct 22 Harnessing unlabeled test data: Application to ranking Learning to Rank with Partially-Labeled Data
Oct 29 Hybrid models for SSL Principled Hybrids of Generative and Discriminative Models
Nov 5 Semi-unsupervised Learning (Clustering/Dimensionality Reduction) Integrating constraints and metric learning in semi-supervised clustering, Semi-Supervised Dimensionality Reduction
Active Learning
Nov 12 Pool-based active learning, Query by committee, Query by uncertainty Support Vector Machine Active Learning with Applications to Text Classification, Active Learning survey (sections 3.1 and 3.2)
Nov 19 Stream-based active learning Worst-Case Analysis of Selective Sampling for Linear Classification
*Nov 26 Dealing with sampling bias and using cluster-structure for active learning Hierarchical Sampling for Active Learning
Dec 3 Semi-supervised learning and active learning Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions
Dec 10 Multiview active learning Active Learning with Multiple Views

Suggested Readings

Will be updated with more papers.

Personal tools