DR+Clustering

From ResearchWiki

(Difference between revisions)
Jump to: navigation, search
(Data)
Line 26: Line 26:
3. [http://www.cs.nyu.edu/~roweis/data/olivettifaces.mat Olivetti Faces] on Sam Roweis's [http://www.cs.nyu.edu/~roweis/data.html data page]
3. [http://www.cs.nyu.edu/~roweis/data/olivettifaces.mat Olivetti Faces] on Sam Roweis's [http://www.cs.nyu.edu/~roweis/data.html data page]
 +
 +
4. [http://archive.ics.uci.edu/ml/machine-learning-databases/dorothea/ Dorothea] on [http://archive.ics.uci.edu/ml/datasets/Dorothea UCI repository]
 +
 +
5. A fifth dataset will only be revealed later to add spice to the contest.
==Leader Board==
==Leader Board==

Revision as of 06:51, 5 October 2012

Contents

CS 6150: Graduate Algorithms Project

High dimensions are weird.

A mathematician and his best friend, an engineer, attend a public lecture on geometry in thirteen-dimensional space.

"How did you like it?" the mathematician wants to know after the talk.

"My head's spinning", the engineer confesses. "How can you develop any intuition for thirteen-dimensional space?"

"Well, it's not even difficult. All I do is visualize the situation in arbitrary N-dimensional space and then set N = 13."


And Clustering is hard.

Although, Amit Daniely, Nati Linial, Michael Saks say its only hard when it does not matter!

Goal

Understand the impact of dimensionality reduction methods on clustering. Try to uncover relationship between a dimensionality reduction method and a clustering technique of your choice (if there exists any).

Data

1. MNIST Digits data on Sam Roweis's data page

2. Gisette on UCI repository

3. Olivetti Faces on Sam Roweis's data page

4. Dorothea on UCI repository

5. A fifth dataset will only be revealed later to add spice to the contest.

Leader Board

Data # Data points # Dimensions Team Name # Target Dimensions Dimensionality Reduction Method Clustering Technique Rand Index NMI Accuracy
MNIST
Gisette
Olivetti Faces
Personal tools