Algorithms Seminar/Fall10

From ResearchWiki

(Difference between revisions)
Jump to: navigation, search
(Schedule)
m (Schedule)
 
(70 intermediate revisions not shown)
Line 7: Line 7:
==Synopsis==
==Synopsis==
-
We will cover many recent developments in the modeling and processing of uncertain data in computer science.   
+
We will cover many recent developments in the modeling and processing of uncertain data in computer science.  The seminar will focus and the modeling, algorithmic, and data-structural foundations needed for understanding, processing, and visualizing uncertain data.   
First we will overview different models of uncertainty, tracing their origins and motivation (Many-world models in databases, imprecision models in geometry, probabilistic models in databases).
First we will overview different models of uncertainty, tracing their origins and motivation (Many-world models in databases, imprecision models in geometry, probabilistic models in databases).
-
Then we will proceed to survey recent developments in computational geometry, databases, and machine learning and statistics.
+
Then we will proceed to survey recent developments in computational geometry, databases, visualization, and machine learning and statistics.
-
Highlights in computational geometry will describe algorithmic analysis to answer basic geometry problems, either with respect to worst case bounds or by formally approximating the distributions.
+
* The computational geometry section will describe basic algorithmic tools and analysis several models of uncertain data.
-
Work in databases has focused on data structures to quickly retrieve summary queries of large data sets with respect to the underlying data uncertainty.
+
* The databases section will focus on data structures to manage and process large amounts of uncertain data concisely.
-
This topic has been studied for longer in machine learning and statistics so we will summarize classic techniques and see some recent developments.   
+
* The visualization section will look at models for data uncertainty for surfaces and volumes, and how the resulting data is visualized.
-
Through this seminar participants will gain an understanding of the state-of-the-art in modeling and processing uncertain data and will be exposed to several important open problems and exciting research directions.
+
* The machine learning and statistics section will cover classic techniques for learning and detecting uncertainty.   
 +
Through this seminar, participants will gain an understanding of the state-of-the-art in modeling and processing uncertain data and will be exposed to several important open problems and exciting research directions.
 +
 
 +
== Participants ==
 +
* [http://www.cs.utah.edu/~jeffp Jeff Phillips], CI Postdoctoral Fellow, School of Computing
 +
* [http://www.cs.utah.edu/~suresh Suresh Venkatasubramanian], Assistant Professor, School of Computing
 +
* [http://www.cs.utah.edu/~praman Parasaran Raman], PhD Student, School of Computing
 +
* [http://www.cs.utah.edu/~moeller John Moeller], PhD Student, School of Computing
 +
* [mailto:avishek@cs.utah.edu Avishek Saha], PhD Student, School of Computing
 +
* [http://www.cs.utah.edu/~seth Seth Juarez], PhD Student, School of Computing
 +
* [http://www.sci.utah.edu/~kpotter Kristi Potter], Post-Doc, SCI
 +
* [http://www.sci.utah.edu/~jlevine Josh Levine], Post-Doc, SCI
 +
* [http://www.cs.utah.edu/~piyush Piyush Rai], PhD Student, School of Computing
 +
* [http://www.sci.utah.edu/~hbhatia Harsh Bhatia], PhD Student, School of Computing
 +
* [http://www.cs.utah.edu/~bowang/ Bo Wang], PhD Student, School of Computing
==Schedule==  
==Schedule==  
Line 20: Line 34:
{|  border="1" style="width: 100%; text-align:left" class="content"
{|  border="1" style="width: 100%; text-align:left" class="content"
|-
|-
-
! Date || Topic || Paper(s) || Presenter  
+
! Date || Topic || Outline and Paper(s) || Presenter  
|-
|-
-
| Aug 25 || Models of Data Uncertainty || || [http://www.cs.utah.edu/~jeffp Jeff Phillips]  
+
| Aug 25 || Models of Data Uncertainty || Motivation behind modeling Uncertainty.  Survey different models and where they arise.  || [http://www.cs.utah.edu/~jeffp Jeff Phillips]  
|-
|-
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Geometry'''  
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Geometry'''  
|-
|-
-
| Sep 1 || Range Spaces and epsilon-Samples || [??? others][http://valis.cs.uiuc.edu/~sariel/teach/notes/aprx/lec/05_vc_dim.pdf Sariel's Notes] ||
+
| Sep 1 || Range Spaces and epsilon-Samples || How well does random sampling work? Define important geometric-combinatorial notions:  Range spaces, eps-nets, eps-samples.  Prove random sampling bound for eps-samples ([http://valis.cs.uiuc.edu/~sariel/teach/notes/aprx/lec/05_vc_dim.pdf Sariel's Notes] read 5.1 and 5.3.1 and 5.3.2 and the "naive proof").  Better deterministic and [http://arXiv.org/abs/0801.2793 from continuous to discrete] (also a different way of defining concepts).  || [http://www.cs.utah.edu/~moeller John Moeller]
|-
|-
-
| Sep 8 || Epsilon-Quantizations and Epsilon-SIPs || [http://www.cs.utah.edu/~jeffp/papers/uncertaintyESA09.pdf Shape Fitting on Point Sets with Probability Distributions] ||
+
| Sep 8 || Epsilon-Quantizations and Epsilon-SIPs || Modeling questions on uncertain data with probability distributions.  Review specific applications to smallest enclosing ball. ([http://www.cs.utah.edu/~jeffp/papers/uncertaintyESA09.pdf Shape Fitting on Point Sets with Probability Distributions], [http://www.cs.utah.edu/~jeffp/online.pdf Jeff's Slides]) || [http://www.cs.utah.edu/~seth Seth Juarez]
|-
|-
-
| Sep 15 || Geometry on Imprecise Points || [http://www.ics.uci.edu/~mloffler/publications/Basic_Geometric_Measures_on_Imprecise_Points.html Basic Geometry Measures on Imprecise Points] ||  
+
| Sep 15 || Geometry on Imprecise Points || Imprecision in data.  eps-geometry for rounding errors ([http://www.springerlink.com/content/517vg0158073g787/ eps-Geometry]).  More general geometry problems: [http://www.ics.uci.edu/~mloffler/publications/Basic_Geometric_Measures_on_Imprecise_Points.html Basic Geometry Measures on Imprecise Points] and maybe [http://www.ics.uci.edu/~mloffler/publications/Largest_and_Smallest_Convex_Hulls_for_Imprecise_Points.html Largest and Smallest Convex Hulls on Imprecise Points] || Parasaran
|-
|-
-
| Sep 22 || Imprecise Convex Hulls || [http://www.ics.uci.edu/~mloffler/publications/Largest_and_Smallest_Convex_Hulls_for_Imprecise_Points.html Largest and Smallest Convex Hulls on Imprecise Points] ||  
+
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Databases'''
|-
|-
-
| Sep 29 || Hardness of Uncertainty Problems || [http://www.cs.washington.edu/homes/suciu/vldbj-probdb.pdf Efficient Query Evaluation on Probabilistic Databases][see Jeff for preprint] ||  
+
| Sep 22 || Sketching and Streaming || How to maintain a concise summary of uncertain data on-the-fly.  [http://dimacs.rutgers.edu/~graham/pubs/papers/probstreams.pdf Sketching probabilistic data streams] || [http://www.cs.utah.edu/~moeller John Moeller]
|-
|-
-
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Databases'''
+
| Sep 29 || Beyond Random Sampling || Can we do better than random sampling?  Theoretically (high level overview): [http://www.cs.princeton.edu/~chazelle/pubs/OptimizationFixedDim.pdf deterministic eps-samples] and for [http://arXiv.org/abs/0801.2793 continuous domains].  Practically (using insights from theory): low discrepancy sets i.e. [http://www.cs.ucsb.edu/~suri/psdir/elephants.pdf eps-Sentinels] and [http://photon.poly.edu/~hbr/publi/ease-kdd03/kdd.03.pdf EASE].  If time histogram results on uncertain probability distributions.  [http://dimacs.rutgers.edu/~graham/pubs/html/CormodeGarofalakis09.html Histograms and wavelets on probabilistic data]  [http://www.cs.umass.edu/~mcgregor/papers/09-vldb.pdf Probabilistic Histograms for Probabilistic Data] || Avishek
|-
|-
-
| Oct 6 || Histograms || [http://dimacs.rutgers.edu/~graham/pubs/html/CormodeGarofalakis09.html Histograms and wavelets on probabilistic data]  [http://www.cs.umass.edu/~mcgregor/papers/09-vldb.pdf Probabilistic Histograms for Probabilistic Data] ||  
+
| Oct 6 || Ranking || Retrieving the top (most important) data points, when all values are uncertain.  Focus on [http://dimacs.rutgers.edu/~graham/pubs/papers/exprank.pdf Expected Ranks]. Note there are more general frameworks [http://arxiv.org/abs/0904.1366 A Unified Approach to Ranking in Probabilistic Databases] with [http://www.cs.umd.edu/~lijian/paper/vldb09_long.pdf slides]. || Shantharaju
|-
|-
-
| Oct 20 || Ranking || [http://arxiv.org/abs/0904.1366 A Unified Approach to Ranking in Probabilistic Databases] with [http://www.cs.umd.edu/~lijian/paper/vldb09_long.pdf slides] and [http://dimacs.rutgers.edu/~graham/pubs/papers/exprank.pdf Semantics of ranking queries for probabilistic data and expected ranks] ||  
+
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Clustering'''
|-
|-
-
| Oct 27 || Clustering ||[http://dimacs.rutgers.edu/~graham/pubs/papers/pclust.pdf Approximation Algorithms for Clustering Uncertain Data] ||  
+
| Oct 20 || Extracting Clusters || Extracting clusters when some data points are the noise.  [http://hal.inria.fr/inria-00389390/en/ Clustering Noisy Data with Persistence] and current work. || [http://www.sci.utah.edu/~jlevine Josh Levine]
|-
|-
-
| Nov 3 || Sketching and Streaming || [http://dimacs.rutgers.edu/~graham/pubs/papers/probstreams.pdf Sketching probabilistic data streams] ||  
+
| Nov 22 (10:45 am)  || Finding Clusters in Uncertain Data || Building clusters on full (but uncertain) data sets.  [http://dimacs.rutgers.edu/~graham/pubs/papers/pclust.pdf Approximation Algorithms for Clustering Uncertain Data] (postponed from Oct 27) || [http://www.cs.utah.edu/~praman Parasaran Raman]
|-
|-
-
| Nov 10 || Indexing /  Range Searching || [http://www.cse.ust.hk/~yike/pods09-urange.pdf Indexing Uncertain Data] with [http://www.cse.ust.hk/~yike/urange-shorttalk.pdf slides]||  
+
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Visualization''' (see page on [http://www.sci.utah.edu/~kpotter/library/uncertainVis/index.html Uncertainty Visualization])
 +
|-
 +
| Nov 3 || Uncertainty Visualization || State of the Art ||[http://www.sci.utah.edu/~kpotter Kristi Potter]
 +
|-
 +
| Nov 10 || Uncertain Vector Fields || Visualizing Uncertainty in Vector Fields. I will be discussing (a) [http://isgwww.cs.uni-magdeburg.de/~germer/publications/uncertainTopo.pdf Uncertain 2D Vector Field Topology], (b) my current project. || [http://www.sci.utah.edu/~hbhatia Harsh Bhatia]
|-
|-
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Machine Learning And Statistics'''  
| colspan="4" bgcolor="#dddddd" style = "text-align:center" | '''Machine Learning And Statistics'''  
|-
|-
-
| Nov 17 || Particle Filters || ||  
+
| Nov 17 || Support Vector Machines || Classifying uncertain data. [http://books.nips.cc/papers/files/nips17/NIPS2004_0195.pdf Support Vector Classification with Input Data Uncertainty], [http://arxiv.org/pdf/0803.3490 Robustness and Regularization of Support Vector Machines] || [http://www.cs.utah.edu/~piyush Piyush Rai]
|-
|-
-
| Nov 24 || Support Vector Machines || [http://books.nips.cc/papers/files/nips17/NIPS2004_0195.pdf Support Vector Classification with Input Data Uncertainty] ||  
+
| Nov 24 || Multi-Armed Bandits || Trading off investigation of uncertainty and rewards.  [http://en.wikipedia.org/wiki/Multi-armed_bandit Multi-armed Bandits] via the [http://en.wikipedia.org/wiki/Gittins_index Gittens Index Strategy].  The [http://arxiv.org/pdf/0711.3861 BalancedIndex for Restless Multi-armed Bandits].  (Another older reference survey [http://eecs.umich.edu/~teneket/pubs/MAB-Survey.pdf MAB-survey].) || Avishek
|-
|-
-
| Dec 1 || Markov Random Fields || ||  
+
| Dec 1 || Particle Filters || Maintaining uncertain estimates. [http://faculty.chicagobooth.edu/nicholas.polson/teaching/41900/pf-intro.pdf Particle Filtering Introduction], [http://www.cs.sfu.ca/~mori/courses/cmpt882/fall05/slides/particle_filtering.pdf Particle Filters (another short note)] [http://www.cs.washington.edu/ai/Mobile_Robotics/mcl/animations/global-floor.gif Particle Filter Animation] and [http://www.cs.duke.edu/courses/fall06/cps270/pf.pdf helpful slides]|| [http://www.cs.utah.edu/~piyush Piyush Rai]
|-
|-
-
| Dec 8 || Multi-Armed Bandits || ||  
+
| Dec 8 || Markov Random Fields || Harnessing Locality.  [https://engineering.purdue.edu/~bouman/publications/tutorials/mrf_tutorial/view.pdf MRF tutorial].  An application to image tracking [http://mplab.ucsd.edu/wp-content/uploads/CVPR2008/Conference/data/papers/222.pdf Optical Flow Estimation with Uncertainties through Dynamic MRFs]|| [http://www.cs.utah.edu/~bowang/ Bo Wang]
|}
|}
-
 
-
==Participants==
 
==Readings==
==Readings==

Latest revision as of 08:47, 1 December 2010

Modelling Data With Uncertainty

Wed 1:25-2:45pm

MEB 3147 (LCR)

Contents

Synopsis

We will cover many recent developments in the modeling and processing of uncertain data in computer science. The seminar will focus and the modeling, algorithmic, and data-structural foundations needed for understanding, processing, and visualizing uncertain data. First we will overview different models of uncertainty, tracing their origins and motivation (Many-world models in databases, imprecision models in geometry, probabilistic models in databases). Then we will proceed to survey recent developments in computational geometry, databases, visualization, and machine learning and statistics.

  • The computational geometry section will describe basic algorithmic tools and analysis several models of uncertain data.
  • The databases section will focus on data structures to manage and process large amounts of uncertain data concisely.
  • The visualization section will look at models for data uncertainty for surfaces and volumes, and how the resulting data is visualized.
  • The machine learning and statistics section will cover classic techniques for learning and detecting uncertainty.

Through this seminar, participants will gain an understanding of the state-of-the-art in modeling and processing uncertain data and will be exposed to several important open problems and exciting research directions.

Participants

Schedule

(subject to change)

Date Topic Outline and Paper(s) Presenter
Aug 25 Models of Data Uncertainty Motivation behind modeling Uncertainty. Survey different models and where they arise. Jeff Phillips
Geometry
Sep 1 Range Spaces and epsilon-Samples How well does random sampling work? Define important geometric-combinatorial notions: Range spaces, eps-nets, eps-samples. Prove random sampling bound for eps-samples (Sariel's Notes read 5.1 and 5.3.1 and 5.3.2 and the "naive proof"). Better deterministic and from continuous to discrete (also a different way of defining concepts). John Moeller
Sep 8 Epsilon-Quantizations and Epsilon-SIPs Modeling questions on uncertain data with probability distributions. Review specific applications to smallest enclosing ball. (Shape Fitting on Point Sets with Probability Distributions, Jeff's Slides) Seth Juarez
Sep 15 Geometry on Imprecise Points Imprecision in data. eps-geometry for rounding errors (eps-Geometry). More general geometry problems: Basic Geometry Measures on Imprecise Points and maybe Largest and Smallest Convex Hulls on Imprecise Points Parasaran
Databases
Sep 22 Sketching and Streaming How to maintain a concise summary of uncertain data on-the-fly. Sketching probabilistic data streams John Moeller
Sep 29 Beyond Random Sampling Can we do better than random sampling? Theoretically (high level overview): deterministic eps-samples and for continuous domains. Practically (using insights from theory): low discrepancy sets i.e. eps-Sentinels and EASE. If time histogram results on uncertain probability distributions. Histograms and wavelets on probabilistic data Probabilistic Histograms for Probabilistic Data Avishek
Oct 6 Ranking Retrieving the top (most important) data points, when all values are uncertain. Focus on Expected Ranks. Note there are more general frameworks A Unified Approach to Ranking in Probabilistic Databases with slides. Shantharaju
Clustering
Oct 20 Extracting Clusters Extracting clusters when some data points are the noise. Clustering Noisy Data with Persistence and current work. Josh Levine
Nov 22 (10:45 am) Finding Clusters in Uncertain Data Building clusters on full (but uncertain) data sets. Approximation Algorithms for Clustering Uncertain Data (postponed from Oct 27) Parasaran Raman
Visualization (see page on Uncertainty Visualization)
Nov 3 Uncertainty Visualization State of the Art Kristi Potter
Nov 10 Uncertain Vector Fields Visualizing Uncertainty in Vector Fields. I will be discussing (a) Uncertain 2D Vector Field Topology, (b) my current project. Harsh Bhatia
Machine Learning And Statistics
Nov 17 Support Vector Machines Classifying uncertain data. Support Vector Classification with Input Data Uncertainty, Robustness and Regularization of Support Vector Machines Piyush Rai
Nov 24 Multi-Armed Bandits Trading off investigation of uncertainty and rewards. Multi-armed Bandits via the Gittens Index Strategy. The BalancedIndex for Restless Multi-armed Bandits. (Another older reference survey MAB-survey.) Avishek
Dec 1 Particle Filters Maintaining uncertain estimates. Particle Filtering Introduction, Particle Filters (another short note) Particle Filter Animation and helpful slides Piyush Rai
Dec 8 Markov Random Fields Harnessing Locality. MRF tutorial. An application to image tracking Optical Flow Estimation with Uncertainties through Dynamic MRFs Bo Wang

Readings

Personal tools