Modelling Data With Uncertainty
MEB 3147 (LCR)
We will cover many recent developments in the modeling and processing of uncertain data in computer science. The seminar will focus and the modeling, algorithmic, and data-structural foundations needed for understanding, processing, and visualizing uncertain data. First we will overview different models of uncertainty, tracing their origins and motivation (Many-world models in databases, imprecision models in geometry, probabilistic models in databases). Then we will proceed to survey recent developments in computational geometry, databases, visualization, and machine learning and statistics.
- The computational geometry section will describe basic algorithmic tools and analysis several models of uncertain data.
- The databases section will focus on data structures to manage and process large amounts of uncertain data concisely.
- The visualization section will look at models for data uncertainty for surfaces and volumes, and how the resulting data is visualized.
- The machine learning and statistics section will cover classic techniques for learning and detecting uncertainty.
Through this seminar, participants will gain an understanding of the state-of-the-art in modeling and processing uncertain data and will be exposed to several important open problems and exciting research directions.
- Jeff Phillips, CI Postdoctoral Fellow, School of Computing
- Suresh Venkatasubramanian, Assistant Professor, School of Computing
- Parasaran Raman, PhD Student, School of Computing
- John Moeller, PhD Student, School of Computing
- Avishek Saha, PhD Student, School of Computing
- Seth Juarez, PhD Student, School of Computing
- Kristi Potter, Post-Doc, SCI
- Josh Levine, Post-Doc, SCI
- Piyush Rai, PhD Student, School of Computing
- Harsh Bhatia, PhD Student, School of Computing
- Bo Wang, PhD Student, School of Computing
(subject to change)
|Date||Topic||Outline and Paper(s)||Presenter|
|Aug 25||Models of Data Uncertainty||Motivation behind modeling Uncertainty. Survey different models and where they arise.||Jeff Phillips|
|Sep 1||Range Spaces and epsilon-Samples||How well does random sampling work? Define important geometric-combinatorial notions: Range spaces, eps-nets, eps-samples. Prove random sampling bound for eps-samples (Sariel's Notes read 5.1 and 5.3.1 and 5.3.2 and the "naive proof"). Better deterministic and from continuous to discrete (also a different way of defining concepts).||John Moeller|
|Sep 8||Epsilon-Quantizations and Epsilon-SIPs||Modeling questions on uncertain data with probability distributions. Review specific applications to smallest enclosing ball. (Shape Fitting on Point Sets with Probability Distributions, Jeff's Slides)||Seth Juarez|
|Sep 15||Geometry on Imprecise Points||Imprecision in data. eps-geometry for rounding errors (eps-Geometry). More general geometry problems: Basic Geometry Measures on Imprecise Points and maybe Largest and Smallest Convex Hulls on Imprecise Points||Parasaran|
|Sep 22||Sketching and Streaming||How to maintain a concise summary of uncertain data on-the-fly. Sketching probabilistic data streams||John Moeller|
|Sep 29||Beyond Random Sampling||Can we do better than random sampling? Theoretically (high level overview): deterministic eps-samples and for continuous domains. Practically (using insights from theory): low discrepancy sets i.e. eps-Sentinels and EASE. If time histogram results on uncertain probability distributions. Histograms and wavelets on probabilistic data Probabilistic Histograms for Probabilistic Data||Avishek|
|Oct 6||Ranking||Retrieving the top (most important) data points, when all values are uncertain. Focus on Expected Ranks. Note there are more general frameworks A Unified Approach to Ranking in Probabilistic Databases with slides.||Shantharaju|
|Oct 20||Extracting Clusters||Extracting clusters when some data points are the noise. Clustering Noisy Data with Persistence and current work.||Josh Levine|
|Nov 22 (10:45 am)||Finding Clusters in Uncertain Data||Building clusters on full (but uncertain) data sets. Approximation Algorithms for Clustering Uncertain Data (postponed from Oct 27)||Parasaran Raman|
|Visualization (see page on Uncertainty Visualization)|
|Nov 3||Uncertainty Visualization||State of the Art||Kristi Potter|
|Nov 10||Uncertain Vector Fields||Visualizing Uncertainty in Vector Fields. I will be discussing (a) Uncertain 2D Vector Field Topology, (b) my current project.||Harsh Bhatia|
|Machine Learning And Statistics|
|Nov 17||Support Vector Machines||Classifying uncertain data. Support Vector Classification with Input Data Uncertainty, Robustness and Regularization of Support Vector Machines||Piyush Rai|
|Nov 24||Multi-Armed Bandits||Trading off investigation of uncertainty and rewards. Multi-armed Bandits via the Gittens Index Strategy. The BalancedIndex for Restless Multi-armed Bandits. (Another older reference survey MAB-survey.)||Avishek|
|Dec 1||Particle Filters||Maintaining uncertain estimates. Particle Filtering Introduction, Particle Filters (another short note) Particle Filter Animation and helpful slides||Piyush Rai|
|Dec 8||Markov Random Fields||Harnessing Locality. MRF tutorial. An application to image tracking Optical Flow Estimation with Uncertainties through Dynamic MRFs||Bo Wang|