3rd Workshop on Geometry and Machine Learning
Monday, June 11, 2018 | Budapest, Hungary
Organized by Jinhui Xu, Anne Driemel, and Jeff Phillips.
Machine learning concerns techniques that can learn from and make predictions on data. Algorithms for these tasks are built to explore the useful patterns in data, which usually can be stated in terms of geometry (e.g., problems in high dimensional feature space). Hence computational geometry can play a crucial and natural role in machine learning, espcecially since geometric algorithms often come with quality guaranteed solutions when dealing with the high-dimensional data. The Computational Geometry community has many researchers with the unique knowledge of relevant geometric tasks, which could be utilized to have a great impact on machine learning and other data related fields.

This workshop is intended to provide a forum for those working in the fields of computational geometry, machine learning and the various theoretical and algorithmic challenges to promote their interaction and blending. To this end, the workshop will consist of several invited talks that will serve as tutorials on a specific applications of geometric algorithms in machine learning. We hope such interaction will stimulate those working in both fields, and we can expect that a synergy can promote many new interesting geometric problems, concepts and ideas, which will contribute to open up new vistas in computational geometry communities.


This workshop is being held as part of CG Week 2018 (June 11-14, 2018 in Budapest, Hungary) which also includes the International Symposium on Computational Geometry (SoCG).

It will be held in room GS (G'olyav'ar smaller room).

Invited Speakers
2:30-3:15 Bei Wang (University of Utah) Stratification Learning with Computational Topology: Overview, Challenges, and Opportunities
3:15-4:00 Brittany Fasy (Montana State University) Topological Data Analysis for Roads and Histology
4:30-5:15 Melanie Schmidt (University of Bonn) Practical Theory for Geometric Center Based Clustering
5:15-6:00 Ioannis Z. Emiris (NKU Athens, and ATHENA Research Center) Randomized Projections for Geometric Search in High Dimension


Schedule
2:30-3:15
Stratification Learning with Computational Topology: Overview, Challenges, and Opportunities
Bei Wang
University of Utah
A fundamental question in the study of high-dimensional data is as follows: Given high-dimensional point cloud samples, how can we infer the structures of the underlying data? In manifold learning, we assume the data is supported by a low-dimensional space with a manifold structure. However, such an assumption may be too restrictive in practice when we are given point cloud samples not of a manifold but of a stratified space, which contain singularities and mixed dimensionality.

Stratification learning can be considered as an unsupervised, exploratory, clustering process that infers a decomposition of data sampled from a stratified space into disjoint subsets that capture recognizable and meaningful structural information. In recent years, there have been significant efforts in computational topology that are relevant to stratification learning. In this talk, I will give an overview of the such efforts, and discuss challenges and opportunities. In particular, I will focus on stratification learning using local homology, persistent homology, sheaves, and discrete stratified Morse theory.

Bio: Dr. Bei Wang is an assistant professor at the School of Computing, and a faculty member at the Scientific Computing and Imaging (SCI) Institute, University of Utah. She received her Ph.D. in Computer Science from Duke University. Her research interests include topological data analysis, data visualization, machine learning and data mining.

3:15-4:00
Topological Data Analysis for Roads and Histology
Brittany Fasy
Montana State University
Using topological (and geometric) descriptors of data have the potential to be rich sources if information for analysis and comparison. While we aim to have methods be parameter-free in theory, the reality in practice is that there are tuning parameters and decisions to make among the various topological and geometric descriptors. We motive this talk with two problems (1) road network comparision; and (2) prostate cancer biopsy analysis. In both problems, we see that connections with domain experts is important for motivating and advancing the theory.

Bio: Dr. Brittany Terese Fasy is an assistant professor in the School of Computing, and affiliated with the Department of Mathematical Sciences at Montana State University. She received her Ph.D. in Computer Science from Duke University in 2012, and did two post-docs: at Carnegie Mellon University and at Tulane University. Her research is in computational topology.

4:00-4:30 : Coffee Break

4:30-5:15
Practical Theory for Geometric Center Based Clustering
Melanie Schmidt
University of Bonn
Clustering is an important unsupervised learning task. In this talk we discuss geometric center based clustering problems like facility location, k-median and k-means. These are well studied in their standard form, yet our focus lies on extending them to more practical scenarios. Firstly, we are interested in adding constraints to the standard formulations of these problems. While capacitated clustering and clustering with outliers have received considerable attention, constraints like fairness (where points have colors and clusters should be balanced) and anonymity (where centers have a lower bound on the number of assigned points) are less studied. Secondly, we review results on big data clustering, in particular in data streams. It is possible to reduce the number of points in a data set (by so-called coresets) and the dimension of the data set (by dimensionality reduction) to a polynom only depending on k and the precision of the summary. We conclude by posing some open questions that arise when combining constraints and the big data scenario.

Bio: Melanie Schmidt received her PhD from TU Dortmund in Germany. After visiting Carnegie Mellon University in Pittsburgh, USA, for a year, she is now working as a PostDoc at the theoretical computer science group at Bonn University. Her work focuses on clustering problems.

5:15-6:00
Randomized Projections for Geometric Search in High Dimension
Ioannis Z. Emiris
NKU Athens, and ATHENA Research Center
Several successful algorithms for approximate geometric search, typically relying on tree-based data structures, exhibit complexity exponential in the ambient dimension, a manifestation of the curse of dimensionality. This has been addressed by Locality Sensitive Hashing and randomized hash-tables. We follow a different tack by employing randomized projections, which are simpler to define and implement, but have long been overlooked since they did not seem powerful enough. We manage to show that they achieve competitive construction time and query time complexities as well as smaller space consumption for the fundamental question of approximate nearest-neighbor search in Euclidean space. We extend our approach to more general topologies, including all LSH-able metrics, as well as products of Euclidean metrics, which generalize the discrete Frechet and dynamic time warping distances. Our work is motivated by H2020 project LAMBDA which focuses on expressive, yet long, vector descriptors of complex shapes, and on manipulating one-dimensional objects such as trajectories or time-series.

Bio: Ioannis Emiris received his PhD in 1994 from the University of California at Berkeley. Since 2007 he is Professor of Theoretical Computer Science at the University of Athens where he heads the Laboratory of Algebraic and Geometric Algorithms. The focus of the lab lies on problems in scientific and geometric computing, in particular computational algebra, the study and solution of polynomial equations, as well as computational, convex, and metric geometry.

The previous versions of this workshop were:
  • Workshop on Geometry and Machine Learning
  • 2nd Workshop on Geometry and Machine Learning