School of Computing UofU calendar UofU index UofU directory Map About Salt Lake SoC Calendar University of Utah University of Utah
Research Buffet

Wednesday, September 21, 2011
3147 MEB
Refreshments 3:40 p.m.
Lecture 4:00 p.m.


Feifei Li
Assistant Professor

Title: Efficient Parallel kNN Joins Using MapReduce

Abstract
Abs: In data mining applications and spatial and multimedia databases, a useful tool is the kNN join, which is to produce the k nearest neighbors (NN), from a dataset P, of every point in a dataset R. Since it involves both the join and the NN search, performing kNN joins efficiently is a challenging task. Meanwhile, applications continue to witness a quick (exponential in some cases) increase in the amount of data to be processed. A popular model nowadays for large-scale data processing is the shared-nothing cluster on a number of commodity machines using MapReduce [6]. Hence, how to execute kNN joins efficiently on large data that are stored in a MapReduce cluster is an intriguing problem that meets many practical needs. This work proposes novel (exact and approximate) algorithms in MapReduce to perform efficient parallel kNN joins on large data. We demonstrate our ideas using Hadoop, an open source MapReduce framework. Extensive experiments using large real datasets, at least tens of millions of records, have convincingly demonstrated the efficiency, effectiveness, and the scalability of our methods.



Return to 2011 Events Calendar


School of Computing • 50 S. Central Campus Dr. Rm. 3190 • Salt Lake City, UT 84112
801-581-8224 • Fax: 801-581-5843 • Send comments to webmaster@cs.utah.edu
Disclaimer