Refreshments 10:50 a.m.
Abstract
How can we automatically build knowledge repositories that will contain not only high-level information such as found in Wikipedia, but also particular facts such as "Who appeared in a concert in the Hollywood Bowl last night?" This is a challenging problem, which was never solved despite many have worked on it. In this talk, I will present novel algorithms for information gathering, sifting and organization that can rapidly, accurately and completely cover any area of interest mining unstructured text on the Web. I will describe a semi-supervised bootstrapping procedure which is guided through graph based algorithms to scan billions of Web documents in order to automatically harvest and taxonomize thousands to millions of new arguments, supertypes and semantic relations. Finally, I will show that the algorithm enriches existing knowledge repositories such as Yago.
Bio
Zornitsa Kozareva is a Research Scientist in the Natural Language group at the Information Sciences Institute, University of Southern California (USC/ISI). She received her PhD with Cum Laude from the University of Alicante, Spain. Her research interests lie in Web-based knowledge acquisition, text mining, lexical semantics, ontology population and multilingual information extraction. In 2010, Zornitsa co-organized the SemEval challenge on Multi-Way Classification of Semantic Relations Between Pairs of Nominals [nlp.csie.ncnu.edu.tw]. She co-organized the CCIACADA/VACCINE Reconnect Conference. She was the leader of the team that won the answer validation challenge (AVE-2006) for French and Italian, and a member of the team that won the Spanish Geographic Information Retrieval (GeoClef-2006) challenge.