The variety and complexity of potentially-related data resources available for querying --- webpages, databases, data warehouses --- has been growing ever more rapidly. There is a growing need to pose integrative queries across multiple such sources, exploiting foreign keys and other means of interlinking data to merge information from diverse sources. This has traditionally been the focus of research within Information Extraction (IE) and Information Integration (II) communities, with IE focusing on converting unstructured sources into structured sources, and II focusing on providing a unified view of diverse structured data sources. However, most of the current IE and II methods, which can potentially be applied to the problem of integration across sources, require large amounts of human supervision, often in the form of annotated data. This need for extensive supervision makes existing methods expensive to deploy and difficult to maintain. Faced with this challenge, in this talk, I shall present an overview of my research into graph-based weakly-supervised methods for IE and II.

Joint work with Koby Crammer, Sudipto Guha, Zack Ives, Marie Jacob, Marius Pasca, Fernando Pereira, Joseph Reisinger

Loading more stuff…

Hmm…it looks like things are taking a while to load. Try again?

Loading videos…