Nate Chambers, Department of Computer Science, US Naval Academy
Given at Heidelberg University, Germany, Oct 2015.
This tutorial introduces and reviews current research on learning event schemas from text with minimal human supervision. There have been several lines of work in the past 5-10 years that focus on learning broader event schemas from text. Scehas are a generalized representation of the events and entities that make up a typical scenario in the world. For instance, atomic events like sneeze, take your temperature, visit a doctor, and fill a prescription are all inherently related to a broader 'illness scenario' that humans naturally understand. Doctors, medicine, and symptoms are all entities that fill particular roles in the series of events. These were famously called scripts in the 1970's, but most recently, several lines of research have focused on new computational methods to learn scripts. As is usual, many methods have been proposed, and different event representations and learning approaches have been applied.
The tutorial has four main goals: (1) generally expose the audience to script/schema research in recent years, including both formal and informal probabilistic models, (2) focus in depth on generative models to the unsupervised learning task, (3) give a practical overview of working code for one such generative model, and (4) review the diverse set of evaluations of schema knowledge.
Day 1: Motivation and Background
Day 1: Clustering Models
Day 2: Generative Models
Day 3: Code Workshop
Also, a one-page highlight of recent work with schema representations.
Code for the entity-based generative model is publicly available on my github probschemas page.
See the slides above for Code Workshop for an overview of the code, high-level description of how it works, and a practical programming challenge. The code itself includes a "workshop" java file with code redacted. You are encouraged to try to complete the code for a better understanding of how the Gibbs Sampler works.
Niranjan Balasubramanian and Stephen Soderland and Mausam and Oren Etzioni. Generating Coherent Event Schemas at Scale. EMNLP 2013.
David Bamman, Brendan O'Connor, Noah Smith. Learning Latent Personas of Film Characters. ACL 2013.
Nathanael Chambers. Event Schema Induction with a Probabilistic Entity-Driven Model. EMNLP 2013.
Nathanael Chambers and Dan Jurafsky. Template-Based Information Extraction without the Templates. ACL 2011.
Harr Chen, Edward Benson, Tahira Naseem, and Regina Barzilay. In-domain Relation Discovery with Meta-constraints via Posterior Regularization. ACL 2011.
Jackie Cheung, Hoifung Poon, Lucy Vanderwende. Probabilistic Frame Induction. ACL 2013.
Bram Jans, Ivan Vulic, and Marie Francine Moens. Skip N-grams and Ranking Functions for Predicting Script Events. EACL 2012
Kiem-Hieu Nguyen, Xavier Tannier, Olivier Ferret and Romaric Besançon. Generative Event Schema Induction with Entity Disambiguation. ACL 2015.
Karl Pichotta and Raymond J. Mooney. Statistical Script Learning with Multi-Argument Events. EACL 2014.
Michaela Regneri, Alexander Koller, Manfred Pinkal. Learning Script Knowledge with Web Experiments.
Rachel Rudinger, Pushpendre Rastogi, Francis Ferraro, Benjamin Van Durme. Script Induction as Language Modeling. EMNLP 2015.