The Brandeis CL Seminar Series hosts Jibo:
Roberto Pieraccini, Head of Conversational Technologies, Jibo Inc.
Friday, December 1, 3pm, Volen 101
Jibo is a robot that understands speech and sees. He has a moving body that complements his verbal communication and expresses his emotions, cameras and microphones to make sense of the world around him. He detects where sounds come from and can track and recognize people’s faces. He has a display to show images, an eye that follows you, and touch sensors. With this array of technologies, Jibo encompasses the ultimate human-machine interface.
In this talk we will give an overview of the technological complexity we embarked into when, more than 4 years ago, we started the journey of building the first consumer social robot. We will describe some of the solutions we adopted and give a demo of the product that started shipping a few weeks ago. We will conclude with a discussion on the future challenges for short and long-term research.
ABOUT THE SPEAKER:
Roberto Pieraccini, a scientist, technologist, and the author of “The Voice in the Machine,” (MIT Press, 2012) has been at the forefront of speech, language, and machine learning innovation for more than 30 years. He is widely known as a pioneer in the fields of statistical natural language understanding and machine learning for automatic dialog systems, and their practical application to industrial solutions. As a researcher he worked at CSELT (Italy), Bell laboratories, AT&T Labs, and IBM T.J. Watson. He led the dialog technology team at SpeechWorks Int.l, he was the CTO of SpeechCycle, and the CEO of the International Computer Science Institute (ICSI) in Berkeley. He now leads the Conversational Technologies team at Jibo. http://robertopieraccini.com
Professor of Language and Artificial Intelligence
Thursday Oct. 26 at 3:30
Quantification is ubiquitous in natural language: it occurs in every sentence. It occurs whenever a predicate P is applied to a set S of objects, where it gives rise to such questions as (1) To how many members of S is P applied? (2) Is P applied to individual members of S, or to S as a whole, or to certain subsets of S? (3) What is the size of S? (4) How is S determined by lexical, syntactic and contextual information? Moreover, if P is applied to combinations of members from different sets, issues of relative scope arise.
Quantification is a complex phenomenon, both from a semantic point of view and because of the complexity of the relation between the syntax and the semantics of quantification, and has been studied extensively by logicians, linguistics, and computational semanticists. Nowadays it is generally agreed that quantifier expressions in natural language are noun phrases, which is why quantification arises in every sentence.
The International Organization for Standardization ISO has in recent years started to develop annotation schemes for semantic phenomena, both in support of linguistic research in semantics and for building semantically more advanced NLP systems. The ISO-TimeML scheme (ISO 24617-1), based on Pustejovsky’s TimeML, was the first ISO standard that was established in this area; others concern the annotation of dialogue acts, discourse relations, semantic roles, and spatial information. Quantification is currently considered as a next candidate for an ISO standard annotation scheme. In this talk I will discuss some of the issues involved in developing such an annotation scheme, including the definition of an abstract syntax of the annotations, of concrete XML representations, and the semantics of the annotations.
Harry Bunt is professor of Linguistics and Computer Science at Tilburg University, The Netherlands. Before that he worked at Philips Research Labs. He studied physics and mathematics at the University of Utrecht and obtained a doctorate (cum laude) in Linguistics at the University of Amsterdam. His main areas of interest are computational semantics and pragmatics, especially in relation to (spoken) dialogue. He developed a framework for dialogue analysis called Dynamic Interpretation Theory, which has been the basis of an international standard for dialogue annotation (ISO 24617-2).
Lexicography from Scratch: Quantifying meaning descriptions with feature engineering
Friday, October 20 at 3pm
When computational linguistics wishes to engage with the meaning of words, it asks the experts: lexicographers, who analyze evidence of usage and then record judgments in dictionaries, in the form of definitions. A definition is a finely-wrought piece of natural language, whose nuances are as elusive to computational processes as any other unstructured data. Computational linguists nevertheless squeeze as much utility as they can out of dictionaries of every stripe, from Webster’s 1913 to Wordnet. None of these resources had computational analysis of lexical meaning in mind when they were conceived or created. Despite the immense human cognitive effort that went into making them, most lexical resources constrain their computational users to a few simplistic lookup tasks.
If a lexical resource is designed, from its origins, to serve all the diverse human and computational applications for which dictionaries have been repurposed in the digital era, it might yield significant improvements both theoretically and practically. But who wants to make a dictionary from scratch? The theme of the 2017 Electronic Lexicography conference (Leiden, September 19-21: http://elex.link/elex2017/
) was “Lexicography From Scratch”. This talk assembles a number of isolated recent innovations in lexicographical practice — often corpus-driven retrofits on to existing dictionary data — and attempts to map out a lexicographical process that would connect them all.
Such a process would yield meaning descriptions that are quantified, linked to corpus data, decomposable into individual semantic factors, and conducive to insightful comparison of lexicalized concepts in pairs and in groups. We describe a cluster-analysis framework that shows promise for automating the fussier parts of this by reducing cognitive loads on the lexical analyst. If aspects of lexical analysis can be automated through feature engineering, we may produce computational models of lexical meaning that are more useful for NLP tasks and more maintanable by lexicographers.
Bio: Orion Montoya graduated from the Brandeis CL MA program in 2017, with the thesis Lexicography as feature engineering: automatic discovery of distinguishing semantic factors for synonyms. Before coming to Brandeis, he spent fifteen years in and around the lexicography industry, computing with lexical data in all of its manifestations: digitizing old print dictionaries, managing lexicographical corpora, linking old lexical data to new corpus data. He also has a BA in Classics from the University of Chicago.
The Brandeis CL Seminar series brings speakers from research and industry related to computational linguistics and is open to all. If you’d like to be a speaker or suggest a speaker contact Marie Meteer, mmeteer at brandeis dot com.
The Language Application Grid as a Platform for NLP Research
Wednesday, February 1 at 3pm
The LAPPS Grid project (Vassar, Brandeis, CMU, LDC), which has developed a platform providing access to a vast array of language processing tools and resources for the purposes of research and development in natural language processing (NLP), has recently expanded to enhance its usability by non-technical users such as those in the Digital Humanities community. We provide a live demonstration of LAPPS Grid use, ranging from “from scratch” construction of a workflow using atomic tools to a pre-configured docker image that can be run off-the-shelf on a laptop or in the cloud, for several tasks of relevance to the NLP and DH communities.
Keith Suderman is a Research Assistant with the Department of Computer Science at Vassar College in Poughkeepsie, New York. Keith works full time on the development of the LAPPS Grid API, architecture, and tool integrations.