Language Technology Seminar Series
Question Answering R&D at Microsoft
TJ Hazen, Microsoft Research
Friday, February 9 at 3:30
Natural language processing technology for open ended question answering tasks is now readily available to anyone with internet access using web sites such as Bing or Google. This talk will present a general overview of how question answering inside of Microsoft Bing works and discuss techniques used to expand and improve Bing’s question answering capabilities. The talk will also discuss recent advances in deep learning modeling techniques to perform open ended machine reading comprehension and question answering tasks.
TJ Hazen is a Principal Research Manager with Microsoft Research where his current work is focused on the tasks of machine reading comprehension and question answering. Prior to joining Microsoft in 2013, TJ was a Research Scientist at MIT where he spent six years as a member of the Human Language Technology Group at MIT Lincoln Laboratory and nine years as a Research Scientist at the MIT Computer Science and Artificial Intelligence Laboratory. TJ holds S.B., S.M., and Ph.D. degrees in Electrical Engineering and Computer Science from MIT.
Translation Divergence between Chinese-English Machine Translation: An Empirical Investigation
Friday, February 2 at 3:30pm
Alignment is an important part of building a parallel corpus that is useful for both linguistic analysis and NLP applications. We propose a hierarchical alignment scheme where word-level and phrase-level alignments are carefully coordinated to eliminate conflicts and redundancies. Using a parallel Chinese-English Treebank annotated with this scheme we show that some high-profile translation divergences that motivate previous research are actually very rare in our data, whereas other translation divergences that have previously received little attention actually exist in large quantities. We also show that translation divergences can to a large extent be captured by the syntax-based translation rules extracted from the parallel treebank, a result that supports the contention that semantic representations may be impractical and unnecessary to bridge translation divergences in Chinese-English MT.
Nianwen Xue is an Associate Professor in the Computer Science Department and the Language & Linguistics Program at Brandeis University. He has devoted substantial efforts to developing linguistically annotated resources for natural language processing purposes. The other thread of his research involves using statistical and machine learning techniques to solve natural language processing problems. His research has received support from the National Science Foundation (NSF), IARPA and DARPA. He is currently the editor-in-chief of the ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), and he also serves on the editorial boards of Language Resources and Evaluation , and Lingua Sinica. He is currently the Vice Chair/Chair-Elect of Sighan, an ACL special interest group in Chinese language processing.