Named Entity Recognition from a Data-Driven Perspective

Jingbo Shang
 

Jingbo Shang

Assistant Professor, UCSD CSE & HDSI

 

Monday, October 14, 2019 @ 11:00am
CSE 1202, EBU-3B

Faculty Host
Pavel Pevzner and Yuanyuan Zhou

 

Abstract:

Named entity recognition (NER) is one of the core tasks in natural language processing (NLP), and has numerous applications in various domains. Recent advances in neural NER models (e.g., LSTM-CRF) have freed human effort from handcrafting features. In this talk, we will briefly revisit these models and discuss how can we improve upon them from a data-driven perspective. The key philosophy of "data-driven" here is to enhance NER performance without introducing any additional human annotations. We will attack this problem from different angles, including pre-training & co-training language models, introducing dictionaries for distant supervision, detecting and re-weighing noise training data, and removing the dependency on tokenizer (especially for social media posts).

Bio:

Jingbo Shang will start as an Assistant Professor in Computer Science Engineering and Halıcıoğlu Data Science Institute at UC San Diego from Jan 2020. He is now a Ph.D. candidate in Department of Computer Science, University of Illinois at Urbana-Champaign. He received his B.E. from Computer Science Department, Shanghai Jiao Tong University, China. His research focuses on mining and constructing structured knowledge from massive text corpora with minimum human effort. His research has been recognized by many prestigious awards, including Grand Prize of Yelp Dataset Challenge in 2015 and Google Ph.D. Fellowship in Structured Data and Database Management in 2017.