CSE Colloquia: Towards Universal Natural Language Processing
Abstract:
Natural language processing (NLP) plays an increasingly important role in everyday life, and many people are familiar with products such as Google Translate, Alexa, Siri, or ChatGPT. However, NLP systems currently only exist for a small fraction of the world's approximately 7000 languages. This is undesirable for many reasons. For instance, only speakers of high-resource languages are able to benefit from the abundance of information available on the internet, which reinforces already existing inequalities. It also limits the ability of NLP to support language documentation and revitalization efforts. NLP systems further perform poorly for many domains, which limits their applicability, e.g., in healthcare or for educational purposes.
In this talk, I will present a couple of my recent lines of work: I will first discuss how we can leverage NLP systems to speed up language documentation efforts. I will then talk about how models can be trained for or adapted to various high-level NLP tasks. I will end with an outlook on remaining challenges and questions for both low-resource languages and low-resource domains.
Bio:
Katharina von der Wense is an Assistant Professor of Computer Science at University of Colorado Boulder, USA, and a junior professor at the Johannes Gutenberg University Mainz, Germany. She leads the VDW Natural Language Processing Group (NALA). She received her PhD from LMU Munich in 2019 and was a postdoc at New York University until she moved to Boulder in 2020. Her work is centered around deep learning for NLP, with a special focus on multilingual NLP and transfer learning, computational morphology, language grounding, and NLP for medical and educational applications.
Event Contact: Timothy Zhu