Ryan H. – BASIS Oro Valley

BASIS Oro Valley Senior Ryan H.

Multilingual Parsing from Raw Text to Universal Dependencies

Project/blog link: Multilingual Parsing from Raw Text to Universal Dependencies
BASIS Advisor: Chester Clark
Internship location: University of Arizona
Onsite Mentor: Dr. Jungyeul Park, Visting Professor, Department of Linguistics, University of Arizona

Project Abstract

In many programs such as search engines, translation programs and programs like Siri for the iPhone, computers use techniques to understand spoken language in order to better fulfill user request. One way a computer can do this is by breaking down a user's sentence into what's called Universal Dependencies. This format is especially useful because it can be used to categorize many languages all with the same grammar labels. The Conference on Computational Natural Language Learning (CoNLL) has announced a shared task to make a program capable of this. The goal of my project is to hopefully get the program, or the paper of our methods, accepted by the conference. By taking a base model that can parse sentences into this format and training it with sets of data from different languages, the model can be prepared to parse these languages. From there I will be able to take aspects of certain languages, and combine them to better analyze languages without training data.

Multilingual Parsing from Raw Text to Universal Dependencies