Şaziye Betül Özateş defended her PhD thesis: Deep Learning-based Dependency Parsing for Turkish

TitleDeep Learning-based Dependency Parsing for Turkish

Co-advised by: Arzucan Özgür and Tunga Güngör

Abstract

Dependency parsing is an important step for many natural language processing (NLP) systems such as question answering and machine translation. Turkish, being a morphologically rich language and having a complex grammar, is challenging for automatic processing. The limited NLP tools and resources for Turkish make the task even more challenging. Data-driven deep learning models show promising performance in the area of dependency parsing. Yet, the amount of data to train a data-driven dependency parser directly affects performance, and deep learning-based systems require large amounts of data to achieve good performance. In this thesis, we focused on the task of Turkish dependency parsing and proposed two solutions to the challenges this task poses. First, we increased the amount and quality of labeled data for Turkish dependency parsing.  In this respect, we created the BOUN Treebank by manually annotating 9,761 sentences. In addition, we re-annotated the IMST and PUD treebanks using the same annotation scheme. As a result, we presented the largest collection of Turkish treebanks with consistent annotation. Second, we developed novel state-of-the-art dependency parsing models for Turkish as well as other low-resource languages. As our first parsing approach, we introduced a hybrid dependency parsing architecture where Turkish grammar rules and morphological features of words are integrated into the deep learning model. Despite the limited training data, the proposed hybrid parser achieved higher success than the current methods for Turkish dependency parsing.  As our second parsing approach, we  proposed a deep dependency parser with semi-supervised enhancement. By conducting experiments on a number of low-resource languages besides Turkish, we achieved state-of-the-art results on all datasets. We have shown that deep learning-based models can be improved not only by additional training data, but also by the integration of intelligently extracted information. 

Contact us

Department of Computer Engineering, Boğaziçi University,
34342 Bebek, Istanbul, Turkey

  • Phone: +90 212 359 45 23/24
  • Fax: +90 212 2872461
 

Connect with us

We're on Social Networks. Follow us & get in touch.