Onur Güngör defended his PhD Thesis

Title: Neural Named Entity Recognition for Morphologically Rich Languages
 
Abstract:
Named entity recognition (NER) is an important task in natural language processing (NLP). Until the revival of neural network-based models for NLP, NER taggers employed traditional machine learning approaches or finite-state transducers to detect the entities in a given sentence. Neural models improved the state-of-the-art performance with sequence-based models and word embeddings. These approaches neglect the morphological information embedded in the surface forms of the words. In this thesis, we introduce two NER taggers that utilize such information, which we show to be significant for morphologically rich languages. Using these taggers, we improve the state-of-the-art performance levels for Turkish, Czech, Hungarian, Finnish, and Spanish. The ablation studies show that these improvements result from the inclusion of morphological information. We also show that it is possible for the neural network to also learn how to disambiguate morphological analyses, thereby, eliminating the dependence on external morphological disambiguators that are not always available. In the second part of this thesis, we propose a model agnostic approach for explaining any sequence-based NLP task by extending a well-known feature-attribution method. We assess the plausibility of the explanations for our NER tagger for Turkish and Finnish through several novel experiments.
 

Contact us

Department of Computer Engineering, Boğaziçi University,
34342 Bebek, Istanbul, Turkey

  • Phone: +90 212 359 45 23/24
  • Fax: +90 212 2872461
 

Connect with us

We're on Social Networks. Follow us & get in touch.