Title: Disentangled Representation Learning in Isolated Sign Language Recognition
Advisor. İnci M. Baytaş
Abstract: Representation learning is an essential part of deep learning tasks. Gathering informative representations that are not affected by unnecessary details is vital. Sign Language Recognition (SLR) is one of the areas where deep learning models have been successfully used. Convolutional Neural Networks (CNN) are commonly a part of deep learning-based SLR frameworks. However, CNN-based recognition frameworks tend to capture the characteristics of the identity in the foreground, such as face attributes, hand and body shape, and skin color. This challenge is often encountered in problems such as face and gait recognition, image manipulation, and person re-identification problems. This thesis proposes a disentangled representation learning framework to separate the latent factors in the sign and signer representations and eliminate the irrelevant identity information to improve sign recognition performance. Various disentanglement techniques, including regularized adversarial training, are investigated. Experiments are conducted on two isolated Turkish sign language benchmark datasets. The effect of feature disentanglement and its potential to improve recognition performance are discussed with qualitative and quantitative analysis.