Crawling and Classification Strategies for Generating a Multi-Language Corpus of Sign Language Video
Additional Document Info
2019 IEEE. Although there is considerable sign language content available online, it can be hard to locate content in a specific sign language on a particular topic. The Sign Language Digital Library (SLaDL) aims to improve access through the generation of a multi-language corpus of sign language video. SLaDL uses a combination of crawling to collect potential sign language content and applying multimodal sign language detection and identification classifiers to winnow the collected videos to those believed to be in a particular sign language. Here we compare the quantity and variety of sign language videos located via breadth-first, depth-first, and focused crawling strategies. Then we examine the accuracy of different approaches to combining textual metadata and video features for the 3-way classification task of identifying videos in American Sign Language (ASL), British Sign Language (BSL), and without-sign language. Finally, due to the high computational cost of generating the video features used for classification, we explore the tradeoffs when using a cascading classifier and when generating features based on motion in sampled frames on classifier accuracy.
name of conference
2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL)