Crawling and Classification Strategies for Generating a Multi-Language Corpus of Sign Language Video

abstract

2019 IEEE. Although there is considerable sign language content available online, it can be hard to locate content in a specific sign language on a particular topic. The Sign Language Digital Library (SLaDL) aims to improve access through the generation of a multi-language corpus of sign language video. SLaDL uses a combination of crawling to collect potential sign language content and applying multimodal sign language detection and identification classifiers to winnow the collected videos to those believed to be in a particular sign language. Here we compare the quantity and variety of sign language videos located via breadth-first, depth-first, and focused crawling strategies. Then we examine the accuracy of different approaches to combining textual metadata and video features for the 3-way classification task of identifying videos in American Sign Language (ASL), British Sign Language (BSL), and without-sign language. Finally, due to the high computational cost of generating the video features used for classification, we explore the tradeoffs when using a cascading classifier and when generating features based on motion in sampled frames on classifier accuracy.

name of conference

2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL)

authors

Shipman, Frank

published proceedings

2019 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2019)

author list (cited authors)

Shipman, F. M., & Monteiro, C.

citation count

1

complete list of authors

Shipman, Frank M||Monteiro, Caio DD

publication date

June 2019

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

n2575-7865ISSN Journal

keywords

Collection Generation
Crawling Techniques
Metadata Extraction
Multimodal Classification
Sign Language
Video Sharing

Digital Object Identifier (DOI)

10.1109/JCDL.2019.00023

International Standard Book Number (ISBN) 13

9781728115474

start page

97

end page

106

volume

00

URL

http://dx.doi.org/10.1109/jcdl.2019.00023

Crawling and Classification Strategies for Generating a Multi-Language Corpus of Sign Language Video Conference Paper

Overview

abstract

name of conference

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13

Additional Document Info

start page

end page

volume

Other

URL