Neural Architecture Search for Portrait Parsing.
Academic Article
Overview
Research
Identity
Additional Document Info
Other
View All
Overview
abstract
This work proposes a neural architecture search (NAS) method for portrait parsing, which is a novel up-level task based on portrait segmentation and face labeling. Recently, NAS has become an effective method in terms of automatic machine learning. However, remarkable achievements have been made only in image classification and natural language processing (NLP) areas. Meanwhile, state-of-the-art portrait segmentation and face labeling approaches are all manually designed, but few models reach a tradeoff between efficiency and performance. Thus, we are extremely interested in improving existing NAS methods for dense-per-pixel prediction tasks on portrait datasets. To achieve that, we resort to a cell-based encoder-decoder architecture with an elaborate design of connectivity structure and searching space. As a result, we achieve state-of-the-art performance on three portrait tasks, including 96.8% MIOU on EG1800 (portrait segmentation), 91.2% overall F1 -score on HELEN (face labeling), and 95.1% overall F1 -score on CelebAMask-HQ (portrait parsing) with only 2.29M model parameters. That is, our approach compares favorably with all previous works on portrait datasets. More crucially, we empirically prove that even a fundamental encoder-decoder architecture may reach an outstanding result on the aforementioned tasks with the help of the innovative approach of NAS. To the best of our knowledge, our work is also the first to report the success of applying NAS on these portrait tasks.