Lu, Yan (2015-08). Visual Navigation for Robots in Urban and Indoor Environments. Doctoral Dissertation. Thesis uri icon

abstract

  • As a fundamental capability for mobile robots, navigation involves multiple tasks including localization, mapping, motion planning, and obstacle avoidance. In unknown environments, a robot has to construct a map of the environment while simultaneously keeping track of its own location within the map. This is known as simultaneous localization and mapping (SLAM). For urban and indoor environments, SLAM is especially important since GPS signals are often unavailable. Visual SLAM uses cameras as the primary sensor and is a highly attractive but challenging research topic. The major challenge lies in the robustness to lighting variation and uneven feature distribution. Another challenge is to build semantic maps composed of high-level landmarks. To meet these challenges, we investigate feature fusion approaches for visual SLAM. The basic rationale is that since urban and indoor environments contain various feature types such points and lines, in combination these features should improve the robustness, and meanwhile, high-level landmarks can be defined as or derived from these combinations. We design a novel data structure, multilayer feature graph (MFG), to organize five types of features and their inner geometric relationships. Building upon a two view-based MFG prototype, we extend the application of MFG to image sequence-based mapping by using EKF. We model and analyze how errors are generated and propagated through the construction of a two view-based MFG. This enables us to treat each MFG as an observation in the EKF update step. We apply the MFG-EKF method to a building exterior mapping task and demonstrate its efficacy. Two view based MFG requires sufficient baseline to be successfully constructed, which is not always feasible. Therefore, we further devise a multiple view based algorithm to construct MFG as a global map. Our proposed algorithm takes a video stream as input, initializes and iteratively updates MFG based on extracted key frames; it also refines robot localization and MFG landmarks using local bundle adjustment. We show the advantage of our method by comparing it with state-of-the-art methods on multiple indoor and outdoor datasets. To avoid the scale ambiguity in monocular vision, we investigate the application of RGB-D for SLAM.We propose an algorithm by fusing point and line features. We extract 3D points and lines from RGB-D data, analyze their measurement uncertainties, and compute camera motion using maximum likelihood estimation. We validate our method using both uncertainty analysis and physical experiments, where it outperforms the counterparts under both constant and varying lighting conditions. Besides visual SLAM, we also study specular object avoidance, which is a great challenge for range sensors. We propose a vision-based algorithm to detect planar mirrors. We derive geometric constraints for corresponding real-virtual features across images and employ RANSAC to develop a robust detection algorithm. Our algorithm achieves a detection accuracy of 91.0%.
  • As a fundamental capability for mobile robots, navigation involves multiple tasks including localization, mapping, motion planning, and obstacle avoidance. In unknown environments, a robot has to construct a map of the environment while simultaneously keeping track of its own location within the map. This is known as simultaneous localization and mapping (SLAM). For urban and indoor environments, SLAM is especially important since GPS signals are often unavailable. Visual SLAM uses cameras as the primary sensor and is a highly attractive but challenging research topic. The major challenge lies in the robustness to lighting variation and uneven feature distribution. Another challenge is to build semantic maps composed of high-level landmarks. To meet these challenges, we investigate feature fusion approaches for visual SLAM. The basic rationale is that since urban and indoor environments contain various feature types such points and lines, in combination these features should improve the robustness, and meanwhile, high-level landmarks can be defined as or derived from these combinations.

    We design a novel data structure, multilayer feature graph (MFG), to organize five types of features and their inner geometric relationships. Building upon a two view-based MFG prototype, we extend the application of MFG to image sequence-based mapping by using EKF. We model and analyze how errors are generated and propagated through the construction of a two view-based MFG. This enables us to treat each MFG as an observation in the EKF update step. We apply the MFG-EKF method to a building exterior mapping task and demonstrate its efficacy.

    Two view based MFG requires sufficient baseline to be successfully constructed, which is not always feasible. Therefore, we further devise a multiple view based algorithm to construct MFG as a global map. Our proposed algorithm takes a video stream as input, initializes and iteratively updates MFG based on extracted key frames; it also refines robot localization and MFG landmarks using local bundle adjustment. We show the advantage of our method by comparing it with state-of-the-art methods on multiple indoor and outdoor datasets.

    To avoid the scale ambiguity in monocular vision, we investigate the application of RGB-D for SLAM.We propose an algorithm by fusing point and line features. We extract 3D points and lines from RGB-D data, analyze their measurement uncertainties, and compute camera motion using maximum likelihood estimation. We validate our method using both uncertainty analysis and physical experiments, where it outperforms the counterparts under both constant and varying lighting conditions.

    Besides visual SLAM, we also study specular object avoidance, which is a great challenge for range sensors. We propose a vision-based algorithm to detect planar mirrors. We derive geometric constraints for corresponding real-virtual features across images and employ RANSAC to develop a robust detection algorithm. Our algorithm achieves a detection accuracy of 91.0%.

publication date

  • August 2015