Characterizing forest canopy structure with lidar composite metrics and machine learning Academic Article uri icon


  • A lack of reliable observations for canopy science research is being partly overcome by the gradual use of lidar remote sensing. This study aims to improve lidar-based canopy characterization with airborne laser scanners through the combined use of lidar composite metrics and machine learning models. Our so-called composite metrics comprise a relatively large number of lidar predictors that tend to retain as much information as possible when reducing raw lidar point clouds into a format suitable as inputs to predictive models of canopy structural variables. The information-rich property of such composite metrics is further complemented by machine learning, which offers an array of supervised learning models capable of relating canopy characteristics to high-dimensional lidar metrics via complex, potentially nonlinear functional relationships. Using coincident lidar and field data over an Eastern Texas forest in USA, we conducted a case study to demonstrate the ubiquitous power of the lidar composite metrics in predicting multiple forest attributes and also illustrated the use of two kernel machines, namely, support vector machine and Gaussian processes (GP). Results show that the two machine learning models in conjunction with the lidar composite metrics outperformed traditional approaches such as the maximum likelihood classifier and linear regression models. For example, the five-fold cross validation for GP regression models (vs. linear/log-linear models) yielded a root mean squared error of 1.06 (2.36) m for Lorey's height, 0.95 (3.43) m for dominant height, 5.34 (8.51) m2/ha for basal area, 21.4 (40.5) Mg/ha for aboveground biomass, 6.54 (9.88) Mg/ha for belowground biomass, 0.75 (2.76) m for canopy base height, 2.2 (2.76) m for canopy ceiling height, 0.015 (0.02) kg/m3 for canopy bulk density, 0.068 (0.133) kg/m2 for available canopy fuel, and 0.33 (0.39) m2/m2 for leaf area index. Moreover, uncertainty estimates from the GP regression were more indicative of the true errors in the predicted canopy variables than those from their linear counterparts. With the ever-increasing accessibility of multisource remote sensing data, we envision a concomitant expansion in the use of advanced statistical methods, such as machine learning, to explore the potentially complex relationships between canopy characteristics and remotely-sensed predictors, accompanied by a desideratum for improved error analysis. © 2011 Elsevier Inc.

published proceedings

  • Remote Sensing of Environment

citation count

  • 123

complete list of authors

  • Zhao, Kaiguang||Popescu, Sorin||Meng, Xuelian||Pang, Yong||Agca, Muge

publication date

  • August 2011