Tracy, James L. (2018-10). Random Subset Feature Selection for Ecological Niche Modeling of Wildfire Activity and the Monarch Butterfly. Doctoral Dissertation. Thesis uri icon

abstract

  • Correlative ecological niche models (ENMs) are essential for investigating distributions of species and natural phenomena via environmental correlates across broad fields, including entomology and pyrogeography featured in this study. Feature (variable) selection is critical for producing more robust ENMs with greater transferability across space and time, but few studies evaluate formal feature selection algorithms (FSAs) for producing higher performance ENMs. Variability of ENMs arising from feature subsets is also seldom represented. A novel FSA is developed and evaluated, the random subset feature selection algorithm (RSFSA). The RSFSA generates an ensemble of higher accuracy ENMs from different feature subsets, producing a feature subset ensemble (FSE). The RSFSA-selected FSEs are novelly used to represent ENM variability. Wildfire activity presence/absence databases for the western US prove ideal for evaluating RSFSA-selected MaxEnt ENMs. The RSFSA was effective in identifying FSEs of 15 of 90 variables with higher accuracy and information content than random FSEs. Selected FSEs were used to identify severe contemporary wildfire deficits and significant future increases in wildfire activity for many ecoregions. Migratory roosting localities of declining eastern North American monarch butterflies (Danaus plexippus) were used to spatially model migratory pathways, comparing RSFSAselected MaxEnt ENMs and kernel density estimate models (KDEMs). The higher information content ENMs best correlated migratory pathways with nectar resources in grasslands. Higher accuracy KDEMs best revealed migratory pathways through less suitable desert environments. Monarch butterfly roadkill data was surveyed for Texas within the main Oklahoma to Mexico Central Funnel migratory pathway. A random FSE of MaxEnt roadkill ENMs was used to estimate a 2-3% loss of migrants to roadkill. Hotspots of roadkill in west Texas and Mexico were recommended for assessing roadkill mitigation to assist in monarch population recovery. The RSFSA effectively produces higher performance ENM FSEs for estimating optimal feature subset sizes, and comparing ENM algorithms and parameters, and environmental scenarios. The RSFSA also performed comparably to expert variable selection, confirming its value in the absence of expert information. The RSFSA should be compared with other FSAs for developing ENMs and in data mining applications across other disciplines, such as image classification and molecular bioinformatics.

publication date

  • December 2018