IMPUTING SOCIAL DEMOGRAPHIC INFORMATION BASED ON PASSIVELY COLLECTED LOCATION DATA AND MACHINE LEARNING METHODS
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Multiple types of passively collected location data (PCLD) have emerged during the past 20 years. Its capability in travel demand analysis has also been studied and revealed. Unlike the traditional surveys whose sample is designed efficiently and carefully, PCLD features a non-probabilistic sample of dramatically larger size. However, PCLD barely contains any ground truth for both the human subjects involved and the movements they produce. The imputation for such missing information has been evaluated for years, including origin and destination, travel mode, trip purpose, etc. This research intends to advance the utilization of PCLD by imputing social demographic information, which can help to create a panorama for the large volume of travel behaviors observed and to further develop a rational weighting procedure for PCLD. The Conditional Inference Tree model has been employed to address the problems because of its abilities to avoid biased variable selection and overfitting.