National-Level Origin-Destination Estimation Based on Passively Collected Location Data and Machine Learning Methods

Thumbnail Image


Pan_umd_0117E_19152.pdf (2.48 MB)
No. of downloads:

Publication or External Link





Along with the development of information and positioning technologies, there emerges passively collected location data that contain location observations with time information from various types of mobile devices. Passive location data are known for their large sample size and continuous behavior observations. However, passive location data also require careful and comprehensive data processing and modeling algorithms for privacy protection and practical applications.In the meantime, the travel demand estimation of origin-destination tables is fundamental in transportation planning. There lacks a national origin-destination estimation that provides time-dependent travel behaviors for all travel modes. Passive collected location data appeal to researchers with the potential of serving as the data source for large-scale multimodal travel demand estimation and monitoring. This research proposes a comprehensive set of methods for passive location data processing including data cleaning, activity location and purpose identification, trip-level information identification, social demographic imputation, sample weighting and expansion, and demand validation. For each task, the thesis evaluates the state-of-the-practice and state-of-the-art algorithms, and develops an applicable method jointly considering the different features of various passive location data sources and the imputation accuracy. The thesis further examines the viability of the method kit in a national-level case study and successfully derives the national-level origin-destination estimates with additional data products, such as trip rate and vehicle miles traveled, at different geographic levels and temporal resolutions.