Synthetic Trajectory Generation via Clustering-Based Semi-supervised Generative Adversarial Networks Open Access

Minxing Zhang (Fall 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/qr46r228c?locale=en
Published

Abstract

Analyzing human mobility data and gaining insights from it is crucial for city planning and epidemic modeling. Synthetic trajectory generation is the task of generating large-scale fake trajectories that mimic the real ones by preserving all essential properties. As one of the fundamental problems in geoscience, it plays a vital role in studying population flow. Analyzing individual movements helps understand the traffic or public transportation system and may be used to predict the future position of a moving object. However, the limited open real-life human mobility data with complicated properties invalidates existing approaches. Moreover, effectively capturing the modality patterns (moving purpose, transportation mode, etc.) of the real-life trajectories and generating synthetic trajectories with these modality patterns preserved is also a critical issue. Third, there is a short of a systematic way to measure whether the transitional information from one location to another has been effectively captured in the generated trajectories. Given a user's incomplete sequence of visits, the existing generation model has yet to be tasked with predicting the following few locations. To address these challenges, we propose a Clustering-based Semi-supervised Generative Adversarial Network (CS-GAN) that, based on limited actual trajectories reported by users, can generate synthetic trajectories which mimic the real ones by preserving all the essential properties. Our proposed model leverages the idea of clustering and semi-supervised GANs to capture real-life modality patterns. Moreover, we develop a novel transitional probability-related metric to measure whether the synthetic trajectories capture the transitional information. We also conducted an ablation study to verify the effectiveness of our proposed generation model in predicting the possible subsequent few visits given an incomplete sequence of visits. Extensive experiments have been conducted on real-world datasets and demonstrated our model's superiority in performance over state-of-the-art models. 

Table of Contents

Introduction, Related Work, Problem Setting, Proposed Framework, Evaluation, Conclusion and Discussion

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files