Driver Identification Using Multivariate In-vehicle Time Series Data 2018-01-1198
All drivers come with a driving signature during a driving. By aggregating adequate driving data of a driver via multiple driving sessions, which is already embedded with driving behaviors of a driver, driver identification task could be treated as a supervised machine learning classification problem. In this paper, we use a random forest classifier to implement the classification task. Therefore, we collected many time series signals from 60 driving sessions (4 sessions per driver and 15 drivers totally) via the Controller Area Network. To reduce the redundancy of information, we proposed a method for signal pre-selection. Besides, we proposed a strategy for parameters tuning, which includes signal refinement, interval feature extraction and selection, and the segmentation of a signal. We also explored the performance of different types of arrangement of features and samples. By following the proposed tuning strategy, the prediction performance of the random forest classifier achieved an accuracy of 89.14% for an identification task of four drivers, and 60.36% for fifteen drivers.