A Load Spectrum Data based Data Mining System for Identifying Different Types of Vehicle Usage of a Hybrid Electric Vehicle Fleet
In order to achieve high customer satisfaction and to avoid high warranty costs caused by component failures of the power-train of hybrid electric vehicles (HEV), car manufacturers have to optimize the dimensioning of these elements. Hence, it is obligatory for them to gain knowledge about the different types of vehicle usage being predominant all over the world. Therefore, in this paper we present a Data Mining system that employs a Random Forest (RF) based dissimilarity measure in the dimensionality reduction technique t-Distributed Stochastic Neighbor Embedding (t-SNE) to automatically identify and visualize different types of vehicle usage by applying these methods to aggregated logged on-board data, i.e., load spectrum data. This kind of data is calculated and recorded directly on the control units of the vehicles and consists of aggregated numerical data, like the histogram of the velocity signal or the traveled distance of a vehicle.