Browse Publications Technical Papers 2017-01-0236
2017-03-28

Statistical Characterization, Pattern Identification, and Analysis of Big Data 2017-01-0236

In the Big Data era, the capability in statistical and probabilistic data characterization, data pattern identification, data modeling and analysis is critical to understand the data, to find the trends in the data, and to make better use of the data. In this paper the fundamental probability concepts and several commonly used probabilistic distribution functions, such as the Weibull for spectrum events and the Pareto for extreme/rare events, are described first. An event quadrant is subsequently established based on the commonality/rarity and impact/effect of the probabilistic events. Level of measurement, which is the key for quantitative measurement of the data, is also discussed based on the framework of probability. The damage density function, which is a measure of the relative damage contribution of each constituent is proposed. The new measure demonstrates its capability in distinguishing between the extreme/rare events and the spectrum events. Several case studies including vehicle reliability, vehicle road test score, warranty, salary distribution of an institution, the city population distribution in 3 countries, and the earthquake distribution worldwide and in the USA, are provided to demonstrate the role of the statistical and probabilistic approaches in the characterization and analysis of the big data.

SAE MOBILUS

Subscribers can view annotate, and download all of SAE's content. Learn More »

Access SAE MOBILUS »

Members save up to 16% off list price.
Login to see discount.
We also recommend:
TECHNICAL PAPER

Evaluation of Tire Wear Performance

980256

View Details

STANDARD

VHDL-AMS Statistical Analysis Packages

J2748_200610

View Details

TECHNICAL PAPER

Quantifying Enclosed Space and Cargo Volume

2011-01-0781

View Details

X