Analyzing the combustion characteristics, engine performance, and emissions pathways of the internal combustion (IC) engine requires management of complex and an increasing quantity of data. With this in mind, effective management to deliver increased knowledge from these data over shorter timescales is a priority for development engineers. This paper describes how this can be achieved by combining conventional engine research methods with the latest developments in process informatics and statistical analysis. Process informatics enables engineers to combine data, instrumental and application models to carry out automated model development including optimization and validation against large data repositories of experimental data. This is complemented with the inclusion of experimental error and model parameter uncertainty, to yield confidence regimes on the final model result, hence the impact of specific shortcomings of the model and/or experimental dataset can be identified in a systematic manner. A methodology for model implementation is described including an extensible data model for storing engine experimental data in a consistent format. Finally, a working example for an application model is presented through the development of a semi-empirical soot model for diesel engines.