Prediction of NO
Emissions from Compression Ignition Engines Using Ensemble Learning-Based Models with Physical Interpretability
On-board diagnostics (OBD) data contain valuable information including real-world measurements of vehicle powertrain parameters. These data can be used to gain a richer data-driven understanding of complex physical phenomena like emissions formation during combustion. In this study, we develop a physics-based machine learning framework to predict and analyze trends in engine-out NOx emissions from diesel and diesel-hybrid heavy-duty vehicles. This model differs from black-box machine learning models presented in previous literature because it incorporates engine combustion parameters that allow physical interpretation of the results. Based on chemical kinetics and the characteristics of diffusive combustion, NOx emissions from compression ignition engines primarily depend non-linearly on three parameters: adiabatic flame temperature, the oxygen concentration in the cylinder when the intake valves are closed, and combustion time duration. Here these parameters were calculated from available OBD data. Linearizing a physics-based NOx emissions prediction model provides an opportunity to evaluate several machine learning regression techniques. The results show that an ensemble learning bagging-type model like random forest regression (RFR) is highly effective in predicting engine out NOx emissions measured by the on-board NOx sensor. We also show that real-world OBD data has high heterogeneity with clustered co-occurrences of vehicle parameters. In terms of accuracy, the developed model provides an average R2 value of 0.72 and mean absolute error (MAE) of 78 ppm for different vehicle OBD datasets, an improvement of 53% and 42% respectively when compared to non-linear regression models, and provides the opportunity to interpret the results because of its linkage to physical parameters. We also perform drop-column feature sensitivity analysis for the RFR Model and compare prediction results with black-box deep neural network and non-linear regression models. Based on its high accuracy and interpretability, the developed RFR model has potential for use in on-board NOx prediction in engines of varying displacement and design.