Trustworthy AI During Machine Learning Model Development: Enhancing Explainability and Robustness of ML Models with Fitting Graphs

Tracking #: 933-1913

Authors:

	Name	ORCID
	Robbie Nakatsu	https://orcid.org/0000-0003-1523-4224

Responsible editor:

Richard Mann

Submission Type:

Research Paper

Abstract:

This study demonstrates how a fitting graph can enhance explainability and robustness during the model development phase of a machine learning project. The approach is illustrated with a ridge regression task, where the goal is to identify the best-fitting regularization parameter, λ, from a range of values. A simple scatterplot of λ values (indicating model complexity) against average mean squared error, MSE (representing predictive accuracy), provides a visual representation to help the model developer determine if sufficient iterations of k-fold cross-validation have been performed. In addition, this study shows how fitting graph curves can be estimated and constructed from noisy scatterplots using regression splines. Instead of increasing the number of reps of cross-validation, a regression spline can save you time in estimating the fitting graph, using far fewer iterations. The fitting graph is also presented as a tool to promote model robustness, defined as the model's ability to maintain performance levels across variations in the hyperparameter λ. This concept is demonstrated through a case study on an unstable polynomial regression model. The simulation study reveals that standard k-fold cross-validation, even when repeated 5 or 10 times, selects an incorrect and unstable λ by an overwhelming margin. In contrast, the fitting graph method reliably selects a λ that is both well-fitting and stable. Without the fitting graph, the model developer is led astray and is more likely to choose a highly unstable λ, leading to suboptimal model performance.

Manuscript:

ds-paper-933.pdf

Data repository URLs:

https://rnakatsu.lmu.build/Datasets_Nakatsu.zip

Date of Submission:

Thursday, September 11, 2025

Date of Decision:

Tuesday, September 23, 2025

Nanopublication URLs:

Decision:

Reject (Pre-Screening)

Data Science