The 5 Stats That Actually Matter for NIR Model Evaluation (R² is Not One of Them)
Stop relying on R² to evaluate your NIR calibration. Learn RMSEP, RPD, Bias, RMSEC/RMSECV ratio, and what thresholds to actually use.
Every time I see a calibration report that leads with R² = 0.98, I brace myself. Not because 0.98 is necessarily wrong — but because R² alone tells you almost nothing useful about whether a model will actually perform in production. I've seen R² = 0.99 models that fell apart on new samples and R² = 0.91 models that ran a grain testing program reliably for eight years. The difference is in the statistics that actually measure what matters.
Here are the five statistics I use when evaluating any NIR calibration. I'll also explain why R² doesn't make the list — and why you should be suspicious of any report that leads with it.
RMSEP is the most important single number in your calibration report. It measures the average prediction error on an independent validation set — samples the model has never seen during development. The formula is:
RMSEP = √(Σ(predicted − reference)² / n)
This is the number that tells you, in the same units as your analyte, how wrong your instrument is likely to be on a real production sample. If your protein RMSEP is 0.45%, that's the error you're working with.
What's acceptable? It depends entirely on your application — which is why RMSEP alone isn't enough. A 0.45% protein RMSEP might be excellent for a rough screening application and completely unacceptable for a high-value ingredient specification. You need context, which is where RPD comes in.
RPD = SD of the reference values ÷ RMSEP. It expresses your model's prediction error relative to the natural variation in your sample population. Here are the thresholds I use (based on published chemometrics literature and my own field experience in feed and grain analysis):
RPD contextualizes RMSEP. A model predicting a narrow-range ingredient might have a small RMSEP but a poor RPD, because the natural variation is also small and you need precision. RPD captures this relationship.