Skip to main content
SLU publication database (SLUpub)

Research article2006Peer reviewed

Hyperspectral NIR image regression part II: Dataset preprocessing diagnostics

Burger J, Geladi P

Abstract

When known reference values such as concentrations are available, the spectra from near infrared (NIR) hyperspectral images can be used for building regression models. The sets of spectra must be corrected for errors, transformed to reflectance or absorbance values, and trimmed of bad pixel outliers in order to build robust models and minimize prediction errors. Calibration models can be computed from small (< 100) sets of spectra, where each spectrum summarizes an individual image or spatial region of interest (ROI), and used to predict large (> 20 000) test sets of spectra. When the distributions of these large populations of predicted values are viewed as histograms they provide mean sample concentrations (peak centers) as well as uniformity (peak widths) and purity (peak shape) information. The same predicted values can also be viewed as concentration maps or images adding spatial information to the uniformity or purity presentations. Estimates of large population statistics enable a new metric for determining the optimal number of model components, based on a combination of global bias and pooled standard deviation values computed from multiple test images or ROIs. Two example datasets are presented: an artificial mixture design of three chemicals with distinct NIR spectra and samples of different cheeses. In some cases it was found that baseline correction by taking first derivatives gave more useful prediction results by reducing optical problems. Other data pretreatments resulted in negligible changes in prediction errors, overshadowed by the variance associated with sample preparation or presentation and other physical phenomena. Copyright (c) 2007 John Wiley & Sons, Ltd

Published in

Journal of Chemometrics
2006, Volume: 20, number: 3-4, pages: 106-119
Publisher: JOHN WILEY & SONS LTD