Background: Visible-near-infrared spectrometry is a technique suitable for assessing chemical and physiological properties of fruit. Some models of calibration/prediction have been tested in order to assess the feasibility of a visible-near-infrared sensor in order to monitor persimmon fruit colour, firmness, soluble solids, titratable acidity and soluble tannins.
Results: Five regression models were investigated: principal component, partial least squares, stepwise, support vector machines and ensembles of trees. These models were assessed by a 10-fold cross-validation with a new strategy for both outlier removal and wavelength reduction; furthermore, their statistical significance was evaluated by 100 Monte Carlo simulation runs. Principal component regression allowed us to build excellent and/or very good fit/prediction models. The results (in terms of RPD as standard deviation to performance standard error ratio) are: 9.23 (±0.26) for colour index, 10.18 (±0.37) for firmness, 7.15 (±0.28) for soluble solids content, 7.87 (±0.31) for titratable acidity and 8.91 (±0.33) for soluble tannins content.
Conclusion: The proposed strategy, for outlier removal and wavelength reduction, allowed the achievement of useful results. Principal component regression fit/prediction capability produced excellent results. Conversely, partial least squares regression showed fair/poor results and the remaining tested models performed badly on real data. © 2017 Society of Chemical Industry.
Keywords: PCR; PLS; RPD; SVM; ensemble trees; regression.
© 2017 Society of Chemical Industry.