H Martens and M Martens 2001 Meas. Sci. Technol. 12 1746 doi:10.1088/0957-0233/12/10/708
H Martens and M Martens
Show affiliationsThe book provides an introduction to multivariate data analysis using linear modelling and its applications to quality assessment with special emphasis on chemometrics and sensory science. The aim of the book is to help students and researchers (the problem and data `owners') to analyse their large empirical data sets with minimum prior knowledge of algebra and mathematical statistics.
The text (445 pages) is divided into four parts: overview (76 pp), methodology (156 pp), applications (122 pp) and appendices (72 pp). In the first part, the authors motivate a reader to use multivariate methods, explain concepts of quality assessment and provide `a layman's guide to multivariate data analysis'. The importance of background knowledge about the problem and good quality of input data is emphasized. A research project with the aim of obtaining new facts from an empirical data set is divided into six steps from the original question to the unfolded answer. These steps are then formally followed in all the presented examples. Finally, the principles of soft bilinear modelling (the basic data-analytic tool used in the book) are described at a glance. In the second part, the method of partial least squares regression is derived from the methods of linear least squares regression and principal component analysis, including its individual variants and applications (multivariate calibration, prediction, discrimination and classification). Individual chapters are dedicated to validation of results and experimental planning. Part three describes five specific experiments: analysis of NIR spectra, analysis of questionnaire data on the quality of the working environment, prediction of toxicity from chemical structure, quality monitoring of a sugar production process, and exploratory search for optimal conditions preventing loss of quality in stored food. Appendices in the final part provide additional information to every individual chapter. The text is supported by 97 figures, 19 tables and 114 references.
Most derivations and explanations in the book are based on examples which are described in almost all possible details except calculations: the main emphasis is on experimental design, organization of input data tables and, above all, on interpretation and validation of results. Computing itself is supposed to be performed by some available data analytic software such as, e.g., The Unscrambler (http://www.camo.no) or PLS_Toolbox in Matlab http://www.mathworks.com). However, the book provides a rather general guide independent of any specific software.
The beginner will probably benefit most from the great experience of the authors: the description of advantages and risks in multivariate data analysis is well and proportionally balanced and documented. Minimum abstraction and mathematical formalism may also bring the book closer to a wider readership. At the same time, the authors' approach is fair and responsible: they encourage readers to work independently but they clearly mark the limits beyond which the reader should seek help from a professional statistician. The absence of abstraction also does not necessarily prevent some generalization: while the reader will probably solve new problems by analogy with those described in the book, the examples presented are accompanied by lists of related problems in order to demonstrate the wide spectrum of possible applications.
The strengths of the book determine also some of its limitations: although it is intended for `laymen' and contains minimal mathematics, it is not too easy to read. Many references to other parts of the book, descriptions of the authors' intentions and directions on how to read the book (often useful) sometimes distract rather than focus the reader's attention on the main problem. A diligent reader, however, will find the book useful: beginners and data-analytic software users as a good start and introduction, more advanced users and teachers as a possible source of inspiration and a good teaching aid.
Martin Samal
Issue 10 (October 2001)
H Martens and M Martens 2001 Meas. Sci. Technol. 12 1746
G F Dell'Antonio and L Tenuta 2004 J. Phys. A: Math. Gen. 37 5605
K. Sugitani et al. 2000 The Astronomical Journal 119 323
D.-C. Kim et al. 2002 ApJS 143 277
Herbert W Hamber and Geoff Kagel 2004 Class. Quantum Grav. 21 5915
G S Joyce 1989 J. Phys. A: Math. Gen. 22 L919
J. R. Mattox et al. 2001 ApJS 135 155
A. Arrieta et al. 2005 ApJ 623 252
C Chicone and B Mashhoon 2004 Class. Quantum Grav. 21 L139
Nirupam Roy et al 2007 ApJ 668 L67