pcr
Purpose
Principal components regression: multivariate inverse least
squares regession.
Synopsis
model = pcr(x,y,ncomp,options) %calibration
pred = pcr(x,model,options) %prediction
valid = pcr(x,y,model,options) %validation
options = pcr('options')
Description
PCR
calculates a single principal components regression model using the given
number of components ncomp
to predict y from
measurements x.
To construct a PCR model, the inputs are x the predictor x-block
(2-way array class “double” or “dataset”), y the predicted y-block (2-way array class “double”
or “dataset”), ncomp
the number of components to to be calculated (positive integer scalar) and the
optional structure, options. The output is a standard model structure model with the following
fields (see MODELSTRUCT):
modeltype: 'PCR',
datasource: structure
array with information about input data,
date: date
of creation,
time: time
of creation,
info: additional
model information,
reg: regression vector,
loads: cell
array with model loadings for each mode/dimension,
pred: 2
element cell array with model predictions for each input block (when options.blockdetail='normal'
x-block predictions are not saved and this will be an empty array) and the
y-block predictions.
tsqs: cell
array with T2 values for each mode,
ssqresiduals: cell
array with sum of squares residuals for each mode,
description: cell
array with text description of model, and
detail: sub-structure
with additional model details and results.
To make predictions the inputs are x the new predictor x-block (2-way array
class “double” or “dataset”), and model the PCR model. The output pred is a structure, similar to model, that contains scores,
predictions, etc. for the new data.
If new y-block measurements are also available then the inputs
are x the new
predictor x-block (2-way array class “double” or “dataset”), y the new predicted block
(2-way array class “double” or “dataset”), and model the PCR model. The output valid is a structure,
similar to model, that
contains scores, predictions, and additional y-block statistics etc. for the
new data.
In prediction and validation modes, the same model structure
is used but predictions are provided in the model.detail.pred field.
Note: Calling pcr
with no inputs starts the graphical user interface (GUI) for this analysis
method.
Options
options = a structure array with the following fields:
display: [ 'off' | {'on'} ], governs level of display to
command window,
plots: [ 'none' | {'final'} ], governs level of plotting,
outputversion: [
2 | {3} ], governs output format (discussed below),
preprocessing: {[]
[]}, two element cell array containing preprocessing structures (see PREPROCESS) defining
preprocessing to use on the x- and y-blocks (first and second elements
respectively),
algorithm: [
{'svd'} | ' robustpcr'
| ' correlationpcr' ],
governs which algorithm to use. 'svd' is standard algorithm. 'robustpcr' is
robust algorithm with automatic outlier detection. 'correlationpcr' is standard
PCR with re-ordering of factors in order of y-variance captured.
blockdetails: ['compact'
| {'standard'} | 'all'], extent of predictions and raw residuals
included in model. 'standard' = only y-block, 'all' x and y blocks.
confidencelimit: [
{'0.95'} ], confidence level for Q and T2 limits. A value of zero (0)
disables calculation of confidence limits,
roptions: structure of options to pass to rpcr (robust PCR
engine from the Libra Toolbox). Only used when algorithm is 'robustpcr',
alpha : [ {0.75} ], (1-alpha) measures the
number of outliers the algorithm should resist. Any value between 0.5 and 1 may
be specified. These options are only used when algorithm is 'robustpcr'.
intadjust : [ {0} ], if equal to one, the
intercept adjustment for the LTS-regression will be calculated. See ltsregres.m
for details (Libra Toolbox).
The default options can be retreived using: options = pcr('options');.
OUTPUTVERSION
By default (options.outputversion
= 3) the output of the function is a standard model structure model. If options.outputversion = 2,
the output format is:
[b,ssq,t,p] = pcr(x,y,ncomp,options)
where the outputs are
b = matrix of regression vectors or matrices for
each number of principal components up to ncomp,
ssq = the sum of squares information,
t = x-block scores, and
p = x-block loadings.
Note: The regression matrices are ordered in b such that each Ny
(number of y-block variables) rows correspond to the regression matrix for that
particular number of principal components.
See Also
analysis, crossval, frpcr, modelstruct, pca, pls, preprocess, ridge