Predict Directive

Prediction process

ASReml parses the predict statement before fitting the model. If any syntax problems are encountered, these are reported in the .pvs file after which the statement is ignored: the job is completed as if the erroneous prediction statement did not exist. The predictions are formed as an extra process in the final iteration and are reported to the .pvs file. Consequently, aborting a run by creating the ABORTASR.NOW file will cause any predict statements to be ignored using FINALASR.NOW will allow any predict statements to be honoured.

By default, factors are predicted at each level, simple covariates are predicted at their overall mean and covariates used as a basis for splines or orthogonal polynomials are predicted at their design points. Model terms mv and units are always ignored.

Prediction at particular values of a covariate or particular levels of a factor is achieved by listing the values after the variate/factor name. Where there is a sequence of values, use the notation a b ... n to represent the sequence of values from a to n with step size b-a. The default stepsize is 1 (in which case b may be omitted). A colon ( :) may replace the ellipsis ( ...). An increasing sequence is assumed. When giving particular values for factors, the default is to use the coded level (1: n) rather than the label (alphabetical or integer). To use the label, precede it with a quote ( "). p The second step is to specify the averaging set. The default averaging set is those explanatory variables involved in fixed effect model terms that are not in the classifying set. By default variables that only define random model terms are ignored. The qualifier !AVERAGE allows these variables to be added to the default averaging set.

The third step is to select the linear model terms to use in prediction. The default is that all model terms based entirely on variables in the classifying and averaging sets are used. Two qualifiers allow this default to be modified by adding ( !USE) or removing ( !IGNORE) model terms. The qualifier !ONLYUSE explicitly specifies the model terms to use, ignoring all others. The qualifier !EXCEPT explicitly specifies the model terms not to use, including all others. These qualifiers may implicitly modify the averaging set by including variables defining terms in the predicted model not in the classify set. It is sometimes easier to specify the classify set and the model terms to use and allow ASReml to construct the averaging set.

The fourth step is to choose the weights to use when averaging over dimensions in the hyper-table. The default is to simply average over the specified levels but the qualifier !AVERAGE factor weights allows other weights to be specified.

There are often situations in which the fixed effects design matrix X is not of full column rank. These can be classified according to the cause of aliasing.
1. linear dependencies among the model terms due to over-parameterisation of the model,
2. no data present for some factor combinations so that the corresponding effects cannot be estimated,
3. linear dependencies due to other, usually unexpected, structure in the data.

The first type of aliasing is imposed by the parameterisation chosen and can be determined from the model. The second type of aliasing can be detected when setting up the design matrix for parameter estimation (which may require revision of imposed constraints). The third type can then be detected during the absorption of the mixed model equations. Dependencies (aliasing) can be dealt with in several ways and ASReml checks that predictions are of estimable functions in the sense defined by Searle (1971, p160) and are invariant to the constraint method used.

ASReml doesn't print predictions of non-estimable functions unless the !PRINTALL qualifier is specified. However, using !PRINTALL is rarely a satisfactory solution. Failure to report predicted values normally means that the predict statement is averaging over some cells of the hyper-table that have no information and therefore cannot be averaged in a meaningful way. Appropriate use of the !AVERAGE and/or !PRESENT qualifiers will usually resolve the problem. The !PRESENT qualifier enables the construction of means by averaging only the estimable cells of the hyper-table. It is reguarly used for nested factors, for example locations nested in regions.

Return to start