Whats New in Release 2

Introduction

These notes relating to ASReml 2 are intended for those familiar with ASReml 1 and describe additions to the syntax. New users should read the User Guide which provides a comprehensive to fitting mixed models in ASReml.

The licensing module has been upgraded and you will need to acquire a new license to use ASReml 2. Generally, this license will be perpetual for all releases during the currency of the license but may vary with your particular contract with VSN. If the license is not renewed, you will not be able to run subsequent releases.

VSN has developed a user frontend WinASReml to facilite running ASReml jobs. Many users like the ConText ASCII editor for running ASReml.

An ASReml Discussion list is hosted by NSW Dept of Primary Industries. To join the list or change your email address, email arthur.gilmour@dpi.nsw.gov.au. You may then direct your comments/queries to ASREML-L@dpi.nsw.gov.au.

Critical Changes to Behaviour

Generally we seek to maintain upward compatability so that ASReml 1 code will continue to run. However, to deliver improved facilities, some changes to behaviour are unavoidable. More.

Expanded ANOVA options.

The Analysis of Variance table has been expanded to optionally include a conditional Wald F-statistic (!FCON qualifier) and in most cases, denominator degrees of freedom can be calculated (!DDF qualifier) allowing significance testing of the F-statistics. More!

Command Line

Template from data file

The facility to generate a template .as file has been moved from the MENU mode to the command line, and extended.

Command line Options

Command line options (some with arguments) are presented as a single concatenated string with a leading - as the first program argument. Remember that [ ] in the guide is used to indicate optional input and such square braces are not to be typed into the command file.

Most command line options can now alternatively be given on the top job control line in qualifier form. The new command line options are
Optionqualifier formdescription
A !ASK a Prompt for options and arguments
B !BRIEF b Reduce output to .asr file
H !HARDCOPY g Graphics written to file but not to screen
J !JOIN Concatenate output from !CYCLE qualifier
Q !QUIET No screen interaction.
O !ONERUN Do only one analysis (override !RENAME cycling)
R !RENAME r Build r arguments into output filename and cycle through extra arguments.
W !WORKSPACE w Set workspace to m Mbyte.
Z !LICENSE Display current license information.

Top job control line, a new optional header line.

The options can be set either as a concatenated string in the same format as expected on the command line, or as a list of qualifiers. In the former case, the syntax is
     !-s a
where s is the option letter string as defined for the command line options and a is a list of command line arguments. For example
     !-h22r 1 2 3
on the first line is equivalent to running ASReml with the command line
     ASReml -h22r jobname 1 2 3

Alternatively, !ASK prompts for an options string and arguments (like the A command line option). It is assumed that no other qualifiers are set on this line when !ASK is specified. For example -h22r 1 2 3
might be the response. The allowed options are -BbCDEFGgHgIJLNORrSsWwYy

There are top job control line qualifiers for most command line options. They include qualifiers to specify the graphis format.

Paths and loops in the .as file

ASReml is designed to analyse just one model per run. However, the analysis of a data set typically requires many runs, fitting different models to different traits. It is often convenient to have all these runs coded into a single .as file and control the details from the command line (or job control line) using arguments. This is possible using !CYCLE and !PATH.

Field Definition qualifiers

Storage of alphabetic factor labels

The storage of factor level labels has changed. Previously there was space for 5000 labels of 20 characters each. Now space is allocated dynamically with default allocation being 2000 labels of 16 characters long. More!

!PRUNE on a field definition line means that if fewer levels are actually present in the factor than were declared, ASReml will reduce the factor size to the actual number of levels. Use !PRUNALL for this action to be taken on the current and subsequent factors up to (but not including) a factor with the !PRUNEOFF qualifier. More!

Reordering the factor levels

!SORT declared after !A or !I on a field definition line will cause ASReml to sort the levels so that labels occur in alphabetic/numeric order for the analysis. !SORTALL means that the levels for the current and subsequent factors are to be sorted. More!

Skipping input fields

!SKIP f will skip f data fields BEFORE reading this field. It is particularly useful in large files with alphabetic fields which are not needed as it saves ASReml the time required to classify the alphabetic labels. For example
Sire !I !skip 1
would skip the field before the field which is read as 'Sire'.

Reading date and time fields

!DATE indicates the field has one of the date formats dd/mm/yy, dd/mm/ccyy, dd-Mon-yy, dd-Mon-ccyy.
!DMY indicates the field has one of the date formats dd/mm/yy or dd/mm/ccyy
!MDY indicates the field has one of the date formats mm/dd/yy or mm/dd/ccyy
!TIME indicates the field has a time format hh:mm:ss More!

New transformations

  • !NORMAL v replaces the variate with normal random variables having variance v. For example,
         Ndat !=0. !Normal 4.5 creates a new variable ( !=0.) and fills it with Normal(0,4.5) random values. These two transformations can be collapsed into one: viz.
         Ndat !=Normal 4.5 li !REPLACE o n replaces data values o with n in the current variable.
  • !RESCALE o s rescales the column(s) in the current variable ( !G group of variables) using Y=(Y+o)*s.
  • !SEED n sets the seed for the random number generator. For example,
         !SEED 848586 sets the seed for the random number generator to 848586.
  • !SETN v n replaces data values 1:n with normal random variables having variance v. Data values outside the range 1:n are set to 0. For example, Anorm !=A !SETN 2.5 10 replaces data values of 1:10 (copied from variable A) with 10 Normal(0,2.5) random values.
  • !SETU v n replaces data values 1:n with uniform random variables having range 0:v. Data values outside the range 1:n are set to 0. For example,
         Aeff !=A !SETU 5 10 replaces data values of 1, $:$, 10 (copied fromvariable A) with 10 Uniform(0,5) random values.
  • !UNIFORM v replaces the variate with uniform random variables having range 0:v. For example,
         Udat !=0. !Uniform 4.5 creates a new variable ( !=0.) and fills it with Uniform(0,4.5) random values. These two transformations can be collapsed into one: viz.
         Udat !=Uniform 4.5
  • !MM s associates marker positions in the vector s (based on the Haldane mapping function) with marker variables and replaces missing values in a vector of marker states with expected values calculated using distances to non-missing flanking markers. More!
  • !DOM A is used to form dominance covariables from a set of additive marker covariables previously declared with the !MM marker map qualifier. More!

    Pedigree and GIV files

    GIV Files

    The standard .giv file procedure expects the user will supply an inverse matrix. In some situations, it is easier to form the uninverted matrix and not very convenient for the user to invert it outside of ASReml to create the .giv file. In this case, supply the uninverted matrix in the sparse format file but with a file extension .grm. ASReml will then invert the matrix itself before it uses it. More!

    Pedigree file line qualifiers

    Formation of the A-inverse has been speeded up (substantial gain if many animals without progeny)

    Some new pedigree processing options added are:
  • !MGS now formed directly rather than by inserting dummy DAMs.
  • !SELF s allows partial selfing when 'Dam'=='Male parent' unspecified. More!
  • !INBRED v generates pedigree for inbred lines. More!
  • The !DIAG qualifier used to return the diagonal of the A-inverse matrix in AINVERSE.DIA. Now it also returns the inbreeding coefficients for the individuals in this file (calculated as the diagonal of A-I).
  • !SORT causes ASReml to sort the pedigree into an acceptable order, that is parents before offspring, before forming the A-Inverse. The sorted pedigree is written to a file whose name has .srt appended to it. ASReml then forms the A-inverse from this new file.

    Reading Data

    Combining columns from separate files

    !MERGE c filename [ !SKIP n ] [ !MATCH a b ]
    qualifiers may be specified on a line following the data filename line. The purpose is to combine data fields from the (primary) data file with data fields from the secondary ( !MERGE). More!

    Combining rows from separate files

    ASReml can read data from multiple files provided the files have the same layout. The file specified as the data file can contain lines of the form
    INCLUDE filename [ !SKIP n]
    where filename is the (path)name of the data subfile and !SKIP n is an optional qualifier indicating that the first n lines of the subfile are to be skipped. More!

    Datafile line qualifiers

    Datafile line qualifiers may also be defined using an environment variable called ASREML_QUAL. The environment variable is processed immediately after the data file name line is processed. All qualifier settings are reported in the .asr file.
  • !AILOADINGS i controls modification to AI updates of loadings in factor analytic variance models. More!
  • !AISINGULARITIES can be specified to force a job to continue even though a singularity was detected in the AI matrix. More!
  • !BMP sets hardcopy graphics file type to .bmp.
  • !BRIEF suppresses some of the information written to the .asr file. More!
  • !CONTRAST label ref values
    provides a convenient way to define contrasts among treatment levels. More!
  • !DDF [ i ]
    the calculation of denominator degrees of freedom required to calculate the significance of F statistics in the Analysis of Variance. More!
  • !DENSE n
    has been modified to accept an argument up to 5000. The upper limit in ASReml 1 was 800 which is still the default.
  • !EPS sets hardcopy graphics file type to .eps.
  • !EQORDER o modifies the algorithm used for choosing the order for solving the mixed model equations. More!
  • !FCON adds a 'conditional' F-statistic column to the Analysis of Variance table. More!
  • !GKRIGE p controls the expansion of !PVAL lists for fac(X,Y) model terms. More!
  • !HPGL [2] sets hardcopy graphics file type to HP GL. An argument of 2 sets the hardcopy graphics file type to HP GL 2
  • !LAST factor_1 lev_1 [ fac_2 lev_2 fac_3 lev_3]

    limits the order in which equations are solved in ASReml by forcing equations in the sparse partition involving the the first lev_i equations of factor_i to be solved after all other equations in the sparse partition. Is intended for use when there are multiple fixed terms in the sparse equations so that ASReml will be consistent in which effects are identified as singular. More!
  • !MBF mbf(X,m) filename [ !SKIP n ]
    specified on a separate line after the data file name line predefines the model term mbf(X,m) as a set of m covariates indexed by the data values. More!
  • !PS sets hardcopy graphics file type to .ps.
  • !PVSFORM f} modifies the format of the tables in the .pvs file and changes the file extension of the file to reflect the format. More!
  • !RREC [ n ]
    causes ASReml to read n records or to read up to a data reading error if n is omitted, and then process the records it has. More!
  • !RSKIP n [s] instructs ASReml to skip the leading lines of the data file up to and including the nth instance of the character string s (default value Ecode). More!
  • !SCORE requests ASReml write the SCORE vector and the Average Information matrix to files basename .SCO and basename .AIM. The values written are from the last iteration.
  • !SCREEN [ n ] [ !SMX m ]
    a 'Regression Screen', a form of all subsets regression. More!
  • !TOLERANCE [s_1 [ s_2]] modifies the ability of ASReml to detect singularities in the mixed model equations. More!
  • !SLNFORM [ i ] modifies the format of the .sln file. More!
  • !SPATIAL increases the amount of information reported on the residuals obtained from the analysis of a two dimensional regular grid field trial. The information is written to the .res file.
  • !SUBSET label factor list
    forms a new factor (label) derived from an existing factor (factor) by selecting a subset (list) of its levels. The qualifier occupies its own line after the data filename line but before the linear model. More!
  • !SUM causes ASReml to report a general description of the distribution of the data variables and factors and simple correlations among the variables for those records included in the analysis. More! !PVSFORM [ f ] controls form of the .pvs file. More!
  • !TABFORM [ f ] controls form of the .tab file. More!
  • !TXTFORM [ i ] sets the default argument for !PVSFORM, !SLNFORM, !TABFORM and !YHTFORM if these are not explicitly set. More!
  • !TWOWAY modifies the appearance of the variogram calculated from the residuals obtained when the sampling coordinates of the spatial process are defined on a lattice. More!
  • !VRB requests writing of .vrb file. Previously, the default was to write
  • !VGSECTORS [ s ]
    The sample variogram reported by ASReml now has two forms depending on whether the spatial coordinates represent a complete rectangular lattice (as typical of a field trial) or are irregularly arranged. For the irregular case, !VGSECTORS sets number of sectors (4, 6 or 8) to use when forming the variogram. More!
  • !WMF sets hardcopy graphics file type to .wmf.
  • !YHTFORM [ f ] controls the form of the .yht file

    TABULATE

    TABULATE statements may now appear before the model line as this is logically, a better place for them. If a linear (mixed) model is not supplied, ASReml will generate a simple model as it does not actually read the data until it has read a linear model line. Tabulation is of the data records included in the analysis (i.e. leaving out records elimated from the analysis model because of missing values in the variate or in the design factors).

    The qualifiers for optional output are: !COUNT, !SD, !RANGE and !STATS. !STATS is shorthand for !COUNT !SD !RANGE.

    The requested statistics are reported for each cell in the table. The tabulation includes the same records as are analysed in the subsequent linear model.

    The qualifier !DECIMALS [d] (1<= d<= 7) requests means be reported with d decimal places. If omitted, ASReml reports 5 significant digits; if specified without an argument, 2 is assumed.

    Linear Model Specification

    Generalized Linear Models

    ASReml includes facilities for fitting the family of Generalised Linear Models (GLMs, Nelder and Wedderburn, 1974, Nelder and McCullagh, 1994). More!

    Generalised Linear Mixed Models

    This section was written by Damian Collins There is the capacity to fit a wider class of models which include additional random effects for non-normal error distributions. The inclusion of random terms in a GLM is usually referred to as a Generalized Linear Mixed Model (GLMM). More?

    New Model terms.

  • at(F,n) is extended so that at(F,i).X at(F,j).X at(F,k).X can be written as at(F,i,j,k).X NB The at(F,i,j,k) term must be the first component of the interaction. Any number of levels may be listed.
  • at(F,i) is extended so that at(F) generates at(F,i) for all levels of F. NB. Since this command is interpreted before the data is read, it is necessary to declare the number of levels correctly in the field definition. \warn This extended form may only be used as the first term in an interaction.
  • ge(F,n), gt(F,n), le(F,n) and lt(F,n) create binary covariates indicating whether the level code for factor F is greater (less) than the argument n. They are similar to at(F,n) which indicates whether the level code equals n.
  • h(F) requests ASReml to fit the model term for factor F using Helmert constraints; these are the standard default constraints used by S-plus.
  • out(i), out(i,t) establishes a binary variable which is:
         out(i) 1 if data relates to observation i, (trait 1), else is 0
         out(i,t) 1 if data relates to observation i, (trait t), else is 0
    More.
  • qtl(M,s) calculates an expected marker state from flanking marker information at position s of the linkage group M (see !MMAP to define marker locations). s should be given in Morgans.

    PREDICT

    General

    The order terms appear in the predict table is now controlled by the user: they appear in the order in which the user specifies them on the predict directive.
  • The syntax is extended to allow specific levels to be specified by name as well as by position. For example
         PREDICT Sex 'male'
  • The !DEC n qualifier gives the user control of the number of decimal places reported in the predict table where n is 0...9. The default is 4. G15.9 format is used if n exceeds 9.
  • A second !PRESENT qualifier is allowed on a PREDICT statement (but not with !PRWTS). This is needed when there are two nested factors such as sites within regions and genotype within family. The two lists must not overlap.
  • Complicated weighting is possible with the !PRWTS qualifier.
  • !TWOSTAGEWEIGHTS is intended for use with variety trials which will subsequently be combined in a meta analysis. It forms the variance matrix for the predictions, inverts it and writes the predicted variety means with the corresponding diagonal elements of this matrix to the .pvs file. These values are used in some variety testing programs in Australia for a subsequent second stage analysis across many trials. A data base is used to collect the results from the individual trials and write out the combined data set. The diagonal elements are used as weights in the combined analysis.
  • !PLOT graphic control qualifiers allow predicted values to be plotted. These were developed by Damian Collins.

    Variance structures

    Models

  • Autoregressive models have been extended to include AR3 and SAR2. SAR2 is a constrained form of AR3 which represents the situation of competition from neighbouring plots plus general spatial correlation.
  • CHOLnC is an alternative zeroed form of CHOLn defining a reduced cholesky factorization. CHOLn models use structure V=LDL' where L has 1.0 on the diagonal and zeros above the diagonal; D is diagonal. CHOL1 extends this by setting lower triangle elements of L to zero except the first off diagonal band.
         CHOL1C extends this by setting lower triangle elements of L to zero except the first column of L. The CHOLnC form is somewhat similar to a Factor Analytic model.
  • The Matern class of isotropic covariance models is now available.

    Variance model qualifiers

    !=list has extended syntax as follows:
  • list
  • specifies one character per parameter. 1-9 are different from a-z which are different from A-Z so that 61 equalities can be specified. 0 and . mean unconstrained.
  • A colon generates a sequence viz. a:e is the same as abcde
  • Putting % as the first character in list makes the interpretation of codes absolute (so that they apply across structures.
  • Putting * as the first character in list indicates that numbers are repeat counts, A-Z and a-z are equality codes and only . is unconstrained. Thus !=*.3A2. is equivalent to !=0AAA00 or !=0aaa00)

    Output

  • A hardcopy Line printer plot of Residuals vs Predicted values is now printed to the .res file with other residual statistics.
  • A variogram is now reported for non-regular spatial data residuals. The variogram is computed for residuals when positions are indexed by two variables, and for fac(x,y) factors which have a spatial variance structure fitted. It is written to file with name tag _V.

    Timing a job

    Overall timing of a job can be obtained by comparing the start and finish times reported in the .asr file. It is possible to break the timing down into particular tasks by using the command line options DL to produce a .asl file containing, among other diagnostic information, times for operational components. Use the Unix system command grep ">>>" to extract the timing information.

    Return to start