The example presents a ‘typical’ complete block design and then adds spatial components to it.
Examples for spatial analysis tree !P family 50 !I row 19 col 84 rep 5 iblock 10 plot 400 dbh radiata.ped !make !alpha radiata.csv !csv !dopart $A # Analysis as random incomplete blocks # includes additive (tree) and SCA (family) effects !part 1 dbh ~ mu rep !r rep.iblock plot tree family !f mv # Analysis as random incomplete blocks # but adding spatial row and col autoregressive # processes and independent error (units) !part 2 dbh ~ mu rep !r rep.iblock plot tree family units !f mv repeated ar1(row !INIT 0.9).ar1(col 0.9) # Spatial analysis dropping non-significant design # features (see text) !part 3 dbh ~ mu rep !r tree family units !f mv repeated ar1(row !INIT 0.9).ar1(col 0.9) # Analysis as random complete blocks # with spatial component. However, spatial is # represented in the G matrix rather than the # R matrix !part 4 dbh ~ mu !r ar1v(row !INIT 0.9 10).ar1(col !INIT 0.9) tree family repeated idv(units)
Part 1 of the code is simply an incomplete block design with random plot, additive (tree) and SCA (Family) effects. You can see explanations for this code in the Univariate Analysis section of the cookbook. Part 2 introduces a few additional elements:
- Separate autoregressive of order 1 (AR) errors for rows and columns. The first term,
ar1(row), states that the factor row is used to fit the AR residual and similarly for the second term. In both cases the values of the factors row or column are used to specify the ‘distance’ measure. The 0.9 is the starting value for each of the autoregressive coefficients.
unitsrefers to the uncorrelated (non spatial) residual or Nugget. Thus, our model includes spatially associated residuals (explained in the previous two points) and spatially independent residuals. Do not forget to include units, because if it is not in the model you will probably get inflated estimates of heritability and think that spatial analysis is the best thing after sliced bread (and a long warm shower and …). Well, it is not. It can help you with patchy situations (for example disease scores) or to discover assessor effects, but in well-designed experiments and standard traits (e.g. volume, basic density) will not make much of a difference.
- You will also need to fit missing values as fixed effects (
!f mv). Most likely you do not have a perfectly rectangular experiment, but you can create a rectangular matrix with missing values and then fitting the missing values in the model. This will speed up (or even make possible) the use of spatial analysis in ASReml. Remember: to use part 2 or part 3 (modelling spatial through the residual matrix R) you must have a rectangular distribution of trees.
Most of the time, when you fit spatial and independent residuals (Part 2), the experimental design features will become non significant and you will drop them from the model (the exception, if any, will probably be the plot effect). Part 3 shows the likely form of the model without incomplete block and plot effects. Part 4 is a bit more intriguing, although it is equivalent to Part 3. There, rather than modelling the spatial residuals in the R residual matrix, we are modelling them as additional random effects, that is in the G matrix. Thus,
- The additional covariance structure for the random term
row.colis the Kronecker product of TWO matrices, which have the same structure as the autoregressive residual matrices of Parts 2 and 3. The difference is that we use
ar1vto request a variance associated with this model term.
- In this case, the random term
unitsis not included as this is now described by the residual variance.
- Modelling spatial correlation through G does not require a rectangular distribution of trees. Part 4 is faster, but it may not run for large arrays.
You can revisit a longer explanation of Covariance structures.
Spatial analysis for irregularly spaced trees
The above code works well with equally spaced individuals; however, there are examples when the distances between trees are variable (for example when using GPS measurements of position). In this case, the code can be changed to something like (Thanks to Arthur Gilmour):
... # Scale the coordinates so a typical distance # is around 1. y !/10 x !/10 filename.dat !dopart $A !part 1 dbh ~ mu # With Nugget variance fitted as residual !part 2 dbh ~ mu !r iexpv(fac(x,y) !INIT 0.2 1) # with Nugget variance in G structure, # spatial in residual !part 3 dbh ~ mu x y !r units 1 residual ieuc(fac(x,y) !INIT 0.2)
In these jobs we are not including any genetic components, just to highlight the spatial syntax. Part 1 obviously fits only the overall mean. Part 2 creates a factor based on the combinations of two continuous variables, and allows for the spatial association in the G structure (in the
fac(x,y) model term). Part 3 fits the distances as covariates and allow for the spatial association at the R structure level. In the code
ieuc uses variables x and y as the distance coordinates. Parts 2 and 3 are memory hogs, so try to stay clear of them if you can. An option would be to create an equally spaced grid and then assign each tree to the closest point in that grid. This would convert the problem to something that can be treated with the code used at the beginning of this page.
Copyright (1997–2021) by Luis Apiolaza, some rights are reserved.