Unreplicated early generation variety trial - Wheat

Introduction

To further illustrate the approaches presented in the previous section, we consider an unreplicated field experiment conducted at Tullibigeal situated in south-western NSW. The trial was an S1 (early stage) wheat variety evaluation trial and consisted of 525 test lines which were randomly assigned to plots in a 67 by 10 array. There was a check plot variety every 6 plots within each column. That is the check variety was sown on rows 1,7,13,...,67 of each column. This variety was numbered 526. A further 6 replicated commercially available varieties (numbered 527 to 532) were also randomly assigned to plots with between 3 to 5 plots of each. The aim of these trials is to identify and retain the top, say 20% of lines for further testing. Cullis et al. (1989) considered the analysis of early generation variety trials, and presented a one-dimensional spatial analysis which was an extension of the approach developed by Gleeson and Cullis (1987). The test line effects are assumed random, while the check variety effects are considered fixed. This may not be sensible or justifiable for most trials and can lead to inconsistent comparisons between check varieties and test lines. Given the large amount of replication afforded to check varieties there will be very little shrinkage irrespective of the realised heritability.

We consider an initial analysis with spatial correlation in one direction and fitting the variety effects (check, replicated and unreplicated lines) as random. We present three further spatial models for comparison. The ASReml input file is

 Tullibigeal trial
   linenum
   yield
   weed
   column 10
   row 67
   variety 532   # testlines 1:525, check lines 526:532
 wheat.asd !skip 1 !DOPATH 1
 !PATH 1                       # AR1 x I
 y ~ mu  weed mv !r variety
 residual ar1(row).column

 !PATH 2                       # AR1 x AR1
 y ~ mu  weed mv !r variety
 residual ar1(row).ar1(column)

 !PATH 3                       # AR1 x AR1 + column trend
 y ~ mu weed pol(column,-1) mv !r variety
 residual ar1(row).ar1(column)

 !PATH 4                       # AR1 x AR1 + Nugget + column trend
 y ~ mu weed pol(column,-1) mv !r variety units
 residual ar1(row).ar1(column)
 predict var

The data fields represent the factors variety, row and column, a covariate weed and the plot yield ( yield). There are three paths in the ASReml file. We begin with the one-dimensional spatial model, which assumes the variance model for the plot effects within columns is described by a first order autoregressive process. The abbreviated output file is

    1 LogL=-4280.75     S2= 0.12850E+06    666 df   0.1000      1.000     0.1000
    2 LogL=-4268.57     S2= 0.12138E+06    666 df   0.1516      1.000     0.1798
    3 LogL=-4255.89     S2= 0.10968E+06    666 df   0.2977      1.000     0.2980
    4 LogL=-4243.76     S2=  88033.        666 df   0.7398      1.000     0.4939
    5 LogL=-4240.59     S2=  84420.        666 df   0.9125      1.000     0.6016
    6 LogL=-4240.01     S2=  85617.        666 df   0.9344      1.000     0.6428
    7 LogL=-4239.91     S2=  86032.        666 df   0.9474      1.000     0.6596
    8 LogL=-4239.88     S2=  86189.        666 df   0.9540      1.000     0.6668
    9 LogL=-4239.88     S2=  86253.        666 df   0.9571      1.000     0.6700
   10 LogL=-4239.88     S2=  86280.        666 df   0.9585      1.000     0.6714
 Final parameter values                        0.9592     1.0000     0.6721

          - - - Results from analysis of yield - - -
 Akaike Information Criterion     8485.76 (assuming 3 parameters).
 Bayesian Information Criterion   8499.26

 Model_Term                             Gamma         Sigma   Sigma/SE   % C
 variety                 IDV_V  532  0.959184       82758.6       8.98   0 P
 ar1(row).column                670 effects
 Residual                SCA_V  670  1.000000       86280.2       9.12   0 P
 row                      AR_R    1  0.672052      0.672052      16.04   1 P

                                   Wald F statistics
     Source of Variation           NumDF     DenDF    F-inc            P-inc
   7 mu                                1      83.6  9799.20            <.001
   3 weed                              1     477.0   109.33            <.001
 Notice: The DenDF values are calculated ignoring fixed/boundary/singular
             variance parameters using algebraic derivatives.

                     Solution       Standard Error    T-value     T-prev
   3 weed
                    1   -217.481        20.7995        -10.46
   7 mu
                    1    2893.05        29.9404         96.63
   8 mv_estimates                          2 effects fitted
   6 variety                             532 effects fitted
 Residual [section 11, column 10 (of 10), row 13 (of 67)] is -4.26 SD
 Finished: 24 Jan 2014 15:06:51.854    Warning: LogL not converged

The iterative sequence converged, the REML estimate of the autoregressive parameter indicating substantial within column heterogeneity. The abbreviated output from the two-dimensional AR1 cross AR1 spatial model is

    1 LogL=-4277.99     S2= 0.12850E+06    666 df
    2 LogL=-4266.13     S2= 0.12097E+06    666 df
    3 LogL=-4253.05     S2= 0.10777E+06    666 df
    4 LogL=-4238.72     S2=  83156.        666 df
    5 LogL=-4234.53     S2=  79868.        666 df
    6 LogL=-4233.78     S2=  82024.        666 df
    7 LogL=-4233.67     S2=  82725.        666 df
    8 LogL=-4233.65     S2=  82975.        666 df
    9 LogL=-4233.65     S2=  83065.        666 df
   10 LogL=-4233.65     S2=  83100.        666 df


          - - - Results from analysis of yield - - -
 Akaike Information Criterion     8475.29 (assuming 4 parameters).
 Bayesian Information Criterion   8493.30

 Model_Term                             Gamma         Sigma   Sigma/SE   % C
 variety                 IDV_V  532   1.06038       88117.5       9.92   0 P
 ar1(row).ar1(column)           670 effects
 Residual                SCA_V  670  1.000000       83100.1       8.90   0 P
 row                      AR_R    1  0.685387      0.685387      16.65   0 P
 column                   AR_R    1  0.285909      0.285909       3.87   0 P

                                    Wald F statistics
      Source of Variation           NumDF     DenDF    F-inc             Prob
    7 mu                                1      41.7  6248.65            <.001
    3 weed                              1     491.2    85.84            <.001

The change in REML LogL is significant (χ²₁= 12.46, p<.001) with the inclusion of the autoregressive parameter for columns. The Figure presents the sample variogram of the residuals for the AR1 cross AR1 model. There is an indication that a linear drift from column 1 to column 10 is present. We include a linear regression coefficient pol(column,-1) in the model to account for this. Note we use the '-1' option in the pol term to exclude the overall constant in the regression, as it is already fitted. The linear regression of column number on yield is significant (t=-2.96). The sample variogram (Figure 2 ) is more satisfactory, though interpretation of variograms is often difficult, particularly for unreplicated trials. This is an issue for further research.

Figure 1. Sample variogram of the residuals from the AR1 cross AR1 model for the Tullibigeal data

Figure 2. Sample variogram of the residuals from the AR1 cross AR1 + pol(column,-1) model for the Tullibigeal data

The abbreviated output for this model and the final model in which a nugget effect has been included is

 #AR1xAR1 + pol(column,-1)
    1 LogL=-4270.99     S2= 0.12730E+06    665 df
    2 LogL=-4258.95     S2= 0.11961E+06    665 df
    3 LogL=-4245.27     S2= 0.10545E+06    665 df
    4 LogL=-4229.50     S2=  78387.        665 df
    5 LogL=-4226.02     S2=  75375.        665 df
    6 LogL=-4225.64     S2=  77373.        665 df
    7 LogL=-4225.60     S2=  77710.        665 df
    8 LogL=-4225.60     S2=  77786.        665 df
    9 LogL=-4225.60     S2=  77806.        665 df

  Source                Model  terms     Gamma     Component    Comp/SE   % C
  variety                 532    532   1.14370       88986.3       9.91   0 P
  Variance                670    665   1.00000       77806.0       8.79   0 P
  Residual            AR=AutoR    67  0.671436      0.671436      15.66   0 U
  Residual            AR=AutoR    10  0.266088      0.266088       3.53   0 U

                                    Wald F statistics
      Source of Variation           NumDF     DenDF    F-inc             Prob
    7 mu                                1      42.5  7073.70            <.001
    3 weed                              1     457.4    91.91            <.001
    8 pol(column,-1)                    1      50.8     8.73            0.005

 #
 #AR1xAR1 + units + pol(column,-1)
 #
   1 LogL=-4272.85     S2= 0.11684E+06    665 df
   2 LogL=-4265.70     S2=  83872.        665 df    :   1 components restrained
   3 LogL=-4240.99     S2=  80942.        665 df
   4 LogL=-4227.44     S2=  53712.        665 df
   5 LogL=-4221.09     S2=  52201.        665 df
   6 LogL=-4220.94     S2=  54803.        665 df
   7 LogL=-4220.94     S2=  54935.        665 df
   8 LogL=-4220.94     S2=  54934.        665 df

          - - - Results from analysis of yield - - -
 Akaike Information Criterion     8451.88 (assuming 5 parameters).
 Bayesian Information Criterion   8474.37

 Model_Term                             Gamma         Sigma   Sigma/SE   % C
 variety                 IDV_V  532   1.32827       72967.0       6.99   0 P
 units                   IDV_V  670  0.562308       30889.9       3.78   0 P
 ar1(row).ar1(column)           670 effects
 Residual                SCA_V  670  1.000000       54934.0       5.15   0 P
 row                      AR_R    1  0.835396      0.835396      18.38   0 P
 column                   AR_R    1  0.375499      0.375499       3.25   0 P

                                   Wald F statistics
     Source of Variation           NumDF     DenDF    F-inc            P-inc
   7 mu                                1      13.6  4272.13            <.001
   3 weed                              1     470.3    86.31            <.001
   8 pol(column,-1)                    1      27.4     3.69            0.065

The change in LogL from adding units is not large but is significant. However, adding units reduces the significance of the linear column trend, as that is now picked up better by the ar1(column) term.

  Warning: mv_estimates         is ignored for prediction
  Warning: units                is ignored for prediction

  ---- ---- ---- ---- ---- ---- ----   1 ---- ---- ---- ---- ---- ---- ---- ----
  column               evaluated at       5.5000
  weed                 is evaluated at average value of       0.4597
  Predicted values of yield

  variety             Predicted_Value Standard_Error Ecode
        1.0000              2917.1782       179.2881 E
        2.0000              2957.7405       178.7688 E
        3.0000              2872.7615       176.9880 E
        4.0000              2986.4725       178.7424 E
          .                     .               .
      522.0000              2784.7683       179.1541 E
      523.0000              2904.9421       179.5383 E
      524.0000              2740.0330       178.8465 E
      525.0000              2669.9565       179.2444 E
      526.0000              2385.9806        44.2159 E
      527.0000              2697.0670       133.4406 E
      528.0000              2727.0324       112.2650 E
      529.0000              2699.8243       103.9062 E
      530.0000              3010.3907       112.3080 E
      531.0000              3020.0720       112.2553 E
      532.0000              3067.4479       112.6645 E
  SED: Overall Standard Error of Difference   245.8

Note that the (replicated) check lines have lower SE than the (unreplicated) test lines. There will also be large diffeneces in SEDs. Rather than obtaining the large table of all SEDs, you could do the prediction in parts
predict var 1:525 column 5.5
predict var 526:532 column 5.5 !SED
to examine the matrix of pairwise prediction errors of variety differences.

Back

Return to index