\name{wheatLines}
\alias{wheatLines}
\title{wheat lines dataset}
\description{

Information from a collection of 599 historical CIMMYT wheat lines.  The wheat data set is from 
CIMMYT's Global Wheat Program. Historically, this program has conducted numerous international 
trials across a wide variety of wheat-producing environments. The environments represented in 
these trials were grouped into four basic target sets of environments comprising four 
main agroclimatic regions previously defined and widely used by CIMMYT's Global Wheat Breeding Program. 
The phenotypic trait considered here was the average grain yield (GY) of the 599 wheat lines evaluated 
in each of these four mega-environments. 

A pedigree tracing back many generations was available, and the Browse application of 
the International Crop Information System (ICIS), as described in  (McLaren \emph{et al.} 2000, 2005) was used 
for deriving the relationship matrix A among the 599 lines; it accounts for selection and inbreeding.

Wheat lines were recently genotyped using 1447 Diversity Array Technology (DArT) generated by 
Triticarte Pty. Ltd. (Canberra, Australia; \url{http://www.triticarte.com.au}). The DArT markers 
may take on two values, denoted by their presence or absence. Markers with a minor allele frequency 
lower than 0.05 were removed, and missing genotypes were imputed with samples from the marginal 
distribution of marker genotypes, that is, \eqn{x_{ij}=Bernoulli(\hat p_j)}, where  \eqn{\hat p_j}  
is the estimated allele frequency computed from the non-missing genotypes. The number of DArT 
MMs after edition was 1279.

}

\usage{
  data(wheatLines)
}

\format{
 Matrix Y contains the average grain yield, column 1: Grain yield for environment 1 and so on. 
}

\source{
  International Maize and Wheat Improvement Center (CIMMYT), Mexico.
}

\references{
McLaren, C. G., L. Ramos, C. Lopez, and W. Eusebio. 2000. ``Applications of the geneaology manegment system.'' 
In \emph{International Crop Information System. Technical  Development Manual, version VI}, edited by McLaren, C. G., J.W. White 
and P.N. Fox. pp. 5.8-5.13. CIMMyT, Mexico: CIMMyT and IRRI. 

McLaren, C. G., R. Bruskiewich, A.M. Portugal, and A.B. Cosico. 2005. The International Rice Information System. 
A platform for meta-analysis of rice crop data. \emph{Plant Physiology} \bold{139}: 637-642.
}
\examples{
####=========================================####
#### For CRAN time limitations most lines in the 
#### examples are silenced with one '#' mark, 
#### remove them and run the examples
####=========================================####
data(wheatLines)
X <- wheatLines$wheatGeno; X[1:5,1:5]; dim(X)
Y <- wheatLines$wheatPheno
rownames(X) <- rownames(Y)

####=========================================####
#### select environment 1
####=========================================####
#y <- Y[,1] # response grain yield
#Z1 <- diag(length(y)) # incidence matrix
#K <- A.mat(X) # additive relationship matrix

####=========================================####
#### GBLUP pedigree-based approach
####=========================================####
#ETA <- list( list(Z=Z1, K=K))
#ans <- mmer(y=y, Z=ETA, method="EMMA") # kinship based
#summary(ans)

####=========================================####
#### GBLUP marker based approach
####=========================================####
#ETA2 <- list( list(Z=X))
#ans2 <- mmer(y=y, Z=ETA2, method="EMMA") # marker based
#summary(ans2)

####=========================================####
#### compare and check that is the same result
####=========================================####
#plot(ans$u.hat, (X%*%ans2$u.hat), xaxt="n", yaxt="n", 
#     ylab="Marker-based GBLUP", xlab="Pedigree-based GBLUP")

####=========================================####
#### PREDICT PROGENY 
####=========================================####
#GEBV.pb <- ans$u.hat # this are the BV
#rownames(GEBV.pb) <- rownames(Y)

####=========================================####
#### all possible crosses = 179,101
####=========================================####
#crosses <- do.call(expand.grid, list(rownames(Y),rownames(Y))); dim(crosses)
#cross2 <- duplicated(t(apply(crosses, 1, sort)))
#crosses2 <- crosses[cross2,]; head(crosses2); dim(crosses2)

####=========================================####
#### match the possible crosses with the parental BV
####=========================================####
#GCA1 = GEBV.pb[match(crosses2[,1], rownames(GEBV.pb))] # get GCA1 BLUP of each hybrid
#GCA2 = GEBV.pb[match(crosses2[,2], rownames(GEBV.pb))] # get GCA1 BLUP of each hybrid

####=========================================####
#### join everything
####=========================================####
#BV <- data.frame(crosses2,GCA1,GCA2); head(BV)
#BV$BVcross <- apply(BV[,c(3:4)],1,mean); head(BV)
#plot(BV$BVcross)

}
\keyword{datasets}
