
"To do" list for R/qtl
----------------------------------------------------------------------
This file is intended to contain a list of many of the additions and
revisions that are planned for the R/qtl package.  

If you any additions or revisions to suggest, please send an email to
Karl Broman, <kbroman@jhsph.edu>.
----------------------------------------------------------------------
 
SHORT TERM:

o Include Bjarke's code on eHK for scanone and scantwo.

o read.cross for "qtx" sometimes doesn't seem to take the
  genotype pattern appropriately; read in a backcross as if it
  were an F2. 

o plot.pxg for X chr and autosome: results can be messed
  up, depending on the order of the markers.  

o Fix the help file for fitqtl() to emphasize that interactions among
  covariates are not allowed, but must be set up in advance.
 
o Add an explanation regarding the coding in the coefficient 
  estimates in fitqtl().  Add text to the help file for
  summary.fitqtl(). 

o write tools for converting the output from scanqtl() to the format
  for scanone() or scantwo(), according to whether it's a 1-d or
  2-d search, or print a warning otherwise.

o Documentation for scanone() and scantwo() regarding the X chromosome.

o scanone() with model="2part" gives NAs as LOD scores if there
  is complete penetrance (one of the p's goes to 1).  This doesn't
  happen with model="binary".

  The problem is that if everyone with a certain genotype survives,
  then you have great segregation distortion if you look at only
  the dead individuals, and so you can't estimate both means.

o In scanone (and possibly in scantwo), when a factor is used as
  a covariate, a statement about "must be a matrix" is given, but 
  it should say something about the need for the thing to be numeric.

o Finish off the work to get coefficient estimates by imputation in
  fitqtl for the X chromosome in BC and F2. 

o geno.table(): the X chromosome needs special treatment.

o P-values from geno.table() when an intercross has some dominant
  markers.  

o summary.scantwo doesn't work if there's just two positions on a
  chromosome (or maybe it's for one position).

o Changed default for plot.scantwo to lower="joint"; check that the
  tutorial reflects this.  

o Fix plot.scantwo for the case that incl.markers=TRUE, so that positions
  are not equally spaced, but are according to the genetic map.

o Add an example regarding the X chromosome to the help files for
  fake.f2 and scanone.

o Effectscan for X chromosome.

o An NA in the mapmaker data file caused an error in read.cross;
  the line became too long.  Maybe this is true whenever an item
  doesn't match what is expected.

o Add sample data to the web site.

o Speed up read.cross.mm; deliver meaningful errors if map/genotypes
  don't match, and if too many genotypes in a row.

o Documentation for makeqtl/fitqtl/scanqtl (especially summary.fitqtl).

o Deal more cleanly with missing values in the output to scantwo.
  Make the warning message more clear, and perhaps don't automatically 
  set them to 0.

o Add attributes regarding degrees of freedom and null log10lik
  to the output of scanone and scantwo.

o Revise the a.starting.point a bit.

o Revise c.cross so that you can combine crosses even if there are
  different numbers of chromosomes

o Turn the tutorial into a Sweave vingette, to conform to the
  Bioconductor requirements?

o Allow no p-values in summary.scantwo

o LOD thresholds for X chromosome in scanone/two.

o Ensure consistency in use of chromosome numbers vs names/IDs when
  plotting results of genome scans, subsetting crosses, and so forth
  (sometimes #'s taken as indices and sometimes as names).

o Documentation on RI lines.


MEDIUM TERM:

o MIM for a set of QTLs at specified locations with specified
  interactions (as a new method in fitqtl and scanqtl).

o Incorporate the DIRECT algorithm stuff from Hao

o refineqtl -- like fitqtl, but refines the locations of the QTLs as
  in Zhao-Bang's MIM

o max.scanqtl, summary.scanqtl, print.summary.scanqtl, plot.scanqtl

o In MIM: allow return of SEs of effects.  Write coef.mim, resid.mim,
  and dev.mim to pull out the est'd coefficients, the residuals, and
  the "deviance" (2 * ln likelihood).

o In MIM, refinement of QTL location and plots of that.

o scanone with additive alleles at QTL

o Pull out results for an interval.

o Function to calculate variance due to QTL

o Modify the map expansion for RI lines for the X chromosome.
  [really need to add additional functions for mapping, as
   the marginal genotype distribution is 2:1 rather than 1:1]
  --the transition matrix is not symmetric 
  Pr(BB|AA) = 2r/(1+4r) and Pr(AA|BB) = 4r/(1+4r)

o Allow tailored allele labels (e.g. CC/CB/BB rather than AA/AB/BB)

o Allow use of individual IDs; a column "id" or "ID" in the phenotype
  data.  We could refer to these in functions like top.errorlod.

o Ability to get at the individual contributions to the LOD score? 

o Incorporate the code from Brian Yandell, Fei Zou and Amy Jin on
  semi-parametric QTL mapping methods.

o Have effectscan give output (silently)

o Allow plot.missing to give results color coded by marker genotype
  (like Saunak's cool plots).

o effectscan and effectplot: SEs and so forth using imputations

o effectscan: if one chromosome, plot map positions on the x-axis 
  rather than the chromosome ID.

o Add appropriate functions to analyze advanced intercrosses (AILs). 

o Analysis of binary traits by imputation

o Include widgets for getting more easy access to the data.

o Calculate pairwise QTL probabilities by the more simple method,
  assuming independence, in scantwo.

o Permutation tests with scantwo should include the results comparing
  2 vs 1 QTL.

o Modify plot.rf and plot.errorlod to allow plot of a color scale, as
  in plot.scantwo.  

o Add a FAQ document to the R/qtl web page
    - Reading in data
    - Haley-knott vs non-parametric vs EM
    - Do you plan to incorporate _____?
    - Extremely large LOD scores by EM


LONG TERM:

o Allow phenotypes on multiple individuals (esp for recombinant inbred
  lines). 

o "embarassing parallel" processing for permutation tests (Rmpi, snow)

o Composite interval mapping, in an automated way.

o Imprinting/parent-of-origin effects.

o Treating a covariate as a random effect.

o Multiple phenotypes (esp. regarding pleiotropy).

o Model search for MIM etc...forward and stepwise selection.

o Function to plot, for a specified q1, LOD{q2|q1} vs q2 (using the
  output from scantwo).

o Take the fit of the null model outside of the C code for
  the imputation method in scanone and scantwo, so that it
  only has to be done once (rather than for each chr or chr pair).

o Starting values for EM for the two-part model (and more generally).
  Allow the option of an automatic selection of multiple starting
  points. 

o Generalized linear models in scanone and scantwo.

o Analysis functions such as scanone and scantwo might assign an
  attribute to their output which identifies the input data and/or
  function call.

o Re-write the C code for EM underneath scanone and scantwo so that it
  is not so tedious.

----------------------------------------------------------------------
end of TODO.txt
