GenABEL: an R package for Genome Wide Association Analysis...

GenABEL-package {GenABEL}R Documentation

GenABEL: an R package for Genome Wide Association Analysis...

Description

GenABEL: an R package for Genome Wide Association Analysis

Details

Genome-wide association (GWA) analysis is a tool of choice for identification of genes for complex traits. Effective storage, handling and analysis of GWA data represent a challenge to modern computational genetics. GWA studies generate large amount of data: hundreds of thousands of single nucleotide polymorphisms (SNPs) are genotyped in hundreds or thousands of patients and controls. Data on each SNP undergoes several types of analysis: characterization of frequency distribution, testing of Hardy-Weinberg equilibrium, analysis of association between single SNPs and haplotypes and different traits, and so on. Because SNP genotypes in dense marker sets are correlated, significance testing in GWA analysis is preferably performed using computationally intensive permutation test procedures, further increasing the computational burden.

To make GWA analysis possible on standard desktop computers we developed GenABEL library which addresses the following objectives:

(1) Minimization of the amount of rapid access memory (RAM) used and the time required for data transactions. For this, we developed an effective data storage and manipulation model.

(2) Maximization of the throughput of GWA analysis. For this, we designed optimal fast procedures for specific genetic tests.

Embedding GenABEL into R environment allows for easy data characterization, exploration and presentation of the results and gives access to a wide range of standard and special statistical analysis functions available in base R and specific R packages, such as "haplo.stats", "genetics", etc.

To see (more or less complete) functionality of GenABEL, try running

demo(ge03d2).

Other demo of interest could be run with demo(srdta). Depending on your user priveleges in Windows, it may well not run. In this case, try demo(srdtawin).

The most important functions and classes are:

For converting data from other formats, see

convert.snp.illumina (Illumina/Affymetrix-like format). This is our preferred converting function, very extensively tested. Other conversion functions include: convert.snp.text (conversion from human-readable GenABEL format), convert.snp.ped (Linkage, Merlin, Mach, and similar files), convert.snp.mach (Mach-format), convert.snp.tped (from PLINK TPED format), convert.snp.affymetrix (BRML-style files).

For converting of GenABEL's data to other formats, see export.merlin (MERLIN and MACH formats), export.impute (IMPUTE, SNPTEST and CHIAMO formats), export.plink (PLINK format, also exports phenotypic data).

To load the data, see load.gwaa.data.

For data managment and manipulations see merge.gwaa.data, merge.snp.data, gwaa.data-class, snp.data-class, snp.names, snp.subset.

For merging extra data to the phenotypic part of gwaa.data-class object, see add.phdata.

For traits manipulations see ztransform (transformation to standard Normal), rntransform (rank-transformation to normality), npsubtreated (non-parametric routine to "impute" trait's values in these medicated).

For quality control, see check.trait, check.marker, HWE.show, summary.snp.data, perid.summary, ibs, hom.

For fast analysis function, see scan.gwaa-class, ccfast, qtscore, mmscore, egscore, ibs, r2fast (estimate linkage disequilibrium using R2), dprfast (estimate linkage disequilibrium using D'), rhofast (estimate linkage disequilibrium using 'rho')

For specific tools facilitating analysis of the data with stratification (population stratification or (possibly unknown) pedigree structure), see qtscore (implements basic Genomic Control), ibs (computations of IBS / genomic IBD), egscore (stratification adjustment following Price et al.), polygenic (heritability analysis), polygenic_hglm (another function for heritability analysis), mmscore (score test of Chen and Abecasis), grammar (grammar test of Aulchenko et al.).

For functions facilitating construction of tables for your manuscript, see descriptives.marker, descriptives.trait, descriptives.scan.

For functions recunstructing relationships from genomic data, see findRelatives, reconstructNPs.

For meta-analysis and related, see help on formetascore.

For link to WEB databases, see show.ncbi.

For interfaces to other packages and standard R functions, also for 2D scans, see scan.glm, scan.glm.2D, scan.haplo, scan.haplo.2D, scan.gwaa-class, scan.gwaa.2D-class.

For graphical facilities, see plot.scan.gwaa, plot.check.marker.

Author(s)

Yurii Aulchenko et al. (see help pages for specific functions)

References

If you use GenABEL package in your analysis, please cite the following work:

Aulchenko Y.S., Ripke S., Isaacs A., van Duijn C.M. GenABEL: an R package for genome-wide association analysis. Bioinformatics. 2007 23(10):1294-6.

If you used polygenic, please cite

Thompson EA, Shaw RG (1990) Pedigree analysis for quantitative traits: variance components without matrix inversion. Biometrics 46, 399-413.

If you used environmental residuals from polygenic for qtscore, used GRAMMAR and/or GRAMMAS analysis, please cite

Aulchenko YS, de Koning DJ, Haley C. Genomewide rapid association using mixed model and regression: a fast and simple method for genome-wide pedigree-based quantitative trait loci association analysis. Genetics. 2007 177(1):577-85.

Amin N, van Duijn CM, Aulchenko YS. A genomic background based method for association analysis in related individuals. PLoS ONE. 2007 Dec 5;2(12):e1274.

If you used mmscore, please cite

Chen WM, Abecasis GR. Family-based association tests for genome-wide association scans. Am J Hum Genet. 2007 Nov;81(5):913-26.

For exact HWE (used in summary.snp.data), please cite:

Wigginton G.E., Cutler D.J., Abecasis G.R. A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet. 2005 76: 887-893.

For haplo.stats (scan.haplo, scan.haplo.2D), please cite:

Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet. 2002 70:425-434.

For fast LD computations (function dprfast, r2fast), please cite:

Hao K, Di X, Cawley S. LdCompare: rapid computation of single- and multiple-marker r2 and genetic coverage. Bioinformatics. 2006 23:252-254.

If you used npsubtreated, please cite

Levy D, DeStefano AL, Larson MG, O'Donnell CJ, Lifton RP, Gavras H, Cupples LA, Myers RH. Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension. 2000 Oct;36(4):477-83.

See Also

DatABEL, genetics, haplo.stats, qvalue

Examples

## Not run: 
demo(ge03d2)
demo(srdta)
demo(srdtawin)

## End(Not run)

[Package GenABEL version 1.6-7 Index]