↓ Skip to main content

Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy

Overview of attention for article published in Human Genetics, January 2013
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age

Mentioned by

twitter
1 tweeter

Citations

dimensions_citation
29 Dimensions

Readers on

mendeley
64 Mendeley
Title
Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy
Published in
Human Genetics, January 2013
DOI 10.1007/s00439-013-1266-7
Pubmed ID
Authors

Eric O. Johnson, Dana B. Hancock, Joshua L. Levy, Nathan C. Gaddis, Nancy L. Saccone, Laura J. Bierut, Grier P. Page

Abstract

A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.

Twitter Demographics

The data shown below were collected from the profile of 1 tweeter who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 64 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 3 5%
United Kingdom 1 2%
Germany 1 2%
Brazil 1 2%
Unknown 58 91%

Demographic breakdown

Readers by professional status Count As %
Researcher 23 36%
Student > Ph. D. Student 13 20%
Student > Doctoral Student 5 8%
Student > Master 5 8%
Student > Bachelor 4 6%
Other 10 16%
Unknown 4 6%
Readers by discipline Count As %
Agricultural and Biological Sciences 26 41%
Biochemistry, Genetics and Molecular Biology 12 19%
Medicine and Dentistry 11 17%
Mathematics 4 6%
Psychology 1 2%
Other 2 3%
Unknown 8 13%

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 19 February 2013.
All research outputs
#7,634,359
of 12,215,443 outputs
Outputs from Human Genetics
#2,131
of 2,462 outputs
Outputs of similar age
#74,082
of 136,064 outputs
Outputs of similar age from Human Genetics
#17
of 28 outputs
Altmetric has tracked 12,215,443 research outputs across all sources so far. This one is in the 23rd percentile – i.e., 23% of other outputs scored the same or lower than it.
So far Altmetric has tracked 2,462 research outputs from this source. They receive a mean Attention Score of 4.6. This one is in the 10th percentile – i.e., 10% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 136,064 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 34th percentile – i.e., 34% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 28 others from the same source and published within six weeks on either side of this one. This one is in the 25th percentile – i.e., 25% of its contemporaries scored the same or lower than it.