Build Summary

Release 2 Version: 20201027095038

January 6, 2021

Please visit ALFA project page for data access and additional information.


Input and Output Counts


Input Release 1 Release 2
Studies 42 79
Subjects 98,494 192,710
Genotypes 551,345,630,054 5,762,708,417,353
Genotypes Excluded 689,601,274 (0.1%) 70,905,472,368 (1.2%)
Output Count
Total RefSNPs 904,167,063
Exist in dbSNP 153 587,734,538
Novel 316,432,525
Input Assay Source Subjects*
Array 109,697
Exome 29,931
Genome 25,478

* Subject counts for different assay source can be overlapping.

Population Biosample ID Subjects Total Site Count MAF = 0 MAF >= 0.01 0.01 > MAF >= 0.001 MAF < 0.001 Singleton
European SAMN10492695 163,190 897,674,770 790,647,710 12,681,797 10,209,207 874,783,766 55,326,962
African American SAMN10492698 5,989 890,636,714 824,003,273 17,196,729 17,457,964 855,982,021 25,207,366
African Others SAMN10492696 211 889,747,124 867,277,986 16,180,269 6,288,869 867,277,986 6,669,925
African (Note 1) SAMN10492703 6,200 890,637,200 823,004,460 17,231,538 17,793,146 855,612,516 25,718,386
South Asian SAMN10492702 2,619 889,129,503 875,443,975 13,537,787 140,039 875,451,677 4,209,516
East Asian SAMN10492697 2,515 889,348,478 877,811,054 11,378,577 133,556 877,836,345 3,529,940
Other Asian SAMN10492701 1,000 889,147,573 880,606,177 8,492,026 41,240 880,614,307 2,584,569
Asian (Note 2) SAMN10492704 3,515 889,368,170 876,465,327 9,012,224 3,858,982 876,496,964 4,096,666
Latin American 1 SAMN10492699 817 889,286,501 869,906,237 12,609,493 6,770,469 869,906,539 6,683,030
Latin American 2 SAMN10492700 4,703 889,328,426 862,575,116 9,597,908 17,148,719 862,581,799 11,064,135
Other SAMN11605645 11,666 897,694,388 859,254,160 14,970,456 22,452,177 860,271,755 14,013,627
Total (Note 3) SAMN10492705 192,710 897,734,484 737,043,490 15,177,885 17,568,436 864,988,163 80,931,234

Notes:

  1. Total of African American and African Others; see population descriptions.

  2. Total of East Asian and Other Asian; see population descriptions.

  3. Total of unique subjects and excluding African and Asian redundant counts above.

Column descriptions:

Output Population - see ALFA computed populations

BioSample ID - population BioSample accession ID

Subjects - unique subject count by population

Total Site Count - total unique variant sites reported

MAF = 0 - site homozygous for the reference allele and no variant allele detected from the current subject sample size; possibly rare if subject size > 100

MAF >= 0.01 - common variant with MAF >= 0.01

0.01 > MAF >= 0.001 - rare variants

MAF < 0.001 - ultra rare variants

Singleton - minor allele is found once

Data Subsets in ClinVar, GTR, dbGaP, and PubMed

RefSNP with ALFA frequency (ALFA RS Count) and percent (%) of total RS (Total) in ClinVar with clinical significance, in GTR as genetic markers, in dbGaP with association p-value, and cited in PubMed. VCF containing RS subsets are available on FTP


Attributes ALFA RS Count Percent(%) Total RS
ClinVar 333,721 86 387,427
GTR 398 84 474
GWAS Catalog (p-value < 10^-5) 21,827 99 21,972
Pubmed Cited 202,156 87 231,502
Support Center

Last updated: 2021-01-07T16:17:52Z