Home » Genetics » Genotype Data Version 1 (2006-2008 Samples)

Sample Collection

In 2006, saliva was collected using a mouthwash collection method. In 2008, saliva was collected using the Oragene DNA collection kit (OGR-250). Saliva completion rates were 83% in 2006 and 84% in 2008.

The genotyping was performed by the NIH Center for Inherited Disease Research (CIDR, X01HG005770-01, http://www.cidr.jhmi.edu/), using the Illumina Human Omni-2.5 Quad beadchip, with coverage of approximately 2.5 million single nucleotide polymorphisms (SNPs). For more information about the specific SNPs included on the Illumina Human Omni-2.5 Quad beadchip, please refer to/download the files in the links below. Genotyping Quality Control was performed by the Genetics Coordinating Center at the University of Washington, Seattle, WA. A copy of the QC report is available here.

The initial data product is available through dbGaP. Specific information on the data can be found at www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000428.v1.p1


Current dbGaP data products also include imputation of approximately 21 million DNA variants from the 1000Genomes Project (http://www.1000genomes.org/). Imputation will increase the number of available markers and will make possible comparisons across platforms that do not assay the same genome-wide SNP panel. These imputation analyses were performed and documented by the Genetics Coordinating Center at the University of Washington, Seattle, WA. A copy of the imputation report is available here.

Flipped Strand Issues and Correction Method

In Mid-2014, we became aware of an annotation issue with the Illumina HumanOmni2.5-v1_D manifest that caused strand flip errors in the annotation for approximately 20,000 SNPs genotyped in the Health and Retirement Study 2006/2008 samples. This manifest also served as the foundation for the 1000 Genomes imputation for those samples. We systematically investigated the problem and identified two issues with the Illumina HumanOmni2.5-v1_D manifest that led to the strand flip errors. The issues are described in detail in HRS1-2_dbGaPUserInfo_v3.pdf

In order to correct the flipped strand issue, users should take the following action. SNPs that are affected by the strand swap are flagged as problem.type2.SNP = TRUE in HRS1-2_HumanOnmi2.5v1_D_flaggedSNPs.zip. Users should switch the coded and non-coded allele annotation for the affected SNPs in the annotation files. For example, if coded_allele=A and noncoded_allele=T, then coded_allele should be T and noncoded _allele should be A.

How to Apply

The genotype data and a limited set of phenotype measures have been deposited in the NIH GWAS repository (dbGaP), which provides a convenient method of distribution to researchers who meet NIH requirements for access. Researchers wishing to use the HRS genetic data must first apply to dbGaP for access to the genotyped data. The process to request access to any dbGaP study is done via the dbGaP authorized access system. Please see the YouTube video dbGaP: Apply for Controlled Access Data for a step-by-step demonstration of how the process works.

Cross-Reference File

Once access to dbGaP has been granted, researchers who wish to link to HRS phenotype measures not in dbGaP may apply for access to the HRS-dbGaP Cross-Reference File by submitting a Genetic Data Access Use Agreement.

  1. Visit the HRS User Registration/File Download Web site. Note: If you do not already have a username and password, you must register in order to enter the site and download HRS public (phenotype) data.
  2. Download and complete the Genetic Data Access Use Agreement
  3. Download and complete the Genetic Data Order Form
  4. Send a signed copy of these two documents...
    1. via email (PDF) to:
      HRS Data Requests (hrsdatareq@umich.edu)
    2. via surface mail to:
      Health and Retirement Study
      DUA Review Committee
      426 Thompson Street
      Ann Arbor, Michigan 48104-2321

HRS Public Data

Researchers can view the HRS public data (phenotypes) at any time by registering as an HRS data user. The concordance tool is a convenient way to search for survey content by topic/keyword.

Restricted Data Linkages

Users wishing to link to HRS restricted data products must submit a restricted data application.


The online course Six Weeks to Genomic Awareness is now available. This free, self-paced online course is designed to provide a foundation for understanding genomic advances and identifying the relevance of genomics to public health. The course units focus on:

  1. Introduction to Genomics
  2. Genes in Populations
  3. Genetic Testing
  4. Gene-Environment Interactions
  5. Ethical, Legal and Social Issues
  6. An Overview of State and National Resources

This course is offered by the Michigan Public Health Training Center.

Additional Information: