New Data Releases

To apply for the any of these restricted-use data files:

  • New users will need to apply for restricted-use data. Please download and complete the restricted-use data contract using the CPC Data Portal.
  • Current Add Health users can log in to their CPC Data Portal application to request additional data.

For more information, please visit the Data Portal’s Frequently Asked Questions page.

Data Released (August 11, 2023)

Individual Vital Status and Underlying Cause of Death File, 2021

This file contains one record for each of the 20,745 Add Health sample members from Wave I. It provides the vital status of each sample member through 2021 as well as the National Death Index-provided underlying cause of death code in ICD-10 format for each decedent. The month and year of the most recent Add Health interview are provided for living sample members, while the month and year of death are provided for decedents. N=20,745

Ordered Cause of Death File, 2021

This file contains entity- and record-axis codes reported by the National Death Index (NDI) for each decedent in the Add Health sample through 2021. The file is arranged hierarchically, by axis code; therefore, each decedent may have multiple records depending on the maximum number of entity- and record-axis codes recorded by NDI. The sequence of the decedent’s records reflects the order in which the entity- and record-axis codes were reported in the NDI record. N=2,123

All Coded Causes of Death File, Including Entity-Axis Codes, 2021

This file contains all underlying cause of death and entity-axis codes appearing in the National Death Index (NDI) source file through 2021. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one. N=647

All Coded Causes of Death File, Including Record-Axis Codes, 2021

This file contains all underlying cause of death and record-axis codes appearing in the National Death Index (NDI) source file through 2021. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one. N=647

Data Released (May 31, 2023)

Polygenic Index Inventories – Release 2

This data file is a 2022 update of the polygenic scores computed by the SSGAC consortium for anthropometric traits, cognition/education, fertility/sexual development, health/health behaviors, and personality/well being. N=5,689

Data Released (May 12, 2023)

Sexual Orientation/Gender Identity, Socioeconomic Status, and Health across the Life Course (SOGI-SES)

This file contains new survey data to support exploration of the relationships among sexual orientation, gender identity, gender expression, romantic and sexual behaviors, socioeconomic status, and health. It contains social, demographic, behavioral, and health data collected in 2020-2021 on a sample of Add Health Wave V participants. N=2,614

Sexual Orientation/Gender Identity, Socioeconomic Status, and Health across the Life Course (SOGI-SES) – Sensitive

This file contains the SOGI-SES study’s sensitive data variables related to gender identity, in vitro fertilization, and HIV status. N=2,614

Data Released (March 20, 2023)

Wave I & II School District Grouping Data

To facilitate clustering by school district, the school district identifiers comprising this file are based on the Local Education Agency identification numbers (LEAID) of the school districts in which the Wave I school, Wave I residence, and Wave II residence were situated. The first two characters of this LEAID represent state and reflect state codes assigned by Add Health in other disseminated data similarly intended for clustering at different geographic areas. N=84,166

Data Released (February 17, 2023)

Historical Neighborhood Redlining

This contextual database allows researchers to identify potential long-term consequences of redlining for contemporary inequities in neighborhood environments, and individual health and socioeconomic attainment over the life course. N=20,706

Waves III-V Multi-year Air Pollution Exposure Estimates

The air pollution data described here provide longer-term estimates of air pollution exposure that can be used to address a broad range of research questions related to how air pollution exposure over time may relate to a variety of health outcomes. N=20,745

Wave I & II School Desegregation Disparities

This file contains data on the levels of school racial segregation experienced by Add Health respondents during their school-age years, related school district characteristics, and measures of tract-level residential segregation present in adulthood (Waves III-V). N=84,166

Data Released (November 3, 2022)

Wave V Hepatic Injury

This file contains constructed measures designed to facilitate analysis and interpretation of hepatic injury based on venous blood collected via phlebotomy at the Wave V home exam.  Assay results for aspartate aminotransferase (AST) and alanine aminotransferase (ALT) are available, as well as three semi-quantitative serum index assays – lipemia, hemolysis and icterus – to evaluate the possibility of interference with the AST or ALT assays.  Moreover, two constructed measures are available – the AST/ALT ratio and its classification. N=5,381

Wave V Renal Function Addendum

This file contains constructed measures designed to facilitate analysis and interpretation of renal function based on venous blood collected via phlebotomy at the Wave V home exam. Updated estimations of the glomerular filtration rate (GFR) based on 2021 guidelines are available using either the creatinine concentration or using both the creatinine and cystatin C concentrations.  The estimations of GFR have been calculated according to the 2021 NIDDK CKD-EPI guidelines.  Classifications of the new estimations according to both clinical and KDIGO guidelines are available as well. N=5,381

Data Released (September 9, 2022)

Wave IV dbGaP GWAS Sample Weight

A weight component for the dbGaP GWAS Sample. N=9,975

Data Released (June 21, 2022)

Individual Vital Status and Underlying Cause of Death File, 2019

This file contains one record for each of the 20,745 Add Health sample members from Wave I. It provides the vital status of each sample member as well as the National Death Index-provided underlying cause of death code in ICD-10 format for each decedent. The month and year of the most recent Add Health interview are provided for living sample members, while the month and year of death are provided for decedents. N=20,745

Ordered Cause of Death File, 2019

This file contains entity- and record-axis codes reported by the National Death Index (NDI) for each decedent in the Add Health sample. The file is arranged hierarchically, by axis code; therefore, each decedent may have multiple records depending on the maximum number of entity- and record-axis codes recorded by NDI. The sequence of the decedent’s records reflects the order in which the entity- and record-axis codes were reported in the NDI record. N=1,745

All Coded Causes of Death File, Including Entity-Axis Codes, 2019

This file contains all underlying cause of death and entity-axis codes appearing in the National Death Index (NDI) source file. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one. N=540

All Coded Causes of Death File, Including Record-Axis Codes, 2019

This file contains all underlying cause of death and record-axis codes appearing in the National Death Index (NDI) source file. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one. N=540