
PDF codebooks for the Add Health restricted-use data

Wave I In-Home Interview Data
A merged file containing the Wave I In-Home Interview data, Parent Questionnaire data and the Add Health Picture Vocabulary data, collected in 1994–1995.
Wave I School Administrator Questionnaire Data
Information from the Wave I (self-administered) questionnaires answered by administrators at the sampled schools. 
School Information Data
Additional information about the individual schools.
Wave I In-School Questionnaire Data
Adolescent responses to the In-School Questionnaire administered September 1994 through April 1995.
Wave I In-Home Weights
Variables needed to correct for design effects and weight the Wave I in-home data.
Wave I School Administrator Weights
Variables needed to correct for design effects and weight the Wave I school administrator data.
Wave I Grand Sample Weights
Variables needed to correct for design effects and weight the Wave I in-school data.
Wave II In-Home Interview Data
Data collected during the 1996 in-home interview.
Wave II School Administrator Questionnaire Data
Information from the Wave II (phone-administered) questionnaires answered by administrators at the sampled schools.
Wave II Grand Sample Weights
Variables needed to correct for design effects and weight the Wave II in-home data.
Wave III In-Home Interview Data
Respondent-level data collected during the 2001–2002 in-home interview includes field interviewer characteristics, AHPVT. 
Wave III Grand Sample Weights
Variables needed to correct for design effects and weight the Wave III in-home data, including longitudinal and cross-sectional weights.
Wave IV In-Home Interview Data
Data collected during the 2008 in-home interview. Includes flow charts for interview sections.
Wave IV Grand Sample Weights
Variables needed to correct for design effects and weight the Wave IV in-home data, including longitudinal and cross-sectional weights.
Wave V Mixed-Mode Survey Data
Respondent data from the Wave V mixed-mode survey. (Comparable to Wave I-IV In-Home Interview). Includes flow charts for survey sections. This release includes survey, weights, disposition, survey medications and Section 16b data sets. N=12,300
Wave V Sample 2B
Sample S2B is part of the Wave V survey. S2B was administered 2 June 2017-8 July 2018. While most variables were released in December 2019 as Wave V data, two sections were not. These specifically S2B cases are a representative sample (asked of everyone in the sample). Web survey content for the other samples (1, 2A, 3) were edited mid-data collection and not all respondents in each sample had the opportunity to answer the questions.
Wave I In-School Friendship Nominations
Identification numbers of the friends that the respondent nominated during the in-school interview.
Wave I In-Home Friendship Nominations
Identification numbers of the friends that the respondent nominated during the Wave I in-home interview.
Wave II In-Home Friendship Nominations
Identification numbers of the friends that the respondent nominated during the Wave II in-home interview.
Wave III Friend IDs
Add Health respondents who were in the 7th or 8th grade at Wave I were asked at Wave III to identify, from a list of 10 computer-generated names, which ones were current friends or which ones were their friends when they were in school together. This dataset contains the altered identification numbers (AIDs) of the 10 computer-generated names.

Adolescent Pairs Data
Information that links and describes the sibling pairs identified at the Wave I in-home interview.
Wave III Sibling IDs
At Wave III, Add Health respondents were asked questions about their siblings who also participated in the Wave I or II in-home interviews. This dataset contains the AIDs for these siblings.

Wave I Contextual Data
Community contextual variables based on state, county, tract, and block group levels derived from the Wave I addresses.
Wave II Contextual Data
Community contextual variables based on state, county, tract, and block group levels derived from the Wave II addresses.
Wave III Contextual Data
Community contextual variables based on state, county, tract, and block group levels derived from the Wave III addresses.
Wave I Spatial Analysis Data
Pseudo coordinates that can be used to calculate distances between friends in a school community.
Wave I Neighborhood Data
Pseudo state, county, tract, and block group variables that allow grouping of Add Health respondents geographically (based on Wave I addresses).
Wave II Neighborhood Data
Pseudo state, county, tract, and block group variables that allow grouping of Add Health respondents geographically (based on Wave II addresses).
Wave III Grouping File Data
Pseudo state, county, tract, and block group variables in FIPS code format that allow grouping of Add Health respondents geographically (based on Wave III addresses).
Wave IV Grouping Data
The pseudo FIPS codes in this file allow you to geographically group respondents by their Wave IV locations.
Wave III Census Region
This file contains the Census region codes for the respondents’ Wave III residential locations.
Wave IV Census Region
This file contains the Census region codes for the respondents’ Wave IV residential locations.
Wave III Supplemental Tract-Level Contextual Data
This file contains supplemental Wave III contextual data that include transportation and commuting measures, climate descriptors, amenities, and state-level tobacco control influences. These variables are available at the census tract-level unless otherwise specified.
Wave IV Supplemental Tract-Level Contextual Data
This file contains tract-level measures, based on the Wave IV respondent locations, reported by the U.S. Census Bureau’s 2009 American Community Survey (ACS), the Climate Atlas of the United States, the USDA Economics Research Service, Esri Data and Maps, ImpacTeen Tobacco Control policy and Prevalence Data, and the Uniform Crime Reports. When tract level measures were not available or appropriate, state and county level variables were used.

Wave I, II, III Political Context Data
The Add Health Political Context files provide an array of measures that describe the political environments in which Add Health respondents reside. These contextual variables include measures of commuting, election results for gubernatorial, presidential, and senatorial races, and voter registration law.
WI: 17
WII: 13
WI: 20,745
WII: 14,738
WIII: 15,197
Wave III Sex Ratio Data
In this file are constructed variables at the county-level for sex ratios for males to females ages 18 to 29, 30-34, and 18 to 34, the proportion of males and females ages 18 to 34, and the sex ratio of employed males 16 years and older to the total number of females 16 years and older for the white population and the black or African American population.
Wave III Alcohol Outlet Density Data
This Add Health data file measures the prevalence of alcohol outlets in respondent communities by reporting the tract-level density of establishments possessing on- and/or off-premise alcohol licenses.
Wave IV Modified Retail Food Environment Index (mRFEI) Data
The Wave IV mRFEI data file includes the mRFEI for each respondent based on their Wave IV residential location.
Wave IV Ambient Air Pollutants Data
These files include 365 daily exposure estimates of ambient air pollutants (individual pollutants/particulate matter/gases) for each Add Health study participant in Wave IV.
varies by file
varies by file
Wave III & IV Sexual Minority Policy Data
This database includes indicators of the policy context for sexual minorities (a population typically defined based on co-residence with a same-sex partner or self-identification as gay, lesbian, or bisexual) in Waves III & IV of Add Health. These data will enable researchers to examine how state-level policies in young adulthood are associated with a wide range of indicators of health and well-being among sexual minorities.
WIV: 4
WIII: 15,197
WIV: 15,701
Wave IV College Characteristics Data
At Wave IV of the Add Health survey, respondents were asked if they had received a bachelor’s degree. This is a degree-level file. College-level data were linked to each degree based on the institution from which the respondent reported receiving each degree.
Wave III & IV College Mobility Data
These data were collected from a sample of college students who were born between 1980 and 1982 and who attended a college or university in the early 2000’s. At Wave III of the Add Health survey, respondents were asked if they were currently enrolled in a postsecondary institution. At Wave IV of the Add Health survey, respondents were asked if they had received a bachelor’s degree. Respondents who answered in the affirmative were then asked to report the institution from which they were currently enrolled or received this degree.
WIII: 50
WIV: 52
WIII: 15,197
WIV: 15,810
Wave I & IV County Health, Mobility, and Tobacco Tax Data
The Wave I & IV County Health and Mobility database summarizes the socioeconomic, health, and mobility characteristics of the environments in which Add Health participants were living at the time of their Wave I & IV interview. These variables are available at the county or state level. Wave I – N: 20,745 and Wave IV – N: 15,701
WI: 45
WI: 20,745
WIV: 15,701
Wave II & III Tobacco Tax Data
Wave II & III Tobacco Tax Data file supplements the County Health and Mobility database available for Waves I & IV. Comprehensively, these data files provide tobacco tax for Add Health respondent state of residence from Wave I to Wave IV. Wave II – N: 14,738 and Wave III – N: 15,197
WII: 2
WII: 14,738
WIII: 15,197
Wave I, III & IV Sunset Data
These data present average sunset time over the course of the year for the respondent’s Census block group. Users can input latitude and longitude coordinates and time zone information to retrieve the sunset time (along with other solar and lunar information) for any date between 1700 and 2100.
WI: 2
WIV: 2
WI: 20,745
WIII: 15,197
WIV: 15,701
Wave I, II, III, IV & V Grouping Data
These location identifiers are based on 2010 Census geographic boundaries and are longitudinally consistent across all waves. Previous Add Health waves released location identifiers based on the most recent Census for that wave, and are therefore not comparable through time. These identifiers are based on Census block group FIPS codes.
HUD-Assisted Housing Supplementary Data
The supplementary datafile identifies Add Health respondents who lived in HUD-assisted housing at any point between 1995 and 2017. For these Add Health respondents, the supplementary datafile provides unique Add Health respondent identifiers (AID) and household-level information about the characteristics of their HUD housing residence. The supplementary datafile is a hierarchical (i.e., long-format data file), with each row representing a unique HUD administrative record that is linked to an AID. In total, the hierarchical file includes a total of 8,587 HUD records on 1,159 unique Add Health respondents identified through the linkage.
Wave I State Demographic Characteristics, Exclusionary Indices, and Inclusionary Indices
These data provide measures of punishment regime variation in state-based policies, practices, and programs, both in their punitive and non-punitive forms, and some additional state demographic control variables. These data were gathered to use with Add Health for multilevel analyses.
Contextual Wave IV Database
The data provide an important update to the contextual variables already available in Wave IV by including information at the state- and county-levels. Additionally, there are a few new variables in the present database that are absent from other waves.
Contextual Wave V Database
This contextual database further expands the extensive contextual data currently available to users of the National Longitudinal Study of Adolescent to Adult Health (Add Health) through the provision of numerous measures reported by the U.S. Census Bureau’s American Community Survey (ACS), rural-urban commuting area codes, U.S. Climate Atlas, and Uniform Crime Report.
Wave V County Health and Mobility Data
The Wave V County Health and Mobility database summarizes the socioeconomic, health, and mobility characteristics of the environments in which Add Health participants were living at the time of their Wave V interview. County-level data describe (1) levels of and trends in chronic disease (hypertension, type-2 diabetes) and health risk behaviors (obesity, smoking, alcohol use); and (2) economic opportunity and inequality.
Wave V – ACA Medicaid Expansion Data
This data file provides the year in which states expanded Medicaid under the Affordable Care Act. These dates are attached to the location in which Add Health participants were living at Wave V.
Waves I, IV, and V – The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility Data
This file enhances the existing Add Health contextual database through the addition of measures essential to understanding the determinants and sequelae of socioeconomic mobility. Specifically, it aims to characterize the socioeconomic mobility of Add Health participants at Wave I, IV, and V.
WI: 211
WIV: 211
WV: 211
WI: 20,745
WIV: 15,701
WV: 12,300
Wave IV City Crime Rate Data
This Crime Rate Data file facilitates research examining the impact of community violence on the health trajectories of Wave IV Add Health participants by providing police department crime data from 13 U.S. cities with high crime rates.
Wave V City Crime Rate Data
This Crime Rate Data file facilitates research examining the impact of community violence on the health trajectories of Wave V Add Health participants by providing police department crime data from 13 U.S. cities with high crime rates.
Historical Neighborhood Redlining
This contextual database allows researchers to identify potential long-term consequences of redlining for contemporary inequities in neighborhood environments, and individual health and socioeconomic attainment over the life course.
Waves III-V Multi-year Air Pollution Exposure Estimates
The air pollution data described here provide longer-term estimates of air pollution exposure that can be used to address a broad range of research questions related to how air pollution exposure over time may relate to a variety of health outcomes.
Wave I & II School Desegregation Disparities
This file contains data on the levels of school racial segregation experienced by Add Health respondents during their school-age years, related school district characteristics, and measures of tract-level residential segregation present in adulthood (Waves III-V).
Wave I & II School District Grouping Data
To facilitate clustering by school district, the school district identifiers comprising this file are based on the Local education agency identification numbers (LEAID) of the school districts in which the Wave I school, Wave I residence, and Wave II residence were situated. The first two characters of this LEAID represent state and reflect state codes assigned by Add Health in other disseminated data similarly intended for clustering at different geographic areas.
Wave V Contextual Despair
This contextual data set focuses on the social, political, and resource environment of Add Health respondents at the tract, county, and state level that are relevant to the prevailing causes of death in midlife – namely alcohol-related diseases, drug overdoses and accidental poisonings, and suicide and self-inflicted harm. Most measures are specific to Wave V residential location, though several measures span multiple waves. Measures include the sociodemographic and segregation context, proximity to firearms distributors and alcohol outlets, opioid dispensing, and policies related to alcohol, drugs, and firearms. N=20745, v=266
Contextual Heterosexism Database-Phase 1
This contextual database, Contextual Heterosexism Database-Phase 1 (CHD1), further expands the collection of contextual data available to users of The National Longitudinal Study of Adolescent to Adult Health (Add Health) through the provision of state, county, and tract level measures from the Decennial Census of Population and Housing, American Community Survey (ACS), the Movement Advancement Project (MAP), Lax and Phillips (2009), Public Religion Research Institute (PRRI), Cooperative Election Study (CES), U.S. Religion Census, and Massachusetts Institute of Technology (MIT) Election Lab. These data include indicators of social policies, social climate, and confounding factors related to the study/measurement of structural heterosexism that correspond to Waves 3, 4, and 5. Some of these indicators are new to the Add Health contextual database and others were previously not available at all three of these waves. 

Contextual Heterosexism Database, Phase 2
Contextual Heterosexism Database – Phase 2 (CHD2), further expands the collection of contextual data available to users of The National Longitudinal Study of Adolescent to Adult Health (Add Health) through the provision of state level measures from the Harvard Implicit Associations Test (IAT) database and city-level measures from the Human Rights Campaign’s (HRC) Municipal Equality Index. These measures correspond to Waves IV and V of Add Health.

Sexual Minority Policy
These data will enable researchers to examine how state-level policies in young adulthood are associated with a wide range of indicators of health and well-being among sexual minorities. The variables included in this contextual database measure context at the state level. Wave V measures correspond to the time period following Wave IV through to the start of Wave V (2010-2016).


Aircraft Noise Measures

Data contains the estimation of aircraft noise measures around ninety major airports and aircraft noise proxies for approximately 900 additional airports. Merged with geopositioned/geocoded Add Health respondent locations over Waves I-VI, it also documents how the aircraft noise source data were acquired, as well as the protocol for quality controlling their assignment across waves

Data contains the estimation of aircraft noise measures around ninety major airports and aircraft noise proxies for approximately 900 additional airports. Merged with geopositioned/geocoded Add Health respondent locations over Waves I-VI, it also documents how the aircraft noise source data were acquired, as well as the protocol for quality controlling their assignment across waves.

Data contains the estimation of aircraft noise measures around ninety major airports and aircraft noise proxies for approximately 900 additional airports. Merged with geopositioned/geocoded Add Health respondent locations over Waves I-VI, it also documents how the aircraft noise source data were acquired, as well as the protocol for quality controlling their assignment across waves.

Data contains the estimation of aircraft noise measures around ninety major airports and aircraft noise proxies for approximately 900 additional airports. Merged with geopositioned/geocoded Add Health respondent locations over Waves I-VI, it also documents how the aircraft noise source data were acquired, as well as the protocol for quality controlling their assignment across waves.

Data contains the estimation of aircraft noise measures around ninety major airports and aircraft noise proxies for approximately 900 additional airports. Merged with geopositioned/geocoded Add Health respondent locations over Waves I-VI, it also documents how the aircraft noise source data were acquired, as well as the protocol for quality controlling their assignment across waves.

Data contains the estimation of aircraft noise measures around ninety major airports and aircraft noise proxies for approximately 900 additional airports. Merged with geopositioned/geocoded Add Health respondent locations over Waves I-VI, it also documents how the aircraft noise source data were acquired, as well as the protocol for quality controlling their assignment across waves.

Rural-Urban Commuting Area (RUCA) Codes

Rural-urban commuting area (RUCA) codes classify U.S. census tracts using measures of population density, urbanization, and daily commuting. The data file including them is based on RUCA codes for census years 1990, 2000, and 2010. The rationale for and utility of acquiring RUCA codes, assigning them to census geographies in which Add Health respondents have resided over three decades.

Roadway Proximity / Density

These measures include distances to and summed lengths of primary and secondary roads in geocoded respondent residence-centric buffers of varying size. The data file is based on national-level data on major roads from ESRI StreetMap and Data & Maps. The User Guide documents how the roadway source data were acquired, protocol for quality controlling their measurement and classification across waves. Whenever possible, construction, measurement, and classification were harmonized to ensure temporal comparability.

National Land Cover Database (NLCD) Neighborhood Land Cover Measures

These land use measures include the land areas (in meters squared) surrounding geocoded respondent residences that are classified as developed, forested, etc. The data file including them is based on data from the National Land Cover Database (NLCD).

Wave III ASHA Call Data
To receive the results of their STD assays, Wave III respondents called an Add Health dedicated number at the American Social Health Association. This dataset provides information on who called the results hotline and the date and time of the call.
Wave III BEM Scores Data
The masculinity and femininity raw and standard scores from the 30 item short form BEM Sex-Role Inventory are available in this file.
Wave III Mentor Data
For Wave III respondents who reported having a mentor, the open-ended responses to the question “How did {HE/SHE} help you?” have been coded and are available in this file.
Wave III Academic Courses
These files contain academic status and/or performance indicators for math, science, foreign language, English, history, social sciences, physical education, and a combined overall category.
Wave III Academic Networks
The Network files provide information on social networks based on the respondents’ course-taking patterns.
3varies by file
Wave III Context
School level contextual data are from the Common Core of Data (CCD), Private School Survey (PSS), the1990 and 2000 Census, and the Office of Civil Rights.
6varies by file
Wave III Course-Level
The data in this file are needed for merging the course-level curriculum data with other Education Files.
Wave III Curriculum
These math and science curriculum data are derived from coding the textbooks schools reported using for each course offered in these two subjects.
8varies by file
Wave III Linking
This file contains variables designed to link transcript data to academic or school years and to Add Health.
Wave III Primary
The Primary Component contains several types of indicators based on information collected from participating schools and listed directly on student transcripts such as student exit or graduation status and materials gathered from schools during the data collection process.
5varies by file
Wave III Transition
This file contains variables explaining the respondents’ movement through the educational system.
Wave III Weights
Files contain weights for the education data along with the school weights needed for HLM analysis.
2varies by file
Wave III Academic Transcript Social Studies and Civic Coursework (ATRCVC) Data
The Wave III ATRCVC data is a course-by-student-level file that includes academic transcript data related to social studies and civic coursework.
Wave I In-Home Weight Components
A weight component for each level of sampling (school and adolescents) has been created for each wave of data collection. This file contains the weight components needed for computing multilevel weights for Wave I.
Wave II In-Home Weight Components
A weight component for each level of sampling (school and adolescents) has been created for each wave of data collection. This file contains the weight components needed for computing multilevel weights for Wave II.
Wave III In-Home Weight Components
A weight component for each level of sampling (school and adolescents) has been created for each wave of data collection. This file contains the weight components needed for computing multilevel weights for Wave III.
Wave IV Weight Components
A weight component for each level of sampling (school and adolescents) has been created for each wave of data collection. This file contains the weight components needed for computing multilevel weights for Wave IV.
In-School Weight Components
A weight component for each level of sampling (school and adolescents) has been created for each wave of data collection. This file contains the weight components needed for computing multilevel weights.
Add Health School Weights
The initial weights for the school are in this file.
Wave V Mixed-Mode Weights
The Wave V Mixed-Mode Weights were released with the Wave V Mixed-Mode Survey.
Wave I School Network Data
Network variables constructed from the in-school questionnaire data and friendship nominations.

Wave I and II Disposition File
This file contains the types of data available for the Wave I respondents along with the outcome of the 16,706 respondents selected for Wave II.
Wave III Disposition File
This file contains the final outcome of the 20,058 cases fielded at Wave III.
Wave IV Disposition File
This file contains the final outcome of the 19,962 cases fielded at Wave IV.
National Death Index File
This file contains the underlying cause of death and days alive after Wave I interview.
Wave V Disposition File
The Wave V Disposition file was released with the Wave V Mixed-Mode Survey Data.
Overview of ONE Files
This file contains an overview of the Obesity and Neighborhood Environment (ONE) data. Codebooks for each specific topic are found below. Data are available on: climate, street connectivity, crime, geocode source, land cover, parks, physical activity resources, urban distances, weather, population density, school distances measures, FIPS code grouping, mobility, MSA pseudo codes, cost of living, employment, length of day, road type length, and rural-urban commuting characteristics.
Wave I and III ACCRA Data
These data files report ACCRA Cost of Living Index data for Wave I and Wave III based on respondent location and the year and quarter of the Add Health interview.
WI: 38
WII: 79
WI: 20,745
WII: 15,197
Wave I and III Climate Files
This file contains the climate data for Wave I and III respondents based on the nearest climate station. Information is available on precipitation, total snowfall, sky cover, temperature, and total hours of sunshine.
WI: 25
WIII: 25
WI: 20,745
WIII: 15,197
Wave I and III Crime Files
The county-level crime data in these files are based on the Wave I and III respondent locations.
WI: 8
WI: 20,745
WIII: 15,197
Wave I and III Employment Files
Certain county-level employment data from the U.S. Bureau of Labor Statistics are attached to Wave I and Wave III respondent locations.
WI: 47
WIII: 36
WI: 20,745
WIII: 15,197
Wave I and III Geocode Source
The data source of the Wave I and III respondent residential geocodes (latitude and longitude) are provided in these files.
WI: 2
WI: 20,745
WIII: 15,197
Wave I and III Land Cover Data
These files contain land cover metrics within 1, 3, 5, and 8.05 km (5 miles) of Wave I and III respondent locations.
WI: 237
WIII: 237
WI: 20,745
WIII: 15,197
Wave I and III Length of Day
These data files contain the number of hours of daylight at each Wave I and Wave III respondent location based on that respondent’s latitude and survey date.
WI: 2
WI: 20,745
WIII: 15,197
Wave I and III Parks Data
The counts of public parks within a Euclidean distance of 1, 3, 5, and 8.05 kilometers (5 miles) of each respondent at Wave I and III are in these files.
WI: 43
WIII: 43
WI: 20,745
WIII: 15,197
Wave I and III Population Density Data
The Wave I population density file contains the proportion of 1990 U.S. Census block group population and are (in square meters) within 1, 3, 5, and 8.04672 km (5 mi) of each Wave I respondent. The Wave III population density file contains the proportion of 2000 U.S. Census block group population and area (in square meters) within 1, 3, 5, and 8.04672 km (5 mi) of each Wave III respondent.
WI: 9
WI: 20,745
WIII: 15,197
Wave I and III Resources Data
These Add Health files provide data on the presence of various physical activity (PA) resources situated near respondent residences at Wave I and III.
WI: 749
WIII: 3741
WI: 20,745
WIII: 15,197
Wave I and III Road Type Length Data
Road type length calculations within radii of 1, 3, 5, and 8.05 kilometers (5 miles) of Wave I and Wave III respondent locations.
WI: 53
WII: 53
WI: 20,745
WIII: 15,197
Wave I and III Rural-Urban Commuting Area (RUCA) Codes
These data files define the rural-urban commuting characteristics of Wave I and Wave III respondent locations at the U.S. Census tract-level using the 1990 and 2000 RUCA codes developed by the U.S. department of Agriculture’s Economic Research Service.
WI: 3
WI: 20,745
WIII: 15,197
Wave I and III Street Connectivity Files
These files contain road network connectivity measures within 1, 3, 5, and 8.05 km (5 miles) of the Wave I and III respondent locations.
WI: 57
WIII: 57
WI: 20,745
WIII: 15,197
Wave I and III Urban Distances
Contains Euclidean distances to both 1990 and 2000 U.S. Census Urbanized Areas (UAs) for each Wave I respondent. Contains the Euclidean distance to 2000 U.S. Census Bureau-defined urbanized areas (UAs) for each Wave III respondent.
WI: 3
WI: 20,745
WIII: 15,197
Wave I and III Weather Data
This file contains weather data for Wave I and III respondents based on the nearest weather station reporting data for the corresponding survey month and year.
WI: 8
WI: 20,745
WIII: 15,197
Wave I Grouping File
This file is for use with the Obesity and Neighborhood Environment (ONE) data only. It is based on recoded Wave I addresses and contains pseudo state, county, tract, and block group variables in FIPS code format that allow for grouping of Add Health respondents geographically.
Wave I School Distance Measures
This file contains the distance between the geocoded point locations of each respondent’s Wave I location and that respondent’s school.
Wave III Mobility Data
Reports the distance between each respondent’s geocoded point location for each survey wave and that respondent’s school location, along with the respondent’s move distance between each survey wave.
Wave III MSA Pseudo Codes
The MSA pseudo code created for each respondent’s Wave III location is in this file.
Wave III Cotinine Assay Data
This file contains the cotinine and 3-hydroxycotinine assay values for 963 Wave III respondents.
Wave III HPV-MGEN Assay Data
Assay results for human papillomavirus and mycoplasma genitalium are available for a subset of the Wave III respondents who provided a urine sample.
Wave III HPV-MGEN Assay Weights
Sample weights for respondents with HPV and MGEN assay results are in this file.
Wave III Urinalysis Data
This file contains nitrate, specific gravity, pH level, white blood cells, protein, glucose, ketone, urobilinogen, bilirubin, microalbumin, urine creatinine, and blood values from the Wave III urine specimens.
Wave IV Medication File Data
This file provides the therapeutic classification codes for the medications reported at Wave IV.
Wave IV Glucose – HbA1c
This file contains two measures of glucose homeostasis based on the assay of the Wave IV dried blood spots.
The results of the assays for CRP (C-reactive protein) and EBV (Epstein-Barr virus) are in this data file.
Wave IV Lipids
This file contains constructed measures designed to facilitate analysis and interpretation of lipids results.
Wave IV Baroreceptor Sensitivity
This file contains constructed measures for baroreflex sensitivity, heart rate recovery, and systolic blood pressure recovery for the Wave IV respondents.
Wave IV Narcotic Drug Flag
This file contains a flag that identifies Wave IV respondents who report taking a medication that contains a narcotic.
Wave IV Consent
This file contains variables indicating the types of consent (archive, no archive, refused, incarcerated) obtained for the Wave IV blood spot and saliva DNA collections.

Wave V Demographics – Home Exam
This file contains various demographic variables in regards to the Wave V home exam, including the date of the home exam, number of days between the Wave V survey and the home exam, the home exam completion status, as well as the respondent’s age, biological sex, pregnancy status, medication use and blood draw status.
Wave V Anthropometrics  
This file contains anthropometric variables constructed from the measurements taken at the Wave V Home Exam.  The measurements include arm circumference, height, weight and waist circumference.  The file also contains BMI, as well as classification variables for BMI and waist circumference.
Wave V Cardiovascular Measures
This file contains cardiovascular measures constructed from the three serial measurements of blood pressure and pulse rate collected at the Wave V Home Exam.  The measures include systolic and diastolic blood pressure, pulse rate, pulse pressure and mean arterial pressure.  Other variables included in the file are two classifications of blood pressure, flags based on self-reported medical history, an antihypertensive medication flag and a joint classification variable.
Wave V Medications-Home Exam 
This file provides the therapeutic classification codes, as well as numerous medication flag variables, to identify the types of medications taken by respondents for medications (both prescription and over-the-counter) reported at the Wave V Home Exam.
Wave V Glucose Homeostasis
This file contains assay results of glucose and hemoglobin A1c (HbA1c) based on venous blood collected via phlebotomy at the Wave V home exam. There are classifications for fasting glucose and non-fasting glucose, as well as an HbA1C classification. Other variables included are flags based on a self-reported diabetes medical history, anti-diabetic medication use, and a diabetes joint classification variable.
Wave V Lipids
This file contains constructed measures designed to facilitate analysis and interpretation of lipids results based on venous blood collected via phlebotomy at the Wave V home exam. In addition to the lipid assay results, there are classifications according to both the NCEP/ATP III and AHA/ACC guidelines, a flag for antihyperlipidemic medication use, and two hyperlipidemia joint classification variables.
Wave V Renal Function
This file contains constructed measures designed to facilitate analysis and interpretation of renal function based on venous blood collected via phlebotomy at the Wave V home exam. Assay results for creatinine and cystatin-c are available, as well as three different estimations of the glomerular filtration rate (GFR) using either the creatinine concentration, the cystatin C concentration or both concentrations. Classifications according to both clinical and KDIGO guidelines are available as well.
Wave V Inflammation and Immune Function
This file contains constructed measures designed to facilitate analysis and interpretation of inflammation and immune function based on venous blood collected via phlebotomy at the Wave V home exam. In addition to the assay results for high sensitivity C-reactive protein (hsCRP), there is an AHA/CDC classification, counts of subclinical symptoms and common infectious or inflammatory diseases, and various anti-inflammatory medication use flags.
Wave V Biomarker Weight
This file contains the Wave V biomarker sample weight.
Wave V Hepatic Injury
This file contains constructed measures designed to facilitate analysis and interpretation of hepatic injury based on venous blood collected via phlebotomy at the Wave V home exam.  Assay results for aspartate aminotransferase (AST) and alanine aminotransferase (ALT) are available, as well as three semi-quantitative serum index assays – lipemia, hemolysis and icterus – to evaluate the possibility of interference with the AST or ALT assays.  Moreover, two constructed measures are available – the AST/ALT ratio and its classification.
Wave V Renal Function Addendum
This file contains constructed measures designed to facilitate analysis and interpretation of renal function based on venous blood collected via phlebotomy at the Wave V home exam. Updated estimations of the glomerular filtration rate (GFR) based on 2021 guidelines are available using either the creatinine concentration or using both the creatinine and cystatin C concentrations.  The estimations of GFR have been calculated according to the 2021 NIDDK CKD-EPI guidelines.  Classifications of the new estimations according to both clinical and KDIGO guidelines are available as well.
Wave V Baroreflex Sensitivity and Hemodynamic Recovery
This file contains constructed measures for baroreflex sensitivity, heart rate recovery, and systolic blood pressure recovery for the Wave V respondents.
Wave V Measures of Inflammation and Immune Function
This file contains additional measures of inflammation and immune function based on venous blood collected via phlebotomy at the Wave V home exam and then assayed for several cytokines.
This file contains two measures of neurodegeneration based on venous blood collected via phlebotomy at the Wave V home exam and then assayed for neurofilament light (NfL) and tau.
Birth Records Database Codebook
Birth Records Database User Guide

Birth record data was collected from participating states for AHSM birth years, 1974-83. When these states provided birth data for all recorded births occurring during that time interval, an AHSM-specific subset was created using Link Plus, a statistical linkage software developed by the U.S. Centers for Disease Control and Prevention (CDC), Cancer Division. One participating state performed its own AHSM linkages and provided Add Health with the linked subset of births. Add Health then performed transformations on all of the original data from the participating states to create the categorical variables present in this release.
Individual Vital Status and Underlying Cause of Death File, 2019
This file contains one record for each of the 20,745 Add Health sample members from Wave I. It provides the vital status of each sample member as well as the National Death Index-provided underlying cause of death code in ICD-10 format for each decedent. The month and year of the most recent Add Health interview are provided for living sample members, while the month and year of death are provided for decedents.
Ordered Cause of Death File, 2019
This file contains entity- and record-axis codes reported by the National Death Index (NDI) for each decedent in the Add Health sample. The file is arranged hierarchically, by axis code; therefore, each decedent may have multiple records depending on the maximum number of entity- and record-axis codes recorded by NDI. The sequence of the decedent’s records reflects the order in which the entity- and record-axis codes were reported in the NDI record.
All Coded Causes of Death File, Including Entity-Axis Codes, 2019
This file contains all underlying cause of death and entity-axis codes appearing in the National Death Index (NDI) source file. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one.
All Coded Causes of Death File, Including Record-Axis Codes, 2019
This file contains all underlying cause of death and record-axis codes appearing in the National Death Index (NDI) source file. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one.
Individual Vital Status and Underlying Cause of Death File, 2021
This file contains one record for each of the 20,745 Add Health sample members from Wave I. It provides the vital status of each sample member through 2021 as well as the National Death Index-provided underlying cause of death code in ICD-10 format for each decedent. The month and year of the most recent Add Health interview are provided for living sample members, while the month and year of death are provided for decedents. N=20,745
Ordered Cause of Death File, 2021
This file contains entity- and record-axis codes reported by the National Death Index (NDI) for each decedent in the Add Health sample through 2021. The file is arranged hierarchically, by axis code; therefore, each decedent may have multiple records depending on the maximum number of entity- and record-axis codes recorded by NDI. The sequence of the decedent’s records reflects the order in which the entity- and record-axis codes were reported in the NDI record. N=2,123
All Coded Causes of Death File, Including Entity-Axis Codes, 2021
This file contains all underlying cause of death and entity-axis codes appearing in the National Death Index (NDI) source file through 2021. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one. N=647
All Coded Causes of Death File, Including Record-Axis Codes, 2021
This file contains all underlying cause of death and record-axis codes appearing in the National Death Index (NDI) source file through 2021. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one. N=647
Individual Vital Status and Underlying Cause of Death File, 2022
This file contains one record for each of the 20,745 Add Health sample members from Wave I. It provides the vital status of each sample member through 2022 as well as the National Death Index-provided underlying cause of death code in ICD-10 format for each decedent. The month and year of the most recent Add Health interview are provided for living sample members, while the month and year of death are provided for decedents.


Ordered Cause of Death File, 2022
This file contains entity- and record-axis codes reported by the National Death Index (NDI) for each decedent in the Add Health sample through 2022. The file is arranged hierarchically, by axis code; therefore, each decedent may have multiple records depending on the maximum number of entity- and record-axis codes recorded by NDI. The sequence of the decedent’s records reflects the order in which the entity- and record-axis codes were reported in the NDI record.


All Coded Causes of Death File, Including Entity-Axis Codes, 2022
This file contains all underlying cause of death and entity-axis codes appearing in the National Death Index (NDI) source file through 2022. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one.


All Coded Causes of Death File, Including Record-Axis Codes, 2022
This file contains all underlying cause of death and record-axis codes appearing in the National Death Index (NDI) source file through 2022. Functioning as dummy variables, zero represents the absence of a code on the decedent’s death certificate, while one denotes the presence of one.


Mortality Outcomes Surveillance: Death Certificate Data
This data provides details of the data abstracted from decedent death certificates.


Mortality Outcomes Surveillance: Coroner/Medical Examiner Report Data
This data provides details of the data abstracted from coroner/medical examiner reports.


Wave IV Genetic Risk Score* – BMI
This file contains the BMI genetic risk score* for Add Health twin and full sibling respondents who provided saliva samples at Wave IV.
Wave IV Genetic Risk Score* – Education
An education genetic risk score* (GRS_EDU) is available for Add Health twin and full sibling respondents who provided saliva samples at Wave IV. This variable is the weighted sum of risk alleles identified in the Rietveld et al. (2013) genome-wide association study.
Wave III DNA Data
Twin and full siblings interviewed at Wave III were asked to provide saliva samples for DNA analysis. This file contains the genotype values for DAT1 (dopamine transporter), DRD4 (dopamine receptor), and SLC6A4 (serotonin transporter), MAOA_V (monoamine oxidase A-uVNTR), DRD2 (dopamine D2 receptor), and CYP2A6 (cytochrome P450 2A6) from these samples. Also included are values for the following SNPs: rs2304297, rs892413, rs4950, rs13280604.
Wave IV DNA Data
Contains genotyping results for all Wave IV respondents who agreed to provide a saliva sample for DNA testing. This dataset has values for DAT1 (dopamine transporter), DRD4 (dopamine receptor), MAOA (monoamine oxidase A-uVNTR), 5HTTLPR (serotonin transporter), HTTLPR La-Lg-S, triallelic activity bins for the serotonin transporter 5HTTLPR adjusted for rs25531, DRD2, s000005, s000006, DRD5, and MAOCA1
Wave IV Polygenic Scores – Release 1
Thirty constructed polygenic scores (PGS) are available for Add Health respondents who provided archival saliva samples for genetic testing at Wave IV. Scores are available for coronary artery disease, myocardial infarction, plasma cortisol, LDL cholesterol, HDL cholesterol, total cholesterol, triglycerides, type II diabetes (2 measures), BMI, waist circumference, waist-to-hip ratio, height, age at menarche, age at menopause, number of children, age at first birth, ever/current smoker, number of cigarettes per day, extraversion, attention deficit disorder (2 measures), bipolar disorder, major depressive disorder (2 measures), schizophrenia, mental health cross disorder, Alzheimer’s disease, and educational attainment (2 measures).
Wave IV Polygenic Scores – Release 2
Seventy-four constructed polygenic scores (PGSs) are available for Add Health respondents who provided archival saliva samples for genetic testing at Wave IV. Scores are available for various anthropomorphic, health, and behavioral outcomes. See the documentation for a list of all PGSs.
Wave IV Polygenic Scores – SSGAC
Polygenic scores (PGS) constructed by the Social Science Genetic Association Consortium (SSGAC) are available for Add Health respondents who provided archival saliva samples for genetic testing at Wave IV. This data file contains educational attainment, cognitive performance, depression, neuroticism, and subjective well-being scores based on standard GWAS summary statistics and multilevel analysis (2 scores for each construct). Additional multivariate analysis scores for highest level math taken and math ability are also included.
Wave IV PGS Risk-Tolerance
Contains polygenic scores for general risk tolerance, adventurousness and risky behaviors in the driving, drinking, smoking and sexual domains. Polygenic scores were created for unrelated, Add Health participants of European Ancestry. Score metrics were generated using Plink, LDPred or MTAG software using the UK Biobank GWAS study.
Polygenic Index Inventories
Polygenic scores computed by the SSGAC consortium for anthropometric traits, cognition/education, fertility/sexual development, health/health behaviors, and personality/well being.
Polygenic Index Inventories – Release 2
This data file is a 2022 update of the polygenic scores computed by the SSGAC consortium for anthropometric traits, cognition/education, fertility/sexual development, health/health behaviors, and personality/well being.
Wave IV dbGaP GWAS Sample Weight
A weight component for the dbGaP GWAS Sample. This dataset is only needed if you have been approved to use the dbGaP data. For more information can be found on the CPC Data Portal.


Wave I-II: Family Structure Array
Nineteen constructed family structure array variables are available for Add Health respondents (Wave I or II) from birth to age at latest adolescent follow-up, with a maximum age of 18 years old. The family structure array variables can be used to construct measures such as the number of family structure changes experienced from birth to a given age (family instability), or the amount of time spent in different family structures.
This file contains four variables, utilizing five, seven, eight, and fourteen categories, to describe the household parental structure at Wave I.

Included in this file are a single race variable and a multiracial variable constructed from the Wave I race questions. 

Wave IV Constructed Variables
This file contains constructed variables on stress, depression, mastery, personality, arrest history, sexual behavior, smoking, and substance abuse created by Wave IV collaborators.
Constructed SES Variables
These data include variables for social origins measured from information about their families collected from Add Health participants’ parents at the Wave I interview, social attainments measured from their occupations reported on the Wave IV survey, and neighborhood-level socioeconomic disadvantage using Census-tract-level data linked to Add Health participants’ addresses from Wave I and Wave IV.
Wave IV Constructed Current Relationship Status
This dataset contains variables that describe the current relationship each respondent had by Wave IV.
Wave V Constructed Age
The Wave V survey did not ask respondent age though it is possible to calculate age using interview date and date of birth. For deductive disclosure reasons complete interview dates and dates of birth were collected but not released. To provide calculated ages two variables have been constructed based on complete interview dates and date of birth month/year and 15 as a universal assigned day of birth.
Wave V Mover Distance
The dataset provides the distance, in meters, between each respondent’s home residence in each of the previous waves and their Wave V residence.

Waves I and II Romantic/Friend Nominations
These files contain the identification numbers to link the Wave I and II romantic, non-romantic, and friend data from the Wave I and II in-home interview. N varies by file.
varies by file
varies by file
Wave III Partner In-home
In Wave III, 1,507 partners of Add Health respondents were interviewed. These files contain the partner data for the Wave III in-home interview, AHPVT, field interviewer characteristics, and links to the Add Health respondents. N= 1,507
Wave III Partner ASHA Call
To receive the results of their STD assays, partners of Wave III respondents called an Add Health dedicated number at the American Social Health Association. This file provides information on who called the results hotline and the date and time of the call. N=332
Wave III Partner Urinalysis
This file contains nitrate, specific gravity, pH level, white blood cells, protein, glucose, ketone, urobilinogen, bilirubin, and blood values from the Wave III partner urine specimens. N=1,507
Parents (2015-2017)
The parent data files contain social, demographic, behavioral, and health data collected in 2015-2017 on a probability sample of Add Health parents who were originally interviewed in 1995 and coincide with Wave V of Add Health. Data for 2,013 Wave I parents, connected to 2,244 Add Health respondents, are available. Additionally, 988 current spouse/partner interviews are available. These data can be linked with Wave I parent data, and corresponding Add Health respondents at Waves I – V. Includes weight files.
varies by filevaries by file
Parents (2015-2017) – Family Health History
These two files contain data from the Wave I parent and spouse/partner who returned the Parents Phase 2 family health history leave-behind forms.
fhhp2: 131
fhhsp2: 131
fhhp2: 2,244
fhhsp2: 1,116
Parents (2015-2017) – Disposition File
This file contains the final disposition codes for Wave I parents selected for the Parents Phase 2 interview.
Parents (2015-2017) – National Death Index File
This file contains the underlying cause of death and days alive after the Wave I interview for Wave I parents selected for the Parents Phase 2 interview who are coded as deceased in the disposition file.
Parents (2015-2017) – Medication Files
These files contain the therapeutic classifications for medication data collected from the Wave I Parent and Spouse/Parent.
p2meds: 7
sp2meds: 7
p2meds: 1,286
sp2meds: 260
Parents (1995)
Data for the 1995 parents are combined with the Wave I data file. In this file are data on 17,669 parents.

Sexual Orientation/Gender Identity, Socioeconomic Status, and Health across the Life Course (SOGI-SES)
This file contains new survey data to support exploration of the relationships among sexual orientation, gender identity, gender expression, romantic and sexual behaviors, socioeconomic status, and health. It contains social, demographic, behavioral, and health data collected in 2020-2021 on a sample of Add Health Wave V participants.
Sexual Orientation/Gender Identity, Socioeconomic Status, and Health across the Life Course (SOGI-SES) – Sensitive
This file contains the SOGI-SES study’s sensitive data variables related to gender identity, in vitro fertilization, and HIV status.

  Wave I Indexes of Questions and Variables
These four indexes list all the questions and variable names in each instrument in Wave I. School Administrator Index: List of questions and variable names from the Wave I School Administrator Questionnaire 19,812 6 Adolescent In-School Index: List of questions and variable names from the Adolescent In-School Questionnaire 19,842 6 Parent In-Home Index: List of questions and variables from the Parent In-Home Questionnaire 33,343 12 In-Home Index: List of questions and variable names from the Wave I Adolescent In-Home Interview 221,729 85
  Wave II Indexes of Questions and Variables
These two indexes list all the questions and variable names in each instrument in Wave II. School Administrator Index: List of questions and variable names from the Wave II School Administrator Questionnaire 11,308 3 In-Home Index: List of questions and variable names from the Wave II Adolescent In-Home Interview 257,762 79
  Wave III Index of Questions and Variables List of questions and variable names from the Wave III young adult interview 243KB 62
  Wave IV Index of Questions and Variables List of questions and variable names from the Wave IV interview 315KB 28
  Wave V Index of Questions and Variables List of questions and variable names from the Wave V mixed-mode survey 477KB 46

Add Health