Mortality Linked Files

Data from the National Death Index (NDI) has been linked to several NCHS datasets, including:

  • the National Health Interview Survey (NHIS)
  • the National Health and Nutrition Examination Surveys (NHANES)
  • the Second Longitudinal Study on Aging (LSOA II)
  • the National Nursing Home Survey (NNHS)

Linked mortality records provide rich data for epidemiological studies and supplementary data for project using NCHS datasets.

Participants are eligible for matching if NDI records contains sufficient information to identify them. (Different combinations of  social security number, last name, first initial, date of birth, and sex are needed for identification.)  Observations are then matched using different combinations of social security numbers, first and last names, rough date of birth, and father’s surname.  A list of potential matches are created, and, if no match exists, the respondent is presumed to still be alive.  Potential matches are scored for probability of accuracy, and accuracy scores and weights are used to select the single best match.  In cases where matches are unclear, data may be reviewed by hand. 

Match rates and sample sizes vary by survey and survey year.   All data sets have been matched to NDI records current through 2006.  

Limited public-use files are available for some datasets. Data in these files has been censored or statistically masked to protect confidentiality, and includes variables for:

  • linkage eligibility
  • mortality status
  • grouped codes for underlying cause of death
  • multiple cause of death flags
  • indicator variables for the presence of diabetes, hypertension, or hip fracture in records with multiple cause of death

Restricted files are also available through the RDC and include:

  • eligibility for linkage,
  • mortality status,
  • age at death,
  • source of mortality information,
  • age last presumed alive,
  • exact date of death and birth,
  • underlying cause of death (ICD-9 or -10 codes) and multiple cause of death codes,
  • interview dates.

All data is collected by the National Center for Health Statistics and is made available through Census RDCs under arrangement with NCHS.  Interested users must submit a proposal to the RDC at NCHS for access to restricted data. Availability of data is subject to the discretion of NCHS.  For up-to-date and detailed information on data, please visit the page for mortality files.

Restricted Data

Both restricted RDC datasets and limited public-use datasets are available for the 1986-2004 NHIS, NHANES III, 1999-2004 NHAHES, LSOA II, and the 2004 NNHS. 

The linked  NHEFS, NHANES II, and 1985, 1995, and 1997 NNHS are available only through the RDC, and no creation of public-use files is planned for these files.

Data in publicly available datasets have been statistically masked to protect confidentiality.  Mortality status is not censored or perturbed, but synthetic data has been substituted for date of death and underlying cause of death.  NHIS and NNHS contain quarter and year of death, while NHANES III, 1999-2004 NHANES, and the LSOA II contain information on time between interview and death. 

All public-use files only include data for participants 18 years of age or older.  In restricted files, the NHANES III, 1999-2004 NHANES, and all years of the NHIS contain information for respondents under 18.  

For NHANES III, 1999-2004 NHANES, the LSOA II, and NHIS, additional restricted variables are available by special request.  Special request variables include additional information from death certificates, such as place of death, place of residence, race and ethnicity.  All results from the NDI matching process are also available by request.  These results are useful for researchers using their own criteria for matching mortality data.

  • wall_background