CitationConde, Eugenia & Poston, Dudley (2016). Approaches for addressing missing data in statistical analyses of female and male adolescent fertility. 2016 Add Health Users Conference. Bethesda, MD.
AbstractThis paper uses data from Wave I and Wave III of the National Longitudinal Study of Adolescent to Adult Health (Add Health) to address the importance of handling missing data. We undertake two separate analyses, one for females the other for males, to predict the likelihood of the respondent having had a teen birth. Six theoretically relevant independent variables are used, including household income and parental education, which respectably have 26 and 15 percent of the data missing. These approaches to handle missing data are used in separate models: (1) listwise deletion, (2) overall mean substitution, (3) mean substitution based on race/ethnicity, (4) the proxy method where mother’s education is used as a proxy for income, (5) dropping the two variables with excessive amounts of missing data, (6) multiple imputation using fully conditional specification iterative method, (7) multiple imputation using the Markov chain Monte Carlo iterative method with three auxiliary variables, and (8) without auxiliary variables. We show that depending on the method used, many of the independent variables in our models vary in whether they are, or are not, statistically significant in predicting the log odds of a person having a teen birth; and many of the independent variables that are statistically significant vary in the ranking of the magnitude of their relative effects on the outcome. The implications of the findings from a scientific and social policy perspective are discussed, and we make recommendations for how to handle models with large amounts of missing data.
Reference TypeConference proceeding
Book Title2016 Add Health Users Conference