A data-driven approach to cell ratio imputation for item nonresponse in data linkage problem

Wednesday, February 6, 2019 - 12:10pm to 1:00pm
Event Type: 

Danhyang Lee / Abstract: With the improved availability of administrative data sources, data linkage, which integrates administrative records with survey data, has been increasingly used to improve the quality of official statistics. However, there are some limitations in directly using the linked data for statistical analyses when administrative data does not cover a whole population of interest, hence, the linkage between survey data and administrative data is not perfect. The challenge is that there may exist inconsistencies between the linked and unlinked respondents in the survey, which leads to bias called imperfect linkage bias. By treating the unlinked respondents as missing data, we develop a data-driven approach to cell ratio imputation to adjust for the imperfect linkage bias. The number of imputation cells and the cell formation are determined using a model-based EM algorithm. The proposed method is applied to handle the real application problem, the 2017 Household Income and Expenditure Survey (HIES) conducted by Statistics Korea. The proposed method is also confirmed in a limited simulation study.