Speaker: Matthew Stuart
Abstract: Split questionnaire design (SQD) is a relatively new survey tool to reduce response burden and increase quality of responses. Among a set of possible SQD choices, a design is considered the best if it leads to the least amount of information loss quantified by the Kullback-Leibler divergence (KLD) distance. The calculation of this distance requires computation of the distribution function for the observed data after integrating out all the missing variables in a particular SQD. For a typical survey questionnaire with a large number of categorical variables, this computation can become practically infeasible. Motivated by the Horvitz-Thompson estimator, we propose to approximate the distribution function of the observed data in much reduced computation time and lose little information when comparing different choices of SQDs. We construct a thorough simulation study to test if the proposed approximation method can correctly identify the best SQD under several simulation scenarios created to cover different distribution shapes of continuous variables, and different correlation structures in the variables. Finally, the proposed approach is applied to the 2012 Pet Demographic Survey data. Both of the simulation studies and the empirical study demonstrate that the proposed method is computationally efficient and can accurately select the best SQD design.