Abstract
Polly Phipps and Daniell Toth "Analyzing Establishment Nonrepsonse Using an
Interpretable Regression Tree Model with Linked Administrative Data"
To gain insight into how characteristics of an establishment are associated with nonresponse, a recursive partitioning algorithm is applied to the
Occupational Employment Statistics May 2006 survey data to build a regression tree. The tree models an establishment’s propensity to respond to the
survey given certain establishment characteristics. It provides mutually exclusive cells based on the characteristics with homogeneous response propensities.
This makes it easy to identify interpretable associations between the characteristic variables and an establishment’s propensity to respond, something
not easily done using a logistic regression propensity model. We test the model obtained using the May data against data from the November 2006
Occupational Employment Statistics survey. Testing the model on a disjoint set of establishment data with a very large sample size (n = 179,360) offers
evidence that the regression tree model accurately describes the association between the establishment characteristics and the response propensity for the
OES survey. The accuracy of this modeling approach is compared to that of logistic regression through simulation. This representation is then used along
with frame-level administrative wage data linked to sample data to investigate the possibility of nonresponse bias. We show that without proper adjustments
the nonresponse does pose a risk of bias and is possibly nonignorable.
|