Optimizing variance-bias trade-off in the TWANG package for estimation of propensity scores

Overview of attention for article published in Health Services and Outcomes Research Methodology, December 2016

Altmetric Badge

Citations

dimensions_citation: 16 Dimensions

Readers on

mendeley: 28 Mendeley

Summary Dimensions citations

Title	Optimizing variance-bias trade-off in the TWANG package for estimation of propensity scores
Published in	Health Services and Outcomes Research Methodology, December 2016
DOI	10.1007/s10742-016-0168-2
Pubmed ID	29104450
Authors	Layla Parast, Daniel F. McCaffrey, Lane F. Burgette, Fernando Hoces de la Guardia, Daniela Golinelli, Jeremy N. V. Miles, Beth Ann Griffin
Abstract	While propensity score weighting has been shown to reduce bias in treatment effect estimation when selection bias is present, it has also been shown that such weighting can perform poorly if the estimated propensity score weights are highly variable. Various approaches have been proposed which can reduce the variability of the weights and the risk of poor performance, particularly those based on machine learning methods. In this study, we closely examine approaches to fine-tune one machine learning technique (generalized boosted models [GBM]) to select propensity scores that seek to optimize the variance-bias trade-off that is inherent in most propensity score analyses. Specifically, we propose and evaluate three approaches for selecting the optimal number of trees for the GBM in the twang package in R. Normally, the twang package in R iteratively selects the optimal number of trees as that which maximizes balance between the treatment groups being considered. Because the selected number of trees may lead to highly variable propensity score weights, we examine alternative ways to tune the number of trees used in the estimation of propensity score weights such that we sacrifice some balance on the pre-treatment covariates in exchange for less variable weights. We use simulation studies to illustrate these methods and to describe the potential advantages and disadvantages of each method. We apply these methods to two case studies: one examining the effect of dog ownership on the owner's general health using data from a large, population-based survey in California, and a second investigating the relationship between abstinence and a long-term economic outcome among a sample of high-risk youth.

View on publisher site Alert me about new mentions

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 28 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Unknown	28	100%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	6	21%
Researcher	4	14%
Student > Master	4	14%
Student > Doctoral Student	2	7%
Lecturer	1	4%
Other	4	14%
Unknown	7	25%

Readers by discipline	Count	As %
Medicine and Dentistry	4	14%
Mathematics	3	11%
Social Sciences	3	11%
Computer Science	2	7%
Psychology	2	7%
Other	4	14%
Unknown	10	36%