Published on Tue Jun 29 2021

Two-Stage TMLE to Reduce Bias and Improve Efficiency in Cluster Randomized Trials

Laura B. Balzer, Mark van der Laan, James Ayieko, Moses Kamya, Gabriel Chamie, Joshua Schwab, Diane V. Havlir, Maya L. Petersen

Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals. Failing to appropriately adjust for differential outcome measurement can result in biased estimates and inference. We propose a novel two-stage targeted minimum loss-based estimator.

2
10
51
Abstract

Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals (e.g., clinics or communities), and measure outcomes on individuals in those groups. While offering many advantages, this experimental design introduces challenges that are only partially addressed by existing analytic approaches. First, outcomes are often missing for some individuals within clusters. Failing to appropriately adjust for differential outcome measurement can result in biased estimates and inference. Second, CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms. Failing to adaptively adjust for these imbalances and other predictive covariates can result in efficiency losses. To address these methodological gaps, we propose and evaluate a novel two-stage targeted minimum loss-based estimator (TMLE) to adjust for baseline covariates in a manner that optimizes precision, after controlling for baseline and post-baseline causes of missing outcomes. Finite sample simulations illustrate that our approach can nearly eliminate bias due to differential outcome measurement, while other common CRT estimators yield misleading results and inferences. Application to real data from the SEARCH community randomized trial demonstrates the gains in efficiency afforded through adaptive adjustment for cluster-level covariates, after controlling for missingness on individual-level outcomes.

Sun Aug 08 2021
Machine Learning
Improving Inference from Simple Instruments through Compliance Estimation
Instrumental variables (IV) regression is widely used to estimate causal treatment effects. While IV can recover consistent treatment effect estimates, they are often noisy. We study how to exploit the predictable variation in the strength of the instrument. We weight each observation according to its estimated compliance.
2
13
58
Mon Jun 18 2018
Machine Learning
Interpretable Almost Matching Exactly for Causal Inference
We aim to create the highest possible quality of treatment-control matches. The algorithm uses a single dynamic program to solve all of the optimization problems simultaneously. Notable advantages of our method over existing matching procedures are its high-quality matches.
0
0
0
Mon Aug 09 2021
Artificial Intelligence
The External Validity of Combinatorial Samples and Populations
The widely used 'Counterfactual' definition of Causal Effects was derived forbiasedness and accuracy - and not generalizability. We propose a simple definition for the External Validity (EV) of Interventions, Counterfactual statements and Samples.
0
0
0
Tue Sep 24 2019
Machine Learning
Trimmed Constrained Mixed Effects Models: Formulations and Algorithms
Mixed effects (ME) models inform a vast array of problems in the physical and socially sciences. We consider ME models where the random effects component is linear. We then develop an efficient approach for a broad problem class that allows nonlinear measurements, priors, and constraints.
0
0
0
Fri Oct 05 2018
Machine Learning
Adaptive Clinical Trials: Exploiting Sequential Patient Recruitment and Allocation
Most RCTs allocate the patients to the treatment group and the control group by uniform randomization. We show that this procedure can be highly sub-optimal (in terms of learning) if -- as is often the case -- patients can be recruited in cohorts. We propose an algorithm that yields an approximate solution.
0
0
0
Fri Jun 12 2020
Machine Learning
Targeting Learning: Robust Statistics for Reproducible Research
Targeted Learning is a subfield of statistics that unifies advances in causal inference, machine learning and statistical theory. Targeted Learning establishes a principled standard for statistical estimation and inference (i.e., confidence intervals and p-values)
0
0
0