• HOME
  • About
  • People
  • PhD Program
  • DEDP Master's Program
  • Undergraduate Program
  • Events and Seminars
  • Centers
Home > Personal Graduate Pages > Max Cytrynbaum > Research Papers

  • Max Cytrynbaum
  • CV (PDF)
  • Research Papers

Max Cytrynbaum

Research Papers

Designing Representative and Balanced Experiments by Local Randomization (Job Market Paper)

Abstract: This paper studies treatment effect estimation in a novel two-stage model of experimentation. In the first stage, using baseline covariates, the researcher selects units to participate in the experiment from a sample of eligible units. Next, they assign each selected unit to one of two treatment arms. We relate estimator efficiency to representative selection of participants and balanced assignment of treatments. We define a new family of local randomization procedures, which can be used for both selection and assignment. This family nests stratified block randomization and matched pairs, the most commonly used designs in practice in development economics, but also produces many useful new designs, embedding them in a unified framework. When used to select representative units into the experiment, local randomization boosts effective sample size, making estimators behave as if they were estimated using a larger experiment. When used for treatment assignment, local randomization does model-free non-parametric regression adjustment by design. We give novel exact inference methods for locally randomized selection and assignment, allowing experimenters to report smaller confidence intervals if they designed a representative experiment. We apply our methods to the setting of two-wave design, where the researcher has access to a pilot study when designing the main experiment. We use local randomization methods to give the first fully efficient solution to this problem.

Local Randomization Dominates Rerandomization: Implications for Table One

Abstract: As evidence of balance, experimentalists often present “Table 1” showing that the average of baseline covariates between treatment and control units is similar. This can be achieved by rerandomization: resampling candidate treatment assignments until Table 1 looks balanced. This paper shows that such rerandomization procedures, based on global balance criteria, are generically inefficient. Because of this, the information about covariate balance, and thus estimator precision, conveyed by Table 1 is limited. As a result, we suggest experimenters present alternative “local” measures of balance. To show our inefficiency result, we study rerandomization with candidate treatment assignments drawn from a large family of local randomization distributions, introduced in Cytrynbaum (2021). Local randomization refers to stratified designs with fine, data-adaptive strata, and strictly nests classical designs like matched pairs. We show that rerandomizing candidate treatment assignments drawn by local randomization does semiparametric regression adjustment by design. The nonparametric component of the regression is produced by the set of locally balanced covariates, while the linear component is due to the covariates in the rerandomization criterion. By contrast, a design without rerandomization that instead locally balances both sets of covariates gives joint non-parametric control, with (generically) strictly smaller asymptotic variance.

Blocked Clusterwise Regression

Abstract: A recent literature in econometrics models unobserved cross-sectional heterogeneity in panel data by assigning each cross-sectional unit a one-dimensional, discrete latent type. Such models have been shown to allow estimation and inference by regression clustering methods. This paper is motivated by the finding that the clustered heterogeneity models studied in this literature can be badly misspecified, even when the panel has significant discrete cross-sectional structure. To address this issue, we generalize previous approaches to modeling discrete unobserved heterogeneity by allowing each unit to have multiple, imperfectly-correlated latent variables that describe its response-type to different covariates. We give inference results for a k-means style estimator of our model and develop information criteria to jointly select the number clusters for each latent variable. Monte Carlo simulations confirm our theoretical results and give intuition about the finite-sample performance of estimation and model selection. We also contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting. Our results suggest that over-fitting can be severe in k-means style estimators when the number of clusters is over-specified.

Using Lattice Geometry to Find all Stable Allocations

Abstract: In this paper, we give an algorithm to find all core allocations in a general model of multilateral many-to-many matching with contracts. We develop a notion of information sharing in lattices, showing how lattice geometry can be exploited to produce a relatively fast algorithm that returns the full set of core outcomes. We show how to apply our technique to more general economic problems and, as an application, construct the first algorithm to find all stable allocations in bilateral matching with contracts markets when agents have substitutable preferences.

Research in Progress

Efficient Revelation in Two-wave Experiments

Abstract: This paper studies the optimal experimental design for a fixed weighting estimator of the ATE, asking whether one can use a pilot experiment to “estimate what to balance.” We show that there exists a one-dimensional moment condition on the treatments that is the minimal sufficient condition for unbiasedness of the estimator. This moment condition captures the exact (unknown) function of the covariates that we need to balance at a fixed DGP. The degrees of freedom left after imposing this condition can be used to exploit efficient revelation effects: optimally revealing the less noisy potential outcomes. Thus, the oracle design solves a linear program: optimizing the revelation effect subject to the minimal, unconditional moment constraint. By contrast, inverse propensity weighting imposes a stronger conditional moment restriction, wasting these degrees of freedom. Because of this, the optimal propensity score is not equal to the estimator weights. We propose a feasible linearly constrained quadratic program (LCQP) that uses pilot data to mimic the oracle design LP. The extra quadratic term regularizes our estimate of the oracle LP, accounting for the additional variance due to pilot estimation of the oracle moment condition. We give asymptotics showing that such a strategy generically improves on the naive propensity weighting approach. 

Massachusetts Institute of Technology • Department of Economics
The Morris and Sophie Chang Building • E52-300
50 Memorial Drive • Cambridge, MA 02142
Accessibility