Jaume Vives-i-Bastida

Email Address vives@mit.edu

Personal Website jvivesb.github.io

References Alberto Abadie Anna Mikusheva Tobias Salz

Synthetic IV estimation in panels with Ahmet Gulek

Co-winner of the Best Student Paper Award of the International Applied Econometrics Association (2024)

Special Mention (runner-up) for the Best Job Market Paper Award of the European Economic Association and UniCredit Foundation (2024)

Abstract

We propose a Synthetic Instrumental Variables (SIV) estimator for panel data that combines the strengths of instrumental variables and synthetic controls to address unmeasured confounding. We derive conditions under which SIV is consistent and asymptotically normal, even when the standard IV estimator is not. Motivated by the finite sample properties of our estimator, we introduce an ensemble estimator that simultaneously addresses multiple sources of bias and provide a permutation-based inference procedure. We demonstrate the effectiveness of our methods through a calibrated simulation exercise, two shift-share empirical applications, and an application in digital economics that includes both observational data and data from a randomized control trial. In our primary empirical application, we examine the impact of the Syrian refugee crisis on Turkish labor markets. Here, the SIV estimator reveals significant effects that the standard IV does not capture. Similarly, in our digital economics application, the SIV estimator successfully recovers the experimental estimates, whereas the standard IV does not.

Stretching the Net: Multidimensional Regularization (Econometric Theory, 2023)

This paper derives asymptotic risk (expected loss) results for shrinkage estimators with multidimensional regularization in high-dimensional settings. We introduce a class of multidimensional shrinkage estimators (MuSEs), which includes the elastic net, and show that—as the number of parameters to estimate grows—the empirical loss converges to the oracle-optimal risk. This result holds when the regularization parameters are estimated empirically via cross-validation or Stein’s unbiased risk estimate. To help guide applied researchers in their choice of estimator, we compare the empirical Bayes risk of the lasso, ridge, and elastic net in a spike and normal setting. Of the three estimators, we find that the elastic net performs best when the data are moderately sparse and the lasso performs best when the data are highly sparse. Our analysis suggests that applied researchers who are unsure about the level of sparsity in their data might benefit from using MuSEs such as the elastic net. We exploit these insights to propose a new estimator, the cubic net, and demonstrate through simulations that it outperforms the three other estimators for any sparsity level.

Synthetic Controls in Action with Alberto Abadie (Econometric Society Monographs, forthcoming)

In this article we propose a set of simple principles to guide empirical practice in synthetic control studies. The proposed principles follow from formal properties of synthetic control estimators, and pertain to the nature, implications, and prevention of over-fitting biases within a synthetic control framework, to the interpretability of the results, and to the availability of validation exercises. We discuss and visually demonstrate the relevance of the proposed principles under a variety of data configurations.

Bayesian and Frequentist Inference for Synthetic Controls with Ignacio Martinez

The synthetic control method has become a widely popular tool to estimate causal effects with observational data. Despite this, inference for synthetic control methods remains challenging. Often, inferential results rely on linear factor model data generating processes. In this paper, we characterize the conditions on the factor model primitives (the factor loadings) for which the statistical risk minimizers are synthetic controls (in the simplex). Then, we propose a Bayesian alternative to the synthetic control method that preserves the main features of the standard method and provides a new way of doing valid inference. We explore a Bernstein-von Mises style result to link our Bayesian inference to the frequentist inference. For linear factor model frameworks we show that a maximum likelihood estimator (MLE) of the synthetic control weights can consistently estimate the predictive function of the potential outcomes for the treated unit and that our Bayes estimator is asymptotically close to the MLE in the total variation sense. Through simulations, we show that there is convergence between the Bayes and frequentist approach even in sparse settings. Finally, we apply the method to re-visit the study of the economic costs of the German re-unification and the Catalan secession movement. The Bayesian synthetic control method is available in the bsynth R-package.

Predictor Selection for Synthetic Controls (revise and resubmit at Journal of Econometrics)

Updated draft coming soon!

Synthetic control methods often rely on matching pre-treatment characteristics (called predictors) of the treated unit. The choice of predictors and how they are weighted plays a key role in the performance and interpretability of synthetic control estimators. This paper proposes the use of a sparse synthetic control procedure that penalizes the number of predictors used in generating the counterfactual to select the most important predictors. We derive, in a linear factor model framework, a new model selection consistency result and show that the penalized procedure has a faster mean squared error convergence rate. Through a simulation study, we then show that the sparse synthetic control achieves lower bias and has better posttreatment performance than the un-penalized synthetic control. Finally, we apply the method to revisit the study of the passage of Proposition 99 in California in an augmented setting with a large number of predictors available.

Bagged Polynomial Regression and Neural Networks with Sylvia Klosin

Series and polynomial regression are able to approximate the same function classes as neural networks. However, these methods are rarely used in practice, although they offer more interpretability than neural networks. In this paper, we show that a potential reason for this is the slow convergence rate of polynomial regression estimators and propose the use of bagged polynomial regression (BPR) as an attractive alternative to neural networks. Theoretically, we derive new finite sample and asymptotic L2 convergence rates for series estimators. We show that the rates can be improved in smooth settings by splitting the feature space and generating polynomial features separately for each partition. Empirically, we show that our proposed estimator, the BPR, can perform as well as more complex models with more parameters. Our estimator also performs close to state-of-the-art prediction methods in the benchmark MNIST handwritten digit dataset. We demonstrate that BPR performs as well as neural networks in crop classification using satellite data, a setting where prediction accuracy is critical and interpretability is often required for addressing research questions.

Synthetic Experimental Design for a UBI Pilot Study [Policy Technical Report: Ivalua]

The Effects of Regulating Food Delivery Platform Design with Alejandro Sabal

[slides]

Abstract: There is rising interest amongst regulators in understanding how different platform design choices affect welfare. In this project we focus on two important mechanisms platforms have at their disposal: (1) offering preferential treatment to producers in consumer search and (2) determining producer payments (commission fees). We study the welfare implications of different platform choices in the empirical setting of a food delivery platform that bargains with producers to set commission fees and to adjudicate fixed ranking slots in the consumer search wall. Using transaction level data and click stream search data from a large food delivery platform, we show that both mechanisms are important in practice. Producers with low commission fees are preferred by consumers, and low commission fees are used to attract valuable “anchor” producers that drive consumers into the platform. Search preferencing is also important. Using an A/B test in which rank was randomized we show that search rank is a driver of consumption. To evaluate the impact of different platform designs on consumer and restaurant welfare, we develop a structural model featuring a consumer demand with search frictions, bargaining between restaurants and the platform over ranks and commission fees, and restaurant and consumer entry into the platform. Using the model, we compute counterfactual experiments to assess the impact of regulations forbidding platforms from providing preferential rank to larger restaurants and from setting differential commission fees across restaurants.

Pushing Back Against Private Practice: the Spanish Physician Public Exclusivity Bonus with Jon Gruber, Nuria Mas and Judit Vall

Draft coming soon!

Admissions

Financial Support

FAQ

Curriculum and Thesis

Student Resources

Job Market

Master’s in Data, Economics, and Design of Policy

Majors

Minor and Concentrations

Transfer Credits

Research

For Current Students

Contact Information

Jaume Vives-i-Bastida

Contact Information

Job Market Paper

Research Papers

Work in Progress