Job market paper
This paper studies a non-parametric discrete instrumental variable model that embeds a conditional statistical independence condition—heterogeneity is statistically independent of instruments conditional upon covariates. The model is applicable when data comprises a dichotomous response, a dichotomous treatment, and a discrete instrumental variable; the model also allows for covariates to influence both the response decision and the treatment decision. This paper shows that the model partially identifies the conditional average treatment effect and its aggregation (i.e., linear combinations of this parameter for different conditioning levels), and that it is falsifiable. This paper also adopts a robust Bayesian (Giacomini & Kitagawa, 2021) posture to provide credible regions for the effect of additional children on female labour force participation using U.S. data from 1980 and 2008–2018 under weak restrictions on behaviour.
Published in Japanese Economic Review (with T.Kitagawa)
Static supervised learning—in which experimental data serves as a training sample for the estimation of an optimal treatment assignment policy—is a commonly assumed framework of policy learning. An arguably more realistic but challenging scenario is a dynamic setting in which the planner performs experimentation and exploitation simultaneously with subjects that arrive sequentially. This paper studies bandit algorithms for learning an optimal individualised treatment assignment policy. Specifically, we study applicability of the EXP4.P (Exponential weighting for Exploration and Exploitation with Experts) algorithm developed by Beygelzimer et al. (2011) to policy learning. Assuming that the class of policies has a finite Vapnik-Chervonenkis dimension and that the number of subjects to be allocated is known, we present a high probability welfare-regret bound of the algorithm. To implement the algorithm, we use an incremental enumeration algorithm for hyperplane arrangements. We perform extensive numerical analysis to assess the algorithm's sensitivity to its tuning parameters and its welfare-regret performance. Further simulation exercises are calibrated to the National Job Training Partnership Act (JTPA) Study sample to determine how the algorithm performs when applied to economic data. Our findings highlight various computational challenges and suggest that the limited welfare gain from the algorithm is due to substantial heterogeneity in causal effects in the JTPA data.
Submitted (with T.Kitagawa & Hugo Lopez)
This paper proposes a novel method to estimate individualised treatment assignment rules. The method is designed to find rules that are stochastic, reflecting uncertainty in estimation of an assignment rule and about its welfare performance. Our approach is to form a prior distribution over assignment rules, not over data generating processes, and to update this prior based upon an empirical welfare criterion, not likelihood. The social planner then assigns treatment by drawing a policy from the resulting posterior. We show analytically a welfare-optimal way of updating the prior using empirical welfare; this posterior is not feasible to compute, so we propose a variational Bayes approximation for the optimal posterior. We characterise the welfare regret convergence of the assignment rule based upon this variational Bayes approximation, showing that it converges to zero at a rate of ln(n)/sqrt(n). We apply our methods to experimental data from the Job Training Partnership Act Study to illustrate the implementation of our methods.
Submitted (with T.Kitagawa)
The von Mises-Fisher family is a parametric family of distributions on the surface of the unit ball, summarised by a concentration parameter and a mean direction. As a quasi-Bayesian prior, the von Mises-Fisher distribution is a convenient and parsimonious choice when parameter spaces are isomorphic to the hypersphere (e.g., maximum score estimation in semi-parametric discrete choice, estimation of single-index treatment assignment rules via empirical welfare maximisation, under-identifying linear simultaneous equation models). Despite a long history of application, measures of statistical divergence have not been analytically characterised for von Mises-Fisher distributions. This paper provides analytical expressions for the f-divergence of a von Mises-Fisher distribution from another, distinct, von Mises-Fisher distribution in Rp and the uniform distribution over the hypersphere. This paper also collects several other results pertaining to the von Mises-Fisher family of distributions, and characterises the limiting behaviour of the measures of divergence that we consider.
In progress
There are substantial medical costs associated with the delivery and care of low and very low birthweight infants. The consumption of alcohol and tobacco during pregnancy is known to adversely affect foetal development, but is this effect uniform across the birthweight distribution? Moreover, does it matter when (i.e., which term of their pregancy) mothers stop their consumption of these goods? Evans & Ringel (1999) propose using variation in state (and federal) taxes on alcohol and cigarettes to identify the causal effect of alcohol and tobacco on infant birthweight. A key identifying assumption that is made is that taxes are exogenously set—the legislative process by which nominal taxes on units of alcohol or packets of cigarettes are set acts independently of the prevailing market conditions for these goods—and so are incompatible with ad valorem taxes. By excluding births from Hawaii—whose taxes on units of alcohol or packets of cigarettes depend upon their price—we find that the statistically insignificant but positive effect of reduced consumption of these goods upon infant birthweight disappears, even in a far larger sample that compensates for the weak correlation between the price and consumption of what are inherently addictive goods.