Counterfactual Analysis Based on Grouped Data: Application to Poverty and Material Deprivation

We propose inferences on the counterfactual mean: the mean of some population outcome, assuming that the related characteristics are distributed according to the distribution of another population. The purpose is to compare the mean of an outcome in two populations, holding constant other related factors that may distort the comparison. Once the data is partitioned into a number of classes, the counterfactual mean is identified as the weighted sum of the outcome conditional expectation in one population, given that the related characteristics take values in each class, with weights the probability that the characteristics take values in each class in the other population. The relative estimator follows by the analogy principle. The procedure can be applied using any kind of data, without specifying a model for the conditional expectation. The asymptotic properties of the proposed estimator are unaffected when the partitions are data-dependent. The main application of the procedure consists of decomposing the difference between sample means into a composition term, assuming that the conditional expectations are identical in each population, and a structural term, assuming that the distribution of the related characteristics in the two populations is identical. The proposal is applied to decompose the effects of the great recession on Spanish poverty indices. The results suggest that changes in the population characteristics were responsible for the increase in poverty rates. While labor market disruptions pushed the rates higher, demographic changes and migratory movements had the opposite effect. \noindent We propose inferences on the counterfactual mean: the mean of some population outcome, pretending that the related characteristics are distributed according to the distribution of another population. The purpose is to compare the mean of an outcome in two populations, holding constant other related factors that may distort the comparison. Once the data is partitioned into a number of classes, the counterfactual mean outcome is identified as the weighted sum of the outcome conditional expectation in one population, given that the related characteristics take values in each class, with weights the probability that the characteristics take values in each class in the other population. The relative counterfactual mean estimator follows by the analogy principle. The procedure can be applied using data of any kind, without specifying a model for the conditional expectation. The asymptotic properties of the proposed estimator are unaffected when the partitions are data-dependent with classes converging to a fixed limit. The main application of the procedure consists of decomposing the difference between sample means into a composition term, pretending that the conditional expectation functions are identical in each population, and a structural term, pretending that the distribution of the related characteristics in the two populations are identical. The proposed methodology is applied to decompose the effects of the great recession on Spanish poverty indices. The results suggest that differences in the distribution of characteristics over time, resulting from labor market disruptions and demographic changes, were responsible for the increase in poverty rates. While job losses pushed the rates higher, the out-migration of poor foreigners decreased the rates.