Collinearity in mover event studies
A note for economists
How do you get an event study that looks like this? By having almost no treatment variation. This post is about an issue I came across in my comment on Moretti (2021) (M21), when trying to fix the event study. M21 uses city fixed effects, but it turns out these create collinearity and this pattern of confidence intervals jumping in width.
The setting is a 'mover' event study, where individuals move across cities; the treatment is the change in city environment. In M21, this is the change in cluster size, defined as the number of inventors in the same research field and city. The treatment variable is Size_diff = Size_post - Size_pre. The event study interacts Size_diff with event time indicators.
The sample is movers: inventors who change cities exactly once. Stayers are excluded from the sample (i.e., there are no never-treated observations). It's a staggered rollout design, with inventors moving in different calendar years. The graph above includes individual, year, and city fixed effects. Seems fine, right?
When I simulate data for N=1000 individuals moving between C=20 cities, with true effect B=1, we get the expected event study graph (using the same fixed effects):
Here's the graph for N=1000 individuals and C=1000 cities. This looks very similar to the original graph! So the city fixed effects are the key.
The core issue is cities being observed only as origins or only as destinations. By definition, origin = (Post==0), so origin-only cities have Post=0 for all observations. Similarly, destination-only always have Post=1. For these observations, the city fixed effect is collinear with Post.
In the event study, with t=-1 omitted, this means the post-move indicators are collinear with the city FE. In the extreme case, where origins and destinations are disjoint, no cities are both origin and destination, so the regression is unidentified.
With C=1000 cities, only 5% of city-year pairs are observed as both origin and destination. So when including city fixed effects, the effective sample is very small, and we get a big jump in confidence interval width. With C=20 cities, this is 78%, so the confidence intervals are fine.
Note that the jump in confidence interval width is just a function of the omitted period. When I omit t=0 instead of t=-1, then the pre-move indicators are nearly collinear with the city fixed effect, and the confidence intervals are now wider pre-move.
Final note: the pattern of large standard errors could also be explained by treatment effect heterogeneity. In this case, however, it is due to the city fixed effects driving collinearity. When I redo the original graph using real data and remove the city fixed effects, the jump in width disappears.
Link to simulation code.





