Chapter 20 - Regression Discontinuity

20.2 How Is It Performed?

20.2.1 Regression Discontinuity with Ordinary Least Squares

Thus far, I’ve written about regression discontinuity in terms of discontinuities but have left out the regression. Seems inappropriate, so let’s fix it. There are many ways to predict the outcome just on either side of the regression, but we’ll start with a very simple interaction-term-based approach:

\[\begin{multline} \tag{20.1} Y = \beta_0 + \beta_1(Running-Cutoff) + \beta_2Treated + \\ \beta_3(Running-Cutoff)\times Treated + \varepsilon \end{multline}\]

This is a simple linear approach to regression discontinuity, where $Running$ is the running variable, which we’ve centered around the cutoff by using $(Running - Cutoff)$. $(Running - Cutoff)$ takes a negative value to the left of the cutoff, zero at the cutoff, and a positive value to the right. We’re talking about a sharp regression discontinuity here, so $Treated$ is both an indicator for being treated and an indicator for being above the cutoff - these are the same thing,619619 They no longer will be identical once we get to estimating a fuzzy regression discontinuity. and you could easily write the equation with $AboveCutoff$ instead of $Treated$. The model is generally estimated using heteroskedasticity-robust standard errors, as one might expect the discontinuity and general shape of the line we’re fitting to exhibit heteroskedasticity in most cases.620620 How exactly to do heteroskedasticity-robust standard errors does get a bit trickier here. You might see papers out in the wild using standard errors that are clustered on the running variable - we used to think this was a good idea but no longer do. I’ll get to that one in the “How the Pros Do It” section. We’ll ignore the issue of a bandwidth for now and come back to it later - this regression approach can be applied whether you use all the data or limit yourself to a bandwidth around the cutoff.

Notice the lack of control variables in Equation (20.1). In most other chapters, when I do this it’s to help focus your attention on the design. Here, it’s very intentional. The whole idea of regression discontinuity is that you have nearly random assignment on either side of the cutoff. You shouldn’t need control variables because the design itself should close any back doors. No open back doors? No need for controls. Adding controls implies you don’t believe the assumptions necessary for the regression discontinuity method to work, and it makes the whole thing becomes a bit suspicious to any readers.621621 That said, sometimes the addition of controls can help improve the precision of the estimate by reducing the amount of unexplained variation. The process of adding controls can get a bit tricky if you’re doing local regression, though, as I’ll describe soon. There are methods for it, though, such as Calonico et al. (⊕2019Calonico, Sebastian, Matias D. Cattaneo, Max H. Farrell, and Rocio Titiunik. 2019. “Regression Discontinuity Designs Using Covariates.” Review of Economics and Statistics 101 (3): 442–51.), which are implemented in pre-packaged regression discontinuity commands.

It’s not that adding controls is wrong, but you should think carefully about whether you need them if you think that treatment is completely assigned by the cutoff - where’s the back door you’re closing in that case? You’d have to think there are some back doors just around the cutoff (which isn’t impossible, just not a given). Controls may be more necessary in fuzzy regression discontinuity, where there definitely are determinants of treatment other than the cutoff, so maybe there are back doors to be worried about. And maybe you have reason to believe that you do you have random assignment but only conditional on your controls, especially if to increase your effective sample size you had to accept more observations further from the cutoff. As always, you could still add controls to try to reduce the size of the residuals and so shrink standard errors.

We end up with two straight lines - one with the intercept $\beta_0$ and the slope $\beta_1$, to the untreated side of the cutoff, and another with the intercept $\beta_0 + \beta_2$ and the slope $\beta_1 + \beta_3$ to the treated side of the cutoff. Applying this regression to some simulated data gives us Figure 20.5.

Figure 20.5: Regression Discontinuity Estimated with Linear Regression with an Interaction

Scatterplot of data with a cutoff on the x-axis, with a clear jump at the cutoff. The data to the left follows a parabola, and a slight upward curve on the right, but both sides are fit with as straight line. The fit on the left is bad.

Ah, but don’t forget - since we centered the running variable to be 0 at the cutoff, the intercept of each of these lines is our prediction at the cutoff.622622 You must center the running variable for this to be true. If you forget to center the running variable, you can still get a regression discontinuity estimate out of the model, but it gets way harder. So center the running variable! So our estimate of how big the jump is from one line to the other at the cutoff is simply the change in intercepts, or $(\beta_0 + \beta_2) - (\beta_0) = \beta_2$. $\beta_2$ gives us our estimate of the regression discontinuity effect.

What are some of the pros and cons of this approach? An obvious pro is that this is easy to do, but that’s probably not a great reason to pick an estimation method.

The real pros and cons are sort of wrapped up together, and it has to do with the fact that we’re using a fitted shape to predict the outcome at the cutoff. There are some clear benefits to this, especially in the form of statistical precision. Using a shape lets us incorporate information from points further from the cutoff to inform the trajectory that the outcome is taking as you get closer and closer to the cutoff. See that clear positive trend to the right of the cutoff in Figure 20.5? There’s a bit of a flat, or even downward-sloping area in the data before the trend becomes obvious - it would be difficult to pick that trend up cleanly without incorporating the data further from the cutoff, which could lead us to overestimate the value at the cutoff coming from the right, as shown in Figure 20.6.

Figure 20.6: Regression Discontinuity Estimated with Linear Regression with an Interaction, both Without and With a Bandwidth Restriction

However, applying a fitted shape only works if we are applying the right fitted shape. If we pick the wrong one, as we clearly have in Figure 20.5 to the left of the cutoff, the prediction we get from our shape will be wrong.

That’s always the case when we use ordinary least squares and fitted shapes, of course. But the problem is especially bad here because fitted shapes tend to be at their most wrong at the edges of the available data. As you get to the very edge of your data, the prediction may just veer off in a strange direction because the shape you’ve chosen forces it to keep going in that direction. Again, we see this in Figure 20.6 on the left, comparing the dashed line (which veers off) to the solid line showing the actual slope near the cutoff.

An obvious instinct is to just try a more flexible shape. We can do this using the exact same idea as Equation (20.1), but we use some more flexible function for $(Running-Cutoff)$:

\[\begin{equation} \tag{20.2} Y = \beta_0 + f(Running-Cutoff, Treated) + \beta_2Treated + \varepsilon \end{equation}\]

where $f()$ is some nonlinear function of $Running-Cutoff$, with one version for $Treated = 0$ and another for $Treated = 1$. For example, this might be a second-order polynomial:

\[\begin{multline} \tag{20.3} Y = \beta_0 + \beta_1(Running-Cutoff) + \beta_2(Running-Cutoff)^2 + \\ \beta_3Treated + \beta_4(Running-Cutoff)\times Treated + \\ \beta_5(Running-Cutoff)^2\times Treated + \varepsilon \end{multline}\]

Equation (20.3) again produces two lines - a parabola to the left of the cutoff, and a parabola to the right. Like before, the coefficient on $Treated$ (here $\beta_3$) shows us how the prediction jumps at the cutoff, giving us the regression discontinuity estimate.

However, we need to be careful with more flexible OLS lines. It’s hard to fix the “wrong shape” problem in RDD by making the shape more flexible, because more-flexible shapes make even weirder predictions and hard-left-turns-to-infinity than simpler shapes near the edges of the data. Figure 20.7 shows regression discontinuity estimated for the same data using (a) the second-order polynomial from Equation (20.3), as well as (b) a fourth-order polynomial and (c) an 25th-order polynomial. Adding all those polynomials, we get stranger and stranger results, with huge swings near the cutoff that don’t even seem to track the data all that well.

Figure 20.7: Regression Discontinuity with Different Polynomials

Four graphs using the same data from the previous two graphs. The time, the lines on each side of the cutoff are fit in four different ways. On the first graph, a second-order polynomial which fits well. On the second, a sixth-order polynomial which fits slightly better but also is a bit wiggly. On the third an order-25 polynomial which behaves erratically. On the fourth it's straight lines but estimated only using data near the cutoff.

Because the data is simulated, I can tell you that the true effect is .15 and the true model is an order-2 polynomial. The linear model gives an effect of .29. The second, sixth, and 25th-order polynomial models give .17, .26, and… even if that 25th-order polynomial happened to get close to the true effect it would just be luck. Look how wild that curve is near the cutoff! Flexibility isn’t always so good.

Local regression. A regression that is fit differently for different values of the predictor variable, weighting observations based on how close they are to that value of the predictor.

Because of this problem, it’s not a great idea to go above the second-order term when performing regression discontinuity with ordinary least squares (⊕Gelman and Imbens 2019Gelman, Andrew, and Guido W. Imbens. 2019. “Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs.” Journal of Business & Economic Statistics 37 (3): 447–56.). If there’s a complex shape that needs fitting, instead try limiting the range of the data with a bandwidth and use a simpler shape. All those twists and curves tend to go away when you zoom in close with a bandwidth, anyway, and a straight line may be just fine. This can be seen in Figure 20.7 (d), which gives an effect of .19, as compared to the true effect of .15. It did just about as well as the order-2 polynomial, but without having to use the knowledge that the true underlying model is order-2.

We can think of this “zoom-in-and-go-simple” approach as a local regression. A local regression is one that estimates the relationship between some predictor $X$ and some outcome $Y$ as normal, but allows that relationship to vary freely across the range of $X$.623623 Local regression is also known as a form of “non-parametric regression” since it does not force a given shape on the entire range of data. However, I will point out that it does force a shape on subsets of the data. So it’s more “less-parametric regression” if you ask me. Local regression of some kind is how most researchers choose to implement their regression discontinuity design, at least if they have a large sample.624624 The flexibility and focus on the cutoff is the upside, but (as with nearly any time you toss out data or drop assumptions), by imposing less structure on the data you get less precision in your estimates. If the assumptions you’d need to impose would be very wrong, that’s a great tradeoff! But you’d better have lots of observations near the cutoff if you want to use local regression, or else the estimate isn’t going to be precise enough to use.

Kernel. A function that describes how values are weighted. In the context of local regression, it describes how they are weighted based on how close they are to the local value of interest.

What local regression does in particular is: for each value of $X$, say, $X = 5$, it gets its estimate of $Y$ by running its own special $X = 5$ regression. This $X = 5$ regression fits some sort of regression shape specific to $X = 5$. How is it specific? When estimating the regression, it weights observations more heavily the closer they are to $X = 5$. The specific form of weighting is called the kernel. Sometimes the kernel is a cutoff - the weight is 1 if you’re close and 0 if you’re too far away. Other times it’s more gradual - your weight goes down and down the farther you get until finally, slowly, you reach zero. A very commonly-used kernel in regression discontinuity is the triangular kernel, which in Figure 20.8 looks just like it sounds - a full weight right at the value you’re looking at, and a linearly declining weight as you move away, until you get to 0 at a certain point. That point is the “bandwidth” - i.e., how far away you can get and still count. This draws a neat lil’ triangle.

Figure 20.8: Example of a Triangular Kernel Weight Function

Graph showing a triangular distribution, which takes the shape of a triangle with a peak when the x-axis is at 0, and which slopes down gradually on either side until it hits zero at +Bandwidth and -Bandwidth

Computationally, local regression takes a lot of computing firepower. But conceptually this is very simple. Just roll along that $x$-axis, making a bunch of new regressions along the way based on the observations closest at hand at the time. Then predict values from it, and that’s your curve.

What kind of regression should be run at each point, though?

We might just take the average instead of running a regression, in which case we end up with kernel regression. Kernel regression can give us problems in the case of regression discontinuity, especially if there’s any sort of trend in the outcome in the lead-up to the cutoff. Since it’s just taking the average, it tends to flatten out the trend, leading to an at-the-cutoff prediction that attributes some of the trend that’s really there to the jump at the cutoff. Figure 20.9 shows an exaggerated example of how this can occur. Notice that the prediction to the left of the cutoff is way lower than all the actual data, because it’s missing the upward trend.

Figure 20.9: Regression Discontinuity with Means at the Cutoff

Graph showing raw data which follows a parabola on the left of a cutoff, and a slight curve on the right. The mean at the cutoff is estimated using the means near the cutoff, which misses the upward trajectory of the data on the left badly and the downward trajectory on the right.

More commonly, in application to regression discontinuity, we’ll do local regression, the most common form of which is LOESS, which we already discussed in Chapter 4 on “Describing Relationships,” and earlier in this chapter in Figure 20.2. Instead of taking a local mean, LOESS estimates a linear (“local linear regression”) or second-order polynomial (“local polynomial regression” or “local quadratic regression”) regression at each point.

The local regression approach allows for the relationship between $X$ and $Y$ to not be totally flat. Depending on how curvy it is, the linear or second-order polynomial function we’ve chosen might not be quite enough to capture the relationship. However, since we’ve zoomed in so far, it’s unlikely that anything will be that highly nonlinear over such a narrow range of the $x$-axis.

Of course, given our plans to use local regression, we don’t actually have to do anything special. While applying a local regression estimator across the whole range of the data will give us a great idea of what the overall relationship between running variable and outcome is, if we’re just interested in estimating regression discontinuity, well… we’re only really interested in two local points - just to the left of the cutoff, and just to the right. So if we estimate a regular-ol’ interacted linear regression like in Equation (20.1) (for local linear regression) or Equation (20.3) (for local polynomial regression), but with a bandwidth (or perhaps some weighting for a slow-fade-out kernel), we’ve got it.

Let’s code up some regression discontinuity! I’m going to do this in two ways. First, I’m going to run a plain-ol’ ordinary least squares model with a bandwidth and kernel weight applied, with heteroskedasticity-robust standard errors. However, there are a number of other adjustments we’ll be talking about in this chapter, and it can be a good idea to pass this task off to a package that knows what it’s doing. So we’ll be using those specialized commands as well, and I’ll talk later about some of the extra stuff they’re doing for us.

For this example we’re going to use data from Government Transfers and Political Support by Manacorda, Miguel, and Vigorito (⊕2011Manacorda, Marco, Edward Miguel, and Andrea Vigorito. 2011. “Government Transfers and Political Support.” American Economic Journal: Applied Economics 3 (3): 1–28.). This paper looks at a large poverty alleviation program in Uruguay which cut a sizeable check to a large portion of the population.625625 The Plan de Atención Nacional a la Emergencia Sociál, or PANES. These were big payments, too - about $70 per month, which is about half what the typical beneficiary had been earning beforehand. On top of that, anyone with kids got a food card. There were some other goodies too. They are interested in whether receiving those funds made people more likely to support the newly-installed center-left government that sent them.

Who got the payments? You had to have an income low enough. But they didn’t exactly use income as a running variable; that might have been too easy to manipulate. Instead, the government used a bunch of factors - housing, work, reported income, schooling - and predicted what your income would be from that. Then, the predicted income was the running variable, and treatment was assigned based on being below a cutoff. About 14% of the population ended up getting payments.

The researchers polled a bunch of people near the income cutoff to check their support for the government afterwards. Did people just below the income cutoff support the government more than those just above?

The data set we have on government transfers comes with the predicted-income variable pre-centered so the cutoff is at zero. Then, support for the government takes three values: you think they’re better than the previous government (1), the same (1/2) or worse (0). The data only includes individuals near the cutoff - the centered-income variable in the data only goes from -.02 to .02.626626 Some of the results here will differ slightly from the original paper so as to simplify the data-preparation process or highlight some potential things you could do that they didn’t.

Before we estimate our model, let’s do a nearly-compulsory graphical regression discontinuity check so we can confirm that there does appear to be some sort of discontinuity where we expect it.627627 Have you noticed how many graphs there are in this chapter? RDD is a lovingly visual design. This is a “plot of binned means.” There are some pre-programmed ways to do this like rdplot in the rdrobust package (all three languages) or binscatter in Stata from the binscatter package, but this is easy enough that we may as well do it by hand and flex those graphing muscles.

For one of these graphs, you generally want to (1) slice the running variable up into bins (making sure the cutoff is the edge between two bins), (2) take the mean of the outcome within each of the bins, and (3) plot the result, with a vertical line at the cutoff so you can see where the cutoff is. Then you’d generally repeat the process with treatment instead of the outcome to produce a graph like Figure 20.3.

R Code

library(tidyverse)
gt <- causaldata::gov_transfers

# Use cut() to create bins, using breaks to make sure it breaks at 0
# (-15:15)*.02/15 gives 15 breaks from -.02 to .02
binned <- gt %>%
    mutate(Inc_Bins = cut(Income_Centered,
           breaks = (-15:15)*(.02/15))) %>%
    group_by(Inc_Bins) %>%
    summarize(Support = mean(Support),
    Income = mean(Income_Centered))
# Taking the mean of Income lets us plot data roughly at the bin midpoints

ggplot(binned, aes(x = Income, y = Support)) + 
    geom_line() + 
    # Add a cutoff line
    geom_vline(aes(xintercept = 0), linetype = 'dashed')

Stata Code

causaldata gov_transfers.dta, use clear download

* Create bins. 15 bins on either side of the cutoff from 
* -.02 to .02, plus 0, means we want "steps" of...
local step = (.02)/15
egen bins = cut(income_centered), at(-.02(`step').02)

* Means within bins
collapse (mean) support, by(bins)

* And graph with a cutoff line
twoway (line support bins) || (function y = 0, horiz range(support)), ///
 xti("Centered Income") yti("Support")

Python Code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn.objects as so
from causaldata import gov_transfers
d = gov_transfers.load_pandas().data

# cut at 0, and 15 places on either side
edges = np.linspace(-.02,.02,31)
d['Bins'] = pd.cut(d['Income_Centered'], bins = edges)

# Mean within bins
binned = d.groupby(['Bins']).agg('mean')

# And plot
fig = plt.figure()
(so.Plot(binned, x = 'Income_Centered', y = 'Support')
 .on(fig).add(so.Line()).plot())
fig.axes[0].axvline(0)

A little extra theming gets us Figure 20.10.628628 Specifically from the R version, as with all the graphs in this book I added the theme_pubr() theme from ggpubr, and some additional text options in the theme function. It’s very much worth your time and effort to learn how to use the graphing package of your choice to make graphs look nice. There’s definitely a break at the cutoff there, and some higher support to the left of the cutoff (treatment is to the left here since you need low income to get paid) that isn’t what we’d expect by following the trend on the right of the cutoff.

Figure 20.10: Government Support by Income with Policy Cutoff

Graph showing support for the government by income, with a cutoff at the income eligible for the individual subsidy. There's a sharp drop at the cutoff.

Now that we have our graph in mind, we can actually estimate our model. We’ll start doing it with OLS and a second-order polynomial, and then we’ll do a linear model with a triangular kernel weight, limiting the bandwidth around the cutoff to .01 on either side. The bandwidth in this case isn’t super necessary - the data is already limited to .02 on either side around the cutoff - but this is just a demonstration.

It’s important to note that the first step of doing regression discontinuity with OLS is to center the running variable around the cutoff. This is as simple as making the variable $RunningVariable - Cutoff$, translated into whatever language you use. The second step would then be to create a “treated” variable $RunningVariable < Cutoff$ (since treatment is applied below the cutoff in this instance). But in this case, the running variable comes pre-centered ($IncomeCentered$) and the below-cutoff treatment variable is already in the data ($Participation$), so I’ll leave those parts out.

R Code

library(tidyverse); library(modelsummary)
gt <- causaldata::gov_transfers

# Linear term and a squared term with "treated" interactions
m <- lm(Support ~ Income_Centered*Participation +
       I(Income_Centered^2)*Participation, data = gt)

# Add a triangular kernel weight
kweight <- function(x) {
    # To start at a weight of 0 at x = 0, and impose a bandwidth of .01, 
    # we need a "slope" of -1/.01 = 100, 
    # and to go in either direction use the absolute value
    w <- 1 - 100*abs(x)
    # if further away than .01, the weight is 0, not negative
    w <- ifelse(w < 0, 0, w)
    return(w)
}

# Run a linear version of the but with the weight
mw <- lm(Support ~ Income_Centered*Participation, data = gt,
         weights = kweight(Income_Centered))

# See the results with heteroskedasticity-robust SEs
msummary(list('Quadratic' = m, 'Linear with Kernel Weight' = mw), 
    stars = c('*' = .1, '**' = .05, '***' = .01), vcov = 'robust')

Stata Code

causaldata gov_transfers.dta, use clear download

* Include the running variable, its square (by interaction with itself)
* and interactions of both with treatment, and robust standard errors
reg support i.participation##c.income_centered##c.income_centered, robust

* Create the triangular kernel weight. To start at a weight of 0 at x = 0,
* and impose a bandwidth of .01, we need a "slope" of -1/.01 = 100
* and to go in either direction use the absolute value
g w = 1 - 100*abs(income_centered)
* if further away than .01, the weight is 0, not negative
replace w = 0 if w < 0

reg support i.participation##c.income_centered [aw = w], robust

Python Code

import numpy as np
import statsmodels.formula.api as smf
from causaldata import gov_transfers
d = gov_transfers.load_pandas().data

# Run the polynomial model
m1 = smf.ols('''Support~Income_Centered*Participation + 
I(Income_Centered**2)*Participation''', d).fit()

# Create the kernel function
def kernel(x):
    # To start at a weight of 1 at x = 0,
    # and impose a bandwidth of .01, we need a "slope" of -1/.01 = 100
    # and to go in either direction use the absolute value
    w = 1 - 100*np.abs(x)
    # if further away than .01, the weight is 0, not negative
    w = np.maximum(0,w)
    return w

# Run the linear model with weights using wls
m2 = smf.wls('Support~Income_Centered*Participation', d,
weights = kernel(d['Income_Centered'])).fit()

m1.summary()
m2.summary()

From this, we get the results in Table 20.1. In the second-order polynomial model, our estimate is that receiving payments increased support for the government by 9.3 percentage points - not bad!629629 Using the specific methods from the original paper, their estimates were more like 11 to 13 percentage points. The version with the kernel weight shows a statistically insignificant effect of 3.3 percentage points. The first model is more plausible to me - the bandwidth was already fairly narrow since the data only included observations near the cutoff anyway, so limiting the data further isn’t necessary. You can also see the reduction in sample size from 1948 to 937 - this occurs because it’s not counting anyone who gets a weight of 0 as being a part of the sample. The potential for reduced bias (by taking fewer people farther from the cutoff) comes with the price of a reduced sample and less precision.

Table 20.1: The Effect of Government Payments on Support, Regression Discontinuity Estimates
	Quadratic	Linear with Kernel Weight
Constant	0.769***	0.819***
	(0.034)	(0.015)
Income (Centered)	−11.567	−23.697***
	(8.101)	(3.219)
Participation	0.093**	0.033
	(0.044)	(0.021)
Income (Centered) x Participation	19.300*	26.594***
	(10.322)	(4.433)
Income (Centered) Squared	562.247
	(401.982)
Income (Centered) Squared x Participation	−101.103
	(502.789)
Num.Obs.	1948	937
R2	0.036	0.041
RMSE	0.31	0.34
Std.Errors	HC3	HC3
* p < 0.1, p < 0.05, * p < 0.01

Now that we have our by-hand version down, we can try things out using some pre-packaged commands, both for estimating the model effects and for making regression discontinuity-style plots. For all three languages, there is the rdrobust package, which is a part of the “RD packages” universe that contains a bunch of useful tools I’ll be including throughout this chapter.630630 See https://rdpackages.github.io/, and the many econometrics papers cited therein.

The rdrobust function in the rdrobust package does a few things for us. First, we don’t have to choose the bandwidth. It will do an optimal bandwidth selection procedure for us. More on how that works in the “How the Pros Do It” section.

Second, it will properly implement local regression with your polynomial order of choice (linear by default) and a triangular kernel.

Third, as implied by the name, it will apply heteroskedasticity-robust standard errors. Specifically, it uses a robust standard error estimator that takes into account the structure of the regression discontinuity problem. It looks for each observation’s nearest neighbors along the running variable, and uses that information to figure out the amount of heteroskedasticity and how to correct for it.

Fourth, it applies a bias correction. Methods for optimal bandwidth choice tend to select bandwidths that are wide, so as to help increase the effective sample size and improve precision. However, estimates of standard errors rely on our idea of what the distribution of the error term is. And those wide bandwidths futz up what those distributions look like. So there is a method here to correct for that bias, improving the estimate of the standard error. More on this in the “How the Pros Do It” section.

R Code

library(tidyverse); library(rdrobust)
gt <- causaldata::gov_transfers

# Estimate regression discontinuity and plot it
m <- rdrobust(gt$Support, gt$Income_Centered, c = 0)
summary(m)
# Note, by default, rdrobust and rdplot use different numbers
# of polynomial terms. You can set the p option to standardize them.
rdplot(gt$Support, gt$Income_Centered)

Stata Code

* STATA CODE
* If necessary: ssc install rdrobust
causaldata gov_transfers.dta, use clear download

* Run the RDD model and plot it. Note, by default, 
* rdrobust and rdplot use different
* numbersof polynomial terms. 
* You can set the p() option to standardize them.
rdrobust support income_centered, c(0)
rdplot support income_centered, c(0)

Python Code

import pandas as pd
from rdrobust import rdrobust, rdplot
from causaldata import gov_transfers
gt = gov_transfers.load_pandas().data

# Estimate regression discontinuity and plot it
m = rdrobust(gt['Support'], gt['Income_Centered'], c = 0)
print(m)
# Note, by default, rdrobust and rdplot use different numbers
# of polynomial terms. You can set the p option to standardize them.
rdplot(gt['Support'], gt['Income_Centered'])

If you run this code, you’ll find a fairly different rdrobust output as opposed to when we did it by hand. Results are insignificant and in the opposite direction of the original paper.631631 Coefficients are positive, but it’s assuming that treatment is to the right of the cutoff, so this translates into a negative effect of being treated, which occurs here on the left of the cutoff. This comes down to rdrobust selecting a narrow bandwidth - if you force the bandwidth to be the entire range, as might be reasonable given how narrow it already is - you get results that are nearly identical to the quadratic-model results from Table 20.1. Making these decisions is hard! Just because we have a premade package doesn’t mean we can let it make all the decisions. Use that head of yours, and explore the options in the functions you use by reading the documentation.

20.2.2 Fuzzy Regression Discontinuity

How many treatments are there really that are completely determined by a cutoff in a running variable? Some, certainly. But we live in the real world. Even in cases where a policy is supposed to be assigned based on a cutoff, surely there might be a few people above or below who manage to get the “wrong” treatment. Forgetting that, there are plenty of scenarios where the policy isn’t necessarily supposed to be assigned based on a cutoff, but still for some reason the chances of treatment jump significantly at a cutoff. Perhaps you wanted to evaluate the effect of holding a high school degree on some outcome. The chances that you hold a high school degree jump significantly in the summer following your 18th birthday. But there are quite a few people who graduate at 17, or 19, or not in the summer, or never graduate at all. There’s a jump in the treatment rate at that summer cutoff, but it’s not from 0% to 100%. We can see how the treatment rate might look across the running variable in Figure 20.11.

Figure 20.11: An Example of Treatment Rates in a Fuzzy Regression Discontinuity

Two graphs. The left, panel (a), shows a sharp regression discontinuity where treatment rates are 0% to one side of the cutoff and 100% to the other. The right, panel (b), shows a fuzzy regression discontinuity where treatment rates are nonzero and steadily rising to one side of the cutoff, then jump sharply at the cutoff, and then continue to rise on the other side of the cutoff.

I already covered the basic concept in the “How Does It Work” section. But how can we actually implement a fuzzy regression discontinuity design?

We can figure it out by modifying the causal diagram we had in Figure 20.1. The only changes we need to account for is that there’s some other way for $RunningVariable$ and $Treatment$ to be related other than the cutoff, which is why we see changes in the treatment rate at non-cutoff points in Figure 20.11. We also might have some non-$RunningVariable$ determinants of $Treatment$.

Figure 20.12: A Causal Diagram That Fuzzy Regression Discontinuity Works For

A causal diagram in which Z causes Running Variable and Treatment and Outcome, Running Variable causes Outcome and Treatment, and Running Variable causes Above Cutoff causes Treatment causes Outcome

We get our modified causal diagram in Figure 20.12.632632 In this diagram, $Z$ might represent a bunch of different variables here - it might even be different variables with arrows to $RunningVariable$ and to $Treatment$. The logic still holds up. What can we get out of this diagram?

Well, one thing to note is that the idea for identifying the effect is still basically the same. If we control for $RunningVariable$, then there is no back door from $AboveCutoff$ to $Outcome$. By cutting out any variation driven by $RunningVariable$ anywhere but the cutoff, we’re left only with variation that has no back doors and identifies the effect of $Treatment$ on $Outcome$.

The only real change is this: we can no longer simply limit the data to the area around the cutoff and call that a day on controlling for $RunningVariable$. Doing that would lead us to understate the effect. If we’re only getting a jump in treatment rates of, say, 15 percentage points, then we should only expect to see 15% of the effect. We have to scale it back to the full size.

Thankfully, this can be done by applying instrumental variables, as in Chapter 19, using basically the same regression discontinuity equations as for the sharp RDD, like Equation (20.1) or Equation (20.3). Except these equations now become our second-stage equations. Our first stage uses $AboveCutoff$ as an instrument for $Treated$ (and interactions with $AboveCutoff$ as instruments for the interactions with $Treated$).

This scales the estimate exactly as we want it to. In its basic form, instrumental variables divides the effect of the instrument on the outcome by the effect of the instrument on the endogenous/treatment variable.633633 Because of those interactions we’re doing something not quite the same as the basic form, but the logic carries over very well. In other words, we’re scaling the effect of being above the cutoff on the outcome (i.e., what we’d get from a typical sharp-regression-discontinuity model) to account for the fact that being above the cutoff only led to a partial increase in treatment rates.

Just like with sharp regression discontinuity, we can implement fuzzy regression discontinuity in code either using our standard tools (the same instrumental variables estimators we used in Chapter 19), or specialized regression-discontinuity specific tools. Let’s do both here.

For this, we’ll be using data from Fetter (⊕2013Fetter, Daniel K. 2013. “How Do Mortgage Subsidies Affect Home Ownership? Evidence from the Mid-Century GI Bills.” American Economic Journal: Economic Policy 5 (2): 111–47.). Fetter also gives us an opportunity to see regression discontinuity performed in a slightly less-clean setting. Fetter’s main question of interest is how much of the increase in the home ownership rate in the mid-century US was due to mortgage subsidies given out by the government.

How does he get at this question? He looks at people who were about the right age to be veterans of major wars like World War II or the Korean War. Anyone who was a veteran of these wars received special mortgage subsidies.

What does this have to do with regression discontinuity? There’s an age requirement to join the military. If you were born one year too late to join the military to fight in the Korean War, then you won’t get these mortgage subsidies (or at least far fewer veterans were eligible). Born just in time? You can join up and get subsidies! So there’s a regression discontinuity here based on birth year. Born just in time, or a bit too late?

Of course, not everybody born in a given year joins the military.634634 Some wouldn’t be allowed. They weren’t taking a whole lot of women at the time, for example. But even among eligible men, not everyone would choose to enlist or end up drafted. So the “treatment” of being eligible for mortgage subsidies would only apply to some people born at the right time. Treatment rates jump from 0% to… above 0% but not 100%. That’s fuzzy!

And why do I say this is “less clean” than some of other examples? Because this isn’t an explicitly designed cutoff. There’s a lot more choice in terms of who gets treated, and, heck, some people were probably getting into the military underaged. We have to think very carefully here about whether we believe the assumptions necessary to use regression discontinuity. I think the design here is pretty reasonable, but my opinion on that is not as rock-solid as it might be for something for which the assignment of treatment had fewer moving parts.635635 But limiting ourselves only to the cleanest analyses would leave us with a lot of stuff we couldn’t study at all.

The author also has to go through some checks and tests that someone with a simpler story might not have to. For example, while I won’t show it in the code below, he performs a placebo test by repeating the analysis using data from other eras, where veteran status wouldn’t have an effect on receiving mortgage subsidies, finding no effect at those times. You’re about to see the analysis include control variables for season of birth, race, and the US state in which they were born - controls generally aren’t necessary for sharp regression discontinuity, but it makes sense to include them here. If the design itself won’t do all the work for you, then geez I guess you’re gonna have to work a little harder.

Let’s code up the analysis. First we’ll do it ourselves using the instrumental variables code we used in Chapter 19. We’re using the running variable quarter of birth (qob), which has been centered on the quarter of birth you’d need to be eligible for a mortgage subsidy for fighting in the Korean War (qob_minus_kw). This determines whether you were a veteran of either the Korean War or World War II (vet_wwko). All of this is to see if veteran status (and its accompanying mortgage subsidies) affects your home ownership status (home_ownership) - remember, the goal here is looking for the effect of those mortgage subsidies. There are also controls for being white or nonwhite, and your birth state.

To keep things simple, we’ll apply a bandwidth before doing estimation, we’ll skip the kernel weighting, and we’ll just do a linear model. The code we did before for incorporating these things still works here, though, if you want to add them.

R Code

library(tidyverse); library(fixest); library(modelsummary)
vet <- causaldata::mortgages

# Create an "above-cutoff" variable as the instrument
vet <- vet %>% mutate(above = qob_minus_kw > 0)

# Impose a bandwidth of 12 quarters on either side
vet <- vet %>% filter(abs(qob_minus_kw) < 12)

m <- feols(home_ownership ~
    nonwhite  | # Control for race
    bpl + qob | # fixed effect controls
    qob_minus_kw*vet_wwko ~ # Instrument our standard RDD
    qob_minus_kw*above, # with being above the cutoff
    se = 'hetero', # heteroskedasticity-robust SEs
    data = vet) 

# And look at the results
msummary(m, stars = c('*' = .1, '**' = .05, '***' = .01))

Stata Code

* We will be using ivreghdfe, 
* which must be installed with ssc install ivreghdfe
* along with ftools, ranktest, reghdfe, and ivreg2
causaldata mortgates.dta, use clear download

* Create an above variable as an instrument
g above = qob_minus_kw > 0

* Impose a bandwidth of 12 quarters on either side
keep if abs(qob_minus_kw) < 12

* Regress, using above as an instrument for veteran status.  
* Note that qob_minus_kw  by itself doesn't need instrumentation 
* so we separate it. Usually I advise against it, 
* but in this case this is easier if we just make our 
* own interaction with g (generate)
g interaction_vet = vet_wwko*qob_minus_kw
g interaction_above = above*qob_minus_kw
ivreghdfe home_ownership nonwhite qob_minus_kw ///
    (vet_wwko interaction_vet = above interaction_above),  ///
    absorb(bpl qob) robust

* We can also use regular ol' ivregress but it will be slower
encode bpl, g(bpl_num)
ivregress 2sls home_ownership nonwhite qob_minus_kw i.bpl_num i.qob ///
    (vet_wwko interaction_vet = above interaction_above), robust

Python Code

import pandas as pd
from linearmodels.iv import IV2SLS
from causaldata import mortgages
d = mortgages.load_pandas().data

# Create an above variable as an instrument
# Apply a bandwidth of 12 quarters on either side
d = (d.assign(above = d['qob_minus_kw'] > 0)
     .query('abs(qob_minus_kw) < 12'))

# Now we estimate!
# Add nonwhite, bpl, qob controls
# and instrumnet for the vet_wwko interaction
# with the above-cutoff interaction
m = IV2SLS.from_formula('''home_ownership ~ 
nonwhite + C(bpl) + C(qob) + 
[qob_minus_kw*vet_wwko ~ qob_minus_kw*above]''', data = d)

# With robust standard errors
m.fit(cov_type = 'robust')

These show that veteran status at this margin increases home ownership rates by 17 percentage points. Not bad. We could also apply the exact same regression discontinuity plot code from last time (there’s no special “fuzzy” plot, although now we’d call the typical regression discontinuity plot the “second stage” plot), and also apply that plot code using the treatment variable as an outcome (the “first stage”), to make sure that the assignment variable is working as expected. This gives us Figure 20.13.636636 Another way to think about it is grabbing hold of both graphs at the top and bottom and stretching them taller until the jump in Figure 20.13 (a) goes from 0% to 100%. The jump you’d end up with in Figure 20.13 (b) is the fuzzy RDD estimate - it’s how much of a jump you’d expect if everybody went from untreated to treated (0% to 100%). It looks good! We definitely see a sharp change in the rate of eligible veterans around the cutoff, and we see a discontinuity in the rate of home ownership at the same point. The fuzzy RDD estimate is the jump at the cutoff in Figure 20.13 (b) divided by the jump at the cutoff in Figure 20.13 (a).

Two graphs, the first showing the relationship between being the difference between birth month and the necessary birth month to be a veteran eligible for mortgage subsidy, and being a veteran eligible for subsidy. There is a small downward relationship before the cutoff, then a sharp drop, then it's near zero and declining after the cutoff. The graph on the right has a similar shape but with a smaller drop, and has home ownership rates on the y-axis.

Figure 20.13: Eligibility for Mortgage Subsidies for being a Korean War Veteran and Home Ownership from Fetter (2013)

Next we can, as before, use the rdrobust function to do local polynomial regression, kernel weighting, easy graphs, all that good stuff. We just need to use vet_wwko as our “treatment status indicator” and give that to the fuzzy option.

R Code

library(tidyverse); library(rdrobust)
vet <- causaldata::mortgages

# It will apply a bandwidth anyway, but having it
# check the whole bandwidth space will be slow. So let's
# pre-limit it to a reasonable range of 12 quarters
vet <- vet %>%
    filter(abs(qob_minus_kw) <= 12)

# Create our matrix of controls
controls <- vet %>%
    select(nonwhite, bpl, qob) %>%
    mutate(qob = factor(qob))
# and make it a matrix with dummies
conmatrix <- model.matrix(~., data = controls)

# This is fairly slow due to the controls, beware!
m <- rdrobust(vet$home_ownership, vet$qob_minus_kw,
              fuzzy = vet$vet_wwko, c = 0,
              covs = conmatrix)

summary(m)

Stata Code

* This uses rdrobust, which must be installed with ssc install rdrobust
causaldata mortgages.dta, use clear download

* It will apply a bandwidth anyway, but having it
* check the whole bandwidth space will be slow. So let's
* pre-limit it to a reasonable range of 12 quarters
keep if abs(qob_minus_kw) <= 12

* And run rdrobust
rdrobust home_ownership qob_minus_kw, c(0) fuzzy(vet_wwko) covs(nonwhite bpl qob)

Python Code

import pandas as pd
from rdrobust import rdrobust
from causaldata import mortgages
vet = mortgages.load_pandas().data

# It will apply a bandwidth anyway, but having it
# check the whole bandwidth space will be slow. So let's
# pre-limit it to a reasonable range of 12 quarters
vet = vet.query('abs(qob_minus_kw) <= 12')

# Create our matrix of controls with dummies
controls = pd.concat([d[['nonwhite']], 
                  pd.get_dummies(d[['bpl']])], axis = 1)
                  
d['qob'] = pd.Categorical(d['qob'])
# Drop one since we already have full rank with bpl
# (we'd also drop_first with bpl if we did add_constant)
controls = pd.concat([controls, pd.get_dummies(d[['qob']], 
                  drop_first = True)], axis = 1)

# As of this writing, actually adding those controls is
# unworkably slow. But maybe by the time you read this
# you can add them with covs = controls
m = rdrobust(vet['home_ownership'], vet['qob_minus_kw'],
              fuzzy = vet['vet_wwko'], c = 0)
print(m)

It’s not included in the code, but we could easily apply rdplot (and maybe apply it with the treatment as the dependent variable as well, just to make sure it looks like we expect).

The rdrobust estimate is quite different from the linear instrumental variables estimate we did earlier, with veteran status increasing home ownership rates by 51.8 percentage points, as opposed to 17.0! That’s quite a difference. It’s useful to know when estimation methods differ like this. Which should we trust? Well, let’s think through why they differ: rdrobust uses a second-order polynomial rather than a linear regression, it uses local regression, and it uses a narrower bandwidth (it picks 3.39 birth quarters on either side of the cutoff, rather than 12).

Our first step in thinking which to trust would have to ask if one of the models is making assumptions that are much more likely to be accurate. These are the sorts of things you’d have to defend when describing your research design. If we think linearity is reasonable, for example, then the linear model might be less noisy and a more precise estimate than the second-order polynomial. Does Figure 20.13 look linear? If we think that the wider bandwidth is likely to contain too many noncomparable people, then we might think the narrower bandwidth is better. Do we think 12 months on either side of the cutoff is too far away to be comparable?

Often these days, researchers performing regression discontinuity will see how sensitive the estimate is to different bandwidths, running the model many times and reporting how the effect changes, letting the reader of the study think about how close to the cutoff things should be limited.637637 In general, for any parameter you have to pick for your analysis, like a bandwidth, it’s not a terrible idea to try a few different ones and report how the results change. Maybe it doesn’t change much, which is reassuring. Or maybe it does, and that’s still good to know, and honest to report. The Bana, Bedard, and Rossin-Slater (⊕2020Bana, Sarah H., Kelly Bedard, and Maya Rossin-Slater. 2020. “The Impacts of Paid Family Leave Benefits: Regression Kink Evidence from California Administrative Data.” Journal of Policy Analysis and Management 39 (4): 888–929.) paper I discuss in the “Regression Kink” section does this.

20.2.3 Placebo Tests

The astonishing thing about regression discontinuity is that it closes all back doors, even the ones that go through variables we can’t measure. That’s the whole idea - since we’re isolating variation in such a narrow window of the running variable, it’s pretty plausible to claim that the only thing changing at the cutoff is treatment, and by extension anything that treatment affects (like the outcome).

Even more convenient is that this extremely-nice result from regression discontinuity gives us a fairly straightforward way to test if it’s true. Simply run your regression discontinuity model on something that the treatment shouldn’t affect. Anything we might normally use as a control variable should serve for this purpose.638638 As long as it’s not something that treatment should affect. But then, we know from Chapter 8 that we wouldn’t usually want to control for something treatment would affect anyway. If we do find an effect, our original regression discontinuity design might not have been quite right. Perhaps things weren’t as random at the cutoff as we’d like, for some reason.

There’s not too much to say here. The equations, code, everything is the exact same for this test as for regular regression discontinuity. We just swap out the actual outcome variable for some other variable that the treatment shouldn’t affect. In many cases we can try it with a long list of variables that might serve as control variables.

Figure 20.14 shows this test with two variables from the Government Transfers and Political Support paper. No effects here! Exactly what we were hoping for.

Two graphs showing the mean age in the household and the mean education in household by income, relative to the income necessary to be eligible for subsidy. Neither jumps at the cutoff.

Figure 20.14: Performing Regression Discontinuity with Controls as Outcomes for a Placebo Test

One important thing to keep in mind with these placebo tests is that, since you can run them on a long list of potential placebo outcomes, you’re likely to find a few nonzero effects just by random chance. It happens. If you test a long list of variables and find a few differences, that’s not a fatal problem with your design. In these cases, you may want to add the variables with the failed placebo tests to the model as control variables. Adding controls can be tricky if you’re using local regression, but there are methods for doing it (⊕Calonico et al. 2019Calonico, Sebastian, Matias D. Cattaneo, Max H. Farrell, and Rocio Titiunik. 2019. “Regression Discontinuity Designs Using Covariates.” Review of Economics and Statistics 101 (3): 442–51.), and pre-packaged regression discontinuity commands often incorporate them.

20.2.4 The Density Discontinuity Test

For regression discontinuity to work, we need a situation where people just on either side of the cutoff are effectively randomly assigned to treatment. That won’t happen if someone’s thumb is on the scale!

There are two ways manipulation could happen. First, whoever (or whatever) is in charge of setting the cutoff value might do so with the knowledge of exactly who it will lead to getting treated. Imagine the track and field coach during tryouts who conveniently decides that anyone on the team must run a mile in 5 minutes and 37 seconds or better just after his son clocks in at 5 minutes and 35 seconds.

Second, individuals themselves might have some control over their running variable. Sometimes they have direct control - if my local supermarket offers me a discounted price on tuna when I buy 12 or more cans, then I can choose exactly what my running variable (number of cans of tuna I buy) is, and it’s not really plausible to say that someone buying 12 cans is comparable to someone buying 11 so I can identify the effect of the discount.

More often, they have indirect control - if I’m trying out for that track-and-field team, I can decide to take it easy or push myself really hard to run faster. Or maybe the person keeping time really likes me and decides to shave a few seconds off of the time they write down for me. This is a problem if we have indirect control and also know what the cutoff is. If I know the cutoff is 5:37 and I usually run a 5:40, I know I have to push extra hard to get a chance at the team and so will try very hard, whereas someone who normally runs 4:30 would just take it easy. Or maybe the person keeping time might see that I actually ran a 5:38, and figure I came so close it wouldn’t hurt to cheat a little and write down 5:36 instead so I can get on the team. The person who runs 6:30 isn’t likely to get a whole minute chopped off, though; that would be too much to ask.

In all of these cases, we want to be sure we understand how the running variable is determined. Are we in a cans-of-tuna situation? A slyly-choose-the-cutoff situation? A situation where we know the cutoff and may manipulate our behavior for it? For any of these, we may be worried that manipulation of the cutoff, or of the running variable near the cutoff, makes people on either side incomparable. If my friendly run-timer shaves off a few seconds for me so I get on the team, but not for Denny who also runs a 5:38 but the run-timer doesn’t like,639639 Denny probably had it coming. then treatment assignment wasn’t random, it was based on who the run-timer liked, which has all sorts of back doors.

In the case of indirect control we do have a test we can perform to check whether manipulation seems to be occurring at the cutoff. Conceptually, it’s very simple. We just look at the distribution of the running variable around the cutoff. It should be smooth, as we might expect if the running variable were being randomly distributed without regard for the cutoff.

But if we see a distribution that seems to have a dip just to one side of the cutoff, with those observations sitting just on the other side, that looks a lot like manipulation. If you looked at the run times at track-and-field tryouts and there was only Denny with a time anywhere from 5:38 to 5:45, but ten people just a second or two below the cutoff, that looks pretty fishy! That run-timer seems to be up to no good.

The idea of putting this intuition into action comes from McCrary (⊕2008McCrary, Justin. 2008. “Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test.” Journal of Econometrics 142 (2): 698–714.), although there have been improvements to the basic idea since then, with better density estimation and power. While, just like with nearly everything else in this book, these tests continue to evolve for better performance in certain situations (for example, see Bugni and Canay (⊕2021Bugni, Federico A., and Ivan A. Canay. 2021. “Testing Continuity of a Density via g-Order Statistics in the Regression Discontinuity Design.” Journal of Econometrics 221 (1): 138–59.)), the instructions in the code below are for Cattaneo, Jansson, and Ma (⊕2020Cattaneo, Matias D., Michael Jansson, and Xinwei Ma. 2020. “Simple Local Polynomial Density Estimators.” Journal of the American Statistical Association 115 (531): 1449–55.).

The McCrary plan is this:

Calculate the density distribution of the treatment variable
Allow that density to have a discontinuity at the cutoff
See if there is a significant discontinuity at the cutoff
Also, plot the density distribution and look at it, and see if it looks like there’s a discontinuity

A big discontinuity is not a good sign. It implies manipulation.

We can code up a density test using pre-packaged commands. The code will still use the example data from Government Transfers and Political Support.

Before running our code, we can ask ourselves about the potential for manipulation here. The government assigned treatment on the basis of not just one characteristic, but a sum of a whole bunch of things. It determined that process before knowing exactly who would qualify (making it hard for the government to manipulate). The whole process wasn’t fully obvious (making it hard for people to manipulate), and it would be difficult to know beforehand exactly how close to the cutoff you were (making it hard to know if you should bother trying to manipulate).

So it seems unlikely that this particular study will show evidence of manipulation. But you never know! Let’s see. This test uses a slightly expanded version of the data including people who weren’t surveyed at the right time to be included in the study, and those outside the bandwidth used in estimation (so we limit to the bandwidth ourselves).

R Code

library(tidyverse); library(rddensity)
# Limit to the bandwidth ourselves
gt <- causaldata::gov_transfers_density  %>%
   filter(abs(Income_Centered) < .02)

# Estimate the discontinuity
gt %>%
    pull(Income_Centered) %>%
    rddensity(c = 0) %>%
    summary()

Stata Code

* If necessary, findit rddensity and install the rddensity package
causaldata gov_transfers_density.dta, use clear download

* Limit to the bandwidth ourselves
keep if abs(income_centered) < .02
* Run the discontinuity check
rddensity income_centered, c(0)

Python Code

import pandas as pd
from rddensity import rddensity
from causaldata import gov_transfers
gt = gov_transfers.load_pandas().data

# Limit to the bandwidth ourselves
gt = gt.query('abs(Income_Centered) < .02')

# Estimate the discontinuity
rdd = rddensity(gt['Income_Centered'], c = 0)
print(repr(rdd))

As expected, we find no statistically significant break in the distribution of income at the cutoff. Hooray!

Of course, we are more interested in seeing whether our assumption of no manipulation is plausible than we are in seeing whether we can reject it to a statistically significant degree. Does it pass the sniff test? We can check this by plotting the distribution of the (binned) running variable, as in Figure 20.15. It certainly doesn’t look like there’s any sort of discontinuity in the distribution at the cutoff. You can make a very similar graph to this one with rdplotdensity in the rddensity package in R or Python. In Stata it’s a plot option on the regular rddensity command.

Figure 20.15: Distribution of the Binned Running Variable

A histogram of the density of the running variable, with a line indicating the treatment cutoff. There does not appear to be a jump in density at the cutoff.

What can we do if we find manipulation? There are ways to estimate a range of plausible effect estimates that allow for a little bit of manipulation. See the final section of Chapter 21.

Page built: 2025-10-17 using R version 4.5.0 (2025-04-11 ucrt)

Chapter 20 - Regression Discontinuity

20.1 How Does It Work?

20.2 How Is It Performed?

20.2.1 Regression Discontinuity with Ordinary Least Squares

20.2.2 Fuzzy Regression Discontinuity

20.2.3 Placebo Tests

20.2.4 The Density Discontinuity Test

20.3 How the Pros Do It

20.3.1 Regression Kink

20.3.2 Cutoffs Cut Off

20.3.3 When Running Variables Misbehave

20.3.4 Dealing with Bandwidths

Chapter 20 - Regression Discontinuity

20.1 How Does It Work?Copy link

20.2 How Is It Performed?Copy link

20.2.1 Regression Discontinuity with Ordinary Least SquaresCopy link

20.2.2 Fuzzy Regression DiscontinuityCopy link

20.2.3 Placebo TestsCopy link

20.2.4 The Density Discontinuity TestCopy link

20.3 How the Pros Do ItCopy link

20.3.1 Regression KinkCopy link

20.3.2 Cutoffs Cut OffCopy link

20.3.3 When Running Variables MisbehaveCopy link

20.3.4 Dealing with BandwidthsCopy link

20.1 How Does It Work?

20.2 How Is It Performed?

20.2.1 Regression Discontinuity with Ordinary Least Squares

20.2.2 Fuzzy Regression Discontinuity

20.2.3 Placebo Tests

20.2.4 The Density Discontinuity Test

20.3 How the Pros Do It

20.3.1 Regression Kink

20.3.2 Cutoffs Cut Off

20.3.3 When Running Variables Misbehave

20.3.4 Dealing with Bandwidths