Propensity Score Matching

Advanced Statistical Tests

Enter your treatment and control group data in CSV format to estimate the Average Treatment Effect on the Treated (ATT).

Examples

Use these sample datasets to see how the calculator works.

Effect of a New Drug

Medical Study

Evaluating the effect of a new drug on blood pressure (outcome), controlling for age and BMI (covariates).

Treatment Data:

blood_pressure,age,bmi
140,55,25.1
135,62,28.3
138,58,26.5
145,65,30.1
142,59,27.8...

Control Data:

blood_pressure,age,bmi
150,56,26.2
155,60,29.1
148,61,27.3
160,68,31.0
152,57,28.1...

Impact of a Job Training Program

Economic Policy

Assessing the impact of a job training program on weekly income (outcome), controlling for education level and years of experience.

Treatment Data:

income,education,experience
850,16,5
900,18,8
880,16,7
920,19,10
860,14,6...

Control Data:

income,education,experience
750,14,4
780,16,6
800,12,5
770,14,7
790,16,9...

Effect of a Promotion on Sales

Marketing

Measuring the effect of a marketing promotion on customer spending (outcome), controlling for customer loyalty score and visit frequency.

Treatment Data:

spending,loyalty,frequency
120,85,10
150,90,15
130,88,12
145,92,18
125,80,9...

Control Data:

spending,loyalty,frequency
90,70,8
100,75,11
95,72,9
110,80,14
105,78,13...

Impact of a Tutoring Program

Education

Analyzing the impact of a tutoring program on test scores (outcome), controlling for prior year's grades and attendance.

Treatment Data:

test_score,prior_grade,attendance
88,80,95
92,85,98
85,78,92
95,90,99
89,82,96...

Control Data:

test_score,prior_grade,attendance
75,70,90
80,75,94
78,72,88
82,80,96
79,74,91...
Other Titles
Understanding Propensity Score Matching: A Comprehensive Guide
A deep dive into the theory, application, and interpretation of Propensity Score Matching (PSM) for causal inference.

What is Propensity Score Matching?

  • The Challenge of Causal Inference in Observational Studies
  • Introducing the Propensity Score
  • The Core Idea of Matching
In many fields, like medicine, economics, and social sciences, we want to understand the causal effect of an intervention—a new drug, a government policy, a teaching method. The gold standard for this is the Randomized Controlled Trial (RCT), where subjects are randomly assigned to a treatment or control group. This randomness ensures that, on average, the two groups are similar in all aspects, both observed and unobserved. Therefore, any difference in outcomes can be confidently attributed to the treatment. However, RCTs are often unethical, impractical, or too expensive. We must then rely on observational data, where subjects self-select or are selected into treatment and control groups based on certain characteristics. This creates selection bias, as the groups may not be comparable to begin with.
The Propensity Score as a Balancing Score
Propensity Score Matching (PSM) is a statistical method designed to address this problem. It aims to mimic an RCT by creating a control group that is as similar as possible to the treatment group based on their observed characteristics (covariates). The central concept is the 'propensity score,' which is the probability of a subject being assigned to the treatment group, given their set of observed covariates. The theory, developed by Rosenbaum and Rubin, shows that if we can match individuals from the treatment and control groups who have the same propensity score, we have effectively balanced their observed covariates. This allows for a more apple-to-apples comparison, reducing the selection bias in the estimated treatment effect.

Step-by-Step Guide to Using the PSM Calculator

  • Preparing and Inputting Your Data
  • Executing the Analysis and Choosing a Method
  • Interpreting the Results
1. Data Preparation
Your data must be structured in a specific way. You need two separate datasets: one for the treatment group and one for the control group. Both datasets must be in CSV format. The very first line of your data must be a header row containing the names of your variables. The first column must always be your outcome variable (the one you are measuring the effect on). All subsequent columns are your covariates (the characteristics you want to control for). Crucially, the header names and column order must be identical in both the treatment and control data files.
2. Calculation
Paste your prepared CSV data into the respective 'Treatment Group' and 'Control Group' text boxes. The calculator will perform three main steps internally: 1) It will run a logistic regression to calculate the propensity score for every individual in your dataset. 2) It will use a matching algorithm (like Nearest Neighbor) to pair each individual in the treatment group with an individual in the control group who has the closest propensity score. 3) It will calculate the treatment effect and balance statistics based on this newly matched sample.
3. Interpreting the Output
The primary output is the Average Treatment Effect on the Treated (ATT), which tells you the average impact of the intervention on those who received it. You will also see a Standard Error and a P-value to assess the statistical significance of this effect. Just as important is the 'Covariate Balance' table. It shows the Standardized Mean Difference (SMD) for each covariate before and after matching. A large SMD (e.g., > 0.1 or 0.2) indicates that the groups were very different on that covariate. After matching, you want to see these SMDs drop below 0.1, which suggests the matching was successful in creating comparable groups.

Real-World Applications of Propensity Score Matching

  • Healthcare and Medicine
  • Economics and Public Policy
  • Education and Social Programs
Evaluating Medical Treatments
A common use case is evaluating the effectiveness of a new surgical procedure compared to a traditional one using patient records. Since surgeons might choose the new procedure for younger or healthier patients, a simple comparison would be biased. PSM can be used to match patients who received the new surgery with similar patients (in terms of age, disease severity, comorbidities) who received the traditional one, providing a fairer comparison of outcomes like recovery time or survival rates.
Assessing Policy Impact
Governments often implement policies like job training programs for the unemployed. To see if the program works, analysts can't just compare the incomes of those who participated with those who didn't, as participants might have been more motivated to begin with. PSM can match program participants with non-participants who had similar characteristics (e.g., age, education, prior work history) before the program began to get a less biased estimate of the program's impact on income.

Common Misconceptions and Correct Methods

  • PSM Only Balances Observed Covariates
  • The Importance of Covariate Selection
  • Matching is Not a Magic Bullet
The 'Unobserved' Covariate Problem
The single most important limitation of PSM is that it can only balance the covariates that you can observe and measure. If there are unobserved characteristics (e.g., patient motivation, innate talent) that influence both the selection into the treatment group and the outcome, PSM cannot account for them, and the resulting estimate may still be biased. This is the key advantage of an RCT, which balances both observed and unobserved factors. Therefore, the results of PSM should always be interpreted with this caveat in mind.
Choosing the Right Variables
The validity of PSM heavily depends on the 'conditional independence assumption,' which means that after controlling for the selected covariates, the treatment assignment is essentially random. This means you must include all covariates that are thought to influence both treatment selection and the outcome. Omitting important covariates can lead to biased results, while including irrelevant ones (those only related to the outcome but not treatment selection) can increase the variance of your estimates.

Mathematical Derivation and Examples

  • The Logistic Regression Model for Propensity Scores
  • The Nearest Neighbor Algorithm
  • Calculating the Standardized Mean Difference (SMD)
1. Estimating Propensity Scores
Let T be the treatment indicator (1 if treated, 0 if control) and X be the vector of observed covariates. The propensity score e(X) is defined as e(X) = P(T=1 | X). This probability is typically estimated using a logistic regression model: log(p / (1-p)) = β₀ + β₁X₁ + ... + βₖXₖ. The model is fit on the entire sample (both treatment and control groups) to find the coefficients (β's) that best predict treatment status from the covariates.
2. Matching
After estimating the propensity score e(Xᵢ) for every subject i, a matching algorithm is applied. The simplest is 1-to-1 Nearest Neighbor matching. For each treated subject i, we find a control subject j that minimizes the distance |e(Xᵢ) - e(Xⱼ)|. Once a control subject is matched, they are removed from the pool of potential matches for other treated subjects.
3. Assessing Balance
To check if matching worked, we calculate the Standardized Mean Difference (SMD) for each covariate before and after matching. The formula is: SMD = (mean(Xtreat) - mean(Xcontrol)) / √((var(Xtreat) + var(Xcontrol))/2). After matching, this is recalculated using only the matched sample. A successful match will result in post-match SMDs close to zero.
4. Estimating the ATT
The Average Treatment Effect on the Treated is then calculated simply as the difference in the mean outcomes between the treated subjects and their matched controls: ATT = (1/Nₜ) Σ(Yᵢ_treat) - (1/Nₜ) Σ(Yⱼ_control), where the sum is over the Nₜ matched pairs.