---
title: "double"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{double}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
Here we show how to deal with confounding events, see [Tsai (2024)](https://arxiv.org/abs/2409.05184) for the full treatment. We first simulate some data with confounding events.
```{r setup}
library(fastdid)
library(ggplot2)
library(data.table)
simdt <- sim_did(1e+04, 5, cov = "cont", hetero = "all", balanced = TRUE, seed = 1,
second_cohort = TRUE, second_het = "no") #comfounding event
dt <- simdt$dt #dataset
#ground truth att
att <- simdt$att |> merge(dt[,.(w = .N),by = "G"], by = "G")
att[, event_time := time-G]
att <- att[event == 1,.(att = weighted.mean(attgt, w)), by = "event_time"]
```
Using the default estimator, the estimates (black line) is biased from the ground truth (red line).
```{r, naive}
naive_result <-fastdid(data = dt,
timevar = "time", cohortvar = "G", unitvar = "unit",
outcomevar = "y", result_type = "dynamic")
plot_did_dynamics(naive_result) +
geom_line(aes(y = att, x = event_time),
data = att, color = "red") + theme_bw()
```
Diagnostics can be obtained by using `time >= G2` as outcome. As we set the effect of the confounding event at 10 constantly, we can see that the bias of the default estimator is roughly 10 times the diagnostics.
```{r, diag}
dt[, D2 := time >= G2]
diag <- fastdid(data = dt,
timevar = "time", cohortvar = "G", unitvar = "unit",
outcomevar = "D2", result_type = "dynamic")
plot_did_dynamics(diag) + theme_bw()
```
Using the double did estimator with `cohortvar2`, the estimator recovers the ground truth.
```{r, double}
double_result <-fastdid(data = dt,
timevar = "time", cohortvar = "G", unitvar = "unit",
outcomevar = "y", result_type = "dynamic",
cohortvar2 = "G2", event_specific = TRUE)
plot_did_dynamics(double_result) +
geom_line(aes(y = att, x = event_time),
data = att, color = "red") + theme_bw()
```
Double DiD also allow for two additional aggregation scheme: group-group-time ("group_group_time") and dynamic-staggered ("dynamic_stagger", event time by event stagger, G1-G2).
```{r, agg}
double_result_ds <-fastdid(data = dt,
timevar = "time", cohortvar = "G", unitvar = "unit",
outcomevar = "y", result_type = "dynamic_stagger",
cohortvar2 = "G2", event_specific = TRUE)
double_result_ggt <-fastdid(data = dt,
timevar = "time", cohortvar = "G", unitvar = "unit",
outcomevar = "y", result_type = "group_group_time",
cohortvar2 = "G2", event_specific = TRUE)
```