---
title: "double"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{double}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Here we show how to deal with confounding events, see [Tsai (2024)](https://arxiv.org/abs/2409.05184) for the full treatment. We first simulate some data with confounding events. 

```{r setup}
library(fastdid)
library(ggplot2)
library(data.table)
simdt <- sim_did(1e+04, 5, cov = "cont", hetero = "all", balanced = TRUE, seed = 1, 
                 second_cohort = TRUE, second_het = "no") #comfounding event
dt <- simdt$dt #dataset

#ground truth att
att <- simdt$att |> merge(dt[,.(w = .N),by = "G"], by = "G")
att[, event_time := time-G]
att <- att[event == 1,.(att = weighted.mean(attgt, w)), by = "event_time"]
```

Using the default estimator, the estimates (black line) is biased from the ground truth (red line).

```{r, naive}
naive_result <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit", 
                 outcomevar = "y", result_type = "dynamic")
plot_did_dynamics(naive_result) + 
  geom_line(aes(y = att, x = event_time), 
            data = att, color = "red") + theme_bw()
```
Diagnostics can be obtained by using `time >= G2` as outcome. As we set the effect of the confounding event at 10 constantly, we can see that the bias of the default estimator is roughly 10 times the diagnostics.

```{r, diag}
dt[, D2 := time >= G2]
diag <- fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "D2", result_type = "dynamic")
plot_did_dynamics(diag) + theme_bw()
```

Using the double did estimator with `cohortvar2`, the estimator recovers the ground truth.

```{r, double}
double_result <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "y", result_type = "dynamic",
                 cohortvar2 = "G2", event_specific = TRUE)
plot_did_dynamics(double_result) + 
  geom_line(aes(y = att, x = event_time), 
            data = att, color = "red") + theme_bw()
```

Double DiD also allow for two additional aggregation scheme: group-group-time ("group_group_time") and dynamic-staggered ("dynamic_stagger", event time by event stagger, G1-G2).

```{r, agg}
double_result_ds <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "y", result_type = "dynamic_stagger",
                 cohortvar2 = "G2", event_specific = TRUE)

double_result_ggt <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "y", result_type = "group_group_time",
                 cohortvar2 = "G2", event_specific = TRUE)

```