This is an extension of a previous post on statistical power. In that post we derived the formula for statistical power using
- The effect size
- The standard error of the effect. Alternatively, this can be stated as the variance and the sample size.
- \(\alpha\).
However, there is an assumption of perfect compliance. That is, 100% of the treatment units are actually treated, and 100% of the control units are actually withheld.
In this post we discuss the case when \(p_1\)% of treatment units are treated, and \(p_0\)% of control units are withheld.
Effect size
Say that treatment assignment is determined by the variable \(Z\), so \(Z = 1\) means we intend to provide treatment, and \(Z = 0\) means we intend to withhold. Let \(X\) be the treatment that was actually received.
\[\begin{align} p_1 &= P(X = 1 | Z = 1) \\ p_0 &= P(X = 0 | Z = 0) \end{align}\]
The effect size we care about is \(\Delta = \mu_1 - \mu_0 = E[y | X = 1] - E[y | X = 0]\). There is a difference between what we care about and what we will observe. From data, we will observe a dilution in the treatment effect. Noncompliance will decrease the effect size and will subsequently decrease the power.
\[\begin{align} \mu_1 | Z = 1 &= p_1 \mu_1 + (1 - p_1) \mu_0 \\ \mu_0 | Z = 0 &= p_0 \mu_0 + (1 - p_0) \mu_1 \\ \mu_1 | Z = 1 - \mu_0 | Z = 0 &= (p_1 + p_0 - 1) \Delta \end{align}\]
Variance
The variance of \(y\) is broken down using the law of total variance
\[Var(y) = E[var(y | X)] + Var(E[y | X])\]
The first term can be simplified and reduced to just \(\sigma^2\). We can argue that the variance of \(y\) is not a function of the artificially generated assignment variable, \(Z\). It is only a function of whether or not the unit actually received the treatment. This is a homoskedasticity assumption.
The second term is the variance of a mixture. The mixture has 2 parts: \(E[y | X = 0] =\mu_0\) and \(E[y | X = 1] = \mu_1 = \mu_0 + \Delta\) and each part is a function of a bernoulli random variable, \(X\). We leverage the identity
\[Var(f(X) | Z = 1) = p_1 (1 - p_1) (f(X = 1) - f(X = 0))^2.\]
We have the key pieces we need to express statistical power in terms of variables that we can plan for, namely the assignment \(Z\) and approximations for the compliance rate.
\[\begin{align} \mu_1 | Z = 1 &= p_1 \mu_1 + (1 - p_1) \mu_0 \\ \mu_0 | Z = 0 &= p_0 \mu_0 + (1 - p_0) \mu_1 \\ \mu_1 | Z = 1 - \mu_0 | Z = 0 &= (p_1 + p_0 - 1) \Delta \\ Var(y | Z = 1) &= \sigma^2 + p_1 (1 - p_1) \Delta^2 \approx \sigma^2 \\ Var(y | Z = 0) &= \sigma^2 + p_0 (1 - p_0) \Delta^2 \approx \sigma^2 \end{align}\]
The approximation in \(Var(y | Z = 1) \approx \sigma^2\) comes from the pattern that variance (not normalized by \(n\)) tends to be large while effect size tends to be small. Thus variance under noncompliance is not much different than variance under full compliance. The change in power will largely come from the dilution in the treatment effect.
\[ \boxed{\text{Power} = \Phi(\delta' - z_{1-\alpha/2}) + \Phi(-\delta' - z_{1-\alpha/2})} \]
where \(\delta' = \delta (p_1 + p_0 - 1)\) is a dilution on the treatment effect.
Using some simple numbers, if \(p_1 = p_0 = 0.9\), then \(\delta' = 0.8 \delta\). By changing the effect size by 0.8, we change the sample size necessary by \(\frac{1}{.8^2} = 56\%\)!