The absolute (delta) and relative (delta%) changes are different random variables, so you should try to calculate the ratio-based standard errors, CIs, and p-values if you care about the latter. This will not change your decision most of the time, but you will come across examples where it does matter (wider CIs, higher p-values, etc.). Ratios can be tricky like that.
Here's a toy example to make things clearer. Consider a two-sample test with a binary outcome where $N_T=N_C=1,359$. There are 163 successes in treatment, and 136 in control. The p-value on the absolute diff is 0.098, so you would reject the null that the two groups are same at $\alpha=.10$. However, the p-value on the relative difference is 0.101, so you would fail to reject. In some sense, this is an artifact of using a fixed threshold for significance and the approximation inherent in using delta method, but could lead to different decisions with the same data and decision rule but different definitions of difference.
Now on to your second question. There are many ways to calculate the variance, with varying complexity. It depends on what tools you have access to, features of your data and experiments, and your company's level of statistical sophistication.
These methods are:
- Delta method (either with correlated means or uncorrelated means)
- Fieller's method (with correlated or uncorrelated means or regression version)
- Regression (either transforming the outcome, or transforming the coefficients or using a GLM and then using the delta method or Fieller's method)
- Bootstrapping (relative difference itself or regression), permutation tests
- Some combinations of the above, like bootstrapped GLM regression
If you are willing to assume that the two means are uncorrelated (which usually makes sense in an A/B test), there are simple formulas linked above you can use (either delta method or based on Fieller's method). There are also canned commands/packages/online calculators.
If you are not willing to assume that, you can use regression since that returns the covariances pretty easily. Then you can use either the more complicated formulas that have the covariance term or have some stats package handle that for you. Another potential option is to log the outcome or use a GLM model to get the effects in percent.
Personally, I find some version of regression easiest, and that still works even if there is no correlation since in that case the covariance will be close to zero.
You can also bootstrap easily, ether the relative change itself or using regression coefficients. There is no formula since this is a resampling method. Make sure to set the seed so that you can replicate your work each time.
None of these approaches are exact, they are all approximations. In the toy example below, they align pretty closely.
For example, the delta method formula for the standard error of the relative change is
$$SE \left( \frac{B-A}{A} \right) \approx \sqrt{\frac{Var(B)\cdot B^2 - 2 \cdot Cov(A,B)\cdot A \cdot B + Var(A)\cdot A^2}{A^4}},$$
where $A$ is the mean in the control group and $B$ is the mean in the treatment group. Assuming uncorrelated means leads to covariance term being zero, simplifying the formula. Otherwise, regression is the easiest way to get the covariance between the two means.
Below I will compare blood pressure between men and women (analogous to a treated and control group) using Stata. I annotated the Stata code with some brief explanations.
You can find some regression-based examples of Stata and R code in Lye, J., & Hirschberg, J. (2018). Ratios of Parameters: Some Econometric Examples. Australian Economic Review, 51(4), 578–602. doi:10.1111/1467-8462.12300.
In this dataset, women have 5% lower BP relative to men, which is about 8 mmHg in absolute terms. All of the relative difference CIs are roughly in the [-8%,-2%] range:
. sysuse bplong, clear
(fictional blood-pressure data)
. keep if when=="After":when
(120 observations deleted)
. isid patient
. /* summary stats */
. table sex, c(mean bp semean bp sd bp N bp)
----------------------------------------------------------
Sex | mean(bp) sem(bp) sd(bp) N(bp)
----------+-----------------------------------------------
Male | 155.5167 1.967891 15.24322 60
Female | 147.2 1.515979 11.74272 60
----------------------------------------------------------
. label list sex
sex:
0 Male
1 Female
. set seed 10122020
.
. /* (A) Absolute effect */
. ttest bp, by(sex) reverse
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Female | 60 147.2 1.515979 11.74272 144.1665 150.2335
Male | 60 155.5167 1.967891 15.24322 151.5789 159.4544
---------+--------------------------------------------------------------------
combined | 120 151.3583 1.294234 14.17762 148.7956 153.921
---------+--------------------------------------------------------------------
diff | -8.316667 2.484107 -13.23587 -3.397459
------------------------------------------------------------------------------
diff = mean(Female) - mean(Male) t = -3.3480
Ho: diff = 0 degrees of freedom = 118
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0005 Pr(|T| > |t|) = 0.0011 Pr(T > t) = 0.9995
. regress bp i.sex
Source | SS df MS Number of obs = 120
-------------+---------------------------------- F(1, 118) = 11.21
Model | 2075.00833 1 2075.00833 Prob > F = 0.0011
Residual | 21844.5833 118 185.123588 R-squared = 0.0867
-------------+---------------------------------- Adj R-squared = 0.0790
Total | 23919.5917 119 201.004972 Root MSE = 13.606
------------------------------------------------------------------------------
bp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
Female | -8.316667 2.484107 -3.35 0.001 -13.23587 -3.397459
_cons | 155.5167 1.756529 88.54 0.000 152.0383 158.9951
------------------------------------------------------------------------------
.
. /* (B) Relative Effect */
.
. /* (-1) logged outcome t-test (works for strictly positive data and small relative differences) */
. generate ln_bp = ln(bp)
. ttest ln_bp, by(sex) reverse
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Female | 60 4.988759 .0100682 .0779881 4.968612 5.008905
Male | 60 5.041963 .0127957 .0991153 5.016358 5.067567
---------+--------------------------------------------------------------------
combined | 120 5.015361 .0084655 .092735 4.998598 5.032123
---------+--------------------------------------------------------------------
diff | -.053204 .0162819 -.0854466 -.0209615
------------------------------------------------------------------------------
diff = mean(Female) - mean(Male) t = -3.2677
Ho: diff = 0 degrees of freedom = 118
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0007 Pr(|T| > |t|) = 0.0014 Pr(T > t) = 0.9993
.
. /* (0) bootstrap means */
. capture program drop mybs
. program define mybs, rclass
1. quietly summarize bp if sex=="Female":sex
2. scalar female_avg_bp = r(mean)
3. quietly summarize bp if sex=="Male":sex
4. scalar male_avg_bp = r(mean)
5. return scalar ratio = (female_avg_bp - male_avg_bp)/male_avg_bp
6. end
.
. bootstrap ratio = r(ratio), reps(500) nodots nowarn: mybs
Bootstrap results Number of obs = 120
Replications = 500
command: mybs
ratio: r(ratio)
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ratio | -.0534777 .0153194 -3.49 0.000 -.0835031 -.0234522
------------------------------------------------------------------------------
.
.
. /* (1b) delta method using regression and ratio of predictions by hand */
. regress bp i.sex
Source | SS df MS Number of obs = 120
-------------+---------------------------------- F(1, 118) = 11.21
Model | 2075.00833 1 2075.00833 Prob > F = 0.0011
Residual | 21844.5833 118 185.123588 R-squared = 0.0867
-------------+---------------------------------- Adj R-squared = 0.0790
Total | 23919.5917 119 201.004972 Root MSE = 13.606
------------------------------------------------------------------------------
bp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
Female | -8.316667 2.484107 -3.35 0.001 -13.23587 -3.397459
_cons | 155.5167 1.756529 88.54 0.000 152.0383 158.9951
------------------------------------------------------------------------------
. nlcom ratio:(_b[1.sex])/_b[_cons]
ratio: (_b[1.sex])/_b[_cons]
------------------------------------------------------------------------------
bp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ratio | -.0534777 .015552 -3.44 0.001 -.083959 -.0229963
------------------------------------------------------------------------------
. margins, eydx(sex) // another way: calculate the elasticity
Conditional marginal effects Number of obs = 120
Model VCE : OLS
Expression : Linear prediction, predict()
ey/dx w.r.t. : 1.sex
------------------------------------------------------------------------------
| Delta-method
| ey/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
Female | -.0549607 .0164307 -3.35 0.001 -.0874979 -.0224235
------------------------------------------------------------------------------
Note: ey/dx for factor levels is the discrete change from the base level.
.
. /* (2a) logged outcome regression */
. /* works for strictly positive data and small relative differences */
. regress ln_bp i.sex
Source | SS df MS Number of obs = 120
-------------+---------------------------------- F(1, 118) = 10.68
Model | .084920032 1 .084920032 Prob > F = 0.0014
Residual | .938452791 118 .00795299 R-squared = 0.0830
-------------+---------------------------------- Adj R-squared = 0.0752
Total | 1.02337282 119 .008599772 Root MSE = .08918
------------------------------------------------------------------------------
ln_bp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
Female | -.053204 .0162819 -3.27 0.001 -.0854466 -.0209615
_cons | 5.041963 .011513 437.94 0.000 5.019164 5.064762
------------------------------------------------------------------------------
.
. /* (2b) GLM with exponentiated coefficients */
. glm bp i.sex, family(gaussian) link(log) nolog
Generalized linear models Number of obs = 120
Optimization : ML Residual df = 118
Scale parameter = 185.1236
Deviance = 21844.58333 (1/df) Deviance = 185.1236
Pearson = 21844.58333 (1/df) Pearson = 185.1236
Variance function: V(u) = 1 [Gaussian]
Link function : g(u) = ln(u) [Log]
AIC = 8.075427
Log likelihood = -482.5256155 BIC = 21279.66
------------------------------------------------------------------------------
| OIM
bp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
Female | -.0549607 .0164307 -3.35 0.001 -.0871643 -.0227571
_cons | 5.046753 .0112948 446.82 0.000 5.024616 5.06889
------------------------------------------------------------------------------
.
. /* (3) bootstrap ratio of predictions from regression by hand */
. bootstrap ratio = (_b[1.sex]/_b[_cons]), reps(500) nodots: regress bp i.sex
Linear regression Number of obs = 120
Replications = 500
command: regress bp i.sex
ratio: _b[1.sex]/_b[_cons]
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ratio | -.0534777 .0147228 -3.63 0.000 -.0823338 -.0246215
------------------------------------------------------------------------------
.
. /* (4) Fieller's method (uncorrelated means) */
. /* there is also a correlated means version */
. fieller bp, by(sex) reverse
Confidence Interval for a Quotient by Fieller's Method (Unpaired Data)
Numerator Mean: 147.2
Denominator Mean: 155.51667
Quotient: .94652234
95% CI: .91652092�.97771318
.
. /* (5) delta method by hand (uncorrelated means) */
. /* there is also a correlated means version */
. table sex, c(mean bp sd bp N bp)
----------------------------------------------
Sex | mean(bp) sd(bp) N(bp)
----------+-----------------------------------
Male | 155.5167 15.24322 60
Female | 147.2 11.74272 60
----------------------------------------------
. display "SE(ratio) = " sqrt(((15.24322^2/60)*(155.5167)^2+(11.74272^2/60)*(147.2)^2)/(155.5167^4))
SE(ratio) = .01566056
The Filler method above calculates $\frac{\bar Y_{female}}{\bar Y_{male}}$ rather than the relative change, but they are equivalent. The paper linked above has R and Stata code to calculate the relative change with regression.
Here is some code showing that the p-values can also differ depending on whether absolute or relative change is used with Wald and Wald-type tests:
. sysuse bplong, clear
(fictional blood-pressure data)
. keep if when=="After":when
(120 observations deleted)
. estimates clear
. qui regress bp i.sex
. /* Absolute effect Wald-type test */
. testnl _b[1.sex] = 0
(1) _b[1.sex] = 0
chi2(1) = 11.21
Prob > chi2 = 0.0008
. display r(p)
.00081412
. /* Relative effect Wald-type test */
. testnl _b[1.sex]/_b[_cons] = 0
(1) _b[1.sex]/_b[_cons] = 0
chi2(1) = 11.82
Prob > chi2 = 0.0006
. di r(p)
.00058466
. /* Absolute effect Wald test */
. test _b[1.sex] = 0
( 1) 1.sex = 0
F( 1, 118) = 11.21
Prob > F = 0.0011
. display r(p)
.00109302
. /* Relative effect Wald test */
. margins, eydx(sex) post
Conditional marginal effects Number of obs = 120
Model VCE : OLS
Expression : Linear prediction, predict()
ey/dx w.r.t. : 1.sex
------------------------------------------------------------------------------
| Delta-method
| ey/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
Female | -.0549607 .0164307 -3.35 0.001 -.0874979 -.0224235
------------------------------------------------------------------------------
Note: ey/dx for factor levels is the discrete change from the base level.
. test _b[1.sex] = 0
( 1) 1.sex = 0
F( 1, 118) = 11.19
Prob > F = 0.0011
. di r(p)
.00110368