# #StackBounty: #r #t-test #ab-test Sampling A/B test results (revenue per visitor vectors)

### Bounty: 100

I have two vectors of the control version A and test version B.
These vectors contain revenues by visitor. So A version has 3020 visitors who didn’t purchase and B respectively 2811. Revenue data comes from different source:

``````A <- c(rep(0, 3020), revenue_A[, 2])
B <- c(rep(0, 2811), revenue_B[, 2])
``````

These aren’t normally distributed, but have heavy right tails. `length(revenue_A[, 2])` and `length(revenue_B[, 2])` are around 700 and contain values between 20 and 100.

My approach was to bootstrap these vectors 1000 times, with 10% of the values, calculate the mean revenue value and then do a `t.test`:

``````aSS <- round(0.1 * length(A))
bSS <- round(0.1 * length(B))
bootA <- c()
bootB <- c()
for (i in 1:1000) {
tempA <- sample(A, aSS, replace = TRUE) # 10% samples of the original data
tempB <- sample(B, bSS, replace = TRUE)
bootA <- c(bootA, mean(tempA)) # Calculate mean of the sample
bootB <- c(bootB, mean(tempB))
}
hist(bootA)
hist(bootB)
# --> Seem to have normal distribution, let's do t.test
t.test(bootA, bootB)
``````

Is this the right statistical approach? I had hard time finding tutorials based on this kind of statistical calculations.

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.