TEDSF Interview Skills Q&A Platform
1 like 0 dislike
16 views
How can you resample statistical tests in R language?
in Data Science by Platinum (104k points) | 16 views

1 Answer

0 like 0 dislike
Best answer

 

BOOTSTRAPPING

Bootstrapping is the process of resampling with replacement (all values in the sample have an equal probability of being selected, including multiple times, so a value could have a duplicate). Resample, calculate a statistic (e.g. the mean), repeat this hundreds or thousands of times and you are able to estimate a precise/accurate uncertainty of the mean (confidence interval) of the data’s distribution. There are less assumptions about the underlying distribution using bootstrap compared to calculating the standard error directly.

PERMUTATION TESTING

Similar to bootstrapping, except permutation testing resamples without replacement (meaning when a value is selected, it can not be selected again, so no value can be a duplicate). This simply shuffles the values. In the case of a univariate statistic (e.g. mean), this will not change anything. However, when there are two or more variables, reshuffling one variable will change the test statistic (e.g. correlation or regression). Usually this is done on the response/outcome/y variable, and usually for tests that use discrete or categorical (“yes”, “no”) variables.

ds <- data.frame(y = runif(10), x = runif(10))
ds
##             y          x
## 1  0.26248353 0.60789022
## 2  0.27608331 0.81612978
## 3  0.35928985 0.47067467
## 4  0.34308694 0.86670014
## 5  0.61019344 0.23596764
## 6  0.31907683 0.53597891
## 7  0.51808777 0.77629172
## 8  0.39792031 0.09387856
## 9  0.04529551 0.75595895
## 10 0.56524394 0.44996910
# Using spearman correlation to be consistent with the next example
cor(ds$y, ds$x, method = "spearman") 
## [1] -0.5393939
# Resampled the y only (reshuffled the order)
ds$resample_y <- sample(ds$y)
ds
## [1] -0.07878788

 

library(coin)
## Loading required package: methods
## Loading required package: survival
## 
## Attaching package: 'survival'
## The following object is masked from 'package:boot':
## 
##     aml
ds <- data.frame(y, x)
## Error in data.frame(y, x): object 'y' not found
spearman_test(y ~ x, data = ds)
## 
## 	Asymptotic Spearman Correlation Test
## 
## data:  y by x
## Z = -1.6182, p-value = 0.1056
## alternative hypothesis: true rho is not equal to 0

There are many more tests found within the coin package. Check them out in the vignette:

vignette("coin", package = "coin")
by Platinum (104k points)

No related questions found

Welcome to TEDSF Skills Questions and Answers, a platform, where you can ask skills questions and receive answers from other members of the community. On TEDSF the youth, students, teachers, policy makers and enthusiasts can ask and answer any questions. Get help and answers to any skills-related problem including mathematics, computer science, data science, web development, physics, chemistry, digital marketing, African development and more. Help is always 100% free!

4.1k questions

1.4k answers

64 comments

29.6k users

4,058 questions
1,448 answers
64 comments
29,583 users