--- title: "Lab 3 review" --- Last lab, we learned how to bootstrap confidence intervals. Reminder of how we did that: ```{r, message=FALSE} library(tidyverse) library(languageR) theme_set(theme_bw()) # resample with replacement, take the mean resample_mean <- function(values) { samples <- sample(values, replace = TRUE) return(mean(samples)) } # repeat resampling num_samples times to get distribution of means resample_means <- function(values, num_samples) { replicate(num_samples, resample_mean(values)) } # use empirical quantiles of the distribution of means to get 95% CI ci_lower <- function(values) quantile(values, 0.025) ci_upper <- function(values) quantile(values, 0.975) # for each class, compute mean and bootstrap CI weight_summary <- weightRatings %>% group_by(Class) %>% summarise(mean_rating = mean(Rating), ci_lower_mean = ci_lower(resample_means(Rating, 1000)), ci_upper_mean = ci_upper(resample_means(Rating, 1000))) ggplot(weight_summary, aes(x = Class)) + geom_pointrange(aes(y = mean_rating, ymin = ci_lower_mean, ymax = ci_upper_mean)) ``` 1. Your new advisor has a super yolo attitude and tells you he's okay with your confidence intervals containing the true value of the mean 70% of the time rather than the usual 95%. What would you need to do differently to make this change? Do you expect your new CIs to be narrower or wider? Why? Change the functions above to reflect this, compute new CIs, and compare your new and old CIs. ```{r} ``` 2. You decide means are too boring and want to look at another property of the data. Pick a different statistic of interest, compute its value for the data, and bootstrap a confidence interval of your estimate. ```{r} ``` 3. A sudden heatwave wipes out 80% of your data. Subset the dataset to the remaining portion (hint: `sample_frac()`) and recompute CIs. Are the new CIs narrower or wider? Why? ```{r} ``` 4. Recall that earlier we computed CIs in a different way, from a parametric model -- we used the fact that the sampling distribution of the mean of any distribution should be approximately normal given a large enough sample size (by the Central Limit Theorem). Compute CIs using that method and compare them to bootstrapped CIs. ```{r} ``` 5. In what situations might it be better to use parametric methods or better to use bootstrapping?