class: center, middle, inverse, title-slide # Hypothesis Testing ### Sebastian Hoyos-Torres --- # What is a Statistical Hypothesis? - An Example from the United States Criminal Justice System. - What does "not guilty" really mean in the Criminal Justice System? - Typically, it refers to there is enough evidence to convict someone beyond a reasonable doubt. - Reasonable doubt in an ideal world means that there is/isn't enough information to convict an individual. Quantitatively, the demarcation is usually noted as 95 percent confident that a person committed the crime based on a statement. - In terms we've covered thus far, the statement would look like : `$$P(Guilty|Evidence)$$` --- # Statistical Hypotheses: - Often we are interested in formulating statistical hypotheses about some characteristic of the population - We will focus on hypotheses being statements relating to a parameter of the distribution. - Some examples of these hypotheses are: - Average height of female college students equals 63 inches. - Average height of female college students is at least 63 inches. - The percentage of people with type B blood is 30% - The percentage of people with type B blood is not equal to 30% - The differences between a composite and simple hypothesis. --- # Statistical Hypotheses continued: - In our application of statistical hypotheses, we specify two claims from the population. - The null hypothesis indicates that there is no effect, difference, or the sample is drawn from the same population . You will usually see it denoted as: `$$H_0 = 0$$` - The alternative hypothesis indicates that there is an effect, difference, or that the sample is not drawn from the same population. In this section; we will see this denoted as: `$$H_a \neq{0}$$` - To test these hypotheses, we traditionally turn to hypothesis testing. A hypothesis test just indicates whether or not the sample data indicates that the null hypothesis is true. --- # Types of errors: <img src ="", width= "800" > --- --- # Type I and II errors example: <img src = "http://marginalrevolution.com/wp-content/uploads/2014/05/Type-I-and-II-errors1-625x468.jpg", width = "800"> --- # Statistical Hypotheses: - When conducting a statistical test; we typically pick a certain `\(a\)` value which is referred to the level of significance. - Levels of significance simply refer to the probability of rejecting the null when it is actually true (This would be referrred to as Type I error). - When we fail to reject the null hypothesis when it is actually false, then we refer to that as type II error. --- # Rejection regions: - Let's take a look at what a rejection region is with an app : https://hselab.shinyapps.io/critvalues/ - Important note!!! Just because we do not fail to reject the null does not mean that we accept the null hypothesis!!! --- --- # Statistical Hypotheses: A further look - As noted previously, a null hypothesis takes the form of `$$H_0: \mu = \mu_0$$` where `\(\mu_0\)` is referred to as the null value. The alternative hypotheses are as follows: `$$H_a:\mu>\mu_0$$` `$$H_a:\mu\neq\mu_0$$` `$$H_a:\mu<\mu_0$$` --- # The Z- Test: When we know the population Variance - Reminder: in statistics we are always concerned with estimating the values of a certain population parameter. - If we were ever in a situation where we knew the population variance but not the population mean, we would use the Z-test - To conduct a Z - test, we calculate a test statistic, Z `$$Z = \frac{\bar{X}- \mu}{\sigma/\sqrt{n}}$$` - In R: ```r zstat <- function(xbar, mu, sigma, n){ (xbar - mu)/(sigma/sqrt(n)) } ``` --- # The Z- Test continued: - So What are we doing in a z-test? - We are testing whether we reject a null hypothesis or not. The Z- statistic normalizes the sample mean to a standardized value on the normal distribution. - Thus, our z-statistic simply indicates how far from 0 our sample mean is. --- # Example of the Z - Test: Adapted from Penn State: Boys of a certain age are known to have a mean weight of μ = 89 pounds. A complaint is made that the boys living in a municipal children's home are underfed. As one bit of evidence, n = 25 boys (of the same age) are weighed and found to have a mean weight of `\(\bar{x}\)` = 80.9 pounds. It is known that the population standard deviation σ is 11.8 pounds. - What is the null hypothesis in this case? - the alternatives? - the Z-statistic? --- # Example of Z-Test continued: - In our case, we are interested in testing the null hypothesis of `$$H_0:\mu = 89 lbs$$` - The alternatives hypotheses are as follows: `$$H_a > 89 lbs$$` `$$H_a <89 lbs$$` `$$H_a \neq 89 lbs$$` --- # Example continued In the prior problem, we have identified all of the necessary elements so all it takes is plugging into our function: ```r zstat(80.9,89,11.8, 25) ``` ``` ## [1] -3.432203 ``` remember when we were talking about `\(z_{a/2}\)` ? ```r qnorm(.05/2) ``` ``` ## [1] -1.959964 ``` Since our critical value is within our rejection region, we can successfully reject the null hypothesis.If we were interested in 1 side? ```r qnorm(.05) ``` ``` ## [1] -1.644854 ``` The first refers to a two tailed versus a 1 tailed Z-test. In both cases, we can reject the null but which alternative hypotheses do we accept? --- #Visual of the z-test ![](week9_files/figure-html/plot1-1.png)<!-- --> --- # T- test: When we don't know the population variance - We usually will not know the population variance or any of the characteristics of the population - Thus, we rely on the t-test and the t- statistic which is calculated as follows: `$$T = \frac{\bar{X}- \mu_0}{S/\sqrt{n}}$$` - In R: ```r tstat <- function(xbar,mu,s,n){ (xbar - mu)/(s/sqrt(n)) } ``` --- # T-test example: Adapted from Penn State: It is assumed that the mean systolic blood pressure is μ = 120 mm Hg. In the Honolulu Heart Study, a sample of n = 110 people had an average systolic blood pressure of 130.3 mm Hg with a standard deviation of 21.2 mm Hg. Answer the following: - What is the null hypothesis? - What are the possible alternative hypotheses? - What is the t-statistic? --- # The t-test example, continued - At this point; we should be able to identify the null hypothesis pretty easily. `$$H_0:\mu = 120$$` `$$H_a > 120$$` `$$H_a < 120$$` `$$H_a \neq 120$$` --- # Example continued: Since we defined the t-statistic as a function; why not use it? ```r tstat(130.3,120,21.2,110) ``` ``` ## [1] 5.095628 ``` The area of the rejection region is the following: ```r qt(.05/2,110-1) ``` ``` ## [1] -1.981967 ``` ```r qt(.05,110-1) ``` ``` ## [1] -1.658953 ``` Therefore, for both the 1 sided and 2 sided t-tests we can reject the null. Note, just make the result positive --- # Visual of the t-test ![](week9_files/figure-html/plot2-1.png)<!-- --> --- # P - Values - Often, we will hear talk of "p-values". [For example:](https://fivethirtyeight.com/features/not-even-scientists-can-easily-explain-p-values/) - From the article: "What I learned by asking all these very smart people to explain p-values is that I was on a fool’s errand. Try to distill the p-value down to an intuitive concept and it loses all its nuances and complexity, said science journalist Regina Nuzzo, a statistics professor at Gallaudet University. “Then people get it wrong, and this is why statisticians are upset and scientists are confused.” You can get it right, or you can make it intuitive, but it’s all but impossible to do both." - So what we are going to get p-values right --- # P-values Continued. - For the t-test and z-test, we will get a p-value calculated for us if we input a vector. For example: ```r x <- rnorm(100, 10,2) t.test(x, mu = 0) ``` ``` ## ## One Sample t-test ## ## data: x ## t = 53.043, df = 99, p-value < 2.2e-16 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## 9.458632 10.193781 ## sample estimates: ## mean of x ## 9.826207 ``` --- # P- Value Cont. - The American Statistical Association's Statement on [p-values](https://amstat.tandfonline.com/doi/pdf/10.1080/00031305.2016.1154108?needAccess=true) - from the article: "a p-value is the probability under a specified statistical model that a sample summary of the data would be equal to or more extreme than its observed value". - A p-value is also associated with conducting a type I error. --- --- # P-Values in hypothesis testing: - Looking back at the example with the test statistic we computed; we found the following t= 5.095628 tstar_a/2 =1.981967 tstar = 1.658953 - With these values, we can compute the p-value as follows: ```r pt(5.095628,df = 109,lower.tail = FALSE) ``` ``` ## [1] 7.355408e-07 ``` where `\(df = n-1\)` and if the test was two sided; ```r pt(5.095628,df = 109,lower.tail = FALSE)*2 ``` ``` ## [1] 1.471082e-06 ``` - If we had the z-statistic, we would be doing something similar with pnorm. --- # Type II error - So far,we've focused in depth about type I error but what about type II error? - Just a refresher; type II error refers to the probability of not rejecting the null hypothesis when it is false. If it is normally distributed where `\(\sigma\)` is known; the probability is computed as: `$$\beta{(\mu)} = \Phi(z_a + \frac{\mu_0-\mu^1}{\sigma/\sqrt{n}})$$` - In R: ```r normtype2 <- function(alpha,mu0,mu1,sigma,n){ pnorm(qnorm(1-alpha) + (mu0 - mu1)/(sigma/sqrt(n))) } ``` --- # A Simulation of null hypothesis testing: - Simulations are fun and illustrative so let's look at an example: ```r k <- 5000 n <- 17 mu <- 413 sd <- 10 alpha <- .05 mns1 <- numeric(k) mns2 <- numeric(k) for(i in 1:k){ x <- mean(rnorm(n,mu,sd)) t <- tstat(x,mu,sd,n) if(t < qnorm(alpha)) mns1[i] <- 1 else mns2[i] <- pt(t,df = n-1,lower.tail = TRUE) } (rejected <- mean(mns1)) ``` ``` ## [1] 0.0502 ```