Data Analysis and Statistical Inference (1)

Data Analysis and Statistical Inference




A 2005 survey found that 7% of teenagers (ages 13 to 17) suffer from an extreme fear of spiders (arachnophobia). At a summer camp there are 10 teenagers sleeping in each tent. Assume that these 10 teenagers are independent of each other. What is the probability that at least one of the suffers from arachnophobia?

每个人不怕的概率是0.93,10个人都不怕的概率是0.93**10,那么至少有一个人怕的概率是: 1 - 0.93^10

[1] 0.516


Last semester, out of 170 students taking a particular statistics class, 71 students were “majoring” in social sciences and 53 students were majoring in pre-medical studies. There were 6 students who were majoring in both pre-medical studies and social sciences. What is the probability that a randomly chosen student is majoring in social sciences, given that s/he is majoring in pre-medical studies?

我一直以为应该是6/71,即pre-medical &social sciences/social sciences

但是答案却是:If M is the event a student is majoring in pre-medical studies and S is the event s/he is majoring in social sciences, then calculate P(S|M)=P(S&M)/P(M)=6/53.



The custom function calc streak, which was loaded in with the data, can be used to calculate the lengths of all shooting streaks 大体用途是,H和M两个字符随机排列,按M进行分区,在分割的区间里,统计H分别出现的频率。。。




calc_streak <- function(x){


y <- rep(0,length(x))


y[x == "H"] <- 1


y <- c(0, y, 0)


wz <- which(y == 0)


streak <- diff(wz) - 1





In a simulation, you set the ground rules of a random process and then the computer uses random numbers to generate an outcome that adheres to those rules. As a simple example, you can simulate flipping a fair coin with the following.

The vector outcomes can be thought of as a hat with two slips of paper in it: one slip says “heads” and the other says “tails”. The function sample draws one slip from the hat and tells us if it was a head or a tail.Run the second command listed above several times. Just like when flipping a coin, sometimes you’ll get a heads, sometimes you’ll get a tails, but in the long run, you’d expect to get roughly equal numbers of each.

outcomes <- c("heads", "tails") sample(outcomes, size = 1, replace = TRUE)

sample(outcomes, size = 100, replace = TRUE,prob=c(0.2,0.8)) outcomes <- c(0,1,2,3) sample(outcomes, size = 3, replace = TRUE)


电子邮件地址不会被公开。 必填项已用*标注