--- title: "Distributions&inference" author: "Zhou" date: "1/27/2021" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Binomial in R First, we will use the "rbinom" function to generate random numbers which belong to the Binomial distribution. ```{r} p=.5 n=5 #number of layers/trials k=10000 #number of balls x=rbinom(k, n, p) hist(x) ``` Second, we will try different probability and trials. ```{r} p=.4 n=200 #number of layers/trials k=10000 #number of balls x=rbinom(k, n, p) hist(x) ``` Third, standardization for the vector x_2. ```{r} mean=n*p var=n*p*(1-p) z=(x-mean)/sqrt(var) hist(z) ``` Fourth, let's plot on the probability density ```{r} d=density(z) par(mfcol=c(2,1),mar = c(3,4,1,1)) # define the graphical parameters (canvas) plot(d) plot(d) polygon(d, col="red", border="blue") # dev.off() # close the canvas ``` ## Normal distribution in R Define the normal distribution and plot the probability density of the normal distribution. ```{r} # Normal distribution mean=n*p # k = 10000, n=5, p=0.5 var=n*p*(1-p) x=rnorm(k, mean=mean,sd=sqrt(var)) hist(x) d=density(x) plot(d) ``` Second, add two normal distribution variables ```{r} par(mfrow=c(1,3),mar = c(3,4,1,1)) k=10000 x1=rnorm(k,0,1) x2=rnorm(k,0,1) y=x1^2 + x2^2 # you create a new distribution derived from 2 normal distribution mean(y) var(y) hist(x1) hist(x2) hist(y) ``` ## Poisson, Chi square, F and t distribution in R ```{r} par(mfrow=c(1,4),mar = c(3,4,1,1)) lambda=.5 x_Poisson=rpois(k, lambda) hist(x_Poisson) x_Chi=rchisq(k,2) hist(x_Chi) x_F=rf(k,1, 100) hist(x_F) x_t=rt(k,2) hist(x_t) ``` ## Central Limit Theory (CLT) Averages of large samples close to normal distribution. First, let's see how to sample from different distributions. ```{r} par(mfrow=c(5,1),mar = c(3,4,1,1)) #Binomia p=.05 n=100 #number of layers/trials k=10000 #number of balls x=rbinom(k, n,p) d=density(x) plot(d,main="Binomial") #Poisson lambda=10 x=rpois(k, lambda) d=density(x) plot(d,main="Poisson") #Chi-Square x=rchisq(k,5) d=density(x) plot(d,main="Chi-square") #F x=rf(k,10, 10000) d=density(x) plot(d,main="F dist") #t x=rt(k,5) d=density(x) plot(d,main="t dist") ``` Second, write a Function to get mean of ten. These ten is sampled from a distribution. Then plot the mean. ```{r} i2mean = function(x,n=10){ k=length(x) nobs=k/n # get matrix: nobs (1000) rows and n (10) cols xm=matrix(x,nobs,n) # get means of each row y=rowMeans(xm) # we can get 1000 means return (y) } par(mfrow=c(5,1),mar = c(3,4,1,1)) #Binomia p=.05 n=100 #number of layers/trials k=10000 #number of balls x=i2mean(rbinom(k, n,p)) d=density(x) # get the density of mean plot(d,main="Binomial") #Poisson lambda=10 x=i2mean(rpois(k, lambda)) d=density(x) # get the density of mean plot(d,main="Poisson") #Chi-Square x=i2mean(rchisq(k,5)) d=density(x) # get the density of mean plot(d,main="Chi-square") #F x=i2mean(rf(k,10, 10000)) d=density(x) # get the density of mean plot(d,main="F dist") #t x=i2mean(rt(k,5)) d=density(x) # get the density of mean plot(d,main="t dist") ```