9.1 카이제곱 분포의 자유도

9.1.1 카이제곱 분포의 정의

앞 장에 이미 나왔지만, 카이제곱 분포를 따르는 확률변수는 다음과 같이 정의에 따라 생성할 수 있다.

\[Z_i \stackrel{i.i.d}{\sim} N(0, 1^2)\] \[{}\] \[V = Z_1^2 + Z_2^2 + \cdots + Z_k^2 \sim {\chi}^2(k)\]

 

\(Z_i\)s are independent and identically distributed(i.i.d.).

9.1.2 Setting

Df = 10   # Degree of Freedom, Number of Zs to be added for Chi-square variable V
nV = 5000 # Number of Chi-square Variables (Vs)

 

9.1.3 Generate Zs (standard normal distribution) and Vs (chi-square distribution)

9.1.3.1 First V variable (rV) adding 10 Z variables

mZ = matrix(rnorm(nV*Df), nrow=nV, ncol=Df) # matrix of Zs
rV = rowSums(mZ^2) # sum of 10 independent Zs
length(rV) # Should be equal to nV
[1] 5000

 

9.1.3.2 Second V variable (rV2) of chi-square distribution

mZ2 = mZ
mZ2[,10] = -rowSums(mZ[,1:(Df - 1)])/(Df - 1) # 10th Z
rV2 = rowSums(mZ2^2) # sum of 9 free and 1 derived Zs.

 

9.1.4 Get theoretical values

x = seq(0, 30, length.out=101)
y1 = dchisq(x, df=Df)
y2 = dchisq(x, df=Df - 1)

9.1.5 Plot

# dev.new(width=14, height=7)
plot(x, y2, type="l", col="red", ylab="Density") # This is taller, so plot first
lines(density(rV2), lty=2, col="red")
lines(x, y1)
lines(density(rV), lty=2)
legend(17, 0.1,
       c("Df=10 Theoretical", "Df=10 Simulation", "Df=9 Theoretical", "Df=9 Simulation"),
       lty=c(1,2,1,2), col=c(1,1,2,2))
Degree of freedom and Chi-square distribution

Figure 9.1: Degree of freedom and Chi-square distribution

9.1.6 Fit and get Df using MASS::fitdistr()

#### Degree of freedom of rV

mean(rV) # because E(V) = k (df)
[1] 9.973559

 

9.1.6.1 Degree of freedom of rV2

mean(rV2)
[1] 9.086426

 

9.1.6.2 Other way using MASS:fitdistr

require(MASS)
fitdistr(rV, dchisq, start=list(df=mean(rV)), lower=0)
       df     
  10.00869288 
 ( 0.06015057)
fitdistr(rV2, dchisq, start=list(df=mean(rV2)), lower=0)
       df    
  9.12664244 
 (0.05715603)