13.1 Classification

통계학의 분류

기술통계학(descriptive statistics):
- 평균(mean), 분산(variance), 왜도(skewness), 첨도(kurtosis), 중앙값(median), 최빈값(mode), 사분위수(quartile, Q1, Q3), 백분위수(percentile)
추론통계학(inferential statistics)
- (가설)검정(hypothesis test)
- (파라미터)추정(estimation): 점추정(point estimation), 구간추정(interval estimation)

분포(distribution)

모집단(population), 표본(sample), 표본공간(sample space), 표본크기(sample size)
분포의 표현: frequency table, plot, formula, representative values
분포의 종류: 이산(disrete), 연속(continuous)
대표적 분포: 정규(normal), 이항(binomial), t-분포, F-분포, 카이제곱분포
정규분포와 중심극한정리(central limit theorem)
확률변수(random variable)
통계량(statistic):
- 검정통계량(test statistic),
- 추정통계량(추정량, estimator)
  - 불편(unbiased), 일치(consistence), 최소분산(minimum variane, 효율, efficiency), 충분(sufficient), robust
기호의 사용규칙(symbols for notation)

검정(test)

귀무가설(null hypothesis)과 대립가설(alternative hypothesis)
제1종 오류(Type I error): alpha error
제2종 오류(Type II error): beta error
유의확률(p-value), 유의수준(significance level), 채택역(acceptance region), 기각역(rejection region), 자유도(degree of freedom)

추정(estimation)

2x2 table

민감도(sensitivity), 특이도(specificity), 양성예측율(positive predictive value), 음성예측률(negative predictive value)
오즈비(odds ratio), 상대위험도(relative risk)

모수적 대 비모수적 방법(Parametric vs Non-parametric method)