더북(TheBook)

4. 다음 데이터를 파티셔닝한 후 로지스틱 회귀모형 피팅하기

>>> df_credit = read.csv("C:/creditset.csv")   # creditset.csv 파일 읽기
>>> dim(df_credit)
[1] 2000   6

# 파티셔닝하기
>>> idx = sample(1:nrow(df_credit), 0.7*nrow(df_credit))
>>> train = df_credit[ idx, ]
>>> test = df_credit[-idx, ]

>>> model = glm( default10yr~income+age+loan, family="binomial", data=train)
>>> summary(model)
Call:
glm(formula = default10yr ~ income + age + loan, family = "binomial",
    data = train)

Deviance Residuals:
     Min        1Q    Median        3Q      Max
-2.21267  -0.07319  -0.00783  -0.00031  2.77476

Coefficients:
              Estimate Std. Error z value Pr(>|z|)
(Intercept)  1.033e+01  1.029e+00   10.047   <2e-16 ***
income      -2.554e-04  2.581e-05   -9.896   <2e-16 ***
age         -3.579e-01  3.208e-02  -11.157   <2e-16 ***
loan         1.787e-03  1.623e-04   11.012   <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1148.33 on 1399 degrees of freedom
Residual deviance:  304.31 on 1396 degrees of freedom
AIC: 312.31
Number of Fisher Scoring iterations: 9
신간 소식 구독하기
뉴스레터에 가입하시고 이메일로 신간 소식을 받아 보세요.