캐글 메달리스트가 알려주는 캐글 노하우: 7.3 EDA

# average length of the words in the text
df_train["mean_word_len"] = df_train["comment_text"].apply(
    lambda x: np.mean([len(w) for w in str(x).split()])
)
df_test["mean_word_len"] = df_test["comment_text"].apply(
    lambda x: np.mean([len(w) for w in str(x).split()])
)

df_train.describe()

▲ 그림 7-8 텍스트 통계량 결과

신간 소식 구독하기

뉴스레터에 가입하시고 이메일로 신간 소식을 받아 보세요.