딥러닝 파이토치 교과서: 10.3 한국어 임베딩

slide 1 of 18, currently active
slide 2 of 18
slide 3 of 18
slide 4 of 18
slide 5 of 18
slide 6 of 18
slide 7 of 18
slide 8 of 18
slide 9 of 18
slide 10 of 18
slide 11 of 18
slide 12 of 18
slide 13 of 18
slide 14 of 18
slide 15 of 18
slide 16 of 18
slide 17 of 18
slide 18 of 18

① from_pretrained를 사용하여 인터넷에서 사전 훈련된 모델을 내려받습니다. ‘bert-base-multilingual-cased’ 모델은 12개의 계층(임베딩 계층을 포함한다면 13개의 계층)으로 구성된 심층 신경망입니다.

ⓐ bert-base-multilingual-cased: 영어 외에 다국어에 대한 임베딩 처리를 할 때 사용하는 모델입니다.

ⓑ output_hidden_states: 버트 모델에서 은닉 상태의 값을 가져오기 위해 사용합니다.

다음은 ‘bert-base-multilingual-cased’ 모델을 내려받은 결과와 모델의 네트워크 형태를 보여 줍니다.

Downloading:                                             714M/714M [01:01<00:00,
100%                                                      11.8MB/s]

BertModel(
  (embeddings): BertEmbeddings(
    (word_embeddings): Embedding(119547, 768, padding_idx=0)
    (position_embeddings): Embedding(512, 768)
    (token_type_embeddings): Embedding(2, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        ... 중간 생략 ...
)
    (pooler): BertPooler(
      (dense): Linear(in_features=768, out_features=768, bias=True)
      (activation): Tanh()
    )
  )
  (dropout): Dropout(p=0.1, inplace=False)
  (classifier): Linear(in_features=768, out_features=2, bias=True)
)

신간 소식 구독하기

뉴스레터에 가입하시고 이메일로 신간 소식을 받아 보세요.