딥러닝 텐서플로 교과서: 10.2.1 seq2seq

모델을 훈련시키기 위한 함수를 정의합니다.

코드 10-36 모델 훈련 함수 정의

def train_step(inp, targ, enc_hidden):
  loss = 0

  with tf.GradientTape() as tape:
    enc_output, enc_hidden = encoder(inp, enc_hidden)
    dec_hidden = enc_hidden
    dec_input = tf.expand_dims([targ_lang.word_index['<start>']] * BATCH_SIZE, 1)
    for t in range(1, targ.shape[1]): ------ 대상 단어를 입력으로 사용
      predictions, dec_hidden, _ = decoder(dec_input, dec_hidden, enc_output) ------ 인코더 출력(enc_output)을 디코더로 보냅니다.

    loss += loss_function(targ[:, t], predictions)
    dec_input = tf.expand_dims(targ[:, t], 1)
batch_loss = (loss / int(targ.shape[1])) ------ 손실/오차 계산
variables = encoder.trainable_variables + decoder.trainable_variables
gradients = tape.gradient(loss, variables)
optimizer.apply_gradients(zip(gradients, variables))
return batch_loss

신간 소식 구독하기

뉴스레터에 가입하시고 이메일로 신간 소식을 받아 보세요.