딥러닝 텐서플로 교과서: 10.2.1 seq2seq

② 스코어(score)의 형태는 (배치 크기, 시퀀스 최대 길이, 1)이 되며, self.W1에 스코어가 적용되기 때문에 마지막 축이 1이 됩니다. 참고로 self.W1을 적용하기 전 텐서의 형태는 (배치 크기, 시퀀스 최대 길이, 유닛(unit))입니다.

디코더 네트워크를 구축합니다.

코드 10-33 디코더 네트워크 구축

class Decoder(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz):
    super(Decoder, self).__init__()
    self.batch_sz = batch_sz
    self.dec_units = dec_units
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(self.dec_units,
                                   return_sequences=True,
                                   return_state=True,
                                   recurrent_initializer='glorot_uniform')
    self.fc = tf.keras.layers.Dense(vocab_size)
    self.attention = EDAttention(self.dec_units) ------ 어텐션 적용

  def call(self, x, hidden, enc_output):------ 인코더 출력(enc_output) 형태는 (배치 크기, 시퀀스 최대 길이, 은닉층 크기)입니다.
context_vector, attention_weights = self.attention(hidden, enc_output) 
x = self.embedding(x) ------ 임베딩층을 통과한 후 x의 형태는 (배치 크기, 1, 임베딩 차원)입니다.
x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)
output, state = self.gru(x) ------ 병합된 벡터를 GRU로 보냅니다.
output = tf.reshape(output, (-1, output.shape[2])) ------ 출력 형태는 (배치 크기×1, 은닉층 크기)입니다.
x = self.fc(output)
    return x, state, attention_weights

decoder = Decoder(vocab_tar_size, embedding_dim, units, BATCH_SIZE)

신간 소식 구독하기

뉴스레터에 가입하시고 이메일로 신간 소식을 받아 보세요.