딥러닝 텐서플로 교과서: 10.2.1 seq2seq

네트워크가 구축되었으니 어텐션을 구현하겠습니다.

코드 10-32 어텐션 구축

class EDAttention(tf.keras.layers.Layer):
  def __init__(self, units):
    super(EDAttention, self).__init__()
    self.W1 = tf.keras.layers.Dense(units)
    self.W2 = tf.keras.layers.Dense(units)
    self.V = tf.keras.layers.Dense(1)

  def call(self, query, values):
    hidden_with_time_axis = tf.expand_dims(query, 1) ------ ①
    score = self.V(tf.nn.tanh(
                   self.W1(values) + self.W2(hidden_with_time_axis))) ------ ②
    attention_weights = tf.nn.softmax(score, axis=1) ------ 어텐션 가중치(attention_weights)의 형태는 (배치 크기, 시퀀스 최대 길이, 1)이 됩니다.
context_vector = attention_weights * values
    context_vector = tf.reduce_sum(context_vector, axis=1) ------ 컨텍스트 벡터(context_vector)의 형태는 (배치 크기, 은닉층 크기)입니다.
    return context_vector, attention_weights
attention_layer = EDAttention(10)

① tf.expand_dims는 텐서의 원하는 위치에 차원을 추가하는 데 사용합니다. 즉, query라는 텐서의 1이라는 위치에 차원을 추가한 것으로 은닉층에 하나의 차원을 추가한다는 의미입니다.

신간 소식 구독하기

뉴스레터에 가입하시고 이메일로 신간 소식을 받아 보세요.