Keras Attention Layer

Version (s)

Introduction

This is an implementation of Attention (only supports Bahdanau Attention right now)

Project structure

data (Download data and place it here)
 |--- small_vocab_en.txt
 |--- small_vocab_fr.txt
layers
 |--- attention.py (Attention implementation)
examples
 |--- nmt
   |--- model.py (NMT model defined with Attention)
   |--- train.py ( Code for training/inferring/plotting attention with NMT model)
   |--- train_variable_length_seq.py ( Code for training/inferring with variable length sequences)
 |--- nmt_bidirectional
   |--- model.py (NMT birectional model defined with Attention)
   |--- train.py ( Code for training/inferring/plotting attention with NMT model)
models (created by train_nmt.py to store model)
results (created by train_nmt.py to store model)

How to use

Just like you would use any other tensoflow.python.keras.layers object.

from attention_keras.layers.attention import AttentionLayer

attn_layer = AttentionLayer(name='attention_layer')
attn_out, attn_states = attn_layer([encoder_outputs, decoder_outputs])

Here,

Visualizing Attention weights

An example of attention weights can be seen in model.train_nmt.py

After the model trained attention result should look like below.

Attention heatmap

Running the NMT example

Prerequisites

Using the docker image

Using a virtual environment

Running the code

If you have improvements (e.g. other attention mechanisms), contributions are welcome!