Visualize: Figure 5 & 6. Description: " Hypotheses: Attention (S3), Prediction (S4), or Beam Search (S5) Error – encoder words and decoder words (E/D), Attention (S3), top k predictions for each time step in decoder (S4), and beam search tree (S5)"