Visualize: Figure 2 & 3 & 5. Description: "Attention over time. As the model generates each word, its attention changes to reflect the relevant parts of the image. “soft” (top row) vs “hard” (bottom row) attention. (Note that both models generated the same captions in this example.)" "Examples of mistakes where we can use attention to gain intuition into what the model saw."