Visualize: Figure4; Description: "At inference time, attention matrix provides wordlevel interpretation: For each MeSH prediction, the model shows which words are given high attention. It helps the indexers to evaluate and proofread the indexing results of our model. Figure 4 shows an example of attention for interpretation."