On the Behaviour of BERT’s Attention for the Classification of Medical Reports
Contributo in Atti di convegno
Data di Pubblicazione:
2022
Abstract:
Since BERT and the other Transformer-based models have been proved successful in many NLP tasks, several studies have been conducted to understand why these complex deep learning architectures are able to reach such remarkable results. Such studies have focused on visualising and analysing the behaviour of each self-attention mechanism and are often conducted with large, generic and annotated datasets for the English language, using supervised probing tasks in order to test specific linguistic capabilities. However, in several practical contexts there are some difficulties: probing tasks may not be available, the documents can contain a strict technical lexicon, and the datasets can be noisy. In this work we analyse the behaviour of BERT in a specific context, i.e. the classification of radiology reports collected from an Italian hospital. We propose (i) a simplified way to classify head patterns without relying on probing tasks or manual observations, and (ii) an algorithm for extracting the most relevant relations among words captured by each self-attention. Combining these techniques with manual observations, we present several examples of linguistic information that can be extracted from BERT in our application.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Elenco autori:
Putelli, L.; Gerevini, A. E.; Lavelli, A.; Mehmood, T.; Serina, I.
Link alla scheda completa:
Titolo del libro:
CEUR Workshop Proceedings
Pubblicato in: