Self-attention-based models for the extraction of molecular interactions from biological texts

  • Prashant Srivastava
  • , Saptarshi Bej
  • , Kristina Yordanova
  • , Olaf Wolkenhauer*
  • *Corresponding author for this work

Research output: Contribution to journalReview article / Perspectivespeer-review

16 Scopus citations

Abstract

For any molecule, network, or process of interest, keeping up with new publications on these is becoming increasingly difficult. For many cellular processes, the amount molecules and their interactions that need to be considered can be very large. Automated mining of publications can support large-scale molecular interaction maps and database curation. Text mining and Natural-Language-Processing (NLP)-based techniques are finding their applications in mining the biological literature, handling problems such as Named Entity Recognition (NER) and Relationship Extraction (RE). Both rule-based and Machine-Learning (ML)-based NLP approaches have been popular in this context, with multiple research and review articles examining the scope of such models in Biological Literature Mining (BLM). In this review article, we explore self-attention-based models, a special type of Neural-Network (NN)-based architecture that has recently revitalized the field of NLP, applied to biological texts. We cover self-attention models operating either at the sentence level or an abstract level, in the context of molecular interaction extraction, published from 2019 onwards. We conducted a comparative study of the models in terms of their architecture. Moreover, we also discuss some limitations in the field of BLM that identifies opportunities for the extraction of molecular interactions from biological text.

Original languageEnglish
Article number1591
JournalBiomolecules
Volume11
Issue number11
DOIs
StatePublished - Nov 2021
Externally publishedYes

Keywords

  • Biological literature mining
  • Natural language processing
  • Relationship extraction
  • Self-attention models
  • Text mining

Fingerprint

Dive into the research topics of 'Self-attention-based models for the extraction of molecular interactions from biological texts'. Together they form a unique fingerprint.

Cite this