We are interested in the understanding of the mechanisms underlying biomedical related problems at the molecular scale. This involves the study of the network of interactions between molecules that underly, for instance, the etiology of a complex disease. In addition to the study of diseases of complex origin, such as schizophrenia or intracranial aneurysms, we are also interested in the mechanisms underlying the appearance of side effects after drug treatments.
One part of our research is focused in strategies to, once a network of molecular interactions is obtained, characterize the network and model its behavior in order to gain insight into the etiology of the disease phenotype. In particular, we are interested in the application of qualitative modeling approaches, such as Petri Nets and Boolean networks.
Another line of research involves strategies for obtaining the networks that are relevant for the biomedical related problems already mentioned. For this, we are developing software for the retrieval and analysis of data from public network repositories (databases of signaling pathways, gene regulatory networks and metabolic reactions) using as starting point a large collection of genes, for instance the list of genes that can result from the analysis of a microarray experiment. Although the publicly available network databases contain valuable information, we are aware that their coverage is not complete: a lot of information regarding interaction between biomedical entities (genes, proteins, phenotypes, chemicals, drugs, etc) still lyes in the biomedical literature as free text.
Here comes our third line of research, which involves the use of text mining approaches for the extraction of relationships between biomedical entities from the biomedical literature. In the past years we have developed NER systems for the identification of mentions of gene sequence variants from MEDLINE abstracts, and linkage of the mentions found in text to the corresponding database identifiers (in this case dbSNP). In addition, we have developed a corpus with annotations for variation mentions for the evaluation of this kind of NER systems. Currently, we are working on the application of NLP approaches for the identification and extraction of different types of relationships between biomedical entities.
The new implementation of OSIRIS (OSIRISv1.2) incorporates a new entity recognition module and is built on top of a local mirror of MEDLINE collection and HgenetInfoDB. HgenetInfoDB is a database that integrates data of human genes from the NCBI Gene database and dbSNP. The entity recognition module is based on a corpus of articles annotated with gene identifiers and the new search algorithm, which uses a pattern-based search strategy and a sequence variant nomenclature dictionary for the identification of terms denoting SNPs and other sequence variants and their mapping to dbSNP entries. The use of OSIRISv1.2 generates a corpus of annotated literature linked to sequence database entries (NCBI Gene and dbSNP). The results of the searches are stored in a database that can be used to query the results and, in the future, for the extraction of relationships among biological entities. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in a 99 % precision at a 82 % recall, and a F-score of 0.9. Results on two sets of genes, one related to the disease cerebral aneurysm and the other to breast cancer, are currently available.
The database HgenetInfoDB is a repository of Homo sapiens genes and their sequence variants.
Visit this link for a description of the available corpora.
Bauer-Mehren A, Furlong LI, Sanz F. Pathway databases and tools for their exploitation: benefits, current limitations and challenges. Mol Syst Biol. 2009;5:290.
Bauer-Mehren A, Furlong LI, Sanz F. From SNPs to pathways: Integration of functional effect of sequence variations on models of cell signalling pathways. BMC Bioinformatics 2009, 10(Suppl 8):S6 Link
Furlong L.I. and Sanz F. Identification of sequence variants of genes from biomedical literature: the OSIRIS approach. Book chapter for the book "Information Retrieval for Biomedicine: Natural Language Processing for Knowledge Integration", Violaine Prince and Mathieu Roche (Eds), IGI Global, 2009. Chapter pre-print
Hofmann-Apitius M, Fluck J, Furlong L, Fornes O, Kolarik C, Hanser S, Boeker M, Schulz S, Sanz F, Klinger R, Mevissen T, Gattermayer T, Oliva B, Friedrich CM. Knowledge environments representing molecular entities for the virtual physiological human. Philos Transact A Math Phys Eng Sci. 2008 Jun 17. Link
Furlong LI, Dach H, Hofmann-Apitius M, Sanz F. OSIRISv1.2: a named entity recognition system for sequence variants of genes in biomedical literature. BMC Bioinformatics 2008, 9:84. Link
Klinger R, Friedrich CM, Mevissen HT, Fluck J, Hofmann-Apitius M, Furlong LI, Sanz F. Identifying gene-specific variations in biomedical text. J Bioinform Comput Biol. 2007 Dec;5(6):1277-96. Link
Bonis J, Furlong LI, Sanz F. OSIRIS: a tool for retrieving literature about sequence variants. Bioinformatics. 2006 Oct 15;22(20):2567-9. Epub 2006 Jul 31. Link
The Integrative Biomedical Informatics Group promotes and tackles the synergistic and integrative approaches of the diverse reasearch lines developed by the research groups of the Research Unit on Biomedical Informatics (GRIB). The GRIB hosts the Node of Biomedical Informatics of the INB. The Integrative Biomedical Informatics Group group focuses on the application of methods and software developed in-house to tackle human health issues, including disease prevention and diagnosis and therapeutic tecnologies. One of our research lines is devoted in the development of new strategies and tools for text mining, focused in the literature retrieval and named entity recognition, particularly considering the documents dealing with genetic variations.
TopIf you are interested in working with us, please send an e-mail to Laura I. Furlong
TopComments and suggestions: Laura I. Furlong (lfurlong@imim.es) Integrative Biomedical Informatics Group, Research Unit on Biomedical Informatics (GRIB), Institut Municipal d´Investigació Médica (IMIM) and Universitat Pompeu Fabra (UPF).
TopBack to Integrative Biomedical Informatics home page
Updated: December 2009
by Laura I. Furlong (visit my web page)