Research Topics

We hypothesized that aberrant activity of transcription factors (TFs) may generate aberrant transcriptional profiles that contribute to the development of advanced cancer and treatment resistance. Therefore, the identification of such TFs could represent effective therapeutic targets for the development of novel treatments that may overcome resistance. Most of the initially proposed methods for gene regulatory network discovery create a network of genes and then mine it to uncover previously unknown regulatory processes. We first analyzed module-based network approaches to build gene regulatory networks and compared them to the well-established single-gene network approaches, showing that a novel module-based approach based on variational Bayes outperformed all other methods. Building on this work, we developed two novel methods, called SparseGMM and TraRe. When the latter was applied to advanced prostate cancer, we were able to identify abrogated TFs associated with drug response, which we were able to identify in vivo. Finally, we have recently move beyond gene expression to unravel dysregulation at the biological process level. To this end, we developed NetActivity, a autoencoder-based model that efficiently and robustly translates gene expression into gene set activity scores. We then ask ourselves if we could leverage the uncovered dysregulated TFs and biological processes, to repurpose known drugs that could target them, so that they could be potentially reverted. With this aim, we developed GeNNius, a drug-target interaction prediction model based on Graph Neural Networks that achieved state-of-the-art performance at blasting computational speeds. Then, we built a novel methodology to accelerate the capabilities of these models to handle real-world applications, where we showed that, when properly trained, Graph Neural Networks-based models can infer novel direct drug target interactions, which we validated through Plasmon Surface Resonance.

This project was initially funded by the H2020 Marie S. Curie Individual Fellowships (No. 898356) and the US Department of Defense (DoD) - CDMRP (W81XWH-20–1-0262). It is currently funded by the Spanish Ministry of Science and Innovation (Ramon y Cajal Fellowship RYC2021-033127-I, and the research projects TED2021-131300B-I00 and PID2023-151980OB-I00), and a joint Spanish ISCIII and US NIH project (AC23_2/00016).

Computational models to unravel the transcriptional machinery of cells at single cell resolution.

Single-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. First, we proposed JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. Leveraging the automatic annotation, we extended our developed transcription network methodology to single-cell RNA-Seq data and showed that the proposed method, SimiC, can uncover complex regulatory dynamics at single-cell resolution that were missed by previous methods. We then applied these novel computational methods tailored to single-cell data through a combined analysis of gene regulatory networks and cell–cell interaction maps, to several pathologies: 1) to elucidate a detailed blueprint of the intercellular crosstalk and cellular reprogramming that balances the metabolic and proliferative requirements of a regenerating liver; 2) in a CAR T cell study in Multiple Myloma, where we showed, for the first time, an association between CAR expression in the cell’s membrane with clinical outcome; and 3) to uncover, in del(5q) MDS patients, the molecular traits associated to CD34+ cells harboring the 5q lesion, compared to those without the genetic lesion and show that even those not carrying the 5q deletion have altered key molecular pathways and partially resist to Lenalidomide. Finally, taking the lessons learnt from the single-cell analyses we developed ELATUS, a computational model tailored to quantify lncRNAs from single-cell data. We showed that using ELATUS we were able to identify the regulatory role of key lncRNAs in Triple-negative Breast Cancer.

This project is funded by a La Caixa Health Research Project (HR24-01000), the Palatchi foundation, and the Government of Navarra (through the DIAMANTE, GRANATE, and BLANCA projects).

Efficient storage and representation of Genomic data

The field of genomics is entering an exciting era with unprecedented opportunities for new medical insights, enabled by an enormous and ever-growing amount of genomic data. The data are characterized by highly distributed acquisition, huge storage requirements, and highly involved analyses that integrate heterogenous information. My research is dedicated to identifying and addressing the challenges arising in the context of such data. This undertaking includes the design and development of new algorithms for coping with the distribution and storage of the data, for facilitating its access, and for improving the analysis and inference performed on it.

This project is funded by the Spanish Ministry of Science and Innovation (PID2020‐114394RA‐C33), and an industrial partnership with Philips Healthcare.

Research Topics

Characterization of altered transcriptional mechanisms in advanced cancers for improved drug repurposing approaches via advanced machine learning models.

Computational models to unravel the transcriptional machinery of cells at single cell resolution.

Efficient storage and representation of Genomic data