RESEARCH
Bioinformatics and Data Integration
Recent advances in science and technology are leading to a revision and re-orientation of methodologies, addressing old and current issues from a new perspective. Advances in next-generation sequencing (NGS) are allowing comparative analysis of the abundance and diversity of whole microbial communities, generating a large amount of data and findings at a systems level. The current limitation for biologists has been the increasing demand for computational power and training required for processing NGS data.
I am working on the development of new Bioinformatics tools and designing new scientific workflows to integrate multi-omics data (For example, integrating metagenomics and metatranscriptomics datasets).
![(A) Bacteriophage data representation is based on DNA sequencing approaches, followed by the assembly of short reads. (B) Model architecture for deep learning on metagenomic data to predict bacteriophage genomic sequencing from metagenomic data. [Silva et al., 2025; doi: 10.1016/j.virol.2025.110559]](https://821e51e4d6.clvaw-cdnwnd.com/c36e7c882a1e71544a4cc23eba115a4d/200000070-48c8848c8b/1-s2.0-S0042682225001722-gr1.jpeg?ph=821e51e4d6)
One Health Relationships between Human, Animal, and Environmental Microbiomes
The One Health concept is a global strategy to study the relationship between human and animal health and the transfer of pathogenic and non-pathogenic species between these systems. In the clinical context, metagenomics is also a powerful weapon in the fight against antibiotic resistance pathogens in humans and animals.
Currently, I have further to look at the role of antibiotic-resistance genes in microbial communities of several ecosystems, which included host-associated and soil microbiomes in a One Health context.
![Reconstruction of ~ 3,000 microbial genomes from a large-scale sampling of animals using One Health concept. (Lemos et al. 2022 - [Lemos et al. 2022 - Scientific Data] doi: 10.1038/s41597-022-01465-5)](https://821e51e4d6.clvaw-cdnwnd.com/c36e7c882a1e71544a4cc23eba115a4d/200000012-600cd600cf/OneHealth.png?ph=821e51e4d6)
Genome-Resolved Metagenomics
Despite all efforts to access microbial diversity, most soil microbes are still unknown and we are far from understanding several microbe-mediated processes in soil. The development and application of novel computational strategies have successfully allowed us to reconstruct complete or near-complete genomes of rare and/or uncultured bacteria.
I am working on the application of computational methods to reconstruct microbial genomes metagenomics using several types of datasets (from low-diversity to high-complex environments).
![Lemos et al. (2021) Trends in Microbiology - [https://www.sciencedirect.com/science/article/pii/S0966842X21000159]](https://821e51e4d6.clvaw-cdnwnd.com/c36e7c882a1e71544a4cc23eba115a4d/200000030-4ff8b4ff8f/1-s2.0-S0966842X21000159-gr1.jpg?ph=821e51e4d6)
Computational identification of CRISPR-Cas system-associated genes in the intestinal microbiota of obese individuals"
Intestinal dysbiosis is considered a factor contributing to the development of complex diseases, including human obesity. Recent studies based on high-throughput DNA sequencing indicate that obese patients exhibit reduced microbial diversity. However, many functional aspects of microbial communities remain underexplored, including the diversity of genes associated with the CRISPR-Cas system — a microbial defense mechanism against bacteriophages that stores information from past infections. In this context, the main objective of this project is to identify and quantify genes associated with the CRISPR system present in the gut microbiota of obese individuals and compare them with non-obese individuals. The analytical focus will be based on in silico comparative analyses using public databases of genetic sequences and genomes reconstructed from metagenomes. We aim to identify the orientation of CRISPR arrays, annotate Cas proteins, and classify their different subtypes. This will allow us to determine whether the microbiota of obese individuals exhibits higher or lower CRISPR diversity compared to that of non-obese individuals, as well as highlight potential new genome editing biotechnologies. This project will also contribute to the training of human resources in Data Science, Microbiology, and Bioinformatics, within the interdisciplinary context of Ilum (School of Science) at the Center for Research in Energy and Materials.
Collaborative Research
None of what I am doing would be possible without the collaboration and help of all people who composed the multidisciplinary teams that I have work during my scientific career.
