Date published:

A new computer tool to predict interactions between viruses and bacteria

Andrzej Zielezinski, Sebastian Deorowicz, Adam Gudyś. PHIST: fast and accurate prediction of prokaryotic hosts from metagenomic viral sequences. Bioinformatics 38(5) 2022 pp. 1447-1449 https://doi.org/10.1093/bioinformatics/btab837

Bacteriophages, which are viruses that infect bacteria, are the most abundant biological entities on Earth. They play a crucial role in maintaining the balance of bacterial populations in ecosystems worldwide and are important tools in medicine and biotechnology. Recent advances in metagenomics and bioinformatics have enabled the use of computers to detect previously unknown viruses in various environments, such as the ocean, soil, and human gut. However, identifying which bacteria are being targeted by these newly discovered viruses can be challenging.

To address this issue, the authors of this article have introduced a new computer tool called PHIST (Phage-Host Interaction Search Tool) that can quickly and accurately predict which bacteria are likely to be infected by viruses. PHIST looks for similarities in genomic sequences of viruses and bacteria and links viruses to their potential bacterial hosts based on the number of common short nucleotide sequences.

The authors tested PHIST on several datasets of known virus-bacteria interactions and showed that their software outperforms other tools by 5-20% in prediction accuracy. Furthermore, compared to other tools, PHIST has a faster processing speed and requires less hardware, which means that most analyses can be carried out on a standard workstation or personal laptop computer. For instance, the authors used PHIST to analyze a set of 190 thousand metagenomic viral genomes from the human gut and predicted the bacterial hosts among nearly 300 thousand bacteria. Although the analysis required a significant number of pairwise comparisons (approximately 54 billion), PHIST was able to complete the task in just three and a half hours instead of weeks or months. Most recently, PHIST has been used in the IMG-VR resource, which is currently the largest database of metagenomic viruses, to predict bacterial hosts for 15 million bacteriophage genomes.

PHIST is freely available at: https://github.com/refresh-bio/phist