Track concatenator j

3/17/2023

In an advance over related tools, StrainGE works at exceptionally low sequence coverages (from 0.1x) to identify strains in a sample, and allows the user to characterize and compare strains across samples at the nucleotide level, with high resolution. In order to be able to disentangle mixtures of low-abundance, clinically important strains within metagenomic data, we developed the Strain Genome Explorer (StrainGE) toolkit. To our knowledge, none of these computational approaches work robustly at low coverages (< 10x), accurately disentangle mixtures of same-species strains, and distinguish similar strains at the nucleotide level. Assembly approaches require higher sequence coverage than typically achieved for lower abundance members of a community. A third class of tools aims to recover strain-level variation after de novo metagenomic assembly, including DESMAN, inStrain, and STRONG. In the case of strain mixtures, MIDAS and StrainPhlan do not untangle the SNVs coming from different strains, while ConStrains attempts to link SNVs with similar allele frequencies, though linking SNVs requires high strain coverage to be accurate. Another class of tools characterizes and tracks strains based on single nucleotide variant (SNV) profiles along a single reference or a set of marker genes, including MIDAS, StrainPhlan, and ConStrains. Thus, output from these tools is dependent upon database granularity and does not distinguish between distinct strains matching the same reference.

These tools rely upon a precomputed database of reference genomes, from which the best matches are reported for a sample (or set of samples). Existing tools that aim to disentangle within-species strain mixtures include BIB, StrainEst, and DiTASiC, as well as the broader taxonomic profiling tools like Kraken2 and GOTTCHA when given an appropriate database. ) were not designed to work at the low coverages typically found for many clinically relevant organisms in metagenomic samples, such as E. However, most current strain-level metagenomic data analytical tools (reviewed in Anyansi et al. Whole metagenome shotgun sequencing approaches offer less perturbed views of strain-level diversity, but require specialized computational tools. While culture-based approaches have been a workhorse for dissecting strain-level diversity, these approaches can be slow and unfaithful to the true representation of strains, due to culturing bottlenecks that limit observed diversity, as well as the potential for evolution during culture. Multiple distinct strains of the same species, often from genetically dissimilar phylogroups, frequently coexist within a single human gut community, the implications of which are mostly underexplored due to the difficulties of studying strain-level variation from complex community samples. For example, strains of Escherichia coli share a core genome representing only about half of their genes and cause distinct disease including diarrhea and urinary tract infections, or potentiate tumorigenesis, while other strains are able to co-exist with their host without causing overt illness. Many of these species are renowned for their genomic and phenotypic plasticity. Human-associated microbial communities include complex mixtures of bacterial species.

0 Comments

Track concatenator j

Leave a Reply.

Author

Archives

Categories