-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I have encounter two problems when using the relabeltree.py --taxonomy option to assign taxonomy to MARdb sequeces:
-
some genomes from the initial MARdb files that Jose sent are missing the MARdb ID and therefore, those sequences cannot be classified with their ID. To solve it, we need to:
a) First, figure out if all teams are working with the same files (Complete and Partial genomes, QC by Jose)
b) Check with José if the MAR IDs were removed during the QC he did - I will do that once we figure out the firs part. -
For those sequences with MAR IDs the script is assigning the NCBI taxonomy and not the GTDB. For example:
One of the genera that appears in my tree classified as the NCBI (Beta) but in GTDB is Gamma.
This happens with several Betas that have been reclassified as Gammas, for example in the tree
The label color is the GTDB web classification and the square in the right the taxonomy applied by the script. Green: Zproteobacteria, Pink: Gamma, Orange: Beta
I have another example with a Streptomyces that has the NCBI taxonomy in the tree label and not the GTDB.
Are we are using an older release??

