The pre-processed shotgun and capture data were analyzed using the Megan ALignment Tool (MALT) (version 0.3.8)25. The shotgun (non-UDG) data previously generated for the three published Peruvian MTBC genomes12 were also processed with MALT. These data were analyzed using a database comprised of all complete genomes in the NCBI Nucleotide (nt) database downloaded from (7 Dec. 2016) created using malt build (v. 0.3.8). The purpose was to assess the amount of non-MTBC DNA in our shotgun and capture data, particularly with regard to the amount of non-MTBC mycobacterial DNA. Two MALT runs were performed, the first using a minimum percent identity parameter (--minPercentIdentity) of 85, which is a more sensitive alignment criterion. The second run used --minPercentIdentity 95, allowing fewer mismatches in the reads aligned to the database. BlastN mode and SemiGlobal alignment were applied. All other parameters were set to default, except a minimum support parameter (--minSupport) of 1 and a top percent value (--topPercent) of 1 was used. MEGAN6 v.6.12.390 was used to view the MALT results. Taxon tables of the MALT results for the shotgun and capture data are shown in Supplementary Data 3, 4, 5, 6, 7 and 8. The captured negative controls were also analyzed with MALT as described above using --minPercentIdentity 95 (Supplementary Data 8). Prior to the MALT analysis described above, identical reads were removed from all data (samples and negative controls) using the rmdup function in seqkit v.0.11.091. This was particularly relevant for the negative controls due to the high duplication rates observed after mapping (Supplementary Data 2).

