bwa-meth is a tool to align bisulfitetreated sequences and compare it to existing aligners. It is it is fast and accurate even without quality-trimming and the output is immediately usable by downstream tools.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive [user@cn4471 ~]$module load bwa-meth [+] Loading singularity 4.1.5 on cn1091 [+] Loading bwa-meth 0.2.7Copy sample data to your current folder:
[user@cn4471 ~]$ cp $BM_DATA/* . Run the bwameth.py script:[user@cn4471 ~]$ bwameth.py index ref.fa ... indexing with bwa-mem: ref.fa.bwameth.c2t ... [bwa_index] Pack FASTA... 0.01 sec [bwa_index] Construct BWT for the packed sequence... [BWTIncCreate] textLength=3600000, availableWord=4573648 [bwt_gen] Finished constructing BWT in 5 iterations. [bwa_index] 0.39 seconds elapse. [bwa_index] Update BWT... 0.01 sec [bwa_index] Pack forward-only FASTA... 0.01 sec [bwa_index] Construct SA from BWT and Occ... 0.09 sec [main] Version: 0.7.17-r1188 [main] CMD: bwa index -a bwtsw ref.fa.bwameth.c2t [main] Real time: 0.624 sec; CPU: 0.511 sec [user@cn4471 ~]$ bwameth.py --reference ref.fa t_R1.fastq.gz t_R2.fastq.gz Found BWA MEM index running: /opt/conda/envs/bwa-meth/bin/python /opt/conda/envs/bwa-meth/lib/python3.10/site-packages/bwameth-0.2.7-py3.10.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz |bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG\tID:t_R\tSM:t_R' -t 6 ref.fa.bwameth.c2t /dev/stdin -------------------- /bin/bash: /opt/conda/envs/bwa-meth/lib/libtinfo.so.6: no version information available (required by /bin/bash) [M::bwa_idx_load_from_disk] read 0 ALT contigs converting reads in t_R1.fastq.gz,t_R2.fastq.gz [M::process] read 92726 sequences (9365326 bp)... [M::process] 0 single-end sequences; 92726 paired-end sequences [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 44943, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (109, 137, 175) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 307) [M::mem_pestat] mean and std.dev: (145.05, 48.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 373) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_process_seqs] Processed 92726 reads in 7.025 CPU sec, 1.178 real sec @SQ SN:chrREF LN:900000 @RG ID:t_R SM:t_R @PG ID:bwa-meth PN:bwa-meth VN:0.2.7 CL:"/opt/conda/envs/bwa-meth/bin/bwameth.py --reference ref.fa t_R1.fastq.g z t_R2.fastq.gz" HISEQ:105:C2UE1ACXX:3:1101:11160:2245 83 chrREF 87709 60 101M = 87498 -312 TAACTTTAAACTCAAAAATTCATCTAC CTCTACCTCCTAAATACTAAAATAAAAAATATATACTACCACTATCAAACAAATCTAATATATTAACTACTTTT FB<7FFFFBB7B7FFFFFFB7F<77FFFFBBFBBB7BIIIIIFBIIIIIII IIIIIIIIFFFFFFFFBBIIFBFFIBIIIIBIIIIIIFFFFFBFFFFBBB NM:i:2 MD:Z:22C73A4 MC:Z:101M AS:i:95 XS:i:31 RG:Z:t_R YC: Z:CT YD:Z:r HISEQ:105:C2UE1ACXX:3:1101:11160:2245 163 chrREF 87498 60 101M = 87709 312 CATAAAAACCAAAACTAACTAAACCCC AAATAAAAAACAACCTAACCTCTAACAAAAACAACAACAACTAACACCTCAAAATCAACTCTAAATAAAAACTA BBBFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII FIIIIIIFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:101 MC:Z:101M AS:i:101 XS:i:19 RG:Z:t_R YC:Z:GA YD:Z:r HISEQ:105:C2UE1ACXX:3:1101:19338:2197 99 chrREF 32493 60 101M = 32550 158 TTTTTTTTTAGAGGGATTAGTTTTTTT TATTGAGGTTTTTGAAAGTTGTTGTATGTTAATTGTTTTTAGAATGTTGGGTATAAGTAGGATTTAGGTCTATT BBBFFFFFFB0<BF7BBBF7BFFIIIII7BF'0<0BBFFF'<BB'<B7<B0 7<B7<BFBBF0<BBBBB0<<B0BB<<000<BF00<<'0<BBB0'00BBF# NM:i:2 MD:Z:72A27G0 MC:Z:101M AS:i:97 XS:i:20 RG:Z:t_R YC: Z:CT YD:Z:f HISEQ:105:C2UE1ACXX:3:1101:19338:2197 147 chrREF 32550 60 101M = 32493 -158 AATTGTTTTTAGAATGTTGGGTATAAG TAGGATTTAGGTCTATGTTGTTTGTTTTTTGTTAGTATTATAGTTATAGTTGTTAGGTAGGTAATAGAAATTAG BFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFIIIFFB<0IIIIIIIIIII IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFBBB NM:i:2 MD:Z:15A80T4 MC:Z:101M AS:i:95 XS:i:21 RG:Z:t_R YC: Z:GA YD:Z:f HISEQ:105:C2UE1ACXX:3:1101:19467:2281 83 chrREF 712646 60 101M = 712638 -109 AAATCCCTACTCCTATAACCTCTCACT ACACCCAAAACTCCATTCTTTTCCCCCTTTACAAAAATCACTAAAATCCAAACTATACATCTCACCCTAACCAA BB700007<7<77FBB7<7<B<<<BBBBBB777FFBFBB77FFFBFFFBB7 FBBIIFFBIIIIFFBBBIIFFFB<<IFFBIFIFFFFFFFFFBBFFF<<B< NM:i:0 MD:Z:101 MC:Z:101M AS:i:101 XS:i:0 RG:Z:t_R YC:Z:CT YD:Z:r HISEQ:105:C2UE1ACXX:3:1101:19467:2281 163 chrREF 712638 60 101M = 712646 109 ACACAACAAAATCCCTACTCCTATAAC CTCTCACTACACCCAAAACTCCATTCTTTTCCCCCTTTACAAAAATCACTAAAATCCAAACTATACATCTCACC BBBFFFFFFFFFFIIIIIIIIIIIIFIIIIIIIIIIIIIIIIIIIIIIIII FIIIIIIIIIIFFIIIIFFFFBBFFFFFFFBBFFFFFBBBFFFBFFFFBF NM:i:0 MD:Z:101 MC:Z:101M AS:i:101 XS:i:0 RG:Z:t_R Y ... [user@cn4471 ~]$End the interactive session:[user@cn4471 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$