GenomeSyn, implemented in Perl, is a bioinformatics tool for visualizing genome synteny and structural variations.
It provides genome synteny visualization for two or
three genomic sequences. Additionally, genome annotation can be
uploaded to visualize various SVs among different genomes.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive [user@cn3144 ~]$ module load [+] Loading singularity 4.0.3 on cn3144 [+] Loading GenomeSyn 20240614 [user@cn3144 ~]$ GenomeSyn -h Usage: GenomeSyn [options] example: a) GenomeSyn -g1 ../data/rice_MH63.fa -g2 ../data/rice_ZS97.fa b) GenomeSyn -t 3 -g1 ../data/rice_MH63.fa -g2 ../data/rice_ZS97.fa -cf1 ../data/rice_MH63vsZS97.delta.filter.coords c) GenomeSyn -t 3 -g1 ../data/rice_MH63.fa -g2 ../data/rice_ZS97.fa -cf1 ../data/rice_MH63vsZS97.delta.filter.coords -cen1 ../data/rice_MH63_centromere.bed -cen2 ../data/rice_ZS97_centromere.bed -tel1 ../data/rice_MH63_telomere.bed -tel2 ../data/rice_ZS97_telomere.bed -TE1 ../data/rice_MH63_repeat.bed -TE2 ../data/rice_ZS97_repeat.bed -PAV1 ../data/rice_MH63_PAV.bed -PAV2 ../data/rice_ZS97_PAV.bed -NLR1 ../data/rice_MH63_NLR.bed -NLR2 ../data/rice_ZS97_NLR.bed -r MH63 -q ZS97 -GD1 ../data/rice_MH63_nonTEgene.gff3 -GD2 ../data/rice_ZS97_nonTEgene.gff3 -GC1 ../data/rice_MH63_GC_10000.bed -GC2 ../data/rice_ZS97_GC_10000.bed -GC_win 100000 -TE_min 40 d) GenomeSyn -t 3 -n3 12 -g1 ../data/rice_MH63.fa -g2 ../data/rice_ZS97.fa -g3 ../data/rice_R498.fasta -cf1 ../data/rice_MH63vsZS97.delta.filter.coords -cf2 ../data/rice_MH63vsR498.delta.filter.coords -cen1 ../data/rice_MH63_centromere.bed -cen2 ../data/rice_ZS97_centromere.bed -cen3 ../data/rice_R498_centromere.bed -tel1 ../data/rice_MH63_telomere.bed -tel2 ../data/rice_ZS97_telomere.bed -tel3 ../data/rice_R498_telomere.bed -TE2 ../data/rice_ZS97_repeat.bed -PAV1 ../data/rice_MH63_PAV.bed -PAV2 ../data/rice_ZS97_PAV.bed -NLR1 ../data/rice_MH63_NLR.bed -NLR2 ../data/rice_ZS97_NLR.bed -r MH63 -q1 ZS97 -q2 R498 -GD1 ../data/rice_MH63_nonTEgene.gff3 -GD2 ../data/rice_ZS97_nonTEgene.gff3 -GD3 ../data/rice_R498_IGDBv3_coreset.gff -GC2 ../data/rice_ZS97_GC_10000.bed -GC_win 100000 -TE_min 40 Options: -aligntype/-at/-t The output mode is divided into four output modes, the parameter value is 1/2/3/4, and the default value is 1; When the value is 1, only the one-to-one double/triple sequence comparison chart will be output; when the value is 2, only the multiple-to-multiple double/triple sequence comparison chart will be output; when the value is 3, it will output simultaneously in 1, 2 mode Two comparison graphs of; when the value is 4, in addition to the first two comparison graphs, some statistical sub-graphs will be generated, such as consistency heat map, consistency histogram, and coverage histogram. -genomeSeq1/-g1 Input the genome1 fasta file to obtain the length of each chromosome in the genome1(ie reference genome). -genomeSeq2/-g2 Input the genome2 fasta file to obtain the length of each chromosome in the genome2(ie query genome). -genomeSeq3/-g3 Input the genome3 fasta file to obtain the length of each chromosome in the genome3(ie query genome2). -comparison_file/-comparison_file1/-cf/-cf1 Input the coordinate file for comparing genome1 and genome2, if there is no coordinate file, call mummer to compare genome1 and genome2 to generate this coordinate file, such as ReferencevsQuery1.delta.filter.coords. -comparison_file2/-cf2 Input the coordinate file for comparing genome1 and genome3, if there is no coordinate file, call mummer to compare genome1 and genome3 to generate this coordinate file, such as ReferencevsQuery2.delta.filter.coords. -comparison_file3/-cf3 Input the coordinate file for comparing genome2 and genome3, if there is no coordinate file, call mummer to compare genome2 and genome3 to generate this coordinate file, such as Query1vsQuery2.delta.filter.coords. -SVG_PDF/-pdf Format transition, generate the corresponding PDF format file with the SVG format file as the original, value is 1/0, default true(1), that is default output SVG format and PDF format files are output at the same time. -sort -sort��ģʽ"match""reference_length_match",Ĭϲ "match"ģʽܣreferenceȾɫŶqueryȾɫ ƥ"reference_length_match"ģʽܣȰȾɫ峤 ȴӳ̶referenceȾɫŽٰƥquery ȾɫŽ -chromosomename/-cn Chromosome numeration setting, the value is 1/0, and the default value is false (0); when the value is 0, the unified chromosome numeration(Chromosome numeration for reference genome) will be displayed on the output map, and when the value is 1, the actual chromosome numeration in the comparison file will be displayed on the output map. -referencename/-reference/-ref/-r Set the name of the genome1, default output is "reference".eg. MH63 -queryname/-queryname1/-query/-query1/-q/-q1 Set the name of the genome2, default output is "query"/"query1".eg. ZS97 -queryname2/-query2/-q2 Set the name of the genome3, default output is "query2".eg. R498 -centromere_genome1/-centromere1/-cen1 Input the centromere position file of genome1, the file uses the bed (Browser Extensible Data) format, and draw centromeres on each chromosome of genome1. -centromere_genome2/-centromere2/-cen2 Input the centromere position file of genome2, the file uses the bed (Browser Extensible Data) format, and draw centromeres on each chromosome of genome2. -centromere_genome3/-centromere3/-cen3 Input the centromere position file of genome3, the file uses the bed (Browser Extensible Data) format, and draw centromeres on each chromosome of genome3. -telomere_genome1/-telomere1/-tel1 Input the telomere position file of genome1, the file uses the bed format, and draw telomere on each chromosome of genome1. -telomere_genome2/-telomere2/-tel2 Input the telomere position file of genome2, the file uses the bed format, and draw telomere on each chromosome of genome2. -telomere_genome3/-telomere3/-tel3 Input the telomere position file of genome3, the file uses the bed format, and draw telomere on each chromosome of genome3. -snp_genome1/-snp1 Input the SNP file of genome1, which uses the bed format to map the SNP distribution of genome1. -snp_genome2/-snp2 Input the SNP file of genome2, which uses the bed format to map the SNP distribution of genome2. -snp_genome3/-snp3 Input the SNP file of genome3, which uses the bed format to map the SNP distribution of genome3. -snp_thresholds/-snp_max SNP threshold setting, that is, setting the upper limit of SNP statistics, the default value is 2000. -TE_genome1/-TE1 Input the TE file of genome1, which uses the bed format to map the TE distribution of genome1. -TE_genome2/-TE2 Input the TE file of genome2, which uses the bed format to map the TE distribution of genome2. -TE_genome3/-TE3 Input the TE file of genome3, which uses the bed format to map the TE distribution of genome3. -TE_thresholds/-TE_min Set the TE threshold that set the lower limit of TE statistics, default the integer value of the smallest TE proportion in the TE file used is the lower limit, for example, the minimum TE is 11%, the icon in the lower right corner shows a scale of 10%-100%, the minimum TE is 28%, and the icon in the lower right corner shows a scale of 20%-100%; if the user inputs the lower limit of TE, it will be output according to the lower limit of TE input by the user, and the value is 0-100. For example: input "-TE_min 50", then a 50%-100% TE statistical graph will be drawn, TE has two display forms, but only when TE is displayed in a histogram, the lower limit of TE can be adjusted. -GC_genome1/-GC_content1/-GC1 Input the bed format file of the GC content of genome1 to plot the distribution of the GC content of genome1. -GC_genome2/-GC_content2/-GC2 Input the bed format file of the GC content of genome2 to plot the distribution of the GC content of genome2. -GC_genome3/-GC_content3/-GC3 Input the bed format file of the GC content of genome2 to plot the distribution of the GC content of genome3. -PAV_genome1/-PAV1 Input the PAV file of genome1, which uses the bed format to map the PAV distribution of genome1. -PAV_genome2/-PAV2 Input the PAV file of genome2, which uses the bed format to map the PAV distribution of genome2. -PAV_genome3/-PAV3 Input the PAV file of genome3, which uses the bed format to map the PAV distribution of genome3. -NLR_genome1/-NLR1 Input the NLR file of genome1, which uses the bed format to map the NLR distribution of genome1. -NLR_genome2/-NLR2 Input the NLR file of genome2, which uses the bed format to map the NLR distribution of genome2. -NLR_genome3/-NLR3 Input the NLR file of genome3, which uses the bed format to map the NLR distribution of genome3. -gene_density_genome1/-GD1 Input the annotation file of genome1, which uses the gff3 format to map the gene density distribution of genome1. -gene_density_genome2/-GD2 Input the annotation file of genome2, which uses the gff3 format to map the gene density distribution of genome2. -gene_density_genome3/-GD3 Input the annotation file of genome3, which uses the gff3 format to map the gene density distribution of genome3. -GeneDensity_Window/-GD_win Set the window size for statistical gene density, this parameter is a required parameter when the gene density is counted in the annotation file of the input gene, the value can be set to 100000. -SNP_Window/-SNP_win Set the window size for statistical SNPs, this parameter is optional, its value is determined by default according to the window size in the bed file of the input SNP of genome1. -TE_Window/-TE_win Set the window size for statistical TEs, this parameter is optional, its value is determined by default according to the window size in the bed file of the input TEs of genome1. -GC_Content_Window/-GC_win Set the window size for statistical GC content, this parameter is optional, its value is determined by default according to the window size in the bed file of the input GC content of genome1. -synteny_length_min/-synteny_min/-syn_min Set the minimum length for drawing synteny fragments, the default value is 10000. -inversion_length_min/-inversion_min/-inv_min Set the minimum length for drawing inversion fragments, the default value is 10000. -PAV_length_min/-PAV_min Set the minimum length for drawing PAV, the default value is 10000. -NLR_length_min/-NLR_min Set the minimum length for drawing NLR, the default value is 10000. -coverage_rate_min/-coverage_min/-cov_min Set the minimum coverage (%) for drawing synteny fragments, the default value is 90. -icon Whether to output the main image icon, value is 1/0, default true(1). -proportion1/-p1 Set the chromosome window size of the one-to-one double/triple sequence alignment chart, the default value is 25000. -proportion2/-p2 Set the chromosome window size of the multiple-to-multiple double/triple sequence alignment chart, the default value is four times the value of -proportion1/-p1, that is, the default is 100000. -targetgene_genome1/-targetgene1/-gene1 Input the target gene file of genome 1, the file uses the bed format, the target gene can be any gene that the user studies. -targetgene_genome2/-targetgene2/-gene2 Input the target gene file of genome 2, the file uses the bed format, the target gene can be any gene that the user studies. -targetgene_genome3/-targetgene3/-gene3 Input the target gene file of genome 3, the file uses the bed format, the target gene can be any gene that the user studies. -targetgene_name/-targetgene Set the name of the target gene, default output as "Target Gene". -genomenumber/-gn/-n Comparison mode, double/triple sequence comparison, this parameter is optional, the parameter value can be set to 2/3,the value is determined by the number of input genomes by default, that is, when two genomes are input, the value is 2, and when three genomes are input, the value is 3. -chromosomenumber1/-n1 Set the number of chromosomes in genome1,this parameter is optional, and its value is determined by the number of chromosomes in the fasta file of the input genome1 by default, or it can be set by users.eg. 12 -chromosomenumber2/-n2 Set the number of chromosomes in genome2,this parameter is optional, and its value is determined by the number of chromosomes in the fasta file of the input genome2 by default, or it can be set by users.eg. 12 -chromosomenumber3/-n3 Set the number of chromosomes in genome3,this parameter is optional, and its value is determined by the number of chromosomes in the fasta file of the input genome3 by default, or it can be set by users.eg. 12 -output1/-o1 Set the name of output SVG format file1, default "GenomeSyn-main-1.svg". -output2/-o2 Set the name of output SVG format file2, default "GenomeSyn-main-2.svg". -output3/-o3 Set the name of output SVG format file3, default "GenomeSyn heatmap.svg". -output4/-o4 Set the name of output SVG format file4, default "GenomeSyn identity.svg". -output5/-o5 Set the name of output SVG format file5, default "GenomeSyn coverage.svg". -output6/-o6 Set the name of output SVG format file6, default "GenomeSyn heatmap2.svg". -headline_identity/-headline1 Set the title of illustration1, default output is "GenomeSyn identity". -headline_coverage/-headline2 Set the title of illustration2, default output is "GenomeSyn coverage". -headline_heatmap/-headline3 Set the title of illustration3, default output is "GenomeSyn heatmap". -genome1_color/-color1/-c1 Set the drawing color of the chromosome in genome1,default color is LightBlue (#3979BC), recommended to input in hexadecimal color code or RGB code, eg. "#3979BC"/"rgb(57,121,188)". -genome2_color/-color2/-c2 Set the drawing color of the chromosome in genome2,default color is Green(#499272), recommended to input in hexadecimal color code or RGB code, eg. "#499272"/"rgb(73,146,114)". -genome3_color/-color3/-c3 Set the drawing color of the chromosome in genome3, default color is DarkBlue(#447784), recommended to input in hexadecimal color code or RGB code, eg. "#447784"/"rgb(68,119,132)". -synteny_color/-color4/-c4 Set the drawing color of the synteny blocks, default color is LightGray(#DFDFE1), recommended to input in hexadecimal color code or RGB code, eg. "#DFDFE1"/"rgb(223,223,225)". -inversion_color/-color5/-c5 Set the drawing color of the inversion blocks, default color is DarkOrange(#E56C1A), recommended to input in hexadecimal color code or RGB code, eg. "#E56C1A"/"rgb(229,108,26)". -translocation_color/-color6/-c6 Set the drawing color of the translocation blocks, default color is Saffron(#EFCF48), recommended to input in hexadecimal color code or RGB code, eg. "#EFCF48"/"rgb(239,207,72)". -centromere_color/-color7/-c7 Set the drawing color of the centromere blocks, default color is Orange(#E4993F), recommended to input in hexadecimal color code or RGB code, eg. "#E4993F"/"rgb(228,153,63)". -telomere_color/-color8/-c8 Set the drawing color of the telomere blocks, default color is Purple(#441680), recommended to input in hexadecimal color code or RGB code, eg. "#441680"/"rgb(68,22,128)". -PAV_color/-color9/-c9 Set the drawing color of PAVs, default color is LightYellow(#F9F067), recommended to input in hexadecimal color code or RGB code, eg. "#F9F067"/"rgb(249,240,103)". -NLR_color/-color10/-c10 Set the drawing color of the NLRs, default color is Cyan(#00FFFF), recommended to input in hexadecimal color code or RGB code, eg. "#00FFFF"/"rgb(0,255,255)". -SNP_color/-color11/-c11 Set the drawing color of the SNPs, default color is DoderBlue(#1E90FF), recommended to input in hexadecimal color code or RGB code, eg. "#1E90FF"/"rgb(30,144,255)". -TE_color/-color12/-c12 Set the drawing color of the TEs, default color is DoderBlue(#1E90FF), recommended to input in hexadecimal color code or RGB code, eg. "#1E90FF"/"rgb(30,144,255)". TE has two forms of display, when it is displayed as a histogram only, the drawing color of TE can be adjusted. -genedensity_color/-color13/-c13 Set the drawing color of the gene density, default color is DarkGreen(#368F5C), recommended to input in hexadecimal color code or RGB code, eg. "#368F5C"/"rgb(54,143,92)". -targetgene_color/-color14/-c14 Set the drawing color of the target gene, default color is Crimson(#DC143C), recommended to input in hexadecimal color code or RGB code, eg. "#DC143C"/"rgb(220,20,60)". -curveto/-curve Draw synteny blocks with curve or straight line, value is 1/0, default true(1), that is default output as a curve. -highlightinversion/-highlight Highlight inversion, value is 1/0, default true(1), that is default the inverted information is highlighted. -help/-h/? Print a brief help message and exits. -man Prints the manual page and exits. [user@cn3144 ~]$ Transform -h Usage: Transform [options] example: a) Transform --PAF example.PAF b) Transform --GFF3 example.gff3 c) Transform -1 rice_MH63.fa -2 rice_R498.fasta --SNP MH63vsR498.delta.filter.snps d) Transform -1 rice_MH63.fa -2 rice_R498.fasta --SNP MH63vsR498.delta.filter.snps -r e) Transform -1 rice_MH63.fa -2 rice_R498.fasta --SNP MH63vsR498.delta.filter.snps -rq f) Transform -1 rice_MH63.fa -2 rice_R498.fasta --SNP MH63vsR498.delta.filter.snps --noquery -r g) Transform --PAV MH63vsR498.delta.filter.qdiff -o MH63vsR498 Options: --PAFtoCOORDS/--PAF Enter a .PAF format file to generate a .coords format file. --PAV Enter a .qdiff format file to generate a .bed format file. --GFF3toBED/--GFFtoBED/--GFF3/--GFF Enter a .gff3 format file to generate a .bed format file. ... --SNP Enter a .snps format file to generate a .bed format file. --genomeSeq1/-1 Input the genome1 fasta file to obtain the length of each chromosome in the genome1(ie reference genome). --genomeSeq2/-2 Input the genome2 fasta file to obtain the length of each chromosome in the genome2(ie query genome). --query/-q/--noquery ... --help/-h Print a brief help message and exits. --man Prints the manual page and exits.
[user@cn3111 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$