High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
NCBI Toolkit

The NCBI C++ Toolkit is a set of executables and libraries for a multitude of sequence analysis functions.

These executables have been compiled and made available on the Helix Systems.

How to use

The easiest way is to load the ncbi-toolkit environment module:

$ module load ncbi-toolkit

The module load statement can be placed in your startup files for permanency.

Then, just call the executable of choice on the commandline:

$ gi2taxid -gi 36209385

List of current executables

Judy1TablesGen                  lds_indexer                     test_id_mux
JudyLTablesGen                  lds_sample                      test_image
SRR574828-crash-test            lds_test                        test_interprocess_lock
abi-dump                        lds_unit_test                   test_lds
abi-load                        legacy_blast.pl                 test_limited_map
ace2asn                         localfinder                     test_line_reader
agp_count                       logs_splitter                   test_logrotate
agp_renumber                    makeblastdb                     test_math
agp_val_test                    makembindex                     test_multipart.cgi
agpconvert                      makeprofiledb                   test_nc_stress
align-info                      mapper_unit_test                test_nc_stress_pubmed
align_filter_unit_test          md5appendtest                   test_ncbi_buffer
align_format_unit_test          md5cp                           test_ncbi_clog_mt_ctx
aln_build                       multi_command                   test_ncbi_config
alnmgr_sample                   multireader                     test_ncbi_conn
alnmrg                          mysql_lang                      test_ncbi_conn_stream
alnvwr                          ncbi_applog                     test_ncbi_conn_stream_mt
annotwriter                     ncfetch.cgi                     test_ncbi_connutil_hit
args-test                       nenctest                        test_ncbi_connutil_misc
asn2asn                         nenctool                        test_ncbi_core
asn2fasta                       nencvalid                       test_ncbi_disp
asn2flat                        netcache_cgi_sample.cgi         test_ncbi_download
asn_assign                      netcache_client_sample          test_ncbi_dsock
asn_sample                      netcache_control                test_ncbi_file_connector
asniotest                       netschedule_client_sample       test_ncbi_ftp_connector
asnwalk_read                    netschedule_control             test_ncbi_ftp_download
asnwalk_type                    netschedule_node_sample         test_ncbi_heapmgr
asnwalk_write                   ngalign_test                    test_ncbi_hmac
autodef_demo                    nmer_repeats                    test_ncbi_http_connector
bam-load                        ns_remote_job_control           test_ncbi_http_get
bam-load.3                      ns_submit_remote_job            test_ncbi_limits
bam2graph                       nw_aligner                      test_ncbi_memory_connector
bam_test                        objects_sample                  test_ncbi_namedpipe
bamgraph_test                   objextract                      test_ncbi_namedpipe_connector
basic_sample                    objmgr_sample                   test_ncbi_null
basic_sample_lib_test           omssa2pepXML                    test_ncbi_os_unix
bdbloader_unit_test             omssacl                         test_ncbi_pipe
biosample_chk                   omssamerge                      test_ncbi_pipe_connector
bioseq_edit_sample              pacc                            test_ncbi_process
blast_dataloader_unit_test      phytree_calc_unit_test          test_ncbi_rate_monitor
blast_demo                      phytree_format_unit_test        test_ncbi_rwstream
blast_format_unit_test          pmem-test                       test_ncbi_sendmail
blast_formatter                 prefetch                        test_ncbi_service
blast_sample                    printf-test                     test_ncbi_service_connector
blast_services_unit_test        project_tree_builder            test_ncbi_socket
blast_unit_test                 psiblast                        test_ncbi_socket_connector
blastdb_aliastool               python_ncbi_dbapi_test          test_ncbi_system
blastdb_format_unit_test        qfiletest                       test_ncbi_table
blastdbcheck                    qual-recalib-stat               test_ncbi_tree
blastdbcmd                      rcexplain                       test_ncbiargs
blastdbcp                       re-compress                     test_ncbiargs_sample
blastinput_demo                 read-filter-redact              test_ncbicfg
blastinput_unit_test            readresult                      test_ncbidiag_f_mt
blastn                          refseq-load                     test_ncbidiag_mt
blastp                          regexplocdemo                   test_ncbidiag_p
blastx                          remote_app_client_sample        test_ncbidll
blobreader                      remote_blast_demo               test_ncbiexec
blobrwd                         rmblastn                        test_ncbiexpt
blobrws                         rowwritetest                    test_ncbifile
blobwriter                      rpsblast                        test_ncbimime
bm_sparse_sample                rpstblastn                      test_ncbireg_mt
bma_refiner                     sam-dump3                       test_ncbistr
bss_info                        schema-replace                  test_ncbitime
cache_demo                      score_builder_unit_test         test_ncbitime_mt
ccextract                       sdbapi_advanced_features        test_ncbiutil
cddalignview                    sdbapi_simple                   test_netcache_api
cg-load                         sdbapi_unit_test                test_netschedule_client
cgi2rcgi                        seedtop                         test_netschedule_crash
cgi_io_test                     segmasker                       test_netschedule_node
cgi_redirect                    seq_id_unit_test                test_netschedule_stress
cgi_sample.cgi                  seqalign_unit_test              test_nsstorage
cgi_session_sample.cgi          seqannot_splicer                test_objmgr
cgi_tunnel2grid.cgi             seqdb_demo                      test_objmgr_basic
cgitest                         seqdb_perf                      test_objmgr_data
clusterer                       seqdb_unit_test                 test_objmgr_gbloader
cobalt                          seqmasks_io_unit_test           test_objmgr_gbloader_mt
cobalt_unit_test                seqvec_bench                    test_objmgr_mem
compart                         sff-dump                        test_objmgr_mt
compartp                        sff-load                        test_objmgr_sv
conv_image                      soap_client_sample              test_objstore
convert2blastmask               soap_server_sample              test_param_mt
convert_seq                     socket_io_bouncer               test_plugins
copycat                         sortreadtest                    test_porter_stemming
coretest                        speedtest                       test_printf
cpgdemo                         split_cache                     test_queue_mt
csra_test_mt                    split_loader_demo               test_range_coll
ctl_lang_ftds64                 sra-dbcc                        test_rangemap
ctl_sp_databases_ftds64         sra-dflt-schema                 test_reader_gicache
ctl_sp_who_ftds64               sra-dump                        test_reader_id1
datatool                        sra-kar                         test_regexp
db_copy                         sra-pileup                      test_relloc
dbapi_advanced_features         sra-sort                        test_request_control
dbapi_bcp                       sra-stat                        test_resize_iter
dbapi_cache_admin               sra_test                        test_resource_info
dbapi_cache_test                srapath                         test_scheduler
dbapi_conn_policy               srf-load                        test_scoremat
dbapi_context_test              srsearch                        test_semaphore_mt
dbapi_cursor                    streamtest                      test_seq_entry_ci
dbapi_driver_check              struct_dp_demo                  test_seqio
dbapi_query                     struct_util_demo                test_seqmap_switch
dbapi_send_data                 sub_image                       test_seqport
dbapi_simple                    subcheck                        test_seqvector_ci
dbapi_testspeed                 tblastn                         test_serial
dbapi_unit_test                 tblastx                         test_source_mod_parser
deltablast                      test-aes-ciphers                test_sra_loader
demo_contig_assembly            test-align                      test_stacktrace
demo_gene_model                 test-block-cross-error          test_staticmap
demo_genomic_compart            test-bzip-concat                test_strdbl
demo_html                       test-cipher-speed               test_strsearch
demo_html_template              test-encapptrunc                test_sub_reg
demo_ncbi_clog                  test-encv2                      test_tar
demo_score_builder              test-error                      test_tempstr
demo_seqtest                    test-fastq-loader               test_title
disc_report                     test-float                      test_tls_object
double-VCursorCommit-test       test-headfile                   test_transmissionrw
dump-blob-boundaries            test-kdb                        test_uoconv
dustmasker                      test-kfg                        test_user_agent
ecnum_unit_test                 test-kfs                        test_utf8
entrez2client                   test-kfsmanager                 test_uttp
eutils_sample                   test-klib                       test_validator
example_value_convert           test-kpath-read-path            test_value_convert
fasthello.fcgi                  test-ktst                       test_vdbgraph_loader
fastq-dump                      test-modes                      test_vmerge
fastq-load                      test-pagefile                   test_weakref
fcgi_sample.fcgi                test-path                       test_wgs_loader
feat_unit_test                  test-ram-file-c                 testencrypt
feattree_sample                 test-ramfile                    testld
formatguess                     test-ref-list                   testreenc
formatguess_unit_test           test-ref_sub_select             time-test
gene_info_reader                test-reference-mgr              txt2kdb
gene_info_unit_test             test-resolve                    unit_test_agp_seq_entry
gene_info_writer_unit_test      test-resolver                   unit_test_alnmgr
genomic_compart_unit_test       test-sra                        unit_test_alt_sample
gi2taxid                        test-static                     unit_test_autodef
graph_test                      test-sysfile-timeout            unit_test_basic_cleanup
grid_cgi_sample.cgi             test-sysfs                      unit_test_defline
grid_cli                        test-vdb                        unit_test_entry_edit
grid_client_sample              test-vdb-resolve                unit_test_extended_cleanup
gumbelparams                    test_algo_tree                  unit_test_fasta_ostream
gumbelparams_unit_test          test_align                      unit_test_fasta_reader
helicos-load                    test_annot_ci                   unit_test_feature_table_reader
hello.cgi                       test_bam_loader                 unit_test_field_collection
hfilter                         test_basic_cleanup              unit_test_format_guess_ex
hgvs2variation                  test_biotree                    unit_test_gene_model
hooks_commented                 test_bm                         unit_test_idmapper
hooks_copy_member               test_buffer_writer              unit_test_mol_wt
hooks_copy_object               test_buffile                    unit_test_polya
hooks_copy_variant              test_bulkinfo                   unit_test_sample
hooks_read_member               test_cgi_entry_reader           unit_test_seq_loc_util
hooks_read_object               test_chainer                    unit_test_seq_translator
hooks_read_variant              test_checksum                   unit_test_validator
hooks_skip_member               test_compress                   update_blastdb.pl
hooks_skip_object               test_compress_mt                varloc-load
hooks_skip_variant              test_conn_stream_pushback       vdb-config
hooks_write_member              test_conn_tar                   vdb-copy
hooks_write_object              test_csra_loader                vdb-decrypt
hooks_write_variant             test_csra_loader_mt             vdb-dump
http_connector_hit              test_date                       vdb-encrypt
id1_fetch                       test_diag_parser                vdb-lock
id1_fetch_simple                test_diff                       vdb-passwd
id2_fetch_simple                test_edit_saver                 vdb-unlock
id_unit_test                    test_expr                       vdb-validate
idmapper                        test_fasta_round_trip           vdb_test
igblastn                        test_feat_overlap               vecscreen
igblastp                        test_feat_tree                  vsrun_sample
illumina-dump                   test_floating_point_comparison  wb-test-bam-loader
illumina-load                   test_fstream_pushback           wb-test-fastq
image_info                      test_fw                         wb-test-vxf
kar                             test_get_console_password       wgs_test
kdb2vdb                         test_grid_worker                wig2table
kdbmeta                         test_gridclient_stress          windowmasker
kqsh                            test_hash                       windowmasker_2.2.22_adapter.py
krypto-test                     test_hgvs_parser                writedb_unit_test
ktartest                        test_html                       xcompareannotsdemo
lang_query                      test_ic_client
latf-load                       test_id1_client

How to use on Biowulf

slurm

Create an sbatch script (script.sh):

#!/bin/bash
module load ncbi-toolkit

# NOTE: This is merely a test to see that ncbi-toolkit runs correctly.
# This example may not be rational or useful.

# Create a nucleotide blast database suitable for the ncbi-toolkit version.
# In this example, we extract the top 1,000,000 lines from nt.fas.

head -1000000 /fdb/fastadb/nt.fas > nt_1M.fas
makeblastdb -in nt_1M.fas -dbtype nucl

# Now run Repeat Masker blast against this database.

rmblastn -query gi_255958152.nt.fas -db nt_1M.fas -gapopen 3 -gapextend 3

Now submit to slurm:

sbatch script.sh

swarm

Create a swarmfile (swarmfile.txt):

igblastn -db mydb -query seq1.fas -out seq1.out
igblastn -db mydb -query seq2.fas -out seq2.out
igblastn -db mydb -query seq3.fas -out seq3.out
igblastn -db mydb -query seq4.fas -out seq4.out

Then submit to swarm:

swarm -f swarmfile.txt

Documentation

Many of the executables have help functions. These can be displayed with the -help option:

$ fastq-dump -help

Usage:
  fastq-dump [options] 
  fastq-dump [options] [ -A ] 

INPUT
  -A|--accession        Replaces accession derived from  in 
                                   filename(s) and deflines (only for single 
                                   table dump) 
  --table              Table name within cSRA object, default is 
                                   "SEQUENCE" 

PROCESSING

Read Splitting                     Sequence data may be used in raw form or
                                     split into individual reads
  --split-spot                     Split spots into individual reads 

Full Spot Filters                  Applied to the full spot independently
                                     of --split-spot
  -N|--minSpotId            Minimum spot id 
  -X|--maxSpotId            Maximum spot id 
  --spot-groups <[list]>           Filter by SPOT_GROUP (member): name[,...] 
  -W|--clip                        Apply left and right clips 

... etc ...