MySQL

MySQL is a general-purpose RDBMS, and is widely used as a backend for many applications. On Biowulf, users can either request a database on the mysql host biobase.nih.gov, or run a temporary Mysql instance on a node, or run an occasional Mysql instance pointing to a permanent location in the user's data area. These options are detailed below.

Choose your modality
Modality Pros Cons
biobase.nih.gov
  • always available
  • accessible within NIH network
  • most reliable data storage
  • no root access
  • strict limits on databases and use
  • controlled and restricted by HPC Staff
local_mysql
  • easy to create and manage
  • full root access
  • full control over administration
  • fastest and most reliable access
  • must be started within an HPC job
  • not accessible outside cluster
  • temporary instance on /lscratch
  • data must be archived between jobs
local_mysql --basedir
  • easy to create and manage
  • full root access
  • full control over administration
  • no need to archive data
  • must be started within an HPC job
  • not accessible outside cluster
  • data stability not 100% reliable
  • access speed affected by network slowdowns
  • InnoDB engine not available
Mysql database on biobase.nih.gov

The mysql host biobase.nih.gov is available for user accounts. This server is available only within the NIH network, and is intended only for use with applications run on HPC @ NIH hosts. To obtain a mysql account on biobase.nih.gov, please contact staff@hpc.nih.gov.

One of the primary purposes for biobase.nih.gov is hosting the backend databases for our local mirror of the UCSC Genome Browser. Users can access the local databases using a special read-only account. Please contact staff@hpc.nih.gov for details.

Using biobase from the cluster

Because nodes in the cluster are sequestered to an internal network, the host 'biobase.nih.gov' should not be used as the mysql host. Instead, use 'biobase':

[node1]$ mysql -u user -p -h biobase
Temporary Mysql instance

Once a mysql module has been loaded, the command local_mysql allows a mysql instance to be started on a node. The local_mysql has four sub-commands:

Here is an example of how to start a mysql server on an interactive node, using the dynamically-allocated local /lscratch disk as the root directory.

Start interactive session

In this example, we allocate 10 GB of local scratch space for the session. See https://hpc.nih.gov/docs/userguide.html#local for more information about using /lscratch. See https://hpc.nih.gov/docs/userguide.html#int for more information on allocating interactive sessions. Job and node ids have been changed to protect the innocent.

[biowulf]$ sinteractive --mem=10g --gres=lscratch:10
salloc.exe: Pending job allocation 123456789
salloc.exe: job 123456789 queued and waiting for resources
salloc.exe: job 123456789 has been allocated resources
salloc.exe: Granted job allocation 123456789
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes node1 are ready for job

Initialize and start a mysql instance

Once the interactive session is ready, load the mysql module of choice, initialize and start the mysql instance.

By default, local_mysql will set the root password of the instance to mysql123. This can be changed using the option --password.

[node1]$ module load mysql
[node1]$ local_mysql create

The default engine is MyISAM. This can be changed by editing my.cnf prior to starting the server:

[node1]$ cat /lscratch/$SLURM_JOB_ID/mysql/my.cnf
...
default_storage_engine      = MyISAM
default_tmp_storage_engine  = MyISAM
...
[node1]$ vi /lscratch/$SLURM_JOB_ID/mysql/my.cnf
[node1]$ cat /lscratch/$SLURM_JOB_ID/mysql/my.cnf
...
default_storage_engine      = InnoDB
default_tmp_storage_engine  = InnoDB
...

Set MYSQL_HOME and MYSQL_UNIX_PORT variables:

[node1]$ export MYSQL_HOME=/lscratch/$SLURM_JOB_ID/mysql
[node1]$ export MYSQL_UNIX_PORT=/lscratch/$SLURM_JOB_ID/mysql/mysql.sock

Now start up the server:

[node1]$ local_mysql start
MySQL is running now
You can now log in using 'mysql -u root -p'mysql123'

By default, the initialization procedure creates a new directory under /lscratch (/lscratch/$SLURM_JOB_ID/mysql) and fills it with files and directories:

[node1]$ ls -F /lscratch/123456789/mysql
bin/  data/  my.cnf  mysql.err  mysql.pid  mysql.sock  share/  support-files/  tmp/

Log in do a few things

Once the mysql instance is up and running, you can log in using either:

By default, local_mysql will set the port to 55555. This is different than the standard mysql port of 3306 and therefore must be explicitly specified either by using the socket file or with the --port option. The port can be changed using the option --port.

[node1]$ mysql -u root -p'mysql123'
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.30 Source distribution

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE USER me IDENTIFIED BY 'pass123';
mysql> CREATE DATABASE `my_db`;
mysql> GRANT ALL ON `my_db`.* TO me;
mysql> USE my_db;
mysql> CREATE TABLE my_table (x int, y int, z int);
mysql> DESC my_table;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| x     | int(11) | YES  |     | NULL    |       |
| y     | int(11) | YES  |     | NULL    |       |
| z     | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
mysql> exit
[node1]$

Simplify using ~/.my.cnf

MySQL can read default information from configuration files. For personal use, the default is ~/.my.cnf. This file can be edited to simplify use of mysql.

# Configuration for local_mysql
[mysql]
auto-rehash=FALSE
prompt= \h:\d>\_

[client]
user=me
password=pass123
database=my_db
host=node1
port=55555
socket=/lscratch/123456789/mysql/mysql.sock

Note that the host, port and socket values will likely be different.

Creating this personal ~/.my.cnf file will allow simple connections:

[node1]$ mysql
...
localhost:my_db>

Log in from another machine

While the interactive session is running, the mysql instance will be accessible from any node within the Biowulf cluster using the --host and --port options.

[node2]$ mysql -u me -p'pass123' --host=node1 --port=55555

By default the root account initially created does not allow access from outside the original host where mysql was initialized. Either modify the root account (not good idea) or create an alternate account (better idea) for access from other nodes.

Stop mysql prior to archiving and exiting the interactive session

[node1]$ local_mysql stop

MySQL is stopped!

Archive the contents

The contents of the temporary server instance can be archived into a single tarball for restarting later and on a different node. This is the recommended way of keeping instances long-term.

[node1]$ local_mysql archive --archivefile=/data/$USER/save.tgz
[node1]$ ls -l /data/$USER/save.tgz
-rw-r--r-- 1 user user 1371882 Jul  1 11:35 /data/user/save.tgz

Exit the interactive session

Lastly, exit the interactive session. The contents of /lscratch will be destroyed!

[node1]$ exit
salloc.exe: Relinquishing job allocation 123456789
[biowulf]$

Start a new interactive session, restore the contents, and start mysql again

[biowulf]$ sinteractive --mem=10g --gres=lscratch:10
salloc.exe: Pending job allocation 234567891
... 
[node3]$ module load mysql
[node3]$ local_mysql restore --archivefile=/data/$USER/save.tgz
[node3]$ local_mysql start

MySQL is running now
You can now log in using 'mysql -u root -p'mysql123' --socket=/lscratch/234567891/mysql/mysql.sock'

Mysql database with permanent location in /data

If the mysql files are required permanently, such that the server could be brought up again and again, and archiving is not a possibility (see above), then don't take the default --basedir value. Instead, point the --basedir option to a directory in a /data area.

By default mysql files are written to /lscratch. This can be changed using the option --basedir. local_mysql will attempt to create the directory path given by --basedir, and will complain if the path already exists.

The $MYSQL_HOME and $MYSQL_UNIX_PORT environment variables are very useful for simplifying things. If you need to use local_mysql quite often, placing the export commands in your ~/.bashrc circumvents the need to declare them after loading the local_mysql module.

[node1]$ module load mysql
[node1]$ export MYSQL_HOME=/data/user/mysql
[node1]$ export MYSQL_UNIX_PORT=/data/user/mysql/mysql.sock
[node1]$ mkdir -p $MYSQL_HOME
[node1]$ local_mysql --basedir $MYSQL_HOME create
[node1]$ local_mysql --basedir $MYSQL_HOME start
...
[node1]$ mysql -u root -p'mysql123' --host=$SLURM_NODELIST
mysql>

And as always be sure to stop the server when you're done using it:

[node1]$ local_mysql --basedir $MYSQL_HOME stop

Here is a script "mysql_server.sh" that can be used to launch a mysql server in a batch job (assuming that the mysql instance has already been installed and configured). The mysql server will run on a node for a defined amount of time, in this example for 8 hours. It traps signals so that the server is cleanly shut down two minutes prior to the job ending.

As with all things, the mysql_server job is only temporary. The maximum walltime allowed for batch jobs is 10 days. If you have secondary jobs that depend on mysql, make sure to allocate enough time for the mysql server that allows all secondary jobs to complete.

#!/bin/bash
# this script is named "mysql_server.sh"
#SBATCH --time=8:00:00 --signal=B:USR1@120 --job-name=mysql_server
export MYSQL_HOME=/data/user/mysql
export MYSQL_UNIX_PORT=/data/user/mysql/mysql.sock

term_handler()
{
    [[ -f /data/user/mysql/mysql.pid ]] && local_mysql --basedir $MYSQL_HOME stop
    exit
}

# associate the function "term_handler" with signals
trap 'term_handler' TERM
trap 'term_handler' EXIT
trap 'term_handler' USR1

# start the server
module load mysql
local_mysql --basedir $MYSQL_HOME start

# sleep for the time with sbatch (above)
sleep 8h & wait
[[ -f $MYSQL_HOME/mysql.pid ]] && local_mysql --basedir $MYSQL_HOME stop

Start the server by submitting the script to the batch system

[biowulf]$ sbatch mysql_server.sh

It will take a few minutes to get running, so be patient. To find the node on which the mysql server is running, use the jobname as a filter:

[biowulf]$ dashboard_cli jobs --jobname mysql_server --fields nodelist --raw --no-header
cnXXXX

or more simply

[biowulf]$ sjobs
User      JobId     JobName     Part         St  Reason  Runtime     Walltime     Nodes  CPUs   Memory  Dependency  Nodelist
==============================================================================================================================
user      24416517  mysql_serv  norm         R                 1:17      8:00:00      1      2    4 GB              cnXXXX
==============================================================================================================================

Verify the server is running

You can verify that the server is running using ssh and wazzup:

[biowulf]$ ssh cnXXXX wazzup -c
   PPID     PID USER     ELAPSED %CPU %MEM S COMMAND
1554701 1554709 user       14:56  0.0  0.0 S slurm_script
1554709 1555206 user       14:55  0.0  0.0 S  \_ sleep
      1 1554780 user       14:56  0.0  0.0 S sh
1554780 1554781 user       14:56  0.1  0.0 S  \_ mysqld_safe
1554781 1555137 user       14:56  0.1  0.1 S      \_ mysqld

Connect to the mysql server

You can connect to the mysql server from another job, for example an interactive session:

[biowulf]$ sinteractive
...
[node1]$ module load mysql
[node1]$ export MYSQL_HOME=/data/user/mysql
[node1]$ export MYSQL_UNIX_PORT=/data/user/mysql/mysql.sock
[node1]$ mysql -u me -p'pass123' --host=cnXXXX

More information

For more information about how to use local_mysql, type

local_mysql --help