High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
CrystFEL on Biowulf

CrystFEL is a suite of programs for processing diffraction data acquired "serially" in a "snapshot" manner, such as when using the technique of Serial Femtosecond Crystallography (SFX) with a free-electron laser source. CrystFEL comprises programs for indexing and integrating diffraction patterns, scaling and merging intensities, simulating patterns, calculating figures of merit for the data and visualising the results. Supporting scripts are provided to help at all stages, including importing data into CCP4 for further processing.


The primary citation for CrystFEL is as follows. See the list at the end of this page for more references.

T. A. White, R. A. Kirian, A. V. Martin, A. Aquila, K. Nass, A. Barty and H. N. Chapman. "CrystFEL: a software suite for snapshot serial crystallography". J. Appl. Cryst. 45 (2012), p335–341.
doi:10.1107/S0021889812002312Download PDFArticle on IUCr website.
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session including 100 GB of local scratch, and run the program. Sample session working through this tutorial. The sample data is copied to local scratch, and all operations are performed on this copy to reduce I/O to the central filesystems.

[user@biowulf]$ sinteractive --cpus-per-task=8 --mem=20g --gres=lscratch:100
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ cd /lscratch/$SLURM_JOBID

[user@cn3144 ~]$ cp -r /usr/local/apps/crystfel/example .

[user@cn3144 ~]$ indexamajig -i files.list -g 5HT2B-Liu-2013.geom --peaks=hdf5 -o tutorial.stream

[user@cn3144 ~]$ check-peak-detection tutorial.stream -g 5HT2B-Liu-2013.geom --int-boost=5

[user@cn3144 ~]$ ....etc....

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. crystfel.sh), which uses the input file 'crystfel.in'. For example:

module load crystfel

cd /lscratch/$SLURM_JOBID
cp -r /usr/local/apps/crystfel/example .
cd example
indexamajig -i files.list -g 5HT2B-Liu-2013.geom --peaks=hdf5 --indexing=mosflm-raw-nolatt --int-radius=3,4,5 -o tutorial.stream

Submit this job using the Slurm sbatch command.

sbatch --gres=lscratch:100 crystfel.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. crystfel.swarm). For example, the following swarmfile runs indexamajig on several sets of data

indexamajig -i files1.list -g AB1.geom --peaks=hdf5 --indexing=mosflm-raw-nolatt --int-radius=3,4,5 -o out1
indexamajig -i files2.list -g AB2.geom --peaks=hdf5 --indexing=mosflm-raw-nolatt --int-radius=3,4,5 -o out2
indexamajig -i files3.list -g AB3.geom --peaks=hdf5 --indexing=mosflm-raw-nolatt --int-radius=3,4,5 -o out3

Submit this job using the swarm command.

swarm -f crystfel.swarm [-g #] [-t #] --module crystfel
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module crystfel Loads the crystfel module for each subjob in the swarm