Biowulf High Performance Computing at the NIH
Sharing Data with Collaborators

Many users wish to share their raw or processed data on Helix/Biowulf with collaborators in and outside NIH. In some cases a group may want to work in a shared disk area rather than having individual directories. Available options for sharing of data are listed below.

Important Note: It may be tempting to share data by changing the permissions on your /home or /data directory so that other users can access the files within. However, we strongly recommend against this. Allowing world access to your directory could allow another user to inadvertently delete or overwrite files in your directory.

Please see https://hpc.nih.gov/storage/permissions.html for more information about handling directory permissions.

NIH collaborators with Helix or Biowulf accounts

World-accessible /scratch

For a one-time or very occasional transfer of non-private files to a collaborator who has a Helix/Biowulf account, the simplest way is to copy the files to /scratch/yourusername, change the permissions so that your collaborator can access them, and then tell your collaborator where they are. For example:

helix% mkdir /scratch/MyUniqDir
helix% chmod a+rx /scratch/MyUniqDir
helix% cp myfile.txt /scratch/MyUniqDir
helix% chmod a+r /scratch/MyUniqDir/myfile.txt

Note that this allows everyone on Helix/Biowulf to potentially access the file, so it is only suitable for files which are non-private, such as a sequence database composed of publicly available sequences. All files in /scratch get deleted 14 days after last access.

/scratch with permissions

A more secure way is to have a Unix group set up containing you and your collaborator. Contact staff@hpc.nih.gov about setting up a group. Once the group is set up, you can set permissions on the directory /scratch/MyUniqDir so that only you and your collaborator can access it. For example, the Helix staff can set up a group called MyGroup containing the users of your choice. Then:

helix% mkdir /scratch/MyUniqDir
helix% chgrp MyGroup /scratch/MyUniqDir
helix% chmod g+rx /scratch/MyUniqDir
helix% cp myfile.txt /scratch/MyUniqDir
helix% chmod g+r /scratch/MyUniqDir/myfile.txt

More about groups and permissions

Shared /data directory

If you wish to share data on a regular basis with other users on Helix/Biowulf, or have many people access the same data without copying it back and forth, the best way is to set up a group with shared disk area. You can apply for the shared area and the group by filling out the form at https://hpc.nih.gov/dashboard/shared_data_request.php.

Globus

Globus is a service that makes it easy to move, sync, and share large amounts of data. Globus will manage file transfers, monitor performance, retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus uses GridFTP for more reliable and high-performance file transfer, and will queue file transfers to be performed asynchronously in the background. Globus can also be used for sharing data with collaborators inside and outside the NIH. NIH researchers can use thier NIH Login username and password to access Globus; collaborators outside the NIH will need a free Globus account. Logging into Globus, data transfer and sharing.

NIH or outside collaborators without Helix/Biowulf accounts

Datashare

Helix/Biowulf users can set up special directories which are readable via the web, but are not browseable. This space is intended for data sharing only, and personal web pages are not allowed. See more information about datashare directories here.

Globus

Globus can be used for sharing data with collaborators inside and outside the NIH. NIH researchers can use thier NIH Login username and password to access Globus; collaborators outside the NIH will need a free Globus account. Logging into Globus, data transfer and sharing.

Any collaborators

Anonymous FTP

The anonymous ftp area allows for transfer of files in both directions between NIH scientists and outside collaborators. We do not recommend this option for any sensitive or private data, since the files on the anonymous ftp area are accessible and downloadable by any person. However, anonymous ftp is convenient and frequently used for relatively small data transfers, such as a sequence database created from publicly available sequences. Uploads and downloads can be performed by both the user and collaborator. More information at https://hpc.nih.gov/nih/anonFTP.html (NIH-only).

Acronis

Acronis is a desktop client and web portal allows users to synchronize files and folders across multiple desktops and share files with collaborators. There is also mobile application that allows NIH iPad and iPhone users direct access to their network home folders, network shared folders, Sharepoint sites and their collaboration folder. The data is encrypted and access controlled is controlled using AES-256, and is fully in line with NIH and other Fed Government security standards for official business. For more information, please see https://itwiki.nih.gov/wiki/projects.

Globus

Globus can be used for sharing data with collaborators inside and outside the NIH. NIH researchers can use thier NIH Login username and password to access Globus; collaborators outside the NIH will need a free Globus account. Logging into Globus, data transfer and sharing. Sharing via Globus will allow you to precisely specify who can access the files, and you can turn on and off this access at any time.

Access Control Lists (ACLs)

Access Control Lists (ACLs) is an extension of the traditional UNIX permission concept, and allows more complex and sophisticated access to files under Linux. Specifically, ACLs make it possible to grant indidividual users or groups access to single files or directories. Moreover, they afford selective control over read, write, and execute permissions. This gives much more sophisticated control for sharing data between users on our systems.

Currently, ACLs are available ONLY on our GPFS filesystems (/gs[2-11]). Our NFS file system (/spin1) does NOT support ACLs.

More information about ACLs on NIH HPC Systems

Custom Situation

Contact staff@hpc.nih.gov if your situation is not addressed by any of the above sections, and we'll try and come up with some options for you.