Many users wish to share their raw or processed data on Helix/Biowulf with collaborators in and outside NIH. In some cases a group may want to work in a shared disk area rather than having individual directories. Available options for sharing of data are listed below.
Important Note: It may be tempting to share data by
changing the permissions on your /home or /data directory so that other users
can access the files within. We prohibit allowing world access to your directory. Limited
access can be extended to a few others using Access Control Lists.
For a one-time or very occasional transfer of non-private files to a collaborator who has a Helix/Biowulf account, the simplest way is to copy the files to a new directory /scratch/MyUniqDir, change the permissions so that your collaborator can access them, and then tell your collaborator where they are. For example:
helix% mkdir /scratch/MyUniqDir helix% chmod a+rx /scratch/MyUniqDir helix% cp myfile.txt /scratch/MyUniqDir helix% chmod a+r /scratch/MyUniqDir/myfile.txtNote that this allows everyone on Helix/Biowulf to potentially access the file, so it is only suitable for files which are non-private, such as a sequence database composed of publicly available sequences. All files in /scratch get deleted 10 days after last access.
Your collaborator should do the copy on Helix (the designated interactive data transfer node), rather than on the Biowulf login node. /scratch is not available on the Biowulf compute nodes. After your collaborator has copied the file to their own area, you should delete the directory i.e.% rm -rf /scratch/MyUniqDirDon't use your personal /scratch/$USER for this purpose, as this will allow any user on the system to read and delete files in /scratch/$USER.
Access Control Lists offer a way of allowing a single user access to a portion of your personal /data directory. In this example, the subdirectory /data/user/for_my_colleague is opened for browsing to my_colleague:
helix% setfacl -m u:colleague:--x /data/user helix% setfacl -m u:colleague:r-x /data/user/for_mycolleague
More about groups and permissions
If you wish to share data on a regular basis with other users on Helix/Biowulf, or have many people access the same data without copying it back and forth, the best way is to set up a group with shared disk area. The 'group owner' and 'group members' have specific responsibilities; see the groups & shared directories page for more information. You can apply for the shared area and the group by filling out the form at https://hpcnihapps.cit.nih.gov/auth/dashboard/shared_data_request.php.
Globus is a service that makes it easy to move, sync, and share large amounts of data. Globus will manage file transfers, monitor performance, retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus uses GridFTP for more reliable and high-performance file transfer, and will queue file transfers to be performed asynchronously in the background. Globus can also be used for sharing data with collaborators inside and outside the NIH. NIH researchers can use thier NIH Login username and password to access Globus; collaborators outside the NIH will need a free Globus account. Logging into Globus, data transfer and sharing.
Globus can be used for sharing data with collaborators inside and outside the NIH. NIH researchers can use their NIH Login username and password to access Globus; collaborators outside the NIH will need a free Globus account, and will need to install Globus Connect Personal (free, available for Windows, Mac, Linux). Logging into Globus, data transfer and sharing.
NIH/CIT provides Box and OneDrive collaboration tools for sharing data with collaborators. Which one you can use depends on the size of data to be shared, the size of individual files, and whether the collaborators are at NIH or outside. See https://hpc.nih.gov/docs/box_onedrive.html for more information.
Helix/Biowulf users can set up special directories which are readable via the web, but are not browseable. This space is intended for data sharing only, and personal web pages are not allowed. See more information about datashare directories here.
Globus can be used for sharing data with collaborators inside and outside the NIH. NIH researchers can use thier NIH Login username and password to access Globus; collaborators outside the NIH will need a free Globus account. Logging into Globus, data transfer and sharing. Sharing via Globus will allow you to precisely specify who can access the files, and you can turn on and off this access at any time.
Access Control Lists (ACLs) is an extension of the traditional UNIX permission concept, and allows more complex and sophisticated access to files under Linux. Specifically, ACLs make it possible to grant indidividual users or groups access to single files or directories. Moreover, they afford selective control over read, write, and execute permissions. This gives much more sophisticated control for sharing data between users on our systems.
More information about ACLs on NIH HPC Systems
Contact staff@hpc.nih.gov if your situation is not addressed by any of the above sections, and we'll try and come up with some options for you.