Quick Links
Globus is a service that makes it easy to move, sync, and share large amounts of data. Globus will manage file transfers, monitor performance,
retry failures, recover from faults automatically when possible, and report the status of your data transfer. Globus uses GridFTP for more reliable
and high-performance file transfer, and will queue file transfers to be performed asynchronously in the background.
| Globus was developed and is maintained at the University of Chicago and is used extensively at supercomputer centers and major research facilities. [Globus website] No matter how you transfer data in and out of our systems, be aware that PII and PHI data cannot be stored or transferred into the NIH HPC systems. |
Using either the web browser or command line, Globus will allow you to
- set up a daily, automatic sync to your back-up system
- delay the start of a big transfer until midnight on Saturday
- move data to archival storage once a week.
To set up a recurring or scheduled transfer using the web browser.
Go to the Globus file manager and set up your transfer in the usual way. Under 'Transfer and Timer options', you can set up a time for the transfer to run, and also the repeat schedule.
To set up a scheduled or recurring transfer via the command line, you will need the Globus UUIDs for the source and destination endpoint, which can be found via the Globus web interface or the Globus CLI.
Finding UUIDs using the Globus CLI:
# Find the UUIDs of the source and destination endpoint biowulf% globus login Please paste the following URL in a browser: https://auth.globus.org/etc.... # once you go to the webpage and authenticate with your NIH login, you will see a page requesting that you allow Gloobus CLI to manage transfers etc. Click 'Allow''. You will then be provided with an authorization code which should be pasted into your terminal session. Please Paste your Auth Code Below: ...pasted code... # search for the UUID for the 'NIH HPC Data Transfer' endpoint biowulf% globus endpoint search 'NIH HPC Data Transfer' ID | Owner | Display Name ------------------------------------ | ------------------------------------------------| -------------------------- e2620047-6d04-11e5-ba46-22000b92c6ec | nihhpc@globusid.org | NIH HPC Data Transfer [....] #
Finding UUIDs using the Globus web interface:
Go to https://app.globus.org/endpoints and search for the endpoint name'.
First you need to authenticate your globus-timer session. This is exactly like the authentication for the Globus CLI.
biowulf% globus-timer session login Please paste the following URL in a browser: https://auth.globus.org/etc.... # once you go to the webpage and authenticate with your NIH login, you will see a page requesting that you allow Globus CLI to manage transfers etc. Click 'Allow''. You will then be provided with an authorization code which should be pasted into your terminal session. Please Paste your Auth Code Below: ...pasted code...Sample session to set up the transfer of a directory once a day
biowulf% globus-timer job transfer --name "globus-timer-test" \ > --start '2021-02-09T14:00:00' \ > --interval '1d' \ > --source-endpoint e2620047-6d04-11e5-ba46-22000b92c6ec \ > --dest-endpoint fb1b8048-f84f-11ea-892a-0a5521ff3f4b \ > --item /data/$USER/dir1 /Users/$USER/Desktop/dir1 true Name: globus-timer-test Job ID: bef49456-8678-4326-b1d9-c9f2509a9988 Status: new Start: 2021-02-09T19:00:00+00:00 Interval: 1 day, 0:00:00 Next Run At: 2021-02-10T19:00:00+00:00Parameters in the command above:
--name | name of the job, to help identify it |
--start 'YYYY-MM-DDTHH:MM:SS' | start time for the job. Alternate syntax is available: see the Globus Timer CLI docs |
--interval 'xxx' | how often the job should run. See the Globus Timer CLI docs for syntax. |
--source-endpoint xxx | UUID of the source endpoint |
--dest-endpoint xxx | UUID of the destination endpoint |
--item sourcepath destpath recursive | source and destination paths for the file or directory. The last parameter defines whether the transfer should be recursive (e.g. for a directory). In the example above, a directory tree is being transferred, so the last parameter is set to true. |
Sample session to check scheduled jobs:
biowulf% globus-timer job list Name | Job ID | Status | Last Result ------------------|--------------------------------------|--------|------------- globus-timer-test | bef49456-8678-4326-b1d9-c9f2509a9988 | loaded | RUN COMPLETE biowulf% globus-timer job status 61cdd6d9-abd3-45a0-8ea2-7e961a741ca2 Name: globus-timer-test Job ID: 61cdd6d9-abd3-45a0-8ea2-7e961a741ca2 Status: loaded Start: 2021-02-09T20:30:00+00:00 Interval: 1 day, 0:00:00 Next Run At: 2021-02-10T20:30:00+00:00 Last Run Result: RUN COMPLETE
biowulf% globus-timer job delete f150615c-26d7-450c-975e-57a5817d817b Name: globus-timer-test Job ID: f150615c-26d7-450c-975e-57a5817d817b Status: deleted Start: 2021-02-09T17:30:00 Interval: 1 day, 0:00:00