helix% module load azcopy helix% azcopy login # get send back a link that you have to paste in your browser and then you can enter the code XXXX # to authenticate. Next you may sign in with your NIH email. helix% azcopy cp "https://[account].blob.core.windows.net/[container]/onefile" . helix% azcopy cp /data/$USER/mydir "https://[account].blob.core.windows.net/[container]/directory" --recursive
The Azure CLI is installed and available as a module (azure_cli) which can be loaded on Helix or the Biowulf compute nodes.
Sample session:
[user@helix ~]$ module load azure_cli [user@helix ~]$ az login To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code ABCDEFGH to authenticate. ---- After successful login .... ------------------------ [ { "cloudName": "AzureCloud", "homeTenantId": "xxxxxxxxxxxxxxxxxxxxxxx", "id": "yyyyyyyyyyyyyyyyy", "isDefault": true, "managedByTenants": [], "name": "NIH.Some.Name", "state": "Enabled", "tenantId": "zzzzzzzzzzzzzzzzzzzzzzzzz", "user": { "name": "user@nih.gov", "type": "user" } } ] # get a list of containers in a Azure Blob storage account [user@helix ~]$ az storage container list --account-name my-azure-storage-account-name -account-key=my-account-key [ { "deleted": null, "encryptionScope": { "defaultEncryptionScope": "$account-encryption-key", "preventEncryptionScopeOverride": false }, [...] "lastModified": "2023-08-14T17:06:58+00:00", }, "publicAccess": null }, "version": null } ]Set some environment variables to avoid having to enter them on the CLI command line
[user@helix ~]$ export AZURE_STORAGE_ACCOUNT=my-azure-storage-account-name [user@helix ~]$ export AZURE_STORAGE_KEY=my-account-key
See blobs in a container
[user@helix ~]$ az storage blob list -c container1 [ { "container": "container1", [...] }, [...] "name": "myfile.jpg", [...] }, "copy": { [...] }, "creationTime": "2023-08-14T17:07:00+00:00", "deletedTime": null, "etag": "0x8DB9CE8DFE6341E", "lastModified": "2023-08-14T17:07:00+00:00", "lease": { [...] }, [...] }, [...] } ]
Uploading data:
# Upload a file: # sample speed: 100 GB file in 78 mins az storage blob upload -c container1 -f ./img_000000255.fits # Upload a directory: az storage blob upload-batch -d container1 -s ./test_data/
Documentation for storage-related Azure CLI commands can be seen at the Azure doc site.
There are multiple ways to deal with Azure authentication. See the rclone Azure auth page for more info.
In the example below, authentication has been set up using account-name/key.
[user@helix ~]$ module load rclone [+] Loading rclone 1.62.2 [user@helix ~]$ rclone config Enter configuration password: password: e) Edit existing remote n) New remote d) Delete remote r) Rename remote c) Copy remote s) Set configuration password q) Quit config e/n/d/r/c/s/q> n Enter name for new remote. name> azureblob Option Storage. Type of storage to configure. Choose a number from below, or type in your own value. [...] 30 / Microsoft Azure Blob Storage \ (azureblob) Storage> 30 Option account. Azure Storage Account Name. Set this to the Azure Storage Account Name in use. Leave blank to use SAS URL or Emulator, otherwise it needs to be set. If this is blank and if env_auth is set it will be read from the environment variable `AZURE_STORAGE_ACCOUNT_NAME` if possible. Enter a value. Press Enter to leave empty. account>myazurestorageaccount Option env_auth. Read credentials from runtime (environment variables, CLI or MSI). See the [authentication docs](/azureblob#authentication) for full info. Enter a boolean value (true or false). Press Enter for the default (false). env_auth> false Option key. Storage Account Shared Key. Leave blank to use SAS URL or Emulator. Enter a value. Press Enter to leave empty. key> myazurestoragekey Option sas_url. SAS URL for container level access only. Leave blank if using account/key or Emulator. Enter a value. Press Enter to leave empty. sas_url> Option tenant. ID of the service principal's tenant. Also called its directory ID. Set this if using - Service principal with client secret - Service principal with certificate - User with username and password Enter a value. Press Enter to leave empty. tenant> Option client_id. The ID of the client in use. Set this if using - Service principal with client secret - Service principal with certificate - User with username and password Enter a value. Press Enter to leave empty. client_id> Option client_secret. One of the service principal's client secrets Set this if using - Service principal with client secret Enter a value. Press Enter to leave empty. client_secret> Option client_certificate_path. Path to a PEM or PKCS12 certificate file including the private key. Set this if using - Service principal with certificate Enter a value. Press Enter to leave empty. client_certificate_path> Option client_certificate_password. Password for the certificate file (optional). Optionally set this if using - Service principal with certificate And the certificate has a password. Choose an alternative below. Press Enter for the default (n). y) Yes, type in my own password g) Generate random password n) No, leave this optional password blank (default) y/g/n> n Edit advanced config? y) Yes n) No (default) y/n> n Configuration complete. Options: - type: azureblob - account: myazurestorageaccount - key: myazurestoragekey Keep this "azureblob" remote? y) Yes this is OK (default) e) Edit this remote d) Delete this remote y/e/d> y Current remotes: Name Type ==== ==== azureblob azureblob box box onedrive onedrive e) Edit existing remote n) New remote d) Delete remote r) Rename remote c) Copy remote s) Set configuration password q) Quit config e/n/d/r/c/s/q> qNow that rclone is configured, it can be used to list blobs in a container on Azure Blob Storage. e.g.
[user@helix ~]$ rclone ls azureblob:container2 Enter configuration password: password: 4678 img_000000248.fitsTransfer a directory.
#To avoid entering the rclone configuration password each time, you can store it in a variable. [user@helix ~]$ export RCLONE_CONFIG_PASS=myrcloneconfigpass # Upload a directory to Azure [user@helix ~]$ rclone mkdir azureblob:container3 [user@helix ~]$ rclone sync --progress --update ./50GB-in-medium-files/ azureblob:container3 Transferred: 46.566 GiB / 46.566 GiB, 100%, 68.300 MiB/s, ETA 0s Checks: 304 / 304, 100% Deleted: 304 (files), 0 (dirs) Transferred: 1875 / 1875, 100% Elapsed time: 10m16.9s # Download a set of files from Azure [user@helix ~]$ rclone sync --progress --update azureblob:container3 ./test_data
The HPC staff are in the process of setting up a Globus connector for Azure Blob Storage.