Go to main content

Globus

In order to submit datasets larger than 25 GB in size for curation and publication, or to download datasets or files from the Research Data Repository that are larger than 25 GB, you will need to use the Globus file transfer service.

What is Globus

Globus is a nonprofit platform created by the University of Chicago and Argonne National Laboratory that enables the simple transfer of digital files as large as petabytes (a petabyte is 1,000 terabytes or 1,000,000 gigabytes) from established endpoints, one of which can be your work or personal computer.

How do I use Globus to transfer files to and from my computer?

Below we describe some of the key steps to getting started with Globus, especially how to download files from the RDR and send us larger files for deposit into the repository. Globus also has detailed "How To" walkthroughs for basic and more complicated setup processes.

For Duke users, Research Computing has a page on different ways that Globus can be used with different Duke storage platforms.

If you already are a Globus user and have end-points established for transfer you can skip to the RDR specific steps for downloading or uploading data.

Set up a Globus account

The first step is to set up an account with the Globus Web App, you can do so using your existing Duke NetID (most common for Duke users), Globus ID or third party logins like Google, GitHub or ORCiD. To begin setting up an account, navigate to the Globus home page and click on the Login icon. You can use the drop down menu to select the Globus ID option or to locate the organization you are affiliated with by name.

If you are setting up an account to deposit data, you must log in using your Duke credentials. Locate Duke University on the drop down list and click the blue "Continue" button to go to a Shibboleth login page and enter your Duke NetID and password.

Install Globus Connect Personal

Once you have set up a Globus account, you will want to determine what endpoint you will do your transfer from. For most, this is to establish a personal endpoint on your computer so that you can transfer data to it (for download) and from it (for upload). To do this, you will need to install Globus Connect Personal to connect to the Globus Web App (instructions below). An "endpoint" is one of the two file transfer locations. You install Globus Connect Personal onto the system you plan to use (server, cluster, storage system, laptop, desktop, etc.) and configure it so that it has access to the area where the data you want to transfer is stored or where you want to download data to. The process is similar to mapping to a network drive or using an FTP service.

Globus does support “drag and drop” to some degree but for both large or high volume file transfers this will likely result in a transfer error. Please use an established endpoint to avoid this issue.

Please see:

Downloading Data using Globus (Anyone)

Globus may be used to download datasets or individual files that are larger than 25GB.

  1. Navigate to the dataset or file you wish to download and click on the "Get Data from Globus" button.
  2. You will be prompted to log in to Globus. You may do so through an existing Globus ID, organizational login (see dropdown list), or through third party services like Google, ORCiD, or GitHub.
  3. After logging in, you will be taken to the Globus File Manager screen. For download we have the Globus File Manager presented with two panels; the left panel being where the data originate, and the right panel being where you select where you want the data to go (your established endpoint).
  4. In the path field at the top will be the RDR system ID for the dataset you wish to download. In the left pane beneath it you will see a list of the files associated with the dataset. For datasets published prior to 2026, this will include both an export manifest that includes SHA-1 checksums with which you can verify the accuracy and fixity of the files you download, and an export README file that contains metadata and some other contextual information about the dataset. Please note: if you are attempting to download a single large file from a dataset, you may have to navigate the dataset's hierarchy in this panel to find the correct file.
  5. In the right panel you will want to use the “Search” bar to find where you want the data to go - your established endpoint. This endpoint may be located in the “Recent” view if newly created. In the endpoint select the location for where you want to download the data (if applicable - e.g. you have multiple folders and want it to go in one specifically).
  6. Click back on the left panel to select the files you want to download (can select all with the toggle or click individually).
  7. Once files are selected in the center tools menu you will choose “Transfer or Sync to…” and click the “Start” button over the left panel (arrow points to the right) to begin the transfer process.
  8. Clicking "Start" will produce a green pop-up indicating a successful transfer request. You will also likely receive email confirmation of this as well.
  9. Because Globus is designed to handle large files that may take some time to download, transfer will pause when your computer is no longer connected to the Internet and will resume automatically when reconnected. Files that have been completely downloaded will appear in the destination folder in your computer. Clicking on "Activity" in the blue menu panel on the left will allow you to check the status of your current transfers.
  10. When transfer is successful you will receive an email that the process has finished. You should then be able to access all files from your endpoint.

Uploading Data using Globus (Duke Only)

  1. Follow the instructions above to set up your account and configure your endpoint via Globus Connect Personal. For Duke users, Research Computing also has a page on different ways that Globus can be used to move files between different Duke storage platforms.
  2. You will receive an email from Globus that contains a URL to access your share where you will transfer your files. The endpoint collection name will have your NetID and a date stamp. Please note this is not an automated process and will take place during typical business hours (9am - 5pm).
  3. If you are not already logged in, you will be taken to the Globus login page when you click on this link. Click the blue "Continue" button to go to a Shibboleth login page and enter your Duke NetID and password. Please note that if you were recently logged in it may sometimes appear that you are already logged in, but you will be prompted to log in again.
  4. You will want to make sure your File Manager screen has the dual panel view for ease of transfer. If you are only seeing one panel, on the top right of your File Manager, you will see a "Panels" menu. In order to see both the source and transfer panels, select the middle option (looks like an open book).
  5. In the left-hand panel, you'll see the collection that you will be transferring your files to. In the right-hand panel, you will navigate to the endpoint you want to transfer from. It will be labeled with your NetID and date stamp and will be empty. To do this, click on the “Search” bar to navigate to the files on the endpoint you defined when you installed Globus Connect Personal. You may need to use the “up one folder” arrow depending on how you have mapped your endpoint. To begin the upload, make sure the right side of your window, your collection, is active and that the files you want to transfer are selected.
  6. Click on the Start button at the top of your file manager window to begin the transfer process (arrow points to the left). Your files will then begin to transfer. You can view processing messages in the Activity section of Globus. To view your transferred files click on “refresh list” and they should display. You will receive an email at start and conclusion of transfer as well as if any errors are encountered. If you receive any error messages and cannot move forward with your upload please contact datamanagement@duke.edu.

If you are in need of further assistance, extensive documentation is available through the Globus website. You may also contact us at datamanagement@duke.edu to begin a troubleshooting session with Research Computing if necessary.

A note about Troubleshooting Globus

By default, Globus Personal Connect will automatically turn on in the background when you start your computer. When you are not using Globus, to avoid receiving error messages and other notifications, simply find the Globus icon in the bottom right-hand corner of your screen (or in the menu/status bar at the top for Mac), right-click (or ctrl-click for Mac), and click "Quit Globus Personal Connect."