Difference between revisions of "Globus Data Transfer"

From HPC users
Jump to navigationJump to search
(Created page with "This is highly experimental and not an official service offered by the University. == Introduction == Globus provides a secure, unified interface to your research data. Use...")
(No difference)

Revision as of 18:11, 19 February 2020

This is highly experimental and not an official service offered by the University.

Introduction

Globus provides a secure, unified interface to your research data. Use Globus to 'fire and forget' high-performance data transfers between systems within and across organizations. [1]

Members of the University Oldenburg can use their credentials (abcd1234 and password) to login to the service. Using the web interface you can manage so-called endpoints and transfer large amounts of data between different endpoints (which can be shared by or with you). See the Globus How-tos for details.

Tip: You can SMB-mount your HPC directories (e.g. $DATA or $GROUP) on your local computer to make them the default directory for your personal endpoint.

Using a Virtual Machine to create a Personal Endpoint

It might be useful to use a virtual machine (VM) for creating a personal endpoint so that data transfer does not interfere with your normal computer use. Here is a summary of what needs to be done:

1. Request VM: You can request a VM following the instructions at the web pages from IT services. Make sure that the VM is visible worldwide.

2. Prepare VM: As root, use SMB to mount a filesystem (e.g. $DATA or $GROUP) which will later serve as a directory for data transfer. Note, that the mount will disappear if the system is rebooted. Next, setup Python 3 as the default Python (not strictly needed, but recommended):

# alternatives --install /usr/bin/python python /usr/bin/python2 50
# alternatives --install /usr/bin/python python /usr/bin/python3 60
# alternatives --install /usr/bin/pip pip /usr/bin/pip3 60
# python --version

Normally, Python 3 is already installed, if not use yum install to do so.

3. Install Globus CLI: The globus-cli is needed to connect to the Globus network and manage your private endpoints. It is a Python package so installation is simply done by

# pip install --upgrade globus-cli

(using pip as root is not really recommended but it is still the easiest way to install a package for everyone). Globus CLI Documentation.

4. Install Globus Personal Connect: This allows to run a Globus service on your server which allows the Globus network to connect to your server. The commands are

# cd /opt   # usually a good place for non-standard software 
# wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
--2018-05-22 15:32:32--  https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
Resolving downloads.globus.org (downloads.globus.org)... 52.84.122.197, 52.84.122.3, 52.84.122.100, ...
Connecting to downloads.globus.org (downloads.globus.org)|52.84.122.197|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14500802 (14M) [application/x-tar]
Saving to: ‘globusconnectpersonal-latest.tgz’ 

globusconnectpersonal-latest.tgz             
100%[=====================================================================================>]  13.83M   3.63MB/s    in 3.9s
# tar xzf globusconnectpersonal-latest.tgz
# ln -s globusconnectpersonal-x.y.z globusconnectpersonal

If you now add to the file /etc/skel/.bashrc the two lines

# Globus Personal Connect
PATH=/opt/globusconnectpersonal:$PATH

somewhere at the end, new user should be able to run the commands in the steps below. Existing user have to add the two lines to their own ~/.bashrc