A hack to copy files between two remote hosts using Python
I sometimes need to copy a file (such as a database dump) between two remote hosts on EC2. Normally this involves a few steps: scp'ing the ssh keyfile to Host 1, ssh'ing to Host 1, looking up the address for Host 2, then scp'ing the desired file from Host 1 to Host 2.
I was excited to read in the man page that scp can copy files between two remote hosts directly. However, it didn't work for me. Apparently, running
scp host1:myfile host2: is like running
ssh host1 scp myfile host2: so I still need the address of host2 and my ssh keyfile on host1.
My inablility to let go of this small efficiency increaser, led me to (what else?) write a Python script. I know this is a hack so if you know of a better way of doing this, let me know.
The script parses my
~/.ssh/config file to find the ssh keyfile and address for host 2, uses scp to copy the ssh keyfile to host 1, then runs the
ssh host1 scp ... command with the appropriate options filled in. The script captures all of the ssh options for host 2 and passes them on the command line to
scp via the
-o command-line option. Note, I only tested this to set the
User option– I don't know if all ssh options will work.
Warning: the script disables the StrictHostKeyChecking SSH option, so you are more vunerable to a man-in-the-middle attack.
Update 2010-02-16: I've found there is already a SSH config file parser in the paramiko library. The source can be viewed on github.
Update 2010-05-04: I modified my code to use the paramiko library and also allow command line options to be passed directly to the scp command. The latest code is available in my github repository remote-tools.
import itertools import os import re import sys SSH_CONFIG_FILE = '/home/saltycrane/.ssh/config' def main(): host1, path1 = sys.argv.split(':', 1) host2, path2 = sys.argv.split(':', 1) o = get_ssh_options(host2) keyfile_remote = '/tmp/%s' % os.path.basename(o['identityfile']) ssh_options = ' -o'.join(['='.join([k, v]) for k, v in o.iteritems() if k != 'hostname' and k != 'identityfile']) run('scp %s %s:%s' % (o['identityfile'], host1, keyfile_remote)) run('ssh %s scp -p -i %s -oStrictHostKeyChecking=no -o%s %s %s:%s' % ( host1, keyfile_remote, ssh_options, path1, o['hostname'], path2)) def get_ssh_options(host): """Parse ~/.ssh/config file and return a dict of ssh options for host Note: dict keys are all lowercase """ def remove_comment(line): return re.sub(r'#.*$', '', line) def get_value(line, key_arg): m = re.search(r'^\s*%s\s+(.+)\s*$' % key_arg, line, re.I) if m: return m.group(1) else: return '' def not_the_host(line): return get_value(line, 'Host') != host def not_a_host(line): return get_value(line, 'Host') == '' lines = [line.strip() for line in file(SSH_CONFIG_FILE)] comments_removed = [remove_comment(line) for line in lines] blanks_removed = [line for line in comments_removed if line] top_removed = list(itertools.dropwhile(not_the_host, blanks_removed))[1:] goodpart = itertools.takewhile(not_a_host, top_removed) return dict([line.lower().split(None, 1) for line in goodpart]) def run(cmd): print cmd os.system(cmd) if __name__ == '__main__': main()
Here is an example
Host testhost1 User root Hostname 48.879.24.567 IdentityFile /home/saltycrane/.ssh/test_keyfile Host testhost2 User root Hostname 56.384.58.212 IdentityFile /home/saltycrane/.ssh/test_keyfile
Here is an example run. It copies
testhost1 to the same path on
python scp_r2r.py testhost1:/tmp/testfile testhost2:/tmp/testfile
Here is the console output:
scp /home/saltycrane/.ssh/test_keyfile testhost1:/tmp/test_keyfile test_keyfile 100% 1674 1.6KB/s 00:00 ssh testhost1 scp -p -i /tmp/test_keyfile -oStrictHostKeyChecking=no -ouser=root /tmp/testfile 56.384.58.212:/tmp/testfile
One inconvenience is that it doesn't show the progress for the main transfer. If anyone knows how I can fix this, please let me know.
- How to get the filename and it's parent directory in Python — posted 2011-12-28
- How to remove ^M characters from a file with Python — posted 2011-10-03
- Options for listing the files in a directory with Python — posted 2010-04-19
- Monitoring a filesystem with Python and Pyinotify — posted 2010-04-09
- os.path.relpath() source code for Python 2.5 — posted 2010-03-31
Was this for a contest to see how many compound statements could be used while obfuscating a very simple idea inside as many iterators as humanly possible? At times I am very glad I do not think like an Engineer. :)
Neat concept though, I can see the need for this at times.
Hi Eliot, thanks for your post and especially the paramiko hint. I ran into the same problems and wrote an improved scp_r2r script. It uses SSH agent forwarding so the private key remains safe on the local computer. It also fixes some of the issues you mentioned:
- StrictHostChecking is enabled and the user can type "yes" (interactively)
- the progress bar is shown (using ssh's -t option and starting the process with stdout=sys.stdout)
gwyn: This is awesome! Thank you for figuring this out and posting it. You have made a really nice tool instead of my hack. I will definitely give this a try. I don't know if you think you'll make any updates, but if you do, it'd be great if it were on github or PyPI or something. (Again sorry about your comment getting marked as spam.)