I do a lot of sysadmin-type work with Python so I often need to list the contents of directory on a filesystem. Here are 4 methods I've used so far to do that. Let me know if you have any good alternatives. The examples were run on my Ubuntu Karmic machine.
OPTION 1 -
This is probably the simplest way to list the contents of a directory in Python.
import os dirlist = os.listdir("/usr") from pprint import pprint pprint(dirlist)
['lib', 'shareFeisty', 'src', 'bin', 'local', 'X11R6', 'lib64', 'sbin', 'share', 'include', 'lib32', 'man', 'games']
OPTION 2 -
This method allows you to use shell-style wildcards.
import glob dirlist = glob.glob('/usr/*') from pprint import pprint pprint(dirlist)
['/usr/lib', '/usr/shareFeisty', '/usr/src', '/usr/bin', '/usr/local', '/usr/X11R6', '/usr/lib64', '/usr/sbin', '/usr/share', '/usr/include', '/usr/lib32', '/usr/man', '/usr/games']
OPTION 3 - Unix "ls" command using
This method uses your operating system's "ls" command. It allows you to sort the output based on modification time, file size, etc. by passing these command-line options to the "ls" command. The following example lists the 10 most recently modified files in
from subprocess import Popen, PIPE def listdir_shell(path, *lsargs): p = Popen(('ls', path) + lsargs, shell=False, stdout=PIPE, close_fds=True) return [path.rstrip('\n') for path in p.stdout.readlines()] dirlist = listdir_shell('/var/log', '-t')[:10] from pprint import pprint pprint(dirlist)
['auth.log', 'syslog', 'dpkg.log', 'messages', 'user.log', 'daemon.log', 'debug', 'kern.log', 'munin', 'mysql.log']
This method allows you to list directory contents recursively in a manner similar to the Unix "find" command. It uses Python's
import os def unix_find(pathin): """Return results similar to the Unix find command run without options i.e. traverse a directory tree and return all the file paths """ return [os.path.join(path, file) for (path, dirs, files) in os.walk(pathin) for file in files] pathlist = unix_find('/etc')[-10:] from pprint import pprint pprint(pathlist)
['/etc/fonts/conf.avail/20-lohit-gujarati.conf', '/etc/fonts/conf.avail/69-language-selector-zh-mo.conf', '/etc/fonts/conf.avail/11-lcd-filter-lcddefault.conf', '/etc/cron.weekly/0anacron', '/etc/cron.weekly/cvs', '/etc/cron.weekly/popularity-contest', '/etc/cron.weekly/man-db', '/etc/cron.weekly/apt-xapian-index', '/etc/cron.weekly/sysklogd', '/etc/cron.weekly/.placeholder']
- How to get the filename and it's parent directory in Python — posted 2011-12-28
- How to remove ^M characters from a file with Python — posted 2011-10-03
- Monitoring a filesystem with Python and Pyinotify — posted 2010-04-09
- os.path.relpath() source code for Python 2.5 — posted 2010-03-31
- A hack to copy files between two remote hosts using Python — posted 2010-02-08
Adding a regexp to your option #1 is a quick way to get python's re module into play when sh regexps won't cut it:
import os, pprint, re pat = re.compile(r".+\d.+") dirlist = filter(pat.match, os.listdir("/usr/local")) pprint.pprint(dirlist)
gives me (on my FreeBSD box)
['diablo-jdk1.6.0', 'netbeans68', 'openoffice.org-3.2.0', 'i386-portbld-freebsd7.3']
Keith: That's a good tip. I will give it a try the next time I get a chance. Thanks!
...and how about an easy way for listing contents of a WEB directory? Could any of the above techniques be used?
I'm just learning python for my job and this has been a really useful reference page for me!! I realise it's only really useful for one thing - but the methods you've shown are perfect for particular types of directory listings in my code ;).
I recently started learning python and i love your blog i'm constantly looking for best practices and "solved" problems
I'm also just learning python for my job and this has been a really useful reference page for me.
I hope you can post more about system administration booth Unix and Windows.
Keep up the good work man ;)
how to getting files from three different dirctory in reverse manner....please give idea..