Options for listing the files in a directory with Python
I do a lot of sysadmin-type work with Python so I often need to list the contents of directory on a filesystem. Here are 4 methods I've used so far to do that. Let me know if you have any good alternatives. The examples were run on my Ubuntu Karmic machine.
OPTION 1 - os.listdir()¶
This is probably the simplest way to list the contents of a directory in Python.
import os
dirlist = os.listdir("/usr")
from pprint import pprint
pprint(dirlist)
Results:
['lib', 'shareFeisty', 'src', 'bin', 'local', 'X11R6', 'lib64', 'sbin', 'share', 'include', 'lib32', 'man', 'games']
OPTION 2 - glob.glob()¶
This method allows you to use shell-style wildcards.
import glob
dirlist = glob.glob('/usr/*')
from pprint import pprint
pprint(dirlist)
Results:
['/usr/lib', '/usr/shareFeisty', '/usr/src', '/usr/bin', '/usr/local', '/usr/X11R6', '/usr/lib64', '/usr/sbin', '/usr/share', '/usr/include', '/usr/lib32', '/usr/man', '/usr/games']
OPTION 3 - Unix "ls" command using subprocess¶
This method uses your operating system's "ls" command. It allows you to sort the output based on modification time, file size, etc. by passing these command-line options to the "ls" command. The following example lists the 10 most recently modified files in /var/log:
from subprocess import Popen, PIPE
def listdir_shell(path, *lsargs):
p = Popen(('ls', path) + lsargs, shell=False, stdout=PIPE, close_fds=True)
return [path.rstrip('\n') for path in p.stdout.readlines()]
dirlist = listdir_shell('/var/log', '-t')[:10]
from pprint import pprint
pprint(dirlist)
Results:
['auth.log', 'syslog', 'dpkg.log', 'messages', 'user.log', 'daemon.log', 'debug', 'kern.log', 'munin', 'mysql.log']
OPTION 4 - Unix "find" style using os.walk¶
This method allows you to list directory contents recursively in a manner similar to the Unix "find" command. It uses Python's os.walk.
import os
def unix_find(pathin):
"""Return results similar to the Unix find command run without options
i.e. traverse a directory tree and return all the file paths
"""
return [os.path.join(path, file)
for (path, dirs, files) in os.walk(pathin)
for file in files]
pathlist = unix_find('/etc')[-10:]
from pprint import pprint
pprint(pathlist)
Results:
['/etc/fonts/conf.avail/20-lohit-gujarati.conf', '/etc/fonts/conf.avail/69-language-selector-zh-mo.conf', '/etc/fonts/conf.avail/11-lcd-filter-lcddefault.conf', '/etc/cron.weekly/0anacron', '/etc/cron.weekly/cvs', '/etc/cron.weekly/popularity-contest', '/etc/cron.weekly/man-db', '/etc/cron.weekly/apt-xapian-index', '/etc/cron.weekly/sysklogd', '/etc/cron.weekly/.placeholder']
Related posts
- How to get the filename and it's parent directory in Python — posted 2011-12-28
- How to remove ^M characters from a file with Python — posted 2011-10-03
- Monitoring a filesystem with Python and Pyinotify — posted 2010-04-09
- os.path.relpath() source code for Python 2.5 — posted 2010-03-31
- A hack to copy files between two remote hosts using Python — posted 2010-02-08
Comments
Adding a regexp to your option #1 is a quick way to get python's re module into play when sh regexps won't cut it:
import os, pprint, re
pat = re.compile(r".+\d.+")
dirlist = filter(pat.match, os.listdir("/usr/local"))
pprint.pprint(dirlist)
gives me (on my FreeBSD box)
['diablo-jdk1.6.0',
'netbeans68',
'openoffice.org-3.2.0',
'i386-portbld-freebsd7.3']
Keith: That's a good tip. I will give it a try the next time I get a chance. Thanks!
...and how about an easy way for listing contents of a WEB directory? Could any of the above techniques be used?
I'm just learning python for my job and this has been a really useful reference page for me!! I realise it's only really useful for one thing - but the methods you've shown are perfect for particular types of directory listings in my code ;).
I recently started learning python and i love your blog i'm constantly looking for best practices and "solved" problems
I'm also just learning python for my job and this has been a really useful reference page for me.
I hope you can post more about system administration booth Unix and Windows.
Keep up the good work man ;)
how to getting files from three different dirctory in reverse manner....please give idea..
