SaltyCrane Blog — Notes on JavaScript and web development

Options for listing the files in a directory with Python

I do a lot of sysadmin-type work with Python so I often need to list the contents of directory on a filesystem. Here are 4 methods I've used so far to do that. Let me know if you have any good alternatives. The examples were run on my Ubuntu Karmic machine.

OPTION 1 - os.listdir()

This is probably the simplest way to list the contents of a directory in Python.

import os
dirlist = os.listdir("/usr")

from pprint import pprint
pprint(dirlist)

Results:

['lib',
 'shareFeisty',
 'src',
 'bin',
 'local',
 'X11R6',
 'lib64',
 'sbin',
 'share',
 'include',
 'lib32',
 'man',
 'games']

OPTION 2 - glob.glob()

This method allows you to use shell-style wildcards.

import glob
dirlist = glob.glob('/usr/*')

from pprint import pprint
pprint(dirlist)

Results:

['/usr/lib',
 '/usr/shareFeisty',
 '/usr/src',
 '/usr/bin',
 '/usr/local',
 '/usr/X11R6',
 '/usr/lib64',
 '/usr/sbin',
 '/usr/share',
 '/usr/include',
 '/usr/lib32',
 '/usr/man',
 '/usr/games']

OPTION 3 - Unix "ls" command using subprocess

This method uses your operating system's "ls" command. It allows you to sort the output based on modification time, file size, etc. by passing these command-line options to the "ls" command. The following example lists the 10 most recently modified files in /var/log:

from subprocess import Popen, PIPE

def listdir_shell(path, *lsargs):
    p = Popen(('ls', path) + lsargs, shell=False, stdout=PIPE, close_fds=True)
    return [path.rstrip('\n') for path in p.stdout.readlines()]

dirlist = listdir_shell('/var/log', '-t')[:10]

from pprint import pprint
pprint(dirlist)

Results:

['auth.log',
 'syslog',
 'dpkg.log',
 'messages',
 'user.log',
 'daemon.log',
 'debug',
 'kern.log',
 'munin',
 'mysql.log']

OPTION 4 - Unix "find" style using os.walk

This method allows you to list directory contents recursively in a manner similar to the Unix "find" command. It uses Python's os.walk.

import os

def unix_find(pathin):
    """Return results similar to the Unix find command run without options
    i.e. traverse a directory tree and return all the file paths
    """
    return [os.path.join(path, file)
            for (path, dirs, files) in os.walk(pathin)
            for file in files]

pathlist = unix_find('/etc')[-10:]

from pprint import pprint
pprint(pathlist)

Results:

['/etc/fonts/conf.avail/20-lohit-gujarati.conf',
 '/etc/fonts/conf.avail/69-language-selector-zh-mo.conf',
 '/etc/fonts/conf.avail/11-lcd-filter-lcddefault.conf',
 '/etc/cron.weekly/0anacron',
 '/etc/cron.weekly/cvs',
 '/etc/cron.weekly/popularity-contest',
 '/etc/cron.weekly/man-db',
 '/etc/cron.weekly/apt-xapian-index',
 '/etc/cron.weekly/sysklogd',
 '/etc/cron.weekly/.placeholder']

Comments


#1 Keith Beattie commented on :

Adding a regexp to your option #1 is a quick way to get python's re module into play when sh regexps won't cut it:

import os, pprint, re

pat = re.compile(r".+\d.+")
dirlist = filter(pat.match, os.listdir("/usr/local"))

pprint.pprint(dirlist)

gives me (on my FreeBSD box)

['diablo-jdk1.6.0',
 'netbeans68',
 'openoffice.org-3.2.0',
 'i386-portbld-freebsd7.3']

#2 Eliot commented on :

Keith: That's a good tip. I will give it a try the next time I get a chance. Thanks!


#3 Al Jaffe commented on :

...and how about an easy way for listing contents of a WEB directory? Could any of the above techniques be used?


#4 Directory commented on :

I'm just learning python for my job and this has been a really useful reference page for me!! I realise it's only really useful for one thing - but the methods you've shown are perfect for particular types of directory listings in my code ;).


#5 gsiliceo commented on :

I recently started learning python and i love your blog i'm constantly looking for best practices and "solved" problems


#6 Eriksen commented on :

I'm also just learning python for my job and this has been a really useful reference page for me.

I hope you can post more about system administration booth Unix and Windows.

Keep up the good work man ;)


#7 SunnY commented on :

how to getting files from three different dirctory in reverse manner....please give idea..