Saltycrane logo

SaltyCrane Blog

Notes on Python, Django, and web development on Ubuntu Linux

    

How to sort a table by columns in Python

I have a 2-dimensional table of data implemented as a list of lists in Python. I would like to sort the data by an arbitrary column. This is a common task with tabular data. For example, Windows Explorer allows me to sort the list of files by Name, Size, Type, or Date Modified. I tried the code from this article, however, if there are duplicate entries in the column being sorted, the duplicates are removed. This is not what I wanted, so I did some further searching, and found a nice solution from the HowTo/Sorting article on the PythonInfo Wiki. This method also uses the built-in sorted() function, as well as the key paramenter, and operator.itemgetter(). (See section 2.1 and 6.7 of the Python Library Reference for more information.) The following code sorts the table by the second column (index 1). Note, Python 2.4 or later is required.

import operator

def sort_table(table, col=0):
    return sorted(table, key=operator.itemgetter(col))

if __name__ == '__main__':
    mytable = (
        ('Joe', 'Clark', '1989'),
        ('Charlie', 'Babbitt', '1988'),
        ('Frank', 'Abagnale', '2002'),
        ('Bill', 'Clark', '2009'),
        ('Alan', 'Clark', '1804'),
        )
    for row in sort_table(mytable, 1):
        print row

Results:
('Frank', 'Abagnale', '2002')
('Charlie', 'Babbitt', '1988')
('Joe', 'Clark', '1989')
('Bill', 'Clark', '2009')
('Alan', 'Clark', '1804')

This works well, but I would also like the table to be sorted by column 0 in addition to column 1. In this example, column 1 holds the Last Name and column 0 holds the First Name. I would like the table to be sorted first by Last Name, and then by First Name. Here is the code to sort the table by multiple columns. The cols argument is a tuple specifying the columns to sort by. The first column to sort by is listed first, the second second, and so on.
import operator

def sort_table(table, cols):
    """ sort a table by multiple columns
        table: a list of lists (or tuple of tuples) where each inner list 
               represents a row
        cols:  a list (or tuple) specifying the column numbers to sort by
               e.g. (1,0) would sort by column 1, then by column 0
    """
    for col in reversed(cols):
        table = sorted(table, key=operator.itemgetter(col))
    return table

if __name__ == '__main__':
    mytable = (
        ('Joe', 'Clark', '1989'),
        ('Charlie', 'Babbitt', '1988'),
        ('Frank', 'Abagnale', '2002'),
        ('Bill', 'Clark', '2009'),
        ('Alan', 'Clark', '1804'),
        )
    for row in sort_table(mytable, (1,0)):
        print row

Results:
('Frank', 'Abagnale', '2002')
('Charlie', 'Babbitt', '1988')
('Alan', 'Clark', '1804')
('Bill', 'Clark', '2009')
('Joe', 'Clark', '1989')

8 Comments — feed icon Comments feed for this post


#1 Anonymous commented on 2008-01-11:

Very Helpful.
Thanks.


#2 Ric. :-) commented on 2008-11-25:

Thanks a lot!! :-) It's that I was looking for! :-)

Ric.


#3 Zack commented on 2009-09-04:

Awesome. Thanks!


#4 Yar commented on 2010-03-03:

Really useful ! Thanks !


#5 Carlo commented on 2010-04-26:

Just what I was after. Thanks.


#6 Orangeboy commented on 2010-09-24:

Thanks! New to python, and had trouble with complex sorting. This does the trick!


#7 Senyai commented on 2010-11-26:

Nice, works as fast as sorted(table, cmp=lambda a, b:cmp((a[1],a[0]), (b[1],b[0])))


#8 Will commented on 2011-12-21:

Thank you for this! Very simple and straightforward, that's what I needed!

Post a comment

Required
Required, but not displayed
Optional

Format using Markdown. (No HTML.)
  • Code blocks: prefix each line by at least 4 spaces or 1 tab (and a blank line before and after)
  • Code span: surround with backticks
  • Blockquotes: prefix lines to be quoted with >
  • Links: <URL>
  • Links w/ description: [description](URL)
Created with Django | Hosted by Linode