How to sort a table by columns in Python
I have a 2-dimensional table of data implemented as a list of lists
in Python. I would like to sort the data by an arbitrary
column. This is a common task with tabular data. For example,
Windows Explorer allows me to sort the list of files
by Name, Size, Type, or Date
Modified. I tried the code
from this
article, however, if there are duplicate entries in the column
being sorted, the duplicates are removed. This is not what I wanted,
so I did some further searching, and found a nice solution from
the HowTo/Sorting
article on the PythonInfo Wiki. This method also uses the
built-in sorted() function, as well
as the key paramenter,
and operator.itemgetter(). (See section 2.1
and 6.7
of the Python Library Reference for more information.) The following
code sorts the table by the second column (index 1). Note, Python
2.4 or later is required.
import operator
def sort_table(table, col=0):
return sorted(table, key=operator.itemgetter(col))
if __name__ == '__main__':
mytable = (
('Joe', 'Clark', '1989'),
('Charlie', 'Babbitt', '1988'),
('Frank', 'Abagnale', '2002'),
('Bill', 'Clark', '2009'),
('Alan', 'Clark', '1804'),
)
for row in sort_table(mytable, 1):
print row
Results:
('Frank', 'Abagnale', '2002')
('Charlie', 'Babbitt', '1988')
('Joe', 'Clark', '1989')
('Bill', 'Clark', '2009')
('Alan', 'Clark', '1804')
This works well, but I would also like the table to be sorted by column 0 in addition to column 1. In this example, column 1 holds the Last Name and column 0 holds the First Name. I would like the table to be sorted first by Last Name, and then by First Name. Here is the code to sort the table by multiple columns. The cols argument is a tuple specifying the columns to sort by. The first column to sort by is listed first, the second second, and so on.
import operator
def sort_table(table, cols):
""" sort a table by multiple columns
table: a list of lists (or tuple of tuples) where each inner list
represents a row
cols: a list (or tuple) specifying the column numbers to sort by
e.g. (1,0) would sort by column 1, then by column 0
"""
for col in reversed(cols):
table = sorted(table, key=operator.itemgetter(col))
return table
if __name__ == '__main__':
mytable = (
('Joe', 'Clark', '1989'),
('Charlie', 'Babbitt', '1988'),
('Frank', 'Abagnale', '2002'),
('Bill', 'Clark', '2009'),
('Alan', 'Clark', '1804'),
)
for row in sort_table(mytable, (1,0)):
print row
Results:
('Frank', 'Abagnale', '2002')
('Charlie', 'Babbitt', '1988')
('Alan', 'Clark', '1804')
('Bill', 'Clark', '2009')
('Joe', 'Clark', '1989')
Related posts
- How to sort a list of dicts in Python — posted 2010-04-02
- Python setdefault example — posted 2010-02-09
- How to conditionally replace items in a list — posted 2008-08-22
- How to use Python's enumerate and zip to iterate over two lists and their indices. — posted 2008-04-18
- How to invert a dict in Python — posted 2008-01-14
- How to find the intersection and union of two lists in Python — posted 2008-01-03
8
Comments
—
Comments feed for this post
#6 Orangeboy commented on 2010-09-24:
Thanks! New to python, and had trouble with complex sorting. This does the trick!
#7 Senyai commented on 2010-11-26:
Nice, works as fast as sorted(table, cmp=lambda a, b:cmp((a[1],a[0]), (b[1],b[0])))
#8 Will commented on 2011-12-21:
Thank you for this! Very simple and straightforward, that's what I needed!
Post a comment
About
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »
Search Blog
Tags
-
algorithms
(5)
-
aws
(9)
-
blogproject
(20)
-
c_cplusplus
(12)
-
cardstore
(8)
-
colinux
(2)
-
concurrency
(13)
-
conkeror
(2)
-
core
(2)
-
cygwin
(17)
-
datastructures
(14)
-
datetime
(4)
-
decorators
(4)
-
django
(40)
-
emacs
(22)
-
files_directories
(11)
-
git
(5)
-
hardware
(5)
-
install_setup
(8)
-
javascript
(3)
-
keyboard
(9)
-
matplotlib
(5)
-
mercurial
(4)
-
nginx
(2)
-
persistence
(5)
-
preferences
(7)
-
processes
(4)
-
pyqt
(18)
-
python
(144)
-
ratpoison
(3)
-
regexes
(6)
-
rsync
(3)
-
softwaretools
(17)
-
sql
(14)
-
ssh
(10)
-
subversion
(6)
-
twisted
(7)
-
ubuntu
(65)
-
urxvt
(5)
-
vxworks
(25)
-
webdev
(5)
-
wmii
(7)
Blogroll
- Adam Gomaa
- Alex Clemesha
- Amir Salihefendic
- Armin Ronacher
- David Beazley
- David Ziegler
- Duncan McGreggor
- Gareth Rushgrave
- Glyph Lefkowitz
- Guido van Rossum
- Ian Bicking
- Jacob Kaplan-Moss
- James Bennett
- James Tauber
- Jesper Noehr
- Marty Alchin
- Matt Harrison
- Nikolay Kolev
- Parand Darugar
- Peter Baumgartner
- Peter Bengtsson
- Rob Hudson
- Simon Willison
- Will McGugan
#1 Anonymous commented on 2008-01-11:
Very Helpful.
Thanks.