How to find the intersection and union of two lists in Python
My friend Bill had previously alerted me to the coolness of Python
set
s. However I hadn't found opportunity to use them
until now. Here are three functions using set
s to
remove duplicate entries from a list, find the intersection of two
lists, and find the union of two lists. Note, set
s were
introduced in Python 2.4, so Python 2.4 or later is required. Also,
the items in the list must be hashable and order of the lists is not
preserved.
For more information on Python set
s, see the
Library Reference.
""" NOTES:
- requires Python 2.4 or greater
- elements of the lists must be hashable
- order of the original lists is not preserved
"""
def unique(a):
""" return the list with duplicate elements removed """
return list(set(a))
def intersect(a, b):
""" return the intersection of two lists """
return list(set(a) & set(b))
def union(a, b):
""" return the union of two lists """
return list(set(a) | set(b))
if __name__ == "__main__":
a = [0,1,2,0,1,2,3,4,5,6,7,8,9]
b = [5,6,7,8,9,10,11,12,13,14]
print unique(a)
print intersect(a, b)
print union(a, b)
Results:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [8, 9, 5, 6, 7] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
Related posts
- python enum types — posted 2012-10-10
- Python data object motivated by a desire for a mutable namedtuple with default values — posted 2012-08-03
- How to sort a list of dicts in Python — posted 2010-04-02
- Python setdefault example — posted 2010-02-09
- How to conditionally replace items in a list — posted 2008-08-22
- How to use Python's enumerate and zip to iterate over two lists and their indices. — posted 2008-04-18
5 Comments — Comments feed for this post
#3 panta commented on 2011-04-17:
thanks for the above - very helpful. do you also know a good solution for intersection & union without eliminating the duplicates beforehand? say I have
a = [1,1,2,4,4]
b = [1,4,4,4]
I would like to get
for intersection = [1,4,4]
and for union = [1,1,2,4,4,4].
Right now, I cannot think of anything better than two nested loops.
Thanks.
#5 Dobes commented on 2014-05-28:
@panta,
How about this:
def to_multiset(x):
result = set()
max_rep = len(x)
for elt in x:
for n in xrange(max_rep):
n_elt = (elt,n)
if n_elt not in result:
result.add(n_elt)
break
return result
def from_multiset(x):
return sorted([elt for elt,n in x])
def multi_union(a, b):
aa = to_multiset(a)
bb = to_multiset(b)
return from_multiset(aa | bb)
def multi_intersect(a, b):
aa = to_multiset(a)
bb = to_multiset(b)
return from_multiset(aa & bb)
a = [1, 1, 2, 4, 4]
b = [1, 4, 4, 4]
expected_intersection = [1, 4, 4]
expected_union = [1, 1, 2, 4, 4, 4]
print multi_union(a, b), expected_union
print multi_intersect(a, b), expected_intersection
Post a comment
About
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »
Search Blog
Tags
- algorithms (6)
- android (2)
- aws (10)
- blogproject (20)
- c_cplusplus (12)
- cardstore (8)
- colinux (2)
- concurrency (13)
- conkeror (2)
- core (2)
- cygwin (17)
- datastructures (16)
- datetime (4)
- decorators (4)
- django (41)
- emacs (22)
- files_directories (12)
- git (6)
- hardware (6)
- install_setup (8)
- javascript (4)
- keyboard (9)
- matplotlib (6)
- mercurial (4)
- nginx (2)
- persistence (6)
- preferences (7)
- processes (4)
- pyqt (18)
- python (162)
- ratpoison (3)
- regexes (6)
- rsync (3)
- softwaretools (17)
- sql (14)
- ssh (12)
- subversion (6)
- twisted (7)
- ubuntu (66)
- urxvt (5)
- vxworks (25)
- webdev (12)
- wmii (7)
Blogroll
- Adam Gomaa
- Alex Clemesha
- Amir Salihefendic
- Armin Ronacher
- David Beazley
- David Ziegler
- Duncan McGreggor
- Gareth Rushgrave
- Glyph Lefkowitz
- Guido van Rossum
- Ian Bicking
- Jacob Kaplan-Moss
- James Bennett
- James Tauber
- Marty Alchin
- Matt Harrison
- Nikolay Kolev
- Parand Darugar
- Lincoln Loop
- Peter Bengtsson
- Rob Hudson
- Simon Willison
- Will McGugan
#1 Derek commented on 2010-02-25:
You may also want to check out this: [http://stackoverflow.com/questions/642763/python-intersection-of-two-lists]