Saltycrane logo

SaltyCrane Blog

Notes on Python, Django, and web development on Ubuntu Linux

    

How to find the intersection and union of two lists in Python

My friend Bill had previously alerted me to the coolness of Python sets. However I hadn't found opportunity to use them until now. Here are three functions using sets to remove duplicate entries from a list, find the intersection of two lists, and find the union of two lists. Note, sets were introduced in Python 2.4, so Python 2.4 or later is required. Also, the items in the list must be hashable and order of the lists is not preserved.

For more information on Python sets, see the Library Reference.

""" NOTES:
      - requires Python 2.4 or greater
      - elements of the lists must be hashable
      - order of the original lists is not preserved
"""
def unique(a):
    """ return the list with duplicate elements removed """
    return list(set(a))

def intersect(a, b):
    """ return the intersection of two lists """
    return list(set(a) & set(b))

def union(a, b):
    """ return the union of two lists """
    return list(set(a) | set(b))

if __name__ == "__main__": 
    a = [0,1,2,0,1,2,3,4,5,6,7,8,9]
    b = [5,6,7,8,9,10,11,12,13,14]
    print unique(a)
    print intersect(a, b)
    print union(a, b)

Results:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[8, 9, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

5 Comments — feed icon Comments feed for this post


#1 Derek commented on 2010-02-25:

You may also want to check out this: [http://stackoverflow.com/questions/642763/python-intersection-of-two-lists]


#2 Eliot commented on 2010-02-25:

Derek: Thanks. My method is simple, but it has its limitations.


#3 panta commented on 2011-04-17:

thanks for the above - very helpful. do you also know a good solution for intersection & union without eliminating the duplicates beforehand? say I have

a = [1,1,2,4,4]

b = [1,4,4,4]

I would like to get

for intersection = [1,4,4]

and for union = [1,1,2,4,4,4].

Right now, I cannot think of anything better than two nested loops.

Thanks.


#4 Jey commented on 2013-05-07:

Thank you. It was exactly what I was looking for!


#5 Dobes commented on 2014-05-28:

@panta,

How about this:

def to_multiset(x):
    result = set()
    max_rep = len(x)
    for elt in x:
        for n in xrange(max_rep):
            n_elt = (elt,n)
            if n_elt not in result:
                result.add(n_elt)
                break
    return result

def from_multiset(x):
    return sorted([elt for elt,n in x])

def multi_union(a, b):
    aa = to_multiset(a)
    bb = to_multiset(b)
    return from_multiset(aa | bb)

def multi_intersect(a, b):
    aa = to_multiset(a)
    bb = to_multiset(b)
    return from_multiset(aa & bb)

a = [1, 1, 2, 4, 4]
b = [1, 4, 4, 4]

expected_intersection = [1, 4, 4]
expected_union = [1, 1, 2, 4, 4, 4]

print multi_union(a, b), expected_union
print multi_intersect(a, b), expected_intersection

Post a comment

Required
Required, but not displayed
Optional

Format using Markdown. (No HTML.)
  • Code blocks: prefix each line by at least 4 spaces or 1 tab (and a blank line before and after)
  • Code span: surround with backticks
  • Blockquotes: prefix lines to be quoted with >
  • Links: <URL>
  • Links w/ description: [description](URL)
Created with Django | Hosted by Linode