My friend Bill had previously alerted me to the coolness of Python
sets. However I hadn't found opportunity to use them
until now. Here are three functions using
remove duplicate entries from a list, find the intersection of two
lists, and find the union of two lists. Note,
introduced in Python 2.4, so Python 2.4 or later is required. Also,
the items in the list must be hashable and order of the lists is not
For more information on Python
sets, see the
""" NOTES: - requires Python 2.4 or greater - elements of the lists must be hashable - order of the original lists is not preserved """ def unique(a): """ return the list with duplicate elements removed """ return list(set(a)) def intersect(a, b): """ return the intersection of two lists """ return list(set(a) & set(b)) def union(a, b): """ return the union of two lists """ return list(set(a) | set(b)) if __name__ == "__main__": a = [0,1,2,0,1,2,3,4,5,6,7,8,9] b = [5,6,7,8,9,10,11,12,13,14] print unique(a) print intersect(a, b) print union(a, b)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [8, 9, 5, 6, 7] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
You may also want to check out this: [http://stackoverflow.com/questions/642763/python-intersection-of-two-lists]
thanks for the above - very helpful. do you also know a good solution for intersection & union without eliminating the duplicates beforehand? say I have
a = [1,1,2,4,4]
b = [1,4,4,4]
I would like to get
for intersection = [1,4,4]
and for union = [1,1,2,4,4,4].
Right now, I cannot think of anything better than two nested loops.
Thank you. It was exactly what I was looking for!
How about this:
def to_multiset(x): result = set() max_rep = len(x) for elt in x: for n in xrange(max_rep): n_elt = (elt,n) if n_elt not in result: result.add(n_elt) break return result def from_multiset(x): return sorted([elt for elt,n in x]) def multi_union(a, b): aa = to_multiset(a) bb = to_multiset(b) return from_multiset(aa | bb) def multi_intersect(a, b): aa = to_multiset(a) bb = to_multiset(b) return from_multiset(aa & bb) a = [1, 1, 2, 4, 4] b = [1, 4, 4, 4] expected_intersection = [1, 4, 4] expected_union = [1, 1, 2, 4, 4, 4] print multi_union(a, b), expected_union print multi_intersect(a, b), expected_intersection
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »