bags (Multisets)¶
Bags are a multiset implementation for Python. Currently, bags have constant time inclusion testing but can only contain hashable elements due to the implementation.
There are three classes provided:
Bag
An abstract base class for bags.
bag
A mutable (unhashable) Bag.
frozenbag
An immutable (implements
collections.abc.Hashable
) Bag.
The Bag base class implements collections.abc.Sized
,
collections.abc.Iterable
and collections.abc.Container
as well as collections.abc.Collection
starting in Python
3.6 and the polyfilled Collection
for Python < 3.6.
Set Operations¶
Bags use python operators for multiset operations:
__add__ (a + b): The sum of two multisets
__sub__ (a - b): The difference between a and b
__and__ (a & b): The intersection of a and b
__or__ (a | b): The union of a and b
__xor__ (a ^ b): The symmetric difference between a and b
bag
has the equivalent in-place operators defined.
Comparison Methods¶
Bags are comparable only to other Bags. Ordering comparisons are done setwise.
>>> bag('ac') <= bag('ab')
False
>>> bag('ac') >= bag('ab')
False
>>> bag('a') <= bag('a') < bag('aa')
True
>>> bag('aa') <= bag('a')
False
Compared to existing similar implementations¶
collections.Counter¶
Counters don’t really behave like Collections - Sized, Iterable, Containers
Adding and Removing¶
>>> c = Counter()
>>> c['a'] += 1
>>> c['a'] -= 1
>>> 'a' in c
True
>>> b = bag()
>>> b.add('a')
>>> 'a' in b
True
>>> b.remove('a')
>>> 'a' in b
False
len
¶
>>> c = Counter()
>>> c['a'] += 1
>>> len(c)
1
>>> c['a'] -= 1
>>> len(c)
1
>>> c['a'] += 2
>>> len(c)
1
>>> len(Counter('aaabbc'))
3
>>> b = bag()
>>> b.add('a')
>>> len(b)
1
>>> b.remove('a')
>>> len(b)
0
>>> len(bag('aaabbc'))
6
Iterating¶
>>> for item in Counter('aaa'): print(item)
a
>>> for item in bag('aaa'): print(item)
a
a
a
Compared to Standard Types¶
bag vs. list¶
Inclusion testing is O(1)
Adding and removing elements is O(1)
Cannot add mutable elements
Elements aren’t ordered
bag vs. set¶
Can add multiple instances of equal elements
New Methods¶
These are bag methods that are not implementing an abstract method from a standard Python ABC.
num_unique_elements
Returns the number of unique elements in the bag. O(1)
unique_elements()
Returns a set of all the unique elements in the bag. O(1)
nlargest(n=None)
Returns the n most common elements and their counts from most common to least. If n is None then all elements are returned. O(n log n)
copy()
Returns a shallow copy of self. O(self.num_unique_elements())
isdisjoint(other: Iterable)
Tests if self is disjoint with any other Iterable. O(len(other))
issubset(other: Iterable)
Tests if self is a subset of another Iterable.
issuperset(other: Iterable)
Tests if self is a superset of another Iterable.
from_mapping(map: Mapping)
Classmethod to create a bag from a Mapping that maps elements to counts.
The following are only for mutable bags (not frozenbags).
pop()
add(elem)
discard(elem)
remove(elem)
clear()
API¶
Bag¶
-
class
collections_extended.
Bag
(iterable=None)[source]¶ Bases:
collections.abc.Collection
Base class for bag classes.
Base class for bag and frozenbag. Is not mutable and not hashable, so there’s no reason to use this instead of either bag or frozenbag.
-
__init__
(iterable=None)[source]¶ Create a new bag.
If iterable isn’t given, is None or is empty then the bag starts empty. Otherwise each element from iterable will be added to the bag however many times it appears.
This runs in O(len(iterable))
-
count
(value)[source]¶ Return the number of value present in this bag.
If value is not in the bag no Error is raised, instead 0 is returned.
This runs in O(1) time
- Parameters
value – The element of self to get the count of
- Returns
The count of value in self
- Return type
int
-
nlargest
(n=None)[source]¶ List the n most common elements and their counts. List is from the most common to the least. If n is None, the list all element counts. Run time should be O(m log m) where m is len(self) Args: n (int): The number of elements to return
Deprecated since version 1.0: Use heapq.nlargest(n, self.counts(), key=itemgetter(1)) instead or sorted(self.counts(), reverse=True, key=itemgetter(1)) for n=None
-
counts
()[source]¶ Return a view of the unique elements in self and their counts.
New in version 1.0.3.
-
classmethod
from_mapping
(mapping)[source]¶ Create a bag from a dict of elem->count.
Each key in the dict is added if the value is > 0.
- Raises
ValueError – If any count is < 0.
-
issubset
(other)[source]¶ Check that every element in self has a count <= in other.
- Parameters
other (Iterable) –
-
issuperset
(other)[source]¶ Check that every element in self has a count >= in other.
- Parameters
other (Iterable) –
-
__and__
(other)[source]¶ Intersection is the minimum of corresponding counts.
- This runs in O(l + n) where:
n is self.num_unique_elements()
l = 1 if other is a bag else l = len(other)
-
isdisjoint
(other)[source]¶ Return if this bag is disjoint with the passed collection.
This runs in O(len(other))
-
__or__
(other)[source]¶ Union is the maximum of all elements.
- This runs in O(m + n) where:
n = self.num_unique_elements()
m = other.num_unique_elements() if other is a bag else m = len(other)
-
__add__
(other)[source]¶ Return a new bag also containing all the elements of other.
self + other = self & other + self | other
- This runs in O(m + n) where:
n is self.num_unique_elements()
m is len(other)
- Parameters
other (Iterable) – elements to add to self
-
__sub__
(other)[source]¶ Difference between the sets.
For normal sets this is all x s.t. x in self and x not in other. For bags this is count(x) = max(0, self.count(x)-other.count(x))
- This runs in O(m + n) where:
n is self.num_unique_elements()
m is len(other)
- Parameters
other (Iterable) – elements to remove
-
product
(other, operator=None)[source]¶ Cartesian product of the two sets.
Optionally, pass an operator to combine elements instead of creating a tuple.
- This should run in O(m*n+l) where:
m is the number of unique elements in self
n is the number of unique elements in other
l is 0 if other is a bag, else l is the len(other)
- Parameters
other (Iterable) – The iterable to take the product with.
operator (Callable) – A function that accepts an element from self and other and returns a combined value to include in the return value.
-
bag¶
-
class
collections_extended.
bag
(iterable=None)[source]¶ Bases:
collections_extended.bags.Bag
bag is a mutable unhashable bag.
-
__init__
(iterable=None)¶ Create a new bag.
If iterable isn’t given, is None or is empty then the bag starts empty. Otherwise each element from iterable will be added to the bag however many times it appears.
This runs in O(len(iterable))
-
remove
(elem)[source]¶ Remove elem from this bag, raising a ValueError if it isn’t present.
- Parameters
elem – object to remove from self
- Raises
ValueError – if the elem isn’t present
-
frozenbag¶
-
class
collections_extended.
frozenbag
(iterable=None)[source]¶ Bases:
collections_extended.bags.Bag
,collections.abc.Hashable
frozenbag is an immutable, hashable bag.
-
__init__
(iterable=None)¶ Create a new bag.
If iterable isn’t given, is None or is empty then the bag starts empty. Otherwise each element from iterable will be added to the bag however many times it appears.
This runs in O(len(iterable))
-