Collections is a built-in Python module that implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers such as dict, list, set, and tuple.
Some of the useful data structures present in this module are:plain_tuple = (10,11,12,13) plain_tuple[0] 10 plain_tuple[3] 13
We can’t give names to individual elements stored in a tuple. Now, this might not be needed in simple cases. However, in case a tuple has many fields this might be kind of a necessity and will also impact the code’s readability. It is here that namedtuple’s functionality comes into the picture. It is a function for tuples with Named Fields and can be seen as an extension of the built-in tuple data type. Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. Each object stored in them can be accessed through a unique (human-readable) identifier and this frees us from having to remember integer indexes. Let’s see its implementation.
from collections import namedtuple fruit = namedtuple('fruit','number variety color') guava = fruit(number=2,variety='HoneyCrisp',color='green') apple = fruit(number=5,variety='Granny Smith',color='red')
We construct the namedtuple by first passing the object type name (fruit) and then passing a string with the variety of fields as a string with spaces between the field names. We can then call on the various attributes:
guava.color 'green' apple.variety 'Granny Smith'
# Importing Counter from collections from collections import Counter # With Strings c = Counter('abcacdabcacd') print(c) Counter('a': 4, 'c': 4, 'b': 2, 'd': 2) # With lists lst = [5,6,7,1,3,9,9,1,2,5,5,7,7] c = Counter(lst) print(c) Counter('a': 4, 'c': 4, 'b': 2, 'd': 2) # With Sentence s = 'the lazy dog attacked over another lazy dog' words = s.split() Counter(words) Counter('another': 1, 'dog': 2, 'attacked': 1, 'lazy': 2, 'over': 1, 'the': 1)Common functions on counter object
sum(c.values()) # total of all counts c.clear() # reset all counts list(c) # list unique elements set(c) # convert to a set dict(c) # convert to a regular dictionary c.items()# convert to a list like (elem, cnt) Counter(dict(list_of_pairs)) # convert from a list of(elem, cnt) c.most_common()[:-n-1:-1] # n least common elements c += Counter() # remove zero and negative counts
Dictionaries are an efficient way to store data for later retrieval having an unordered set of key: value pairs. Keys must be unique and immutable objects.
from collections import defaultdict d = defaultdict(object) print(d['A']) object object at 0x7fc9bed4cb00
The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet).The defaultdict is also a dictionary-like object and provides all methods provided by a dictionary. However, the point of difference is that it takes the first argument (default_factory) as a default data type for the dictionary.
An OrderedDict is a dictionary subclass that remembers the order in which that keys were first inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added. Since an ordered dictionary remembers its insertion order, it can be used in conjunction with sorting to make a sorted dictionary
Dictionary ordered by key
OrderedDict(sorted(d.items(), key=lambda t: t[0])) OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])
Dictionary ordered by value
OrderedDict(sorted(d.items(), key=lambda t: t[1])) OrderedDict([('pear', 1), ('orange', 2), ('banana', 3), ('apple', 4)])
Dictionary ordered by length of key string
OrderedDict(sorted(d.items(), key=lambda t: len(t[0]))) OrderedDict([('pear', 1), ('apple', 4), ('banana', 3), ('orange', 2)])
from collections import deque list = ["a","b","c"] deq = deque(list) print(deq) deque(['a', 'b', 'c'])Inserting elements into deque
You can easily insert an element to the deq we created at either of the ends. To add an element to the right of the deque, you have to use append() method.
If you want to add an element to the start of the deque, you have to use appendleft() method.
deq.append("d") deq.appendleft("e") print(deq) deque(['e', 'a', 'b', 'c', 'd'])Removing elements from deque
Removing elements is similar to inserting elements. You can remove an element the similar way you insert elements. To remove an element from the right end, you can use pop() function and to remove an element from left, you can use popleft().
deq.pop() deq.popleft() print(deq) deque(['a', 'b', 'c'])Clearing deque and counting each element in deque
# Clearing the deque list = ["a","b","c"] deq = deque(list) print(deq) print(deq.clear()) # Counting each element list = ["a","b","c"] deq = deque(list) print(deq.count("a"))
Creating the ChainMap
To create a chainmap we can use ChainMap() constructor. We have to pass the dictionaries we are going to combine as an argument set.
dict1 = { 'a' : 1, 'b' : 2 } dict2 = { 'c' : 3, 'b' : 4 } chain_map = ChainMap(dict1, dict2) print(chain_map.maps) [{'b': 2, 'a': 1}, {'c': 3, 'b': 4}]
Another important point is ChainMap updates its values when its associated dictionaries are updated. For example, if you change the value of 'c' in dict2 to '5', you will notice the change in ChainMap as well.
dict2['c'] = 5 print(chain_map.maps) [{'a': 1, 'b': 2}, {'c': 5, 'b': 4}]
Getting keys and values from ChainMap
dict1 = { 'a' : 1, 'b' : 2 } dict2 = { 'c' : 3, 'b' : 4 } chain_map = ChainMap(dict1, dict2) print (list(chain_map.keys())) print (list(chain_map.values()))
Adding new dictionary to ChainMap
dict3 = {'e' : 5, 'f' : 6} new_chain_map = chain_map.new_child(dict3) print(new_chain_map) ChainMap({'f': 6, 'e': 5}, {'a': 1, 'b': 2}, {'b': 4, 'c': 3})Python collection module still needs improvements and features as compared to Java collection library. We can except changes in upcoming python releases.