Flattening data while mapping

In Chapter 4, Working with Collections, we looked at algorithms that flattened a nested tuple-of-tuples structure into a single iterable. Our goal at the time was simply to restructure some data, without doing any real processing. We can create hybrid solutions that combine a function with a flattening operation.

Let's assume that we have a block of text that we want to convert to a flat sequence of numbers. The text looks as follows:

>>> text= """
...   2   3    5    7   11   13   17   19   23   29 
...  31  37   41   43   47   53   59   61   67   71 
...  73  79   83   89   97  101  103  107  109  113 
... 127 131  137  139  149  151  157  163  167  173 
... 179 181  191  193  197  199  211  223  227  229 
... """  

Each line is a block of 10 numbers. We need to unblock the rows to create a flat sequence of numbers.

This is done with a two-part generator function, as follows:

data = list(
v
for line in text.splitlines()
for v in line.split()
)

This will split the text into lines and iterate through each line. It will split each line into words and iterate through each word. The output from this is a list of strings, as follows:

['2', '3', '5', '7', '11', '13', '17', '19', '23', '29', '31', '37', 
'41', '43', '47', '53', '59', '61', '67', '71', '73', '79', '83',
'89', '97', '101', '103', '107', '109', '113', '127', '131', '137',
'139', '149', '151', '157', '163', '167', '173', '179', '181', '191',
'193', '197', '199', '211', '223', '227', '229']

To convert the strings to numbers, we must apply a conversion function as well as unwind the blocked structure from its original format, using the following code snippet:

from numbers import Number
from typing import Callable, Iterator

Num_Conv = Callable[[str], Number]

def numbers_from_rows(
conversion: Num_Conv, text: str) -> Iterator[Number]:
return (
conversion(value)
for line in text.splitlines()
for value in line.split()
)

This function has a conversion argument, which is a function that is applied to each value that will be emitted. The values are created by flattening using the algorithm shown previously.

We can use this numbers_from_rows() function in the following kind of expression:

print(list(numbers_from_rows(float, text)))

Here we've used the built-in float() to create a list of floating-point values from the block of text.

We have many alternatives using mixtures of higher-order functions and generator expressions. For example, we might express this as follows:

map(float, 
value
for line in text.splitlines()
for value in line.split()
)

This might help us understand the overall structure of the algorithm. The principle is called chunking; the details of a function with a meaningful name can be abstracted and we can work with the function in a new context. While we often use higher-order functions, there are times when a generator expression can be clearer.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset