Assigning numbers with enumerate()

In Chapter 7, Additional Tuple Techniques, we used the enumerate() function to make a naive assignment of rank numbers to sorted data. We can do things such as pairing up a value with its position in the original sequence, as follows:

pairs = tuple(enumerate(sorted(raw_values)))  

This will sort the items in raw_values in order, create two tuples with an ascending sequence of numbers, and materialize an object we can use for further calculations. The command and the result are as follows:

>>> raw_values = [1.2, .8, 1.2, 2.3, 11, 18]
>>> tuple(enumerate( sorted(raw_values)))
((0, 0.8), (1, 1.2), (2, 1.2), (3, 2.3), (4, 11), (5, 18))

In Chapter 7, Additional Tuple Techniques, we implemented an alternative form of enumerate, the rank() function, which would handle ties in a more statistically useful way.

This is a common feature that is added to a parser to record the source data row numbers. In many cases, we'll create some kind of row_iter() function to extract the string values from a source file. This may iterate over the string values in tags of an XML file or in columns of a CSV file. In some cases, we may even be parsing data presented in an HTML file parsed with Beautiful Soup.

In Chapter 4, Working with Collections, we parsed an XML file to create a simple sequence of position tuples. We then created legs with a start, end, and distance. We did not, however, assign an explicit leg number. If we ever sorted the trip collection, we'd be unable to determine the original ordering of the legs.

In Chapter 7, Additional Tuple Techniques, we expanded on the basic parser to create named tuples for each leg of the trip. The output from this enhanced parser looks as follows:

(Leg(start=Point(latitude=37.54901619777347, longitude=
-76.33029518659048), end=Point(latitude=37.840832, longitude=
-76.273834), distance=17.7246),
Leg(start=Point(latitude=37.840832, longitude=-76.273834),
end=Point(latitude=38.331501, longitude=-76.459503),
distance=30.7382),
Leg(start=Point(latitude=38.331501, longitude=-76.459503),
end=Point(latitude=38.845501, longitude=-76.537331),
distance=31.0756),
...,
Leg(start=Point(latitude=38.330166, longitude=-76.458504),
end=Point(latitude=38.976334, longitude=-76.473503),
distance=38.8019))

The first Leg function is a short trip between two points on the Chesapeake Bay.

We can add a function that will build a more complex tuple with the input order information as part of the tuple. First, we'll define a slightly more complex version of the Leg class:

from typing import NamedTuple
class Point(NamedTuple):
latitude: float
longitude: float

class Leg(NamedTuple):
start: Point
end: Point
distance: float

The Leg definition is similar to the Leg definition shown in Chapter 7, Additional Tuple Techniques, but it includes the order as well as the other attributes. We'll define a function that decomposes pairs and creates Leg instances as follows:

from typing import Iterator

def ordered_leg_iter(
pair_iter: Iterator[Tuple[Point, Point]]
) -> Iterator[Leg]:
for order, pair in enumerate(pair_iter):
start, end = pair
yield Leg(
order,
start,
end,
round(haversine(start, end), 4)
)

We can use this function to enumerate each pair of starting and ending points. We'll decompose the pair and then re-assemble the order, start, and end parameters and the haversine(start,end) parameter's value as a single Leg instance. This generator function will work with an iterable sequence of pairs.

In the context of the preceding explanation, it is used as follows:

filename = "file:./Winter%202012-2013.kml"
with urllib.request.urlopen(filename) as source:
path_iter = float_lat_lon(row_iter_kml(source))
pair_iter = legs(path_iter)
trip_iter = ordered_leg_iter( pair_iter )
trip = list(trip_iter)

We've parsed the original file into the path points, created start—end pairs, and then created a trip that was built of individual Leg objects. The enumerate() function assures that each item in the iterable sequence is given a unique number that increments from the default starting value of 0. A second argument value to the enumerate() function can be given to provide an alternate starting value.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset