Using max() and min() to find extrema

The max() and min() functions each have a dual life. They are simple functions that apply to collections. They are also higher-order functions. We can see their default behavior as follows:

>>> max(1, 2, 3)
3
>>> max((1,2,3,4))
4  

Both functions will accept an indefinite number of arguments. The functions are designed to also accept a sequence or an iterable as the only argument and locate max (or min) of that iterable.

They also do something more sophisticated. Let's say we have our trip data from the examples in Chapter 4, Working with Collections. We have a function that will generate a sequence of tuples that looks as follows:

(
((37.54901619777347, -76.33029518659048), (37.840832, -76.273834), 17.7246),
((37.840832, -76.273834), (38.331501, -76.459503), 30.7382),
((38.331501, -76.459503), (38.845501, -76.537331), 31.0756),
((36.843334, -76.298668), (37.549, -76.331169), 42.3962),
((37.549, -76.331169), (38.330166, -76.458504), 47.2866),
((38.330166, -76.458504), (38.976334, -76.473503), 38.8019)
)

Each tuple in this collection has three values: a starting location, an ending location, and a distance. The locations are given in latitude and longitude pairs. The east latitude is positive, so these are points along the US East Coast, about 76° west. The distances between points are in nautical miles.

We have three ways of getting the maximum and minimum distances from this sequence of values. They are as follows:

  • Extract the distance with a generator function. This will give us only the distances, as we've discarded the other two attributes of each leg. This won't work out well if we have any additional processing requirements.
  • Use the unwrap(process(wrap())) pattern. This will give us the legs with the longest and shortest distances. From these, we can extract the distance item, if that's all that's needed. 
  • Use the max() and min() functions as higher-order functions, inserting a function that does the extraction of the important distance values.

To provide context, the following is a script that builds the overall trip:

from ch02_ex3 import (
float_from_pair, lat_lon_kml, limits, haversine, legs
)
path = float_from_pair(float_lat_lon(row_iter_kml(source))) trip = tuple(
(start, end, round(haversine(start, end), 4))
for start, end in legs(iter(path)))

This script requires source to be an open file with KML-formatted data points. The essential  trip object is a tuple of individual legs. Each leg is a three-tuple with the starting point, the ending point, and the distance, computed with the haversine function. The leg function creates start-end pairs from the overall path of points in the original KML file.

Once we have this trip object, we can extract distances and compute the maximum and minimum of those distances. The code to do this with a generator function looks as follows:

>>> long = max(dist for start, end, dist in trip)
>>> short = min(dist for start, end, dist in trip)

We've used a generator function to extract the relevant item from each leg of the trip tuple. We've had to repeat the generator function because each generator expression can be consumed only once.

Here are the results based on a larger set of data than was shown previously:

>>> long
129.7748
>>> short
0.1731

The following is a version with the unwrap(process(wrap())) pattern. To make it clear, the example includes functions with the names wrap() and unwrap(). Here's the functions and the evaluation of those functions:

from typing import Iterator, Iterable, Tuple, Any

Wrapped = Tuple[Any, Tuple]
def wrap(leg_iter: Iterable[Tuple]) -> Iterable[Wrapped]:
return ((leg[2], leg) for leg in leg_iter)
def unwrap(dist_leg: Tuple[Any, Any]) -> Any:
distance, leg = dist_leg return leg
long = unwrap(max(wrap(trip)))
short = unwrap(min(wrap(trip)))

Unlike the previous version, the max() and min() functions will locate all attributes of the legs with the longest and shortest distances. Rather than simply extracting the distances, we put the distances first in each wrapped tuple. We can then use the default forms of the min() and max() functions to process the two tuples that contain the distance and leg details. After processing, we can strip the first element, leaving just the leg details.

The results look as follows:

((27.154167, -80.195663), (29.195168, -81.002998), 129.7748)
((35.505665, -76.653664), (35.508335, -76.654999), 0.1731)

The final and most important form uses the higher-order function feature of the max() and min() functions. We'll define a helper function first and then use it to reduce the collection of legs to the desired summaries by executing the following code snippet:

def by_dist(leg: Tuple[Any, Any, Any]) -> Any:
    lat, lon, dist = leg
    return dist

long = max(trip, key=by_dist)
short = min(trip, key=by_dist)

The by_dist() function picks apart the three items in each leg tuple and returns the distance item. We'll use this with the max() and min() functions.

The max() and min() functions both accept an iterate and a function as arguments. The keyword parameter key= is used by all of Python's higher-order functions to provide a function that will be used to extract the necessary key value.

We can use the following to help conceptualize how the max() function uses the key function:

from typing import Iterable, Any, Callable

def max_like(trip: Iterable[Any], key: Callable) -> Any:
wrap = ((key(leg), leg) for leg in trip) return sorted(wrap)[-1][1]

The max() and min() functions behave as if the result of the given key() function is being used to wrap each item in the sequence in a two-tuple. After sorting the two-tuples, selecting the first (for min) or last (for max) provides a two-tuple with an extreme value. This can be decomposed to recover the original value.

In order for the key() function to be optional, it can have a default value of lambda x: x.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset