Using the map_async(), starmap_async(), and apply_async() functions

The role of the map(), starmap(), and apply() functions is to allocate work to a subprocess in the Pool object and then collect the response from the subprocess when that response is ready. This can cause the child to wait for the parent to gather the results. The _async() function's variations do not wait for the child to finish. These functions return an object that can be queried to get the individual results from the child processes.

The following is a variation using the map_async() method:

import multiprocessing
pattern = "*.gz"
combined = Counter()
with multiprocessing.Pool() as workers:
    results = workers.map_async(
analysis, glob.glob(pattern)) data = results.get() for c in data: combined.update(c)

We've created a Counter() function that we'll use to consolidate the results from each worker in the pool. We created a pool of subprocesses based on the number of available CPUs and used this Pool object as a context manager. We then mapped our analysis() function to each file in our file-matching pattern. The response from the map_async() function is a MapResult object; we can query this for the results and overall status of the pool of workers. In this case, we used the get() method to get the sequence of the Counter objects.

The resulting Counter objects from the analysis() function are combined into a single resulting Counter object. This aggregate gives us an overall summary of a number of log files. This processing is not any faster than the previous example. The use of the map_async() function allows the parent process to do additional work while waiting for the children to finish.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset