Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Caching Downloads

In the previous chapter, we learned how to scrape data from crawled web pages and save the results to a CSV file. What if we now want to scrape an additional field, such as the flag URL? To scrape additional fields, we would need to download the entire website again. This is not a significant obstacle for our small example website; however, other websites can have millions of web pages, which could take weeks to recrawl. One way scrapers avoid these problems is by caching crawled web pages from the beginning, so they only need to be downloaded once.
In this chapter, we will cover a few ways to do this using our web crawler.

In this chapter, we will cover the following topics:

When to use caching
Adding cache support to the link crawler
Testing the cache
Using requests - cache
Redis cache implementation

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Caching Downloads

Create new playlist

Sign In

Sign Up

Table of Contents for
Caching Downloads