Supporting proxies

Sometimes it's necessary to access a website through a proxy. For example, Hulu is blocked in many countries outside the United States as are some videos on YouTube. Supporting proxies with urllib is not as easy as it could be. We will cover requests for a more user-friendly Python HTTP module that can also handle proxies later in this chapter. Here's how to support a proxy with urllib:

proxy = 'http://myproxy.net:1234' # example string 
proxy_support = urllib.request.ProxyHandler({'http': proxy})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener) 
# now requests via urllib.request will be handled via proxy

Here is an updated version of the download function to integrate this:

def download(url, user_agent='wswp', num_retries=2, charset='utf-8', proxy=None): 
    print('Downloading:', url) 
    request = urllib.request.Request(url) 
    request.add_header('User-agent', user_agent)
    try: 
        if proxy:
            proxy_support = urllib.request.ProxyHandler({'http': proxy})
            opener = urllib.request.build_opener(proxy_support)
            urllib.request.install_opener(opener)
        resp = urllib.request.urlopen(request)
        cs = resp.headers.get_content_charset()
        if not cs:
            cs = charset
        html = resp.read().decode(cs)
    except (URLError, HTTPError, ContentTooShortError) as e:
        print('Download error:', e.reason)
        html = None 
        if num_retries > 0: 
            if hasattr(e, 'code') and 500 <= e.code < 600: 
            # recursively retry 5xx HTTP errors 
            return download(url, num_retries - 1) 
    return html

The current urllib module does not support https proxies by default (Python 3.5). This may change with future versions of Python, so check the latest documentation. Alternatively, you can use the documentation's recommended recipe (https://code.activestate.com/recipes/456195/) or keep reading to learn how to use the requests library.

Table of Contents for Supporting proxies

Create new playlist

Sign In

Sign Up

Table of Contents for
Supporting proxies