Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Simulating web browsing

Corporate websites are usually made by teams or departments using specialized tools and templates. A lot of the content is generated on the fly and consists of a large part of JavaScript and CSS. This means that even if we download the content, we still have to, at least, evaluate the JavaScript code. One way that we can do this from a Python program is using the Selenium API. Selenium's main purpose is actually testing websites, but nothing stops us from using it to scrape websites.

Instead of scraping a website, we will scrape an IPython Notebook—the test_widget.ipynb file in this book's code bundle. To simulate browsing this web page, we provided a unit test class in test_simulating_browsing.py. In case you wondered, this is not the recommended way to test IPython Notebooks.

For historic reasons, I prefer using XPath to find HTML elements. XPath is a query language, which also works with HTML. This is not the only method, you can also use CSS selectors, tag names, or IDs. To find the right XPath expression, you can either install a relevant plugin for your favorite browser, or for instance in Google Chrome, you can inspect an element's XPath.

Getting ready

Install Selenium with the following command:

$ pip install selenium

I tested the code with Selenium 2.47.1.

How to do it…

The following steps show you how to simulate web browsing using an IPython widget that I made. The code for this recipe is in the test_simulating_browsing.py file in this book's code bundle:

The first step is to run the following:
```
$ ipython notebook
```

The imports are as follows:

from selenium import webdriver
import time
import unittest
import dautil as dl

NAP_SECS = 10

Define the following function, which creates a Firefox browser instance:

class SeleniumTest(unittest.TestCase):
    def setUp(self):
        self.logger = dl.log_api.conf_logger(__name__)
        self.browser = webdriver.Firefox()

Define the following function to clean up when the test is done:
```
    def tearDown(self):
        self.browser.quit()
```

The following function clicks on the widget tabs (we have to wait for the user interface to respond):

def wait_and_click(self, toggle, text):
        xpath = "//a[@data-toggle='{0}' and contains(text(), '{1}')]"
        xpath = xpath.format(toggle, text)
        elem = dl.web.wait_browser(self.browser, xpath)
        elem.click()

Define the following function, which performs the test that consists of evaluating the notebook cells and clicking on a couple of tabs in the IPython widget (we use port 8888):

    def test_widget(self):
        self.browser.implicitly_wait(NAP_SECS)
        self.browser.get('http://localhost:8888/notebooks/test_widget.ipynb')

        try:
            # Cell menu
            xpath = '//*[@id="menus"]/div/div/ul/li[5]/a'
            link = dl.web.wait_browser(self.browser, xpath)
            link.click()
            time.sleep(1)

            # Run all
            xpath = '//*[@id="run_all_cells"]/a'
            link = dl.web.wait_browser(self.browser, xpath)
            link.click()
            time.sleep(1)

            self.wait_and_click('tab', 'Figure')
            self.wait_and_click('collapse', 'figure.figsize')
        except Exception:
            self.logger.warning('Error while waiting to click', exc_info=True)
            self.browser.quit()

        time.sleep(NAP_SECS)
        self.browser.save_screenshot('widgets_screenshot.png')

if __name__ == "__main__":
    unittest.main()

The following screenshot is created by the code:

Table of Contents for
Simulating web browsing

Simulating web browsing

Getting ready

How to do it…

See also

Table of Contents for Simulating web browsing

Create new playlist

Sign In

Sign Up

Simulating web browsing

Getting ready

How to do it…

See also

Table of Contents for
Simulating web browsing