10. Web Workers

If you have already been experimenting a bit with JavaScript, you may have come across a browser message similar to this: A script on this page may be busy, or it may have stopped responding. This could be the result of a programming error, perhaps an endless loop. But what should you do if your JavaScript does not have an error and the calculation is just taking a bit longer than usual? This is where web workers come in.

10.1 Introduction to Web Workers

To ensure that long calculations on the client side do not block the browser, a worker can work in the background and inform the calling script about the status of its calculations via messages. Workers have no access to DOM APIs, the window object, and the document object. What at first seems like a great limitation is in fact very sensible on closer inspection. If scripts running in parallel access the same resources and change them, very complex situations can arise as a result. The strict isolation of the workers and their communication via messages makes the JavaScript code more secure.

The start of a new worker is relatively labor intensive for the operating system, and each worker takes up more memory space than executing the same functions without workers. The advantages are obvious nevertheless: The browser remains able to react, and complicated calculations can be carried out in parallel, leading to a potential increase in speed for modern hardware.

When created, each worker receives the script containing the code for the worker:

var w = new Worker("calc.js");

The script, in this case calc.js, contains JavaScript code that is executed when the worker is called. Optionally, the worker contains an event handler for the message event, reacting to requests by the calling script. In practice, this supplies the worker with data for calculations and triggers the computing process:

addEventListener('message', function(evt) {
  // evt.data contains the data passed

The data transfer from the calling script to the worker and vice versa takes place via the postMessage() function. To supply the worker w with data, the following call is suitable:

w.postMessage(imgData);

JavaScript objects can be passed to the postMessage() call and converted to JSON strings internally by the browser. The important point is that this data is copied with every call, which can mean a considerable loss of speed in the case of large amounts of data.

As mentioned earlier, workers have no access to the window object. Exceptions are the functions of the WindowTimers interface: setTimeout()/clearTimeout() and setInterval()/clearInterval() can also be used within a worker. And workers can load external scripts, which is why the importScripts() function was introduced. One or more JavaScript files can be passed to this function (separated by commas), which the worker loads and can then use.

The worker also has read access to the location object, where the href attribute returns an absolute URL to the running worker. Via the XMLHttpRequest, workers can communicate with web services.

For web workers, the specification distinguishes between Dedicated Workers and Shared Workers; the second category, Shared Workers, is able to receive messages from different scripts and send their own messages to various scripts. In this chapter, we will only address the first variety, Dedicated Workers; for information on Shared Workers, please refer to the relevant sections in the specification at

http://dev.w3.org/html5/workers/#shared-workers-introduction.

Because this specification on web workers is still in an early stage and the existing implementations in WebKit and Firefox are still incomplete, we will omit a detailed description of the API and instead present you with two introductory examples of the way web workers function.

10.2 Search for Leap Years

Both prime numbers and the Fibonacci sequence have already been calculated sufficiently with web workers (you can easily find the relevant examples via Google). We want to turn to another, similarly exciting, task. In the first example, we will search for leap years since 1.1.1970. Because this task would only take a few fractions of a second on modern hardware and would not demonstrate the capabilities of web workers, we will make it difficult for our program. It is supposed to check for very short time spans (seconds or minutes) if it is February 29 and therefore a leap year. A selection for the step size of the time span is required, because different hardware will execute the program at different speeds. Figure 10.1 shows the output on a weak CPU after several seconds.

Figure 10.1 Web worker searching for leap years

image

Clicking the Start button executes the startCalc() function. This reads the step value set in the option field and then initializes the web worker worker with the script date_worker.js:

var opts = document.forms.stepForm.step.options;
startCalc = function() {
  var step = opts[opts.selectedIndex].value;
  var w = new Worker('date_worker.js'),
  w.postMessage(step);

The call of the postMessage() function to which the selected step size is passed communicates with the event listener for the message event in the script date_worker.js. Now the worker starts working:

addEventListener('message', function(evt) {
  var today = new Date();
  var oldMonth = -1;
  for (var i=0; i<today; i+=Number(evt.data)*1000) {
    var d = new Date(i);
    if (d.getDate() == 29 && d.getMonth() == 1
      && d.getHours() == 12 && d.getMinutes() == 0) {
      postMessage(d.toLocaleString());
    }
    if (d.getMonth() != oldMonth) {
      postMessage("y "+d.getFullYear()+"-"
        +(d.getMonth()+1));
      oldMonth = d.getMonth();
    }
  }
}, false);

A for loop in the worker runs from the second 0 to the current date (today), converting the value passed by postMessage() to a number via the Number() function and then multiplying it by 1000 to get the step size. Access to the postMessage() data takes place via the data attribute, which you have already encountered in the previous chapter about WebSockets. Multiplying by 1000 is necessary because the variable today contains the current value in milliseconds, not in seconds. If a date in the loop is recognized as February 29, the worker sends a message to the calling script and passes the day as a formatted string.

To indicate the current progress of the calculation, the program sends another message as soon as the loop reaches a new month. This message starts with the string "y " and also contains the year and the month. The following listing shows how the calling script distinguishes this message from a leap year notification:

w.onmessage = function(evt) {
  if (evt.data.substr(0,2) == "y ") {
    $("y").innerHTML = evt.data.substr(2);
  } else {
    $("cnt").innerHTML += "Leap year: "+evt.data+" ";
  }
}

The substr() function extracts the first two characters of the variable evt.data and compares them to the value "y ". In the case of a match, the field for displaying the date is updated; otherwise, the date is added as a new line to the field with the ID cnt. As in many other examples, we use the $() function as an abbreviation for the document.getElementById() call.

If the worker takes too long to run (for example, if your computer does not compute fast enough), you can force the process to end by clicking the Stop button. This stops the worker via the terminate() function; thereafter, the Start button is reactivated after being inactive during the computation:

stopCalc = function() {
  w.terminate();
  $("start").removeAttribute("disabled");
}

The next, more extensive example shows how several workers can work in parallel and carry out a more practical computation than the previous one.

10.3 Calculate Altitude Profiles with Canvas

Among the areas where web workers are particularly useful is undoubtedly the client-side analysis of audio, video, and image files. In our example, we use a PNG file showing the area of Tyrol, Austria, with a special feature: The image’s alpha channel contains the altitude information of the area. You can find this image online at

http://html5.komplett.cc/code/chap_workers/images/topo_elevation_alpha.png.

Via canvas we can not only read the color values, but also the alpha channel values (see Chapter 5, Canvas), allowing us to carry out computations regarding the region. One simple example for such a computation is an altitude profile, extracting the altitude value for each point along a certain line.

The profiles in our example consist of several sections, and we can set both the number of the sections and the number of profiles via text fields on the website. This is necessary to be able to adapt the computation to computers of different speeds. The individual profile sections result from randomly chosen points within the picture. We want the program to display a progress bar during the computation and to output the calculated minimum and maximum altitude along the profile. Once all sections have been calculated, the program returns the number of points found. The website displays the number of points as well as the time it took to calculate the profile. It would make sense to send the entire altitude profile back to the calling program as a result, but if many sections are used, the profile takes up a lot of memory space and slows down the program considerably. This would not achieve the desired demo effect. Figure 10.2 shows two profiles being calculated in parallel using web workers.

Figure 10.2 Web workers calculating two altitude profiles simultaneously

image

If we are creating more than one profile, we can let the web workers carry out the calculations in parallel, whereas an analysis without web workers always has to be done sequentially. On modern hardware, where the operating system has multiple core processors available on the CPU, this means that the browser can divvy up the workload between the different cores. Figure 10.3 shows this situation on a system with four CPU cores. Although the call with web workers uses two cores to 100 percent capacity (at about 30 seconds), we can see in the second case that without web workers only one CPU core is used to its full capacity (at 15 seconds). The result is a marginally faster computation with web workers, with the browser reacting to input during the computation and continually updating the progress bar.

Figure 10.3 CPU usage in calculations with and without web workers

image

10.3.1 Important Code Fragments

To compare how the script behaves with and without web workers, you can call the program with both methods. You first need to integrate the external JavaScript file containing the code for the worker (canvas_profile.js) into the head of the calling website. From that point on, the onmessage function is globally available—but more on the worker code shortly. Let’s start with the HTML code for the program:

<script src="canvas_profile.js"></script>
...
<h1>Calculate elevation profiles with Web Workers</h1>
<p>Number of profiles <input type=number id=profiles
  size=2 oninput="updateProgressBars();" value=2>
Number of sections in profile
<input type=number id=parts value=500 size=4
  oninput="updateProgressBars();">
</p>
<h3>Start
<input type=button onclick="calcProfiles(true)"
    value="with"> or
<input type=button onclick="calcProfiles(false)"
    value="without"> Web Workers
</h3>

Each time the content of the two input fields of the type number are changed, they cause the function updateProgressBars() to be called. In it, the progress bar and the placeholders for the results output are created. The two buttons with and without start the calculation of the altitude profiles.

In the JavaScript code, we first extract the altitude values from the PNG image. To do this, we load the image into a new canvas element:

var canvas = document.createElement("CANVAS");
canvas.width = 300;
canvas.height = 300;
var context = canvas.getContext('2d'),
var image = document.querySelector("IMG");
context.drawImage(image,0,0);
// document.querySelector("BODY").appendChild(canvas);
var elev =
context.getImageData(0,0,canvas.width,canvas.height).data;
var alpha = [];
for (var i=0; i<elev.length; i+=4) {
  alpha.push(elev[i+3]);
}

In the variable image, the only img element on the website is loaded and then drawn onto the newly created canvas element. Neither the image nor the canvas is visible on the website, because the img element is marked with display:none and the canvas is never attached to the DOM tree. If you activate the commented-out line in the preceding code, you can see the canvas at the end of the page. As you know from Chapter 5, Canvas, the getImageData() function produces an array with the color and alpha channel values of the canvas (in each case four entries per pixel). Because only the alpha channel values are relevant for our example, we extract them from the array via the for loop. This data reduction is sensible because each worker receives a copy of the array. If we are starting four workers in parallel, the memory usage increases linearly with each worker.

The calcProfiles() function then starts the calculation with or without workers, depending on whether true or false is passed to the function:

calcProfiles = function(useWorker) {
  USE_WORKER = useWorker;
  startTime = new Date();
  for (var i=0; i<PROFILES; i++) {
    var imgData = {
      id : i,
      alpha: alpha,
      parts : PARTS,
      height : canvas.height,
      width : canvas.width
    }

The variable PROFILES contains the value of the relevant input field and controls how often the central for loop is run. The imgData variable is created with the altitude values of the image (alpha), the number of sections (PARTS), the canvas height (height), and canvas width (width), plus an ID (id), with the latter being required as reference for the profiles. Then the program logic divides itself into the part working with web workers and the part without web workers:

if (USE_WORKER) {
  imgData.useWorker = true;
  var worker = new Worker('canvas_profile.js'),
  worker.postMessage(imgData);
  worker.onmessage = function(evt){
    if (evt.data.task == 'update') {
      progress.item(evt.data.id).value = evt.data.status*i;
    } else if (evt.data.task == 'newMin') {
      $('progDivMin'+evt.data.id).innerHTML = evt.data.min;
    } else if (evt.data.task == 'newMax') {
      $('progDivMax'+evt.data.id).innerHTML = evt.data.max;
    } else {
      showResults(evt);
    }
  };
}
else {
  imgData.useWorker = false;
  showResults(
    onmessage({data:imgData})
  );
  progress.item(i).value = PARTS;
}

In the first case, a new worker is created and activated with postMessage(). The entire data structure of the imgData variable is passed to it. Then an event listener is defined, which receives four different message types. Messages of the type update will update the progress bar, and newMin and newMax reset the relevant altitude values on the website. All other messages call the showResult() function, which works out the time of the calculation and displays it with the number of points on the altitude profile.

If the call is to be started without workers, the onmessage() function of the external JavaScript file is started, with the imgData variable wrapped into the data attribute of a JavaScript object. This is useful because the postMessage() call in the worker also wraps data into such a structure, and we therefore do not need to further adapt the external code.

The external JavaScript file canvas_profile.js starts with the onmessage() function. In the notation shown here, this function has a double purpose: as an event handler for the worker’s message event and also as a global function, which we can call without a worker. In it, the random points for the individual sections are created:

onmessage = function(evt) {
...
  var p1 = [Math.round(Math.random()*(evt.data.width-1)),
           Math.round(Math.random()*(evt.data.height-1))];
  for (var i=1; i<evt.data.parts; i++) {
    var p2 = [Math.round(Math.random()*(evt.data.width-1)),
           Math.round(Math.random()*(evt.data.height-1))];
    var len = Math.sqrt((Math.pow(p2[0]-p1[0],2)
      +Math.pow(p2[1]-p1[1],2)));
    var profile = [];
    for (var j=0; j<len-1; j++) {
...
      var h = getHeight([x,y]);

The length in pixels (len) between the two random points (p1 and p2) is calculated via the Pythagorean theorem, using the JavaScript function Math.sqrt() (for the square root) and Math.pow() (for squaring). Then a second loop runs over all pixels along this route and extracts the altitude value from the array:

var getHeight = function(p) {
  var pos = ((parseInt(p[1])*evt.data.width) +
               parseInt(p[0]));
  return evt.data.alpha[pos] * equidistance;
};

To determine the desired position within the one-dimensional array of alpha channel values, we need to multiply the y-value by the canvas width and then add the x-value. The attentive reader will have noticed another detail: Before returning the determined value, it is multiplied by the variable equidistance. The reason is that we can only save 256 different values per channel in an 8-bit image file. But the area around Innsbruck, Austria, has an altitude difference of more than 256 meters, so the altitude in this PNG image is specified in steps of 20 meters.

If a new minimum value along a profile line is found, the calling script is notified accordingly:

if (h < min) {
  min = h;
  if (evt.data.useWorker) {
    postMessage({task:'newMin',min:min,id: evt.data.id});
  }
}

The same applies of course for new maximum values. At the end of each loop over all sections, the progress bar is updated, and as soon as all sections have been calculated, the result, wrapped in the variable d, is sent back to the main script. If the script is executed as a worker, the data is sent with postMessage(), without a worker, the result is returned to the calling function with return:

if (evt.data.useWorker) {
  postMessage({task:'update', status:i, id:evt.data.id});
}
...
if (evt.data.useWorker) {
  postMessage(d);
}
else {
  return {data:d};
}

The client-side analysis of image data conserves server capacity and network bandwidth. Provided there is suitable hardware equipment on the client side, this could give users the option of digitalizing altitude profiles on an image with an alpha channel and then graphically representing these in realtime.

If this has whet your appetite for web workers, please do not forget that using workers requires more resources than scripts working without workers. Data transfer with messages between a worker and the calling script is especially slower than in a script with direct access to the resources.

Summary

This chapter introduced the concept of scripts running parallel in the browser. In desktop applications these are known as threads; in the browser they are called web workers. Access to the elements of the website is subject to certain restrictions, but information can comfortably be exchanged between the calling script and the individual workers through the concept of message passing.

Web workers are particularly useful for large web applications where processes are running in the background and should not block user input. Think for example of automatic saving while you are working on a document or coloring source code while you create it, as demonstrated by Mozilla’s Web Editor Ace.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset