7
Preventing Findability Roadblocks

Some types of content can be difficult or impossible for search engines to index. JavaScript, Ajax, Flash, audio, and video all pose unique findability challenges. But if you build intelligently you can create sophisticated interfaces with rich content without compromising search engine visibility.

Several content formats and interface scenarios might enhance the user experience, but create serious roadblocks for search engines spiders. Sites that are built entirely in Flash, though richly interactive, are often opaque to search engines. Like Flash, JavaScript-dependent interfaces often incorporate animation and sophisticated interactivity, but may make search engine indexing difficult or impossible if built improperly. Audio and video content also present huge roadblocks for search engine spiders.

With a little knowledge and forethought you can build websites that incorporate all of these great technologies without sacrificing the findability of your site. In this chapter we’ll identify the specific challenges these technologies pose for search engines, and examine practical solutions that will unlock your content for indexing. As we discovered in Chapter 2, “Markup Strategies,” search engine optimization and accessibility go hand in hand. The SEO strategies we’ll explore in this chapter will also make your content more accessible for users with disabilities and those using alternate platforms, thus broadening your potential audience.

Avoiding JavaScript Pitfalls

There’s been a boom recently in highly interactive interface design as dozens of JavaScript libraries have been released that make creating animation and various effects easy. MooTools (http://mootools.net/), Scriptaculous (http://script.aculo.us/), YUI (http://developer.yahoo.com/yui/), and Prototype (http://www.prototypejs.org/), to name just a few of the most popular libraries, have made the JavaScript that drives these systems more accessible to novice and intermediate scripters without extensive knowledge of the language.

Ironically, though JavaScript libraries have helped fuel innovations in interface design, they’ve also helped make it easier for front-end Web developers to degrade the findability of their site. It’s easy to be seduced by the bells and whistles JavaScript libraries offer, but don’t forget to be mindful of how search engines will view your site.

Search engine spiders are unable to read and execute JavaScript. If the content of your site is not accessible without the use of JavaScript it won’t be properly indexed. It’s easy to find out if your site has indexing pitfalls: Simply disable JavaScript in your browser and try to access all content. You can disable JavaScript in most modern browsers using the preferences panel. If you discover any content you cannot navigate to or view, then you’ll need to make some changes to help search engines do their job and ensure your content is accessible to your users.

The most common situations where JavaScript might pose a hindrance to indexing is in scripted navigation systems, scripted CSS styles that are by default set to display:none, and content that is loaded via Ajax—see the section in Chapter 6 entitled “Using the Google Search Ajax API” for an explanation of Ajax. Each of these scenarios can be easily resolved by progressively enhancing your interface.

Image
To get up to speed on JavaScript and Ajax, read Jeremy Keith’s books on Dom Scripting (http://domscripting.com/), published by Friends of Ed, and Bulletproof Ajax (http://bulletproofajax.com/), published by New Riders.

Progressive Enhancement

Progressive enhancement is a layered approach to interface development that helps ensure all users can navigate and read the content on a website regardless of browser or device limitations. The key to progressive enhancement is keeping the structure, presentation, and behavior separate (see FIGURE 7.1).

Figure 7.1 By keeping the structure (HTML), presentation (CSS), and behavior (JavaScript) separate when you build websites, content remains accessible to search engine spiders.

Image

Before creating intricate JavaScript behaviors for your interface, begin with a semantically meaningful HTML document that communicates the information hierarchy of your page. Use heading, strong emphasis, emphasis, lists, and other semantic tags to mark up your document so search engines can understand your content (see Chapter 2 for further details).

Next, add a presentational layer using an external CSS file that creates the design of the document without altering the HTML code. Then add a behavior layer of JavaScript that enhances the interactions of the page. The JavaScript is also kept external in a separate file, and can easily take control of elements in the page without mixing in with the HTML structure, as we’ll see in some examples in this chapter.

Building your documents with this additive approach keeps them functional with each step. Without CSS and JavaScript enabled, the user can still see the content and understand the information hierarchy as it’s communicated through your semantic markup. When CSS is enabled, the interface design is enhanced. When JavaScript is enabled, the interactions of the document are enhanced.

When search engines encounter sites built using progressive enhancement they’ll be able to index all of the content because JavaScript will no longer be a roadblock. Externalizing CSS and JavaScript also improves the speed at which search engines can index your page. If the three layers were integrated, search engines would be required to download JavaScript with HTML even though they can’t read it. When it’s separate, it can be ignored. CSS can be downloaded once by search engine spiders and cached for faster reference with each additional page crawl.

As Chapter 3 explained, you can configure your server to tell search engine spiders to cache your external CSS and JavaScript files. This can greatly increase the speed at which spiders can index your site. It’s just one more reason why keeping structure, presentation, and behavior separate is good for SEO.

Of course, keeping things separate makes maintenance easier as well. When you make a change in your external CSS or JavaScript files, every page that links to them will inherit the update automatically.

Now that you are privy to the benefits of progressive enhancement and how it will serve the findability of your site, let’s take a look at some practical examples of how you can prevent problems with your navigation, scripted styles, and content loaded with Ajax.

Solving JavaScript Navigation Problems

Navigation systems that require JavaScript to function and don’t gracefully degrade into a viable alternative when JavaScript is disabled should be avoided at all costs. When search engines encounter such systems they are unable to index all pages and much of the hard work that went into creating engaging content will be lost. It’s perhaps the fastest way to destroy all search engine referrals to your site. Disable JavaScript and browse your site; if you can’t get around, then you’ve got problems to resolve.

Dropdown menu systems are a popular navigation approach that let users dig down into subsections of a site without requiring a lot of clicks. If you’re using this type of navigation system on your site, make sure it’s accessible without JavaScript enabled. Legacy dropdown menu systems significantly hinder navigation when JavaScript support is unavailable.

James Edwards—known on the Web as Brothercake—has created an accessible version of the classic dropdown navigation system called Ultimate Dropdown Menu (http://www.brothercake.com/dropdown/). It uses progressive enhancement to deliver usable versions of the system regardless of whether JavaScript is supported (see FIGURE 7.2).

Figure 7.2 James Edwards’ Ultimate Dropdown Menu (http://brothercake.com/dropdown/) is a search engine friendly, accessible navigation system that is easy to integrate into a site, and can be styled to match your design.

Image

Edwards’ menu passes both the U.S. government’s Section 508 accessibility guidelines and the W3C’s Web Content Accessibility Guidelines (WCAG), which means it’s highly accessible to disabled users and search engines. It’s a good replacement option for JavaScript-dependent systems.

Alternatively, you can create your own search engine friendly navigation system using CSS instead of JavaScript. Rachel Andrew has written a great article on SitePoint called “Nifty Navigation Using CSS” (http://www.sitepoint.com/article/navigation-using-css), which provides a number of practical examples of dropdown menu systems. Although the navigation systems are a bit simpler than Ultimate Dropdown Menu, they’re still quite accessible and search engine friendly.

Solving Scripted Style Problems

Web pages that use JavaScript to manipulate the style of a page pose a potential problem for search engines and users. If JavaScript is disabled, your CSS should still display all of the content on the page so users can still see it and search engines can index it without mistaking it for dishonestly cloaked content.

In order to dishonestly achieve higher search rankings, some sites stuff their pages with keywords. This black-hat SEO trick is often accomplished by hiding keywords from users with the display:none CSS property. Search engines are wise to these sorts of tricks, and sometimes ban sites from their listings when they see dishonest content cloaking.

JavaScript interfaces sometimes need content hidden by default so it can be revealed when the user requests it. Any element that is by default styled by your CSS with display:none and is shown when a JavaScript event changes the style to display:block runs the risk of looking spammy to search engines. It’s also a significant accessibility problem as it will be invisible to users who have JavaScript disabled. Of course, users can’t find content that isn’t visible.

A common example of this type of interface scenario is a show/hide element that reveals some content when a user chooses to expand it. If you create a series of such boxes on a page and style them to be collapsed by default using CSS they could look like keyword spam to search engine spiders, and will be invisible to users with JavaScript disabled. Here’s an example of some code that could cause such problems:

CSS

. hide {display:none;}

XHTML

<div class="display-box">
    <a href="#" onclick="showText();">Expand</a>
    <div class="hide">This text is inaccessible to users without
JavaScript support, and might look like dishonest cloaking to
search engines.</div>
</div>

The text inside <div class="hide"> is by default hidden using display:none. Because JavaScript would be required to change the style of this <div> tag to be visible, it’s inaccessible when JavaScript is unsupported and search engines might mistake it for cloaked content. A better approach to this common problem is to style the <div> to be visible by default and then, if Java-Script support is available, collapse the box.

The above example mixes the JavaScript behavior into the HTML structure. It’s preferable to keep them separate so search engines don’t have to waste time downloading JavaScript code that is of no use to them. In the example that follows, the JavaScript onclick behavior will be attached to the expand/collapse link from an external file instead of mixing behavior and structure.

FIGURE 7.3 illustrates a progressive enhancement approach to a series of expand/collapse boxes in which content is visible without CSS or JavaScript support. The page is progressively enhanced with style and behavior as support is available.

Figure 7.3 The same document can be progressively enhanced from basic HTML, to a styled design, to an interactive interface. When CSS or JavaScript are disabled, the content is still accessible and search engines won’t mistake it for cloaked content.

Image

To build a more accessible, search engine friendly version of the page, start by creating an HTML document that keeps JavaScript and CSS external and uses some logical class names to identify the <div> tags to be manipulated. This example has three collapsible boxes, but the JavaScript we’ll write for the page behavior will be intelligent enough to accommodate as many or as few of these elements as you like.

    …
    <div class="display-box">
        <div class="expand">This text will be accessible to users
with or without JavaScript support, and won’t look like cloaked
content to search engines.</div>
    </div>

    <div class="display-box">
       <div class="expand">This is still more text that will be
accessible to users with or without JavaScript support, and won’t
look like cloaked content to search engines.</div>
    </div>

    <div class="display-box">
        <div class="expand">Watch out for <code>display:none</
code>. It sends the wrong message to search engines.</div>
    </div>

Notice that there are no links in the page to make the boxes expand and collapse. This is because they are only relevant when JavaScript is enabled, so they shouldn’t be present if it’s not. These links would also unnecessarily pollute the keyword density of the page if they were present when search engines spider the content. Search engines could mistakenly perceive the link label “Expand” as an important keyword in the page as it would be repeated often. Instead, the JavaScript will dynamically write these links to the page.

With the HTML structure in place, add some CSS in an external file. Create a class to expand and collapse the content. By default the HTML document shows the content expanded.

.expand {display:block;}
.collapse {display:none;}

The last step in a progressive enhancement workflow is the behavior layer defined by JavaScript. Because the HTML document includes a few class names, the JavaScript should be able to walk the page and manipulate the desired elements. The JavaScript will be placed in an external file called expand-collapse.js.

Inside the external JavaScript file a function is created so the page behavior can be invoked easily when the page is done loading. This function first checks to see if the browser supports JavaScript sufficiently to find the elements in the page by tag name. If it can’t, then we want to stop the script from proceeding, to avoid errors.

function expandCollapseBoxes(){
    if(!document.getElementsByTagName){return;}

};

The function will find all of the <div> tags in the page with the class expand and run a loop to change them to collapse. This will close all of the content boxes, which are left open for search engines by default.

function expandCollapseBoxes(){
    if(!document.getElementsByTagName){return;}
    var divs = document.getElementsByTagName("div");
    for (var i=0; i < divs.length; i++){
        if(divs[i].className =="expand"){
            divs[i].className ="collapse";
        }
    };
};

Next, inside the loop the link tags that toggle the expand/collapse functionality are dynamically added to each display box. When each link is created it’s assigned a class so it can be styled. Though the link won’t navigate to another page, it’s given an href attribute so it will look and behave like an interactive link in the browser. A label is added to the link, then it’s drawn into the page.

var closeLink = document.createElement("a");
closeLink.className ="close-box";
closeLink.href ="#";
closeLink.innerHTML ="Expand";
divs[i].parentNode.insertBefore(closeLink,divs[i]);

Following the dynamic link-generation code inside the loop, the link behavior is added. Inside the onclick function a variable is created to capture a reference to the <div> tag that is to be manipulated when the link is clicked. Then a conditional is used to toggle between expanding and collapsing the target <div>. If the text in the link is “Expand,” then the function should expand the box and change the link label to “Collapse.”

closeLink.onclick = function(){
    var displayBox = this.parentNode.getElementsByTagName("div")[0
];
    if(this.innerHTML == "Expand"){
        this.innerHTML = "Collapse";
        displayBox.className ="expand";
    }else{
        this.innerHTML ="Expand";
        displayBox.className ="collapse";
    }
};

Here’s what the function looks like when it’s completed:

function expandCollapseBoxes(){
    if(!document.getElementsByTagName){return;}
     var divs = document.getElementsByTagName("div");
     for (var i=0; i < divs.length; i++){
         if(divs[i].className == "expand"){
         divs[i].className ="collapse";

         // Build close link
         var closeLink = document.createElement("a");
         closeLink.className ="close-box";
         closeLink.href ="#";
         closeLink.innerHTML = "Expand";
         divs[i].parentNode.insertBefore(closeLink,divs[i]);

         // Create link behavior
         closeLink.onclick = function(){
             var displayBox = this.parentNode.getElementsByTagName(
"div")[0];
             if(this.innerHTML == "Expand"){
                 this.innerHTML = "Collapse";
                 displayBox.className ="expand";
             }else{
                 this.innerHTML = "Expand";
                 displayBox.className ="collapse";
             }
         };
     }
     };
};

All that’s left to do is link the HTML page to the external JavaScript file. To do this, add a script tag before the close </body> tag in the HTML page. You can call the expandCollapseBoxes() function with an onload event listener to make sure the JavaScript doesn’t attempt to manipulate the page until all elements are loaded into the browser.

<script type="text/javascript" src="js/expand-collapse.js"></
script>
<script type="text/javascript" charset="utf-8">
    window.onload = expandCollapseBoxes;
</script>

Another benefit of using this sort of progressive enhancement approach is that it makes it very easy to add to any page. Now that you have this expand/collapse interface behavior built, in future projects all you need to do to use it is create elements in the page with the appropriate class names and link to the JavaScript. It’s instant interface functionality that preserves the findability of your content.

Solving Ajax Problems

Web pages that use Ajax to access and display content pose yet another search engine visibility problem. Ajax is simply JavaScript that can pass information to and from a server. It’s a bit like a dumbwaiter that can connect a client-side interface to server-side scripts. It’s is often used in Web applications to speed up basic data manipulation tasks like reordering user input or editing text inline. Since no page refresh is required to store the user’s changes, the interface behaves very much like a desktop application with nearly instantaneous reactions.

Because only humans perform these types of information storage tasks, they typically have no adverse affect on search engine optimization. Ajax creates SEO problems when it is used to load content into the page. Because search engine spiders can’t run JavaScript, any content loaded by Ajax won’t be indexed.

There are two solutions to the Ajax content indexing problem: First, if your site’s use of Ajax is not essential to the user experience, consider eliminating it. If you’re using the technology for the cool factor rather than to speed up interface reaction time, then it’s probably not worth risking search engine visibility.

It is possible, though, to use Ajax in a search engine friendly way. If it is important for your site to load content via Ajax, you can progressively enhance your interface to work with or without JavaScript support, so search engines can still index it.

A Progressively Enhanced Ajax Catalog System

Let’s take a look at a progressively enhanced example that uses Ajax to create a single-page products catalog that users can browse quickly without a page refresh (see FIGURE 7.4). With a little bit of tweaking, this catalog example could be converted into a full ecommerce system.

Figure 7.4 This simple products catalog uses Ajax to load information without refreshing the page. If the browser doesn’t support JavaScript, the progressively enhanced interface will instead take the user to a separate product page where the content is loaded with a page refresh required.

Image

This interface will let users browse a catalog of an artist’s pottery showing the title, a description, and a product image. When JavaScript is enabled, the links in the navigation that would normally go to another page are disabled. An Ajax content loading behavior is then attached to each link to make the browsing experience very fast. Using Ajax, no page refresh is required to view any of the products.

If JavaScript is not supported, as would be the case during search engine indexing, the links would simply navigate to a separate products page where the user could browse the catalog but the page would need to refresh for each product. FIGURE 7.5 illustrates the basic workflow the system will follow.

Figure 7.5 The catalog system should function with or without JavaScript support so search engines can index the content properly.

Image

Both of the browsing scenarios in this example will use the same PHP script to grab content from a database. It’s never any fun to double your development efforts to support different browsing situations. With a little forethought you can make your PHP script do double duty for you, which also simplifies maintenance.

To start, create a database on your server called ajax_products with the following fields: prodid (INT, auto increment, primary key), title (TEXT), description (TEXT), image (TEXT). Add a few product records to your database to work with.

You’ll also need to set up a directory structure that matches the one shown in FIGURE 7.6. To simplify the Ajax interactions we’ll use the popular JavaScript framework Prototype (http://www.prototypejs.org/), so you’ll need to download it and place it in the js folder.

Figure 7.6 To begin building the search engine friendly Ajax catalog, create a directory structure as shown.

Image

Building the Catalog’s HTML Structure

Following the same progressive enhancement workflow shown earlier, in the section “Solving Scripted Style Problems,” we’ll start the catalog with a well-structured HTML page. This document connects to an external style sheet inside <head>, and two external JavaScript files at the bottom of the page before the close </body> tag. Placing the JavaScript at the bottom of the document will make it render faster for users and search engines. The page can’t begin to render when it’s waiting for external JavaScript linked within the <head> tag to load. Putting your script tags in the bottom of the page is a simple technique that can speed up the user experience and help search engines index your pages faster.

Here’s the basic structure of the index.html page of the catalog:

    …
    <h1>Pottery Catalog</h1>

        <ul id="ajax-catalog">
        <li><a href="products/1">Casserole Dish</a></li>
        <li><a href="products/2">Candy Dish</a></li>
        <li><a href="products/3">Salt and Pepper</a></li>
        <li><a href="products/4">Soup Bowl</a></li>
    </ul>

    <div id="product-display">
        Some introductory content would go here.
    </div>

    <script type="text/javascript" src="js/prototype.js"></script>
    <script type="text/javascript" src="js/progressive-ajax.js">
  </script>

All of the navigation will link to a single products.php page that will load different product information—no Ajax required—based on the product id it’s passed. But notice that the path in the anchor tag doesn’t seem to be pointing to a file called products.php. As we learned in Chapter 3 in the section “Building Search Engine Friendly URLs,” a URL like http://example.com/products.php?prodid=3 can create search engine indexing problems. This example will use the URL rewriting techniques covered in Chapter 3 to make the URLs more search engine friendly.

If you decide to convert this example into a real catalog for your site, you’ll want to have PHP generate the navigation for you with a query to the database and a loop. As it stands now, when a new product is added you’d need to manually update the navigation system.

The next step in a progressive enhancement workflow would be to create a presentation layer of CSS for the page. Since you’re probably already a CSS guru, I’ll skip this step and let you style your catalog as you like.

Content Retrieval with PHP

Following a progressive enhancement philosophy, let’s first make the catalog work without JavaScript by creating the products.php file alluded to earlier. Save a copy of the index.html page as products.php. You can remove the two script tags from the new products page. No need for JavaScript here as this is the fallback solution if the user’s browser doesn’t support it.

Replace the default text inside <div id="product-display"> with the following PHP that will separate the product id from the URL then pass it over to a PHP script that fetches product information.

<div id="product-display">
      <?php
      require_once("inc/getProduct.php");
      echo getProduct($_GET["prodid"]);
      ?>
</div>

Remember, we’re not using a query string in the navigation links to pass the product id, but this script is still accessing a mysterious GET variable called prodid. Using the Apache server module mod_rewrite we’ll direct URLs like this products/1 to URLs like this products.php?prodid=1. Although users and search engines will see the simpler URL structure, the PHP script will be able to access $_GET variables in an invisible query string. We’ll build the getProduct.php file referenced in this script and the getProduct() function it will contain shortly.

To correctly remap the search engine friendly URLs to the products.php file with a trailing query string, create a .htaccess file with the following rewrite rule and place it in the directory on your server where your catalog system is stored.

RewriteEngine On
# Catches URL with or without a trailing slash
RewriteRule ^products/([0-9]+)/? products.php?prodid=$1

This rewrite rule, which is very similar to those covered in Chapter 3, should now successfully route your navigation system to the products.php page.

All of the database interactions for the JavaScript-enabled and -disabled versions of the catalog system will be handled by a file called getProduct.php stored in the include folder called inc. It contains a single function called getProduct() that receives a product id in order to retrieve the desired record. Here’s how it starts:

<?php
function getProduct($prodid){
    if(!$prodid){
      echo "Oops! We ran into some trouble getting info about this
product.";
      exit;
    }

    $con = mysql_connect('db host', 'your username', 'your
password
'),
    mysql_select_db('db name', $con);

    $prodid = mysql_real_escape_string($prodid);
    $result = mysql_query("SELECT * FROM ajax_products WHERE
prodid='$prodid'");
}

Be sure to change the host, username, password, and database name to reflect your database’s access information.

If no product id is supplied, the function will display an error message, and halt the script’s progress. If a product id has been supplied, it’s cleaned up using mysql_real_escape_string() to prevent SQL injection attacks on the database (see http://us3.php.net/manual/en/security.database.sql-injection.php to learn more about SQL injection attacks). Once the potentially dangerous characters in the product id are escaped it can be used in a query to pull the record.

After the query is run, you’ll need to make sure a result is returned before trying to write it to the page.

if(mysql_num_rows($result) == 0){
echo "Sorry, we couldn’t find any information about this product.";
}

while($row = mysql_fetch_array($result)) {
$content = '
<h3>'.$row['title'].'</h3>
<p>
<img src="i/'.$row['image'].'" alt="'.$row['title'].'" />
'.$row['description'].'
</p>
<p>
<a href="products/'.$row['id'].'" title="Right click or control click to copy URL">Link to this page</a>
</p>';
}

return $content;

The conditional and loop are pretty basic, but notice the link highlighted at the bottom of the $content variable. Ajax interfaces that load content into a single page only use one URL for all content. This makes it difficult for users to bookmark or email links that preserve the state of the page, and degrade a site’s findability. Google Maps, a paragon of Ajax-powered interfaces, attempts to solve the problem by providing a direct link to search results that preserve the state of the page. Because we’ve already set up the products.php page to show individual records based on the parameters in the URL, it’s pretty easy to add a similar feature to this interface. In Figure 7.4 you can see at the bottom of the page a link labeled “Link to this page” that will make it easier for users to share or bookmark content.

When all of the content has been retrieved and assembled into a basic HTML structure, it’s returned to the place where the getProduct() function was called. When an Ajax request is made to this script, there’s actually no way to trigger the function. In products.php the getProduct() function is called and the results are written to the page. The getProduct.php script will need to be smart enough to detect when an Ajax request is being made so it can trigger the function automatically. All it takes is a simple conditional at the very end of the script:

if($_GET['ajax']=='true'){ echo getProduct($_GET['prodid']); }

We’ll set the $_GET['ajax'] variable when we build the Ajax call. That’s it for the PHP. The catalog should now work perfectly for search engines without support for JavaScript. The last piece of the system is to use JavaScript to disable the navigation links and use Ajax to load content into the page for a super-fast browsing experience.

Adding Ajax to the Catalog

Create an external JavaScript file called progressive-ajax.js and save it in the js folder. Inside this file create two functions: one to initialize the page behaviors, and another to handle the Ajax communication with the server:

window.onload = init;

function init(){

};

function loadProduct(prodId){

}

The init() function will be called automatically when the page loads. It will deactivate all of the links in the navigation and assign the alternate Ajax behavior. The loadProduct() function must receive a product id so it can in turn pass this number on to the getProduct.php script.

The init function will first determine to what degree the browser supports JavaScript. If it doesn’t support the getElementsByTagName or the getElementById methods, then the script will have to terminate. If sufficient JavaScript support is detected, the init() function will find all of the links in the navigation system and attach the Ajax loading behavior. JavaScript can locate the navigation system by its catalog-nav id:

function init(){
      if(!document.getElementsByTagName || !document.
getElementById){return;}

      var nav = document.getElementById("catalog-nav");
      var navlinks = nav.getElementsByTagName("a");

      for (var i=0; i < navlinks.length; i++){
          navlinks[i].onclick = function() {
                urlarray = this.href.split("/");
                loadProduct(urlarray[urlarray.length-1]);
                return false;

          };
      };
  };

Inside the loop each link is assigned an onclick event listener. When a link is clicked the script grabs the href attribute value and splits it into an array with values created each time a / is detected. The URLs in each href look something like this: products/1. The last value in the array will be the product id. To access this value, the script finds the number of elements it contains using the JavaScript keyword length, and subtracts 1 from it, since arrays begin indexing at 0. So urlarray[urlarray.length-1] will contain the product id we need to pass to the loadProduct() function.

The return false in the last part of the loop disables the links from navigating away from the page. All that remains to make the Ajax catalog complete is the loadProduct() function, which will handle the Ajax communication with the getProduct.php script on the server. Here’s what it will look like:

function loadProduct(prodId){
      var display = document.getElementById(’product-display’);
      display.innerHTML = 'Loading …';
      var url = 'inc/getProduct.php';
      var pars = 'prodid='+escape(prodId)+'&ajax=true';
      var myAjax = new Ajax.Updater(display, url, {method:'get', parameters:pars});
}

The function starts by defining which tag will display the results of the Ajax request. In the HTML document a <div> tag with the id product-display has already been created just for this purpose. If the user’s network connection is slow, a loading message will be written to the display <div> while the page is waiting to receive the results from the database query.

Next, a few variables are defined. The url variable identifies the path to the PHP script that will receive the Ajax request. The pars variable defines the parameters that will be passed in a query string to the getProduct.php script. Notice that one of the parameters is ajax=true. At the end of the getProduct.php script a conditional was added to see if the request made was from the Ajax script so the getProduct() function could be triggered automatically. The ajax=true variable is what the PHP script will be looking for.

The JavaScript framework Prototype, to which the HTML page already links, simplifies the Ajax communication to a single line of code. The last line of this function creates a new Ajax object that will send a request to the PHP script using the get method to transmit the variables. The data that is returned will automatically be written to the <div id="product-display"> tag.

The Ajax catalog is now all set and ready for fast browsing. Search engines will appreciate the simple URLs that include no messy query strings and the graceful way the JavaScript will degrade to facilitate indexing of the site. Users will appreciate how quickly they can navigate through content without pesky page loads that can slow down their browsing experience.

Is Ajax Worth the Trouble?

Is adding Ajax to your site worth the potential findability pitfalls? Ultimately, you’ll need to evaluate your project and your users’ needs to decide for yourself. But with progressively enhanced Ajax there’s no reason you can’t support the needs of your users and search engines simultaneously.

This example doesn’t sacrifice search engine visibility. All of the content is accessible to search engine spiders and can be thoroughly indexed. Although the lack of unique URLs in Ajax systems is a potential findability pitfall, this system compensates by providing users with a link for each product page so they can bookmark it or email it to a friend. Inbound links that users create build your search engine page ranking, but you’re not likely to receive many if you don’t provide unique URLs for your content.

If the speed that Ajax can bring to your user interface is important to the objectives of your site, then don’t shy away from it because of fears of ruining your site’s search engine optimization. It is possible to have your cake and eat it too.

Findable Flash

Flash is often demonized as an SEO death sentence for websites. In reality it’s not the tool that should be receiving the criticism so much as the way it is improperly used. Flash is no worse for search engine visibility than Ajax. When used with no consideration for how search engines view content, both Flash and Ajax can, in fact, significantly hinder search engine indexing. But just as we saw with Ajax, if you use your noodle and follow a few best practices when building Flash sites, you won’t sacrifice search engine traffic.

After receiving some criticism a few years ago for Flash’s poor search engine support, Macromedia—now Adobe—released a software development kit (http://www.adobe.com/licensing/developer/) that converts SWFs to HTML documents in an effort to make Flash content more search engine friendly. Macromedia’s goal was to provide the major search engines with some helpful tools to be incorporated in page indexing systems to make reading Flash content easier.

To some degree their efforts have paid off. Today, Google actually does index Flash content but with its own proprietary sytem. To get a sense of what the Google spider sees when it reads a SWF file, simply do a Google search for any keywords followed by the operator filetype:swf. This will limit your search returns to keywords found inside SWF files. FIGURE 7.7 illustrates some typical search results for content in Flash files.

Figure 7.7 This search, executed using the filetype: swf operator, illustrates that Google can see the content inside SWF files. Some of the files appear to have been poorly translated into HTML by Google’s mysterious SWF translation engine.

Image

Some of the search results in Figure 7.7 contain consistent yet poorly written HTML that suggest it was generated by some SWF-to-HTML translation system, probably part of Google’s search spider. When you compare the text shown in Google’s results page to the content within the SWF files you’ll discover that there’s not always parity between the two. This is probably because Google also reads meta data stored in the SWF if the developer has included any.

You can add descriptive meta data to your Flash files by simply modifying the document settings before publishing. In Flash, go to Modify > Document and enter a title and description as shown in FIGURE 7.8.

Figure 7.8 You can add meta data to your Flash content by navigating to the document settings in Modify > Document and entering a title and description.

Image

Although Google can read content within Flash, the information hierarchy is often ambiguous to its search spider. This makes achieving good SEO results from Flash content alone difficult. When competing against HTML sites that use semantic tags to communicate information hierarchy Flash files will consistently rank lower in search results. For a Flash site to achieve high search rankings it needs to use HTML effectively.

Though HTML can make Flash content much more findable it’s usually an afterthought for most Flash designers. Often SWFs are embedded in pages directly from Flash, then published to the Web without modification to the HTML. By default Flash writes the file name of the FLA into the <title> tag. As discussed in Chapter 2, the <title> tag is an important location for keywords. Publishing your HTML with Flash’s default text in it is a significant SEO opportunity missed.

Flash doesn’t publish any content in the meta tags in its HTML either. Although meta tag keywords are no longer viewed by major search engines, the meta description is shown on search results pages, and can entice users to click to visit your site. It’s another missed opportunity to generate traffic to your site.

Adding <title> and meta tag content to the HTML page Flash publishes will help SEO but it’s still not quite enough to be competitive. To create real keyword density and prominence you’ll need to follow a progressive enhancement strategy much like the one discussed earlier for JavaScript.

Rather than thinking of Flash as the structure, presentation, and behavior of a site rolled into one file, consider it the fourth layer on top of the typical progressive enhancement trinity that creates rich interaction (see FIGURE 7.9). It enhances the user experience much like JavaScript does when it’s supported.

Figure 7.9 Flash is a rich interaction layer that should by delivered on top of the typical structure, presentation, and behavior layers of a progressively enhanced website or application.

Image

Start your Flash site with HTML documents that contain the site’s content marked up using the strategies discussed in Chapter 2. Create separate SWFs for each page in your site rather than bundling them all in one. This provides search engines with unique URLs for indexing and creates more opportunity to embed your content in the HTML pages.

With the HTML structure built to best communicate your content, you can add CSS to refine the presentation of the page in which your Flash movie will sit. Flash will be the crowning layer that will sit on top of the HTML structure embedded in the page using a simple JavaScript file called SWFObject.

Using SWFObject for Flash Progressive Enhancement

Geoff Sterns created SWFObject to offer a better way of embedding Flash content into HTML pages. Sterns’ simple JavaScript file detects the Flash plugin before embedding SWF files within a <div> tag or other element you define. The target tag can contain search engine friendly alternative content that will make your site more competitive in search results, but will be replaced when users view the page with the correct plugin. SWFObject progressively enhances the page with Flash content but gracefully degrades to HTML content for search engines. You can use SWFObject to embed smaller Flash movies into your site to create a hybrid of HTML and SWF content, or embed a single SWF that occupies the entire page.

Image
Bobby Van Der Sluis has written a brilliant article on the Adobe site about Progressive Enhancement and Flash http://www.adobe.com/devnet/flash/articles/progressive_enhancement_03.html

Let’s take a look at it in action.

Hybrid Sites

This example will examine a common scenario in which a Flash slideshow is embedded on the home page of a site to catch the audience’s attention by cycling through a series of product promotions and sales (see FIGURE 7.10). The slideshow is intended as a complement to other HTML content on the page.

Figure 7.10 Hybrid sites use Flash in select areas to capture the user’s attention or enhance the interface. In this example Flash is used to cycle through sales and promotions in a slideshow that can present more information than a static image.

Image

Once the general HTML structure is built, a <div> tag needs to be added to the page to define where the Flash content will be displayed. Add an id to the tag to make it easy for SWFObject to target.

<div id="flash-promo">
    <a href="promos/" title="See our latest promotions">
        <img src="images/promo.jpg" alt="Save 10% on the Berry
Bowl" longdesc="#slideshow" />
    </a>
</div>

Users who don’t have the Flash player will see the content within <div id="flash-promo">. To make the slideshow gracefully degrade for users, an <img /> tag is added to show the first promo slide and will link to a page where all of the promos can be seen. The full text of the slideshow can be accessed via the longdesc attribute. In Chapter 2’s “Making Images Visible” you saw how the longdesc attribute could link to an area on a page that further describes content within images that would otherwise be invisible to search engines. The same thing happens here with the image substitute for the Flash movie.

Add the full text from the Flash slideshow in a div at the bottom of the page:

<a name="slideshow"></a>
<dl id="slideshow">
    <dt><a href="promos/1/" title="Buy the berry bowl">Slide 1:
Berry Bowl Sale</a></dt>
    <dd>Save 10% on the berry bowl April 7-21</dd>
    […]
</dl>

A definition list lends itself well to communicating each slide’s title and content, but you could use any number of markup techniques here. The important part is that a text equivalent is provided for the slides, and markup is used to communicate the information hierarchy.

As the example in Chapter 2 illustrates, you can use some simple CSS to remove the slideshow text from view of sighted users, but it will stay visible to search engines.

#slideshow {text-indent:-9999px; position:absolute;}

To add the Flash layer to the page using SWFObject, first link to the external JavaScript file, then create a new object that will replace the text within the

<div id="flash-promo"> tag:
<script type="text/javascript" src="js/swfobject.js"></script>
<script type="text/javascript">
    var so = new SWFObject("slideshow.swf","myswf","600","400","8"
,"#ffffff");
    so.write("flash-promo");
</script>

The instantiation of SWFObject includes six parameters to define how it will embed the SWF in your page. Here’s an explanation of each one:

Image slideshow.swf: path to the SWF to be embedded

Image myswf: name for the JavaScript object

Image 600: width of the SWF

Image 400: height of SWF

Image 8: version of Flash player to be detected

Image #ffffff: base color for the SWF

Once SWFObject is instantiated, call the object’s write() function, passing the id of the target tag where the SWF should be embedded.

Voila! The search engine friendly HTML will now be replaced with a SWF when Flash is supported. Search engines will never see the SWF because JavaScript is used to write it to the page. There’s no SEO compromise with this Flash embedding solution.

Image
To see how SWFObject stacks up to the other Flash embedding techniques, visit http://blog.deconcept.com/swfobject/#whyitsbetter. Bobby Van Der Sluis’s article, entitled “Flash Embedding Cage Match,” provides further detail: http://www.alistapart.com/articles/flashembedcagematch/

Entirely Flash Sites

As mentioned earlier, it’s not a good idea to create one giant SWF for your entire site. Instead it’s smarter to create separate SWFs, each with its own HTML page. There are many reasons why separate SWFs are more desirable, but here are a few of the more compelling ones.

Single SWF sites

Image Can be significantly harder for search engines to index

Image Break the browser’s back button

Image Don’t provide unique URLs for bookmarking and sharing

Image Require all alternative content for the entire site to be embedded in one HTML page

Image Prevent direct navigation to content

Image May take longer to load if the main SWF doesn’t load other SWF files

It’s OK to have an entirely Flash site, but be sure to use separate HTML pages to circumvent these issues.

The previous example of using SWFObject for hybrid sites isn’t very practical for an entirely Flash site. It would require you to write all of your content in HTML, then do it again in Flash. If you need to make a content change for any reason you’ll have twice as much work. It would be smarter to consolidate your content in one place and let some code do the work for you. There are two ways to accomplish this.

The first option would be to put all of your content into a database. Using PHP or another server-side scripting language you could write content into the HTML pages where the Flash files are embedded. Another server-side script could generate an XML page of the same content that Flash can easily link to and consume. The XML source Flash would link to would actually be a PHP file that outputs XML content. Although this approach requires a little extra work to create the XML generation script, it would be a one-time buildout that would certainly simplify maintenance.

Putting all of the site’s content into a database would also make it easier to tie in a Content Management System (CMS) such as Joomla (http://joomla.com/), Drupal (http://drupal.org/), Expression Engine (http://expressionengine.com), or one that you create yourself. Using a CMS makes keeping your content current much easier, and can allow clients to manage the site themselves. FIGURE 7.11 illustrates the relationship between a CMS, the database where all content would be stored, and the front end that would deliver the same content in HTML and Flash formats.

Figure 7.11 When all content is stored in a database it can be sent to multiple delivery platforms using server-side scripting. A CMS could be connected to the database to make content management easier.

Image

If the prospect of creating a PHP script to write XML is too daunting, you may find this next option more attractive.

Once you’ve built the structure of your site with HTML and have integrated all of your content, you can pass it into Flash along with the markup using SWFObject. Since the Flash layer gets its content from the HTML structure, any change you make to the text in the HTML will also automatically update your SWF too.

Once SWFObject is instantiated you can use its addVariable() function to send in the text from the HTML page. In this example we’ll create a <div> tag containing all of the text to be passed to Flash. Here’s how it might look:

<div id="content">
    <h1>Findable Flash</h1>
    <p>This text will be passed into a SWF using SWFObject. Once
the content is in Flash, you will need to do a little XML parsing
to grab nodes and manipulate the content as you like.</p>
</div>

Just like the previous SWFObject example we’ll instantiate it, but before writing the SWF to the page we’ll use addVariable() to pass in the HTML text. I’ve highlighted the code that will pass the content into the SWF:

<script type="text/javascript" src="js/swfobject.js"></script>
<script type="text/javascript">
    var so = new SWFObject("flash-content.swf", "passdata", "800",
"500", "8", "#ffffff");

    var content = document.getElementById('content').innerHTML;
    so.addVariable('xmlData', encodeURIComponent(content));
    so.write("content");
</script>

Using some JavaScript, we first grab the text inside <div id="content">. Inside the addVariable() function a variable called xmlData is defined to contain the content and will automatically be created inside the SWF. Next, all of the content that is to be passed into the SWF is URL-encoded so special characters in the text don’t cause trouble as Flash imports it.

Flash will view the HTML content we’re passing it as XML. This makes a lot of sense, as both are markup languages that use tags to wrap text, and the HTML we’re using in this book is actually XHTML, a dialect of XML. With the content wrapped in HTML tags, Flash will be better able to access individual nodes of text.

In a Flash file named flash-content.fla create a new actions layer and a text field layer. In the text field layer add a dynamic text field to the stage and name the instance “display.” In the actions layer on the first frame, open the Actions window and add the following:

var xml = new XML();
xml.ignoreWhite = true;
xml.parseXML(unescape(_root.xmlData));

display.text = xml.toString();

This simple ActionScript starts by creating a new XML parsing object. After indicating that all white space in the content should be ignored, the script parses the content as XML. The highlighted code _root.xmlData is the variable just passed to Flash via the SWFObject JavaScript. The last line writes the content to the text field named display.

Once the SWF is published it will automatically display the content from your HTML page. You may want to manipulate the imported content further using Flash’s various XML parsing functions. To learn more about how to parse XML with Flash and ActionScript, check out Jesse Stratford’s article on ActionScript.org entitled “XML 101” (http://www.actionscript.org/resources/articles/9/1/XML-101/).

Now that both the Flash and HTML layers share the same content you’ll only need to make text changes in one place. This solution will provide search engines and users without Flash support the same content users with the Flash plugin will enjoy.

TIP
You can get help if you run into trouble working with SWFObject in the support forum at http://blog.deconcept.com/swfobject/forum/.

You Don’t Have To Compromise

Often Flash is written off as a technology to be avoided if SEO is at all a concern, but eliminating Flash from every site is not an acceptable compromise. Flash is a powerful, compelling technology that can deliver an enhanced user experience. When used properly, you don’t have to sacrifice Flash in the name of search engine visibility.

If you progressively enhance your interfaces and add Flash as the crowning layer, search engines can fall back to the HTML structure to discern the content and information hierarchy of the page.

The progressive enhancement examples explored in this chapter also improve Accessibility. Guideline 6 of the W3C’s Web Content Accessibility Guidelines states that pages featuring new technologies should gracefully degrade (http://www.w3.org/TR/WAI-WEBCONTENT/#gl-new-technologies). Although Flash is hardly a new technology these days, it is one that can pose accessibility issues to the disabled and users of alternate devices. Using Flash as the top layer in a progressively enhanced interface ensures all users can enjoy the content.

Image
Claus Wahler has created another search engine visibility solution for Flash called SEFFS (Search Engine Friendly Flash Site) http://wahlers.com.br/claus/blog/seffs-to-flash-or-not-to-flash/.

Findable Audio and Video

The content trapped within audio and video files is inherently invisible to most search engines. Currently, there is no mechanism built into Google, Yahoo!, MSN Live Search, or other major search engines to transcribe speech to text so the content can be indexed and searched.

Audio and video are very desirable for users. These formats have the potential to explain some topics better than plain text because they’re closer to human-to-human communication. They also provide a passive content consumption experience that people tend to enjoy, especially when transferred to a portable device like an iPod.

Audio and video content are far too attractive to users to remain un-findable. Though content in these formats is arguably the most invisible to search engines of the various technologies discussed in this chapter, it is also the easiest to transform into search engine friendly formats.

EveryZing

EveryZing—formerly PodZinger—(http://www.everyzing.com/) is a unique search tool that can transcribe the spoken word in MP3s, video, and other rich media file types to text so the content can be searched. Like YouTube, EveryZing is a central repository for video content on a wide variety of topics, but also hosts podcasts and other audio content as well. See FIGURE 7.12.

Figure 7.12 EveryZing (http://everyzing.com) is a media search tool that creates text transcripts of audio and video posted by users. It’s free to use and will automatically create searchable transcripts of your media files linked to within your RSS feed.

Image

You can post your content on EveryZing for free and it will use its proprietary audio-to-speech technology to create a text transcript of your video or audio file. When users search on EveryZing they can discover your video or audio files via keywords in the text transcript. It’s also possible for users to find your files on EveryZing via queries on major search engines too.

The content you post on EveryZing can be pulled onto your site using RSS, but the text transcript stays on its site. You can also provide EveryZing your RSS feed containing links to your audio and video files, and it will automatically transcribe and post it when you publish new content.

It’s a free and easy way to expose your content to search engines and a broader audience. Any of the thousands of users who search EveryZing daily could potentially stumble across your content and visit your site via a link that will be automatically displayed with it.

When you sign up with EveryZing you can choose to display ads with your content. Revenue earned from these ads gets shared with you directly via a PayPal account.

Creating Text Transcripts

The surest way to make your audio and video content visible to all search engines is with a simple text transcript. Although EveryZing creates transcripts automatically, it would be preferable to host the text on your site to draw direct search referrals.

You can create your own transcripts, but it’s a tedious job. There are a number of inexpensive services that will do the dirty work for you in a relatively short turnaround time. All of the services discussed here use humans rather than software to create transcripts, which results in more accurate transcripts.

CastingWords

If you’re a podcaster, CastingWords (http://castingwords.com) is a great way to get transcripts made (see FIGURE 7.13). Simply choose the desired turnaround time, then upload your audio file, and you’ll receive your transcript in plain text, HTML, and RTF formats. They also provide an RSS feed of all of your transcripts so your podcast listeners can subscribe to the full text if they like.

Figure 7.13 CastingWords (http://castingwords.com) provides high-quality transcripts of podcasts and other audio files at reasonable prices.

Image

You can provide CastingWords with a podcast RSS feed to request transcripts automatically each time you release a new podcast. This will save you the time and hassle of having to visit the site repeatedly to place orders.

Transcribr

Enablr has an audio transcription service called Transcribr (http://enablr.com/transcribr.php) similar to CastingWords (see FIGURE 7.14). Like CastingWords, Transcribr lets you provide your podcast RSS feed URL to automatically create transcripts for you a few days after publication.

Figure 7.14 Transcribr (http://enablr.com/transcribr.php) is another great audio transcription service worth considering.

Image

E24 Transcription

Unlike CastingWords and Transcribr, E24 Transcription (http://www.e24tech.com) creates transcripts of both audio and video files. See FIGURE 7.15.

Figure 7.15 E24 Transcription (http://www.e24tech.com) creates audio and video transcriptions at exceptionally low prices.

Image

It’s exceptionally inexpensive, but only provides transcripts in Word document format and has no RSS auto-generation service. If you’re looking for economy and don’t care much about advanced features and alternate transcript formats, then this might be the right option for you.

Text Transcripts Make Your Content Accessible

Like so many other SEO techniques discussed throughout this book, text transcripts for video and audio also make your content accessible. Users with hearing impairments would miss out on audio or video content if a text transcript weren’t included.

Users who don’t speak English fluently will also appreciate the transcripts as they can follow along while listening to catch words and phrases that might be obfuscated by an unfamiliar accent.

The more users you include in your audience, the more likely you are to achieve the business and communication goals of your site. Text transcripts require little effort yet pay big dividends in search traffic and reaching a broader audience.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset