XParser

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7.3. XParser

News aggregation sites using syndication formats are gaining popularity as the formats become more widely used. Many sites use server-side logic to parse RSS and Atom feeds, displaying them in some sort of user-friendly format. However, it may be necessary to perform the same functions on the client-side using JavaScript. This is where XParser comes in.

XParser is a JavaScript library that parses RSS and Atom feeds into JavaScript objects, making the feed's data easy to access in web applications. Its primary goal is to provide an interface for JavaScript developers to quickly access a feed's most important elements. The code is object–oriented, broken into abstract classes that the Atom- and RSS-specific classes inherit from. Such a design allows the different feed types to be parsed according to their specific differences while leaving room for extensions. This section explains how the XParser code is designed and implemented.

7.3.1. The xparser Namespace

XParser begins with the xparser namespace. A namespace contains the library's code in one simple package and protects the contained code from external naming conflicts. Of course, JavaScript does not implement an official namespace construct; however, you can simulate the behavior of a namespace quite easily with a simple object.

var xparser = {};

This code defines the xparser object using object literal notation. It is this object that holds data, methods, and classes for XParser.

Because the script deals with different types of feeds, it needs some way to identify the feed it is parsing. This is easily accomplished with the feedType object:

xparser.feedType = {
    rss    : 1,
    atom   : 2
};

The feedType object contains two properties (rss and atom), which are assigned numeric values. These numeric constants allow assignment and comparison of a feed's format.

7.3.2. Retrieving the Data

To retrieve data from a specific XML node, the XParser library depends upon the FeedNode class. As its name implies, it represents a DOM node contained in the feed and is responsible for accessing and retrieving the node's value. The class accepts one argument, the XML node:

xparser.FeezdNode = function (oNode) {
    this.value = (oNode && (oNode.text || oNode.getText())) || null;
};

FeedNode exposes one property called value, which either contains the node's text or a null value.

The text property does not exist in Firefox's DOM, and getText() doesn't exist in Opera. To gain this functionality, XParser uses the zXml library introduced in Chapter 2, which extends Firefox's and Opera's DOM.

7.3.3. The Abstract Classes

As stated earlier, XParser is responsible for parsing two types of feeds, RSS and Atom. While there are many ways to accomplish this, the best is to have one class responsible for parsing each of the two different feed types. Despite their differences, the feed types share some similarities, and finding that common ground can save time and code. To facilitate this design, XParser contains two abstract classes: BaseFeed and BaseItem.

7.3.3.1. The BaseFeed Class

The BaseFeed class represents the feed as a whole, defining several properties that each feed uses to describe itself. The constructor accepts three arguments: the feed type (1 or 2, as defined in the FeedType object), a function pointer to call when parsing is complete, and the scope in which the call back function should execute. Here's the code for the BaseFeed class:

xparser.BaseFeed = function (iFeedType, fpCallBack, oCallBackScope ) {
    this.type           = iFeedType || null;
    this.title          = null;
    this.link           = null;
    this.description    = null;

this.copyright      = null;
    this.generator      = null;
    this.modified       = null;
    this.author         = null;
    this.items          = [];

    this.callBack       =
        (typeof fpCallBack == "function") ? fpCallBack : function () {};
    this.callBackScope  =
        (typeof oCallBackScope == "object") ? oCallBackScope : this;
};

The first line assigns the feed type, which defaults to null if no argument is passed to the constructor. This ensures that no errors are thrown when prototype chaining subclasses (discussed later).

The title, link, description, copyright, generator, modified, and author properties are generalized properties that both Atom and RSS feeds contain. These properties, at some point, will hold FeedNode objects. The items array represents the feed's <rss:item/> or <atom:entry/> elements. The final four lines of the BaseFeed constructor assign the default values for the callback and callBackScope properties. The former defaults to an empty function, while the latter defaults to the BaseFeed instance.

This class exposes a method called parse(), which accepts a context node, an associative array (object) of property and element names as keys and values, respectively, and an associate array of namespace prefixes and namespace URIs as arguments:

xparser.BaseFeed.prototype = {
    parse : function (oContextNode, oElements, oNamespaces) {

    }
};

With the information provided, it's possible to evaluate XPath expressions to extract the desired data. To do this, loop through oElements and use the zXPath class to perform the XPath evaluation:

xparser.BaseFeed.prototype = {
    parse : function (oContextNode, oElements, oNamespaces  ) {
        //Loop through the keys
        for (var sProperty in oElements) {
            if (oElement.hasOwnProperty(sProperty)) {
                //Create FeedNode objects with the node
                //returned from the XPath evaluation
                this[sProperty] = new xparser.FeedNode(
                    zXPath.selectSingleNode(
                        oContextNode,
                        oElements[sProperty],
                        oNamespaces
                    )
                );
            }
        }
    }
};

The associative array passed to the oElements parameter contains "title", "link", "description", "copyright", "generator", "modified", and "author" as the keys. These keys correspond directly to the properties of the BaseFeed class. This provides a quick and easy way to assign values to these properties.

NOTE

It's important to note that BaseFeed is an abstract class and as such should not be instantiated directly. These types of classes are designed to be inherited from; therefore, only the child classes need to worry about providing the information in the correct format.

7.3.3.2. The BaseItem Class

The BaseItem class follows the same pattern. Like the BaseFeed class, BaseItem's constructor initializes its properties as null:

xparser.BaseItem = function () {
    this.title          = null;
    this.author         = null;
    this.link           = null;
    this.description    = null;
    this.date           = null;
};

These properties are a generalized equivalent to the feed's item (or entry) elements. Also, like the BaseFeed class, this class exposes a parse() method, which is implemented similarly:

xparser.BaseItem.prototype = {
    parse : function (oContextNode, oElements, oNamespaces  ) {
        //Loop through the keys
        for (var sProperty in oElements) {
            if (oElements.hasOwnProperty(sProperty)) {
                //Create FeedNode objects with the node
                //returned from the XPath evaluation
                this[sProperty] = new xparser.FeedNode(
                    zXPath.selectSingleNode(
                        oContextNode,
                        oElements[sProperty],
                        oNamespaces
                    )
                );
            }
        }
    }
};

These two classes provide a basis that the RSS and Atom classes can inherit from. Also, this design future-proofs the library, allowing easy addition of new feed types (provided any new feed type uses a compatible format).

7.3.3.3. Parsing RSS Feeds

The RSSFeed class is in charge of parsing RSS feeds. The constructor accepts three arguments: the root element of the XML document, the callback function, and the scope in which the callback function should run:

xparser.RssFeed = function (oRootNode, fpCallBack, oCallBackScope) {
    xparser.BaseFeed.apply(this,
        [xparser.feedType.rss, fpCallBack, oCallBackScope]
    );
};

xparser.RssFeed.prototype = new xparser.BaseFeed();

Two things are taking place in this code. First, the BaseFeed constructor is called using the apply() method and passing in the appropriate arguments (including xparser.feedType.rss as the feed type). This is a common way of inheriting properties from a superclass in JavaScript; it ensures that all inherited properties are instantiated with the appropriate values. Second, the RssFeed prototype is set to a new instance of BaseFeed, which inherits all methods from BaseFeed.

For more information on inheritance and object-oriented design in JavaScript, see Professional JavaScript for Web Developers (Wiley Publishing, Inc., 2005).

The next step is to parse the XML data supplied by the oRootNode argument. This is a simple matter of creating an associative array of class properties as keys and the corresponding XML element name as values.

xparser.RssFeed = function (oRootNode, fpCallBack, oCallBackScope) {
    xparser.BaseFeed.apply(this,
        [xparser.feedType.rss, fpCallBack, oCallBackScope]
    );

    var oChannelNode = zXPath.selectSingleNode(oRootNode, "channel");

    var oElements = {
        title          : "title",
        link           : "link",
        description    : "description",
        copyright      : "copyright",
        generator      : "generator",
        modified       : "lastbuilddate",
        author         : "managingeditor"
    };

    this.parse(oChannelNode, oElements, []);
};

This new code first retrieves the <rss:channel/> element. Remember, the <rss:channel/> element serves as a container for the entire feed. Next, create the oElements associative array by supplying the values of the XML element names. This information is passed to the parse() method, which retrieves the desired elements, creates FeedNode objects with the elements, and assigns them to the class properties.

Next, populate the items array:

xparser.RssFeed = function (oRootNode, fpCallBack, oCallBackScope) {
    xparser.BaseFeed.apply(this,
        [xparser.feedType.rss, fpCallBack, oCallBackScope]);

    var oChannelNode = zXPath.selectSingleNode(oRootNode, "channel");

    var oElements = {
        title          : "title",
        link           : "link",
        description    : "description",
        copyright      : "copyright",
        generator      : "generator",
        modified       : "lastbuilddate",
        author         : "managingeditor"
    };


    this.parse(oChannelNode, oElements, []);

    var cItems = zXPath.selectNodes(oChannelNode, "item");

    for (var i = 0, oItem; oItem = cItems[i]; i++) {
        this.items.push(new xparser.RssItem(oItem));
    }

    this.callBack.call(this.callBackScope, this);
};

The first new line uses XPath to retrieve the <rss:item/> nodes. Next, the code loops through the selected XML nodes and creates an RssItem object with the element. The new object is added to the items arrayusing the push() method. After the items array is fully populated, the feed is completely parsed; thus, the final line executes the callback function in the specified scope. Also, the RssFeed object is passed to the callback function. This allows easy access to the feed object in case those using the library need easy access to the information the object houses.

Just as a RssFeed extends BaseFeed, an RssItem class extends BaseItem. This item class is quite simple; the RssItem constructor accepts one parameter, the <rss:item/> node:

xparser.RssItem = function (oItemNode) {
    xparser.BaseItem.apply(this);

    var oElements = {
        title       : "title",
        link        : "link",
        description : "description",
        date        : "pubdate",
        author      : "author"
    };

    this.parse(oItemNode, oElements, {});
};
xparser.RssItem.prototype = new xparser.BaseItem();

This code resembles that of RssFeed. The first line calls the parent class constructor to initialize properties. Next, the oElements associative array is created and passed, along with the XML node, to the parse() method. Since the RSS specification does not specify a namespace, an empty object is passed as the namespace parameter of the parse() method.

7.3.3.4. Parsing Atom

The code for parsing Atom feeds is very similar to the RSS-parsing code. There are just a few key differences to take into account.

The first difference is the use of namespaces. According to the Atom specification, all elements in the feed must reside in the http://www.w3.org/2005/Atom namespace. XParser may also come across an Atom feed that uses a previous version, in which case, the aforementioned namespace will not work. You can work around this issue, however, by retrieving the namespace URI of the root element:

xparser.AtomFeed = function (oRootNode, fpCallBack, oCallBackScope) {
    xparser.BaseFeed.apply(this,
        [xparser.feedType.atom, fpCallBack, oCallBackScope]
    );

    var oNamespaces = {
        atom : oRootNode.namespaceURI
    };
};

The first few lines are very similar to the code in the RssFeed constructor, the only difference being the feedType passed to the BaseFeed constructor. The next block of code creates an associative array called oNamespaces, which is responsible for holding key/value pairs consisting of the element prefix and the associated namespace URI. In this case, the atom key corresponds to the namespaceURI of the root element. This ensures that an attempt to parse the Atom feed, regardless of version, takes place.

The next key difference is, of course, the elements to retrieve. As a result of XParser's design, however, this obstacle is easily overcome:

xparser.AtomFeed = function (oRootNode, fpCallBack, oCallBackScope) {
    xparser.BaseFeed.apply(this,
        [xparser.feedType.atom, fpCallBack, oCallBackScope]
    );

    var oNamespaces = {
        atom : oRootNode.namespaceURI
    };

    var oElements = {
        title           : "atom:title",
        link            : "atom:link/@href",
        description     : "atom:tagline",
        copyright       : "atom:copyright",
        generator       : "atom:generator",
        modified        : "atom:modified",
        author          : "atom:author"
    };

    this.parse(oRootNode, oElements, oNamespaces);
};

The first new block of code creates the oElements associative array with the Atom element names. The element's prefix, atom, matches the prefix contained in oNamespaces. The combined information is then passed to the parse() method to assign the properties their proper value.

Next, populate the items array:

xparser.AtomFeed = function (oRootNode, fpCallBack, oCallBackScope) {
    xparser.BaseFeed.apply(this,
        [xparser.feedType.atom, fpCallBack, oCallBackScope]
    );

    var oNamespaces = {
        atom : oRootNode.namespaceURI
    };

    var oElements = {
        title           : "atom:title",
        link            : "atom:link/@href",
        description     : "atom:tagline",
        copyright       : "atom:copyright",
        generator       : "atom:generator",
        modified        : "atom:modified",
        author          : "atom:author"
    };

    this.parse(oRootNode, oElements, oNamespaces);

    var cEntries = zXPath.selectNodes(oRootNode, "atom:entry", oNamespaces);

    for (var i = 0, oEntry; oEntry = cEntries[i]; i++) {
        this.items.push(new xparser.AtomItem(oEntry, oNamespaces));
    }

    this.callBack.apply(this.callBackScope, [this]);
};

The new code selects the <atom:entry/> elements and assigns the collection to cEntries. Next, the code loops through the collection and adds new AtomItem objects to the items array. When the parsing is complete, the callback function is executed in the specified scope.

Also, like the RssFeed class, the AtomFeed class's prototype is set to a new instance of BaseFeed to inherit methods:

xparser.AtomFeed.prototype = new xparser.BaseFeed();

Naturally, the code for AtomItem resembles that of RssItem. In fact, the only difference between the two is the XML element names contained in oElements:

xparser.AtomItem = function (oEntryNode, oNamespaces) {
    xparser.BaseItem.apply(this, []);

    var oElements = {
        title       : "atom:title",

link        : "atom:link/@href",
        description : "atom:content",
        date        : "atom:issued",
        author      : "atom:author"
    };

    this.parse(oEntryNode, oElements, oNamespaces);
};

And of course, you need to assign this new class's prototype as well:

xparser.AtomItem.prototype = new xparser.BaseItem();

This last line of code completes the parsing aspect of XParser. Of course, this approach is helpful only if you know what type of feed to parse. The library needs some way of creating a feed object, regardless of the feed's type.

7.3.3.5. Putting It Together

To address this issue, XParser contains a factory method called getFeed(), whose purpose is to retrieve the feed, determine if the feed is usable, and create the feed object. The method relies upon an XHR object to retrieve the feed. In order to do this, the zXml library is used once again, as the zXmlHttp .createRequest() factory method is called to create the XHR object in a cross-browser fashion.

The getFeed() method accepts three arguments: the feed's URL, the callback function pointer, and the callback function's scope.

xparser.getFeed = function (sUrl, fpCallBack, oCallBackScope) {
    var oReq = zXmlHttp.createRequest();
    oReq.onreadystatechange = function () {
        if (oReq.readyState == 4) {
            if (oReq.status == 200 || oReq.status == 304) {
                //more code here
            }
        }
    };

    oReq.open("GET", sUrl, true);
    oReq.send(null);
};

This code for creating and handle the XHR object is similar to other examples in this book, as the readystatechange handler checks for status codes of both 200 and 304.The next step is to determine the requested feed's type. In order to do this, you need to load the XHR's responseText into an XML DOM:

xparser.getFeed = function (sUrl, fpCallBack, oCallBackScope) {
    var oReq = zXmlHttp.createRequest();
    oReq.onreadystatechange = function () {
        if (oReq.readyState == 4) {
            if (oReq.status == 200 || oReq.status == 304) {

var oFeed = null;

                var oXmlDom = zXmlDom.createDocument();
                oXmlDom.loadXML(oReq.responseText);

                if (oXmlDom.parseError.errorCode != 0) {
                    throw new Error("XParser Error: The requested feed is not " +
                        "valid XML and could not be parsed.");
                } else {
                    var oRootNode = oXmlDom.documentElement;

                    //more code here
                }
            }
        }
    };

    oReq.open("GET", sUrl, true);
    oReq.send(null);
};

In this new code, an XML DOM is created and loaded with data. The XML document's documentElement is assigned to a variable for easy access to the node. Also, the variable oFeed is initialized as null; this variable eventually assumes the value of a feed object.

A simple way to determine the feed's format is to check the documentElement's nodeName property, since Atom uses <feed/> as its root element and RSS uses <rss/>. You also need to take into consideration that the Atom feed may or may not use a default namespace. This concern is easily addressed by checking whether or not the root element uses a prefix:

xparser.getFeed = function (sUrl, fpCallBack, oCallBackScope) {
    var oReq = zXmlHttp.createRequest();
    oReq.onreadystatechange = function () {
        if (oReq.readyState == 4) {
            if (oReq.status == 200 || oReq.status == 304) {
                var oFeed = null;

                var oXmlDom = zXmlDom.createDocument();
                oXmlDom.loadXML(oReq.responseText);

                if (oXmlDom.parseError.errorCode != 0) {
                    throw new Error("XParser Error: The requested feed is not " +
                        "valid XML and could not be parsed.");
                } else {
                    var oRootNode = oXmlDom.documentElement;

                    //Get the name of the document element.
                    var sRootName;
                    if (oRootNode.nodeName.indexOf(":") > −1)  //a prefix exists
                        sRootName = oRootNode.nodeName.split(":")[1];
                    else
                        sRootName = oRootNode.nodeName;

                    switch (sRootName.toLowerCase()) {
                        case "feed": //It's Atom.

//more code here
                            break;
                        case "rss": //It's RSS
                            //more code here
                            break;
                        default: //The feed isn't supported.
                            //more code here
                            break;
                    }
                }
            }
        }
    };

    oReq.open("GET", sUrl, true);
    oReq.send(null);
};

In the newly added code, the root element's name is checked to see if it contains a colon (:). If it does, this means that the element name contains a prefix, so it's split into two parts: the prefix and the tag name. The tag name is assigned to the sRootName variable. If no prefix exists, then sRootName takes on the value of the element's name.

Once the element's name is known, it can be handled accordingly. The switch block determines the next step based on the root element's name. Using this code, the desired AtomFeed or RssFeed object is created:

xparser.getFeed = function (sUrl, fpCallBack, oCallBackScope) {
    var oReq = zXmlHttp.createRequest();
    oReq.onreadystatechange = function () {
        if (oReq.readyState == 4) {
            if (oReq.status == 200 || oReq.status == 304) {
                var oFeed = null;

                var oXmlDom = zXmlDom.createDocument();
                oXmlDom.loadXML(oReq.responseText);

                if (oXmlDom.parseError.errorCode != 0) {
                    throw new Error("XParser Error: The requested feed is not " +
                        "valid XML and could not be parsed.");
                } else {
                    var oRootNode = oXmlDom.documentElement;

                    //Get the name of the document element.
                    var sRootName;
                    if (oRootNode.nodeName.indexOf(":") > −1)
                        sRootName = oRootNode.nodeName.split(":")[1];
                    else
                        sRootName = oRootNode.nodeName;

                    switch (sRootName.toLowerCase()) {
                        case "feed": //It's Atom.

                            oFeed = new xparser.AtomFeed(
                                oRootNode,
                                fpCallBack,

oCallBackScope
                            );
                            break;
                        case "rss": //It's RSS
                            //Check the version.
                            if (parseInt(oRootNode.getAttribute("version")) < 2)
                                throw new Error("XParser Error! RSS feed " +
                                    "version is not supported"
                                );

                            oFeed = new xparser.RssFeed(
                                oRootNode,
                                fpCallBack,
                                oCallBackScope
                            );
                            break;
                        default: //The feed isn't supported.

                            throw new Error("XParser Error: The supplied feed " +
                                "is currently not supported."
                            );
                            break;
                    }
                }
            }
        }
    };

    oReq.open("GET", sUrl, true);
    oReq.send(null);
};

The newly added code creates an AtomFeed object and passes it the required arguments. Creating an RSS feed, however, requires a few more steps. First, the RSS version is checked (by checking the version attribute in the root element). If the version is less than 2, the code throws an error stating the RSS version isn't supported. If the feed is the correct version, however, an RssFeed object is created. Last, if the document's root could not be matched, the feed isn't supported, so an error is thrown. Throwing errors allows a developer using the library to anticipate these types of errors and handle them accordingly.

While we're on the subject of errors, the getFeed() method needs one more in case the XHR request fails:

xparser.getFeed = function (sUrl, fpCallBack, oCallBackScope) {
    var oReq = zXmlHttp.createRequest();
    oReq.onreadystatechange = function () {
        if (oReq.readyState == 4) {
            if (oReq.status == 200 || oReq.status == 304) {
                var oFeed = null;

                var oXmlDom = zXmlDom.createDocument();
                oXmlDom.loadXML(oReq.responseText);
                if (oXmlDom.parseError.errorCode != 0) {
                    throw new Error("XParser Error: The requested feed is not " +
                        "valid XML and could not be parsed.");
                } else {

var oRootNode = oXmlDom.documentElement;

                    //Get the name of the document element.
                    var sRootName;
                    if (oRootNode.nodeName.indexOf(":") > −1)
                        sRootName = oRootNode.nodeName.split(":")[1];
                    else
                        sRootName = oRootNode.nodeName;

                    switch (sRootName.toLowerCase()) {
                        case "feed": //It's Atom. Create the object.
                            oFeed = new xparser.AtomFeed(
                                oRootNode,
                                fpCallBack,
                                oCallBackScope
                            );
                            break;
                        case "rss": //It's RSS
                            //Check the version.
                            if (parseInt(oRootNode.getAttribute("version")) < 2)
                                throw new Error("XParser Error! RSS feed " +
                                    "version is not supported"
                                );

                            oFeed = new xparser.RssFeed(
                                oRootNode,
                                fpCallBack,
                                oCallBackScope
                            );
                            break;
                        default: //The feed isn't supported.
                            throw new Error("XParser Error: The supplied feed " +
                                "is currently not supported."
                            );
                            break;
                    }
                }
            } else { //The HTTP Status code isn't what we wanted; throw an error.
                throw new Error("XParser Error: XHR failed. " +
                    "HTTP Status: " + oReq.status
                );
            }
        }
    };

    oReq.open("GET", sUrl, true);
    oReq.send(null);
};

This new code throws an error if the HTTP status is anything other than 200 or 304, making it easier to debug and realize that the request failed for some reason. Also, notice that the errors are prepended with the string "XParser Error" to clearly indicate that the error occurred within the library.

With these final lines of code, the XParser library can now be used in any web application. The remainder of this chapter walks you through the creation of two components that utilize the XParser library.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for XParser

Create new playlist

Sign In

Sign Up