As you examine the design for a web page, it’s important to distinguish between data on the page that is dynamic and data that is static. Dynamic data, such as a list of search results, changes each time the page is loaded (based on the query); static data, such as a label for the query box, does not. The distinction between static and dynamic data is important because each requires its own management strategy. On the one hand, static data is easy—you simply specify it directly within the HTML of the page. With dynamic data, however, you must enlist the help of a server-side scripting language, such as PHP, so that you can interact with backend systems to store and retrieve the data. In this chapter, we look at techniques for managing dynamic data.
One of the most important goals for managing dynamic data in a large web application is to establish a clearly defined data interface through which to interact with the backend. A clearly defined data interface allows modules in the user interface (see Chapter 7) to remain loosely coupled with the backend, allows details of the backend (e.g., data dependencies) to be abstracted from modules, and gives modules the flexibility to work with any set of data that contains what the data interface requires. In teams where web developers and backend engineers are separate roles, these qualities let each role work independently, knowing that both are working toward a common point where the user interface and backend will meet. This goal for managing dynamic data is captured in the following tenet from Chapter 1:
Tenet 6: Dynamic data exchanged between the user interface and the backend is managed through a clearly defined data interface. Pages define a single point for loading data and a single point for saving it.
We begin this chapter by looking at what we mean by a dynamic module. We then discuss the concept of a data manager, look at important techniques for using data managers to store and retrieve dynamic data, and examine methods for making data managers extensible using inheritance and aggregation. Next, we look at some examples of data managers using SQL and XML, and explore some techniques for working with database connections, accessing time-consuming web services in parallel, and working with JSON, which is particularly useful for Ajax applications. Finally, we look at a few things to keep in mind when working with dynamic data in cookies and forms.
Let’s reconsider the New Car Reviews module from Example 3-3, which contains a list of three new car reviews. That example illustrates well-constructed HTML for the module, but it doesn’t address how the HTML was generated on the server or which parts of that module are dynamic versus static. Exploring that module again, it’s reasonable to expect that the list of reviews should be generated dynamically so that we can insert whichever reviews are relevant wherever the module is used. An associative array is a good data structure for organizing dynamic data. The list of reviews might be structured as shown in the PHP code in Example 6-1.
array ( "0" => array ( "name" => "2009 Honda Accord", "price" => "21905", "link" => "http://.../reviews/00001/" ), "1" => array ( "name" => "2009 Toyota Prius", "price" => "22000", "link" => "http://.../reviews/00002/" ), "2" => array ( "name" => "2009 Nissan Altima", "price" => "19900", "link" => "http://.../reviews/00003/" ) )
Example 6-2 shows a method that uses the data structure of Example 6-1 to generate the HTML for the list items in the New Car Reviews module (Chapter 7 presents a complete class for implementing a module in PHP, which might employ a method like this). This method takes the array of new car reviews as an argument.
protected function get_reviews($reviews) { $count = count($reviews); $items = ""; for ($i = 0; $i < $count; $i++) { $pos = ($i == 0) ? "beg" : (($i == $count - 1) ? "end" : "mid"); $price = "$".number_format($reviews[$i]["price"]); $items .= <<<EOD <li class="$pos"> <p> <strong>{$reviews[$i]["name"]}</strong> <em>(from $price)</em>. </p> <a href="{$reviews[$i]["link"]}">Read the review</a> </li> EOD; } return $items; }
The point in Example 6-2 is how members of the data structure for the list of reviews have been used in the dynamic generation of HTML markup for the list items of the module. To get dynamic data like this into a data structure that you can use within the PHP for a module, you need a standard, systematic way to access the data. A good way to handle this is to encapsulate access to the data within an object. That leads to our next section.
A data manager is an object that abstracts and encapsulates access to a specific set of data. Its purpose is to provide a well-defined, consistent interface by which you can get and set data in the backend, and to create a clear structure for the data itself. In Chapter 7, we will look at some techniques for invoking data managers during the generation of a complete page. Data managers are also useful for managing the data exchanged in Ajax requests. For now, let’s look at how data managers simplify access to dynamic data.
Because a data manager is an object, you simply instantiate the data
manager and call its get_data
method anywhere
you need to get the data it manages. Example 6-3 illustrates the use
of a couple of data managers to get data from the backend within the kind
of PHP class for pages that we’ll develop in Chapter 7. In Chapter 7,
you’ll also see that a page’s load_data
method defines
a single point at which to load its data.
class NewCarSearchResultsPage extends SitePage { ... public function load_data() { // Set up load_args for each of the data managers called below. ... $dm = new NewCarListingsDataManager(); $dm->get_data ( $this->load_args["new_car_listings"], $this->load_data["new_car_listings"], $this->load_stat["new_car_listings"] ); $dm = new NewCarReviewsDataManager(); $dm->get_data ( $this->load_args["new_car_reviews"], $this->load_data["new_car_reviews"], $this->load_stat["new_car_reviews"] ); ... } ... }
Notice the use of new_car_listings
and new_car_reviews
members (named after the data
managers themselves) for each argument of the get_data
calls. These ensure that the arguments,
data, and status for each data manager are uniquely identifiable. All you
need to know right now about get_data
is that the $load_args
argument is the
input (allowing you to control the method’s operation), the $load_data
argument is the main output, and the
$load_stat
argument is additional
output that you can use in case something goes wrong. After get_data
returns, the $load_data
member of the page class contains the
data retrieved by each data manager, with the data for each module placed
within its own area of the data structure. Example 6-4 shows an example of
this data structure.
array ( "new_car_listings" => array ( // Data retrieved by the New Car Listings data manager is here. ... ), "new_car_reviews" => array ( // Data retrieved by the New Car Reviews data manager is here. "0" => array ( "name" => "2009 Honda Accord", "price" => "21905", "link" => "http://.../reviews/00001/" ), "1" => array ( "name" => "2009 Toyota Prius", "price" => "22000", "link" => "http://.../reviews/00002/" ), "2" => array ( "name" => "2009 Nissan Altima", "price" => "19900", "link" => "http://.../reviews/00003/" ) ) )
Anytime you need to set some data in the backend managed by a data
manager, you simply instantiate the data manager and call its set_data
method. Example 6-5 illustrates the use
of a data manager to set data in the backend within the kind of PHP class
for pages that we’ll develop in Chapter 7. The
save_data
method defines
a single point at which to save data for a page. As in Example 6-3, notice the use of
the new_car_queries
member for each argument of
set_data
to ensure the arguments, data,
and status for this data manager are uniquely identifiable.
class NewCarSearchResultsPage extends SitePage { ... public function save_data() { // Set up save_args and save_data for each data manager called below. ... $dm = new NewCarQueriesDataManager(); $dm->set_data ( $this->save_args["new_car_queries"], $this->save_data["new_car_queries"], $this->save_stat["new_car_queries"] ); ... } ... }
To allow a data manager to be configured before accessing the data
that it manages, you can define parameters for its constructor or define
various setter methods. For example, to tell the data manager
whether you’d like abbreviated or full information for the listings that
are retrieved, you can define a method such as set_full_listings
, which
can be called anytime before calling get_data
.
A good approach for creating data managers is to define them for fairly granular sets of data grouped logically from the backend perspective. Backend developers may be in the best position to do this since they have good visibility into details about backend systems. Ideally, these details should be abstracted from the user interface. Once data managers are defined, the user interface can instantiate whichever of them are needed to load and save data for the page.
It’s important to realize that data managers don’t necessarily correspond one-to-one to modules on the page. In fact, this is a key design attribute that makes it easy for multiple modules to access the same data, which is common in large web applications. For example, imagine a postal code stored by the backend for the current visitor. You may need to use this within multiple modules on a page, but ideally there should be a single data manager that defines the interface for getting and setting it.
Because all data managers fundamentally do the same thing (i.e.,
get and set data), it’s useful to define a DataManager
base class (see Example 6-6). This base class defines a standard
interface that all data managers implement. For each data manager that
you derive from this base class, implement either or both of the methods
in the interface as needed, and provide whatever supporting methods are
helpful for these methods to manage the data efficiently. The default
implementations do nothing.
class DataManager { public function __construct() { } public function get_data($load_args, &$load_data, &$load_stat) { } public function set_data($save_args, &$save_data, &$save_stat) { } }
The get_data
method of a
data manager abstracts the process of getting data from the backend. A
key part of implementing a clearly defined data interface for getting
data is to define well-organized data structures for each of the
parameters that get_data
accepts or
returns:
public function get_data($load_args, &$load_data, &$load_stat)
$load_args
Input arguments needed for getting the data—for example, configuration settings, a database connection, or the maximum number of items in a list of data to retrieve. Since more than one input argument is frequently required, an associative array works well for this data structure.
$load_data
A reference for where to place the retrieved data. Since more than one data member is frequently retrieved, an associative array works well for this data structure.
$load_stat
A reference for where to return the status of the operation. A status indication may be a numeric code or a string in the simplest situations, or it could be an associative array that provides more details.
The set_data
method of a
data manager abstracts the process of setting data in the
backend.
public function set_data($save_args, &$save_data, &$save_stat)
The set_data
method of a data
manager uses the same arguments and internal structures as get_data
, except $save_data
is the data to save. This
argument is a reference so that a data manager has the option to pass
back some data after saving.
Often, it makes sense to build on existing data managers when creating new ones. For example, you might create a data manager that relies on common methods for working with web services from another data manager or combine access to multiple, finer-granularity data managers within a single data manager that a page can instantiate on its own. The extension of data managers offers more than just a convenience—it also provides the opportunity for certain optimizations. For example, you might encapsulate how you share database connections or file handles. Because data managers are objects, you can extend them easily using either inheritance or aggregation.
Inheritance establishes an “is-a” relationship between data managers. To extend a data manager using inheritance, derive your new data manager class from the data manager class with the characteristics that you desire. The extension of a data manager via inheritance is a good approach when you need a data manager that is a more specific type of an existing one.
Example 6-7 derives the New Car Listings data manager from the Web Service data manager, which provides common capabilities for any data manager that accesses web services. When you extend a data manager using inheritance, the derived data manager has access to all the public and protected members of its parent. You can then add new methods or override methods from the parent to augment functionality.
Aggregation establishes a “has-a” relationship between data managers. To extend a data manager using aggregation, create an instance of the data manager class with the capabilities that you desire as a member of the new data manager. The extension of a data manager via aggregation is a good approach to let a single data manager provide access to the data of multiple data managers.
Example 6-8 aggregates several data managers into a New Car Listings data manager so we can retrieve new car reviews as a part of retrieving other data related to new car listings. When you extend a data manager using aggregation, your data manager has access only to the public members of the data manager that has been aggregated.
class NewCarListingsDataManager { protected $new_car_reviews_dm; ... public function __construct() { parent::__construct(); $this->new_car_reviews_dm = new NewCarReviewsDataManager(); } public function get_data($load_args, &$load_data, &$load_stat) { $this->new_car_reviews_dm->get_data ( $load_args["new_car_reviews"], $load_data["new_car_reviews"], $load_stat["new_car_reviews"] ); // Get other data needed for the New Car Listings data manager. ... } }
Just as we saw in Example 6-3, the use of the
new_car_reviews
member (named after
the data manager itself) for each argument of get_data
ensures that
the arguments, data, and status for the New Car Reviews data manager
are uniquely identifiable. Assuming the get_data
method of NewCarListingsDataManager
is passed an
associative array member called new_car_listings
for its $load_data
argument (per the same
convention), the data structure returned by the New Car Listings data
manager will be similar to the one shown in Example 6-9. This structure
reflects nicely that the New Car Listings data aggregates some New Car
Reviews data.
array ( "new_car_listings" => array ( // Data from the New Car Reviews data manager, by which the // New Car Listings data manager was extended via aggregation. "new_car_reviews" => array ( "0" => array ( "name" => "2009 Honda Accord", "price" => "21905", "link" => "http://.../reviews/00001/" ), ... ), // Other data retrieved by the New Car Listings data manager. ... ) )
Databases using SQL are some of the most common sources for data from the backend that a data manager may need to manage. In this section, we look at a canonical data manager that manages access to a simple database.
Example 6-10 shows an
implementation for the NewCarDetailsDataManager
class, which uses SQL
to access a database. The purpose of this data manager is to get
detailed data about a new car. The example also shows DatabaseDataManager
, a sample base class to
provide common capabilities needed by most data managers that access
databases, such as opening the database, looking up a user and password
from a secure location, closing the database, and handling database
errors, among other things.
Because the New Car Details data manager is a specific type of
database data manager, we’ve extended its class from the DatabaseDataManager
class using inheritance.
It’s important to notice a few key points about the data managers in
Example 6-10:
DatabaseDataManager
does
not implement either get_data
or
set_data
, because this class is
not intended to be instantiated directly.
One of the useful features that DatabaseDataManager
implements is a check
of whether or not a database is already open and whether to close it
when finished. This allows multiple data managers to share the same
database connection when they are aggregated within other data
managers.
Defining another data manager (e.g., NewCarDatabaseDataManager
) would let you
keep the details for accessing this specific database (e.g.,
building queries with SQL, etc.) out of NewCarDetailsDataManager
, in
practice.
The database support required by most large web applications can be abstracted into other database data managers as well. These can handle things that backend systems typically deal with, such as implementing a caching layer.
class DatabaseDataManager extends DataManager { protected $host; protected $name; protected $file; protected $user; protected $pass; protected $connection; protected $close_flag; public function __construct($connection, $close_flag) { parent::__construct(); $this->connection = $connection; $this->close_flag = $close_flag; } protected function db_open() { // If there is not already an open connection, open the database. if (empty($this->connection)) { $this->db_access(); $this->connection = mysql_connect ( $this->host, $this->user, $this->pass ); if (!$this->connection) { $this->db_handle_error(); return false; } if (!mysql_select_db($this->name)) { $this->db_handle_error(); return false; } } return true; } protected function db_access() { list($user, $pass) = explode(":", file_get_contents($this->file)); $this->user = trim($user); $this->pass = trim($pass); } protected function db_close() { if ($this->connection) mysql_close($this->connection); } protected function db_handle_error() { ... } ... } ... class NewCarDetailsDataManager extends DatabaseDataManager { public function __construct($connection = "", $close_flag = true) { parent::__construct($connection, $close_flag); // Provide the host and name for the database as well as the // path of the secure file containing the user and password. $this->host = ... $this->name = ... $this->file = ... $this->db_open(); } public function get_data($load_args, &$load_data, &$load_stat) { $load_stat = $this->get_details ( $load_args["id"], $load_data ); // Close the database after getting the data if set up for this. if ($this->close_flag) $this->db_close(); } protected function get_details($id, &$details) { $query = "SELECT * FROM new_cars WHERE id='$id'"; $result = mysql_query($query); if (!$result) { $details = array(); $this->db_handle_error(); return false; } $details = $this->get_details_result($result); mysql_free_result($result); return true; } protected function get_details_result($result) { $data = mysql_fetch_array($result, MYSQL_ASSOC); if (!empty($data)) { // Massage the data structure as needed before returning it. ... } return $data; } }
XML data is another common source for data from the backend that a data manager may need to manage. In this section, we look at a canonical data manager that manages access to data defined by XML.
Example 6-11
presents an implementation for the NewCarArticlesDataManager
class, which
accesses short articles about new cars stored in XML. The example also
illustrates the XMLDataManager
base
class, which provides common capabilities needed by most data managers
that process XML. In this example, a single method is shown that
performs postprocessing on extracted data, but you can imagine many
others to assist in various operations for XML parsing. Because the New
Car Articles data manager is a specific type of XML data manager, we’ve
extended its class from XMLDataManager
using inheritance. Example 6-12 presents a sample
of the XML (from two XML files) that the data manager processes. This
XML might be from a feed produced by a content management system.
For most XML data, which is accessed frequently but doesn’t change very often, it would be a good idea to use the APC cache facilities provided by PHP to improve performance.
class XMLDataManager extends DataManager { public function __construct() { parent::__construct(); } protected static function clean($text, $lower = false) { $clean = trim($text); $clean = ($lower) ? strtolower($clean) : $clean; return $clean; } ... } ... class NewCarArticlesDataManager extends XMLDataManager { public function __construct() { parent::__construct(); } public function get_data($load_args, &$load_data, &$load_stat) { // Populate this with the path of the file containing XML data. $file = ... $data = array(); if (file_exists($file)) { $xml = simplexml_load_file ( $file, "SimpleXMLElement", LIBXML_NOCDATA ); foreach ($xml->article as $article) { $article_id = XMLDataManager::clean($article->article_id); if ($article_id == $load_args["article_id"]) { $article_id = XMLDataManager::clean($article->article_id); $title = XMLDataManager::clean($article->title); $content = XMLDataManager::clean($article->content); // Populate the array with info about related new cars. if (empty($article->new_car_ids)) $new_cars = array(); else $new_cars = self::get_new_cars($article->new_car_ids); $data = array ( "article_id" => $article_id, "title" => $title, "content" => $content, "new_cars" => $new_cars ); break; } } } $load_data = $data; } protected static function get_new_cars($new_car_ids) { // Populate this with the path of the file containing XML data. $file = ... $data = array(); if (file_exists($file)) { $xml = simplexml_load_file ( $file, "SimpleXMLElement", LIBXML_NOCDATA ); foreach ($new_car_ids->new_car_id as $new_car_id) { $new_car_id = XMLDataManager::clean($new_car_id); foreach ($xml->new_car as $new_car) { $comp_id = XMLDataManager::clean($new_car->new_car_id); if ($comp_id == $new_car_id) { $name = XMLDataManager::clean($new_car->name); $price = XMLDataManager::clean($new_car->price); $preview = XMLDataManager::clean($new_car->preview); $details = XMLDataManager::clean($new_car->details); $data[$new_car_id] = array ( "new_car_id" => $new_car_id, "name" => $name, "price" => $price, "preview" => $preview, "details" => $details, ... ); break; } } } } return $data; } }
<?xml version="1.0"?> <articles> <article> <article_id> 2009_may </article_id> <title> Featured New Cars for May 2009 </title> <content> <![CDATA[ ... ]]> </content> <new_car_ids> <new_car_id> new_car_00001 </new_car_id> <new_car_id> new_car_00002 </new_car_id> ... </new_car_ids> </article> ... </articles> ... <?xml version="1.0"?> <new_cars> <new_car> <new_car_id> new_car_00001 </new_car_id> <name> New_car 1 </name> <cost> 20.95 </cost> <preview> <![CDATA[ ... ]]> </preview> <details> <![CDATA[ ... ]]> </details> </new_car> ... </new_cars>
A web service is a system that defines an API for accessing information over a network. Data often is returned as XML, but JSON (see Data in the JSON Format) is very popular as well. The simple interface, natural abstraction, and ubiquity of web services makes them very desirable for interfacing with backend systems.
To access a web service from a data manager, you can use the PHP Client URL (cURL) library. This library provides a simple way to communicate with many different servers using various protocols. Example 6-13 provides a basic example of a data manager to access a web service using cURL.
class NewCarListingsDataManager { public function __construct() { parent::__construct(); } public function get_data($load_args, &$load_data, &$load_stat) { $ch = curl_init(); // Set the URL to the web service required by the data manager. $url = ... curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); header("Content-Type: application/xml"); $results = curl_exec($ch); curl_close($ch); // Do whatever processing is needed to the data that was returned. ... } }
Because web services involve establishing connections over a
network, they can take time to generate a response. To address this, it’s
a good idea to run multiple data managers for web services in parallel.
You can do this using the cURL functions for making parallel requests
(e.g., curl_multi_init
, curl_multi_add_handle
, curl_multi_exec
, etc.).
When we explore large-scale Ajax in Chapter 8, you’ll see that often it’s useful to
exchange data between the server and browser using JSON. This is because
JSON is just the normal JavaScript syntax for object literals. Once you
evaluate the data in the browser using eval
, or more safely, json_parse
(downloadable from http://json.org/json_parse.js), you
can use the data like any other JavaScript object. It’s also very
lightweight. Considering its simplicity and conciseness, JSON is
increasingly being recognized as a great format for exchanging data in
other types of applications as well.
To convert a data structure (typically an associative array or object) in PHP to JSON, use the following:
$json = json_encode($data);
It’s just as easy to get data in the JSON format back into a format that’s easy to work with in PHP:
$data = json_decode($json, true);
The second parameter of json_decode
, when set to true, causes the
function to return the data as an associative array as opposed to an
object. Example 6-14
illustrates what the new car reviews data from Example 6-1 would look like
encoded as JSON data.
[ { "name" : "2009 Honda Accord", "price" : "21905", "link" : "http://.../reviews/00001/" }, { "name" : "2009 Toyota Prius", "price" : "22000", "link" : "http://.../reviews/00002/" }, { "name" : "2009 Nissan Altima", "price" : "19900", "link" : "http://.../reviews/00003/" } ]
Assuming this data is in the variable json
, you can get the name of the first new car
in the array using JavaScript as follows:
var reviews = json_parse(json); var name = reviews[0].name;
To get data into the JSON format, you can either pass flags to data managers to transform the data themselves
or let the PHP scripts that handle Ajax requests transform the data from
the associative arrays that the data managers normally return.
Whatever the case, all it takes is a call to json_encode
.
Cookies and forms present their own considerations for the data they manage. Cookies provide a mechanism for browsers to store a small amount of persistent data on a visitor’s computer. Some common uses for cookies are saving visitor preferences and managing shopping carts. Forms allow visitors to enter data for transmission back to the server. Some common places where forms are used include order processing and queries for product listings.
A cookie consists of one or more name-value pairs. You can
read and write them using JavaScript as well as server-side scripting
languages like PHP. The following JavaScript writes two cookies that expire in
one month (using the max-age
cookie
attribute) to save a postal code and a range in miles for new car search
results:
var m = 60 * 60 * 24 * 30; document.cookie = "nwcsrspos=94089;max-age=" + m; document.cookie = "nwcsrsdst=50;max-age=" + m;
To write a cookie in PHP, you must send the cookie before echoing
any output for the page (just as with the header
function). The following PHP code
writes a cookie that expires in one week to save a postal code for new
car search results:
$t = time() + (60 * 60 * 24 * 7); setcookie("nwcsrspos", "94089", $t);
In JavaScript, you retrieve the value of a cookie on a page by
parsing the name-value pair that you are interested in from document.cookie
. In PHP, you retrieve the
value of a cookie by accessing the appropriate member of the associative
array in $_COOKIE
or $_REQUEST
. For example, the following uses PHP
to get the nwcsrspos
cookie:
$pos = $_COOKIE["nwcsrspos"];
One of the concerns with cookies in large web applications
is how to preserve modularity so that cookies written by one module do
not conflict with those of another. To prevent conflicts, make sure to
name each cookie within its own namespace. If you create unique identifiers for your
modules (see Chapter 3), a simple solution is
to prefix each cookie with the identifier of the module to which it
belongs. For example, the nwcsrspos
cookie contains name segments indicating it was the postal code cookie
for the New Car Search Results module. For cookies that you need to
share across multiple modules (e.g., suppose you want the cookie for a
postal code to have the same identifier anywhere you use it), you can
establish a naming convention that reflects the wider scope in which the
cookies will be used.
A form typically utilizes a number of named input elements
whose names and values are passed to another page for processing when
the form is submitted. The values are available to the target page as
members of associative arrays within the following variables. Since these variables often contain
the data that you need to save to the backend, you often pass their
values as arguments to the set_data
method of data
managers:
$_GET
An associative array of values passed to the current page via URL parameters (e.g., via the GET method of a form).
$_POST
An associative array of values passed to the current page via the HTTP POST method (e.g., via the POST method of a form).
$_REQUEST
An associative array that contains all the values
available in the $_GET
,
$_POST
, and $_COOKIE
variables.
One of the concerns with form data in a large web application, as it is with cookies, is to preserve modularity across forms within different modules. Specifically, you need to ensure that modules containing forms do not conflict with one another as their values are passed in tandem to other pages. Otherwise, it would be impossible for those pages to know which of the modules actually sent the similarly named data.
Fortunately, the same solution given for cookies works well here, too. If you create unique identifiers for your modules (see Chapter 3), you can use the module identifiers as a prefix for each form parameter to indicate the module to which it belongs. In addition, for common parameters that may be entered from multiple modules (e.g., suppose multiple modules let you set your postal code as a location), you can establish other naming conventions that reflect the scope in which the parameters will be used.