Chapter 12. Implementing your design

After you have your information model, metadata, and workflow decided, you need to determine how you are going to implement your unified content solution. This chapter discusses the options available. Note that it is not designed to provide technical “how-to” information; the design model (for information, metadata, and workflow) will not necessarily map directly to an off-the-shelf process or software solution.

Factors affecting implementation

You can implement a unified content solution in an infinite number of ways; no single architecture or solution will suit all needs. Some of the factors that will affect your implementation include:

  • Budget

    The budget for a unified content solution could cost from tens of thousands of dollars to millions of dollars, depending on the scope of your implementation. For example, a unified content solution for a single department requires a smaller budget than one for an enterprise.

  • Technical resources

    A large corporation with a large IT department may be in a better position to investigate and invest in new, leading-edge technologies. Having technical resources available for your implementation will give you an advantage over a small department that may not have similar access. IT resources (internal or contracted) will be required for any customization or integration that you undertake in your implementation.

  • Technical ability of authors

    The technical capabilities of your users will have a significant effect on your implementation. You should target your implementation to the computing skills and technology comfort level of your authors. You cannot implement a system that requires expert computing skills or a high level of comfort with technology if your authors do not have that skill or comfort level. The solution you implement must support your authors.

  • Diversity of users

    If you are creating a unified content solution for an enterprise-wide group of content providers, you will need to address their range of abilities. This means that a “one size fits all” solution has a high risk of failure. You will probably need to develop a solution that presents different suites of tools to different authors.

  • Required outputs

    The information you create must ultimately be delivered to users. How information is delivered will have a large impact on the technologies that you choose for your implementation. If you are confining outputs to HTML for web delivery and PDF for paper, you can build your solution around traditional desktop publishing tools. If you need to output to formats such as Wireless Markup Language (WML) for display on hand-held devices or text-enabled cell phones, then XML-based technology should probably be the backbone of your system.

The requirements for implementing any unified content solution depend on the needs that the solution must satisfy. However, there are common requirements that must be considered in any implementation. These requirements are listed in the following sections.

Scalability

Any unified content solution you implement today must be able to support your business for some time into the future. An implementation could consume considerable resources, so it is best to plan ahead and ensure that the system will be able to handle the volumes of information or numbers of authors you anticipate having for as far into the future as you plan your business (that is, it should support your five-or ten-year business plan, depending on how far ahead you plan).

Ease of use

One of the key requirements for any implementation is that it must be easy for users to use. “Ease of use” means different things depending on the technology. If you are using traditional desktop publishing technology, you may need to create custom scripts to add functionality. If you are using non-traditional tools (such as XML), you may need to customize the interface to remove functionality that you don’t want the authors to access—or functionality that may be confusing. XML tools can also be customized to insert or add reusable text systematic reuse.

Ease of finding information

Finding information is an extremely important requirement for any unified content implementation. The return (in improved efficiency, faster time to market, increased consistency, or financial savings) of a unified content strategy depends on the ability of authors to find the information they need. In some implementations, depending on the amount of reuse, authors may spend more time finding and assembling blocks of information into information products than they spend writing original content.

In a typical scenario, authors begin to write a piece of content. When they need certain reusable content elements, they search for them in the content management system (CMS). If they cannot easily find the information that they are looking for, they are sure to re-create it. Then you have multiple versions of the same information, reducing the effectiveness of the unified content solution. This need to quickly and efficiently find information places importance on two different aspects of your system: the search capabilities of your CMS and the metadata that you apply to the chunks of information. If you choose to depend on the CMS for systematic reuse (reducing the need for authors to search for content), then metadata and detailed models are of even greater importance.

Physical granularity

Chapter 8, “Information modeling,” discusses granularity in the context of creating your information model. In that context, granularity refers to the size of the elements that you will reuse, and therefore the size of the building blocks. In implementing your design, you must again consider granularity, but against the functionality of your content management and delivery systems. In this context, granularity really refers to the physical chunk of information that you create and store in your CMS.

The granularity that you settle on depends to some degree on your CMS’s capabilities. With document management systems, the physical chunk of information stored in the system had to match the structural granularity for reuse. That is, if you were reusing paragraphs, every paragraph that you wanted to reuse had to be saved in a separate physical file. With the increase of “structurally aware” content management systems, the physical chunk does not always have to match your reuse granularity. Some content management systems are capable of treating elements in your document as logically separate pieces.

The granularity that you manage in your CMS affects the efficiency of your unified content solution. So how granular should your information be? What is the optimum size for chunks of information? There is no single correct answer. However, there are risks if you make your information too granular or not granular enough. The proper level of granularity depends on the information and what you want to do with it, coupled with the capabilities of the CMS that you implement.

The impact of making your content granular

When you are designing models for information products and elements, you need to identify all the unique information elements in your document. You also identify the elements that will be used or reused. Each element—unique and reusable—will have a unique name.

One approach to granularity would be to consider every element reusable, and therefore manage each element. This is certainly possible, but would probably be inefficient and ineffective. There are two issues:

  • In breaking your information into small pieces, you create a large number of chunks that the CMS must manage, which places a strain on the system as well as on the network structures that provide access to the CMS. When authors retrieve a complete piece (book, section, and so on), the system must assemble and send all the small pieces that make up the complete piece.

  • The greater the number of pieces that you manage in your CMS, the more difficult it is to find information. There are more pieces of information to choose from in searches.

The impact of not making your content granular enough

The opposite approach to granularity is to set the physical file size of managed chunks to match the higher-level structures of the information, such as book, chapter, section, and report. Granularity at this level is very comfortable for authors because it mirrors what they are typically used to. However, it may impact your ability to reuse information from the CMS. If the chunks are too big, authors may not be able to separate the smaller pieces of information they need to reuse.

Options for implementation

You can implement your models in many ways, depending on your authoring requirements, technology, and desired results, including those discussed in the following sections.

Implementing your model in XML

XML is a very powerful technology for unified content strategies (see Chapter 14, “The role of XML”). It provides capabilities not available in traditional tools. An XML Document Type Definitions (DTD) provides a highly structured way to enforce your models. XML DTDs can be implemented in many ways: semantic, generic, a mixture of semantic elements and generic elements.

What is a DTD?

A DTD is a formal definition of the XML elements that a specific type of document can include. The DTD defines the names of the elements, their relationship, and their frequency. For example, the DTD entry for a procedure can be defined as something like this:

  • The procedure starts with a title.

  • Following the title is an introduction.

  • The introduction contains a paragraph, followed by 0 or more additional paragraphs, tables, lists, or images.

  • After the introduction is a stem sentence.

  • Following the stem sentence are instruction steps, and so on.

You can define your own DTD, use the DTD provided with XML editors, or use “industry-standard” DTDs freely available on the Web.

Is a DTD required?

When considering implementing an XML solution, the first question people ask is “Do we need a DTD?” The answer, really, is yes.

XML itself does not require that files be associated with a DTD. XML files can be “well formed,” which means that they follow the rules of syntax defined in the XML specification. However, for a unified content solution, particularly where reuse is involved, DTDs are mandatory.

The key to effective reuse is consistency. Common types of information should have a consistent structure, and reusable pieces of information should have a predictable structure. With a DTD, structural consistency is assured. You can quite clearly define the elements and the structure for all your information products and types. When you use a DTD, XML editing tools (both native XML and XML-aware editors) can check documents to ensure that all required elements are in place, in the correct order. The editors can also ensure that authors do not add elements or change element names.

Predictability is also a key requirement for formatting. You use external style sheets to format XML documents. These style sheets associate specific formatting commands with specific elements in the document. The association is by name. For example, if you have an element named “procedure” in your document, you need formatting instructions with the name “procedure” in your style sheet. If you cannot control the element names your authors use, you cannot ensure that each element has a style definition. Without the style definition, you have no control over the output.

Chapter 8, “Information modeling,” describes the value of and how to develop information models. If you are implementing your model in XML, you must create a DTD to support that model, or you must provide some other mechanism to validate content.

How many DTDs?

The number of DTDs that you create depends on personal choice and organization, more than on information volume or complexity, even for the largest implementation. For example, if you have content owned by different divisions or departments, and specific individuals in the departments will be responsible for maintaining the content models for their respective departments, you may want to create multiple DTDs, so that separate departments can work on their DTDs without conflict.

If you do create a single DTD, you should modularize your DTD to simplify maintenance. This involves grouping content models and elements in your DTD together by purpose or relationship. Some groupings include:

  • Common block elements such as generic paragraphs, tables, lists, or graphics

  • Common inline elements such as emphasis, links, or cross-references

  • Book-level content models such as books, manuals, reports, or brochures

  • Specific content models such as procedures, concepts, or policies

Also, with a single DTD, you must build your DTD to provide different views for different authors. In an enterprise-wide implementation you will have authors who provide specific subsets of your content. For example, business analysts may be limited to providing policy information for your corporation. Your DTD should be designed and built to give these authors a policy content model and nothing else. However, authors who are responsible for putting together an entire policy guide should have access to the content model for the an entire guide, including policy models.

DTD or Schema?

DTDs have traditionally been the structural definition format for publication markup languages. Now, increased use of XML for business applications has lead to the development of an alternative to DTDs: XML Schemas. For business applications, DTDs are seen as limited. They define the elements of content, but give you no control over the content itself. The definition of an element is that it can contain other elements or text.

Schemas, like DTDs, are a formalized language for defining structure, but they introduce data typing to element definitions. For example, if you define an element representing a date, you can define that the content must be in a date format. The validating tools that support schemas can recognize the data types and will report when content does not conform to type.

The decision to use schemas depends on two factors:

  • Do you need data typing?

  • Does the tool you need support it?

Most publishing solutions work very well with DTDs. But if you need data typing, you need schemas. Or, if you have chosen a tool that does not support schemas, but still need to check content against data types, you need to build custom scripting to supplement standard validation against DTDs.

Authoring forms

Authoring forms are HTML forms that can guide authors in entering structured content. Users enter content in a browser by entering content in the form fields. Authoring forms can be used in conjunction with XML and a DTD, as an alternative to a DTD, or as an interface to a CMS or database.

Typically forms-based authoring systems are implemented using functionality that is part of many CMSs. The CMS provides utilities to create HTML forms with individual fields to capture content model elements. The form is used to write only a controlled chunk of content, such as a concept, feature, policy, or procedure. You would not usually use a form to build a book.

Forms-based implementations have advantages over implementations that use full authoring tools:

  • They enable collaborative authoring with remote access. Anyone with a browser and access to the Internet/extranet/intranet can author content in the forms.

  • Forms hide the complexity of XML editing from the authors. Authors do not need to apply styles or tagging at all. Authors enter content into the fields provided in the form. When they post the form to the server, the server converts the field data in the appropriate tags or codes.

Forms-based systems also have their limitations:

  • Forms require IT support (to build the forms and the mechanisms for managing the data from the forms).

  • Forms are not effective for large information product models.

  • Forms are very inflexible; you require coding support to change them.

  • Forms do not support very granular reuse.

It can be difficult to implement granular reuse within paragraphs or even among paragraphs. For example, it may be difficult to identify individual steps in a procedure that are applicable to different roles. You need to separate each step as a field in the form, but that can be cumbersome.

Structural templates (traditional authoring tools)

Structural templates are formatting templates that use structural names to represent the structural elements of documents. They are used in implementations where traditional word-processing or desktop publishing tools provide the authoring functionality. Structural templates use semantic names as style names. Structural templates do not have the same power as DTDs or forms because the tools that use templates cannot control when the tags are used. For example, an XML editor can filter the list of elements that authors can apply so that the DTD is followed. If the DTD defines a series of tags (a title, followed by a introduction, followed by a procedure,…) the authoring software can ensure that the tags are included. Traditional tools have none of these controls. There is nothing to prevent authors from changing the structure, changing the tag names, or even creating new tags.

Structural templates can be very good for providing boilerplate information, or at least hints to common information.

Semantic versus generic element or style names

When you develop your information model you should identify all the content elements in your information and give each a unique semantic name (a semantic component of the model—see Chapter 8). An important part of the modeling exercise is to understand every possible level of structure. Without a complete understanding of your content, you cannot really make informed decisions about the outputs you require and what information you can use.

The complete model is used to guide authors, so it should uniquely identify all the content elements. However, that does not mean that you create elements or style tags that match every element in your content. You do not need to—and probably should not—create a one-to-one mapping of elements in a model to elements or style tags. Some elements should ultimately be generic or common elements.

Number of elements or tags

The number of elements or tags is a key issue on deciding whether to use semantic names to match every element in your information model. Too many elements or style tags can make authoring difficult.

For example, one of the key complaints about the DocBook DTD is that with over 400 elements it is too complicated for most authors. Think of having a huge drop-down list for style tags in a traditional authoring tool. Authors simply will not scroll through this list to select the appropriate style tag. Frequently, authors will use just a few style tags and will hand format the rest. This defeats the purpose of uniquely identifying content and your ability to automatically convert the content to multiple outputs in multiple information products.

Identifying content

If you need to be able to do something with a piece of content such as find it easily, reuse it, or manipulate it, it should have a semantic identity. There are two ways to give semantic identity to an element of text:

  • Give it a semantic element name or style tag.

  • Use a generic element name or style tag, but add metadata that qualifies what the element is used for or represents.

For example, listing 12.1 shows an XML sample procedure that uses semantic names for all elements.

Example 12.1. Semantically tagged procedure

<procedure> 
    <title>Logging On to AccSoft </title> 
    <introduction>The first time you click on a component in AccSoft 
        you are required to log on to the system before you can 
        complete any tasks.</introduction> 
    <stem>To log on to AccSoft:</stem> 
    <procedure_steps> 
        <step>Double-click the AccSoft application. 
          <result>The system displays the AccSoft main window.</result> 
        </step> 
        <step>Select AP from the Explorer. 
            <result>The system displays the login dialog.</result> 
        </step> 
        <step>Type your USERID into the Name field.</step> 
        <step>Type your password into the Password field. 
            <result>The system displays the customer dialog.</result> 
        </step> 
        <step>Select the customer to update.</step> 
        <step>Click the OK button to log on to AccSoft.</step> 
    </procedure_steps> 
    <note>If you do not know your USERID or Password, consult 
             your System Administrator. 
</note> 
    <warning>This database contains personal 
             information about our clients. If you are logged on, 
             do not leave your terminal unattended at any time. 
</warning> 
</procedure> 

Listing 12.2 shows a sample procedure that uses generic names for some elements, but includes attributes (the XML way of identifying metadata) to give semantic identity to other elements.

Example 12.2. Combination of semantic and generic tagging with attributes

<procedure> 
    <title>Logging On to AccSoft </title> 
    <para type="introduction">The first time you click on a 
             component in AccSoft you are required to log on to the 
             system before you can complete any tasks.</para> 
    <para>To log on to AccSoft:</para> 
    <procedure_steps> 
        <step>Double-click the AccSoft application. 
             <para>The system displays the AccSoft main window.</para> 
        </step> 
        <step>Select AP from the Explorer. 
             <para>The system displays the login dialog.</para> 
        </step> 
        <step>Type your USERID into the Name field.</step> 
        <step>Type your password into the Password field. 
                 <para>The system displays the customer dialog.</para> 
        </step> 
        <step>Select the customer to update.</step> 
        <step>Click the OK button to log on to AccSoft.</step> 
    </procedure_steps> 
    <para type="note">If you do not know your USERID or Password, 
             consult your System Administrator.</para> 
    <para type="warning">This database contains personal 
             information about our clients. If you are logged on, 
             do not leave your terminal unattended at any time.</para> 
</procedure> 

If you compare the two examples you can see that Listing 12.2 uses five fewer element names than Listing 12.1. The introduction, stem, result, note, and warning have been replaced by generic paragraph elements as summarized in Table 12.1.

Table 12.1. Generic paragraph elements

Semantic name

Generic name with attribute

<introduction>

<para type="introduction">

<stem>

<para>

<result>

<para>

<note>

<para type="note">

<warning>

<para type="warning">

Identifying the lead-in stem sentence is important information for your authors and should be included in your model, but does not have to be implemented in your structure; a generic element is sufficient. The result element is replaced by a generic element with no attribute. The result is a part of the step and not likely to be reused separately from the step, so it does not need to be identified separately. The note and warning have been given generic element names (para) but have been modified with an attribute (metadata) that uniquely defines them. As you begin to model, you’ll notice structures such as note, important, tip, caution, warning are often identical with the exception of the title and an icon. You can create a common model for these elements that is implemented as a generic element, then add metadata to differentiate them.

If you were using a traditional authoring tool to create this procedure you might make some different decisions. Table 12.2 illustrates the same procedure as it might be tagged in Microsoft Word or FrameMaker (unstructured version). Notice that a semantic style tag is used for the note and warning. In this case you might want the template to automatically insert the title (Note or Warning) and possibly an icon. These tools can do this only if you have defined a unique style tag.

Table 12.2. Authoring in Word versus FrameMaker

Procedure

Microsoft Word

FrameMaker

Logging On to AccSoft

Heading 1

Heading1

The first time you click on a component in AccSoft you are required to log on to the system before you can complete any tasks.

Normal

Body

To log on to AccSoft:

Normal

Body

Double-click the AccSoft application.

Normal (numbers applied)

Numbered1

The system displays the AccSoft main window.

Normal (indent applied)

Indented

Select AP from the Explorer.

Normal (numbers applied)

Numbered

The system displays the Login dialog.

Normal (indent applied)

Indented

Type your USERID into the Name field.

Normal (numbers applied)

Numbered

Type your password into the Password field.

Normal (numbers applied)

Numbered

The system displays the Customer dialog.

Normal (indent applied)

Indented

Select the customer to update.

Normal (numbers applied)

Numbered

Click the OK button to log on to AccSoft.

Normal (numbers applied)

Numbered

Note: If you do not know your USERID or Password, consult your System Administrator.

Note

Note

Warning: This database contains personal information about our clients. If you are logged on, do not leave your terminal unattended at any time.

Warning

Warning

Metadata

An effective metadata strategy is vital to a unified content implementation. Without effective metadata, authors cannot find information. If authors can’t find it, they can’t reuse it. Chapter 9, “Designing metadata,” defines what metadata is and describes how to determine the required metadata for your information. During implementation, you need to be concerned about where the metadata is stored and how it is managed and maintained.

In implementation, metadata can be stored in different places, depending on the capabilities of your CMS and the data format you are using. A CMS usually enables you to define metadata when you check files into the database. Rather than store the metadata values in the datafile, the CMS stores metadata in tables in the underlying database. If you are managing binary data formats (usually the format created by traditional tools) this is your only option. Ideally, metadata should be stored in the data file it is describing, with the content that it identifies.

In XML, metadata can be stored in elements or in attributes. Most interfaces between XML authoring tools and the CMS include functionality to extract the metadata from the XML file when it is checked in and apply it to the metadata fields in the CMS interface. If this functionality is not available, it can be implemented as a customization. Where possible, the update should be bidirectional. For example, if the metadata is updated in the source file, the changes should be saved to the CMS metadata database when the file is checked in. If the metadata is updated in the CMS, it should automatically be updated in the source file.

For traditional authoring tools, metadata is stored in the CMS.

Style sheets

Style sheets have different purposes, depending on the technology that you are using. When used with traditional authoring tools, style sheets control both the look of the document in the editor and the look of the document in the output. In XML, style sheets have a much broader capability.

Style sheet purposes

Style sheets can have different purposes. You can have style sheets for specific outputs, which we will refer to as output style sheets. You can also have display style sheets, which are used to format content for display in authoring tools. For an enterprise solution, it is important to give authors control over their own display templates, whenever possible. This enables them to change the look of information in their authoring tool to make the tool most effective for them. However, authors should be educated to understand that the changes they make to their display of content will not be reflected in the output.

After the authoring is complete, the content management or publication engine uses the output templates to provide the format for the specified output. While authors should control their own display templates, output templates should remain “locked” to maintain their consistency and make them easier to manage.

How many style sheets are needed?

The minimum number of style sheets you need is one for each format of output that you plan to create, for example, one for paper, one for online, and so on. In actual application, you probably need one for each output that you need to create.

Capabilities of XSL style sheets

Style sheets created for XML have capabilities beyond the simple display of content. They have all the power of traditional style sheets and templates, but can also provide additional functionality, such as

  • Sorting

  • Supplying boiler plate text

  • Hiding text

  • Repeating or rearranging text

These functions can be extremely powerful in a unified content solution. The more manipulation or processing that you can do in a style sheet, the better. In solutions featuring traditional desktop publishing tools and formats, manipulating text usually requires you to create scripts in specialized languages (proprietary to the tool) or in common programming/scripting languages, such as VB or JavaScript. XSL style sheets are easily created, and they do not need to be compiled. The formatting engine or parser can apply them immediately to a document.

Designing style sheets for output

Output style sheets can be used to configure and structure content for delivery. They can also filter the content for dynamic delivery. When you create style sheets, you should borrow from the best principles of reuse and the best principles of software development:

  • Make your style sheets modular.

  • Use your style sheet to generate as much text as possible.

  • Use variables whenever possible.

  • Use parameters whenever possible.

Note that not all authoring and publications tools will support all these principles. However, any time you can automate a task in a style sheet, eliminating a manual task for authors, you improve the consistency and predictability of your output.

Modular style sheets

Style sheets should be designed and implemented as modular components. Modular style sheets create reusable styles and layer format for simpler maintenance. For example, the following style sheet components could each be maintained in separate style modules:

  • Corporate look and feel—. Fonts, colors, logos, and so on

  • Page description—. Page size, margins, headers, footers, and so on

  • Inline elements—. Emphasis, link, cross-references, and so on

  • Block elements—. Sections, subsections, paragraphs, lists, and tables

Generated text

If you examine specific outputs, you are sure to see that often people treat certain information like content when it is really format. For example, headings for the following sections typically found in books and manuals are constant:

  • Table of Contents

  • List of Tables

  • List of Figures

  • Index

  • Glossary

  • Preface

These are not really content; they are navigation elements. No author should have to type the title “Table of Contents.” This heading should be applied by the style sheet. Whenever possible, you should build your style sheets to provide these types of textual elements automatically. This will guarantee consistency across your information set because repetitive text is applied programmatically and is not subject to the author’s typing skill.

Providing textual elements automatically can be done easily in XML with an XSL style sheet. In addition, many traditional authoring tools enable you to use paragraph numbering functions to provide the same text.

Style sheets can also generate all structural numbering in your document. Structural numbering includes chapter numbers, section numbers, appendix letters, and list numbers. You should avoid having to rely on authors to manually update these numbers.

Finally, whenever possible, you should use style sheets to generate the actual content of the table of contents, compile the index and glossary, and produce lists of tables and graphics. In traditional authoring tools, this functionality is in the tool, not the stylesheet.

Control of production style sheets

The ideal situation for any unified content solution is to isolate production style sheets from authors; production style sheets should be maintained by an individual or group of individuals responsible for the style sheets (for example, an information technologist as described in Chapter 21, “Managing change”). The information technologist is responsible for making required changes and updates as content changes over time. Most authors do not have any access to the production style sheets. However, there may be times when you want to give authors options in presentation. For example, you may want to give authors an option of format (5×9 or 8.5×11) for paper output, or enable them to select from a variety of ways of displaying content, such as inline or in a sidebar or example.

Summary

You can implement a unified content solution in a variety of ways:

  • There are many factors that affect implementation, including

    • Scalability

    • Ease of use

    • Ease of finding information

  • Selecting the correct level of physical granularity depends on many factors, including the complexity of reuse and the capabilities of your CMS.

  • There are many options for implementing a unified CMS, including

    • Implementing your model in XML

    • Authoring forms

    • Structural templates (traditional authoring tools)

  • If you implement in XML, a DTD is required.

  • You should balance the use of semantic versus generic element or style names to ensure that you can find, reuse, and manipulate all required elements easily, without creating unworkable style sheets or DTDs.

  • Metadata tends to be stored with information components in the CMS, but can also be stored in XML files.

  • You should design style sheets very carefully to ensure that you

    • Use the maximum capabilities of the style sheets.

    • Make the style sheets modular for reuse.

    • Make the style sheets easy to maintain.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset