Chapter 6. DocBook Assemblies

One modern school of thought on technical documentation stresses the development of independent units of documentation, often called topics, rather than a single narrative. Instead of writing something that DocBook users would easily recognize as a book consisting of a preface, several consecutive chapters, and a few appendixes, the author (or authors) write a set of discrete topics covering different aspects of the system as if they were wholly independent.

In a typical online presentation system, for example the world wide web or online help, each topic is a page that stands alone. Except, of course, that just as no man is an island, no topic is completely unrelated to the other topics that are available.

From any given topic, there may be topics of obviously related interest. The nature of the relationships may vary. Some topics are related by physical proximity (if you're interested in the ink cartridges in a printer, you may also be interested in the print head), others by their procedural nature (adding or replacing memory, adding or replacing a hard drive, or even changing the CPU are all topics that might logically follow a topic that describes how to open the computer case).

In a single narrative, it is the responsibility of the author to manage these relationships. He or she can reasonably assume that anyone reading chapter 4 has read chapters 1, 2, and 3. If the reader needs to be directed elsewhere, a cross reference can be used (for example, “for more information on paper jams, see Section 3.5, The Paper Path”).

In a topic-oriented system, authors are explicitly instructed to write independent units. No linear order can be assumed and many forms of explicit cross-reference are discouraged.

Documentation managers treat the library of available topics very much as programmers treat libraries of available functions. Just as any given program can pick and choose from the available libraries, the documentation for any given system can pick and choose from the available topics.

If you imagine a large documentation group managing the documentation for several related systems (different models of printer, different configurations of a software system, computers assembled from different components, etc.) it's easy to see the appeal of topic-oriented authoring.

In a successful deployment, you might find a library of say 1,000 topics which, taken together, document five or six related systems, each of which uses 700-800 topics. Some topics are used in every system, many are used in several systems, and a small number of topics are unique to a specific system.

In order to make such a documentation platform functional, you need not only the individual topics, but also some sort of “map” or “assembly” file that describes which topics from the library are used, what relationships exist between them and, at least for print presentation, what linear order should be imposed upon them.

DocBook uses an assembly for this purpose.

1. Physical structure

The notion of a DocBook assembly is predicated on a few assumptions about the physical layout of the topics that are to be combined together into units of documentation. We call the units of documentation “structures” and the topics from which they are composed “resources”.

For the most part, we assume that resources exist as stand alone documents accessible via URIs. The structures that result from assembling the resources together may be a single file (as in a single PDF book) or a collection of files (as in a web site or help system).

Other arrangements are possible, but for simplicity, we assume that the resources are accessible via URIs and the resulting structures can be written to the local filesystem as one or more files.

2. Logical structure

Many features of an assembly allow the assembly process to update metadata associated with a document, for example changing the title or removing the metadata altogether. Throughout the description of assemblies, we assume that all metadata always occurs inside an info wrapper. In other words, although the following is perfectly legal:

<chapter>
  <title>Chapter Title</title>
  <info>
    <pubdate>2009-11-23</pubdate>
  </info>
  <para>Some chapter content.</para>
</chapter>

we always assume that this is instead represented with a single info element:

<chapter>
  <info>
    <title>Chapter Title</title>
    <pubdate>2009-11-23</pubdate>
  </info>
  <para>Some chapter content.</para>
</chapter>

Even in cases where there is no info element:

<section>
  <title>Section Title</title>
  <para>Some section content.</para>
</section>

we assume one is present:

<section>
  <info>
    <title>Section Title</title>
  </info>
  <para>Some section content.</para>
</section>

Authors are not required to author in this way in order to use assemblies. Processing systems are to behave as if they had.

The assumption that all metadata is always present in a single, explicit info wrapper greatly simplifies the exposition of assemblies without introducing any actual limitations.

3. Assembly Files

An assembly has four major parts:

resources

Identifies a collection of topics. An assembly may identify one or more collections.

structure

Identifies a single artifact to be assembled. At present, we assume that these artifacts are single documents, but that may change. A document in this case is the particular collection of topics that forms the documentation for a product or system. An assembly may identify one or more structures.

relationships

Identifies relationships between resources. These relationships may be manifested in any number of structures during assembly. An assembly may identify any number of relationships.

transforms

Identifies transformations that can be applied during assembly. An assembly may identify any number of transformations.

3.1. Resources

Individual resources are identified through a resource element:

<resource xml:id="topicA" href="uri/for/topicA.xml"/>

Here the resource “topicA” is associated with the URI uri/for/topicA.xml. Relative URIs are made absolute with respect to the element on which they appear. An optional description attribute may also be used to provide a description of the resource.

Fragment Identifiers

The URI specified in the href attribute may include a fragment identifier, such as uri/for/topicA.xml#sectionid. Which, if any, fragment identifier schemes are supported is implementation defined.

A collection of resources may appear in a resources wrapper. The primary motivation for the resources wrapper is to group physically and logically colocated resources together. The resources wrapper is a convenient place, for example, to specify a common xml:base or xml:lang and description.

3.2. Structures

Structures are where the real work is performed. There is considerable tension between the goal of providing structure elements that are simple for the simple cases and simultaneously allowing them to be complex enough to describe the full richness of possibility.

This inherent tension is compounded by the fact that the nature of structure outputs is naturally open ended. The goal is that DocBook assemblies should be able to handle both the richness of the structures that we produce today as well as the structures we will want to produce tomorrow.

The principal task of the structure element is to identify the resources that are to be combined for that structure. The resulting structure may be delivered in any number of ways: as a single document, help system, or web site, for example.

It is important to observe that the realized structure is often, but not necessarily, a valid DocBook document. What is important is not its validity from an authoring perspective, but rather its validity with respect to what the downstream processor is expecting.

For example, if the structure is defining a book, then the realized structure should be a valid book. If, on the other hand, what is defined is a help system, then the realized structure may be a collection of nested topic elements. Even though authors are not allowed to nest topics (per the DocBook 5.1 topic element), the assembly process may be allowed to nest them.

A structure consists mostly of module elements.

3.3. Modules

The module element is a wrapper for a unit of structure. The attributes and children of a module provide facilities for selecting the content of the module and controlling its presentation.

3.4. Relationships

A relationship, in the context of an assembly, is an assertion that a particular collection of resources are related. The nature of the relationship is specified with a type attribute. For example:

<relationship type="seealso">
  <instance linkend="tut1"/>
  <instance linkend="tut2"/>
  <instance linkend="task1"/>
</relationship>

This asserts that there is a “seealso” relationship between these resources. The processor might use this information to automatically generate a “See Also” section at the end of any of these topics.

Relationships are often, but not necessarily, unordered. This relationship establishes a path through a set of resources:

<relationship type="path">
  <info>
    <title>New User Introduction</title>
  </info>
  <instance linkend="over1"/>
  <instance linkend="over2"/>
  <instance linkend="task3"/>
  <instance linkend="cleanup"/>
</relationship>

A sophisticated help system could use this path to guide a new user through a sequence of resources.

3.5. Transformations

An assembly can identify a collection of transformations that can be used during the assembly process. A transformation can be associated with a resource (for example, to translate from some other format into DocBook), or with a module (to address requirements beyond the limited transformation capabilities of the assembly).

<transforms>
  <transform name="dita2docbook" grammar="text/xsl" href="dita2db.xsl"/>
  <transform name="tutorial" grammar="text/xsl" href="db2tutorial.xsl"/>
  <transform name="art2pi" grammar="text/xsl" href="art2pi.xsl"/>
  <transform name="office" grammar="application/xproc+xml" href="office2db.xpl"/>
  <transform name="office" grammar="text/xsl" href="extractoffice.xsl"/>
</transforms>

If there are several ways to provide a transformation, they may all be listed provided that they have different types. In the example above, it may be that the XProc transformation from office documents to DocBook is superior to the XSLT-only transformation, but the XSLT-only transformation is better than nothing. If no type is specified, the default is implementation dependent.

Note

Not all systems may support arbitrary transformations, nor can all systems support all possible transformation languages. To maximize interoperability it is best to use as few explicit transformations as possible, ideally none.

Given the transformations above, a DITA resource might be included in the assembly:

<resource xml:id="overview" href="dita/over.xml"
          transform="dita2docbook"/>

Whenever a module refers to this resource, it will receive the transformed, DocBook result.

If a module needs to perform a transformation to get from one DocBook format to another, it can name a transform as well:

<module resourceref="overview">
  <transform name="art2pi"/>
  <output type="book" renderas="partintro"/>
</module>

In this case, two transformations will occur. This can be generalized to an arbitrary number by listing more than one transform in the module. The transforms are applied in the order specified.

If the output specifies a renderas, it is applied to the result of the last transformation.

4. Describing Structural Relationships With Assemblies

In addition to providing a mechanism for describing the structure of the classical DocBook products such as books and articles, the assembly element can be used to describe the relationships among non-linear collections of topics such as are presented in Web sites and help systems. While help systems can be produced by chunking a typical book into pieces; it is important to emphasize the non-linear nature of the content during the creation of it to avoid the presumption that a particular topic follows its predecessor in the document.

Another characteristic of collections of topics such as help systems and Web sites is that it can be, at times, important to maintain explicit control over file names (to produce consistent URIs or to reduce churn in source control systems). The use of the assembly mechanism easily allows control over file names for the delivered products.

It is also desirable, at times, to produce multiple output formats for some of the non-linear documents, with different locations for some portions of the content (the Index may go near the top of a navigational hierarchy of the HTML delivery of a help system, while it traditionally would be at the back of a printed version of a help system). The ability to specify specific outputs for elements and suppress them in other output formats supports this requirement.

At this point, working through an example may be helpful.

4.1. Describing a Help System with an Assembly

This section describes the use of an assembly to describe the hierarchical relationship among pages in a help system, and the use of the type attribute on assemblies to carry information beyond the simple structure of a document. While there may be other specific considerations applying the assembly element to the description of a Web site, similar considerations are likely to apply.

Note

This is a simplified example, but based on a real system.

4.1.1. Background

This is the help system for a simple documentation build system that has two Web-based screens that allow the user to build documents (books rendered into PDF and chunked HTML and help systems that are delivered with products using a Webhelp type system).

There are four tasks supported by the system. Two screens of help are provided for the screens in the user interface, and the help system organization has to comply with documentation standards.

Help System Structure

Each entry in this list represents a node (HTML page) in the help system.

  • ToC

  • Index

  • Overview (Welcome)

  • Tasks

    • Building Documents

    • Packaging Books for Publication

    • Packaging Help for Submitting to Product Builds

    • Packaging Documents (Books and Help) for Localization

  • Screens and Menus

    • Document Build Screen

    • Document Packaging Screen

  • Printable Version (PDF)

  • Glossary

  • Help on Help

The Webhelp system delivers topics of help with a navigational hierarchy on the left-hand side of the content presented for each topic. The navigational hierarchy indicates the location of the topic in the hierarchy by highlighting it and can be used to navigate to other topics in the help system. Help is provided for each page in the interface. In addition, a glossary is provided, with links from glossterms in the body of the help system to the glossary along with an index and full text search, that must be build across all the topics in the help system.

In addition to being delivered using Webhelp, a printable version of the help system is delivered with the help system in the form of a PDF generated by walking the structure of the help system, collecting the individual topics to create a book with the structure derived from the hierarchical structure of the help system. However, the PDF follows the conventions of printed books, putting the index at the back of the book, and does not include the help on help content.

In addition to the files for the Webhelp HTML and JavaScript, supplementary files are delivered to allow merging with other help systems and to support integration of the help on screens with the product help buttons. Some of the HTML files are expected to have standard names.

The following example assumes that the Webhelp transform has been extended to support non-chunked generation of the system by passing in a topic for each page to be rendered. The navigational structure will be derived from the assembly.

4.1.2. The Resources

A very simple set of resources is specified for the help system. The file names are immaterial, and all the referenced file resources are presumed to be topics except the glossary, which is simply a glossary element (and usually derived from a glossary database).

Example 6.1. Resources for a Help System
  <resources xml:base="some/path">
    <description>Resources for help system.</description>
    <resource xml:id="overview" href="overview.xml">
      <description>Overview of help system.</description>
    </resource>
    <resource xml:id="task.build.doc">
      <description>Describe how to build a document.</description>
    </resource>
    <resource xml:id="pckg.intro" href="task-intro.xml">
      <description>Introduction to the packaging tasks.</description>
    </resource>
    <resource xml:id="task.package.book.pub">
      <description>Describe how to package a document for publication.</description>
    </resource>
    <resource xml:id="task.package.help">
      <description>Describe how to package a help system for submission to product
        build.</description>
    </resource>
    <resource xml:id="task.package.l10n">
      <description>Describe how to package source files for localization.</description>
    </resource>
    <resource xml:id="screen.overview">
      <description>Introductory text for screens and menus section.</description>
    </resource>
    <resource xml:id="screen.build.doc">
      <description>Displayed when question mark icon on document build screen is
        clicked.</description>
    </resource>
    <resource xml:id="screen.package">
      <description>Displayed when question mark icon on packaging screen is clicked.</description>
    </resource>
    <resource xml:id="glossary">
      <description>Glossary for help system.</description>
    </resource>
    <resource xml:id="help.on.help">
      <description>How to use the help system.</description>
    </resource>
  </resources>

4.1.3. Setting Up the Structure

The first few lines of the structure set up general things about the help system:

Example 6.2. Defining the Basics of the Help System
  <structure type="help.system">
    <info>
      <title>XIDI Build System Help</title>
      <titleabbrev>XIDI Help</titleabbrev>
    </info>
    <output outputformat="pdf" file="sys-book.pdf" renderas="book"/>
    <!-- The PDF output will be rendered into a file expected by the help
         system for the printable version of the help system. -->
    <output outputformat="webhelp" chunk="false" renderas="topic"/>
    <!-- The webhelp output will NOT be chunked and the default render
         (wherever a file name is specified) is as a topic.   IF the
         renderas on an output that is a child of structure is not treated as
         a default for specified files, a renderas has to be added to each of
         the webhelp output statements in the modules that specify a file. -->

The type attribute provides information to the render system so that the auxiliary files for merging and integration are built in addition to the Webhelp system.

The output element for pdf specifies the name of the file to use and tells the render system to render a book as the printable version of the help system.

The output element for webhelp suppresses chunking and sets the default value for the webhelp output statements that do not have a renderas attribute.

4.1.4. Standard Front End

By convention, the first two pages in the help system are the table of contents and the index. However, in the book, the index goes in the back.

Example 6.3. The Front End
    <module>
      <output outputformat="webhelp" file="sys-toc.html" renderas="topic"/>
      <toc/>
      <!-- By convention, there is a ToC at the beginning of a help system and
           of the PDF of the help system.  The delivery system expects the file
           to be named sys-toc.html. -->
    </module>
    <module>
      <output outputformat="webhelp" file="sys-index.html"/>
      <output outputformat="pdf" suppress="true"/>
      <index/>
      <!-- By convention, there is an index at the beginning of a help system
           but it is expected to be at the end of a book.  The delivery system
           expects the file to be named sys-index.html -->
    </module>

4.1.5. Main Body of the Help System

The main content of the help system is broken into an overview (Which can have lower-level topics below the main overview topic), task descriptions, which may be grouped together or appear at the top level if they are considered important enough to the user, and a set of help pages for screens.

Each task description has a task element with a summary, prerequisites, the procedure, and lists of related topics and tasks. Help on screens always includes standard information such as how to navigate to the screen and what tasks the screen is used to accomplish. In complex systems, screens can be grouped together to deal with things like Wizards and related screens and dialogs.

Example 6.4. Middles Section of Help System
    <module resourceref="overview">
      <output outputformat="webhelp" file="overview.html"/>
      <output outputformat="pdf" renderas="chapter"/>
    </module>
    <module resourceref="task.build.doc">
      <output outputformat="webhelp" file="tsk-build-doc.html"/>
      <output outputformat="pdf" renderas="section"/>
    </module>
    <module>
      <output outputformat="webhelp" file="pckg-intro.html"/>
      <output outputformat="pdf" renderas="chapter"/>
      <title>Packaging</title>
      <module resourceref="pckg.intro" contentonly="true" omittitles="true"/>

      <module resourceref="task.package.book.pub">
        <output outputformat="webhelp" file="tsk-pckg-book-pub.html"/>
        <output outputformat="pdf" renderas="section"/>
      </module>
      <module resourceref="task.package.help">
        <output outputformat="webhelp" file="tsk-pckg-help.html"/>
        <output outputformat="pdf" renderas="section"/>
      </module>
      <module resourceref="task.package.l10n">
        <output outputformat="webhelp" file="tsk-pckg-l10n.html"/>
        <output outputformat="pdf" renderas="section"/>
      </module>
    </module>
    <module>
      <output outputformat="webhelp" file="scr-overvw.html"/>
      <output outputformat="pdf" renderas="chapter"/>
      <title>Screens</title>
      <module resourceref="screen.overview" contentonly="true"
      omittitles="true"/>
      <module resourceref="screen.build.doc">
        <output outputformat="webhelp" file="scr-build.html"/>
        <output outputformat="pdf" renderas="section"/>
      </module>
      <module resourceref="screen.package">
        <output outputformat="webhelp" file="scr-package.html"/>
        <output outputformat="pdf" renderas="section"/>
      </module>
    </module>

4.1.6. Standard Back End for the Help System

Once the main body of the help system is built, the standard back end components are added.

Example 6.5. Back End of Help System
    <module>
      <output outputformat="webhelp"/> 
      <!-- No file, just want the link in the navigation. -->
      <output outputformat="pdf" suppress="true"/>
      <!-- Nothing in PDF. -->
      <title xmlns:xl="http://www.w3.org/1999/xlink" xl:href="sys-book.pdf">Printable Version (PDF)</title>
    </module>
    <module resourceref="glossary">
      <output outputformat="webhelp" file="sys-glossary.html"/>
      <!-- The help delivery system expects the glossary to be in a file named
           sys-glossary.html.-->
    </module>
    <module>
      <output outputformat="webhelp" suppress="true"/>
      <index/>
      <!-- By conventions, the index is at the back of a book, but it has
           already been presented at the front of the help system. -->
    </module>
    <module resourceref="help.on.help">
      <output outputformat="webhelp" file="help-on-help.html"/>
      <output outputformat="pdf" suppress="true"/>
      <!-- By convention help on how to use the help system is not included in
           the printable version of the help system and is stock, pulled from a
           library. -->
    </module>
  </structure>

4.1.7. What Happens

Presuming an Ant (or other target based) build system for this help system document, the following targets are meaningful for building the document described by this structure.

Targets for Building the Help System
build.help

Builds a help system with the following:

Webhelp files, including the full test search inverse index files
XML files for merging with other help systems
Properties files for integrating into product
build.pdf

Build the PDF printable version

build.all

Build the help system and the printable version, packaged into the same directory structure used to deliver the help system with the product, so that it can be zipped up for delivery into the product source control system.

5. Example: Assembling a Printed Book

For the purposes of this section, let's assume that we have the following resources available:

<resource xml:id="full-toc">
  <toc/>
  <toc role="figures"/>
  <toc role="tables"/>
  <toc role="procedures"/>
</resource>

<resource xml:id="index">
  <index/>
</resource>

<resources xml:base="tutorial/">
  <resource xml:id="tut1" href="tut1.xml"/>
  <resource xml:id="tut2" href="tut2.xml"/>
  <resource xml:id="tut3" href="tut3.xml"/>
  <resource xml:id="tut4" href="tut4.xml"/>
  <resource xml:id="tut5" href="tut5.xml"/>
</resources>

<resources xml:base="tasks/">
  <resource xml:id="task1" href="task1.xml"/>
  <resource xml:id="task2" href="task2.xml"/>
  <resource xml:id="task3" href="task3.xml"/>
  <resource xml:id="task4" href="task4.xml"/>
</resources>

Further, let's assume that each of the tutorials and tasks is authored as a standalone topic.

Our first challenge is to create a PDF user guide. A subject matter expert has informed us that the proper linear presentation for the user guide is tutorials 1 and 2, task 1, tutorial 3, task 4, and an index.

Our first attempt at a structure might look like this:

<structure xml:id="user-guide">
  <output renderas="book"/>
  <module resourceref="full-toc"/>
  <module resourceref="tut1"/>
  <module resourceref="tut2"/>
  <module resourceref="task1"/>
  <module resourceref="tut3"/>
  <module resourceref="task4"/>
  <module resourceref="index"/>
</structure>

The output element identifies the kind of output to be generated. There may be more than one type. The renderas attribute on the output tells us that the realized structure should be a book.

The module elements identify the resources to be included. That's fine as far as it goes. Processing this structure will create a realized structure that is a book (because that's what the renderas on the output tells us) consisting of the tables of contents and the correct six topics:

<book xmlns="http://docbook.org/ns/docbook">
  <toc/>
  <toc role="figures"/>
  <toc role="tables"/>
  <toc role="procedures"/>
  <topic xml:base="tutorial/tut1.xml">
    <title>Introduction</title>
    ...
  </topic>
  <topic xml:base="tutorial/tut2.xml">
    <title>Getting Started</title>
    ...
  </topic>
  <topic xml:base="tasks/task1.xml">
    <title>Engaging the spindle</title>
    ...
  </topic>
  <topic xml:base="tutorial/tut3.xml">
    <title>Troubleshooting</title>
    ...
  </topic>
  <topic xml:base="tasks/task4.xml">
    <title>Diagnosing spindle problems</title>
    ...
  </topic>
  <index/>
</book>

The trouble is, that's not a valid book. Not only does it consist of topics instead of the expected chapters and such, but it's lacking necessary metadata like a title.

We can address the first problem by allowing output elements inside modules.

Most high-level elements (chapter, appendix, book, section, topic, etc.) in DocBook have the same general structure: they contain an (optional) info element followed by a collection of block elements.

The semantic of renderas in a module that points to a resource is that it renames the root element of that resource. This simple transformation will allow us to turn all those topics into elements more appropriate for book content. (We'll consider how to address the situation where more complex transformation is necessary later.)

We can address the problem of missing metadata by creating a new resource that contains an info element containing the appropriate metadata and then pointing to it. Alternatively, you can supply a merge element containing metadata for the structure. The merge element has the same content model as info. Metadata elements that you include in the merge element are applied to the output element for the structure.

With these amendments, here's our new structure:

<structure xml:id="user-guide">
  <output renderas="book"/>
  <merge>
    <title>Widget User Guide</title>
  </merge>
  <module resourceref="full-toc"/>
  <module resourceref="tut1">
    <output renderas="chapter"/>
  </module>
  <module resourceref="tut2"/>
  <module resourceref="task1"/>
  <module resourceref="tut3">
    <output renderas="appendix"/>
  </module>
  <module resourceref="task4"/>
  <module resourceref="index"/>
</structure>

The realized structure that this produces is:

<book xmlns="http://docbook.org/ns/docbook">
  <info>
    <title>Widget User Guide</title>
  </info>
  <toc/>
  <toc role="figures"/>
  <toc role="tables"/>
  <toc role="procedures"/>
  <chapter xml:base="tutorial/tut1.xml">
    <title>Introduction</title>
    ...
  </chapter>
  <chapter xml:base="tutorial/tut2.xml">
    <title>Getting Started</title>
    ...
  </chapter>
  <chapter xml:base="tasks/task1.xml">
    <title>Engaging the spindle</title>
    ...
  </chapter>
  <appendix xml:base="tutorial/tut3.xml">
    <title>Troubleshooting</title>
    ...
  </appendix>
  <appendix xml:base="tasks/task4.xml">
    <title>Diagnosing spindle problems</title>
    ...
  </appendix>
  <index/>
</book>

That realized structure is a valid book so you would expect the processing system to produce the correct results.

Except that it's not the right book. Upon further review by our subject matter expert, it's been decided that tutorial 3 and task 4 shouldn't be separate appendixes, they should instead both be sections in a single appendix called “Troubleshooting”.

We can address this issue with nested modules. Instead of having separate top-level modules for the tutorial and task, we'll make them subordinate modules in a new top-level module:

<structure xml:id="user-guide">
  <output renderas="book"/>
  <merge>
    <title>Widget User Guide</title>
  </merge>
  <module resourceref="full-toc"/>
  <module resourceref="tut1">
    <output renderas="chapter"/>
  </module>
  <module resourceref="tut2"/>
  <module resourceref="task1"/>
  <module>
    <output renderas="appendix">
    <merge>
      <title>Troubleshooting</title>
    </merge>
    <module resourceref="tut3">
      <output renderas="section"/>
    </module>
    <module resourceref="task4"/>
  </module>
  <module resourceref="index"/>
</structure>

The semantics of renderas on a module that does not have a resourceref is that it simply generates that element.

The realized document is now:

<book xmlns="http://docbook.org/ns/docbook">
  <info>
    <title>Widget User Guide</title>
  </info>
  <toc/>
  <toc role="figures"/>
  <toc role="tables"/>
  <toc role="procedures"/>
  <chapter xml:base="tutorial/tut1.xml">
    <title>Introduction</title>
    ...
  </chapter>
  <chapter xml:base="tutorial/tut2.xml">
    <title>Getting Started</title>
    ...
  </chapter>
  <chapter xml:base="tasks/task1.xml">
    <title>Engaging the spindle</title>
    ...
  </chapter>
  <appendix>
    <info>
      <title>Troubleshooting</title>
    </info>
    <section xml:base="tutorial/tut3.xml">
      <title>Troubleshooting</title>
      ...
    </section>
    <section xml:base="tasks/task4.xml">
      <title>Diagnosing spindle problems</title>
      ...
    </section>
  </appendix>
  <index/>
</book>

That's probably not quite what was intended. Given that tutorial 3 has the title “Troubleshooting”, what we probably wanted was to make that tutorial the appendix and make the task a subordinate section of that appendix.

We can achieve that by nesting the modules differently:

  <module resourceref="tut3">
    <output renderas="appendix"/>
    <module resourceref="task4">
      <output renderas="section"/>
    </module>
  </module>

Now the appendix in the realized structure has the form that we want:

  <appendix xml:base="tutorial/tut3.xml">
    <title>Troubleshooting</title>
    ...
    <section xml:base="tasks/task4.xml">
      <title>Diagnosing spindle problems</title>
      ...
    </section>
  </appendix>

If a module “A” is nested within a module “B” that refers to a resource, the result of processing “A” is inserted into “B” as the last child of “B”. (In the case of several nested modules, they are inserted as the last children in the order specified.)

After further review of the content, the subject matter expert decides that tutorial 5 and task 3 are also relevant to trouble shooting. Tutorial 5 also has the title “Troubleshooting” and naturally follows tutorial 3. Task 3 is about diagnosing bearing issues.

It's easy to combine them the new content into the appendix:

  <module resourceref="tut3">
    <output renderas="appendix"/>
    <module resourceref="tut5">
      <output renderas="section"/>
    </module>
    <module resourceref="task4"/>
    <module resourceref="task3"/>
  </module>

but that doesn't produce a pleasing result:

  <appendix xml:base="tutorial/tut3.xml">
    <title>Troubleshooting</title>
    ...
    <section xml:base="tutorial/tut5.xml">
      <title>Troubleshooting</title>
      ...
    </section>
    <section xml:base="tasks/task4.xml">
      <title>Diagnosing spindle problems</title>
      ...
    </section>
    <section xml:base="tasks/task3.xml">
      <title>Diagnosing bearing problems</title>
      ...
    </section>
  </appendix>

What we really want to do is combine tutorials 3 and 5, not make 5 a subordinate section of 3. While we're at it, let's change the title of the appendix to “Troubleshooting spindle and bearing problems” since that's what it's really about.

We can solve the first problem with the contentonly attribute on module:

  <module resourceref="tut3">
    <output renderas="appendix"/>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4">
      <output renderas="section"/>
    </module>
    <module resourceref="task3"/>
  </module>

If contentonly is true, then only the content of the document element, and not the document element itself, is included. Any metadata associated with the document element is also discarded. (If the referenced resource has several top-level element children, then this processing is applied to each, in turn.)

The realized structure for this appendix is now:

  <appendix xml:base="tutorial/tut3.xml">
    <title>Troubleshooting</title>
    ...
    ...content (only) of tutorial 5...
    <section xml:base="tasks/task4.xml">
      <title>Diagnosing spindle problems</title>
      ...
    </section>
    <section xml:base="tasks/task3.xml">
      <title>Diagnosing bearing problems</title>
      ...
    </section>
  </appendix>

There are two ways that we can change the title. The first uses only semantics we've already encountered:

  <module>
    <output renderas="appendix"/>
    <merge>
      <title>Troubleshooting spindle and bearing problems</title>
    </merge>
    <module resourceref="tut3" contentonly="true"/>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4">
      <output renderas="section"/>
    </module>
    <module resourceref="task3"/>
  </module>

The only disadvantage of this approach is that we lose all of the metadata associated with tutorials 3 and 5. This might include publication dates, copyright information, etc. We could add those fields to the merge wrapper for the appendix in the assembly, but then it may have to be maintained in two places.

Instead, we introduce one more convention: if a module refers to another resource and contains a merge element, then the elements within that merge replace any elements of the same name in the referenced resource's metadata. (If the referenced resource has no metadata, then the specified merge becomes an info and is inserted as the first child of the referenced resource.)

The second approach uses this convention:

  <module resourceref="tut3">
    <output renderas="appendix"/>
    <merge>
      <title>Troubleshooting spindle and bearing problems</title>
    </merge>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4">
      <output renderas="section"/>
    </module>
    <module resourceref="task3"/>
  </module>

In the realized structure for this example, the appendix metadata includes everything in tutorial 3's metadata, with the title changed as indicated. Of course, all of the metadata for tutorial 5 is still lost, but there's nothing we can do about that because there's no where to put it.

6. Example: Assembling an Online Book

This example extends the book example, see Section 5, “Example: Assembling a Printed Book”. In addition to the printed user guide, we want to produce an online user guide. The online guide is, by design, a straightforward, linear rendering of the book. We'll look at a more intricately structured example later.

We could create a separate assembly for the online guide, but it will be better in the long run if we can manage the structure of the user guide in one place.

The first step is to create a new output type:

<structure xml:id="user-guide">
  <output type="book" renderas="book"/>
  <output type="web" renderas="book"/>
  <merge>
    <title>Widget User Guide</title>
  </merge>
  <module resourceref="full-toc"/>
  <module resourceref="tut1">
    <output renderas="chapter"/>
  </module>
  <module resourceref="tut2"/>
  <module resourceref="task1"/>
  <module resourceref="tut3">
    <output renderas="appendix"/>
    <merge>
      <title>Troubleshooting spindle and bearing problems</title>
    </merge>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4">
      <output renderas="section"/>
    </module>
    <module resourceref="task3"/>
  </module>
  <module resourceref="index"/>
</structure>

Now we can process this structure either as a “book” type structure or a “web” type structure. At the moment, there's no difference between them. Let's consider some of the changes we need to make for the online site:

  1. We want to control the names of the HTML documents produced.

  2. We want to control how chunking is performed.

  3. We want to suppress some “print only” content.

The first two of these we can address with output elements:

<structure xml:id="user-guide">
  <output type="book" renderas="book"/>
  <output type="web" renderas="book" file="user-guide.html"/>
  <merge>
    <title>Widget User Guide</title>
  </merge>
  <module resourceref="full-toc">
    <output type="web" chunk="false"/>
  </module>
  <module resourceref="tut1">
    <output renderas="chapter"/>
  </module>
  <module resourceref="tut2"/>
  <module resourceref="task1"/>
  <module resourceref="tut3">
    <output renderas="appendix"/>
    <merge>
      <title>Troubleshooting spindle and bearing problems</title>
    </merge>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4">
      <output renderas="section"/>
      <output type="web" chunk="false"/>
    </module>
    <module resourceref="task3"/>
  </module>
  <module resourceref="index">
    <output type="web" file="book-index.html"/>
  </module>
</structure>

Specifying a file on an output tells the assembly where to write the chunk. (If it occurs on a module which is not chunked, it is simply ignored.) The chunk attribute can be used to specify which modules should (or should not) be placed in separate chunks. The implied value on modules where it is not specified depends on the name of the rendered element and the output type.

When processing a module, all of the relevant output directives are combined. An output is relevant if it does not specify a type or if the type specified matches the type of processing that is being performed. If conflicting values are specified for any attribute, the last value specified (in document order) is used.

To exclude (or include) content, we add the filterout (and filterin) elements to the assembly. Each may have a type and specifies some number of effectivity values.

If no conditions are specified on filterout, the module is unconditionally excluded. (Specifying no conditions on a filterin has no effect.)

If filters are used at the structure level, they apply to the entire document. If they are used at the module level, they apply only to that module.

With filters, our structure might look like this:

<structure xml:id="user-guide">
  <output type="book" renderas="book"/>
  <output type="web" renderas="book" file="user-guide.html"/>
  <filterout type="web" condition="print"/>
  <merge>
    <title>Widget User Guide</title>
  </merge>
  <module resourceref="full-toc">
    <output type="web" chunk="false"/>
  </module>
  <module resourceref="tut1">
    <output renderas="chapter"/>
  </module>
  <module resourceref="tut2">
    <filterin type="web" condition="print"/>
    <filterout type="web" userlevel="advanced"/>
  </module>
  <module resourceref="task1"/>
  <module resourceref="tut3">
    <output renderas="appendix"/>
    <merge>
      <title>Troubleshooting spindle and bearing problems</title>
    </merge>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4">
      <output renderas="section"/>
      <output type="web" chunk="false"/>
    </module>
    <module resourceref="task3"/>
  </module>
  <module resourceref="index">
    <filterout type="web"/>
  </module>
</structure>

This structure applies the following filters when rendering for the web: print only content is excluded globally; in tutorial 2, print only content is included, but advanced material is excluded; and the entire index is excluded.

7. Example: Assembling an Online Reference

The online reference differs from the preceding book-based examples in that no total, linear order is imposed on the content. Instead, navigation is performed either by search (which is outside the scope of assemblies) or through related links.

Using the resources defined in Section 5, “Example: Assembling a Printed Book”, here's a plausible structure for our online reference.

<structure xml:id="user-guide">
  <output renderas="topic"/>
  <filterout type="web" condition="print"/>
  <merge>
    <title>Widget Reference</title>
  </merge>
  <module resourceref="tut1"/>
  <module resourceref="tut2"/>
  <module resourceref="task1"/>
  <module resourceref="tut3">
    <merge>
      <title>Troubleshooting spindle and bearing problems</title>
    </merge>
    <module resourceref="tut5" contentonly="true"/>
    <module resourceref="task4"/>
    <module resourceref="task3"/>
  </module>
  <module resourceref="tut4"/>
  <module resourceref="task2"/>
  <module resourceref="index"/>
</structure>

The resulting realized structure will be a nested set of topic elements. Topics don't nest in DocBook (at least in DocBook 5.1), so this isn't a valid DocBook document. However, the nested topics provide a navigational structure for the online presentation. The rendering system is, we assume, smart enough to render each of the topics separately, as if they were all top-level siblings.