In DocBook V3.1, introduced in February, 1999, the following
elements were added to DocBook:
DocBook V4.0, introduced in January, 2001, the following elements
The following additional changes were made in DocBook V4.0:
artheader was renamed
comment was renamed
docinfo was broken into a set of other info elements;
seriesinfo were removed.
DocBook V4.2, introduced in FIXME: April, 2002, the following elements
Additional changes made to DocBook V4.2 are summarized in the DocBook V4.2 Specification.
The description of each element in this reference is divided into the following sections:
Describes the content model of the element in SGML/XML DTD terms. See the section called “Understanding Content Models”.”
Lists “inclusions.” Inclusions are an SGML feature.
Included elements can appear anywhere inside the element
that includes them, even in places that aren't ordinarily
valid. For example,
This means that within a
Emphasis, for instance, even though the
content model of
Emphasis does not explicitly allow
Lists “exclusions.” Exclusions are an SGML feature.
Excluded elements cannot appear anywhere inside the element that
excludes them, even in places that are ordinarily valid. For
means that a
Footnote cannot appear inside a
Para inside a
Footnote, even though
Footnote appears in the content model of
Lists elements that are excluded from appearing at any level below the element described.
Provides a synopsis of the attributes on the element. For brevity, common attributes are described only once, in this introduction.
Indicates if start- or end-tags may be omitted. Tag omission is dependent on both the DTD and your SGML declaration. If a tag is described as ommissable here, it is ommissible if your declaration allows tag omission. The standard DocBook declaration does not.
Lists the parameter entities in which the element described appears. Parameter entities are important when you are customizing the DTD.
Identifies changes that are scheduled for future versions of the DTD. These changes are highlighted because they involve some backward-incompatability that may make currently valid DocBook documents no longer valid under the new version.
Provides examples of proper usage for the element. Generally, the smallest example required to reasonably demonstrate the element is used. In many cases, a formatted version of the example is also shown.
All of the examples printed in the book use the SGML version of DocBook. The CD-ROM includes the full text of all of the examples.
Formatted examples are indicated using a vertical bar.
Content models are the way that DTDs describe the name, number, and order of other elements that may be used inside an element. The primary feature of content model syntax is that it is concise, but this conciseness comes at the cost of legibility until you are familiar with the syntax.
A content model that consists of the single keyword
identifes an element as an empty element. Empty elements are not allowed
to have any content. In order for the word “EMPTY” to have
this special meaning, it must be the first and only word in the content
model. The word “EMPTY” at any other place is treated as
an element name.
#PCDATA keyword indicates that text may
occur at that position. The text may consist of entity references and
any characters that are legal in the document character set. For XML
documents, the document character set is always Unicode. In SGML the
declaration can identify character sets and ranges that are allowed.
DocBook SGML documents use the ISO Latin 1 character set.
An unadorned element name indicates that an element must occur exactly
once at that position. A content model can also specify that an element
may occur zero or more times, one or more times, or exactly zero or one
time. This is accomplished by following the element name with one of
the following characters:
* for zero or more times,
+ for one or more times, or
? for exactly
zero or one times.
A content model of
that the element must contain at least one paragraph and may contain
A content model of
Title, Para indicates
that the element must contain a single title followed by a single paragraph.
If element names in a content model are separated by vertical bars (|), then they are alternatives. These are sometimes called “or groups” because they require the selection of one or another element.
A content model of
Phrase | Para indicates
that the element must contain either a single phrase or a single paragraph.
In SGML, there is another connector: the ampersand (&). The ampersand is a kind of combination of alternative and sequence, which means that all of the elements must occur, but they can occur in any order. DocBook does not have any content models that use the ampersand connector. XML does not allow it.
A parser uses the content models to determine if a given document is valid. In order for a document to be valid, the content of every element in the document must “match” the content model for that element.
In practical terms, match means that it must be possible to expand the content model until it exactly matches the sequence of elements in the document.
For example, consider the content model of the
Attribution?, (FormalPara | Para | SimPara)+.
This indicates that the following document fragment is valid:
<epigraph> <para>Some text</para> </epigraph>
It is valid because the following expansion of the content model exactly
matches the actual content: choose zero occurances of
Attribution, choose the alternative
the group, and choose to let the “+” match once.
By the same token, this example is not valid because there is no expansion of the content model that can match it:
<epigraph> <para>Some text</para> <attribution>John Doe</attribution> </epigraph>
Ambiguity is not allowed. The parser must always be able to choose
exactly what to match based upon the next input token.
following content model:
Meta*, Title?, Meta*.
The intent is clear: to allow some meta-information and a single,
Title. But this content model is ambiguous
for the following reason: if the document content begins with a
Meta element, it is impossible to tell if it matches
Meta before the
Title or after
without looking ahead.
Ambiguous content models are detected by the parser when it reads the DTD. It is not sufficient that your document simply be unambiguous; it must not be possible to construct any ambiguous document.
#PCDATA keyword can always match the empty string. This
makes it impossible to force an element that may contain
characters not to be empty. In other words, the following content
model does not guarantee that the element is not
[a] On a
few elements, the
Condition is a general-purpose
effectivity attribute with no specified semantics. Many DocBook users
observed that in order to add an effectivity condition that was unique
to their environment required “abusing” the semantics of
one of the existing attributes, or adding their own, making their
customization an extension rather than a
provides a standard place for application-specific effectivity.
Lang should be a language
code drawn from ISO 639 (perhaps extended with a
country code drawn from ISO 3166, as
en-US). Use it when you need to signal your
application to change hyphenation and other display
While Role is a common attribute in the sense that it occurs on
almost all elements, it is not part of either of the common attributes
parameter entities (
%idreq.common.attrib;). It is
parameterized differently because it is useful to be able to subclass
Role independently on different
RevisionFlag indicates the
revision status of element; the default is that the element hasn't
intended only for simple revision management: to track the entire
history of a document use a proper revision control system. Use
RevisionFlag for indicating
changes from one version to the next, no more.
XrefLabel holds text to be
used when a cross reference (
XRef) is made to the
A sequence of one or more space-delimited
A sequence of one or more space-delimited