Attributes augment the element on which they appear; they also provide additional information about the element.

Attributes appear as name-value pairs in the element’s start tag. For example, to assign the value hostname to the role attribute of systemitem, you would use the markup: systemitem role="hostname".


Block elements are ones that generally get rendered as discrete units running vertically down the page (orthogonal to the writing direction). The div element in HTML, for example.

See also inline.


A pointer, verbal or graphical or both, to a component of an illustration or a text object.

CDATA section

A CDATA section is a region of text in an XML document delimited by <![[ at the beginning and ]]> at the end. All of the characters in a CDATA section are taken literally as part of the document. In other words, literal “<” characters, “&” characters, and other markup characters are not interpreted as markup. The delimiters are not rendered.

CDATA sections are a convenience for authoring, they are not part of the XML data model. After the document has been parsed, it is not possible to determine where CDATA sections where used (as opposed to other forms of escaping, such as character entities).

character reference

A character reference is a mechanism for inserting an arbitrary Unicode character into a document. They’re most often used for characters that aren’t available on the author’s keyboard or font. Syntactically, they have the form &#number; where “number” is the Unicode codepoint of the character expressed as a decimal number. Hexadecimal numbers can also be used with character references of the form &#xnumber;.

For example, you can type &#169; (or &#xa9;) to insert a © symbol.

content area

See viewport area.


Cooked data, as distinct from raw, is a collection of elements and character data that’s ready for presentation. The processor is not expected to rearrange, select, or suppress any of the elements, but simply present them as specified.

See also raw.

customization layer

Many XML techologies (RELAX NG grammars and XSLT Stylesheets, for example) are designed to be extended. Such an extension is often referred to as a customization layer.

Document Type Declaration

A set of declarations that defines the names of the elements and their attributes, and that specifies rules for their combination or sequence.


See Document Type Declaration.


A term used to identify attributes used for profiling or conditional processing. DocBook contains a set of effectivity attributes that allow you to flag elements as being effective under particular conditions. For example, you might set the value of the os effectivity attribute to linux to indicate that this element is applicable to the Linux operating system. With the DocBook stylesheets, if you set the profile.os parameter to linux this element will be included. If you set the parameter to some other value, the element will be excluded. Further information about using the DocBook stylesheets for profiling can be found in Bob Stayton’s DocBook XSL: The Complete Guide.


Elements define the hierarchical structure of a document. Most elements have start and end tags and contain some part of the document content. Empty elements have only a start tag and have no content.


A name assigned (by means of a declaration) to some chunk of data so that it can be referred to by that name; the data can be of various kinds (e.g., a special character or a chapter or a set of declarations in a DTD), and the way in which it is referred to depends on the type of data and where it is being referenced.

Extended Backus-Naur Form

Any of a variety of notations for describing context free grammars.


When a grammar permits new markup that would not be valid against the original grammar, it is an extension.

external entity

An external entity is a general entity that refers to another document. External entities are often used to incorporate parsable text documents, like legal notices or chapters, into larger units, like chapters or books.

external subset

Element, attribute, and other declarations that compose (part of) a document type definition that are stored in an external entity, and referenced from a document’s Document Type Declaration using a system identifier and optionally a public identifier.


Text objects such as sidebars, figures, tables, and graphics are said to float when their actual place in the document is not fixed. For presentation on a printed page, for instance, a graphic may float to the top of the next page if it is too tall to fit on the page in which it actually falls, in the sequence of words and the sequence of other like objects in a document.

formal public identifier

A public identifier that conforms to the specification of formal public identifiers in ISO 8879.

general entity

An entity referenced by a name that starts with an ampersand (&) and ends with a semicolon. Most of the time general entities are used in document instances, not in the schema. There are two types, external and internal entities, and they refer either to special characters or to text objects such as commonly repeated phrases or names or chapters.


Inline elements are ones that get rendered in the flow of prose. The span element in HTML, for example.

See also block.

internal entity

A general entity that references a piece of text (including its markup and even other internal entities), usually as a keyboard shortcut.

internal subset

Element, attribute, and other declarations that compose (part of) a document type definition that are stored in a document, within the Document Type Declaration.

intrinsic size

The intrinsic size of an image is it’s actual size, measured in pixels or other appropriate units, before any scaling or transformations have been applied.


The terms “may”, “must”, etc. are interpreted as defined in RFC-2119.


Meta-information is information about a document, such as the specification of its author or its date of composition, as opposed to the content of a document itself.


This glossary term appears in the example for firstterm. We’re really not going to attempt to define object oriented programming terms here.


NVDL is the Namespace-based Validation Dispatching Language; see

Object Oriented

This glossary term appears in the example for firstterm. We’re really not going to attempt to define object oriented programming here.

processing instruction

An essentially arbitrary string preceded by a question mark and delimited by angle brackets that is intended to convey information to an application that processes an XML instance. For example, the processing instruction <?linebreak> might cause the formatter to introduce a line break at the position where the processing instruction occurs.

In XML documents, processing instructions should have the form:

  |<?pitarget param1="value1" param2="value2"?>

The pitarget should be a name that the processing application will recognize. Additional information in the processing instruction should be added using attribute syntax.

public identifier

An abstract identifier for an XML document, DTD, or external entity.


Raw data is just a collection of elements, with no additional punctuation or information about presentation. To continue the cooking metaphor, raw data is just a set of ingredients. It’s up to the processor to select appropriate elements, arrange them for display, and add required presentational information.

See also cooked.


RELAX NG is a grammar-based schema language for XML; see


Schematron is a language for making assertions about patterns found in XML documents; see


Standard Generalized Markup Language is an international standard (ISO 8879) that specifies the rules for the creation of platform-independent markup languages for electronic texts.


A file that specifies the presentation or appearance of a document; there are several standards for such stylesheets, including CSS, FOSIs, DSSSL, and XSL.

system identifier

In SGML, a local, system-dependent identifier for a document, DTD, or external entity. Usually a filename on the local system.

In XML, a system identifier is required to be a URI.


An XML element name enclosed in angle brackets (<>), used to mark up the semantics or structure of a document. para is a tag in DocBook used to mark the beginning of a paragraph.


Uniform Resource Identifier, the W3C’s codification of the name and address syntax of present and future objects on the Internet. In its most basic form, a URI consists of a scheme name (such as file, http, ftp, news, mailto, gopher) followed by a colon, followed by a path whose nature is determined by the scheme that precedes it (see RFC 1630).

URI is the umbrella term for URNs, URLs, and all other Uniform Resource Identifiers.


Uniform Resource Locator, a name and address for an existing object accessible over the Internet. is an example of a URL (see RFC 1738).

validating parser

A validating parser is one that performs DTD validation. Other forms of validation, for example against RELAX NG grammars or W3C XML Schemas, don’t imply a validating parser.

See also Document Type Declaration.

viewport area

Images in DocBook are defined by two bounding boxes: a bounding box for the overall image, the viewport area, and the bounding box for the image itself, the content area, which may be aligned within the viewport area.


The World Wide Web Consortium.


Some elements, such as chapter, have important semantic significance. Other elements serve no obvious purpose except to contain a number of other elements. For example, info has no important semantics; it merely serves as a container for the meta-information about a book. Elements that are just containers are sometimes called wrappers.


The Extensible Markup Language, a subset of SGML designed specifically for use over the Web.

XML processor

This term is used in an example for termdef where it claims that an XML processor is “used to read XML documents and provide access to their content and structure.” The definition comes from the XML Recommendation.


Extensible Stylesheet Language (XSL), an evolving language for stylesheets to be attached to XML documents. The stylesheet is itself an XML document.