XTiger Language Specification

Authors: Émilien Kia, Vincent Quint, Irène Vatton -- INRIA Rhône-Alpes

Version: 1.0 - Date: 2009-12-15


Abstract

This document presents XTiger, an XML language for specifying document templates. XTiger templates are intended to guide an editing tool for building documents that follow a predefined model. The XTiger language is used jointly with another XML language, typically XHTML, which is called the target language. A template is a target language document where XTiger elements indicate how the document can be edited and still conform with the model. XTiger is versatile enough to represent templates that capture the overall structure of large documents as well as the fine details of a microformat.

Contents

1. Introduction to the XTiger language

Most popular XML document formats used on the web, such as XHTML or SVG, are very flexible: they allow many different types of documents to be represented. This is an advantage in a wide space such as the Web, as a broad range of documents can be handled consistently. XHTML, for instance, is used to represent not only traditional Web pages, but also complex technical documents, sophisticated e-commerce forms or rich media slides, and all these documents can be accessed with a single browser. But this flexibility makes document authoring a complex task. When producing a specific type of document, an author is faced with all the possibilities provided by XHTML, and she has to make a number of difficult decisions. If multiple similar documents have to be produced consistently, for a particular use or for some specific application, authors have to make a consistent use of the XHTML document format, which has proven to be very difficult.

XTiger (eXtensible Templates for Interactive Guided Editing of Resources) tackles this problem by defining how the document format (XHTML, for instance) has to be used for representing a certain type of document. To do so, XTiger relies on the notion of a template. A template is a skeleton representing a given type of document, expressed in the format of the final documents to be produced (XHTML, for instance). The format of the final documents is called the target language and must be an XML language. The skeleton contains some statements, expressed in the XTiger language, that specify how this minimal document can evolve and grow, while keeping in line with the intended type of the final documents. Some parts of the template may be frozen, if they have to appear as is in the final document. Some parts may be modified when producing the final document, some others may be added either freely or under some constraints. It is the role of the XTiger language to specify these possibilities and constraints.

When talking about XTiger, it is important to make a distinction between two kinds of documents: a template and its instances. A template is the skeleton presented above, containing XTiger elements and defining a certain type of document. It is the seed used to produce a series of documents, called instances, that are derived from the template by following the statements expressed by the embedded XTiger elements. In the rest of the paper, we use the term instance instead of final document.

The statements expressed by XTiger elements are supposed to be interpreted by a document authoring tool. Starting from a template, the tool helps the user to follow the XTiger statements, thus ensuring that the instance being edited will stick to the document type specified by the template.

XTiger templates may be used to specify the overall structure of a large document, as well as the fine details of some of its parts. This latter feature allows in particular to express how to use microformats in large documents.

XTiger is not a document type like XHTML, SVG or MathML. It is always used in combination with a target language, which is a document type. The XTiger elements interspersed in a template are not supposed to be displayed in the same way as elements of the target language. Instead, the role of these XTiger elements is to specify what elements and attributes of the target language must, should or could be present at these positions in the document instance. That is the core of the language, which specifies the structure and (parts of) the content of documents.

This functionality is complemented with additional features that make the language easier to use. For instance, structure fragments can be defined only once and used at several places, in one or several templates. This facilitates a modular construction of templates, by sharing reusable pieces of structure stored in libraries.

As XTiger is used to describe structures, and because it is always mixed with XML languages, it is itself an XML language. XML namespaces are used to distinguish between XTiger elements and elements from the target language. This distinction allows existing web browsers to simply ignore the XTiger elements and to display a template as if these elements were not present.

The XTiger namespace is http://ns.inria.org/xtiger. For the sake of readability, all examples in this document use prefix xt: for XTiger element names, while names from the target language are not prefixed.

The target language used in the following examples is XHTML, but it might be any other XML language as well. The first example below is a piece of XTiger language that defines a component called "author" (see also other template examples). This component is constituted by a XHTML paragraph that contains a few XHTML span and br elements, with classes from the hCard microformat. The "author" component can be used to generate the XHTML structure representing an author in document instances, following the hCard microformat.

<xt:component name="author">
  <p class="vcard">
    <span class="fn">
      <xt:use types="string" label="name">Author name</xt:use>
    </span>
    <br/>
    <span class="adr">
      <xt:use types="string" label="address">Address ...</xt:use>
    </span>
    <br/>
    <span class="email">
      <xt:use types="string" label="email">email ...</xt:use>
    </span>
  </p>
</xt:component>

2. Types in XTiger

In XTiger, types are used to specify pieces of structure that may occur at several places in a template or in several templates. XTiger offers a few basic types and allows constructed types to be built. Constructed types are built with constructors that combine XTiger basic types and types from the target language. Two constructors are available: component and union.

2.1. Basic types

XTiger offers three basic types:

2.2. Target language types

As XTiger always works with a target language and is used to produce documents in that language, it may use elements and attributes from the target language. For instance, when the target language is XHTML, elements h1, h2, p, strong, span, cite are target language types.

2.3. Component

Component is a constructor that creates a new constructed type by specifying an XML structure assembling other types, which may be basic types, target languages types and constructed types (unions and other components). The type thus created has a name that allows it to be referred from other XTiger elements. This name must be unique in the template where it is defined.

The XTiger element component is used to define a component type:

<!ELEMENT component ANY>
<!ATTLIST component
    name  NMTOKEN  #REQUIRED>
Attributes:
name
The name of the type. This attribute must be unique in the template and is mandatory.
Content:
The content of the component element defines the structure of the new type. It may be any XML structure that combines target language elements (possibly with attributes) and XTiger elements allowed in the template body.

An example :

<xt:component name="hello">
    <p>Hello world!</p>
</xt:component>

This example defines a type called "hello" that is a XHTML paragraph (the target language of the template where this element occurs is XHTML) containing the text "Hello World !". It uses a target language type (p element).

2.4. Union

Union is a constructor that defines a new type as a choice between several types, each of which being a basic type, a target language type, or a constructed type (component or other union). The new type has a name that allows it to be used in other XTiger elements. This name must be unique in the template where it is defined.

The XTiger element union is used to define a union type:

<!ELEMENT union EMPTY>
<!ATTLIST union
    name    NMTOKEN  #REQUIRED
    include CDATA    #REQUIRED
    exclude CDATA    #IMPLIED>
Attributes:
name
The name of the type. This attribute must be unique in the template and is mandatory.
include
A list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ... for XHTML), or the name attribute of a component or another union. This attribute is used to define the options that constitute the union. This attribute is mandatory.
exclude
This attribute is a list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ...), or the name attribute of a component or another union. This attribute is used to exclude some elements that are part of the union as defined by the include attribute. This attribute is optional.
Content:
empty.

XTiger provides four predefined unions that may be used in any type definition:

anySimple
includes all basic types (number, string and boolean)
anyElement
includes all elements defined in the target language.
anyComponent
includes all components defined in the template.
any
includes anySimple, anyElement and anyComponent.

Example :

<xt:union name="hello_or_p" include="hello p"/>
<xt:union name="headings" include="h1 h2 h3 h4 h5 h6"/>
<xt:union name="headings1to4" include="headings" exclude="h5 h6"/>

With these definitions, the hello_or_p union provides a choice between the hello component and the p element. The headings union provides a choice between all HTML headings (h1 to h6). The headings1to4 union provides a choice between all HTML headings except h5 and h6.

3. Type definitions

The definitions of components and unions presented above must appear in the head of an XTiger template, or in a XTiger library imported by a template.

Type definitions do not appear in document instances. Instead, instances include a Processing Instruction that refers to their template, which contains type definitions and reference to libraries containing additional type definitions.

3.1. Head element

The head element collects definitions of components and unions that are used in the template. It also refers to the libraries that contain additional components and/or unions used in the template. This is done with the import element.

There is always a head element in a template, but only one. It may appear anywhere in the template, but it cannot be the root of the document. In XHTML documents it is recommended to insert it in the XHTML head element.

<!ELEMENT head ((component | union | import )*) >
<!ATTLIST head
    version         CDATA    #REQUIRED
    templateVersion CDATA    #IMPLIED>
Attributes:
version
Version of the XTiger language used in the template. For the XTiger language defined in this specification, the value of the version attribute must be "1.0". This attribute is mandatory.
templateVersion
Version of the template. A template may evolve over time, when the type of document it represents is modified and several versions are available. This version number may be used to make sure that the right version of the template is used for a given document instance.
Content:
The head element contains any number (including 0) of component, union and import elements, but no other elements.

3.2. XTiger libraries

A XTiger library is an XML document containing definitions of constructed types (components and/or unions). Libraries allow types to be declared only once and to be shared between different templates. A XTiger library is defined by the root element library. Its content model is the same as the head element of a template. Like the head element, a library can import other XTiger libraries using the import element.

<!ELEMENT library ((component | union | import)*)>
<!ATTLIST library
    version         CDATA    #REQUIRED
    templateVersion CDATA    #IMPLIED>

3.3. Using libraries

When a template or a library uses constructed types defined in a library, that library must be explicitly imported in the template or library that uses it by an import element.

<!ELEMENT import EMPTY >
<!ATTLIST import
    src    CDATA    #REQUIRED>
Attributes:
src
URI of the imported library. This attribute is mandatory.
Content:
empty.

All components and unions declared in the imported library are inserted at the position of the import element. Some imported components and unions can be redeclared (same name attribute) in the current head or library element. The order of import elements in a head or library is important: a component or a union defined in an imported library with the same name attribute as a previous definition replaces that previous definition.

4. The template body

A template contains a set of type definitions grouped in the head element but also the skeleton of a target language document and some XTiger statements that are used to generate instances. The latter (skeleton and statements) is called the template body. A copy of it serves as initial instance when a new document is created from the template. The head element and its definitions are not copied in the instance.

All target language elements included in the template body appear in all document instances exactly as they are in the template. Their content is preserved and can not be modified in instances. This is the static part of the template.

There is also a dynamic part in a template, i.e. a part that can be modified under the control of XTiger elements. The XTiger elements that control the dynamic part are:

4.1. Inclusion of types

The use element indicates what type(s) of element can appear at that position in an instance. Only one element of the specified type(s) can appear at that position in an instance document.

<!ELEMENT use ANY>
<!ATTLIST use
    label       NMTOKEN  #IMPLIED
    types       CDATA    #REQUIRED
    option  (set|unset)  #IMPLIED
    currentType CDATA    #IMPLIED
    initial (true)       #IMPLIED
Attributes:
label
Label associated with the use element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
types
A list of space separated names. A single name is also allowed. Each name is either a basic type (number, boolean, string), an element of the target language (h1, h2, p, ...), or the name attribute of a component or a union. The element to be inserted at that position in an instance must be of one of these types, but there is no constraint on the descendants of the inserted elements, provided they comply with the DTD or schema of the target language, when target language elements are used. This attribute is mandatory.

Recursion is forbidden. For example, when a use element is part of a component, it cannot refer to that component.

option
Indicates whether the content of the use element is optional. The value is set when the content is generated and unset when it is omitted. Usually in a template the value is set.
currentType
Name of the selected type, when a choice has been made. This attribute is present only in instances. It should not be used in a template.
initial
Indicates whether the content of the use instance is the initial value provided by the template.This attribute is present only in instances. It should not be used in a template.
Content:
The use element may have a content. If a content is present in the template, it must be of one of the types listed in the types attribute. This content is considered as an initial value that will be present in an instance. It may be replaced by an instance author by another content, provided it is compliant with the types attribute.

Even if a component is used only once in the template, it must be declared within the template head and a use element will refer to it.

Example 1:

<xt:use label="birthday" types="string">
Your birth date here
</xt:use>

In this example "Your birth date here" is the content that will be displayed when a new instance is created from the template. This string can be freely replaced by an instance author by any other string, but only by a string.

Example 2:

<xt:head version="1.0">
  <xt:component name="short_date">
    <xt:use label="day"   types="number">20</xt:use> /
    <xt:use label="month" types="number">10</xt:use> /
    <xt:use label="year"  types="number">1981</xt:use>
  </xt:component>
 ...
</xt:head>
 ...
<xt:use label="birthday" types="short_date"/>

This example shows how a component can be used to make sure that the user will enter a date in the dd/mm/yyyy format.

<xt:use label="date" types="em short_date">
  <em>20 october 1981</em>
</xt:use>

Here, the content of the xt:use element may be either an XHTML em element or a short_date component. Only one of them can be inserted at that position in an instance. The current content <em>20 october 1981</em> is a valid value, because it is an em. It does not need to be also a short_date.

4.2. Free content areas

The use element puts strong constraints on the structure and/or content of a part of a document. It is sometimes useful to have more flexibility. That is the role of the bag element. It indicates that any number of a set of elements may appear at that position in an instance document, and it specifies the allowed types for these elements.

<!ELEMENT bag ANY>
<!ATTLIST bag
    label   NMTOKEN  #REQUIRED
    types   CDATA    #REQUIRED>
    include CDATA    #IMPLIED>
    exclude CDATA    #IMPLIED>
Attributes:
label
Label associated with the bag element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
types
A list of space separated names. A single name is also allowed. Each name is either a basic type (number, boolean, string), an element of the target language (h1, h2, p, ...), or the name attribute of a component or a union.

The elements to be inserted at the top level of the bag in an instance (bag children) must be of one of these types. The types attribute is mandatory. By default, all descendant element types allowed by the target language can be inserted into bag children.

include
A list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ... for XHTML), or the name attribute of a component or another union.

This attribute is used to extend the list of allowed descendant element types that could be inserted into bag children. This attribute is optional.

exclude
This attribute is a list of names separated by spaces. These names can be basic types (number, boolean, string), names of elements from the target language (div, h1, h2, p, ...), or the name attribute of a component or another union.

This attribute is used to exclude some element types from the possible set of descendant element types. This attribute is optional.

Content:
The bag element may have a content. If a content is present, it must follow the constraints set by the types attribute. This content is considered as an initial value that will be present in an instance. It may be replaced by an instance author by another content, provided it remains compliant with the types attribute.

Example 1:

<div>
  <xt:bag label="sect" types="p h2 h3 h4 div">
  <p>
    This <em>paragraph</em> contains <em><strong>strings</strong></em>
    and <strong><code>any</code></strong> combination of <em>emphasis</em>,
    <code>code</code> and <strong>strong</strong> elements.
  </p>
  </xt:bag>
</div>

Many occurrences of p, h2, h3, h4, and div elements may appear at the top level of the bag, and only these elements. There is no constraint about the order of these elements and as for the use element, no constraint is specified on the content of these elements.

By default the bag element will generate this initial paragraph ; em and strong elements are allowed by the target language.

Example 2:

<div>
  <xt:bag label="sect" types="p h2 div" include="author" exclude="h2">
  <h2>Title...</h2>
  <p>
    This <em>paragraph</em> contains <em><strong>strings</strong></em>
    and <strong><code>any</code></strong> combination of <em>emphasis</em>,
    <code>code</code> and <strong>strong</strong> elements.
  </p>
  </xt:bag>
</div>

In example 2, h3 and h4 elements cannot appear at the top level of the bag, only p, h2, and div elements are allowed. The include attribute says that the author component can be inserted within the bag but not at the top level. The exclude attribute says that the h2 element can be inserted only at the top level.

Example 3:

<div>
  <xt:bag label="sect" types="anyElement">
  <p>
    This <em>paragraph</em> contains <em><strong>strings</strong></em>
    and <strong><code>any</code></strong> combination of <em>emphasis</em>,
    <code>code</code> and <strong>strong</strong> elements.
  </p>
  </xt:bag>
</div>

In example 3, any element of the target language can appear at any levey level of the bag. Only the target langage constraints apply.

4.3. Repeated elements

It is often useful to be able to repeat a piece of the document structure (or an alternative of pieces) several times. In this case, the structure to be repeated must first be declared as a component. It can then be used with a repeat element in the template body around a use element that refers the component(s).

<!ELEMENT repeat ( use+ )>
<!ATTLIST repeat
    label         NMTOKEN #REQUIERED
    minOccurs     CDATA   #IMPLIED "1"
    maxOccurs     CDATA   #IMPLIED "*">
Attributes:
label
Label associated with the repeat element. This attribute allows authors of instances to make a difference between the many XTiger elements that appear in a document. It is mandatory.
minOccurs
Minimum number of times the component must be repeated. If this attribute is absent, the minimum is 1.
maxOccurs
Maximum number of times the component may be repeated. "*" means no upper bound. If this attribute is absent, it is equivalent to "*".
Content:

A use element indicates (with its types attribute) the type of the component to be repeated. Basic types are not allowed. The use element cannot have an option attribute, as the option is equivalent to a minOccur="0".

If the types attribute of the use element is a list of several types, the repeated elements may have any of these types. Several use elements may be present in a repeat element in a template to provide initial values to several repeated elements.

Example:

<xt:head version="1.0">
  <xt:component name="author">
     <xt:use label="given_name" types="string"/>
     <xt:use label="family_name" types="string"/>
  </xt:component>

  <xt:component name="bib_item">
    <li>
      <xt:repeat label="authors" minOccurs="1" maxOccurs="5">
        <xt:use types="author"/>
      </xt:repeat>
      ...
    </li>
  </xt:component>
  ...
</xt:head>
 ...
<h2>Bibliography</h2>
<ul>
  <xt:repeat label="bib_list" minOccurs="1">
    <xt:use label="entry" types="bib_item"/>
  </xt:repeat>
</ul>

This example describes a bibliography section which includes at least one bib_item element. Each of these elements may contain one to five authors.

The document bibliography could be also defined with a bag element:

<h2>Bibliography</h2>
<ul>
  <xt:bag label="bib_list" types="bib_item"/>
</ul>

In that case, the list of bib_item could be empty. It is equivalent to a repeat with minOccurs="0".

4.4. Attributes

XTiger provides a way to control attributes from the target language. This is achieved by inserting an attribute element as a child of a target language element. The attribute element makes an attribute of its parent element mandatory, fixed, or prohibited. If several attributes of a single target language element have to be controlled, several attribute elements must be used, one for each of these attributes.

<!ELEMENT attribute EMPTY>
<!ATTLIST attribute
    name    NMTOKEN                          #REQUIRED
    type    (number, string, list)           #IMPLIED "string"
    use     (required, optional, prohibited) #IMPLIED "required"
    default CDATA #IMPLIED
    fixed   CDATA #IMPLIED
    values  CDATA #IMPLIED>
Attributes:
name
Name of the attribute of the parent element that is constrained. This attribute is mandatory.
type
Type of the constrained attribute (number, string, list). If the type attribute is not present, the default type "string" is assumed.
use
Indicates whether the constrained attribute is required, optional, or prohibited. If an attribute is required by its DTD, this attribute will be added even if the template makes it prohibited or optional. If attribute use is not present, the default value "required" is assumed.
default
Default value of the constrained attribute. This value can be replaced by another value in the instance. Attribute default is optional.
fixed
Fixed value of the constrained attribute. Attribute fixed is optional
values
List of possible values. Possible values are separated by spaces. Attribute values is optional.
Content:
empty.

Example:

<div>
  <xt:attribute name="class" use="optional" 
    values="comment example info" default="comment"/>
  ...
</div>

This example shows a XHTML div element whose class attribute is made optional with value limited to the three options comment, example and info. The default value is set to comment.

5. Resources and processing

When working with XTiger templates, three different kinds of resources are involved:

Template file
A template defines the skeleton of a document and the constructed types (components and unions) it uses. Template files have the .xtd extension.
XTiger libraries
Libraries are lists of constructed type definitions. They can be imported by templates and other libraries. Library files have the .xtl extension.
Document instances
Instances are documents generated from templates. Instance files have the usual extension of their target language.

When a user creates a document from a template, the new document instance is created as a copy of the template. However, the xt:head element with its type definitions is kept by the authoring tool, but it is not copied in the document instance.

The template is linked to the new instance by a processing instruction:
<?xtiger template="URI/of/the/template.xtd" version="1.0" templateVersion="xx" ?>
which is inserted at the beginning of the instance, in the same way CSS style sheets are linked to XML documents. With this link, the authoring tool can find all the type definitions needed during editing sessions. All other XTiger elements (use, bag, repeat, attribute) as well as all target language elements are kept in the copy that constitutes the initial instance. XTiger types that appear in these elements are replaced by references to their definition in the template (actually, by references to a parsed representation of types in core memory which is more compact).

6. References

Francesc Campoy Flores, Vincent Quint, Irène Vatton, Templates, Microformats and Structured Editing, Proceedings of DocEng'06, ACM Symposium on Document Engineering, 10-13 October 2006, Amsterdam, The Netherlands, pp. 188-197. This research paper presents an early version of the XTiger language.