MicroXML uses a simplified version of SXML as the internal representation of documents. Each SXML element is a list whose first member is a symbol representing the element name, whose second member is a SRFI 180 JSON object mapping the attribute names (as symbols) to their values (as strings), and whose remaining members (if any) are either SXML elements or strings. There is no representation of comments or processing instructions in this version of SXML, and no notion of document objects (a document is just an element that has no parent).
Options is a list of symbols that control how MicroXML is written.
All implementations recognize apos
to wrap attribute values in apostrophes,
end-tags
to write end-tags for empty elements,
and ascii
to escape all characters outside the ASCII range.
The symbol pretty
may cause pretty-printing.
Other symbols are also allowed; their effects are implementation-defined.
(uxml->sxml
port handler)
Reads all the characters from the textual input port port
as a MicroXML document and returns the SXML equivalent.
The procedure handler is invoked when a $error
event (see Events) is produced.
The default handler signals an error that satisfies uxml-error?
.
If the handler is #f
, errors are ignored;
this relaxed parsing mode allows some XML documents
that are not well-formed MicroXML to be parsed.
(sxml->uxml
element port options)
Writes the SXML element in MicroXML format to the textual output port port, using the symbols in options.
(make-uxml-generator
port)
Returns a SRFI 158 generator of event objects that represent a MicroXML document read from the textual input port port. Processing continues no matter how many errors there are until all characters have been read.
(make-sxml-generator
element)
Returns a generator of event objects representing the SXML element.
(event-generator->uxml
gen port options)
Invokes the generator gen to obtain event objects
and writes the corresponding MicroXML document to the textual output port port,
using the symbols in options.
If the resulting document would not be well-formed MicroXML, an error is signaled that satisfies uxml-error?
.
(event-generator->sxml
gen)
Invokes the generator gen to obtain event objects,
constructs the corresponding SXML element, and returns it.
If the resulting object would not be structurally correct SXML,
an error is signaled that satisfies uxml-error?
.
(write-uxml
port options)
Returns a SRFI 158 accumulator
that accepts an event object or an end of file object.
When invoked repeatedly, the accumulator writes the corresponding MicroXML representation
to the textual output port port using the symbols in options,
and returns an unspecified value. An end of file object is written as a newline.
If the resulting document would not be well-formed MicroXML,
an error is signaled that satisfies uxml-error?
.
(build-sxml)
Returns an accumulator
that accepts an event object or an end of file object.
When invoked repeatedly, it builds the corresponding SXML representation.
If the object is an end of file object, the procedure returns the SXML element;
if not, it returns an unspecified value.
If the resulting document would not be well-formed MicroXML,
an error is signaled that satisfies uxml-error?
.
A MicroXML event is either an end-of-file object or a list representing a parsing event, in one of the following formats. Stack is a list of SXML element names currently being processed; the car of the list is the name of the current element.
($start
stack attr-list)
Represents a start-tag. Attr-list is a JSON object representing the attributes.
($end
stack)
Represents an end-tag.
($text
text)
Represents character content. Text is the character content as a string.
($error
stack error-code . other)
Represents a parsing error. Error-code is a symbol. Other is implementation-dependent.
In particular, if error-code is $pi
, this error indicates the presence of a processing
instruction or XML declaration, which are not part of MicroXML. In that case, the
first two elements of other are the PI target and the PI content.
(sxml-element?
obj)
Returns #t
if obj is an SXML element and #f
otherwise.
The procedure checks that obj is a list whose first element is a symbol
and whose second element's car is the symbol @
; further elements of the list are not examined.
(sxml-empty?
element)
Returns #t
if element is an empty SXML element and #f
otherwise.
(sxml-wf-element-name?
string)
Returns #t
if string matches the MicroXML name production; returns #f
otherwise.
(sxml-wf-attribute-name?
string)
Returns #t
if string matches the MicroXML attribute name production; returns #f
otherwise.
(sxml-wf-string?
string)
Returns #t
if all the characters in string are allowed in MicroXML character content and #f
otherwise.
(sxml-wf-element?
element)
Returns #t
if element is well-formed.
The first element of element must be a symbol whose print name satisfies sxml-wf-element-name?
.
The second element of element must be a JSON object that maps symbols
whose print name satisfies sxml-wf-attribute-name?
to strings that satisfy sxml-wf-string?
.
The remaining elements must be either strings that satisfy sxml-wf-string?
or lists that satisfy sxml-wf-element?
.
(sxml-attribute?
element attr-name)
Returns #t
if attr-name (a symbol) is an attribute of element and #f
otherwise.
(sxml-id-valid?
element id-mapping idref-list)
Returns #t
if all idref attributes contain valid ids.
An id is valid if it appears as a key in id-mapping (see make-id-mapping
).
Idref-list is a list of 2-element sublists,
where the first element of each sublist is an element name
and the second element is an attribute name.
It specifies for each element name which of its attributes are idrefs.
(sxml-language?
element language)
Returns #t
if the language of element,
as specified by the value of a lang
or xml:lang
attribute
(the latter is not well-formed MicroXML but is supported for backward compatibility)
matches language; returns #f
otherwise.
If element has no such attribute, the language of the nearest ancestor of element
that has such an attribute is used.
If there is no such attribute at all, then sxml-language?
returns #f
.
The attribute value matches language if, in a case-insensitive comparison, language exactly equals the attribute value, or if language exactly equals a prefix of the attribute value such that the first character following the prefix is "-".
(sxml-make-parent-mapping
document)
Creates a parent mapping based on the SXML element document.
A parent mapping is an opaque object that maps each element to its parent,
or to #f
if there is no parent.
Returns the parent mapping.
(sxml-parent
element parent-mapping)
Uses parent-mapping to determine the parent of element and returns it,
or #f
if there is no parent.
(sxml-root
element parent-mapping)
Returns the root element of element.
(sxml-detach-parent!
element parent-mapping)
Removes the mapping from element to its parent from parent-mapping. If element does not have a parent, nothing is done. Returns an unspecified value.
(sxml-make-id-mapping
document)
Creates an id mapping based on the SXML element document.
An id mapping is an opaque object that maps an id (a symbol) to an element.
The element and all its descendants are checked for the presence
of an attribute named id
or xml:id
(the latter is not well-formed MicroXML but is allowed in SXML for backward compatibility).
If found, an entry is created in the id mapping
that maps the corresponding attribute value as a symbol to the element.
Returns the id mapping.
(sxml-id
id id-mapping)
Looks up the symbol id in id-mapping and returns the corresponding element,
or #f
if there is none.
(sxml-copy
element)
Returns a copy of element that shares nothing with it except possibly strings.
(sxml-name
element)
Returns the name of element as a symbol.
(sxml-set-name!
element name)
Returns the attribute list of element as a JSON object.
(sxml-set-attr-list!
element jso)
Returns the name of element as a list.
(sxml-value
element)
Returns the results of concatenating all string content children in element and all its descendants in depth-first left-to-right preorder.
(sxml-defaults
element attribute-defaults element-defaults inherited-attributes)
Returns element with default values expanded in itself and all its descendant. The following transformations are made:
-
attribute-defaults is a list of 3-element sublists. Each sublist contains an element name (a symbol), an attribute name (a symbol), and a default value (a string). All elements with those names are checked for the presence of the corresponding attribute. If it is missing, the attribute is added with the specified default value.
-
element-defaults is a list of 2-element sublists. Each sublist contains an element name (a symbol) and a default value (a string). All empty elements with any of those names have the default value installed as the only content child.
-
inherited-attributes is a list of symbols. All elements are checked for a corresponding attribute whose name is one of the list. If absent, then the most recent ancestor of the element that has this attribute is found (note that no parent map is required), and the attribute is added to the element being processed with the same value as in the ancestor.
(sxml-element-position
element parent-map)
Returns the position of element among the element children
of the parent of element as an exact integer, with 1 meaning the first element child.
Elements are compared in the sense of eqv?
.
If there is no parent, return 0.
(sxml-child-position
child parent-map)
Returns the position of child among the content children
of the parent of element as an exact integer, with 1 meaning the first element child.
Elements are compared in the sense of eqv?
;
strings are compared in the sense of string=?
.
Note that if there are multiple equal strings, the first is returned.
If there is no parent, return 0.
(sxml-element-size
element)
Return the number of content children of element as an exact integer.
(sxml-normalize-element!
element)
Returns a normalized version (using mutation) of an SXML element that does not necessarily conform to the definition. In particular, at least the following repairs are made:
- If the name is a string, it is converted to a symbol with
string-symbol
. - If the attribute-list is missing, an empty JSO is provided.
- If the attribute-list does not begin with an
@
element, one is provided. - If one of the content children or an attribute value is a number, it is converted to a string with
number->string
. - If one of the content children or an attribute value is a boolean, it is converted to a string with
sxml-boolean->string
. - If one of the content children or an attribute value is a symbol, it is converted to a string with
symbol->string
. - If one of the content children is some other type of Scheme object, it is converted to a string by some implementation-defined means or else removed.
- If an attribute value is some other type of Scheme object, it is converted to a string by some implementation-defined means or else that attribute is removed.
- If after the above transformations are completed, two or more consecutive content children are strings, they are consolidated.
- If any content children are elements, they are recursively normalized.
(sxml-display
element)
Displays information on (current-error-port)
about element.
The precise nature of the information displayed is undefined,
except that it should not recurse into child elements and should
end with a newline;
there is no guarantee that it can be re-read.
Element is returned.
(uxml-escape-string
string attribute? apos? ascii?)
Converts string to contain the necessary entity references for MicroXML.
In all cases, the characters < & >
are escaped with entity references.
If attribute? is true, then if apos? is true, '
is escaped, but if apos? is false, then "
is escaped, in both cases with an entity reference. Finally, if ascii? is true, non-ASCII characters are escaped with numeric character references. All other characters are left unchanged. The escaped result is returned.
(uxml-unescape-string
string)
Converts string by translating all MicroXML escapes, both entity references and numeric character references, to single characters. All other characters are left unchanged. The result is returned.
(uxml-normalize-space
string)
Returns a string that is equal to string, but with all leading and trailing whitespace removed, and all other consecutive whitespace characters replaced by a single space.
These use the conventions of XPath and XML Schema.
(sxml-string->boolean
string)
Converts the strings "1"
and "true"
to #t
,
and the strings "0"
and false
to #f
.
If any other string is passed, an error is signaled that satisfies uxml-error?
.
(sxml-boolean->string
boolean)
Converts #t
to "true"
and #f
to "false"
.
(sxml-number->boolean
number)
If number returns #t
when zero?
is applied to it, returns #f
;
otherwise returns #t
.
(sxml-boolean->number
boolean)
If boolean is true, returns 1, otherwise returns 0.
The following procedures are generator operations:
they accept a generator of SXML elements and return another generator, also of SXML elements.
After the sxml-
prefix, they begin with g
, using the convention of
SRFI 158 for generator operations.
If the source generator is empty, so is the result generator.
(sxml-groot
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their root elements on successive invocations.
(sxml-gparent
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their parent elements on successive invocations.
(sxml-gancestor
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their ancestor elements from parent to root on successive invocations.
(sxml-gancestor-or-self
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their element itself and then its ancestor elements from parent to root on successive invocations.
(sxml-gchild
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their descendant elements in depth-first order from left to right on successive invocations.
(sxml-gchild
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their descendant elements in depth-first order from left to right on successive invocations.
(sxml-gchild
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their descendant elements in depth-first order from left to right on successive invocations.
(sxml-gdescendant
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their child elements from left to right on successive invocations.
(sxml-gdescendant-or-self
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns their element itself and then its child elements from left to right on successive invocations.
(sxml-gfollowing
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns all of their following elements in document order on successive invocations.
(sxml-gfollowing-or-self
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns the elements themselves and then all of their following elements in document order on successive invocations.
(sxml-gpreceding
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns all of their preceding elements in reverse document order on successive invocations.
(sxml-gpreceding-or-self
parent-mapping gen)
Returns a generator of SXML elements which invokes SXML elements from gen and returns the elements themselves and then all of their preceding elements in reverse document order on successive invocations.
(sxml-path
element parent-mapping item ...)
Items: /
is root, //
is descendant, ..
is parent.
All other symbols are the names of child elements.
An exact integer is the number of a content child.
A procedure is an axis procedure or a user-written
procedure with the same signature.
Errors are signaled using objects of a disjoint type. They contain an $error
or $pi
event.
(uxml-error?
obj)
Returns #t
if obj belongs to the error type, and #f
otherwise.
(uxml-error-event
xml-error)
Returns an $error
event (i.e. a list) encapsulated in xml-error.