XProc Steps

Summary of the XProc steps
Step Name Input Ports Output Ports Options Description
Summary of the XProc Steps by Roger L. Costello
add-attribute

source (primary)

result (primary)

match (required): XPath expression

attribute-name (required)

attribute-value (required)

Adds an attribute/value to the element(s) identified by the match pattern

add-xml-base

source (primary)

result (primary)

all (boolean, default is false)

relative (boolean, default is true)

Adds an xml:base attribute to the root element, its value is the URL of the document on the source port. Use "all" to add xml:base to all the elements in the document.

compare

source (primary)

alternate

result

fail-if-not-equal (boolean, default is false)

Compares the document on the source port with the document on the alternate port.

count

source (primary, sequence)

result (primary)

limit (integer, defaults to 0)

Count the sequence of documents on the source input port. Use limit to cutoff the counting (limit=0 means count all)

delete

source (primary)

result (primary)

match (required): XPath expression

Deletes the element(s) and/or attribute(s) identified by the XPath expression.

directory-list

result (primary)

path (required): URI; location of the directory

include-filter: regex expression

exclude-filter: regex expression

Produces a list of the file names and folder names of a specified directory. Can restrict the listed files/folders using the include and exclude filters.

error

source (*not* primary)

result (primary, sequence)

code (required): QName

Generates a dynamic error. The error message is what you put on the source input port.

escape-markup

source (primary)

result (primary)

cdata-section-elements: list of element names

doctype-public: string

doctype-system: URL

escape-uri-attributes: boolean, default is false

include-content-type: boolean, default is true

indent: boolean, default is false

media-type: a MIME type

method: string, default is 'xml'

omit-xml-declaration: boolean, default is true

standalone: true, false, or omit, default is omit

undeclare-prefixes: boolean

version: string, default is '1.0'

This escapes all the markup between the root element's start tag and end tag. It does not escape the root element.

exec

source (primary, sequence)

result (primary), the result is wrapped in a <c:result> element

errors

exit-status

command (required): the name of the command to be executed

args: the p:exec step executes the command passed on command with the arguments passed on args

cwd: (current working directory) use this to specify the directory where you want the command run

source-is-xml: boolean, default is true

result-is-xml: boolean, default is true

wrap-result-lines: boolean, default is false

errors-is-xml: boolean, default is false

wrap-error-lines: boolean, default is false

path-separator: string

failure-threshold: integer

arg-separator: string, default is a single space

byte-order-mark: boolean

cdata-section-elements: string, default is the empty string

doctype-public: string

doctype-system: URL

encoding: string

escape-uri-attributes: boolean, default is false

include-content-type: boolean, default is true

indent: boolean, default is false

media-type: a MIME type

method: string, default is 'xml'

omit-xml-declaration: boolean, default is true

standalone: true, false, or omit, default is omit

undeclare-prefixes: boolean

version: string, default is '1.0'

This runs an external command; the data on the source port is passed to the command as standard input, the standard output of the command is put on the result port and the error port has errors.

filter

source (primary)

result (primary, sequence)

select (required): XPath expression

Selects for output portions of the source document; the portions are identified by an XPath expression.

hash

source (primary)

parameters: input, kind="parameter"; some hash algorithms require arguments, this is used to provide those arguments (settings)

result (primary, sequence)

value (required): string; the hash algorithm takes this string and generates a hash of it

algorithm (required): 'crc' or 'md' or 'sha' (the name of a hash algorithm)

match (required): XSLT match pattern; the element(s) identified is replaced by the hash value that is generated

version: if the algorthm is 'crc' then version must be 32, if the algorthm is 'md' then version must be 5, if the algorthm is 'sha' then version must be 1

Makes a digital finger print and embeds it within the document.

http-request

source (primary): the data binded on this port must be a <c:request> element

result (primary)

byte-order-mark: boolean

cdata-section-elements: string, default is the empty string

doctype-public: string

doctype-system: URL

encoding: string

escape-uri-attributes: boolean, default is false

include-content-type: boolean, default is true

indent: boolean, default is false

media-type: a MIME type

method: string, default is 'xml'

omit-xml-declaration: boolean, default is true

standalone: true, false, or omit, default is omit

undeclare-prefixes: boolean

version: string, default is '1.0'

Enables a step to interact with web services (both REST and SOAP-based web services). You can GET and POST to a web service.

identity

source (primary, sequence)

result (primary, sequence)

Makes a verbatim copy of its input available on its output. Not very exciting, but remarkably useful (e.g. it can be used to generate data).

insert

source (primary)

insertion (sequence): the documents to insert

result (primary)

match: an XPath expression; the matching elements

position: first-child, last-child, before, or after; where to insert the documents

Inserts the document on the insertion port into the document on the source port. It is inserted at the point identified by the options.

label-elements

source (primary)

result (primary)

attribute: QName, default is xml:id; the name of the attribute

label: string; the value of the attribute

match: XPath expression, default is '*'; the elements to be attributed

replace: boolean, default is true; true means an existing label should be overwritten

Generates a label for each matched element and stores that label in an attribute.

load

result (primary)

href (required): URI; the location of the XML

dtd-validate: boolean, default is false; true means validate the XML against a DTD

Inputs an XML document specified by a URL.

make-absolute-uris

source (primary)

result (primary)

match (required): XPath expression; the location of an element or attribute

base-uri: URL

Replaces an element or attribute containing a relative URL with its absolute URL.

namespace-rename

source (primary)

result (primary)

from: URL

to: URL

apply-to: all, elements, or attributes

Changes a namespace URL to a new URL.

pack

source (primary, sequence)

alternate (sequence): the second document

result (primary)

wrapper (required): QName: an element by this name will wrap the documents

Merges the document sequences on the two input ports in a pair-wise fashion and wraps the merged documents.

parameters

parameters (sequence, kind="parameter")

result (kind="parameter")

Wraps in <p:param-set> a sequence of <c:param> elements.

rename

source (primary)

result (primary)

match (required): XSLT match pattern; item to be renamed

new-name (required): QName; the new name for the item

Used to rename elements, attributes, or processing-instructions.

replace

source (primary)

replacement: the new XML

result (primary)

match (required): XSLT match pattern; the element to replace

Replaces matching elements in its primary input with the document in the replacement port.

set-attributes

source (primary)

attributes: the document containing the attributes to be copied

result (primary)

match (required): XSLT match pattern; the elements to receive attributes

Copies the attributes on the root element of the attributes port to each element identified by the match pattern.

sink

source (primary, sequence)

Accepts a sequence of documents and discards them. It has no output.

split-sequence

source (primary, sequence)

matched (primary, sequence)

not-matched (sequence)

test (required): XPath expression; each document is tested using this XPath expression

initial-only: boolean, defaults to false; if this is set to true then the first document that fails the test, and all subsequent documents, are output on the not-matched port

Divides up a sequence of documents; the documents that meet a boolean expression exit on the matched output port, the other exit on the not-matched output port.

store

source (primary, sequence)

result (not primary)

href (required): URL; the XML document on the source input port is stored to this URL.

byte-order-mark: boolean

cdata-section-elements: list of QNames, default is the empty string

doctype-public: string

doctype-system: URL

encoding: string

escape-uri-attributes: boolean, default is false

include-content-type: boolean, default is true

indent: boolean, default is false

media-type: a MIME type

method: string, default is 'xml'

omit-xml-declaration: boolean, default is true

standalone: true, false, or omit, default is omit

undeclare-prefixes: boolean

version: string, default is '1.0'

Stores a serialized version of its input to a URI.

string-replace

source (primary)

result (primary)

match (required): XSLT match pattern; the elements that are to be replaced with string data

replace (required): an XPath expression; the string data of the element(s) identified will replace the elements identified by the match pattern

Replaces matched elements with string data.

unescape-markup

source (primary)

result (primary)

namespace: a URI

content-type: a MIME type, default is application/xml

encoding: string

charset: string

Does the opposite of p:escape-markup; it converts escaped markup to markup, e.g. &lt; is converted to <

unwrap

source (primary)

result (primary)

match (required): an XSLT match pattern

Replaces matched elements with their children. In other words, it deletes the current element but retains its children.

validate-with-relax-ng

source (primary)

schema

result (primary)

dtd-attribute-values: true or false, default is false

dtd-id-idref-warnings: true or false, default is false

assert-valid: true or false (default is true); assert-valid="true" means you want an error thrown if the instance document fails validation

Performs RELAX NG validation on the input document.

validate-with-schematron

source (primary)

schema

parameters (kind="parameter")

result (primary)

report (sequence)

phase: the identifier of a schematron phase; its default is #ALL (all phases are selected for execution)

assert-valid: true or false (default is true); assert-valid="true" means you want an error thrown if the instance document fails validation

Performs Schematron validation on the input document.

validate-with-xml-schema

source (primary)

schema

result (primary)

mode: strict or lax (default is strict)

assert-valid: true or false (default is true); assert-valid="true" means you want an error thrown if the instance document fails validation

use-location-hints: true or false (default is false)

try-namespaces: true or false (default is false)

Performs XML Schema validation on the input document.

wrap

source (primary)

result (primary)

match (required): XSLT match pattern; identifies the elements to be wrapped

wrapper (required): QName; create an element with this name and use it to wrap the elements identified by match

group-adjacent: XPath expression

Places an element around each element identified by the match pattern.

wrap-sequence

source (primary, sequence)

result (primary, sequence)

wrapper (required): QName; create an element with this name and use it to wrap the documents on the input port

group-adjacent: XPath expression

Wraps the sequence of documents on the input port.

www-form-urldecode

No input port

result (primary)

value: string, the x-www-form-urlencoded string

Converts a x-www-form-urlencoded string into a set of parameters.

www-form-urlencode

source (primary)

parameters (kind="parameter")

result (primary)

match: an XPath match pattern; the encoded string is placed in the nodes specified here

Extracts the name/value pairs in a p:parameter-set and generates a x-www-form-urlencoded string, which is then inserted into the XML document.

xinclude

source (primary)

result (primary)

fixup-xml-base: boolean, default is false

fixup-xml-lang: boolean, default is false

Pulls in (macro-substitutes) all the XML snippets referenced using xi:include.

xquery

source (primary, sequence)

query

parameters (kind="parameter")

result (primary, sequence)

Executes an XQuery on one or more XML documents.

xslt

source (primary, sequence)

stylesheet

parameters (kind="parameter")

result (primary)

secondary (sequence)

initial-mode: QName

template-name: QName

output-base-uri: URI

version: '1.0' or '2.0'

Applies an XSLT 1.0 or 2.0 stylesheet to a document.