Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing proliferation of function namespaces #21

Open
michaelhkay opened this issue Oct 28, 2020 · 5 comments
Open

Reducing proliferation of function namespaces #21

michaelhkay opened this issue Oct 28, 2020 · 5 comments

Comments

@michaelhkay
Copy link
Member

michaelhkay commented Oct 28, 2020

As the number of function libraries increases, the number of namespace declarations required proliferates. There are several practical consequences: the code becomes more cluttered, and it also becomes slower, because namespace declarations have to be maintained at run-time, they aren't just used for compile time disambiguation. Even when exclude-result-prefixes="#all" is used in XSLT to prevent the namespaces finding their way into the result tree, there are other operations where a large static context becomes a nuisance. For example, when we compile code to SEF files, a significant proportion of the size of the compiled file is taken up with namespace information, much of which is never used. The more namespaces there are, the more likely it becomes that two functions have different namespace contexts, and when functions are inlined, the different namespace contexts need to be maintained in the optimized code.

Many languages have some kind of mechanism to allow functions to have a "full name" and a "short name" of some kind, with a reasonably flexible mechanism to allow the short name to be expanded statically to the full name. It would be good to do this without the necessity to proliferate namespace declarations that have to be maintained at run-time.

An idea for doing this, using XSLT syntax, is to allow something like this:

<xsl:function-library>
    <xsl:import-functions namespace="http://www.w3.org/2005/xpath-functions">
       <xsl:alias-function name="put#1" as="update-put"/>
    </xsl:import-functions>
    <xsl:import-functions namespace="http://www.w3.org/2005/xpath-functions/map">
       <xsl:alias-function name="put" as="map-put"/>
    </xsl:import-functions>
    <xsl:import-functions namespace="http://www.w3.org/2005/xpath-functions/array">
       <xsl:alias-function name="put" as="array-put"/>
       <xsl:alias-function name="get" as="array-get"/>
    </xsl:import-functions>
</xsl:function-library>

When an unprefixed function name is referenced, the local name (and arity) is resolved using the declared function library. The basic rule is that the reference must be unambiguous: if two of the imported function namespaces overlap, making the local-name/arity combination ambiguous, then that function is not accessible by local-name/arity, unless it has been assigned an alias.

(This rule is designed so that if one of the function libraries expands over time, causing an ambiguity to arise where there was none before, then (a) there is no failure unless the affected function is actually used, and (b) if it is used, then a static error is reported; this situation never causes the wrong function to be executed.)

To keep independence between modules, it would probably make sense for a function library to be named and for the name to be scoped to a package, and for individual modules to say explicitly what library they are using with a declaration such as <xsl:use-function-library name="xxx"/> which has module scope.

The impact on XPath, I think, is that the concept of "default function namespace" would be replaced by "unqualified function name resolution algorithm", whose value is a procedure for statically resolving a local-name/arity to a fully qualified function name; different host languages could use different resolution algorithms.

@gimsieke
Copy link

Alternatively, in XSLT, a stylesheet author may use an attribute [xsl:]infer-function-prefixes-for. For example,

<xsl:stylesheetinfer-function-prefixes-for="http://www.w3.org/2005/xpath-functions/map 
                               http://www.w3.org/2005/xpath-functions/array 
                               http://www.w3.org/2005/xpath-functions/math
                               http://expath.org/ns/binary
                               http://expath.org/ns/file">

Then the processor can look up in the statically known function signatures whether the prefixless function encode-ASN-integer() can be unambiguously resolved. If not, for example for get() that is declared for arrays and maps, a static error is thrown.

One might use wildcards on the URIs or tokens that stand for a list of namespace URIs: infer-function-prefixes-for="http://www.w3.org/2005/xpath-functions/* http://expath.org/ns/* or infer-function-prefixes-for="#xpath31 #expath". The issue with “tokens for lists of URIs” is that each processor version may use a different list as new namespace URIs are added to EXPath etc.

A similar infer-function-prefixes-for directive can be provided for XQuery.

Ironically, unless this attribute/directive becomes standardized, it probably needs to have a vendor namespace prefix.

@michaelhkay
Copy link
Member Author

michaelhkay commented Oct 28, 2020

I think that just providing a list of namespaces is not enough. Without a mechanism such as alias-function, it still becomes necessary to add a namespace declaration for those function names (like get, put) that are ambiguous, and so you still end up with the problem of having lots of namespace declarations (unless you rely on using EQName syntax for those cases, which is pretty unwieldy). A simpler solution might be for us to define aliases for those functions in the "standard" namespaces where there's an overlap; but I think the alias mechanism is useful for cases where a conflict between different libraries defined by third parties arises over a period of time due to growth of one of the libraries.

But perhaps one could do:

<xsl:stylesheet function-library="http://www.w3.org/2005/xpath-functions/map 
                               http://www.w3.org/2005/xpath-functions/array 
                               http://www.w3.org/2005/xpath-functions/math
                               http://expath.org/ns/binary
                               http://expath.org/ns/file">

<xsl:function-alias local-name="array.put" full-name="Q{http://www.w3.org/2005/xpath-functions/array}put"/>

which reduces the weight of syntax.

Or perhaps we could have a "smart aliasing" scheme, where if you write select="array.put($x, $y)" and there is no function named array.put, then we search the declared function library for a function named "put" in a namespace whose last component is "array"?

@liamquin
Copy link

I'd love to be able to have a single "header file" that contains all of those namespace URIs and just refer to it. Sort of like the "automatic namespaces" proposal i wrote a few years ago (that proposal was defeated by people who wanted it to use namespaces heavily itself). You'd still have to deal with array:new and map:new though.

@michaelhkay
Copy link
Member Author

I was thinking the "header file" could just be a stylesheet module that contains an <xsl:function-library> element which is imported into those modules that want to use it.

@liamquin
Copy link

module - that would work for XSLT, although not sure it helps XPath and XQuery users as much as i'd like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants