Skip to content

sortinfo as optional #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

sortinfo as optional #27

wants to merge 1 commit into from

Conversation

arademaker
Copy link
Member

In #15, I proposed to add here the DTD specifications of our semantic representations.

  1. What tools depend on these files?
  2. as far as I understood, quantifiers do not have sortinfo attached to them in a DRMS graph, if that is the case, I am proposing to add it as an optional tag for nodes.

@goodmami
Copy link
Member

goodmami commented Sep 2, 2021

Re (1), none that I'm aware of, besides just validating outputs manually using xmllint or xmlstarlet or something.

Re (2), sortinfo is required for non-quantifier nodes, however, so being optional is misleading. DMRX should have empty <sortinfo /> elements on quantifiers to be compliant:

$ echo "The dog barked." | ace -g ~/grammars/erg.dat | delphin convert -f ace -t dmrx
NOTE: 1 readings, added 909 / 48 edges to chart (26 fully instantiated, 26 actives used, 15 passives used)	RAM: 1516k
NOTE: parsed 1 / 1 sentences, avg 1516k, time 0.00984s
<dmrs-list>
<dmrs cfrom="-1" cto="-1" top="10002" index="10002" surface="The dog barked.">
<node nodeid="10000" cfrom="0" cto="3"><realpred lemma="the" pos="q" /><sortinfo /></node>
<node nodeid="10001" cfrom="4" cto="7"><realpred lemma="dog" pos="n" sense="1" /><sortinfo PERS="3" NUM="sg" IND="+" PT="pt" cvarsort="x" /></node>
<node nodeid="10002" cfrom="8" cto="14"><realpred lemma="bark" pos="v" sense="1" /><sortinfo SF="prop" TENSE="past" MOOD="indicative" PROG="-" PERF="-" cvarsort="e" /></node>
<link from="10000" to="10001"><rargname>RSTR</rargname><post>H</post></link>
<link from="10002" to="10001"><rargname>ARG1</rargname><post>NEQ</post></link>
</dmrs>
</dmrs-list>

So I'm not in favor of this change. The empty <sortinfo /> is a bit annoying, but not enough to break the DTD in this way.

RelaxNG could, in theory, make the sortinfo required if the node's realpred does not have pos="q", but it wouldn't be able to deal with gpred quantifiers (e.g., udef_q), which do not have a pos attribute to rely on. We could enumerate every quantifier gpred, but then it becomes grammar-specific (to be fair, it already is somewhat language-specific with the constrained attributes on <sortinfo>).

@arademaker
Copy link
Member Author

arademaker commented Sep 2, 2021

Good arguments for me! Ok, I will close the PR. This discussion was motivated by the RDF vocabularies where we could make these extra validations for avoiding nodes in the RDF without any information attached to them.

DMRX stands for the XML representation of DMRS? If so, it can be distracting to name the schema dmrs.dtd?

It looks like DMRS is the only one that has one extra name for one among its possible serializations

@goodmami
Copy link
Member

goodmami commented Sep 2, 2021

DMRX stands for the XML representation of DMRS?

Correct

If so, it can be distracting to name the schema dmrs.dtd?

It's the only XML representation for DMRS, so it's not ambiguous. Also I think DMRX/MRX were only shorthand names, but they were useful so I used them as the actual name for the PyDelphin codecs.

It looks like DMRS is the only one that has one extra name for one among its possible serializations

Not sure what you mean here. MRX is the XML representation of MRS, and I think there's RMRX as well.

@arademaker
Copy link
Member Author

Oh, I see, I believe I have forgotten about the MRX. The RMRX is new for me. I found very few pages about these formats in the wiki, e.g. https://github.com/delph-in/docs/search?q=MRX&type=wikis.

@arademaker arademaker closed this Sep 3, 2021
@arademaker arademaker deleted the issue-15 branch September 3, 2021 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants