diff --git a/docs/index.html b/docs/index.html index 78d1ec517..a2c0bc819 100644 --- a/docs/index.html +++ b/docs/index.html @@ -4544,6 +4544,6 @@ tei_teidata.xpath = textNote

Any XPath expression using the syntax defined in 6.2..

When writing programs that evaluate XPath expressions, programmers should be mindful of the possibility of malicious code injection attacks. For further information about XPath injection attacks, see the article at OWASP.

Notes
1
Note that this is a illustrative example, i.e. a valid ParlaMint corpus would also need certain attributes to be defined on the illustrated elements. This holds for all the examples in this section.
2
Note that parliaments also have unaffiliated (or independent) MPs, that can either belong to a special ‘unaffiliated’ parliamentary group or don't belong to any parliamentary group. For the former, they are simply not affiliated to any parliamentary group. For the latter, an ‘unaffiliated’ parlimentaryGroup organisation must be created, and such MPs are affiliated with it as members.
3
The ideal situation is that the organisation somebody is affiliated with is specificed as a organisation, using the <org> element (cf. the Section on Organisations) but if this is not the case, using <orgName> directly in the <affiliation> is an alternative encoding.
4
Note that, in general, the utterance can also be split in the middle of a sentence, which brings with it problems for automatic linguistic processing, as, ideally, the parts should be first joined, and only then processed.
5
These are typically tagset developed and used for specific languages and can be found in the XPOS column of CoNLL-U files, which is the native format for UD treebanks.
6
Note that the example is rendered in three lines, however, the correct encoding in the corpus is actually in a single line, without any spaces between the elements, as otherwise the new line and indenting spaces are actually a part of the word ‘abyste’.
7
Because <name> and <phr> can give conflicting markup (i.e. crossing tags) the current script annotates phrases only where they are not related to names, i.e. not only conflicting markup, but also nestings of phr/name and name/phr are forbidden and such MWEs are not retained in the XML. Furthermore, due to a bug in the script, phrases adjecent to names are also not retained. We hope to introduce a better script and encoding in the future.
Tomaž Erjavec, tomaz.erjavec@ijs.si, Matyáš Kopp, kopp@ufal.mff.cuni.cz and Andrej Pančur, andrej.pancur@inz.si. Date: 2025-01-13
\ No newline at end of file