-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate app and rdgGrp structure from c and u #3
Comments
Looking for a place to intervene in the TAN-fn-strings-collate-standard.xsl. What about setting an |
The output of tan:collate() can be redirected in any number of ways, not
just TEI but docx, HTML, etc. I'm inclined to keep the diff/collate output
generic, but propose a postprocessing function. Let me throw together an
idea or two.
jk
…On Sun, Apr 24, 2022 at 11:40 AM Elisa Beshero-Bondar < ***@***.***> wrote:
Looking for a place to intervene in the
TAN-fn-strings-collate-standard.xsl
<https://github.com/textalign/TAN-2021/blob/master/functions/strings/TAN-fn-strings-collate-standard.xsl>.
What about setting an <app> element down after line 418
<https://github.com/textalign/TAN-2021/blob/730bd16200e38eab3e1d20727bae5d882e194c57/functions/strings/TAN-fn-strings-collate-standard.xsl#L418>
?
—
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQD53SYOQEPKVK2ROUSVCLVGVTODANCNFSM5UGLWIXA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Joel Kalvesmaki
kalvesmaki.com
|
Looking at your desired output, it is half TEI, half non-TEI. Is that intentional, or are you going to do something different downstream that is un-TEI-like? I'm wondering why, using "Kirwin," versus "Kirwin" as my example, you wouldn't prefer something like:
Isn't that what most TEI users would want to do if they tried to convert Also I'm uncertain why
|
@Arithmeticus Okay, looking back at this, my example wasn't fully "fleshed out". I was just concentrating on converting the So, let me show you what the full TEI output looks like, with few more comments... |
@Arithmeticus Here's what we're really doing in the Frankenstein Variorum with TEI critical apparatus. Indeed, we want to be seeing the text of the original witnesses output in the We definitely don't favor The structure of the edition is driven by the simple hierarchy of So here's what we're building out of collation software, and this is mostly the standard TEI with just one unusual dimension: an explicit view of the normalized tokens for comparison: Here's a view of how collateX generates the collation output in its TEI critical apparatus form. There's just one modification to their standard TEI critical apparatus output as I recall, and that's the <app>
<rdgGrp n="['i', 'saw', 'the', 'dull', 'yellow', 'eye', 'of', 'the', 'creature']">
<rdg wit="f1818">I saw the dull yellow eye of the creature </rdg>
<rdg wit="f1823">I saw the dull yellow eye of the creature </rdg>
<rdg wit="fThomas">I saw the dull yellow eye of the creature </rdg>
<rdg wit="f1831">I saw the dull yellow eye of the creature </rdg>
<rdg wit="fMS">I saw the dull yellow eye of <lb n="c56-0045__main__14"/> the creature </rdg>
</rdgGrp>
</app>
<app>
<rdgGrp n="['open.—it']">
<rdg wit="fMS">open.—It </rdg>
</rdgGrp>
<rdgGrp n="['open;', 'it']">
<rdg wit="f1818">open; it </rdg>
<rdg wit="f1823">open; it </rdg>
<rdg wit="fThomas">open; it </rdg>
<rdg wit="f1831">open; it </rdg>
</rdgGrp>
</app> We need to start there--that's actually the structure of the output we need from the collation process. Afterwards, we post-process this in two ways:
I saw the dull yellow eye of the creature <seg xml:id="C10_app29-fThomas">open; it </seg> (Where there was a single
<app xml:id="C10_app29" n="2">
<rdgGrp xml:id="C10_app29_rg1" n="['open.—it']">
<rdg wit="#fMS">
<ptr target="https://raw.githubusercontent.com/PghFrankenstein/fv-data/master/variorum-chunks/fMS_C10.xml#string-range(//tei:surface[@xml:id='ox-ms_abinger_c56-0045']/tei:zone[@type='main']//tei:line[14],14,23)"/>
</rdg>
</rdgGrp>
<rdgGrp xml:id="C10_app29_rg2" n="['open;', 'it']">
<rdg wit="#f1818">
<ptr target="https://raw.githubusercontent.com/PghFrankenstein/fv-data/master/variorum-chunks/f1818_C10.xml#C10_app29-f1818"/>
</rdg>
<rdg wit="#f1823">
<ptr target="https://raw.githubusercontent.com/PghFrankenstein/fv-data/master/variorum-chunks/f1823_C10.xml#C10_app29-f1823"/>
</rdg>
<rdg wit="#fThomas">
<ptr target="https://raw.githubusercontent.com/PghFrankenstein/fv-data/master/variorum-chunks/fThomas_C10.xml#C10_app29-fThomas"/>
</rdg>
<rdg wit="#f1831">
<ptr target="https://raw.githubusercontent.com/PghFrankenstein/fv-data/master/variorum-chunks/f1831_C10.xml#C10_app29-f1831"/>
</rdg>
</rdgGrp>
</app> So about the "TEI-ness" of this: I was part of a TEI-Council-hosted panel on the TEI Critical Apparatus at the TEI 2019 conference in Graz, thinking about what the TEI critical apparatus can do, and yes, the Frankenstein Variorum is trying to say something about the critical apparatus as a way of moving interchangeably between differently-marked editions: here's my part of the slide deck which pretty much expresses what we are trying to do: https://slides.com/elisabeshero-bondar/app-crit#/2 |
There are a few issues you raise, and I propose we focus for now on what I think the primary one, the method/technique of consolidating a group of adjacent If you are content treating all adjacent A more difficult question arises if you are not happy treating adjacent Consider the following:
Anyone wanting to cluster/group the differences between common substrings XX and YY will need to make decisions about where one Out of the box,
With so many RMDs, choices need to be made, such as why the RMD "34" wasn't captured. Try capturing it. You must sacrifice some other RMD, if you are committed to a granular approach that preserves text order of each witness. If granularity is not important, and you want to group RMDs in constellations, so to speak, you still have to choose some criterion for constellation formation. That faces the original problem: tesselation and overlap. The example above is actually rather simple. As the number of versions group, so do the challenges behind the choices. If you cannot articulate a criterion for the creation of RMD constellations, then you cannot code a solution. If you can articulate one, you might be able to. (And you have to be aware that other people may vehemently disagree with the principles you adopt.) Then you start. In my XSLT code in an answer above I proposed Many such constellation functions could be written. Few would be trivial to write. Few would be widely adopted. If at the end you go back to simply one |
Find where the
<c>
and<u>
collation structure is generated.Can this be modified in the TAN collate source to rename
<u>
as<rdgGrp
, and (crucially) bundle the moments of divergence in<app>
? Without this, the<rdgGrp>
s are just sibling elements, without an indication of their delta relationship (or the delta is just implied), and they're harder to process. Having to bundle them up after the collation will be more challenging than capturing them as they're formed. Why challenging? Because we cannot always expect just one set of related<u>
elements in between each<c>
.@Arithmeticus This is probably the most important question I have now for applying tan collate() in our workflow. Can you help?
Example (this isn't the best example because the
<u>
siblings are members of the same divergent group, but imagine if there were three or four sets of<u>
s generated between a<c>
.TAN collation output
Desired output (same collation data, bundled in
<app>
elements:The text was updated successfully, but these errors were encountered: