Skip to content
This repository has been archived by the owner on Aug 22, 2020. It is now read-only.

XPath activities

G Bondar edited this page Oct 14, 2018 · 13 revisions

Things XPath can do:

XPath: not just for walking trees, and not just in oXygen. Open http://newtfire.org:8338/exist/apps/eXide/index.html . Click “New XQuery”, and erase all content in the editing window. You’ll type (or paste) your XPath in the editing window and run it with the “Eval” button. Try:

current-dateTime()

Put a comma to separate so you can add some XPaths:

current-dateTime(), 
current-date(),

and here's an arrow-operator to send the results of one function to a new function:

current-dateTime(), 
current-date(),
current-dateTime() => format-dateTime('[h].[m01][Pn] on [FNn], [D1o] [MNn]'),

These are "library functions" in XPath (and lots of programming languages have functions like these to do basic tasks.) You can also type things like 2 + 2 (try it). Arithmetic operations:

4 div 2, 5 mod 2 code goes here and this <preserves my syntax without reformatting>

Sequences and Position:

All XPath expressions return a sequence. Sequences may contain XML nodes (elements, attributes, etc.), atomic values (strings, numbers, etc.), or both. A sequence of one item is nonetheless a sequence, as is an empty sequence. Nested sequences are automatically flattened. So here are some atomic values, expressed in different ways to make a sequence of one, or a sequence of multiples. Plug these in and take a look at the results (what's the difference)?

"eat breakfast, write code, go to class",
"eat breakfast", "write code", "go to class",
("eat breakfast", "write code",  "go to class"),

You can "pull" data from a sequence based on its position:

("eat breakfast", "write code", "go to class")[last()],
("eat breakfast", "write code", "go to class")[1],
("eat breakfast", "write code",  ("go to class", "eat lunch")[position() = 1])[last()]

Working with XML trees:

Switch to oXygen. Open URL: http://digitalmitford.org/si.xml

  • Look at the div elements in the site index (//div). Notice how this returns a sequence. What attribute on this element can tell you how the document is organized? Write an XPath that isolates these attribute values.
  • //div/listPerson How many?
  • Return a count: count(//div/listPerson) or //div/listPerson => count() (using the arrow operator)
  • //listPerson/person how many?
  • Let's try and find the last person listed in the Site index. Does this work? //listPerson/person[last()] (Why did this return six results? How do we get one result? What's the very last node in an XML document? How do we understand sequences? (//listPerson/person)[last()]

Predicates and how to nest them:

  • //div[@type='historical_people']

  • Find all the women in the list of historical people: //div[@type='historical_people']//person[@sex='f']

  • Notice: Person elements contain birth elements: Take a look at them with XPath.

  • Nested predicates: Find all the people whose birth element is coded with a placeName inside

  • Nested predicates using attributes: This shows us birth elements that have a @notBefore attribute, a handy TEI attribute for when we're not sure of a precise date!

//div[@type='historical_people']//person/birth[@notBefore]
  • How many females have a birth element coded with @notBefore that also contain a placeName?

//div[@type='historical_people']//person[@sex="f"][birth[@notBefore][placeName]

  • Here's what the XML for one of these entries looks like:
                  <persName>
                     <surname type="paternal">Webb</surname>
                     <forename>Jane</forename>
                     <forename>Eleanor</forename>
                  </persName>
                  <persName>Jane Eleanor Webb</persName>
                  <birth notBefore="1797-03-03">
                     <placeName>Wokingham, Berkshire, England</placeName>
                  </birth>
                  <death when="1851-03-24">
                     <placeName>Sandgate, Kent, England</placeName>
                  </death>
                  <note resp="#scw #lmw">Friend of <persName ref="#MRM">Mary Russell
                        Mitford</persName>. Jane Webb was born about 1795, the daughter of James
                     Webb, Esq., and Jane Elizabeth Ogbourn. Baptized on <date when="1797-03-03">March 3, 1797</date> in Wokingham, Berkshire. Sister of <persName ref="#Webb_Eliza">Elizabeth</persName> (called "Eliza") and Mary Elizabeth
                     Webb and niece of <persName ref="#Webb_Mary_elder">the elder Mary Webb, "Aunt
                        Mary"</persName>. In <bibl corresp="#Needham_PapersRCL">
                        <persName ref="#Needham_Francis">Needham</persName>’s papers</bibl>, he
                     notes from the <title>Berkshire Directory</title>that she lived on
                        <placeName>Broad street</placeName>, presumably in Wokingham, Berkshire. She
                     married Henry Walters, Esq., a land-surveyor and amateur antiquarian, and they
                     lived at The Willows, near Windsor, Berkshire, according to census and other
                     period records. Their date of marriage is unknown, but is likely between 1822
                     and 1832, based on her father’s 1822 will and 1831 census records. She died on
                     March 24, 1851 at Sandgate, Kent. More research needed.</note>
               </person>

Resource: DHSI XPath class materials: https://ebeshero.github.io/UpTransformation/schedule.html