-
Notifications
You must be signed in to change notification settings - Fork 27
XQuery Exercise 2: Where are the Pokemon?
In our eXist-db we have uploaded a collection of XML files from the Pokemon team. Access the Pokemon collection with:
collection('/db/pokemonMap/pokemon')
(The collection is not coded in a namespace, so you can simply open a New XQuery and begin writing queries over it with no namespace declaration line.)
Open a text or markdown file to paste in the expressions you use for each of the following:
-
Write an expression to return a count of the number of files in the collection, using the
count()
function. -
Return the filepaths of all the files in the Pokemon collection with the
base-uri()
function. Then trim the results to return only the filenames: Tokenize the file paths on the/
and retrieve the last token. -
For the next few steps, we will ask you to study and answer some questions about the XML you retrieved to help prepare you to query it.
- a. We need to see how the XML in each file is structured to know how to query it. Starting from the
collection()
, write a basic XQuery expression to show you the coding of the files, using/*
: This will show you the root element of each file (and thus each entire file). - b. What XML element and attribute holds the type of a Pokemon?
- c. Where can you find the locations associated with each Pokemon? (What element and attribute holds this information?)
- d. If we started an XPath from the element holding the type of Pokemon, what XPath axis would we use to find the name of the Pokemon?
- e. If we started an XPath from an element holding the landmark, what XPath axis would we use to find the type of Pokemon here? (We will need to express this relationship in our XQuery below.)
- a. We need to see how the XML in each file is structured to know how to query it. Starting from the
-
We want to learn from the Pokemon collection what types of Pokemon can be found in specific landmarks in the Pokemon world. Let’s start working on a FLWOR expression to help us pull this data from the files.
- a. Start by defining (and returning) a variable holding all the Pokemon types. We want to work with the
@type
attribute on thetyping
element, because this seems to return a list of standardized values. Note: To return an attribute value in eXist-db you will need to setstring()
at the end of your expression. - b. Look at your output: Do you see the white spaces? Several attributes hold a list of multiple type values separated by white space. Use the
tokenize()
function to break these apart on the white space separator, and return all of the individual values. - c. You have a big list of multiple duplicate values now. Define a variable to get rid of those duplicates and return only the distinct values.
- d. Look at the return for the above list of distinct values. Do you see some duplicates? Here's a good opportunity to try the
lower-case()
orupper-case()
function on your nodes before you send them to distinct-values. Do it. How many items do you see now in the sequence of values that you return?
- a. Start by defining (and returning) a variable holding all the Pokemon types. We want to work with the
-
Now things get interesting! We want to build up this FLWOR so XQuery will show us which locations are associated with each distinct type of Pokemon. We have a list of distinct values that is off the XML tree.Our goal: We want to output a simple chart that contains, on one side, the distinct type of Pokemon, and on the other, a list of the distinct locations where we can find that Pokemon type. For each type there are going to be multiple locations.
- a. Make a special
for
statement to create an index variable, to take each member of the distinct-values list of types one by one.- Understand, this sequence of distinct-values() is off the tree, and the values have been tokenized and lower-cased or upper-cased. Notice: once you define a
for
statement, your returns are constructed in afor loop
: If you return any variable, it returns once for each member of the sequence of values you are looping over. Try it.
- Understand, this sequence of distinct-values() is off the tree, and the values have been tokenized and lower-cased or upper-cased. Notice: once you define a
- b. Define a variable with a
let
statement in thefor loop
that returns to the Pokemon XML collection and looks for all thelandmark
elements (inside thelocation
elements). - c. Define a new variable to return only the landmarks in files that hold the current type value in the
for loop
. That current value is stored in thefor loop
variable. Make a predicate filter on the landmark nodes to return only the landmarks that are associated with the current value of yourfor loop
. How to do that? You want to find the landmarks whosepreceding::
node gives a type, which, if you lower-case or upper-case it, will contain the currentfor loop
variable. - d. Try testing your return in stages.
- If you return your
for
loop variable, you should see 26 results. We want a total of 26 results in our return, so each type of Pokemon is presented alongside a list of landmarks for finding it. - Try returning just the special matching landmarks variable: there will be 144 of them. But notice there are many duplicates in the results. To keep our chart concise and tidy, let’s define a new variable to hold the
distinct-values()
of those matching landmarks. There will be 83 distinct values. - You can return multiple variables together by wrapping them in parentheses like this:
return ($d, $distLM)
The first is my distinct type (myfor-loop
variable), and the second is the list of distinct landmarks. - Frequently when we are working with a list of
for loops
and matching values in XQuery, we have a small list (26for
values), with a longer list of matching results indexed to them. Let’s play with string functions to tie these together into 26 tidy bundles. Use theconcat()
function in a one-to-one relationship with single strings, and it can take and number of comma-separated pieces, including literal strings in quotation marks and XQuery variables. Tryconcat($d, 'is cool')
to see how this works. If we takestring-join()
and bundle up each list of landmarks associated with a type, you can make one string with a separator that will fit as an argument in yourconcat()
, so you can construct something like:concat('Type: ', $d, ' :Landmark: ', string-join($distLM))
- If you return your
- a. Make a special
-
Want to make that prettier for your website? Look at the examples on our tutorial on Building New HTML or XML with XQuery and take careful note of how and where to use the curly braces to activate XQuery (so it is not just treated as plain text). See if you can construct an HTML page with a table in it, outputting inside the
for loop
each of 26 table rows containing a Pokemon type and its associated landmarks. Congratulations! You have found, and formatted the Pokemon types! Try creating a folder for yourself in the database to save your work as a.xql
document, but also copy and paste it into a text or markdown file to submit as homework over Courseweb.