Skip to content

Wikibase to Solr #8: qualifiers

Human Experience Systems LLC edited this page Apr 27, 2023 · 2 revisions

To understand the Wikibase to Solr script, it is useful to first understand the Wikibase JSON structure and how it is interpreted inside Ruby.

Here is a simple program which loads the export.json file into a Ruby variable using the JSON library, and then iterates over each item.

Inside of each property array, we may find further qualifiers present. Qualifiers are stored in qualifiers, at the same level as mainsnak.

Example: item["claims"]["P16"][]["mainsnak"]

Example: item["claims"]["P16"][]["qualifiers"]

## import Ruby functions
require 'json'
require 'csv'
require 'date'
require 'time'
require 'optparse'

dir = File.dirname __FILE__
importJSONfile = File.expand_path 'export.json', dir

## Load the import JSON file into a Ruby array
data = JSON.load_file importJSONfile

data.each do |item|
  @id = item["id"]
  @keys = item.keys          # ["type", "id", "labels", "descriptions", "aliases", "claims", "sitelinks", "lastrevid"]
  @claims = item["claims"]

  puts @id
  @claims.each_key do |property|

    @propertyArray = @claims.dig(property)&.first
    @propertyValue = @propertyArray.dig "mainsnak", "datavalue", "value"
    puts "#{property} #{@propertyValue}"

    @qualifiers = @propertyArray["qualifiers"]
    p @qualifiers.keys
  end 
  puts "--"

end

If we work with the qualifiers without checking for nil values, it will cause an error:

wikibase4.rb:27:in `block (2 levels) in <main>': undefined method `keys' for nil:NilClass (NoMethodError)
p @qualifiers.keys
             ^^^^^
from wikibase4.rb:20:in `each_key'
from wikibase4.rb:20:in `block in <main>'
from wikibase4.rb:14:in `each'
from wikibase4.rb:14:in `<main>'

When we add the nil check p @qualifiers&.keys, the output will show each property, the property value, and any qualifiers contained within the property.

Q1300
P10 Kitāb al-Majisṭī
["P13", "P11"]
P12 Almagest.
nil
P14 Ptolemy, active 2nd century
["P15", "P13", "P17"]
P16 {"entity-type"=>"item", "numeric-id"=>3, "id"=>"Q3"}
nil
P18 Tables (Data)
["P20"]
P19 Astronomy--Early works to 1800
["P20"]
P21 Arabic
["P22"]
P23 1381
["P25", "P24", "P37", "P36"]
P29 Extent: i, 174, i leaves : paper ; 280 x 215 (220 x 145) mm bound to 280 x 225 mm.
nil
P3 {"entity-type"=>"item", "numeric-id"=>1299, "id"=>"Q1299"}
nil
P30 paper
["P31"]
P32 Many edges and corners of leaves mended with paper.
nil
P34 {"time"=>"+2023-03-17T00:00:00Z", "timezone"=>0, "before"=>0, "after"=>0, "precision"=>11, "calendarmodel"=>"http://www.wikidata.org/entity/Q1985727"}
nil
P35 {"time"=>"+2023-03-17T00:00:00Z", "timezone"=>0, "before"=>0, "after"=>0, "precision"=>11, "calendarmodel"=>"http://www.wikidata.org/entity/Q1985727"}
nil
P41 https://colenda.library.upenn.edu/phalt/iiif/2/81431-p3ff3m111/manifest
nil
--

Note that there is no explicit definition (other than what we observe in the exported data) of which properties WILL or MAY contain which qualifiers, which are themselves simply properties, aka properties of properties.