Skip to content

ActiveTriples/active_triples-solrizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ActiveTriples::Solrizer

Build Status Coverage Status Gem Version

Provides a default solr implementation under the ActiveTriples framework.

Installation

Add this line to your application's Gemfile:

gem 'active_triples-solrizer'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install active_triples-solrizer

Usage

Property definitions for ActiveTriples resources can be extended by adding a block to define indexing data type and modifiers (see table of supported values below).

property :title, :predicate => RDF::SCHEMA.title   do |index|
  index.data_type = :text         # specify the data type of the field in solr.  See (https://github.com/elrayle/active_triples-solrizer/blob/master/solr/schema.xml)[solr/schema.xml] for field type definitions.
  index.as :indexed, :sortable    # specify modifiers for the solr field       
end
data_type Notes
:text tokenized text
:text_en tokenized English text
:string non-tokenized string
:integer
:long
:double
:float
:boolean
:date format for this date field is of the form 1995-12-31T23:59:59Z; Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
:coordinate TBA - used to index the lat and long components for the "location"
:location TBA - latitude/longitude
:guess allow guessing of the type based on the type of the property value; NOTE: only checks the type of the first value when multiple values
index.as modifiers works with types Notes
:indexed all types except :coordinate searchable, but not returned in solr doc unless also has :stored modifier
:stored all types except :coordinate returned in solr doc, but not searchable unless also has :indexed modifier
:multiValued all types except :boolean, :coordinate NOTE: if not specified and multiple values exist, only the first value is included in the solr doc
:sortable all types except :boolean, :coordinate, :location numbers are stored as trie version of numeric type; :string, :text, :text_XX have an extra alphaSort field
:range all numeric types including :integer, :long, :float, :double, :date optimize for range queries
:vectored valid for :text, :text_XX only

NOTE: Modifiers placed on types that do not support the modifier are ignored.

Examples

Common prep code for all examples:

require 'active_triples'
require 'active_triples/solrizer'

# create an in-memory repository for ad-hoc testing
ActiveTriples::Repositories.add_repository :default, RDF::Repository.new

# configure the solr url
ActiveTriples::Solrizer.configure do |config|
  config.solr_uri = "http://localhost:8983/solr/#/~cores/active_triples"
end

# create a DummyResource for ad-hoc testing
class DummyResource < ActiveTriples::Resource
  configure :type => RDF::URI('http://example.org/SomeClass')
  property :title,            :predicate => RDF::SCHEMA.title       do |index|
    index.data_type = :text
    index.as :indexed, :sortable
  end
  property :description_si,   :predicate => RDF::SCHEMA.description do |index|
    index.data_type = :text
    index.as :stored, :indexed
  end
  property :borrower_uri_i,   :predicate => RDF::SCHEMA.borrower    do |index|
    index.data_type = :string
    index.as :indexed
  end
  property :clip_number_simr, :predicate => RDF::SCHEMA.clipNumber  do |index|
    index.data_type = :integer
    index.as :stored, :indexed, :multiValued, :range
  end
  property :price_s,          :predicate => RDF::SCHEMA.price       do |index|
    index.data_type = :float
    index.as :stored
  end
  property :bookEdition,      :predicate => RDF::SCHEMA.bookEdition # non-indexed property
end


# initialize solr service with defaults
ActiveTriples::Solrizer::SolrService.register

Example: Indexing Service to create solr document

# create a new resource
dr = DummyResource.new('http://www.example.org/dr')
dr.title                = 'Test Title'
dr.description_si       = 'Test text description stored and indexed.'
dr.borrower_uri_i       = 'http://example.org/i/b2'
dr.clip_number_simr     = [7,8,9,10]
dr.price_s              = 789.01
dr.bookEdition          = 'Ed. 2'
dr

# get solr doc
doc = ActiveTriples::Solrizer::IndexingService.new(dr).generate_solr_document
# =>  {
#         :id=>"http://www.example.org/dr",
#         :at_model_ssi=>"DummyResource",
#         :object_profile_ss=>expected_object_profile_short_all_values,
#         :title_ti=>"Test Title",
#         :title_ssort=>"Test Title",
#         :description_si_tsi=>"Test text description stored and indexed.",
#         :borrower_uri_i_si=>"http://example.org/i/b2",
#         :clip_number_simr_itsim=>[7,8,9,10],
#         :price_s_fs=>789.01
#     }


# persist doc to solr
ActiveTriples::Solrizer::SolrService.add(doc)
ActiveTriples::Solrizer::SolrService.commit

Example: Profile Indexing Service to serialize/deserialize resource

# create a new resource with all properties having values
dr1 = DummyResource.new('http://www.example.org/dr1')
dr1.title                = 'Test Title'
dr1.description_si       = 'Test text description stored and indexed.'
dr1.borrower_uri_i       = 'http://example.org/i/b2'
dr1.clip_number_simr     = [7,8,9,10]
dr1.price_s              = 789.01
dr1.bookEdition          = 'Ed. 2'
dr1

# serialize resource into object profile
object_profile1 = ActiveTriples::Solrizer::ProfileIndexingService.new(dr1).export
# =>    '{"id":"http://www.example.org/dr1",'\
#        '"title":["Test Title"],'\
#        '"description_si":["Test text description stored and indexed."],'\
#        '"borrower_uri_i":["http://example.org/i/b2"],'\
#        '"clip_number_simr":[7,8,9,10],'\
#        '"price_s":[789.01],'\
#        '"bookEdition":["Ed. 2"]}'

# deserialize resource from object profile
dr1_filled = ActiveTriples::Solrizer::ProfileIndexingService.new().import( object_profile1, DummyResource )
dr1_filled.attributes
# => {"id"=>"http://www.example.org/dr2",
#     "title"=>["Test Title"],
#     "description_si"=>["Test text description stored and indexed."],
#     "borrower_uri_i"=>["http://example.org/i/b2"],
#     "clip_number_simr"=>[7, 8, 9, 10],
#     "borrower_uri_i"=>[],
#     "clip_number_simr"=>[],
#     "price_s"=>[789.01],
#     "bookEdition"=>["Ed. 2"]}

# create a new resource with some properties with unset values
dr2 = DummyResource.new('http://www.example.org/dr2')
dr2.title                = 'Test Title'
dr2.description_si       = 'Test text description stored and indexed.'
dr2.price_s              = 789.01
dr2.bookEdition          = 'Ed. 2'
dr2

# serialize resource into object profile
object_profile2 = ActiveTriples::Solrizer::ProfileIndexingService.new(dr2).export
# =>    '{"id":"http://www.example.org/dr1",'\
#        '"title":["Test Title"],'\
#        '"description_si":["Test text description stored and indexed."],'\
#        '"borrower_uri_i":[],'\
#        '"clip_number_simr":[],'\
#        '"price_s":[789.01],'\
#        '"bookEdition":["Ed. 2"]}'

# deserialize resource from object profile
dr2_filled = ActiveTriples::Solrizer::ProfileIndexingService.new().import( object_profile2, DummyResource )
dr2_filled.attributes
# => {"id"=>"http://www.example.org/dr2",
#     "title"=>["Test Title"],
#     "description_si"=>["Test text description stored and indexed."],
#     "borrower_uri_i"=>[],
#     "clip_number_simr"=>[],
#     "price_s"=>[789.01],
#     "bookEdition"=>["Ed. 2"]}

Example: Properties Indexing Service to generate solr fields based on property definitions

# NOTE re-use dr1 and dr2 from object profile examples

# generate property fields
property_fields1 = ActiveTriples::Solrizer::PropertiesIndexingService.new(dr1).export
# => {
#        :title_ti=>"Test Title",
#        :title_ssort=>"Test Title",
#        :description_si_tsi=>"Test text description stored and indexed.",
#        :borrower_uri_i_si=>"http://example.org/i/b2",
#        :clip_number_simr_itsim=>[7,8,9,10],
#        :price_s_fs=>789.01
#    }

# generate property fields
property_fields2 = ActiveTriples::Solrizer::PropertiesIndexingService.new(dr2).export
# => {
#        :title_ti=>"Test Title",
#        :title_ssort=>"Test Title",
#        :description_si_tsi=>"Test text description stored and indexed.",
#        :price_s_fs=>789.01
#    }

Development Notes:

  • I would like to see this expand to support specification of facets.
  • The location and coordinate field types have not been tested and do not have examples.
  • Some of the code in solr_service.rb is untested. It was copied from ActiveFedora as is. Mentions in the code to querying have not been tested. Query code was not copied at the time this document was written.

Contributing

Please observe the following guidelines:

  • Do your work in a feature branch based on master and rebase before submitting a pull request.
  • Write tests for your contributions.
  • Document every method you add using YARD annotations. (Note: Annotations are sparse in the existing codebase, help us fix that!)
  • Organize your commits into logical units.
  • Don't leave trailing whitespace (i.e. run git diff --check before committing).
  • Use well formed commit messages.

About

Default solr implementation for ActiveTriples resources

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages