- Namespace: thi.ng.fabric.facts.core
These protocols act as the main public & internal API for this module and their roles are described in more detail throughout this document.
(defprotocol IFactGraph
(facts [_])
(fact-indices [_])
(fact-transform [_])
(add-fact! [_ t])
(remove-fact! [_ t]))
(defprotocol ICache
(cached [_ type k])
(cache! [_ type k v])
(expire! [_ type k]))
(defprotocol IFactIndex
(index-signal-handler [_]))
(defprotocol IFactQuery
(raw-pattern [_]))
(defprotocol IQueryResult
(pre-result-vertex [_])
(result-vertex [_]))
(defprotocol ITwoWayTransform
(transform [_ x])
(untransform [_ x]))
Facts are simply vectors of (currently 3 or 4) items, e.g. [subject
predicate object]
(for triples) and stored in custom vertices. A
FactVertex
is more lightweight than a standard Vertex
, configured
to never collect (facts are immutable) and signal only once when added
(or removed) from the graph.
Any Clojure value can be used as fact item, though in most cases you’ll want to use strings, keywords, symbols, UUIDs or numbers.
Example facts:
;; P123 has name Alice
[:P123 :name "Alice"]
;; toxi is author of fabric
'[toxi author fabric]
;; in transaction UUID we state "XYZ" has a template function f
[#uuid "cffedb78-0e16-4f7e-8494-a19b749e771b" "XYZ" :template-fn (fn [title] [:div [:h1 title]])]
As with the default Vertex
type defined in the fabric-core module,
a vertex’ state can be obtained by dereferencing it via Clojure’s @
reader macro or deref
. The FactVertex
impl will throw errors for
many operations supported by the default vertex. This is because
unlike standard vertices, facts (as values) in a graph must be
immutable. The only mutable aspects of a FactVertex
are related to
connectivity and signalling (e.g. each fact vertex is automatically
connected to index vertices).
(deftype FactVertex
[id fact new-edges outs]
#?@(:clj
[clojure.lang.IDeref
(deref
[_] fact)]
:cljs
[IDeref
(-deref
[_] fact)])
f/IVertex
(vertex-id
[_] id)
(set-value!
[_ val] (err/unsupported!))
(update-value!
[_ f] (err/unsupported!))
(previous-value
[_] fact)
(collect!
[_] (err/unsupported!))
(collect-final!
[_] (err/unsupported!))
(score-collect
[_] 0)
(connect-to!
[_ v sig-fn opts]
(swap! outs assoc v [sig-fn opts])
(swap! new-edges inc)
(debug id "edge to" (f/vertex-id v) "(" (pr-str opts) ") new:" @new-edges)
_)
(neighbors
[_] (keys @outs))
(disconnect-neighbor!
[_ v]
(when v
(debug id "disconnect from" (f/vertex-id v))
(swap! outs dissoc v))
_)
(disconnect-all!
[_]
(run! #(f/disconnect-neighbor! _ %) (keys @outs))
_)
(new-edge-count
[_] @new-edges)
(score-signal
[_] @new-edges)
(signal!
[_ handler]
(reset! new-edges 0)
(handler _ @outs))
(receive-signal
[_ src sig] (err/unsupported!))
(signal-map
[_] (err/unsupported!))
(uncollected-signals
[_] (err/unsupported!)))
This constructor is not intended for public use. Instead it is called
by the FactGraph
when a new fact is to be added.
(defn fact-vertex
[id fact _]
(FactVertex. id fact (atom 0) (atom {})))
Facts are stored in a FactGraph
instance which has been configured
with a maximum fact length (currently 3 or 4 items per fact, all
facts in a graph MUST have the same length). The default
implementation of the FactGraph
wraps an existing IComputeGraph
instance (used as backend) and provides an extended API via the
IFactGraph
and ICache
protocols.
The FactGraph
indexes all facts by their individual items (e.g.
subject, predicate and object values). These indices are special
vertices, are customizable, and automatically added during graph
construction. All fact vertices are automatically connected to (and
only to!) these indices. When queries are added to the graph, they
indirectly attach themselves to these indices (via intermediate index
selection vertices, see Queries section below) and are notified each
time their index selection is changing.
The module currently implements two types of indices:
- plain fact components
- indexing with dynamic entity aliasing (equivalence/synonym)
Both are described (incl. examples) further below.
The ITwoWayTransform
protocol allows the graph to invisibly store
transformed facts, possibly in more compact form than specified by the
user. When a fact is added, the transform
method of the transform
is applied. Queries automatically call untransform
on their results,
so the user will be oblivious to the changed internal representation.
Custom implementations MUST follow this rule:
- Query variables (e.g.
?a
, Clojure symbols prefixed with?
) andnil
values MUST NOT be transformed.
This is a null transform (NOP), used when no fact transform is given for a graph.
(def identity-transform
(reify ITwoWayTransform
(transform [_ x] x)
(untransform [_ x] x)))
This transform is useful when dealing with Linked Data (RDF), where most items are URIs. It’s common practice in the LD community to specify prefixes for various often used vocabularies. E.g. We might define a prefix map and transform like this:
(def prefixes
{"rdf" "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
"rdfs" "http://www.w3.org/2000/01/rdf-schema#"
"owl" "http://www.w3.org/2002/07/owl#"})
(def tx (prefix-transform prefixes))
(transform tx "http://www.w3.org/2002/07/owl#Ontology")
;; ["owl" "Ontology"]
(untransform tx ["owl" "Ontology"])
;; "http://www.w3.org/2002/07/owl#Ontology"
The transform phase takes such a prefix map and value. If value is a
string, attempts to find prefix from map and if found returns a 2-elem
vector of [prefix name]
, else original.
The untransformer too takes the same prefix map and a value. If the value is a vector as produced by the transformation fn and has a known prefix, returns expanded string or else original value.
Important note: This transform only applies to single fact
sub-items, to apply it to full facts, wrap it with others in
combine-transforms
(described below).
(defn sort-prefixes
[prefixes] (rseq (vec (sort-by #(count (peek %)) prefixes))))
(defn prefix-vector-transform
[prefixes]
(let [prefixes' (sort-prefixes prefixes)]
(reify ITwoWayTransform
(transform
[_ x]
(if (string? x)
(reduce
(fn [acc p]
(if (zero? (.indexOf ^String x ^String (peek p)))
(reduced [(first p) (subs x (count (peek p)))])
acc))
x prefixes')
x))
(untransform
[_ x]
(if (vector? x)
(if-let [p (prefixes (first x))]
(str p (nth x 1))
x)
x)))))
This second version of this transform is very similar and only differs
in its transformed result type: Instead of a vector, it produces a
string of prefix:name
.
(defn ->prefix-string
[prefixes' x]
(if (string? x)
(reduce
(fn [acc p]
(if (zero? (.indexOf ^String x ^String (peek p)))
(reduced (str (first p) \: (subs x (count (peek p)))))
acc))
x prefixes')
x))
(defn <-prefix-string
[prefixes x]
(if (string? x)
(let [[[_ p n]] (re-seq #"^(\w[\w\-_]+):([\w\-_]*)$" x)]
(if-let [p (prefixes p)]
(str p n)
x))
x))
(defn prefix-string-transform
[prefixes]
(let [prefixes' (sort-prefixes prefixes)]
(reify ITwoWayTransform
(transform
[_ x] (->prefix-string prefixes' x))
(untransform
[_ x] (<-prefix-string prefixes x)))))
This transform too applies to fact sub-items only (see note in previous section) and maps values to/from unique index IDs.
In Clojure, the reverse index uses Zach Tellman’s org.clojure/data.int-map]] for faster and more memory efficient behavior.
(defn global-index-transform
[]
(let [index (atom #?(:clj {:fwd {} :rev (imap/int-map) :id 0}
:cljs {:fwd {} :rev {} :id 0}))]
(reify ITwoWayTransform
(transform
[_ x]
(if (or (nil? x) (qvar? x))
x
(or ((@index :fwd) x)
(let [curr (volatile! nil)]
(swap! index
#(let [id (:id %)]
(vreset! curr id)
(-> %
(update :id inc)
(update :fwd assoc x id)
(update :rev assoc id x))))
@curr))))
(untransform
[_ id] ((@index :rev) id id)))))
Transforms can be composed using the functions below:
compose-transforms
applies the given transforms serially (viareduce
) and in reverse order foruntransform
.one-way-transform
takes anITwoWayTransform
and modifies it to useidentity
as its reverse transformation, therefore keeping transformed values.combine-transforms
applies 3 or 4 transforms in parallel (one per fact component) to (un)transform a single fact
(defn compose-transforms
[& transforms]
(let [rtx (reverse transforms)]
(reify ITwoWayTransform
(transform [_ x]
(reduce #(transform %2 %) x transforms))
(untransform [_ x]
(reduce #(untransform %2 %) x rtx)))))
(defn one-way-transform
[tx]
(reify ITwoWayTransform
(transform [_ x] (transform tx x))
(untransform [_ x] x)))
(defn combine-transforms
([tx len]
(apply combine-transforms (repeat len tx)))
([txs txp txo]
(reify ITwoWayTransform
(transform [_ fact]
[(transform txs (first fact))
(transform txp (nth fact 1))
(transform txo (nth fact 2))])
(untransform [_ fact]
[(untransform txs (first fact))
(untransform txp (nth fact 1))
(untransform txo (nth fact 2))])))
([txt txs txp txo]
(reify ITwoWayTransform
(transform [_ fact]
[(transform txt (first fact))
(transform txs (nth fact 1))
(transform txp (nth fact 2))
(transform txo (nth fact 3))])
(untransform [_ fact]
[(untransform txt (first fact))
(untransform txs (nth fact 1))
(untransform txp (nth fact 2))
(untransform txo (nth fact 3))]))))
The ICache
protocol is used to cache various query related entities
for DRY reasons. Intermediate result caching is not just done for
performance reasons, but also to minimize memory usage, since queries
often make use of similar patterns and sub-patterns (e.g. two separate
query joins might use the same underlying pattern as LHS or RHS).
Currently the cache is an atom with a nested map and these top-level
keys:
::index-sel
- caches vertices selecting a specific key from any of the indices (or a wildcard selection). These vertices basically act as sub-sub-queries of a specific fact component.::queries
- caches existing single-pattern queries based on their query pattern (only basic & parametric queries without options are cached). Note: the cached values are NOT vertices, but query entities (reify’s w/IFactQuery
,IGraphComponent
and other protocol implementations)::rules
- caches inference rules based on their ID (also defrecords)
(defrecord FactGraph
[g indices facts cache ftx]
f/IComputeGraph
(add-vertex!
[_ val vspec] (f/add-vertex! g val vspec))
(remove-vertex!
[_ v] (f/remove-vertex! g v))
(vertex-for-id
[_ id] (f/vertex-for-id g id))
(vertices
[_] (f/vertices g))
(add-edge!
[_ src dest sig opts] (f/add-edge! g src dest sig opts))
f/IWatch
(add-watch!
[_ type id f] (f/add-watch! g type id f) _)
(remove-watch!
[_ type id] (f/remove-watch! g type id) _)
(notify-watches
[_ evt] (f/notify-watches g evt))
IFactGraph
(facts
[_] (keys @facts))
(fact-indices
[_] indices)
(fact-transform
[_] ftx)
(add-fact!
[_ f]
(let [f' (transform ftx f)]
(or (@facts f')
(let [v (f/add-vertex! g f' nil fact-vertex)]
(f/notify-watches g [:add-fact f])
(debug :add-fact f f')
(run! #(f/add-edge! g v % signal-fact :add) @indices)
(swap! facts assoc f' v)
v))))
(remove-fact!
[_ f]
(let [f' (transform ftx f)]
(if-let [v (@facts f')]
(do
(debug :remove-fact f f')
(run! #(f/add-edge! g v % signal-fact :remove) @indices)
(swap! facts dissoc f')
(f/notify-watches g [:remove-fact f])
(f/remove-vertex! g v)
v)
(warn "attempting to remove unknown fact:" f))))
ICache
(cached
[_ type k] (get-in @cache [type k]))
(cache!
[_ type k v] (swap! cache assoc-in [type k] v) v)
(expire!
[_ type k] (swap! cache update type dissoc k) nil))
(defn fact-graph
"Creates a new FactGraph instance configured with given options map:
:graph - backing IComputeGraph (default fabric.core/compute-graph)
:len - fact length (default 3)
:index - index vertex ctor (default default-index-vertex)
:transform - fact transform (default identity-transform)"
([]
(fact-graph {}))
([{:keys [graph len index transform]
:or {graph (f/compute-graph)
len 3
index default-index-vertex
transform identity-transform}}]
(let [indices (atom nil)
g (map->FactGraph
{:indices indices
:facts (atom {})
:cache (atom {})
:ftx transform
:g graph})]
(reset! indices (mapv #(add-index-vertex! graph (index g %)) (range len)))
g)))
Index vertices are automatically created during graph construction.
This is the helper function used by the FactGraph
constructor to
create a single fact component index vertex. The diagram in the
following section shows how these vertices are used (users don’t need
to deal with them directly, though).
(deftype IndexVertex
[id index prev-index uncollected signal-map new-edges outs
signal-fn collect-fn]
#?@(:clj
[clojure.lang.IDeref
(deref
[_] @index)]
:cljs
[IDeref
(-deref
[_] @index)])
f/IVertex
(vertex-id
[_] id)
(set-value!
[_ val] (reset! index val) _)
(update-value!
[_ f] (swap! index f) _)
(previous-value
[_] @prev-index)
(collect!
[_]
(collect-fn _)
(reset! uncollected [])
_)
(collect-final!
[_] _)
(score-collect
[_] (count @uncollected))
(connect-to!
[_ v sig-fn opts]
(swap! outs assoc v [sig-fn opts])
(swap! new-edges inc)
(debug id "edge to" (f/vertex-id v) "(" (pr-str opts) ") new:" @new-edges)
_)
(neighbors
[_] (keys @outs))
(disconnect-neighbor!
[_ v]
(when v
(debug id "disconnect from" (f/vertex-id v))
(swap! outs dissoc v))
_)
(disconnect-all!
[_]
(run! #(f/disconnect-neighbor! _ %) (keys @outs))
_)
(new-edge-count
[_] @new-edges)
(score-signal
[_] (+ @new-edges (if (= @index @prev-index) 0 1)))
(signal!
[_ handler]
(reset! prev-index @index)
(reset! new-edges 0)
(handler _ @outs))
(receive-signal
[_ src sig]
(if-not (= sig (@signal-map src))
(do (swap! uncollected conj sig)
(swap! signal-map assoc src sig)
true)
(debug id " ignoring unchanged signal: " (pr-str sig))))
(signal-map
[_] @signal-map)
(uncollected-signals
[_] @uncollected)
IFactIndex
(index-signal-handler
[_] signal-fn))
(defn add-index-vertex!
[g ctor]
(f/add-vertex! g nil nil ctor))
(defn- collect-index
[idx]
(let [tx (map (fn [[op vid t]] [op vid (nth t idx)]))
rf (completing
(fn [acc [op vid x]]
(case op
:add (assoc acc x (conj (or (acc x) #{}) vid))
:remove (if-let [idx (acc x)]
(if (= #{vid} idx)
(dissoc acc x)
(assoc acc x (disj idx vid)))
acc)
(do (warn "ignoring unknown index signal op:" op)
acc))))]
(f/collect-pure
(fn [val incoming]
;;(debug :old-index val)
(let [val (transduce tx rf val incoming)]
;;(debug :new-index val)
val)))))
(defn- signal-index-select
[[idx sel]]
(if sel
(if-let [choices (and (map? sel) (::choices sel))]
(fn [vertex _]
(let [res (into #{} (mapcat identity) (vals (select-keys* @vertex choices)))]
[idx (if (seq res) res ::none)]))
(fn [vertex _]
[idx (@vertex sel ::none)]))
(fn [vertex _]
[idx (if-let [v (vals @vertex)] (into #{} (mapcat identity) v) ::none)])))
(defn default-index-vertex
[g idx]
(fn
[id _ _]
(IndexVertex.
id
(atom {}) ;; index
(atom nil) ;; prev
(atom []) ;; uncollected
(atom {}) ;; sig-map
(atom 0) ;; new-edges
(atom {}) ;; outs
signal-index-select
(collect-index idx))))
An alias index vertex allows the declaration of equivalent fact items and
their interchanged use in queries. For example we might configure such
an index with an initial same-as
property like this:
(def graph (ff/fact-graph {:index (ff/alias-index-vertex '#{same-as})}))
With the graph configured that way, any fact added which has same-as
in predicate (middle) position, will have its subject and object
values declared as synonyms/aliases. Example:
;; facts: bob is a person and his homepage
(ff/add-fact! graph '[bob type person])
(ff/add-fact! graph '[bob url "http://bob.com"])
;; declare further that...
(ff/add-fact! graph '[bobby same-as bob])
;; at this point bob & bobby are semantically merged to refer to the
;; same entity and one could query the graph for all facts about
;; bobby, but retrieve the facts about both bob AND bobby
(ff/add-query! graph '[bobby nil nil])
;; => [bob type person] [bob url "http://bob.com"]
The alias merging is implemented on top of a Disjoint set (or Unionfind) datastructure and has the following features and behaviors:
- Order independence - declaring an equivalence before adding facts produces the same result as declaring it after facts have been added.
- Indempodent - declaring the same equivalence more than once has no further effect (apart from if the fact expressing the equivalence has been removed meanwhile).
- Recursive merging - equivalences can be merged into bigger sets,
e.g. the 2 facts
[a same-as b] [b same-as c]
will alias A,B and C. This too works with declaring new properties defining aliases themselves:
;; again, configure graph to use 'same-as' property to indicate aliases
(def graph (ff/fact-graph {:index (ff/alias-index-vertex '#{same-as})}))
;; add fact declaring new alias property
(ff/add-fact! graph '[equiv same-as same-as])
;; now aliases can be expressed using either 'same-as' or 'equiv'
(ff/add-fact! graph '[bob same-as bobby])
(ff/add-fact! graph '[bb equiv bob])
In query patterns, aliases are not resolved for qvars. I.e. using
the above graph/facts as example, the query pattern [bobby ?pred
?obj]
will match the facts about subject bob
and bobby
, but the
pattern [?subj url "http://bob.com"]
will always only bind qvar
?subj
to bob
, but not relate any of bob
’s aliases (only the
original facts are matched). That also means the following query (with
filter) will only produce an empty result set:
{:where [[?subj url "http://bob.com"]] :filter (= ?subj "bobby")}
(Queries are described in more detail in the next section of this document.)
(defn collect-alias-index
[g props idx]
(let [props (atom (set props))
rf (fn [[index aliases :as acc] [op vid t]]
(let [[s p o] t
x (nth t idx)
[index aliases]
(case op
:add [(assoc index x (conj (or (index x) #{}) vid))
(uf/register aliases x)]
:remove (if-let [idx (index x)]
(if (= #{vid} idx)
(let [index (dissoc index x)
comp (disj (uf/component aliases x) x)]
[index
(if (some index comp)
aliases
(uf/unregister aliases x))])
[(assoc index x (disj idx vid))
aliases])
acc)
(do (warn "ignoring unknown index signal op:" op)
acc))
aliases
(if (@props p)
(case op
:add (let [_ (debug idx :unify s o :pred p)
aliases (uf/union aliases s o)
facts' (cond
(@props o)
(do
(swap! props conj s)
(into #{} (filter #(= s (nth % 1))) (facts g)))
(@props s)
(do
(swap! props conj o)
(into #{} (filter #(= o (nth % 1))) (facts g)))
:else nil)]
(debug idx :unify-facts facts')
(reduce
(fn [acc [s _ o]]
(debug idx :unify s o)
(uf/union acc s o))
aliases facts'))
:remove (-> aliases
(uf/unregister s)
(uf/unregister o)
(uf/register s)
(uf/register o))
aliases)
aliases)]
[index aliases]))]
(f/collect-pure
(fn [val incoming]
;;(debug idx :old-index val)
(let [val (reduce rf val incoming)]
;;(debug idx :new-index val)
val)))))
(defn signal-alias-index-select
[[idx sel]]
(if sel
(if-let [choices (and (map? sel) (::choices sel))]
(fn [vertex _]
(let [[index aliases] @vertex
choices (into #{} (mapcat #(uf/component aliases %)) choices)
_ (info :index-sel idx :sel sel :choices choices)
res (into #{} (mapcat identity) (vals (select-keys* index choices)))]
[idx (if (seq res) res ::none)]))
(fn [vertex _]
(let [[index aliases] @vertex
choices (uf/component aliases sel)
_ (info :index-sel idx :sel sel :choices choices)
res (into #{} (mapcat identity) (vals (select-keys* index choices)))]
[idx (if (seq res) res ::none)])))
(fn [vertex _]
[idx (if-let [v (vals (first @vertex))] (into #{} (mapcat identity) v) ::none)])))
(defn alias-index-vertex
[alias-props]
(fn [g idx]
(fn [id _ _]
(IndexVertex.
id
(atom [{} (uf/disjoint-set)])
(atom nil) ;; prev
(atom []) ;; uncollected
(atom {}) ;; sig-map
(atom 0) ;; new-edges
(atom {}) ;; outs
signal-alias-index-select
(collect-alias-index g alias-props idx)))))
Fact queries are by far the major key operation and functional aspect
provided by this module. In fact, the FactGraph
and vertices setup
discussed above are really just means to an end - queries.
Queries allow us to treat (parts of) a FactGraph
like a database to
answer questions using the facts stored, or even manipulate the sets
of facts by inferring new facts (or contradictions) and add/remove
them to/fron the graph. Automatically.
Queries are defined declaratively and attached to the graph as (potentially nested trees, or at the very least, as vertex chains). That approach allows for the caching of intermediate results and their reuse by other queries, thus minimizing work effort.
Queries remain part of the graph until removed again and take part in every execution cycle of an attached execution context.
The query operations provided in this namespace are considered the low-level raw building blocks for constructing query trees and are therefore meant for advanced users. The DSL namespace provides a much more userfriendly and concise approach to define complex queries using a fully-featured, extensible, hashmap based DSL and expressions. These query specs are then compiled into query trees using the lowlevel forms provided here (but also offer many more additional features).
This diagram illustrates how queries are generally implemented via multiple vertices in the graph, though we here use a graph with triples only (SPO facts). There’re two queries here:
- Q1 : All “friend” facts
- Q2 : All facts about subject “alice”
- Row 1 : 4 facts stored in the graph
- Row 2 : S, P, O index vertices
- Row 3 : Index selection vertices (the blue vertex is re-used by both queries)
- Row 4 : Query accumulators (combine intersection of index selection sets)
- Row 5 : Query results (basic queries)
- Row 6 : Query results (parametric queries)
The vertex values of the result row differ based on the query type used. In this example Q1 is a basic fact query and Q2 a parametric query with variables. The result set of a basic query contains the matching facts. The result set of parametric queries contains unique maps of variable bindings.
In code form this same graph could be constructed like this:
(require '[thi.ng.fabric.core :as f])
(require '[thi.ng.fabric.facts.core :as ff])
(def g (ff/fact-graph))
(ff/add-fact! g '[alice friend bob])
(ff/add-fact! g '[bob friend charlie])
(ff/add-fact! g '[alice friend dora])
(ff/add-fact! g '[alice email "[email protected]"])
(def friends (ff/add-query! g '[nil friend nil] {}))
(def alice (ff/add-param-query! g '[alice ?p ?o] {}))
(f/execute! (f/sync-execution-context {:graph g}))
@friends
;; #{[alice friend dora] [bob friend charlie] [alice friend bob]}
@alice
;; #{{?p email, ?o "[email protected]"} {?p friend, ?o bob} {?p friend, ?o dora}}
(defn index-selection
[g sel]
(if-let [sel' (cached g ::index-sel sel)]
(do (debug :reuse-index-sel sel) sel')
(let [index (@(fact-indices g) (first sel))
vertex (f/add-vertex!
g nil
{::f/score-signal-fn f/score-signal-with-new-edges
::f/collect-fn (f/collect-pure (fn [_ in] (peek in)))})
isel (reify
IQueryResult
(pre-result-vertex
[_] nil)
(result-vertex
[_] vertex)
f/IGraphComponent
(component-type
[_] :index-sel)
(add-to-graph!
[_ g] _)
(remove-from-graph!
[_ g] (f/remove-from-graph! _ g nil))
(remove-from-graph!
[_ g parent]
(if (f/none-or-single-user? vertex parent)
(do (debug "removing :index-sel" sel)
(expire! g ::index-sel sel)
(f/disconnect-neighbor! index vertex)
(f/remove-vertex! g vertex)
true)
(do (f/disconnect-neighbor! vertex parent)
false))))]
(f/add-edge! g index vertex ((index-signal-handler index) sel) nil)
(cache! g ::index-sel sel isel))))
(defn index-sel-choice
[coll] {::choices (set coll)})
(defn make-index-selections
[g pattern]
(into [] (map-indexed #(index-selection g [% %2])) pattern))
A basic fact query is used to match a single or multiple facts without variable bindings:
['alice 'friend 'bob]
only matches the stated fact[nil 'friend 'bob]
matches all facts which have predicate'friend
and object'bob
In general, nil
is used as wildcard to match any S, P or O position.
Therefore, the query pattern [nil nil nil]
matches all facts in a
triple graph (or [nil nil nil nil]
for quads).
(defn add-query!
([g pattern opts]
(add-query! g (fact-transform g) pattern opts))
([g ptx pattern opts]
(let [pattern (transform ptx pattern)]
(if-let [q (and (empty? opts) (cached g ::queries pattern))]
(do (debug :reuse-basic-query pattern) q)
(let [[s p o] pattern
sels (make-index-selections g pattern)
acc (f/add-vertex!
g {} {::f/collect-fn collect-select})
res (f/add-vertex!
g nil
{::f/collect-fn (collect-basic-query-results g opts)
::f/score-signal-fn f/score-signal-with-new-edges
::f/score-collect-fn (score-collect-min-signal-vals (count sels))})
q (reify
#?@(:clj
[clojure.lang.IDeref (deref [_] @res)]
:cljs
[IDeref (-deref [_] @res)])
IFactQuery
(raw-pattern
[_] pattern)
IQueryResult
(pre-result-vertex
[_] acc)
(result-vertex
[_] res)
f/IGraphComponent
(component-type
[_] :basic-query)
(add-to-graph!
[_ g] (err/unsupported!))
(remove-from-graph!
[_ g] (f/remove-from-graph! _ g nil))
(remove-from-graph!
[_ g parent]
(if (f/none-or-single-user? res parent)
(do (debug "removing :basic-query" pattern)
(expire! g ::queries pattern)
(f/remove-vertex! g res)
(f/remove-vertex! g acc)
(run! #(f/remove-from-graph! % g acc) sels)
true)
(do (f/disconnect-neighbor! res parent)
false))))]
(run! #(f/add-edge! g (result-vertex %) acc f/signal-forward nil) sels)
(f/add-edge! g acc res f/signal-forward nil)
(if-not (seq opts)
(cache! g ::queries pattern q)
q))))))
This type of query supports binding fact items to query variables. Instead of fact triples or quads, the result set of this query contains maps of unique variable bindings.
E.g. given the query pattern [?s :type ?t]
and two matching facts of
[:fabric :type :project]
and [:alice :type :person]
, the result
set is: #{{?s :fabric ?t :project} {?s :alice ?t :person}}
.
Use nil
for a pattern position, if it should catch all values at
this position, but not be bound to a variable. Example: [?s :type
nil]
will catch any facts with :type
as predicate, but the result
set only contains unique bindings of query variable ?s
.
(defn add-param-query!
([g pattern opts]
(add-param-query! g (fact-transform g) pattern opts))
([g ptx pattern opts]
(let [pattern (transform ptx pattern)]
(if-let [q (and (empty? opts) (cached g ::queries pattern))]
(do (debug :reuse-param-query pattern) q)
(let [pattern' (inject-value-choices pattern (:values opts))
qvars? (mapv qvar? pattern)
raw (mapv #(if-not (qvar? %) %) pattern')
vmap (bind-translator pattern)
verify (fact-verifier qvars? pattern)
tx (if verify
(map #(if (verify %) (vmap %)))
(map vmap))
tx (comp tx (filter identity))
tx (non-grouping-result-transducer opts tx)
coll-fn (f/collect-pure
(fn [_ incoming]
(if-let [res (seq (peek incoming))]
(into #{} tx res)
#{})))
sub-q (add-query! g identity-transform raw {})
sub-res (result-vertex sub-q)
result (f/add-vertex!
g nil
{::f/collect-fn coll-fn
::f/score-signal-fn f/score-signal-with-new-edges})
pq (reify
#?@(:clj
[clojure.lang.IDeref (deref [_] @result)]
:cljs
[IDeref (-deref [_] @result)])
IFactQuery
(raw-pattern
[_] raw)
IQueryResult
(pre-result-vertex
[_] sub-res)
(result-vertex
[_] result)
f/IGraphComponent
(component-type
[_] :param-query)
(add-to-graph!
[_ g] (err/unsupported!))
(remove-from-graph!
[_ g] (f/remove-from-graph! _ g nil))
(remove-from-graph!
[_ g parent]
(if (f/none-or-single-user? result parent)
(do (debug "removing :param-query" pattern)
(expire! g ::queries pattern)
(f/remove-vertex! g result)
(f/remove-from-graph! sub-q g result)
true)
(do (f/disconnect-neighbor! result parent)
false))))]
(f/add-edge! g sub-res result f/signal-forward nil)
(if-not (seq opts)
(cache! g ::queries pattern pq)
pq))))))
(defn add-join!
([g lhs rhs opts]
(add-join! g join lhs rhs opts))
([g join-fn lhs rhs opts]
(let [lhs-v (result-vertex lhs)
rhs-v (result-vertex rhs)
lhs-id (f/vertex-id lhs-v)
rhs-id (f/vertex-id rhs-v)
_ (debug :add-join lhs-id rhs-id opts)
tx (non-grouping-result-transducer opts nil)
cfn (if tx #(into #{} tx (join-fn % %2)) join-fn)
res (f/add-vertex!
g nil
{::f/collect-fn
(fn [vertex]
(let [sig-map (f/signal-map vertex)
a (sig-map lhs-id)
b (sig-map rhs-id)]
;;(debug (f/vertex-id vertex) :join-sets a b)
(f/set-value! vertex (cfn a b))))
::f/score-collect-fn score-collect-join
::f/score-signal-fn f/score-signal-with-new-edges})
jq (as-query-entity res nil [lhs rhs] :join-query)]
(f/add-edge! g lhs-v res f/signal-forward nil)
(f/add-edge! g rhs-v res f/signal-forward nil)
jq)))
(defn add-query-join!
([g patterns opts]
(add-query-join! g (fact-transform g) patterns opts))
([g ptx patterns opts]
(let [[a b & more :as p] patterns ;;(sort-patterns patterns)
_ (assert (and a b) "Requires min. 2 query patterns")
opts' (select-keys* opts [:values])
jq (reduce
#(add-join! g join % (add-param-query! g ptx %2 opts') {})
(add-join!
g join
(add-param-query! g ptx a opts')
(add-param-query! g ptx b opts')
(if (seq more) {} opts))
(butlast more))]
(if-let [p (last more)]
(add-join! g join jq (add-param-query! g ptx p {}) opts)
jq))))
(defn add-query-join-optional!
([g patterns opts]
(add-query-join-optional! g (fact-transform g) patterns opts))
([g ptx patterns opts]
(let [[a b & more] patterns
_ (assert (and a b) "Requires min. 2 query patterns")
opts' (select-keys* opts [:values])
jq (reduce
#(add-join! g join-optional % (add-param-query! g ptx %2 opts') {})
(add-join!
g join-optional
(add-param-query! g ptx a opts')
(add-param-query! g ptx b opts')
(if (seq more) {} opts))
(butlast more))]
(if-let [p (last more)]
(add-join! g join-optional jq (add-param-query! g ptx p {}) opts)
jq))))
(defn add-query-union!
[g queries opts]
(assert (< 1 (count queries)) "min. 2 queries required")
(let [tx (non-grouping-result-transducer opts nil)
res (f/add-vertex!
g nil
{::f/collect-fn
(fn [vertex]
(let [subs (vals (f/signal-map vertex))
res (reduce #(if (seq %2) (into % %2) %) subs)]
(f/set-value! vertex (if tx (into #{} tx res) res))))
::f/score-signal-fn f/score-signal-with-new-edges})
q (as-query-entity res nil queries :union-query)]
(run! #(f/add-edge! g (result-vertex %) res f/signal-forward nil) queries)
q))
(defn add-query-negation!
[g lhs rhs opts]
(let [lhs-v (result-vertex lhs)
rhs-v (result-vertex rhs)
lhs-id (f/vertex-id lhs-v)
rhs-id (f/vertex-id rhs-v)
tx (non-grouping-result-transducer opts nil)
cfn (if tx #(into #{} tx %) identity)
res (f/add-vertex!
g nil
{::f/collect-fn
(fn [vertex]
(let [sig-map (f/signal-map vertex)
a (sig-map lhs-id)
b (sig-map rhs-id)
res (join-negative a b)]
(f/set-value! vertex (cfn res))))
::f/score-signal-fn f/score-signal-with-new-edges})
q (as-query-entity res nil [lhs rhs] :negation-query)]
(f/add-edge! g lhs-v res f/signal-forward nil)
(f/add-edge! g rhs-v res f/signal-forward nil)
q))
A property path is a possible route through a graph between two nodes. The most trivial case, a single fact, has a path length of 1. The ends of the path may be constants or variables. Variables can not be used as part of the path (predicates) itself, only at the ends.
Path queries allow for more concise expressions for some graph patterns and they also add the ability to match connectivity of subjects/objects by an arbitrary length path.
Note: Currently, path queries are only supported for triples (fact len = 3) and only bounded path queries are implemented, meaning both min/max path lengths MUST be specified.
;; match path (of single pred) between depth 1 and 3
(add-path-query! g '[?s [pred] ?o] {:min 1 :max 3})
;; match path (cyclically repeated) between 2 hops 4
(add-path-query! g '[?s [pred1 pred2] ?o] {:min 2 :max 4})
;; match given path
(add-path-query! g '[?s [p1 p2 p3] ?o])
;; match *any* paths starting at "node-a" with len 2 - 4
(add-path-query! g '[node-a [nil] ?o] {:min 1 :max 4})
;; check if path exists between fixed subject/object
;; this query has no variables, see note below
(add-path-query! g '[node-a [pred] node-b] {:min 3 :max 3})
(defn add-path-query!
([g path-pattern]
(add-path-query! g path-pattern {}))
([g path-pattern opts]
(add-path-query! g (fact-transform g) path-pattern opts))
([g ptx path-pattern opts]
(let [len (count (nth path-pattern 1))
{:keys [min max] :or {min len max len}} opts]
(assert (== 3 (count path-pattern)) "path queries only supported for triple patterns")
(assert (pos? min) "min depth must be >= 1")
(assert (<= min max) "min depth must be <= max depth")
(let [[patterns avars] (resolve-path-pattern path-pattern max)
[?s _ ?o] path-pattern
vs? (qvar? ?s)
vo? (qvar? ?o)
req (take min patterns)
opt (drop min (take max patterns))
req (if (seq req)
(if (== 1 (count req))
(add-param-query! g ptx (first req) {})
(add-query-join! g ptx req {})))
opt (if (seq opt)
(if (== 1 (count opt))
(add-param-query! g ptx (first opt) {})
(add-query-join-optional! g ptx opt {})))
q (if (and req opt)
(add-join! g join-optional req opt {}) ;; TODO opts
(or req opt))
qr (result-vertex q)
tx (cond
(or (= min max) (and vs? (not vo?)))
(let [qv (filter qvar? path-pattern)]
(map #(select-keys* % qv)))
(and vo? (not vs?))
(let [rv (take (dec min) avars)]
(mapcat #(map (fn [v] {?o v}) (vals (apply dissoc % rv)))))
:else
(let [rv (cons ?s (take (dec min) avars))]
(mapcat
#(let [s (% ?s)] (map (fn [v] {?s s ?o v}) (vals (apply dissoc % rv)))))))
tx (non-grouping-result-transducer opts tx)
res (f/add-vertex!
g #{}
{::f/collect-fn (f/collect-pure (fn [_ in] (into #{} tx (peek in))))})
pq (as-query-entity res qr [q] :path-query)]
(f/add-edge! g qr res f/signal-forward nil)
pq))))
See the query specification section in the DSL namespace for all available options…
(defn make-query-result
[{:keys [aggregate order group-by] :as spec}]
(let [in-tx (if (and aggregate (not group-by))
(result-pre-aggregator aggregate))
in-tx (if order
(if in-tx
#(sort-by order (in-tx %))
#(sort-by order %))
in-tx)
tx (or (result-transducer spec nil) (map identity))
cfn (if group-by
(let [gfn (transient-group-by-reducer spec group-by)]
(if in-tx
(fn [in] (postprocess-group-by (transduce tx gfn (in-tx in))))
(fn [in] (postprocess-group-by (transduce tx gfn in)))))
(if in-tx
(fn [in] (into (if order [] #{}) tx (in-tx in)))
(fn [in] (into #{} tx in))))]
(if (and aggregate group-by)
(result-post-aggregator aggregate cfn spec)
cfn)))
(defn add-query-result!
[g spec q]
(let [cfn (make-query-result spec)
res (f/add-vertex!
g nil
{::f/collect-fn (f/collect-pure (fn [_ in] (cfn (peek in))))
::f/score-signal-fn f/score-signal-with-new-edges})
qraw (result-vertex q)
qres (as-query-entity res qraw [q] :query-result)]
(f/add-edge! g qraw res f/signal-forward nil)
qres))
(def ^:dynamic *auto-qvar-prefix* "?__q")
(defn qvar?
"Returns true, if x is a qvar (a symbol prefixed with '?')"
[x] (and (symbol? x) (= \? (.charAt ^String (name x) 0))))
(defn auto-qvar?
"Returns true, if x is an auto-generated qvar (a symbol prefixed
with *auto-qvar-prefix*)"
[x] (and (symbol? x) (zero? (.indexOf ^String (name x) ^String *auto-qvar-prefix*))))
(defn auto-qvar
"Creates a new auto-named qvar (symbol)."
[] (gensym *auto-qvar-prefix*))
(defn qvar-name
[x] (-> x name (subs 1)))
(defn resolve-path-pattern
"Takes a path triple pattern and max depth. The pattern's predicate
must be a seq of preds. Returns a 2-elem vector [patterns vars],
where `patterns` is a seq of query patterns with injected temp qvars
for inbetween patterns and `vars` the autogenerated qvars
themselves.
Example:
[?s [p1 p2 p3] ?o]
=> [([?s p1 ?__q0] [?__q0 p2 ?__q1] [?__q1 p3 ?o]) (?__q0 ?__q1)]"
[[s p o] maxd]
(let [avars (repeatedly maxd auto-qvar)
vars (cons s avars)]
[(->> (concat (interleave vars (take maxd (cycle p))) [o])
(partition 3 2))
avars]))
(defn inject-value-choices
[pattern value-map]
(if (seq value-map)
(mapv #(if-let [choices (value-map %)] (index-sel-choice choices) %) pattern)
pattern))
(defn pattern-var-count
[pattern]
(count (filter qvar? pattern)))
(defn sort-patterns
[patterns]
(sort-by pattern-var-count patterns))
(defn unique-bindings?
"Returns true if all value-map in the given map are unique, i.e.
no two keys are mapped to the same value."
[res] (== (count (into #{} (vals res))) (count res)))
(defn select-keys*
"Like c.c/select-keys, but doesn't retain map's meta"
{:static true}
[map keyseq]
(loop [ret {} keys (seq keyseq)]
(if keys
#?(:clj
(let [entry (. clojure.lang.RT (find map (first keys)))]
(recur
(if entry (conj ret entry) ret)
(next keys)))
:cljs
(let [key (first keys)
entry (get map key ::not-found)]
(recur
(if (= entry ::not-found) ret (assoc ret key entry))
(next keys))))
ret)))
(defn index*
"Like clojure.set/index, but using select-keys w/o metadata retention."
[xrel ks]
(persistent!
(reduce
(fn [m x]
(let [ik (select-keys* x ks)]
(assoc! m ik (conj (get m ik #{}) x))))
(transient {}) xrel)))
(defn join
"Based on clojure.set/join. Does not join when there're no shared
keys and no key mapping, enforced result set limit, uses
transients."
[xrel yrel]
(if (and (seq xrel) (seq yrel))
(let [ks (set/intersection (set (keys (first xrel))) (set (keys (first yrel))))]
(if (seq ks)
(let [[r s] (if (<= (count xrel) (count yrel))
[xrel yrel]
[yrel xrel])
idx (index* r ks)]
(persistent!
(reduce
(fn [ret x]
(let [found (idx (select-keys* x ks))]
(if found
(reduce #(conj! % (conj %2 x)) ret found)
ret)))
(transient #{}) s)))
#{}))
#{}))
(defn join-optional
[a b]
(loop [old (transient #{}), new (transient #{}), kb b]
(if kb
(let [kb' [(first kb)]
[old new] (loop [old old, new new, ka a]
(if ka
(let [ka' (first ka)
j (first (join [ka'] kb'))]
(if j
(recur (conj! old ka') (conj! new j) (next ka))
(recur old new (next ka))))
[old new]))]
(recur old new (next kb)))
(let [new (persistent! new)]
(if (seq new)
(into (apply disj (set a) (persistent! old)) new)
a)))))
(defn joinable?
([a b]
(joinable? a b (set (keys a)) (set (keys b))))
([a b aks bks]
(let [ks (set/intersection aks bks)]
(and (seq ks) (= (select-keys* a ks) (select-keys* b ks))))))
(defn join-negative
[a b]
(if (and (seq a) (seq b))
(reduce
(fn [res bres]
(let [bks (set (keys bres))]
(persistent!
(reduce
(fn [res ares]
(if (joinable? ares bres (set (keys ares)) bks)
(disj! res ares)
res))
(transient res) res))))
a b)
a))
(defn result-pre-aggregator
[agg]
(fn [results]
(let [ares (agg results)]
(map #(merge % ares) results))))
(defn result-post-aggregator
[agg cfn spec]
(let [sel (:select spec)
rfn (if (or (nil? sel) (= :* sel))
(let [sel (if (sequential? sel) sel [sel])]
(fn [acc k res]
(let [ares (agg res)]
(assoc acc k (mapv #(merge % ares) res)))))
(let [avars (set (keys (:aggregate* spec)))]
(if (and (seq avars) (every? avars sel))
(fn [acc k res]
(assoc acc k (select-keys (merge (first res) (agg res)) sel)))
(fn [acc k res]
(let [ares (agg res)]
(assoc acc k (into (empty res) (map #(select-keys (merge % ares) sel)) res)))))))]
(fn [in] (reduce-kv rfn {} (cfn in)))))
(defn result-transducer
[spec tx]
(let [tx (if-let [bnd (:bind spec)]
(let [tx' (map bnd)]
(if tx (comp tx tx') tx'))
tx)
tx (if (:unique spec)
(let [tx' (filter unique-bindings?)]
(if tx (comp tx tx') tx'))
tx)
tx (if-let [flt (:filter spec)]
(let [tx' (filter flt)]
(if tx (comp tx tx') tx'))
tx)
tx (if-let [lim (:limit spec)]
(let [tx' (take lim)]
(if tx (comp tx tx') tx'))
tx)
tx (if-let [sel (and (not (:group-by spec)) (:select spec))]
(if-not (= :* sel)
(let [sel (if (sequential? sel) sel [sel])
tx' (comp (map #(select-keys % sel)) (filter seq))
agg (set (keys (:aggregate* spec)))
tx' (if (and (seq agg) (every? agg sel))
(comp tx' (take 1))
tx')]
(if tx (comp tx tx') tx'))
tx)
tx)]
tx))
(defn non-grouping-result-transducer
[spec tx] (result-transducer (dissoc spec :group-by :aggregate*) tx))
(defn transient-group-by-reducer
[{:keys [aggregate order select]} grp]
(let [ctype (if order [] #{})]
(if-not (or aggregate (nil? select) (= :* select))
(let [sel (if (sequential? select) select [select])]
(fn
([] (transient {}))
([res] res)
([res input]
(let [k (grp input)]
(assoc! res k (conj! (get res k (transient ctype)) (select-keys* input sel)))))))
(fn
([] (transient {}))
([res] res)
([res input]
(let [k (grp input)]
(assoc! res k (conj! (get res k (transient ctype)) input))))))))
(defn postprocess-group-by
[res]
(reduce-kv
(fn [acc k v] (assoc acc k (persistent! v)))
{} (persistent! res)))
(defn accumulate-result-vars
[results]
(->> results
(reduce
(fn [acc res]
(reduce-kv
(fn [acc k v]
(assoc! acc k (conj! (get acc k (transient #{})) v)))
acc res))
(transient {}))
(postprocess-group-by)))
(defn remove-recursively
[g res parent deps type]
(if (f/none-or-single-user? res parent)
(do (debug "removing" type)
(f/remove-vertex! g res)
(run! #(f/remove-from-graph! % g res) deps)
true)
(do (f/disconnect-neighbor! res parent)
false)))
(defn as-query-entity
[res pre-res deps type]
(reify
#?@(:clj
[clojure.lang.IDeref (deref [_] @res)]
:cljs
[IDeref (-deref [_] @res)])
IFactQuery
(raw-pattern
[_] nil)
IQueryResult
(pre-result-vertex
[_] pre-res)
(result-vertex
[_] res)
f/IGraphComponent
(component-type
[_] type)
(add-to-graph!
[_ g] (err/unsupported!))
(remove-from-graph!
[_ g] (f/remove-from-graph! _ g nil))
(remove-from-graph!
[_ g parent]
(remove-recursively g res parent deps type))))
This function (and fact-verifier
below) are used as part of the
mechanism allowing the graph to support facts of varying lengths and
queries over them. Currently, this module only supports fact triples
(len=3) or quads (len=4).
bind-translator
is used to provide optimized functions to bind query
variables to their fact components in the result set of a parametric
query.
(defn bind-translator
[pattern]
(let [vmap (reduce-kv
(fn [acc k v]
(if (and (qvar? k) (not (acc k)))
(assoc acc k #(% v))
acc))
{} (zipmap pattern (range)))]
(case (count vmap)
0 (fn [r] {})
1 (let [[[a av]] (seq vmap)] (fn [r] {a (av r)}))
2 (let [[[a av] [b bv]] (seq vmap)] (fn [r] {a (av r) b (bv r)}))
3 (let [[[a av] [b bv] [c cv]] (seq vmap)] (fn [r] {a (av r) b (bv r) c (cv r)}))
(fn [r] (reduce-kv (fn [acc k f] (assoc acc k (f r))) {} vmap)))))
The fact verifier function is used to provide optimized validations to
ensure query variables are bound to unique values. E.g. given a query
pattern ['?a ?b '?a]
, these functions ensure that all qvars are
bound to different values, i.e. it would reject the fact [fact is
fiction]
(yes, pretty meaningless..) because ?a
wouldn’t be bound to the
same values (first ?a
binds to “fact”, second ?a
to “fiction”)…
((fact-verifier [true true true] '[?a ?b ?a]) '[fact is fiction])
;; false
((fact-verifier [true true true] '[?a ?b ?a]) '[fact is fact])
;; true
(defmulti fact-verifier
(fn [qvars? pattern] (count pattern)))
(defmethod fact-verifier 3
[[vs? vp? vo?] [s p o]]
(cond
(and vs? vp? vo?) (cond
(= s p o) #(= (% 0) (% 1) (% 2))
(= s p) #(and (= (% 0) (% 1)) (not= (% 0) (% 2)))
(= s o) #(and (= (% 0) (% 2)) (not= (% 0) (% 1)))
(= p o) #(and (= (% 1) (% 2)) (not= (% 0) (% 1)))
:else nil)
(and vs? vp?) (if (= s p) #(= (% 0) (% 1)) #(not= (% 0) (% 1)))
(and vs? vo?) (if (= s o) #(= (% 0) (% 2)) #(not= (% 0) (% 2)))
(and vp? vo?) (if (= p o) #(= (% 1) (% 2)) #(not= (% 1) (% 2)))
:else nil))
(defmethod fact-verifier 4
[[vt? vs? vp? vo?] [t s p o]]
(cond
(and vt? vs? vp? vo?)
(cond
(= t s p o) #(= (% 0) (% 1) (% 2) (% 3))
(= t s p) #(and (not= (% 0) (% 3)) (= (% 0) (% 1) (% 2)))
(= t s o) #(and (not= (% 0) (% 2)) (= (% 0) (% 1) (% 3)))
(= t p o) #(and (not= (% 0) (% 1)) (= (% 0) (% 2) (% 3)))
(= s p o) #(and (not= (% 0) (% 1)) (= (% 1) (% 2) (% 3)))
(and (= t s) (= p o)) #(and (= (% 0) (% 1)) (= (% 2) (% 3)))
(and (= t p) (= s o)) #(and (= (% 0) (% 2)) (= (% 1) (% 3)))
(and (= t o) (= s p)) #(and (= (% 0) (% 3)) (= (% 1) (% 2)))
(= t s) #(let [t (first %)] (and (= t (% 1)) (not= t (% 2)) (not= t (% 3))))
(= t p) #(let [t (first %)] (and (= t (% 2)) (not= t (% 1)) (not= t (% 3))))
(= t o) #(let [o (peek %)] (and (= o (% 0)) (not= o (% 1)) (not= o (% 2))))
(= s p) #(let [s (nth % 1)] (and (= s (% 2)) (not= s (% 3)) (not= s (% 0))))
(= s o) #(let [o (peek %)] (and (= o (% 1)) (not= o (% 2)) (not= o (% 0))))
(= p o) #(let [o (peek %)] (and (= o (% 2)) (not= o (% 1)) (not= o (% 0))))
:else nil)
(and vt? vs? vp?)
(cond
(= t s p) #(= (% 0) (% 1) (% 2))
(= t s) #(and (= (% 0) (% 1)) (not= (% 0) (% 2)))
(= t p) #(and (= (% 0) (% 2)) (not= (% 0) (% 1)))
(= s p) #(and (= (% 1) (% 2)) (not= (% 1) (% 0)))
:else nil)
(and vt? vs? vo?)
(cond
(= t s o) #(= (% 0) (% 1) (% 3))
(= t s) #(and (= (% 0) (% 1)) (not= (% 0) (% 3)))
(= t o) #(and (= (% 0) (% 3)) (not= (% 0) (% 1)))
(= s o) #(and (= (% 1) (% 3)) (not= (% 1) (% 0)))
:else nil)
(and vt? vp? vo?)
(cond
(= t p o) #(= (% 0) (% 2) (% 3))
(= t p) #(and (= (% 0) (% 2)) (not= (% 0) (% 3)))
(= t o) #(and (= (% 0) (% 3)) (not= (% 0) (% 1)))
(= p o) #(and (= (% 2) (% 3)) (not= (% 2) (% 0)))
:else nil)
(and vs? vp? vo?)
(cond
(= s p o) #(= (% 1) (% 2) (% 3))
(= s p) #(and (= (% 1) (% 2)) (not= (% 1) (% 3)))
(= s o) #(and (= (% 1) (% 3)) (not= (% 1) (% 2)))
(= p o) #(and (= (% 2) (% 3)) (not= (% 2) (% 1)))
:else nil)
(and vt? vs?) (if (= t s) #(= (% 0) (% 1)) #(not= (% 0) (% 1)))
(and vt? vp?) (if (= t p) #(= (% 0) (% 2)) #(not= (% 0) (% 2)))
(and vt? vo?) (if (= t o) #(= (% 0) (% 3)) #(not= (% 0) (% 3)))
(and vs? vp?) (if (= s p) #(= (% 1) (% 2)) #(not= (% 1) (% 2)))
(and vs? vo?) (if (= s o) #(= (% 1) (% 3)) #(not= (% 1) (% 3)))
(and vp? vo?) (if (= p o) #(= (% 2) (% 3)) #(not= (% 2) (% 3)))
:else nil))
(defn add-rule!
[g {:keys [id query patterns transform production collect-fn] :as opts}]
(let [id (or id (f/random-id))
coll-fn (or collect-fn (collect-inference g production))
query (or query
(let [tx (or transform (fact-transform g))
q-opts (select-keys opts [:filter :limit :select :unique])]
(if (< 1 (count patterns))
(add-query-join! g tx patterns q-opts)
(add-param-query! g tx (first patterns) q-opts))))
inf (f/add-vertex! g #{} {::f/collect-fn coll-fn})
rule (reify
#?@(:clj
[clojure.lang.IDeref (deref [_] @query)]
:cljs
[IDeref (-deref [_] @query)])
IFactQuery
(raw-pattern
[_] nil)
IQueryResult
(pre-result-vertex
[_] (pre-result-vertex query))
(result-vertex
[_] (result-vertex query))
f/IGraphComponent
(component-type
[_] :rule)
(add-to-graph!
[_ g] (err/unsupported!))
(remove-from-graph!
[_ g] (f/remove-from-graph! _ g nil))
(remove-from-graph!
[_ g parent]
(if (f/none-or-single-user? inf parent)
(do (debug "removing :rule" id)
(f/remove-vertex! g inf)
(f/remove-from-graph! query g inf)
true)
(do (f/disconnect-neighbor! inf parent)
false))))]
(f/add-edge! g (result-vertex query) inf f/signal-forward nil)
(cache! g ::rules id rule)))
(defn- signal-fact
[vertex op] [op (f/vertex-id vertex) @vertex])
(def ^:private collect-select
(f/collect-pure
(fn [val incoming]
(let [val (reduce (fn [acc [idx res]] (assoc acc idx res)) val incoming)]
;;(debug :coll-select val incoming)
val))))
(defn- score-collect-min-signal-vals
[num]
(fn [vertex]
(if (> num (count (vals (peek (f/uncollected-signals vertex))))) 0 1)))
(defn- score-collect-min-signals
[num]
(fn [vertex]
(if (> num (count (f/uncollected-signals vertex))) 0 1)))
(defn- collect-basic-query-results
[g opts]
(let [tx (if (not= false (:untransform opts))
(let [ftx (fact-transform g)]
(map #(untransform ftx @(f/vertex-for-id g %))))
(map #(deref (f/vertex-for-id g %))))
tx (if-let [flt (:filter opts)]
(let [tx' (filter flt)]
(if tx (comp tx tx') tx'))
tx)
tx (if-let [lim (:limit opts)]
(let [tx' (take lim)]
(if tx (comp tx tx') tx'))
tx)]
(f/collect-pure
(fn [_ incoming]
(let [res (vals (peek incoming))]
;;(debug :agg-incoming res)
(if (every? #(not= ::none %) res)
(->> res
(into #{})
(sort-by count)
(reduce set/intersection)
(into #{} tx))
#{}))))))
(defn- score-collect-join
[vertex]
(if (and (seq (f/uncollected-signals vertex))
(== (count (f/signal-map vertex)) 2))
1 0))
(defn- collect-inference
[g production]
(fn [vertex]
(let [prev @vertex
in (reduce into #{} (f/uncollected-signals vertex))
adds (set/difference in prev)]
(debug (f/vertex-id vertex) :additions adds)
(run! #(production g vertex %) adds)
(f/update-value! vertex #(set/union % adds)))))
(defn collect-into-set
[tx] (f/collect-pure (fn [_ in] (into #{} tx (peek in)))))
In addition to the standard events emitted by the default
ComputeGraph
of the fabric-core module, a FactGraph
also emits
the events :add-fact
or :remove-fact
whenever a fact is added or
removed. The add-fact-graph-logger
function below can be used to
subscribe to these (and only these) events and react to them, for
example to write a log of fact changes to disk. The fn takes a logging
function which is called with a single arg, a 2-element vector of
[event fact]
. The user function is called from a core.async
go-loop
setup by add-fact-graph-logger
.
(def fact-log-transducer
(map (fn [[op f]] [({:add-fact :add :remove-fact :remove} op) f])))
(defn add-fact-graph-logger
([g log-fn]
(add-fact-graph-logger g log-fn 1024))
([g log-fn buf-size]
(let [ch (async/chan buf-size)
watch-id (f/random-id)
log->chan #(go (>! ch %))]
(async/thread
(loop []
(when-let [t (<!! ch)]
(log-fn t)
(recur))))
(f/add-watch! g :add-fact watch-id log->chan)
(f/add-watch! g :remove-fact watch-id log->chan)
(info "started fact logger with buffer" buf-size)
{:graph g :chan ch :watch-id watch-id})))
(defn remove-fact-graph-logger
[{:keys [graph watch-id chan]}]
(f/remove-watch! graph :add-fact watch-id)
(f/remove-watch! graph :remove-fact watch-id)
(async/close! chan))
Clojure’s hashmaps provide a natural way to express nested
relationships, but cannot be directly used as facts. The function
map->facts
recursively converts a given hashmap into a lazy-seq of
facts (triples).
Conversion rules:
- toplevel keys are used as subjects and MUST each have a map as value, where their keys are used as predicates and their values as objects.
- maps in object position are translated into UUID strings in order to allow the nested map to become a subject itself. In RDF terms, this is equivalent to BNodes.
- sequential types (vectors, lists) in object position cause multiple triples with same subject & predicate to be emitted. If any of the items in the object sequence is a map, it will be treated as BNode as described in previous rule
The following example demonstrates all these rules:
(map->facts
{"people:toxi"
{"rdf:type" "schema:Person"
"schema:givenName" "Karsten"
"schema:familyName" "Schmidt"
"foaf:nick" "toxi"
"foaf:account" [{"rdf:type" "foaf:OnlineAccount"
"foaf:accountServiceHomepage" "https://twitter.com/"
"foaf:accountName" "toxi"}
{"rdf:type" "foaf:OnlineAccount"
"foaf:accountServiceHomepage" "https://github.com/"
"foaf:accountName" "postspectacular"}]
"schema:memberOf" ["orgs:postspectacular"
"orgs:thing"]
"schema:gender" "male"
"schema:jobTitle" "Computational Designer"}})
;; (["people:toxi" "rdf:type" "schema:Person"]
;; ["people:toxi" "schema:givenName" "Karsten"]
;; ["people:toxi" "schema:familyName" "Schmidt"]
;; ["people:toxi" "foaf:nick" "toxi"]
;; ["people:toxi" "foaf:account" "b2c1c57e-2967-44fe-8b66-135ec6fa80bf"]
;; ["b2c1c57e-2967-44fe-8b66-135ec6fa80bf" "rdf:type" "foaf:OnlineAccount"]
;; ["b2c1c57e-2967-44fe-8b66-135ec6fa80bf" "foaf:accountServiceHomepage" "https://twitter.com/"]
;; ["b2c1c57e-2967-44fe-8b66-135ec6fa80bf" "foaf:accountName" "toxi"]
;; ["people:toxi" "foaf:account" "4e6fba86-92ec-4422-a8a2-c80150ddb885"]
;; ["4e6fba86-92ec-4422-a8a2-c80150ddb885" "rdf:type" "foaf:OnlineAccount"]
;; ["4e6fba86-92ec-4422-a8a2-c80150ddb885" "foaf:accountServiceHomepage" "https://github.com/"]
;; ["4e6fba86-92ec-4422-a8a2-c80150ddb885" "foaf:accountName" "postspectacular"]
;; ["people:toxi" "schema:memberOf" "orgs:postspectacular"]
;; ["people:toxi" "schema:memberOf" "orgs:thing"]
;; ["people:toxi" "schema:gender" "male"]
;; ["people:toxi" "schema:jobTitle" "Computational Designer"])
(declare map->facts)
(defn- bnode-facts
[s p o]
(let [bn (strf/new-uuid)] (cons [s p bn] (map->facts o bn))))
(defn- map->facts*
[s]
(fn [[p o]]
(cond
(sequential? o) (mapcat (fn [o] (if (map? o) (bnode-facts s p o) [[s p o]])) o)
(map? o) (bnode-facts s p o)
:else [[s p o]])))
(defn map->facts
([fact-map]
(mapcat (fn [[s v]] (mapcat (map->facts* s) v)) fact-map))
([fact-map s]
(mapcat (map->facts* s) fact-map)))
(ns thi.ng.fabric.facts.core
#?@(:clj
[(:require
[thi.ng.fabric.core :as f]
[thi.ng.xerror.core :as err]
[thi.ng.strf.core :as strf]
[thi.ng.dstruct.unionfind :as uf]
[clojure.set :as set]
[clojure.data.int-map :as imap]
[clojure.core.async :as async :refer [go go-loop <! <!! >!]]
[taoensso.timbre :refer [debug info warn]])]
:cljs
[(:require-macros
[cljs.core.async.macros :refer [go go-loop]]
[cljs-log.core :refer [debug info warn]])
(:require
[thi.ng.fabric.core :as f]
[thi.ng.xerror.core :as err]
[thi.ng.strf.core :as strf]
[clojure.set :as set]
[cljs.core.async :as async :refer [<! <!! >!]])]))
(def ^:private MAX_LIMIT #?(:clj Long/MAX_VALUE :cljs (.-MAX_VALUE js/Number)))
(declare select-keys* qvar? add-query! add-query-join!)
(declare index-selection index-sel-choice make-index-selections)
<<protos>>
<<helpers>>
<<sig-coll>>
<<query-helpers>>
<<fact-v>>
<<index-v>>
<<fact-tx>>
<<graph>>
<<queries>>
<<inference>>
<<logger>>
- maybe more convenient, but not strictly facts
- requires custom fact transforms & query patterns (also maps)
- how to deal with adding new facts, merged into original map?
- just provide map translations: map->facts (use trio triple-seq conversion)
;; example map
{"P-123" {:name "karsten" :nick "toxi" :address {:country "uk", :city "london"}}}
;; query: subjects with name & city
{?n :name ?city [:address :city]}