Skip to content

Latest commit

 

History

History
2116 lines (1857 loc) · 75.4 KB

core.org

File metadata and controls

2116 lines (1857 loc) · 75.4 KB

Contents

Namespace: thi.ng.fabric.facts.core

Protocols

These protocols act as the main public & internal API for this module and their roles are described in more detail throughout this document.

(defprotocol IFactGraph
  (facts [_])
  (fact-indices [_])
  (fact-transform [_])
  (add-fact! [_ t])
  (remove-fact! [_ t]))

(defprotocol ICache
  (cached [_ type k])
  (cache! [_ type k v])
  (expire! [_ type k]))

(defprotocol IFactIndex
  (index-signal-handler [_]))

(defprotocol IFactQuery
  (raw-pattern [_]))

(defprotocol IQueryResult
  (pre-result-vertex [_])
  (result-vertex [_]))

(defprotocol ITwoWayTransform
  (transform [_ x])
  (untransform [_ x]))

Core types & concepts

Facts

Facts are simply vectors of (currently 3 or 4) items, e.g. [subject predicate object] (for triples) and stored in custom vertices. A FactVertex is more lightweight than a standard Vertex, configured to never collect (facts are immutable) and signal only once when added (or removed) from the graph.

Any Clojure value can be used as fact item, though in most cases you’ll want to use strings, keywords, symbols, UUIDs or numbers.

Example facts:

;; P123 has name Alice
[:P123 :name "Alice"]

;; toxi is author of fabric
'[toxi author fabric]

;; in transaction UUID we state "XYZ" has a template function f
[#uuid "cffedb78-0e16-4f7e-8494-a19b749e771b" "XYZ" :template-fn (fn [title] [:div [:h1 title]])]

FactVertex implementation

As with the default Vertex type defined in the fabric-core module, a vertex’ state can be obtained by dereferencing it via Clojure’s @ reader macro or deref. The FactVertex impl will throw errors for many operations supported by the default vertex. This is because unlike standard vertices, facts (as values) in a graph must be immutable. The only mutable aspects of a FactVertex are related to connectivity and signalling (e.g. each fact vertex is automatically connected to index vertices).

(deftype FactVertex
    [id fact new-edges outs]
  #?@(:clj
       [clojure.lang.IDeref
        (deref
         [_] fact)]
       :cljs
       [IDeref
        (-deref
         [_] fact)])
  f/IVertex
  (vertex-id
    [_] id)
  (set-value!
    [_ val] (err/unsupported!))
  (update-value!
    [_ f] (err/unsupported!))
  (previous-value
    [_] fact)
  (collect!
    [_] (err/unsupported!))
  (collect-final!
    [_] (err/unsupported!))
  (score-collect
    [_] 0)
  (connect-to!
    [_ v sig-fn opts]
    (swap! outs assoc v [sig-fn opts])
    (swap! new-edges inc)
    (debug id "edge to" (f/vertex-id v) "(" (pr-str opts) ") new:" @new-edges)
    _)
  (neighbors
    [_] (keys @outs))
  (disconnect-neighbor!
    [_ v]
    (when v
      (debug id "disconnect from" (f/vertex-id v))
      (swap! outs dissoc v))
    _)
  (disconnect-all!
    [_]
    (run! #(f/disconnect-neighbor! _ %) (keys @outs))
    _)
  (new-edge-count
    [_] @new-edges)
  (score-signal
    [_] @new-edges)
  (signal!
    [_ handler]
    (reset! new-edges 0)
    (handler _ @outs))
  (receive-signal
    [_ src sig] (err/unsupported!))
  (signal-map
    [_] (err/unsupported!))
  (uncollected-signals
    [_] (err/unsupported!)))

Constructor

This constructor is not intended for public use. Instead it is called by the FactGraph when a new fact is to be added.

(defn fact-vertex
  [id fact _]
  (FactVertex. id fact (atom 0) (atom {})))

FactGraph

Facts are stored in a FactGraph instance which has been configured with a maximum fact length (currently 3 or 4 items per fact, all facts in a graph MUST have the same length). The default implementation of the FactGraph wraps an existing IComputeGraph instance (used as backend) and provides an extended API via the IFactGraph and ICache protocols.

Fact indexing

The FactGraph indexes all facts by their individual items (e.g. subject, predicate and object values). These indices are special vertices, are customizable, and automatically added during graph construction. All fact vertices are automatically connected to (and only to!) these indices. When queries are added to the graph, they indirectly attach themselves to these indices (via intermediate index selection vertices, see Queries section below) and are notified each time their index selection is changing.

The module currently implements two types of indices:

  • plain fact components
  • indexing with dynamic entity aliasing (equivalence/synonym)

Both are described (incl. examples) further below.

Fact transformation

The ITwoWayTransform protocol allows the graph to invisibly store transformed facts, possibly in more compact form than specified by the user. When a fact is added, the transform method of the transform is applied. Queries automatically call untransform on their results, so the user will be oblivious to the changed internal representation.

Custom implementations MUST follow this rule:

  • Query variables (e.g. ?a, Clojure symbols prefixed with ?) and nil values MUST NOT be transformed.
Identity transform

This is a null transform (NOP), used when no fact transform is given for a graph.

(def identity-transform
  (reify ITwoWayTransform
    (transform [_ x] x)
    (untransform [_ x] x)))
String prefix replacements

This transform is useful when dealing with Linked Data (RDF), where most items are URIs. It’s common practice in the LD community to specify prefixes for various often used vocabularies. E.g. We might define a prefix map and transform like this:

(def prefixes
  {"rdf"  "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   "rdfs" "http://www.w3.org/2000/01/rdf-schema#"
   "owl"  "http://www.w3.org/2002/07/owl#"})

(def tx (prefix-transform prefixes))

(transform tx "http://www.w3.org/2002/07/owl#Ontology")
;; ["owl" "Ontology"]

(untransform tx ["owl" "Ontology"])
;; "http://www.w3.org/2002/07/owl#Ontology"

The transform phase takes such a prefix map and value. If value is a string, attempts to find prefix from map and if found returns a 2-elem vector of [prefix name], else original.

The untransformer too takes the same prefix map and a value. If the value is a vector as produced by the transformation fn and has a known prefix, returns expanded string or else original value.

Important note: This transform only applies to single fact sub-items, to apply it to full facts, wrap it with others in combine-transforms (described below).

(defn sort-prefixes
  [prefixes] (rseq (vec (sort-by #(count (peek %)) prefixes))))

(defn prefix-vector-transform
  [prefixes]
  (let [prefixes' (sort-prefixes prefixes)]
    (reify ITwoWayTransform
      (transform
        [_ x]
        (if (string? x)
          (reduce
           (fn [acc p]
             (if (zero? (.indexOf ^String x ^String (peek p)))
               (reduced [(first p) (subs x (count (peek p)))])
               acc))
           x prefixes')
          x))
      (untransform
        [_ x]
        (if (vector? x)
          (if-let [p (prefixes (first x))]
            (str p (nth x 1))
            x)
          x)))))

This second version of this transform is very similar and only differs in its transformed result type: Instead of a vector, it produces a string of prefix:name.

(defn ->prefix-string
  [prefixes' x]
  (if (string? x)
    (reduce
     (fn [acc p]
       (if (zero? (.indexOf ^String x ^String (peek p)))
         (reduced (str (first p) \: (subs x (count (peek p)))))
         acc))
     x prefixes')
    x))

(defn <-prefix-string
  [prefixes x]
  (if (string? x)
    (let [[[_ p n]] (re-seq #"^(\w[\w\-_]+):([\w\-_]*)$" x)]
      (if-let [p (prefixes p)]
        (str p n)
        x))
    x))

(defn prefix-string-transform
  [prefixes]
  (let [prefixes' (sort-prefixes prefixes)]
    (reify ITwoWayTransform
      (transform
        [_ x] (->prefix-string prefixes' x))
      (untransform
        [_ x] (<-prefix-string prefixes x)))))
Global fact item indexing

This transform too applies to fact sub-items only (see note in previous section) and maps values to/from unique index IDs.

In Clojure, the reverse index uses Zach Tellman’s org.clojure/data.int-map]] for faster and more memory efficient behavior.

(defn global-index-transform
  []
  (let [index (atom #?(:clj  {:fwd {} :rev (imap/int-map) :id 0}
                       :cljs {:fwd {} :rev {} :id 0}))]
    (reify ITwoWayTransform
      (transform
        [_ x]
        (if (or (nil? x) (qvar? x))
          x
          (or ((@index :fwd) x)
              (let [curr (volatile! nil)]
                (swap! index
                       #(let [id (:id %)]
                          (vreset! curr id)
                          (-> %
                              (update :id inc)
                              (update :fwd assoc x id)
                              (update :rev assoc id x))))
                @curr))))
      (untransform
        [_ id] ((@index :rev) id id)))))
Higher-order transforms & composition

Transforms can be composed using the functions below:

  • compose-transforms applies the given transforms serially (via reduce) and in reverse order for untransform.
  • one-way-transform takes an ITwoWayTransform and modifies it to use identity as its reverse transformation, therefore keeping transformed values.
  • combine-transforms applies 3 or 4 transforms in parallel (one per fact component) to (un)transform a single fact
(defn compose-transforms
  [& transforms]
  (let [rtx (reverse transforms)]
    (reify ITwoWayTransform
      (transform [_ x]
        (reduce #(transform %2 %) x transforms))
      (untransform [_ x]
        (reduce #(untransform %2 %) x rtx)))))

(defn one-way-transform
  [tx]
  (reify ITwoWayTransform
    (transform [_ x] (transform tx x))
    (untransform [_ x] x)))

(defn combine-transforms
  ([tx len]
   (apply combine-transforms (repeat len tx)))
  ([txs txp txo]
   (reify ITwoWayTransform
     (transform [_ fact]
       [(transform txs (first fact))
        (transform txp (nth fact 1))
        (transform txo (nth fact 2))])
     (untransform [_ fact]
       [(untransform txs (first fact))
        (untransform txp (nth fact 1))
        (untransform txo (nth fact 2))])))
  ([txt txs txp txo]
   (reify ITwoWayTransform
     (transform [_ fact]
       [(transform txt (first fact))
        (transform txs (nth fact 1))
        (transform txp (nth fact 2))
        (transform txo (nth fact 3))])
     (untransform [_ fact]
       [(untransform txt (first fact))
        (untransform txs (nth fact 1))
        (untransform txp (nth fact 2))
        (untransform txo (nth fact 3))]))))

Entity caching

The ICache protocol is used to cache various query related entities for DRY reasons. Intermediate result caching is not just done for performance reasons, but also to minimize memory usage, since queries often make use of similar patterns and sub-patterns (e.g. two separate query joins might use the same underlying pattern as LHS or RHS). Currently the cache is an atom with a nested map and these top-level keys:

  • ::index-sel - caches vertices selecting a specific key from any of the indices (or a wildcard selection). These vertices basically act as sub-sub-queries of a specific fact component.
  • ::queries - caches existing single-pattern queries based on their query pattern (only basic & parametric queries without options are cached). Note: the cached values are NOT vertices, but query entities (reify’s w/ IFactQuery, IGraphComponent and other protocol implementations)
  • ::rules - caches inference rules based on their ID (also defrecords)

Graph implementation

(defrecord FactGraph
    [g indices facts cache ftx]
  f/IComputeGraph
  (add-vertex!
    [_ val vspec] (f/add-vertex! g val vspec))
  (remove-vertex!
    [_ v] (f/remove-vertex! g v))
  (vertex-for-id
    [_ id] (f/vertex-for-id g id))
  (vertices
    [_] (f/vertices g))
  (add-edge!
    [_ src dest sig opts] (f/add-edge! g src dest sig opts))
  f/IWatch
  (add-watch!
    [_ type id f] (f/add-watch! g type id f) _)
  (remove-watch!
    [_ type id] (f/remove-watch! g type id) _)
  (notify-watches
    [_ evt] (f/notify-watches g evt))
  IFactGraph
  (facts
    [_] (keys @facts))
  (fact-indices
    [_] indices)
  (fact-transform
    [_] ftx)
  (add-fact!
    [_ f]
    (let [f' (transform ftx f)]
      (or (@facts f')
          (let [v (f/add-vertex! g f' nil fact-vertex)]
            (f/notify-watches g [:add-fact f])
            (debug :add-fact f f')
            (run! #(f/add-edge! g v % signal-fact :add) @indices)
            (swap! facts assoc f' v)
            v))))
  (remove-fact!
    [_ f]
    (let [f' (transform ftx f)]
      (if-let [v (@facts f')]
        (do
          (debug :remove-fact f f')
          (run! #(f/add-edge! g v % signal-fact :remove) @indices)
          (swap! facts dissoc f')
          (f/notify-watches g [:remove-fact f])
          (f/remove-vertex! g v)
          v)
        (warn "attempting to remove unknown fact:" f))))
  ICache
  (cached
    [_ type k] (get-in @cache [type k]))
  (cache!
    [_ type k v] (swap! cache assoc-in [type k] v) v)
  (expire!
    [_ type k] (swap! cache update type dissoc k) nil))

Constructor

(defn fact-graph
  "Creates a new FactGraph instance configured with given options map:
  :graph     - backing IComputeGraph (default fabric.core/compute-graph)
  :len       - fact length (default 3)
  :index     - index vertex ctor (default default-index-vertex)
  :transform - fact transform (default identity-transform)"
  ([]
   (fact-graph {}))
  ([{:keys [graph len index transform]
     :or   {graph     (f/compute-graph)
            len       3
            index     default-index-vertex
            transform identity-transform}}]
   (let [indices (atom nil)
         g       (map->FactGraph
                  {:indices indices
                   :facts   (atom {})
                   :cache   (atom {})
                   :ftx     transform
                   :g       graph})]
     (reset! indices (mapv #(add-index-vertex! graph (index g %)) (range len)))
     g)))

Index vertices

Index vertices are automatically created during graph construction. This is the helper function used by the FactGraph constructor to create a single fact component index vertex. The diagram in the following section shows how these vertices are used (users don’t need to deal with them directly, though).

IndexVertex type

(deftype IndexVertex
    [id index prev-index uncollected signal-map new-edges outs
     signal-fn collect-fn]
  #?@(:clj
       [clojure.lang.IDeref
        (deref
         [_] @index)]
       :cljs
       [IDeref
        (-deref
         [_] @index)])
  f/IVertex
  (vertex-id
    [_] id)
  (set-value!
    [_ val] (reset! index val) _)
  (update-value!
    [_ f] (swap! index f) _)
  (previous-value
    [_] @prev-index)
  (collect!
    [_]
    (collect-fn _)
    (reset! uncollected [])
    _)
  (collect-final!
    [_] _)
  (score-collect
    [_] (count @uncollected))
  (connect-to!
    [_ v sig-fn opts]
    (swap! outs assoc v [sig-fn opts])
    (swap! new-edges inc)
    (debug id "edge to" (f/vertex-id v) "(" (pr-str opts) ") new:" @new-edges)
    _)
  (neighbors
    [_] (keys @outs))
  (disconnect-neighbor!
    [_ v]
    (when v
      (debug id "disconnect from" (f/vertex-id v))
      (swap! outs dissoc v))
    _)
  (disconnect-all!
    [_]
    (run! #(f/disconnect-neighbor! _ %) (keys @outs))
    _)
  (new-edge-count
    [_] @new-edges)
  (score-signal
    [_] (+ @new-edges (if (= @index @prev-index) 0 1)))
  (signal!
    [_ handler]
    (reset! prev-index @index)
    (reset! new-edges 0)
    (handler _ @outs))
  (receive-signal
    [_ src sig]
    (if-not (= sig (@signal-map src))
      (do (swap! uncollected conj sig)
          (swap! signal-map assoc src sig)
          true)
      (debug id " ignoring unchanged signal: " (pr-str sig))))
  (signal-map
    [_] @signal-map)
  (uncollected-signals
    [_] @uncollected)
  IFactIndex
  (index-signal-handler
    [_] signal-fn))

Default index vertex

(defn add-index-vertex!
  [g ctor]
  (f/add-vertex! g nil nil ctor))

(defn- collect-index
  [idx]
  (let [tx (map (fn [[op vid t]] [op vid (nth t idx)]))
        rf (completing
            (fn [acc [op vid x]]
              (case op
                :add    (assoc acc x (conj (or (acc x) #{}) vid))
                :remove (if-let [idx (acc x)]
                          (if (= #{vid} idx)
                            (dissoc acc x)
                            (assoc acc x (disj idx vid)))
                          acc)
                (do (warn "ignoring unknown index signal op:" op)
                    acc))))]
    (f/collect-pure
     (fn [val incoming]
       ;;(debug :old-index val)
       (let [val (transduce tx rf val incoming)]
         ;;(debug :new-index val)
         val)))))

(defn- signal-index-select
  [[idx sel]]
  (if sel
    (if-let [choices (and (map? sel) (::choices sel))]
      (fn [vertex _]
        (let [res (into #{} (mapcat identity) (vals (select-keys* @vertex choices)))]
          [idx (if (seq res) res ::none)]))
      (fn [vertex _]
        [idx (@vertex sel ::none)]))
    (fn [vertex _]
      [idx (if-let [v (vals @vertex)] (into #{} (mapcat identity) v) ::none)])))

(defn default-index-vertex
  [g idx]
  (fn
    [id _ _]
    (IndexVertex.
     id
     (atom {})              ;; index
     (atom nil)             ;; prev
     (atom [])              ;; uncollected
     (atom {})              ;; sig-map
     (atom 0)               ;; new-edges
     (atom {})              ;; outs
     signal-index-select
     (collect-index idx))))

Alias index vertex (fact triples only)

An alias index vertex allows the declaration of equivalent fact items and their interchanged use in queries. For example we might configure such an index with an initial same-as property like this:

(def graph (ff/fact-graph {:index (ff/alias-index-vertex '#{same-as})}))

With the graph configured that way, any fact added which has same-as in predicate (middle) position, will have its subject and object values declared as synonyms/aliases. Example:

;; facts: bob is a person and his homepage
(ff/add-fact! graph '[bob type person])
(ff/add-fact! graph '[bob url "http://bob.com"])

;; declare further that...
(ff/add-fact! graph '[bobby same-as bob])

;; at this point bob & bobby are semantically merged to refer to the
;; same entity and one could query the graph for all facts about
;; bobby, but retrieve the facts about both bob AND bobby
(ff/add-query! graph '[bobby nil nil])

;; => [bob type person] [bob url "http://bob.com"]
Behaviors

The alias merging is implemented on top of a Disjoint set (or Unionfind) datastructure and has the following features and behaviors:

  • Order independence - declaring an equivalence before adding facts produces the same result as declaring it after facts have been added.
  • Indempodent - declaring the same equivalence more than once has no further effect (apart from if the fact expressing the equivalence has been removed meanwhile).
  • Recursive merging - equivalences can be merged into bigger sets, e.g. the 2 facts [a same-as b] [b same-as c] will alias A,B and C. This too works with declaring new properties defining aliases themselves:
;; again, configure graph to use 'same-as' property to indicate aliases
(def graph (ff/fact-graph {:index (ff/alias-index-vertex '#{same-as})}))

;; add fact declaring new alias property
(ff/add-fact! graph '[equiv same-as same-as])

;; now aliases can be expressed using either 'same-as' or 'equiv'
(ff/add-fact! graph '[bob same-as bobby])
(ff/add-fact! graph '[bb equiv bob])

In query patterns, aliases are not resolved for qvars. I.e. using the above graph/facts as example, the query pattern [bobby ?pred ?obj] will match the facts about subject bob and bobby, but the pattern [?subj url "http://bob.com"] will always only bind qvar ?subj to bob, but not relate any of bob’s aliases (only the original facts are matched). That also means the following query (with filter) will only produce an empty result set:

{:where [[?subj url "http://bob.com"]] :filter (= ?subj "bobby")}

(Queries are described in more detail in the next section of this document.)

Implementation
(defn collect-alias-index
  [g props idx]
  (let [props (atom (set props))
        rf    (fn [[index aliases :as acc] [op vid t]]
                (let [[s p o] t
                      x (nth t idx)

                      [index aliases]
                      (case op
                        :add    [(assoc index x (conj (or (index x) #{}) vid))
                                 (uf/register aliases x)]
                        :remove (if-let [idx (index x)]
                                  (if (= #{vid} idx)
                                    (let [index (dissoc index x)
                                          comp  (disj (uf/component aliases x) x)]
                                      [index
                                       (if (some index comp)
                                         aliases
                                         (uf/unregister aliases x))])
                                    [(assoc index x (disj idx vid))
                                     aliases])
                                  acc)
                        (do (warn "ignoring unknown index signal op:" op)
                            acc))

                      aliases
                      (if (@props p)
                        (case op
                          :add    (let [_ (debug idx :unify s o :pred p)
                                        aliases (uf/union aliases s o)
                                        facts'  (cond
                                                  (@props o)
                                                  (do
                                                    (swap! props conj s)
                                                    (into #{} (filter #(= s (nth % 1))) (facts g)))

                                                  (@props s)
                                                  (do
                                                    (swap! props conj o)
                                                    (into #{} (filter #(= o (nth % 1))) (facts g)))

                                                  :else nil)]
                                    (debug idx :unify-facts facts')
                                    (reduce
                                     (fn [acc [s _ o]]
                                       (debug idx :unify s o)
                                       (uf/union acc s o))
                                     aliases facts'))
                          :remove (-> aliases
                                      (uf/unregister s)
                                      (uf/unregister o)
                                      (uf/register s)
                                      (uf/register o))
                          aliases)
                        aliases)]
                  [index aliases]))]
    (f/collect-pure
     (fn [val incoming]
       ;;(debug idx :old-index val)
       (let [val (reduce rf val incoming)]
         ;;(debug idx :new-index val)
         val)))))

(defn signal-alias-index-select
  [[idx sel]]
  (if sel
    (if-let [choices (and (map? sel) (::choices sel))]
      (fn [vertex _]
        (let [[index aliases] @vertex
              choices (into #{} (mapcat #(uf/component aliases %)) choices)
              _ (info :index-sel idx :sel sel :choices choices)
              res (into #{} (mapcat identity) (vals (select-keys* index choices)))]
          [idx (if (seq res) res ::none)]))
      (fn [vertex _]
        (let [[index aliases] @vertex
              choices (uf/component aliases sel)
              _ (info :index-sel idx :sel sel :choices choices)
              res (into #{} (mapcat identity) (vals (select-keys* index choices)))]
          [idx (if (seq res) res ::none)])))
    (fn [vertex _]
      [idx (if-let [v (vals (first @vertex))] (into #{} (mapcat identity) v) ::none)])))

(defn alias-index-vertex
  [alias-props]
  (fn [g idx]
    (fn [id _ _]
      (IndexVertex.
       id
       (atom [{} (uf/disjoint-set)])
       (atom nil)               ;; prev
       (atom [])                ;; uncollected
       (atom {})                ;; sig-map
       (atom 0)                 ;; new-edges
       (atom {})                ;; outs
       signal-alias-index-select
       (collect-alias-index g alias-props idx)))))

Queries

Overview

Fact queries are by far the major key operation and functional aspect provided by this module. In fact, the FactGraph and vertices setup discussed above are really just means to an end - queries.

Queries allow us to treat (parts of) a FactGraph like a database to answer questions using the facts stored, or even manipulate the sets of facts by inferring new facts (or contradictions) and add/remove them to/fron the graph. Automatically.

Queries are defined declaratively and attached to the graph as (potentially nested trees, or at the very least, as vertex chains). That approach allows for the caching of intermediate results and their reuse by other queries, thus minimizing work effort.

Queries remain part of the graph until removed again and take part in every execution cycle of an attached execution context.

Highlevel query declaration

The query operations provided in this namespace are considered the low-level raw building blocks for constructing query trees and are therefore meant for advanced users. The DSL namespace provides a much more userfriendly and concise approach to define complex queries using a fully-featured, extensible, hashmap based DSL and expressions. These query specs are then compiled into query trees using the lowlevel forms provided here (but also offer many more additional features).

Example

This diagram illustrates how queries are generally implemented via multiple vertices in the graph, though we here use a graph with triples only (SPO facts). There’re two queries here:

  • Q1 : All “friend” facts
  • Q2 : All facts about subject “alice”
  • Row 1 : 4 facts stored in the graph
  • Row 2 : S, P, O index vertices
  • Row 3 : Index selection vertices (the blue vertex is re-used by both queries)
  • Row 4 : Query accumulators (combine intersection of index selection sets)
  • Row 5 : Query results (basic queries)
  • Row 6 : Query results (parametric queries)

The vertex values of the result row differ based on the query type used. In this example Q1 is a basic fact query and Q2 a parametric query with variables. The result set of a basic query contains the matching facts. The result set of parametric queries contains unique maps of variable bindings.

../../assets/query-example01.png

In code form this same graph could be constructed like this:

(require '[thi.ng.fabric.core :as f])
(require '[thi.ng.fabric.facts.core :as ff])

(def g (ff/fact-graph))
(ff/add-fact! g '[alice friend bob])
(ff/add-fact! g '[bob friend charlie])
(ff/add-fact! g '[alice friend dora])
(ff/add-fact! g '[alice email "[email protected]"])

(def friends (ff/add-query! g '[nil friend nil] {}))
(def alice   (ff/add-param-query! g '[alice ?p ?o] {}))

(f/execute! (f/sync-execution-context {:graph g}))

@friends
;; #{[alice friend dora] [bob friend charlie] [alice friend bob]}
@alice
;; #{{?p email, ?o "[email protected]"} {?p friend, ?o bob} {?p friend, ?o dora}}

Index selection

(defn index-selection
  [g sel]
  (if-let [sel' (cached g ::index-sel sel)]
    (do (debug :reuse-index-sel sel) sel')
    (let [index  (@(fact-indices g) (first sel))
          vertex (f/add-vertex!
                  g nil
                  {::f/score-signal-fn f/score-signal-with-new-edges
                   ::f/collect-fn      (f/collect-pure (fn [_ in] (peek in)))})
          isel   (reify
                   IQueryResult
                   (pre-result-vertex
                     [_] nil)
                   (result-vertex
                     [_] vertex)
                   f/IGraphComponent
                   (component-type
                     [_] :index-sel)
                   (add-to-graph!
                     [_ g] _)
                   (remove-from-graph!
                     [_ g] (f/remove-from-graph! _ g nil))
                   (remove-from-graph!
                     [_ g parent]
                     (if (f/none-or-single-user? vertex parent)
                       (do (debug "removing :index-sel" sel)
                           (expire! g ::index-sel sel)
                           (f/disconnect-neighbor! index vertex)
                           (f/remove-vertex! g vertex)
                           true)
                       (do (f/disconnect-neighbor! vertex parent)
                           false))))]
      (f/add-edge! g index vertex ((index-signal-handler index) sel) nil)
      (cache! g ::index-sel sel isel))))

(defn index-sel-choice
  [coll] {::choices (set coll)})

(defn make-index-selections
  [g pattern]
  (into [] (map-indexed #(index-selection g [% %2])) pattern))

Basic pattern query

A basic fact query is used to match a single or multiple facts without variable bindings:

  • ['alice 'friend 'bob] only matches the stated fact
  • [nil 'friend 'bob] matches all facts which have predicate 'friend and object 'bob

In general, nil is used as wildcard to match any S, P or O position. Therefore, the query pattern [nil nil nil] matches all facts in a triple graph (or [nil nil nil nil] for quads).

(defn add-query!
  ([g pattern opts]
   (add-query! g (fact-transform g) pattern opts))
  ([g ptx pattern opts]
   (let [pattern (transform ptx pattern)]
     (if-let [q (and (empty? opts) (cached g ::queries pattern))]
       (do (debug :reuse-basic-query pattern) q)
       (let [[s p o] pattern
             sels (make-index-selections g pattern)
             acc  (f/add-vertex!
                   g {} {::f/collect-fn collect-select})
             res  (f/add-vertex!
                   g nil
                   {::f/collect-fn       (collect-basic-query-results g opts)
                    ::f/score-signal-fn  f/score-signal-with-new-edges
                    ::f/score-collect-fn (score-collect-min-signal-vals (count sels))})
             q    (reify
                    #?@(:clj
                         [clojure.lang.IDeref (deref [_] @res)]
                         :cljs
                         [IDeref (-deref [_] @res)])
                    IFactQuery
                    (raw-pattern
                      [_] pattern)
                    IQueryResult
                    (pre-result-vertex
                      [_] acc)
                    (result-vertex
                      [_] res)
                    f/IGraphComponent
                    (component-type
                      [_] :basic-query)
                    (add-to-graph!
                      [_ g] (err/unsupported!))
                    (remove-from-graph!
                      [_ g] (f/remove-from-graph! _ g nil))
                    (remove-from-graph!
                      [_ g parent]
                      (if (f/none-or-single-user? res parent)
                        (do (debug "removing :basic-query" pattern)
                            (expire! g ::queries pattern)
                            (f/remove-vertex! g res)
                            (f/remove-vertex! g acc)
                            (run! #(f/remove-from-graph! % g acc) sels)
                            true)
                        (do (f/disconnect-neighbor! res parent)
                            false))))]
         (run! #(f/add-edge! g (result-vertex %) acc f/signal-forward nil) sels)
         (f/add-edge! g acc res f/signal-forward nil)
         (if-not (seq opts)
           (cache! g ::queries pattern q)
           q))))))

Parametric query

This type of query supports binding fact items to query variables. Instead of fact triples or quads, the result set of this query contains maps of unique variable bindings.

E.g. given the query pattern [?s :type ?t] and two matching facts of [:fabric :type :project] and [:alice :type :person], the result set is: #{{?s :fabric ?t :project} {?s :alice ?t :person}}.

Use nil for a pattern position, if it should catch all values at this position, but not be bound to a variable. Example: [?s :type nil] will catch any facts with :type as predicate, but the result set only contains unique bindings of query variable ?s.

(defn add-param-query!
  ([g pattern opts]
   (add-param-query! g (fact-transform g) pattern opts))
  ([g ptx pattern opts]
   (let [pattern (transform ptx pattern)]
     (if-let [q (and (empty? opts) (cached g ::queries pattern))]
       (do (debug :reuse-param-query pattern) q)
       (let [pattern' (inject-value-choices pattern (:values opts))
             qvars?   (mapv qvar? pattern)
             raw      (mapv #(if-not (qvar? %) %) pattern')
             vmap     (bind-translator pattern)
             verify   (fact-verifier qvars? pattern)
             tx       (if verify
                        (map #(if (verify %) (vmap %)))
                        (map vmap))
             tx       (comp tx (filter identity))
             tx       (non-grouping-result-transducer opts tx)
             coll-fn  (f/collect-pure
                       (fn [_ incoming]
                         (if-let [res (seq (peek incoming))]
                           (into #{} tx res)
                           #{})))
             sub-q    (add-query! g identity-transform raw {})
             sub-res  (result-vertex sub-q)
             result   (f/add-vertex!
                       g nil
                       {::f/collect-fn      coll-fn
                        ::f/score-signal-fn f/score-signal-with-new-edges})
             pq       (reify
                        #?@(:clj
                             [clojure.lang.IDeref (deref [_] @result)]
                             :cljs
                             [IDeref (-deref [_] @result)])
                        IFactQuery
                        (raw-pattern
                          [_] raw)
                        IQueryResult
                        (pre-result-vertex
                          [_] sub-res)
                        (result-vertex
                          [_] result)
                        f/IGraphComponent
                        (component-type
                          [_] :param-query)
                        (add-to-graph!
                          [_ g] (err/unsupported!))
                        (remove-from-graph!
                          [_ g] (f/remove-from-graph! _ g nil))
                        (remove-from-graph!
                          [_ g parent]
                          (if (f/none-or-single-user? result parent)
                            (do (debug "removing :param-query" pattern)
                                (expire! g ::queries pattern)
                                (f/remove-vertex! g result)
                                (f/remove-from-graph! sub-q g result)
                                true)
                            (do (f/disconnect-neighbor! result parent)
                                false))))]
         (f/add-edge! g sub-res result f/signal-forward nil)
         (if-not (seq opts)
           (cache! g ::queries pattern pq)
           pq))))))

Query join

(defn add-join!
  ([g lhs rhs opts]
   (add-join! g join lhs rhs opts))
  ([g join-fn lhs rhs opts]
   (let [lhs-v  (result-vertex lhs)
         rhs-v  (result-vertex rhs)
         lhs-id (f/vertex-id lhs-v)
         rhs-id (f/vertex-id rhs-v)
         _      (debug :add-join lhs-id rhs-id opts)
         tx     (non-grouping-result-transducer opts nil)
         cfn    (if tx #(into #{} tx (join-fn % %2)) join-fn)
         res    (f/add-vertex!
                 g nil
                 {::f/collect-fn
                  (fn [vertex]
                    (let [sig-map (f/signal-map vertex)
                          a (sig-map lhs-id)
                          b (sig-map rhs-id)]
                      ;;(debug (f/vertex-id vertex) :join-sets a b)
                      (f/set-value! vertex (cfn a b))))
                  ::f/score-collect-fn score-collect-join
                  ::f/score-signal-fn  f/score-signal-with-new-edges})
         jq     (as-query-entity res nil [lhs rhs] :join-query)]
     (f/add-edge! g lhs-v res f/signal-forward nil)
     (f/add-edge! g rhs-v res f/signal-forward nil)
     jq)))

(defn add-query-join!
  ([g patterns opts]
   (add-query-join! g (fact-transform g) patterns opts))
  ([g ptx patterns opts]
   (let [[a b & more :as p] patterns ;;(sort-patterns patterns)
         _     (assert (and a b) "Requires min. 2 query patterns")
         opts' (select-keys* opts [:values])
         jq    (reduce
                #(add-join! g join % (add-param-query! g ptx %2 opts') {})
                (add-join!
                 g join
                 (add-param-query! g ptx a opts')
                 (add-param-query! g ptx b opts')
                 (if (seq more) {} opts))
                (butlast more))]
     (if-let [p (last more)]
       (add-join! g join jq (add-param-query! g ptx p {}) opts)
       jq))))

(defn add-query-join-optional!
  ([g patterns opts]
   (add-query-join-optional! g (fact-transform g) patterns opts))
  ([g ptx patterns opts]
   (let [[a b & more] patterns
         _     (assert (and a b) "Requires min. 2 query patterns")
         opts' (select-keys* opts [:values])
         jq    (reduce
                #(add-join! g join-optional % (add-param-query! g ptx %2 opts') {})
                (add-join!
                 g join-optional
                 (add-param-query! g ptx a opts')
                 (add-param-query! g ptx b opts')
                 (if (seq more) {} opts))
                (butlast more))]
     (if-let [p (last more)]
       (add-join! g join-optional jq (add-param-query! g ptx p {}) opts)
       jq))))

Query union

(defn add-query-union!
  [g queries opts]
  (assert (< 1 (count queries)) "min. 2 queries required")
  (let [tx  (non-grouping-result-transducer opts nil)
        res (f/add-vertex!
             g nil
             {::f/collect-fn
              (fn [vertex]
                (let [subs (vals (f/signal-map vertex))
                      res  (reduce #(if (seq %2) (into % %2) %) subs)]
                  (f/set-value! vertex (if tx (into #{} tx res) res))))
              ::f/score-signal-fn f/score-signal-with-new-edges})
        q   (as-query-entity res nil queries :union-query)]
    (run! #(f/add-edge! g (result-vertex %) res f/signal-forward nil) queries)
    q))

Query negation

(defn add-query-negation!
  [g lhs rhs opts]
  (let [lhs-v  (result-vertex lhs)
        rhs-v  (result-vertex rhs)
        lhs-id (f/vertex-id lhs-v)
        rhs-id (f/vertex-id rhs-v)
        tx     (non-grouping-result-transducer opts nil)
        cfn    (if tx #(into #{} tx %) identity)
        res    (f/add-vertex!
                g nil
                {::f/collect-fn
                 (fn [vertex]
                   (let [sig-map (f/signal-map vertex)
                         a       (sig-map lhs-id)
                         b       (sig-map rhs-id)
                         res     (join-negative a b)]
                     (f/set-value! vertex (cfn res))))
                 ::f/score-signal-fn f/score-signal-with-new-edges})
        q   (as-query-entity res nil [lhs rhs] :negation-query)]
    (f/add-edge! g lhs-v res f/signal-forward nil)
    (f/add-edge! g rhs-v res f/signal-forward nil)
    q))

Path queries

A property path is a possible route through a graph between two nodes. The most trivial case, a single fact, has a path length of 1. The ends of the path may be constants or variables. Variables can not be used as part of the path (predicates) itself, only at the ends.

Path queries allow for more concise expressions for some graph patterns and they also add the ability to match connectivity of subjects/objects by an arbitrary length path.

Note: Currently, path queries are only supported for triples (fact len = 3) and only bounded path queries are implemented, meaning both min/max path lengths MUST be specified.

Bounded path query

;; match path (of single pred) between depth 1 and 3
(add-path-query! g '[?s [pred] ?o] {:min 1 :max 3})

;; match path (cyclically repeated) between 2 hops 4
(add-path-query! g '[?s [pred1 pred2] ?o] {:min 2 :max 4})

;; match given path
(add-path-query! g '[?s [p1 p2 p3] ?o])

;; match *any* paths starting at "node-a" with len 2 - 4
(add-path-query! g '[node-a [nil] ?o] {:min 1 :max 4})

;; check if path exists between fixed subject/object
;; this query has no variables, see note below
(add-path-query! g '[node-a [pred] node-b] {:min 3 :max 3})
(defn add-path-query!
  ([g path-pattern]
   (add-path-query! g path-pattern {}))
  ([g path-pattern opts]
   (add-path-query! g (fact-transform g) path-pattern opts))
  ([g ptx path-pattern opts]
   (let [len (count (nth path-pattern 1))
         {:keys [min max] :or {min len max len}} opts]
     (assert (== 3 (count path-pattern)) "path queries only supported for triple patterns")
     (assert (pos? min) "min depth must be >= 1")
     (assert (<= min max) "min depth must be <= max depth")
     (let [[patterns avars] (resolve-path-pattern path-pattern max)
           [?s _ ?o] path-pattern
           vs? (qvar? ?s)
           vo? (qvar? ?o)
           req (take min patterns)
           opt (drop min (take max patterns))
           req (if (seq req)
                 (if (== 1 (count req))
                   (add-param-query! g ptx (first req) {})
                   (add-query-join! g ptx req {})))
           opt (if (seq opt)
                 (if (== 1 (count opt))
                   (add-param-query! g ptx (first opt) {})
                   (add-query-join-optional! g ptx opt {})))
           q   (if (and req opt)
                 (add-join! g join-optional req opt {}) ;; TODO opts
                 (or req opt))
           qr  (result-vertex q)
           tx  (cond
                 (or (= min max) (and vs? (not vo?)))
                 (let [qv (filter qvar? path-pattern)]
                   (map #(select-keys* % qv)))

                 (and vo? (not vs?))
                 (let [rv (take (dec min) avars)]
                   (mapcat #(map (fn [v] {?o v}) (vals (apply dissoc % rv)))))

                 :else
                 (let [rv (cons ?s (take (dec min) avars))]
                   (mapcat
                    #(let [s (% ?s)] (map (fn [v] {?s s ?o v}) (vals (apply dissoc % rv)))))))
           tx  (non-grouping-result-transducer opts tx)
           res (f/add-vertex!
                g #{}
                {::f/collect-fn (f/collect-pure (fn [_ in] (into #{} tx (peek in))))})
           pq  (as-query-entity res qr [q] :path-query)]
       (f/add-edge! g qr res f/signal-forward nil)
       pq))))

Query results

See the query specification section in the DSL namespace for all available options…

(defn make-query-result
  [{:keys [aggregate order group-by] :as spec}]
  (let [in-tx (if (and aggregate (not group-by))
                (result-pre-aggregator aggregate))
        in-tx (if order
                (if in-tx
                  #(sort-by order (in-tx %))
                  #(sort-by order %))
                in-tx)
        tx    (or (result-transducer spec nil) (map identity))
        cfn   (if group-by
                (let [gfn (transient-group-by-reducer spec group-by)]
                  (if in-tx
                    (fn [in] (postprocess-group-by (transduce tx gfn (in-tx in))))
                    (fn [in] (postprocess-group-by (transduce tx gfn in)))))
                (if in-tx
                  (fn [in] (into (if order [] #{}) tx (in-tx in)))
                  (fn [in] (into #{} tx in))))]
    (if (and aggregate group-by)
      (result-post-aggregator aggregate cfn spec)
      cfn)))

(defn add-query-result!
  [g spec q]
  (let [cfn  (make-query-result spec)
        res  (f/add-vertex!
              g nil
              {::f/collect-fn      (f/collect-pure (fn [_ in] (cfn (peek in))))
               ::f/score-signal-fn f/score-signal-with-new-edges})
        qraw (result-vertex q)
        qres (as-query-entity res qraw [q] :query-result)]
    (f/add-edge! g qraw res f/signal-forward nil)
    qres))

Query helpers

(def ^:dynamic *auto-qvar-prefix* "?__q")

(defn qvar?
  "Returns true, if x is a qvar (a symbol prefixed with '?')"
  [x] (and (symbol? x) (= \? (.charAt ^String (name x) 0))))

(defn auto-qvar?
  "Returns true, if x is an auto-generated qvar (a symbol prefixed
    with *auto-qvar-prefix*)"
  [x] (and (symbol? x) (zero? (.indexOf ^String (name x) ^String *auto-qvar-prefix*))))

(defn auto-qvar
  "Creates a new auto-named qvar (symbol)."
  [] (gensym *auto-qvar-prefix*))

(defn qvar-name
  [x] (-> x name (subs 1)))

(defn resolve-path-pattern
  "Takes a path triple pattern and max depth. The pattern's predicate
    must be a seq of preds. Returns a 2-elem vector [patterns vars],
    where `patterns` is a seq of query patterns with injected temp qvars
    for inbetween patterns and `vars` the autogenerated qvars
    themselves.

    Example:

      [?s [p1 p2 p3] ?o]
      => [([?s p1 ?__q0] [?__q0 p2 ?__q1] [?__q1 p3 ?o]) (?__q0 ?__q1)]"
  [[s p o] maxd]
  (let [avars (repeatedly maxd auto-qvar)
        vars  (cons s avars)]
    [(->> (concat (interleave vars (take maxd (cycle p))) [o])
          (partition 3 2))
     avars]))

(defn inject-value-choices
  [pattern value-map]
  (if (seq value-map)
    (mapv #(if-let [choices (value-map %)] (index-sel-choice choices) %) pattern)
    pattern))

(defn pattern-var-count
  [pattern]
  (count (filter qvar? pattern)))

(defn sort-patterns
  [patterns]
  (sort-by pattern-var-count patterns))

(defn unique-bindings?
  "Returns true if all value-map in the given map are unique, i.e.
    no two keys are mapped to the same value."
  [res] (== (count (into #{} (vals res))) (count res)))

(defn select-keys*
  "Like c.c/select-keys, but doesn't retain map's meta"
  {:static true}
  [map keyseq]
  (loop [ret {} keys (seq keyseq)]
    (if keys
      #?(:clj
         (let [entry (. clojure.lang.RT (find map (first keys)))]
           (recur
            (if entry (conj ret entry) ret)
            (next keys)))
         :cljs
         (let [key   (first keys)
               entry (get map key ::not-found)]
           (recur
            (if (= entry ::not-found) ret (assoc ret key entry))
            (next keys))))
      ret)))

(defn index*
  "Like clojure.set/index, but using select-keys w/o metadata retention."
  [xrel ks]
  (persistent!
   (reduce
    (fn [m x]
      (let [ik (select-keys* x ks)]
        (assoc! m ik (conj (get m ik #{}) x))))
    (transient {}) xrel)))

(defn join
  "Based on clojure.set/join. Does not join when there're no shared
    keys and no key mapping, enforced result set limit, uses
    transients."
  [xrel yrel]
  (if (and (seq xrel) (seq yrel))
    (let [ks (set/intersection (set (keys (first xrel))) (set (keys (first yrel))))]
      (if (seq ks)
        (let [[r s] (if (<= (count xrel) (count yrel))
                      [xrel yrel]
                      [yrel xrel])
              idx (index* r ks)]
          (persistent!
           (reduce
            (fn [ret x]
              (let [found (idx (select-keys* x ks))]
                (if found
                  (reduce #(conj! % (conj %2 x)) ret found)
                  ret)))
            (transient #{}) s)))
        #{}))
    #{}))

(defn join-optional
  [a b]
  (loop [old (transient #{}), new (transient #{}), kb b]
    (if kb
      (let [kb'       [(first kb)]
            [old new] (loop [old old, new new, ka a]
                        (if ka
                          (let [ka' (first ka)
                                j   (first (join [ka'] kb'))]
                            (if j
                              (recur (conj! old ka') (conj! new j) (next ka))
                              (recur old new (next ka))))
                          [old new]))]
        (recur old new (next kb)))
      (let [new (persistent! new)]
        (if (seq new)
          (into (apply disj (set a) (persistent! old)) new)
          a)))))

(defn joinable?
  ([a b]
   (joinable? a b (set (keys a)) (set (keys b))))
  ([a b aks bks]
   (let [ks (set/intersection aks bks)]
     (and (seq ks) (= (select-keys* a ks) (select-keys* b ks))))))

(defn join-negative
  [a b]
  (if (and (seq a) (seq b))
    (reduce
     (fn [res bres]
       (let [bks (set (keys bres))]
         (persistent!
          (reduce
           (fn [res ares]
             (if (joinable? ares bres (set (keys ares)) bks)
               (disj! res ares)
               res))
           (transient res) res))))
     a b)
    a))

(defn result-pre-aggregator
  [agg]
  (fn [results]
    (let [ares (agg results)]
      (map #(merge % ares) results))))

(defn result-post-aggregator
  [agg cfn spec]
  (let [sel (:select spec)
        rfn (if (or (nil? sel) (= :* sel))
              (let [sel (if (sequential? sel) sel [sel])]
                (fn [acc k res]
                  (let [ares (agg res)]
                    (assoc acc k (mapv #(merge % ares) res)))))
              (let [avars (set (keys (:aggregate* spec)))]
                (if (and (seq avars) (every? avars sel))
                  (fn [acc k res]
                    (assoc acc k (select-keys (merge (first res) (agg res)) sel)))
                  (fn [acc k res]
                    (let [ares (agg res)]
                      (assoc acc k (into (empty res) (map #(select-keys (merge % ares) sel)) res)))))))]
    (fn [in] (reduce-kv rfn {} (cfn in)))))

(defn result-transducer
  [spec tx]
  (let [tx (if-let [bnd (:bind spec)]
             (let [tx' (map bnd)]
               (if tx (comp tx tx') tx'))
             tx)
        tx (if (:unique spec)
             (let [tx' (filter unique-bindings?)]
               (if tx (comp tx tx') tx'))
             tx)
        tx (if-let [flt (:filter spec)]
             (let [tx' (filter flt)]
               (if tx (comp tx tx') tx'))
             tx)
        tx (if-let [lim (:limit spec)]
             (let [tx' (take lim)]
               (if tx (comp tx tx') tx'))
             tx)
        tx (if-let [sel (and (not (:group-by spec)) (:select spec))]
             (if-not (= :* sel)
               (let [sel (if (sequential? sel) sel [sel])
                     tx' (comp (map #(select-keys % sel)) (filter seq))
                     agg (set (keys (:aggregate* spec)))
                     tx' (if (and (seq agg) (every? agg sel))
                           (comp tx' (take 1))
                           tx')]
                 (if tx (comp tx tx') tx'))
               tx)
             tx)]
    tx))

(defn non-grouping-result-transducer
  [spec tx] (result-transducer (dissoc spec :group-by :aggregate*) tx))

(defn transient-group-by-reducer
  [{:keys [aggregate order select]} grp]
  (let [ctype (if order [] #{})]
    (if-not (or aggregate (nil? select) (= :* select))
      (let [sel (if (sequential? select) select [select])]
        (fn
          ([] (transient {}))
          ([res] res)
          ([res input]
           (let [k (grp input)]
             (assoc! res k (conj! (get res k (transient ctype)) (select-keys* input sel)))))))
      (fn
        ([] (transient {}))
        ([res] res)
        ([res input]
         (let [k (grp input)]
           (assoc! res k (conj! (get res k (transient ctype)) input))))))))

(defn postprocess-group-by
  [res]
  (reduce-kv
   (fn [acc k v] (assoc acc k (persistent! v)))
   {} (persistent! res)))

(defn accumulate-result-vars
  [results]
  (->> results
       (reduce
        (fn [acc res]
          (reduce-kv
           (fn [acc k v]
             (assoc! acc k (conj! (get acc k (transient #{})) v)))
           acc res))
        (transient {}))
       (postprocess-group-by)))

(defn remove-recursively
  [g res parent deps type]
  (if (f/none-or-single-user? res parent)
    (do (debug "removing" type)
        (f/remove-vertex! g res)
        (run! #(f/remove-from-graph! % g res) deps)
        true)
    (do (f/disconnect-neighbor! res parent)
        false)))

(defn as-query-entity
  [res pre-res deps type]
  (reify
    #?@(:clj
         [clojure.lang.IDeref (deref [_] @res)]
         :cljs
         [IDeref (-deref [_] @res)])
    IFactQuery
    (raw-pattern
      [_] nil)
    IQueryResult
    (pre-result-vertex
      [_] pre-res)
    (result-vertex
      [_] res)
    f/IGraphComponent
    (component-type
      [_] type)
    (add-to-graph!
      [_ g] (err/unsupported!))
    (remove-from-graph!
      [_ g] (f/remove-from-graph! _ g nil))
    (remove-from-graph!
      [_ g parent]
      (remove-recursively g res parent deps type))))

Variable binding result mapping

This function (and fact-verifier below) are used as part of the mechanism allowing the graph to support facts of varying lengths and queries over them. Currently, this module only supports fact triples (len=3) or quads (len=4).

bind-translator is used to provide optimized functions to bind query variables to their fact components in the result set of a parametric query.

(defn bind-translator
  [pattern]
  (let [vmap (reduce-kv
              (fn [acc k v]
                (if (and (qvar? k) (not (acc k)))
                  (assoc acc k #(% v))
                  acc))
              {} (zipmap pattern (range)))]
    (case (count vmap)
      0 (fn [r] {})
      1 (let [[[a av]] (seq vmap)] (fn [r] {a (av r)}))
      2 (let [[[a av] [b bv]] (seq vmap)] (fn [r] {a (av r) b (bv r)}))
      3 (let [[[a av] [b bv] [c cv]] (seq vmap)] (fn [r] {a (av r) b (bv r) c (cv r)}))
      (fn [r] (reduce-kv (fn [acc k f] (assoc acc k (f r))) {} vmap)))))

Fact verifier multi-method

The fact verifier function is used to provide optimized validations to ensure query variables are bound to unique values. E.g. given a query pattern ['?a ?b '?a], these functions ensure that all qvars are bound to different values, i.e. it would reject the fact [fact is fiction] (yes, pretty meaningless..) because ?a wouldn’t be bound to the same values (first ?a binds to “fact”, second ?a to “fiction”)…

((fact-verifier [true true true] '[?a ?b ?a]) '[fact is fiction])
;; false
((fact-verifier [true true true] '[?a ?b ?a]) '[fact is fact])
;; true
(defmulti fact-verifier
  (fn [qvars? pattern] (count pattern)))

(defmethod fact-verifier 3
  [[vs? vp? vo?] [s p o]]
  (cond
    (and vs? vp? vo?) (cond
                        (= s p o) #(= (% 0) (% 1) (% 2))
                        (= s p) #(and (= (% 0) (% 1)) (not= (% 0) (% 2)))
                        (= s o) #(and (= (% 0) (% 2)) (not= (% 0) (% 1)))
                        (= p o) #(and (= (% 1) (% 2)) (not= (% 0) (% 1)))
                        :else nil)
    (and vs? vp?)     (if (= s p) #(= (% 0) (% 1)) #(not= (% 0) (% 1)))
    (and vs? vo?)     (if (= s o) #(= (% 0) (% 2)) #(not= (% 0) (% 2)))
    (and vp? vo?)     (if (= p o) #(= (% 1) (% 2)) #(not= (% 1) (% 2)))
    :else             nil))

(defmethod fact-verifier 4
  [[vt? vs? vp? vo?] [t s p o]]
  (cond
    (and vt? vs? vp? vo?)
    (cond
      (= t s p o)           #(= (% 0) (% 1) (% 2) (% 3))
      (= t s p)             #(and (not= (% 0) (% 3)) (= (% 0) (% 1) (% 2)))
      (= t s o)             #(and (not= (% 0) (% 2)) (= (% 0) (% 1) (% 3)))
      (= t p o)             #(and (not= (% 0) (% 1)) (= (% 0) (% 2) (% 3)))
      (= s p o)             #(and (not= (% 0) (% 1)) (= (% 1) (% 2) (% 3)))
      (and (= t s) (= p o)) #(and (= (% 0) (% 1)) (= (% 2) (% 3)))
      (and (= t p) (= s o)) #(and (= (% 0) (% 2)) (= (% 1) (% 3)))
      (and (= t o) (= s p)) #(and (= (% 0) (% 3)) (= (% 1) (% 2)))
      (= t s)               #(let [t (first %)] (and (= t (% 1)) (not= t (% 2)) (not= t (% 3))))
      (= t p)               #(let [t (first %)] (and (= t (% 2)) (not= t (% 1)) (not= t (% 3))))
      (= t o)               #(let [o (peek %)]  (and (= o (% 0)) (not= o (% 1)) (not= o (% 2))))
      (= s p)               #(let [s (nth % 1)] (and (= s (% 2)) (not= s (% 3)) (not= s (% 0))))
      (= s o)               #(let [o (peek %)]  (and (= o (% 1)) (not= o (% 2)) (not= o (% 0))))
      (= p o)               #(let [o (peek %)]  (and (= o (% 2)) (not= o (% 1)) (not= o (% 0))))
      :else                 nil)
    (and vt? vs? vp?)
    (cond
      (= t s p)             #(= (% 0) (% 1) (% 2))
      (= t s)               #(and (= (% 0) (% 1)) (not= (% 0) (% 2)))
      (= t p)               #(and (= (% 0) (% 2)) (not= (% 0) (% 1)))
      (= s p)               #(and (= (% 1) (% 2)) (not= (% 1) (% 0)))
      :else                 nil)
    (and vt? vs? vo?)
    (cond
      (= t s o)             #(= (% 0) (% 1) (% 3))
      (= t s)               #(and (= (% 0) (% 1)) (not= (% 0) (% 3)))
      (= t o)               #(and (= (% 0) (% 3)) (not= (% 0) (% 1)))
      (= s o)               #(and (= (% 1) (% 3)) (not= (% 1) (% 0)))
      :else                 nil)
    (and vt? vp? vo?)
    (cond
      (= t p o)             #(= (% 0) (% 2) (% 3))
      (= t p)               #(and (= (% 0) (% 2)) (not= (% 0) (% 3)))
      (= t o)               #(and (= (% 0) (% 3)) (not= (% 0) (% 1)))
      (= p o)               #(and (= (% 2) (% 3)) (not= (% 2) (% 0)))
      :else                 nil)
    (and vs? vp? vo?)
    (cond
      (= s p o)             #(= (% 1) (% 2) (% 3))
      (= s p)               #(and (= (% 1) (% 2)) (not= (% 1) (% 3)))
      (= s o)               #(and (= (% 1) (% 3)) (not= (% 1) (% 2)))
      (= p o)               #(and (= (% 2) (% 3)) (not= (% 2) (% 1)))
      :else                 nil)
    (and vt? vs?)           (if (= t s) #(= (% 0) (% 1)) #(not= (% 0) (% 1)))
    (and vt? vp?)           (if (= t p) #(= (% 0) (% 2)) #(not= (% 0) (% 2)))
    (and vt? vo?)           (if (= t o) #(= (% 0) (% 3)) #(not= (% 0) (% 3)))
    (and vs? vp?)           (if (= s p) #(= (% 1) (% 2)) #(not= (% 1) (% 2)))
    (and vs? vo?)           (if (= s o) #(= (% 1) (% 3)) #(not= (% 1) (% 3)))
    (and vp? vo?)           (if (= p o) #(= (% 2) (% 3)) #(not= (% 2) (% 3)))
    :else                   nil))

Rulebased inference

(defn add-rule!
  [g {:keys [id query patterns transform production collect-fn] :as opts}]
  (let [id      (or id (f/random-id))
        coll-fn (or collect-fn (collect-inference g production))
        query   (or query
                    (let [tx (or transform (fact-transform g))
                          q-opts (select-keys opts [:filter :limit :select :unique])]
                      (if (< 1 (count patterns))
                        (add-query-join! g tx patterns q-opts)
                        (add-param-query! g tx (first patterns) q-opts))))
        inf     (f/add-vertex! g #{} {::f/collect-fn coll-fn})
        rule    (reify
                  #?@(:clj
                       [clojure.lang.IDeref (deref [_] @query)]
                       :cljs
                       [IDeref (-deref [_] @query)])
                  IFactQuery
                  (raw-pattern
                    [_] nil)
                  IQueryResult
                  (pre-result-vertex
                    [_] (pre-result-vertex query))
                  (result-vertex
                    [_] (result-vertex query))
                  f/IGraphComponent
                  (component-type
                    [_] :rule)
                  (add-to-graph!
                    [_ g] (err/unsupported!))
                  (remove-from-graph!
                    [_ g] (f/remove-from-graph! _ g nil))
                  (remove-from-graph!
                    [_ g parent]
                    (if (f/none-or-single-user? inf parent)
                      (do (debug "removing :rule" id)
                          (f/remove-vertex! g inf)
                          (f/remove-from-graph! query g inf)
                          true)
                      (do (f/disconnect-neighbor! inf parent)
                          false))))]
    (f/add-edge! g (result-vertex query) inf f/signal-forward nil)
    (cache! g ::rules id rule)))

Signal & collect functions

(defn- signal-fact
  [vertex op] [op (f/vertex-id vertex) @vertex])

(def ^:private collect-select
  (f/collect-pure
   (fn [val incoming]
     (let [val (reduce (fn [acc [idx res]] (assoc acc idx res)) val incoming)]
       ;;(debug :coll-select val incoming)
       val))))

(defn- score-collect-min-signal-vals
  [num]
  (fn [vertex]
    (if (> num (count (vals (peek (f/uncollected-signals vertex))))) 0 1)))

(defn- score-collect-min-signals
  [num]
  (fn [vertex]
    (if (> num (count (f/uncollected-signals vertex))) 0 1)))

(defn- collect-basic-query-results
  [g opts]
  (let [tx  (if (not= false (:untransform opts))
              (let [ftx (fact-transform g)]
                (map #(untransform ftx @(f/vertex-for-id g %))))
              (map #(deref (f/vertex-for-id g %))))
        tx  (if-let [flt (:filter opts)]
              (let [tx' (filter flt)]
                (if tx (comp tx tx') tx'))
              tx)
        tx  (if-let [lim (:limit opts)]
              (let [tx' (take lim)]
                (if tx (comp tx tx') tx'))
              tx)]
    (f/collect-pure
     (fn [_ incoming]
       (let [res (vals (peek incoming))]
         ;;(debug :agg-incoming res)
         (if (every? #(not= ::none %) res)
           (->> res
                (into #{})
                (sort-by count)
                (reduce set/intersection)
                (into #{} tx))
           #{}))))))

(defn- score-collect-join
  [vertex]
  (if (and (seq (f/uncollected-signals vertex))
           (== (count (f/signal-map vertex)) 2))
    1 0))

(defn- collect-inference
  [g production]
  (fn [vertex]
    (let [prev @vertex
          in   (reduce into #{} (f/uncollected-signals vertex))
          adds (set/difference in prev)]
      (debug (f/vertex-id vertex) :additions adds)
      (run! #(production g vertex %) adds)
      (f/update-value! vertex #(set/union % adds)))))

(defn collect-into-set
  [tx] (f/collect-pure (fn [_ in] (into #{} tx (peek in)))))

Graph logging

In addition to the standard events emitted by the default ComputeGraph of the fabric-core module, a FactGraph also emits the events :add-fact or :remove-fact whenever a fact is added or removed. The add-fact-graph-logger function below can be used to subscribe to these (and only these) events and react to them, for example to write a log of fact changes to disk. The fn takes a logging function which is called with a single arg, a 2-element vector of [event fact]. The user function is called from a core.async go-loop setup by add-fact-graph-logger.

(def fact-log-transducer
  (map (fn [[op f]] [({:add-fact :add :remove-fact :remove} op) f])))

(defn add-fact-graph-logger
  ([g log-fn]
   (add-fact-graph-logger g log-fn 1024))
  ([g log-fn buf-size]
   (let [ch        (async/chan buf-size)
         watch-id  (f/random-id)
         log->chan #(go (>! ch %))]
     (async/thread
       (loop []
         (when-let [t (<!! ch)]
           (log-fn t)
           (recur))))
     (f/add-watch! g :add-fact watch-id log->chan)
     (f/add-watch! g :remove-fact watch-id log->chan)
     (info "started fact logger with buffer" buf-size)
     {:graph g :chan ch :watch-id watch-id})))

(defn remove-fact-graph-logger
  [{:keys [graph watch-id chan]}]
  (f/remove-watch! graph :add-fact watch-id)
  (f/remove-watch! graph :remove-fact watch-id)
  (async/close! chan))

Generic helpers

Converting maps to facts / triples

Clojure’s hashmaps provide a natural way to express nested relationships, but cannot be directly used as facts. The function map->facts recursively converts a given hashmap into a lazy-seq of facts (triples).

Conversion rules:

  • toplevel keys are used as subjects and MUST each have a map as value, where their keys are used as predicates and their values as objects.
  • maps in object position are translated into UUID strings in order to allow the nested map to become a subject itself. In RDF terms, this is equivalent to BNodes.
  • sequential types (vectors, lists) in object position cause multiple triples with same subject & predicate to be emitted. If any of the items in the object sequence is a map, it will be treated as BNode as described in previous rule

The following example demonstrates all these rules:

(map->facts
 {"people:toxi"
  {"rdf:type"          "schema:Person"
   "schema:givenName"  "Karsten"
   "schema:familyName" "Schmidt"
   "foaf:nick"         "toxi"
   "foaf:account"      [{"rdf:type" "foaf:OnlineAccount"
                         "foaf:accountServiceHomepage" "https://twitter.com/"
                         "foaf:accountName" "toxi"}
                        {"rdf:type" "foaf:OnlineAccount"
                         "foaf:accountServiceHomepage" "https://github.com/"
                         "foaf:accountName" "postspectacular"}]
   "schema:memberOf"   ["orgs:postspectacular"
                        "orgs:thing"]
   "schema:gender"     "male"
   "schema:jobTitle"   "Computational Designer"}})

;; (["people:toxi" "rdf:type" "schema:Person"]
;;  ["people:toxi" "schema:givenName" "Karsten"]
;;  ["people:toxi" "schema:familyName" "Schmidt"]
;;  ["people:toxi" "foaf:nick" "toxi"]
;;  ["people:toxi" "foaf:account" "b2c1c57e-2967-44fe-8b66-135ec6fa80bf"]
;;  ["b2c1c57e-2967-44fe-8b66-135ec6fa80bf" "rdf:type" "foaf:OnlineAccount"]
;;  ["b2c1c57e-2967-44fe-8b66-135ec6fa80bf" "foaf:accountServiceHomepage" "https://twitter.com/"]
;;  ["b2c1c57e-2967-44fe-8b66-135ec6fa80bf" "foaf:accountName" "toxi"]
;;  ["people:toxi" "foaf:account" "4e6fba86-92ec-4422-a8a2-c80150ddb885"]
;;  ["4e6fba86-92ec-4422-a8a2-c80150ddb885" "rdf:type" "foaf:OnlineAccount"]
;;  ["4e6fba86-92ec-4422-a8a2-c80150ddb885" "foaf:accountServiceHomepage" "https://github.com/"]
;;  ["4e6fba86-92ec-4422-a8a2-c80150ddb885" "foaf:accountName" "postspectacular"]
;;  ["people:toxi" "schema:memberOf" "orgs:postspectacular"]
;;  ["people:toxi" "schema:memberOf" "orgs:thing"]
;;  ["people:toxi" "schema:gender" "male"]
;;  ["people:toxi" "schema:jobTitle" "Computational Designer"])
(declare map->facts)

(defn- bnode-facts
  [s p o]
  (let [bn (strf/new-uuid)] (cons [s p bn] (map->facts o bn))))

(defn- map->facts*
  [s]
  (fn [[p o]]
    (cond
      (sequential? o) (mapcat (fn [o] (if (map? o) (bnode-facts s p o) [[s p o]])) o)
      (map? o)        (bnode-facts s p o)
      :else           [[s p o]])))

(defn map->facts
  ([fact-map]
   (mapcat (fn [[s v]] (mapcat (map->facts* s) v)) fact-map))
  ([fact-map s]
   (mapcat (map->facts* s) fact-map)))

Complete namespace definition

(ns thi.ng.fabric.facts.core
  #?@(:clj
      [(:require
        [thi.ng.fabric.core :as f]
        [thi.ng.xerror.core :as err]
        [thi.ng.strf.core :as strf]
        [thi.ng.dstruct.unionfind :as uf]
        [clojure.set :as set]
        [clojure.data.int-map :as imap]
        [clojure.core.async :as async :refer [go go-loop <! <!! >!]]
        [taoensso.timbre :refer [debug info warn]])]
      :cljs
      [(:require-macros
        [cljs.core.async.macros :refer [go go-loop]]
        [cljs-log.core :refer [debug info warn]])
       (:require
        [thi.ng.fabric.core :as f]
        [thi.ng.xerror.core :as err]
        [thi.ng.strf.core :as strf]
        [clojure.set :as set]
        [cljs.core.async :as async :refer [<! <!! >!]])]))

(def ^:private MAX_LIMIT #?(:clj Long/MAX_VALUE :cljs (.-MAX_VALUE js/Number)))

(declare select-keys* qvar? add-query! add-query-join!)
(declare index-selection index-sel-choice make-index-selections)

<<protos>>

<<helpers>>

<<sig-coll>>

<<query-helpers>>

<<fact-v>>

<<index-v>>

<<fact-tx>>

<<graph>>

<<queries>>

<<inference>>

<<logger>>

Questions

maps as facts?

  • maybe more convenient, but not strictly facts
  • requires custom fact transforms & query patterns (also maps)
  • how to deal with adding new facts, merged into original map?
    • just provide map translations: map->facts (use trio triple-seq conversion)
;; example map
{"P-123" {:name "karsten" :nick "toxi" :address {:country "uk", :city "london"}}}

;; query: subjects with name & city
{?n :name ?city [:address :city]}