Skip to content

Commit

Permalink
[WIP] Improve Index Structure
Browse files Browse the repository at this point in the history
Closes: #2028
  • Loading branch information
alexanderkiel committed Sep 10, 2024
1 parent 7e95761 commit c78bd5b
Show file tree
Hide file tree
Showing 109 changed files with 4,596 additions and 548 deletions.
1 change: 1 addition & 0 deletions dev/blaze/dev/rocksdb.clj
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
(ac/supply-async #(rocksdb/compact-range! (index-kv-store) :resource-as-of-index))

(doseq [index [:search-param-value-index
:type-search-param-token-full-resource-index
:resource-value-index
:compartment-search-param-value-index
:compartment-resource-type-index
Expand Down
139 changes: 130 additions & 9 deletions docs/implementation/database.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,14 +120,27 @@ The `SystemStats` index keeps track of the total number of resources, and the nu

The indices not depending on `t` directly point to the resource versions by their content hash.

| Name | Key Parts | Value |
|-------------------------------------|----------------------------------------------------------------|-------|
| SearchParamValueResource | search-param, type, value, id, hash-prefix | - |
| ResourceSearchParamValue | type, id, hash-prefix, search-param, value | - |
| CompartmentSearchParamValueResource | comp-code, comp-id, search-param, type, value, id, hash-prefix | - |
| CompartmentResourceType | comp-code, comp-id, type, id | - |
| SearchParam | code, type | id |
| ActiveSearchParams | id | - |
| Name | Key Parts | Value | Since |
|-------------------------------------------|----------------------------------------------------------------|-------|------:|
| SearchParamValueResource | search-param, type, value, id, hash-prefix | - |
| ResourceSearchParamValue | type, id, hash-prefix, search-param, value | - |
| CompartmentSearchParamValueResource | comp-code, comp-id, search-param, type, value, id, hash-prefix | - |
| CompartmentResourceType | comp-code, comp-id, type, id | - |
| TypeSearchParamTokenFullResource | search-param, type, value, system, id, hash-prefix | - | 0.27 |
| TypeSearchParamTokenSystemResource | search-param, type, system, id, hash-prefix | - | 0.27 |
| TypeSearchParamReferenceCanonicalResource | search-param, type, url, version, id, hash-prefix | - | 0.27 |
| TypeSearchParamReferenceUrlResource | search-param, type, url, id, hash-prefix | - | 0.27 |
| TypeSearchParamReferenceLocalResource | search-param, type, ref-id, ref-type, id, hash-prefix | - | 0.27 |
| ResourceSearchParamTokenFull | type, id, hash-prefix, search-param, value, system | - | 0.27 |
| ResourceSearchParamTokenSystem | type, id, hash-prefix, search-param, system | - | 0.27 |
| ResourceSearchParamReferenceCanonical | type, id, hash-prefix, search-param, url, version | - | 0.27 |
| ResourceSearchParamReferenceUrl | type, id, hash-prefix, search-param, url | - | 0.27 |
| ResourceSearchParamReferenceLocal | type, id, hash-prefix, search-param, ref-id, ref-type | - | 0.27 |
| PatientTypeSearchParamTokenFullResource | patient-id, search-param, type, value, system, id, hash-prefix | - | 0.27 |
| SearchParam | code, type | id |
| ActiveSearchParams | id | - |
| SearchParamCode | code | id |
| System | code | id |

#### SearchParamValueResource

Expand All @@ -137,7 +150,7 @@ The `SearchParamValueResource` index is used to find resources based on search p
* `type` - a 4-byte hash of the resource type
* `value` - the encoded value of the resource reachable by the search parameters FHIRPath expression. The encoding depends on the search parameters type.
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

The way the `SearchParamValueResource` index is used, depends on the type of the search parameter. The following sections will explain this in detail for each type:

Expand Down Expand Up @@ -223,6 +236,102 @@ The `ResourceSearchParamValue` index is used to decide whether a resource contai
* `search-param` - a 4-byte hash of the search parameters code used to identify the search parameter
* `value` - the encoded value of the resource reachable by the search parameters FHIRPath expression. The encoding depends on the search parameters type.

#### TypeSearchParamTokenFullResource

New index in v0.27.0. It is used to find resources based on full values of search parameters of type token. Full values consist of the system and value for Identifiers or code for Codings. The system will be the special value 0x000000 if not available in the resource.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `value` - the full code/value
* `system` - a 3-byte identifier of the system URI
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamTokenSystemResource

New index in v0.27.0. It is used to find resources based on the system only of search parameters of type token. If the system is not available, no index entry will be written.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `system` - a 3-byte identifier of the system URI
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamReferenceCanonicalResource

New index in v0.27.0. It is used to find resources based on the reference value in case it is an canonical URL of search parameters of type reference.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - a 4-byte identifier of the canonical URL
* `version` - the full version
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamReferenceUrlResource

New index in v0.27.0. It is used to find resources based on the reference value in case it is an URL of search parameters of type reference.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - the full url
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamReferenceLocalResource

New index in v0.27.0. It is used to find resources based on the reference value in case it is a local reference of search parameters of type reference.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `ref-id` - the logical id of the referenced resource
* `ref-type` - the type byte of the referenced resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### ResourceSearchParamTokenFull

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `value` - the full code/value
* `system` - a 3-byte identifier of the system URI

#### ResourceSearchParamTokenSystem

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `system` - a 3-byte identifier of the system URI

#### ResourceSearchParamReferenceCanonical

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - a 4-byte identifier of the canonical URL
* `version` - the full version

#### ResourceSearchParamReferenceUrl

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - the full url

#### ResourceSearchParamReferenceLocal

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `ref-id` - the logical id of the referenced resource
* `ref-type` - the type byte of the referenced resource type (one byte)

#### CompartmentSearchParamValueResource

The `CompartmentSearchParamValueResource` index is used to find resources of a particular compartment based on search parameter values.
Expand All @@ -236,6 +345,18 @@ The `CompartmentResourceType` index is used to find all resources that belong to
* `type` - a 4-byte hash of the resource type of the resource that belongs to the compartment, ex. `Observation`
* `id` - the logical id of the resource that belongs to the compartment, ex. the logical id of the Observation

#### PatientTypeSearchParamTokenFullResource

New index in v0.27.0. It is used to find resources based on full values of search parameters of type token. Full values consist of the system and value for Identifiers or code for Codings. The system will be the special value 0x000000 if not available in the resource.

* `patient-id` - the logical id of the patient
* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `value` - the full code/value
* `system` - a 3-byte identifier of the system URI
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### ActiveSearchParams

Currently not used.
Expand Down
42 changes: 7 additions & 35 deletions modules/admin-api/test/blaze/admin_api_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[blaze.admin-api :as admin-api]
[blaze.async.comp :as ac :refer [do-sync]]
[blaze.db.api :as d]
[blaze.db.api-stub]
[blaze.db.api-stub :as api-stub]
[blaze.db.impl.index.patient-last-change :as plc]
[blaze.db.kv :as-alias kv]
[blaze.db.kv.rocksdb :as rocksdb]
Expand Down Expand Up @@ -97,50 +97,21 @@
{:dir (str dir "/index")
:block-cache (ig/ref ::rocksdb/block-cache)
:column-families
{:search-param-value-index
{:write-buffer-size-in-mb 1
:max-write-buffer-number 1
:max-bytes-for-level-base-in-mb 1
:target-file-size-base-in-mb 1}
:resource-value-index nil
(assoc
api-stub/index-kv-store-column-families
:compartment-search-param-value-index
{:write-buffer-size-in-mb 1
:max-write-buffer-number 1
:max-bytes-for-level-base-in-mb 1
:target-file-size-base-in-mb 1}
:compartment-resource-type-index nil
:active-search-params nil
:tx-success-index {:reverse-comparator? true}
:tx-error-index nil
:t-by-instant-index {:reverse-comparator? true}
:resource-as-of-index nil
:type-as-of-index nil
:system-as-of-index nil
:patient-last-change-index
{:write-buffer-size-in-mb 1
:max-write-buffer-number 1
:max-bytes-for-level-base-in-mb 1
:target-file-size-base-in-mb 1}
:type-stats-index nil
:system-stats-index nil
:cql-bloom-filter nil
:cql-bloom-filter-by-t nil}}
:target-file-size-base-in-mb 1})}

[::kv/mem :blaze.db.admin/index-kv-store]
{:column-families
{:search-param-value-index nil
:resource-value-index nil
:compartment-search-param-value-index nil
:compartment-resource-type-index nil
:active-search-params nil
:tx-success-index {:reverse-comparator? true}
:tx-error-index nil
:t-by-instant-index {:reverse-comparator? true}
:resource-as-of-index nil
:type-as-of-index nil
:system-as-of-index nil
:type-stats-index nil
:system-stats-index nil}}
{:column-families api-stub/index-kv-store-column-families}

::rs/kv
{:kv-store (ig/ref :blaze.db/resource-kv-store)
Expand Down Expand Up @@ -169,7 +140,8 @@
[:blaze.db.node.resource-indexer/executor :blaze.db.node.resource-indexer.admin/executor] {}

:blaze.db/search-param-registry
{:structure-definition-repo structure-definition-repo}
{:kv-store (ig/ref :blaze.db.main/index-kv-store)
:structure-definition-repo structure-definition-repo}

::rocksdb/block-cache {:size-in-mb 1}

Expand Down
31 changes: 23 additions & 8 deletions modules/byte-buffer/src/blaze/byte_buffer.clj
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,12 @@
(.copyTo ^ByteString byte-string byte-buffer)
byte-buffer)

(defn put-null-terminated-byte-string!
"Copies all bytes of `byte-string` into `byte-buffer`."
[byte-buffer byte-string]
(.copyTo ^ByteString byte-string byte-buffer)
(put-byte! byte-buffer 0))

(defn limit
"Returns the limit of `byte-buffer`."
{:inline
Expand Down Expand Up @@ -120,6 +126,13 @@
[byte-buffer position]
(.position ^ByteBuffer byte-buffer (int position)))

(defn inc-position!
{:inline
(fn [byte-buffer amount]
`(set-position! ~byte-buffer (unchecked-add-int (position ~byte-buffer) (int ~amount))))}
[byte-buffer amount]
(set-position! byte-buffer (+ (position byte-buffer) (long amount))))

(defn remaining
"Returns the number of elements between the current position and the limit."
{:inline
Expand Down Expand Up @@ -229,19 +242,21 @@
[byte-buffer]
(when (pos? (remaining byte-buffer))
(mark! byte-buffer)
(loop [byte (bit-and (long (get-byte! byte-buffer)) 0xFF)
(loop [byte (long (get-byte! byte-buffer))
size 0]
(cond
(zero? byte)
(if (zero? byte)
(do (reset! byte-buffer)
size)

(pos? (remaining byte-buffer))
(recur (bit-and (long (get-byte! byte-buffer)) 0xFF) (inc size))
(if (zero? (remaining byte-buffer))
(do (reset! byte-buffer)
nil)
(recur (long (get-byte! byte-buffer)) (inc size)))))))

:else
(do (reset! byte-buffer)
nil)))))
(defn skip-null-terminated! [byte-buffer]
(if-let [size (size-up-to-null byte-buffer)]
(set-position! byte-buffer (+ (position byte-buffer) (long size) 1))
(throw (Exception. "Can't skip null terminated byte sequence."))))

(defn mismatch
"Finds and returns the relative index of the first mismatch between `a` and
Expand Down
33 changes: 32 additions & 1 deletion modules/byte-buffer/src/blaze/byte_buffer_spec.clj
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
(ns blaze.byte-buffer-spec
(:require
[blaze.byte-buffer :as bb :refer [byte-buffer?]]
[clojure.spec.alpha :as s]))
[clojure.spec.alpha :as s])
(:import
[com.google.protobuf ByteString]))

(s/fdef bb/allocate
:args (s/cat :capacity nat-int?)
Expand Down Expand Up @@ -31,10 +33,39 @@
:args (s/cat :byte-buffer byte-buffer? :x int?)
:ret byte-buffer?)

(s/fdef bb/put-byte-array!
:args (s/cat :byte-buffer byte-buffer? :byte-array bytes?
:offset (s/? nat-int?) :length (s/? nat-int?))
:ret byte-buffer?)

(s/fdef bb/put-byte-buffer!
:args (s/cat :dst byte-buffer? :src byte-buffer?)
:ret byte-buffer?)

(s/fdef bb/put-byte-string!
:args (s/cat :byte-buffer byte-buffer? :byte-string #(instance? ByteString %))
:ret byte-buffer?)

(s/fdef bb/put-null-terminated-byte-string!
:args (s/cat :byte-buffer byte-buffer? :byte-string #(instance? ByteString %))
:ret byte-buffer?)

(s/fdef bb/limit
:args (s/cat :byte-buffer byte-buffer?)
:ret nat-int?)

(s/fdef bb/position
:args (s/cat :byte-buffer byte-buffer?)
:ret nat-int?)

(s/fdef bb/set-position!
:args (s/cat :byte-buffer byte-buffer? :position nat-int?)
:ret byte-buffer?)

(s/fdef bb/size-up-to-null
:args (s/cat :byte-buffer byte-buffer?)
:ret (s/nilable nat-int?))

(s/fdef bb/skip-null-terminated!
:args (s/cat :byte-buffer byte-buffer?)
:ret byte-buffer?)
9 changes: 5 additions & 4 deletions modules/byte-string/src/blaze/byte_string.clj
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,15 @@

(defn from-byte-buffer-null-terminated!
"Returns the bytes from `byte-buffer` up to (exclusive) a null byte (0x00) as
byte string ot nil if `byte-buffer` doesn't include a null byte.
byte string or nil if `byte-buffer` doesn't include a null byte.
Increments the position of `byte-buffer` up to including the null byte."
[byte-buffer]
(when-let [size (bb/size-up-to-null byte-buffer)]
(if-let [size (bb/size-up-to-null byte-buffer)]
(let [bs (from-byte-buffer! byte-buffer size)]
(bb/get-byte! byte-buffer)
bs)))
(bb/set-position! byte-buffer (inc (bb/position byte-buffer)))
bs)
(throw (Exception. "Can't read null terminated byte string."))))

(defn from-hex [s]
(ByteString/copyFrom (.decode (BaseEncoding/base16) s)))
Expand Down
2 changes: 1 addition & 1 deletion modules/byte-string/src/blaze/byte_string_spec.clj
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@

(s/fdef bs/from-byte-buffer-null-terminated!
:args (s/cat :byte-buffer byte-buffer?)
:ret (s/nilable bs/byte-string?))
:ret bs/byte-string?)

(s/fdef bs/from-hex
:args (s/cat :s string?)
Expand Down
Loading

0 comments on commit c78bd5b

Please sign in to comment.