Welcome to the recipes section of the Couchbase Shell cbsh
documentation.
+Here you can find some examples of the more complicated tasks that can be performed using cbsh
.
1. Moving data between clusters
+CBShell allows data to be moved between clusters, along with buckets, scopes and collections. +Imagine you have 2 clusters, one self-managed (named local) and a Capella cluster called remote:
+👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> cb-env managed
+╭───┬────────┬───────┬────────────┬───────────────┬──────────────────────┬─────────────────╮
+│ # │ active │ tls │ identifier │ username │ capella_organization │ project │
+├───┼────────┼───────┼────────────┼───────────────┼──────────────────────┼─────────────────┤
+│ 0 │ false │ true │ remote │ Administrator │ my-org │ CBShell Testing │
+│ 1 │ true │ false │ local │ Administrator │ │ │
+╰───┴────────┴───────┴────────────┴───────────────┴──────────────────────┴─────────────────╯
+The remote
cluster is empty while local
contains two buckets with scopes and collections containing data:
👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> buckets
+╭───┬─────────┬───────────────┬───────────┬──────────┬──────────────────────┬───────────┬───────────────┬───────┬────────────╮
+│ # │ cluster │ name │ type │ replicas │ min_durability_level │ ram_quota │ flush_enabled │ cloud │ max_expiry │
+├───┼─────────┼───────────────┼───────────┼──────────┼──────────────────────┼───────────┼───────────────┼───────┼────────────┤
+│ 0 │ local │ beer-sample │ couchbase │ 1 │ none │ 200.0 MiB │ false │ false │ 0 │
+│ 1 │ local │ travel-sample │ couchbase │ 1 │ none │ 200.0 MiB │ false │ false │ 0 │
+╰───┴─────────┴───────────────┴───────────┴──────────┴──────────────────────┴───────────┴───────────────┴───────┴────────────╯
+The first thing to do is to recreate all of the buckets that we have on the local
cluster on the remote
cluster:
> buckets | each {|in| buckets create $in.name ($in.ram_quota / 1MB | into int) --clusters remote}
+Here we simply get all of the buckets, then iterate over the list with each and create buckets with the same name and ram quota, specifying the remote
cluster with the --clusters flag.
+We can check that this has worked by running the buckets
command against the remote cluster:
👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> buckets --clusters remote
+╭───┬─────────┬───────────────┬───────────┬──────────┬──────────────────────┬───────────┬───────────────┬───────┬────────────╮
+│ # │ cluster │ name │ type │ replicas │ min_durability_level │ ram_quota │ flush_enabled │ cloud │ max_expiry │
+├───┼─────────┼───────────────┼───────────┼──────────┼──────────────────────┼───────────┼───────────────┼───────┼────────────┤
+│ 0 │ remote │ beer-sample │ couchbase │ 1 │ none │ 209.0 MiB │ false │ true │ 0 │
+│ 1 │ remote │ travel-sample │ couchbase │ 1 │ none │ 209.0 MiB │ false │ true │ 0 │
+╰───┴─────────┴───────────────┴───────────┴──────────┴──────────────────────┴───────────┴───────────────┴───────┴────────────╯
+Next we need to create all of the scopes within these buckets.
+Firstly we get all the buckets again on the local
cluster, then for each of the buckets we get the scopes:
👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> buckets | each {|bckt| scopes --bucket $bckt.name | where scope not-in [_default _system] | get scope}
+╭───┬─────────────────────────╮
+│ 0 │ ╭───┬───────╮ │
+│ │ │ 0 │ Cafes │ │
+│ │ ╰───┴───────╯ │
+│ 1 │ ╭───┬─────────────────╮ │
+│ │ │ 0 │ inventory │ │
+│ │ │ 1 │ tenant_agent_00 │ │
+│ │ │ 2 │ tenant_agent_01 │ │
+│ │ │ 3 │ tenant_agent_02 │ │
+│ │ │ 4 │ tenant_agent_03 │ │
+│ │ │ 5 │ tenant_agent_04 │ │
+│ │ ╰───┴─────────────────╯ │
+╰───┴─────────────────────────╯
+Here we iterate over each of teh buckets and call scopes
with the --bucket
flag to get the scopes from each of them.
+Then we use where and not-in operator to filter out the _default and _system scopes, since these are empty.
+Note that for this demo we have moved the data in beer-sample out of the default scope and collection into the Cafes scope and Breweries collection.
Now that we have listed all the scopes in the buckets, we can amended the previous command to use scopes create
to create the scopes on the remote cluster:
👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> buckets | each {|bckt| scopes --bucket $bckt.name | where scope not-in [_default _system] | get scope | each {|scp| scopes create $scp --clusters remote --bucket $bckt.name}}
+Here we have run the same command to list all the scopes, then for each scope we create one of the same name in the corresponding bucket on the remote
cluster.
The final step is to do the same with the collections:
+👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> buckets | each {|bckt| scopes --bucket $bckt.name | where scope not-in [_default _system] | get scope | each {|scp| collections --scope $scp --bucket $bckt.name | get collection | each {|col| collections create $col --bucket $bckt.name --scope $scp --clusters remote}}}
+Here we have fetched the buckets
, and for each bucket fetched the scopes
and finally for each of the scopes we have fetched the collecitons
.
+Then for each of teh collections in a bucket/scope we re-create it on the remote cluster in the corresponding buckets/scope.
Before we copy our data over to our new collections we also want to migrate our indexes across.
+The first step to doing this is to list all of the index definitions on the local
cluster as follows:
👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> query indexes --definitions | where name != '#primary'
+╭────┬───────────────┬───────────┬────────────┬───────────────────────────────────────┬────────┬──────────────┬──────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─────╮
+│ # │ bucket │ scope │ collection │ name │ status │ storage_mode │ replicas │ definition │ ... │
+├────┼───────────────┼───────────┼────────────┼───────────────────────────────────────┼────────┼──────────────┼──────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────┤
+│ 0 │ beer-sample │ _default │ _default │ beer_primary │ Ready │ plasma │ 0 │ CREATE PRIMARY INDEX `beer_primary` ON `beer-sample` WITH { "defer_build":true } │ ... │
+│ 1 │ travel-sample │ _default │ _default │ def_airportname │ Ready │ plasma │ 0 │ CREATE INDEX `def_airportname` ON `travel-sample`(`airportname`) WITH { "defer_build":true } │ ... │
+│ .. │ ... │ ... │ ... │ ... │ ... │ ... │ ... │ ... │ ... │
+│ 23 │ travel-sample │ _default │ _default │ def_type │ Ready │ plasma │ 0 │ CREATE INDEX `def_type` ON `travel-sample`(`type`) WITH { "defer_build":true } │ ... │
+╰────┴───────────────┴───────────┴────────────┴───────────────────────────────────────┴────────┴──────────────┴──────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────╯
+Here we use query with the definitions flag to fetch all of the query definitions on teh active cluster.
+The we have used where
to filter out all of the primarty indexes as these are crated automatically when we created the buckets on the Capella cluster.
Now we can select the definition column and iterate over the definitions, using query
to re-create the indexes on the remote
cluster:
👤 Charlie 🏠 local in 🗄 travel-sample._default._default
+> query indexes --definitions | where name != '#primary' | get definition | each {|def| query $def --clusters remote}
+The final step is to copy the data from each bucket/scope/collection from teh local
cluster into the corresponding bucket/scope/collection on the remote
cluster.
+THis is done as follows:
> buckets | each {|bckt| scopes --bucket $bckt.name | where scope not-in [_default _system] | get scope | each {|scp| collections --scope $scp --bucket $bckt.name | get collection | each {|col| "SELECT meta().id, * FROM `" + $bckt.name + "`." + $scp + "." + $col | query $in | if $in != null {$in | reject cluster | rename id content | doc upsert --bucket $bckt.name --scope $scp --collection $col --clusters remote}}}}
+Here we have the same nested each
loops that get the scopes for each bucket then the collections contained within each scope.
+We then use the bucket/scope and collection to construct a query:
"SELECT meta().id, * FROM `" + $bckt.name + "`." + $scp + "." + $col
+So when the bucket is travel-sample
the scope inventory
and the collection airline
the above will give:
"SELECT meta().id, * FROM `travel-sample`.inventory.airline"
+We run each of these queries using the query
command, and pipe the output into the final section of our command:
if $in != null {$in | reject cluster | rename id content | doc upsert --bucket $bckt.name --scope $scp --collection $col --clusters remote}
+To understand what we are doing here you need to know what format the data being output by our queries is in. +If the query run is the example given above then the output will look like this:
+╭─────┬───────────────┬────────────────────────────────────────────────────────┬─────────╮
+│ # │ id │ airline │ cluster │
+├─────┼───────────────┼────────────────────────────────────────────────────────┼─────────┤
+│ 0 │ airline_10 │ ╭──────────┬───────────────╮ │ local │
+│ │ │ │ id │ 10 │ │ │
+│ │ │ │ type │ airline │ │ │
+│ │ │ │ name │ 40-Mile Air │ │ │
+│ │ │ │ iata │ Q5 │ │ │
+│ │ │ │ icao │ MLA │ │ │
+│ │ │ │ callsign │ MILE-AIR │ │ │
+│ │ │ │ country │ United States │ │ │
+│ │ │ ╰──────────┴───────────────╯ │ │
+│ ... │ ... │ ... │ ... │
+│ 186 │ airline_9833 │ ╭──────────┬───────────────╮ │ local │
+│ │ │ │ id │ 9833 │ │ │
+│ │ │ │ type │ airline │ │ │
+│ │ │ │ name │ Epic Holiday │ │ │
+│ │ │ │ iata │ FA │ │ │
+│ │ │ │ icao │ 4AA │ │ │
+│ │ │ │ callsign │ Epic │ │ │
+│ │ │ │ country │ United States │ │ │
+│ │ │ ╰──────────┴───────────────╯ │ │
+╰─────┴───────────────┴────────────────────────────────────────────────────────┴─────────╯
+Before we can insert this into out remote
cluster using doc upsert we need it to be correctly formatted.
+But before we try to reformat any of the data we make sure that the query not returned null with if $in != null
since trying to manipulate a null value will return an error.
+The formatting required is to drop the cluster column which we do using reject then rename the column named after the collection, in this case airline
to content
which we do using rename.
+After the formatting has been applied to the above example it would become:
╭─────┬───────────────┬────────────────────────────────────────────────────────╮
+│ # │ id │ content │
+├─────┼───────────────┼────────────────────────────────────────────────────────┤
+│ 0 │ airline_10 │ ╭──────────┬───────────────╮ │
+│ │ │ │ id │ 10 │ │
+│ │ │ │ type │ airline │ │
+│ │ │ │ name │ 40-Mile Air │ │
+│ │ │ │ iata │ Q5 │ │
+│ │ │ │ icao │ MLA │ │
+│ │ │ │ callsign │ MILE-AIR │ │
+│ │ │ │ country │ United States │ │
+│ │ │ ╰──────────┴───────────────╯ │
+│ ... │ ... │ ... │
+│ 186 │ airline_9833 │ ╭──────────┬───────────────╮ │
+│ │ │ │ id │ 9833 │ │
+│ │ │ │ type │ airline │ │
+│ │ │ │ name │ Epic Holiday │ │
+│ │ │ │ iata │ FA │ │
+│ │ │ │ icao │ 4AA │ │
+│ │ │ │ callsign │ Epic │ │
+│ │ │ │ country │ United States │ │
+│ │ │ ╰──────────┴───────────────╯ │
+╰─────┴───────────────┴────────────────────────────────────────────────────────╯
+Now that our data is correctly formatted it can be piped into doc upsert
and using the appropriate flags upserted into the corresponding bucket/scope/collection on our remote
cluster.
2. Similarity Search
+The vector commands can be used to enrich your existing data and allow you to experiment with the value that similarity search can add. +Before you can follow this recipe you’ll need to configure a llm for use with the shell.
+Next you’ll need a set of data, for this example we’ll be using the travel-sample data set that you can load with:
+> buckets load-sample travel-sample
+Once we have loaded the sample data we want to add embeddings to our documents.
+👤 Charlie 🏠 remote in ☁️ travel-sample._default._default
+> query 'SELECT meta().id, * FROM `travel-sample` WHERE type = "landmark"' | vector enrich-doc content | doc upsert
+Batch size limited to 2047
+Embedding batch 1/3
+Embedding batch 2/3
+Embedding batch 3/3
+╭───┬───────────┬─────────┬────────┬──────────┬─────────╮
+│ # │ processed │ success │ failed │ failures │ cluster │
+├───┼───────────┼─────────┼────────┼──────────┼─────────┤
+│ 0 │ 4495 │ 4495 │ 0 │ │ remote │
+╰───┴───────────┴─────────┴────────┴──────────┴─────────╯
+Here we have used query to get all the landmark doc ids and bodies.
+Then we have enriched all of these with the embedding generated from the content
field, see vector enrich-doc for details.
+Finally we pipe the output directly into doc upsert to overwrite the original landmark documents with our enriched versions.
Now that we have a set of docs containing vectors we can create a vector index over them using vector create-index:
+👤 Charlie 🏠 remote in ☁️ travel-sample._default._default
+> vector create-index landmark-content-index contentVector
+Once the index has finished building we can use it to perform similarity searches over all of the contentVector fields. +This is done using the vector search command as follows:
+👤 Charlie 🏠 remote in ☁️ travel-sample._default._default
+> subdoc get contentVector landmark_10019 | select content | vector search landmark-content-index contentVector --neighbors 5
+╭───┬────────────────┬─────────────────────────────────────────┬─────────╮
+│ # │ id │ score │ cluster │
+├───┼────────────────┼─────────────────────────────────────────┼─────────┤
+│ 0 │ landmark_10019 │ 340282350000000000000000000000000000000 │ remote │
+│ 1 │ landmark_28965 │ 1.0286641 │ remote │
+│ 2 │ landmark_3547 │ 1.0150012 │ remote │
+│ 3 │ landmark_16379 │ 0.9759125 │ remote │
+│ 4 │ landmark_33857 │ 0.9599941 │ remote │
+╰───┴────────────────┴─────────────────────────────────────────┴─────────╯
+Here we have used subdoc get to get the contentVector field from landmark_10019
, which is why the most similar result is landmark_10019
the vector is the same.
+Once we have this list of results from the vector search we can use the ids to inspect the source documents:
> subdoc get contentVector landmark_10019 | select content | vector search landmark-content-index contentVector --neighbors 5 | subdoc get [name address content]
+╭───┬────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─────╮
+│ # │ id │ content │ ... │
+├───┼────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────┤
+│ 0 │ landmark_16379 │ ╭─────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ... │
+│ │ │ │ name │ Royal Hospital │ │ │
+│ │ │ │ address │ Royal Hospital Rd │ │ │
+│ │ │ │ content │ A retirement home for soldiers created by King Charles II. Tours around the listed building and grounds are regular and include the museum (which can be visited separately) whose exhibits contain │ │ │
+│ │ │ │ │ military memorabilia donated by Chelsea Pensioners over the years. │ │ │
+│ │ │ ╰─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │
+│ 1 │ landmark_28965 │ ╭─────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ... │
+│ │ │ │ name │ Steam: The Great Western Railway Museum │ │ │
+│ │ │ │ address │ Fire Fly Ave, SN2 2EY │ │ │
+│ │ │ │ content │ The museum is located in a restored railway works building. The building is a treat in itself. As well as having a wealth of information about the railways, it also is an invaluable source of social │ │ │
+│ │ │ │ │ history. There are plenty of events for children, and it is right next to the Great Western Designer Outlet Village and the National Trust Headquarters, so anyone in the family who doesn't want to │ │ │
+│ │ │ │ │ visit the museum has plenty of other options. │ │ │
+│ │ │ ╰─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │
+│ 2 │ landmark_10019 │ ╭─────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ... │
+│ │ │ │ name │ Royal Engineers Museum │ │ │
+│ │ │ │ address │ Prince Arthur Road, ME4 4UG │ │ │
+│ │ │ │ content │ Adult - £6.99 for an Adult ticket that allows you to come back for further visits within a year (children's and concessionary tickets also available). Museum on military engineering and the history │ │ │
+│ │ │ │ │ of the British Empire. A quite extensive collection that takes about half a day to see. Of most interest to fans of British and military history or civil engineering. The outside collection of tank │ │ │
+│ │ │ │ │ mounted bridges etc can be seen for free. There is also an extensive series of themed special event weekends, admission to which is included in the cost of the annual ticket. │ │ │
+│ │ │ ╰─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │
+│ 3 │ landmark_33857 │ ╭─────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ... │
+│ │ │ │ name │ National Railway Museum │ │ │
+│ │ │ │ address │ Leeman Road, YO26 4XJ │ │ │
+│ │ │ │ content │ The largest railway museum in the world, responsible for the conservation and interpretation of the British national collection of historically significant railway vehicles and other artefacts. │ │ │
+│ │ │ │ │ Contains an unrivalled collection of locomotives, rolling stock, railway equipment, documents and records. │ │ │
+│ │ │ ╰─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │
+│ 4 │ landmark_3547 │ ╭─────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ... │
+│ │ │ │ name │ The Giant Screen │ │ │
+│ │ │ │ address │ │ │ │
+│ │ │ │ content │ Millennium Point, Curzon St. Daily 10AM-5PM. Part of the Thinktank science museum. 2D and 3D films shown on an enormous (five story) screen. Some mainstream films, mainly documentaries. £9.60 │ │ │
+│ │ │ │ │ (''concessions £7.60, children under 16 £7.60, family and joint Thinktank tickets available''). │ │ │
+│ │ │ ╰─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ │ │
+╰───┴────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────╯
+Here we could have used doc get to get the whole of the documents, but to keep things tidy we’ve used another subdoc get
to retrieved the name, address and content fields.
+As you can see by examining the results they all have semantically similar content fields.
Another way that CBShell can be used to generate embeddings is from plain text with vector enrich-text:
+👤 Charlie 🏠 remote in ☁️ travel-sample._default._default
+> "physical exercise" | vector enrich-text | vector search landmark-content-index contentVector --neighbors 5 | subdoc get [name address content] | select content | flatten
+Embedding batch 1/1
+╭───┬───────────────────────────┬───────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
+│ # │ name │ address │ content │
+├───┼───────────────────────────┼───────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
+│ 0 │ Altrincham Leisure Centre │ Oakfield Rd │ Includes swimming pools, sports halls and gym. │
+│ 1 │ Hornchurch Sports Centre │ Hornchurch Road, Hornchurch, RM11 1JU │ You can find several activities like swimming, squash, cricket and gym. │
+│ 2 │ Outdoor Swimming Pool │ │ Swim outdoors in the summer │
+│ 3 │ Rothesay Leisure Centre │ High Street, Rothesay │ For those rainy days. Pool, gym and sauna open daily. │
+│ 4 │ Sydney G. Walton Square │ │ Small (one square block), well maintained park/square in the heart of the city, located right beside the Financial District. Tai Chi practitioners exercise here in │
+│ │ │ │ the early morning hours. │
+╰───┴───────────────────────────┴───────────────────────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
+Here we have done another similarity search using the same index, but our source vector is the result of embedding the phrase "physical exercise".
+One important detail to remeber is that the embedding generated from vector enrich-text
must have the same dimension as those over which the index was created, otherwise vector search
will return no results.
+See vector enrich-text for how to specify the dimension of the generated embeddings.
3. Useful snippets
+This section contains a collection of useful commands and sets of commands which don’t really fit into their own section of recipes.
+3.1. Migrating scope and collection definitions
+When you create a new cluster it can be useful to migrate scope and collection definitions from an old cluster. +A good example here is migrating from an on-premise cluster to a Capella cluster.
+To migrate scopes, except the _default
scope:
scopes --clusters "On-Prem-Cluster" --bucket travel-sample | select scope | where scope != "_default" | each { |it| scopes create $it.scope --clusters "Capella-Cluster" }
+To migrate all collections, except the _default
collection:
collections --clusters "On-Prem-Cluster" --bucket "travel-sample" | select scope collection | where $it.scope != "_default" | where $it.collection != "_default" | each { |it| collections create $it.collection --clusters "Capella-Cluster" --bucket "travel-sample-import" --scope $it.scope
+These examples can easily be extended to filter out any other scopes and collections you do not want to migrate.
+For example to filter more scopes you would just add more where
clauses: … | where scope != "_default" | where scope != "inventory" | …
3.2. Migrating query index definitions
+When you create a new cluster it can be useful to migrate index definitions from an old cluster. +A good example here is migrating from an on-premise cluster to a Capella cluster.
+To migrate all of your index definitions:
+query indexes --definitions --clusters "On-Prem-Cluster" | get definition | each { |it| query $it --clusters "Capella-Cluster" }
+