Transporter that lets Integreat use a MongoDB database as service.
Requires at least node v18, Integreat v1.0, and MongoDB 5.0.
Install from npm:
npm install integreat-transporter-mongodb
Example of use:
import Integreat from 'integreat'
import mongodb from 'integreat-transporter-mongodb'
import defs from './config'
const resources = {
// ... you'll probably want to include other resources as well
transporters: { mongodb },
}
const great = Integreat.create(defs, resources)
// ... and then dispatch actions as usual
The data returns from GET
actions will be the retrieved documents, while for
SET
, UPDATE
, and DELETE
actions the data will be result stats in the form
of { modifiedCount: 1, insertedCount: 2, deletedCount: 0 }
.
UPDATE
actions may update a given array of items or use the given data item to
update all documents matched by a given query. In the first case, the action
will respond with a notfound
error if one or more of the provided data items
are not already in the database. In the second case, the action will respond
with noaction
if no documents are matched by the query.
GET
actions will also return a totalCount
in the params
object of the
response, with the total number of documents matching the query. This is useful
for paged queries.
After including the mongodb
transporter in your resources object, you still
need to configure your service to use it.
Example service configuration:
{
id: 'store',
transporter: 'mongodb',
auth: 'mongoAuth', // See below for documentation of authentication
options: {
uri: 'mongodb://mymongo.com',
}
endpoints: [
{ options: { db: 'store', collection: 'documents' } }
]
}
The uri
is used as the uri to the database.
MongoDB uses an _id
field as the primary key for documents. By default, we
let MongoDB generate this field and never touch it, but if you set
idIsUnique: true
in the option, signaling that the id
of the data items
you'll send will always be unique, we'll map the id
to _id
and back again.
One of the advantages of this, in addition to not have an extra id field,
is that MongoDB will always create an index for _id
, so this may speed up
operations on the collection without any extra setup.
idIsUnique: true
may be set on the service options
to apply to all
collections, or on each endpoint, applying to the collection relevant to that
endpoint. You are responsible for making sure the idIsUnique
is set
consistently, if more than one endpoint is dealing with the same collection, or
else you'll end up with inconsistencies.
When listening to MongoDb, you may set idIsUnique
on the incoming
options.
If not set here, incoming will use whatever is set on the service options
.
Keep in mind that setting idIsUnique
on endpoints will not affect incoming
data, so make sure to set it on the service level when needed.
There are also cases where one endpoint may involve more than one collection
(e.g. in aggregations), and in these cases all relevant collections must adhere
to idIsUnique
being either false
or true
. It's usually a good idea to be
consistent within a database anyway.
By setting the appendOnly
option to true
, any document will be inserted as
a new document, unless a query
is also specified. This means that we will
not update an existing document by the id of the given document, as is the
default behaviour.
Setting appendOnly
to true
will also override idIsUnique
, which will be
treated as false
regardless of what it is set to.
Some characters are not allowed in MongoDB keys, so we have to escape them when
setting data to MongoDB. Leading $
is escape as \$
, and .
is escaped as
\_
. Consequently, \
is mapped to \\
as well. This is done automatically by
the transporter, and reverted when values are fetched, but you will see it if
you query data directly from the database.
Integreat accepts using an empty string ''
as a key in an object (as does
JavaScript and JSON), but MongoDB does not. We therefore replace empty strings
keys with the string '**empty**'
when storing data in MongoDB, and normalizes
it back when fetching data.
Also, we remove all properties with a undefined
value, to not fill the
database with empty values. This also means that existing values are not
overwritten with undefined
when updating documents. This is usually what you
want when updating data from Integreat, and it's easy to end up with unintended
undefined
values from mutation pipelines, but if you actually want to set
undefined
values, you can do so by setting the keepUndefined
option to
true
. Note that undefined
values in an array are always preserved, e.g.
['ent1', undefined, 'ent3']
. Also, note that MongoDB will turn undefined
into null
.
An endpoint may have a query
property, which should be an array of path
objects describing the query object used with MongoDB's find()
method.
Here's an example:
{
...
endpoints: [
{
id: 'getDrafts',
options: {
db: 'store',
collection: 'documents',
query: [
{ path: 'type', param: 'type' },
{ path: 'meta.status', value: 'draft' }
{ path: 'meta.views', op: 'gt', value: 1000 }
],
allowDiskUse: true
}
}
]
}
The path
property describes what property to set, and the property is set to
the value of value
or to the value of the request parameter in param
. The
default operand is eq
, but you may also use gt
, gte
, lt
, lte
, or in
.
There are also two special operands: isset
and notset
. They will match when
a field is set or not (using MongoDB operator $exists
).
To do a match on objects in an array, use the match
operand. This will match
any document with an array at path
that contains an object with the properties
specified in value
or param
. This uses MongoDB's $elemMatch
operator under
the hood.
To do a text search in the text index set up for th collection, use the search
operand and set value
to search string or param
to the parameter that holds
the search string. See MongoDB docs for more on
setting up a text index.
The query object will look like this, for a request for items of type entry
:
{
type: 'entry',
'meta.status': 'draft',
'meta.views': { $gt: 1000 }
}
To specify or logic, you put several queries in an array. To have and logic within an or array, you again use an array.
To query for type and a meta.status
of draft
or published
:
// ...
query: [
{ path: 'type', param: 'type' },
[
// or
{ path: 'meta.status', value: 'draft' },
{ path: 'meta.status', value: 'published' },
],
]
To query for type and a meta.status
of draft
or published
, with
draft
having an and logic with meta.author.role
:
// ...
query: [
{ path: 'type', param: 'type' },
[
// or
[
// and
{ path: 'meta.status', value: 'draft' },
{ path: 'meta.author.role', value: 'author' },
],
{ path: 'meta.status', value: 'published' },
],
]
When no query is specified and the action has an id
param, the following query
will be used by default (the value of id
is 'ent1'
in this example):
{
id: 'ent1',
}
Aggregation is supported by specifying a pipeline on the aggregation
property
on the options
object. If a query or a sort order is specified, they are put
first in the aggregation pipeline, query first, then sorting.
Example of an aggregation pipeline:
{
...
endpoints: [
{
id: 'getNewestVersion',
options: {
db: 'store',
collection: 'documents',
aggregation: [
{ type: 'sort', sortBy: { updatedAt: -1 } },
{
type: 'group',
groupBy: ['account', 'id'],
values: { updatedAt: 'first', status: 'first' },
},
{
type: 'query',
query: [
{ path: 'updatedAt', op: 'gt', param: 'updatedAfter' },
],
},
]
}
}
]
}
When the pageSize
param is set in a request, it is taken as the max number of
documents to return in the response. When nothing else is specified, the first
page of documents is returned, and the paging.next
prop on the response will
hold a params object that may be used to get the next page.
There are two types of pagination; pageId
or pageOffset
. The first one is
used by default, and returns an id for the next page in the dataset. All details
around this id is internal to the transporter and may change without being
considered a breaking change. Just treat it as an id and you'll be find.
The pageOffset
approach kicks in when a pageOffset
param is specified on the
action, so to use this approach, you need to set pageOffset: 0
for the first
page. If the pageSize
is e.g. 100
, the next pageOffset
will be 100
, etc.
Pagination works for both aggregations and simple queries.
We recommend using Integreat's built in authentication mechanism to authenticate
with MongoDB. To do this, set the id of an auth object on the auth
prop of the
service definition -- in the example above this is set to mongoAuth
. Then
define an auth object like this:
{
id: 'mongoAuth',
authenticator: 'options',
options: {
key: '<mongo username>',
secret: '<mongo password>',
}
}
The options
authenticator will simply pass on the options object to Integreat,
which will again pass it on to the MongoDB transport -- which will know how to
use this to authenticate with MongoDB.
Note: Including credential in the connection uri, is a fairly common
practice with MongoDB. When using this approach, tell Integreat that the service
is authenticated by setting auth: true
on the service definition. However, we
not recommend this approach, as the username and password is then included in
the definition file and this makes the chance of it being e.g. commited to a git
repo, much higher.
The MongoDB transporter supports listening to changes in the database. To enable
this, set an array of collections to listen to in the collections
property on
the incoming
object on options
. When the Integreat instance is set up, call
listen()
on the instance, and Integreat will dispatch SET
actions for
insert
and update
events, and and DELETE
actions for delete
events. Note
that DELETE
will only be dispatched when idIsUnique
is true
, as we would
otherwise not know the id of the deleted document.
You may also specify a database in options.incoming.db
, otherwise options.db
is used.
The incoming action will have the following payload properties:
collection
: The name of the collectiondb
: The name of the databasedata
: The document forinsert
andupdate
eventsid
: The id of the document fordelete
eventsmethod
: The method that triggered the change:insert
,update
, ordelete
.
Note that we disregard any incoming settings in endpoint options
for now. You
may use the endpoint match
settings to direct incoming actions for different
collections and event types, to different endpoints. In the future we may allow
different incoming settings for different endpoints, so only specify this on the
service to make sure you are future compatible.
We use MongoDB's change streams to listen to changes, so this feature requires a replica set or a shared cluster. See the MongoDB documentation on Change Streams for more.
Experimental: By setting a number on the throwAfterFailedHeartbeatCount
option, the transporter will throw after the number of heartbeat failures you
specify. The counter will reset for every sucessful heartbeat, so if
throwAfterFailedHeartbeatCount
is 3
, it will throw when after three
heartbeat failures in a row.
The point of this is to allow the server to restart after loosing contact with MongoDB.
The tests can be run with npm test
.
Please read CONTRIBUTING for details on our code of conduct, and the process for submitting pull requests.
This project is licensed under the ISC License - see the LICENSE file for details.