-
Notifications
You must be signed in to change notification settings - Fork 479
Configuration Options
You can use a custom configuration file to specify some options to mongo-connector.
This page details all the options that can be specified in Mongo Connector's configuration file. You can also look at an example. Taking a look at the tests also might be helpful to understand configuration options.
Mongo Connector uses JSON as the format for its configuration file. We'll use MongoDB "dot-notation" for the configuration option names themselves. For example, we'll use the name authentication.password
to mean:
{"authentication": {"password": XXX}}
Please note that any option that starts with __ will be ignored. For example,
"namespaces": {
"__include": ["test.talks"]
},
Will have the __include option ignored.
You can tell mongo-connector
what configuration file to use via the -c
option (this will also be shown with --help
). To invoke mongo-connector with a configuration file option, run:
mongo-connector -c config.json
(presuming your configuration file is called config.json and it is on the same directory that you are invoking mongo-connector)
Although JSON itself doesn't provide a syntax for comments, Mongo Connector allows its JSON configuration file to have comments, which are defined as any key in an object that is prefixed by 2 underscores (_
). For example:
{
"__comment": "this is a comment"
}
Command-line equivalent: -m
, --main
Default: localhost:27017
The address of the replica set or sharded cluster from which to replicate. This may be any MongoDB connection string.
Command-line equivalent: -o
, --oplog-ts
Default: oplog.timestamp
The path to the oplog progress file. Note: backslashes must be escaped, eg "C:\\path\\to\\oplog.timestamp"
.
Command-line equivalent: --no-dump
Default: false
Do not dump collections from MongoDB to the remote system prior to tailing the MongoDB oplog.
Command-line equivalent: --batch-size
Default: -1
Number of records processed from the oplog before updating the timestamp file.
Command-line equivalent: -v
, --verbose
Default: 0
The verbosity of Mongo Connector. Note that the command-line option only turns on/off debug-level logging. In the config file, verbosity
may be set according to the following table:
Verbosity | Log Level |
---|---|
0 | ERROR |
1 | WARNING |
2 | INFO |
3 | DEBUG |
Command-line equivalent: --continue-on-error
Default: false
Whether to continue tailing the oplog after an error occurred while dumping a collection. This doesn't affect the connector's behavior while already tailing the oplog.
Command-line equivalent: -i
, --fields
Default: all fields
Comma-separated list of fields to read from MongoDB documents. This option can be used to select just a few fields out of every document. Note that the _id
field, and the ns
and _ts
fields for Solr, will always be included. This option is mutually exclusive with the exclude_fields
option.
Command-line equivalent: -e
, --exclude_fields
Default: empty
Comma-separated list of fields to exclude from MongoDB documents. This option can be used to select just a few fields out of every document. Note that the _id
field, and the ns
and _ts
fields for Solr, will always be included. This option is mutually exclusive with the fields
option.
Command-line equivalent: --tz-aware
Default: false
Whether Dates read from MongoDB should be timezone-aware.
Command-line equivalents: --logfile
, -s
, --enable-syslog
Default: file
Where to direct Mongo Connector logs. This may be one of "file", "syslog", or "stream".
Command-line equivalent: --logfile
Default: mongo-connector.log
The path to Mongo Connector's log file. This option only applies if logging.type
is "file". Note: backslashes must be escaped, eg "C:\\path\\to\\mongo-connector.log"
.
Command-line equivalent: --logfile-when
Default: midnight
The type of period defining when Mongo Connector should rotate its log file. This must be one of:
- S (second)
- M (minute)
- H (hour)
- D (day)
- W0 - W6 (days of the week, numbered 0 - 6)
- midnight
For more details, see the Python documentation for TimedRotatingFileHandler
This option only applies if logging.type
is "file".
Command-line equivalent: --logfile-interval
Default: 1
How frequently the log file should be rotated. Specifically, how many units of logging.rotationWhen
should occur before rotation. This option cannot be used if logging.rotationWhen
is any of W0 - W6.
For more details, see the Python documentation for TimedRotatingFileHandler
This option only applies if logging.type
is "file".
Command-line equivalent: --logfile-backups
Default: 7
How many rotated log files to keep around.
This option only applies if logging.type
is "file".
Command-line equivalent: --syslog-host
Default: localhost:512
Address of the syslog. This can include a host and port like "localhost:512" or, on Unix/Linux, be a Unix domain socket such as "/dev/log".
This option only applies if logging.type
is "syslog".
Command-line equivalent: --syslog-facility
Default: user
The syslog facility to use.
This option only applies if logging.type
is "syslog".
Command-line equivalent: -a
, --admin-username
Default: (no default)
The username that Mongo Connector should use to log into MongoDB.
Command-line equivalent: -p
, --password
Default: (no default)
The password for authentication.adminUsername
. This option cannot be used with authentication.passwordFile
.
Command-line equivalent: -f
, --password-file
Default: (no default)
A path to a file that contains the password for authentication.adminUsername
. This option cannot be used with authentication.password
.
Command-line equivalent: --ssl-certfile
Default: (no default)
A path to the SSL certificate that Mongo Connector should use to identify the local connection to MongoDB.
Command-line equivalent: --ssl-keyfile
Default: (no default)
A path to the private key for ssl.sslCertfile
. This option isn't necessary if ssl.sslCertfile
already has the private key included.
Command-line equivalent: --ssl-certificate-policy
Default: ignored
Policy for validating SSL certificates provided from the other end of the connection (i.e., to MongoDB). Must be one of:
- required - Require and validate the remote certificate.
- optional - Validate the remote certificate only if one is provided.
- ignored - Remote SSL certificates are ignored completely.
Command-line equivalent: -n
, --namespace-set
Default: all namespaces
List of collections to read from MongoDB. Collection names should be given as database_name.collection_name
. By default, Mongo Connector will replicate all namespaces except for system and GridFS collections. Each namespace may contain a single wildcard (*
) which matches any characters. For example, db_*.foo
matches db_bar.foo
and db_a.foo
. Cannot be used in combination with namespaces.exclude
.
Usage Examples: -n test.test,alpha.*,db_*.foo
on the command line or ["test.test", "alpha.*", "db_*.foo"]
in a config file.
Command-line equivalent: -x
, --exclude-namespace-set
Default: no namespaces
List of collections to not read from MongoDB. Collection names should be given as database_name.collection_name
. By default, Mongo Connector will not exclude any name. Each namespace may contain a single wildcard (*
) which matches any characters. For example, db_*.foo
matches db_bar.foo
and db_a.foo
. Cannot be used in combination with namespaces.include
.
Usage Examples: -x test.test,alpha.*,db_*.foo
on the command line or ["test.test", "alpha.*", "db_*.foo"]
in a config file.
Command-line equivalent: -g
, --dest-namespace-set
Default: no mapping
Comma-separated list of new names to use for each collection. Each namespace provided in namespaces.include
will be renamed respectively at the destination according to this list. This option may only be used with namespaces.include
, and both options must include the same number of names. By default, no renaming will occur. If the source namespace contains a wildcard (*
), then the destination must also contain a wildcard (*
) which will be replaced during the renaming. Consider for example this config:
{
"namespaces": {
"include": ["db.col", "company.*"],
"mapping": {
"company.*": "company_*.col"
}
}
}
Reading the company.employees
collection from MongoDB, will be renamed and sent to the target system as company_employees.col
instead. This may be useful if you want each collection to be mapped to a different Elasticsearch index.
Note that when replicating to Elasticsearch, the MongoDB database name, which will become the Elasticsearch index name, is always made lowercase.
Command-line equivalent: --gridfs-set
Default: empty
Comma-separated list of GridFS root collections. For example, if GridFS metadata is stored in the test.fs.files
collection, and chunks are stored in the test.fs.chunks
collection, pass test.fs
as the namespace.
Mongo Connector may use more than one DocManager at a time to support replicating to more than one location simultaneously. An array of DocManagers should be provided, even if that array only contains one DocManager configuration. Here we use <index>
in the configuration key name to mean "at any index within the array". For example, docManagers.0.docManager
means:
{"docManagers": [{"docManager": XXX}]}
Command-line equivalent: -d
, --doc-manager
Default: doc_manager_simulator
Module name of the DocManager to use. Included in Mongo Connector are mongo_doc_manager
, solr_doc_manager
, and doc_manager_simulator
. To write your own DocManager, see Writing Your Own DocManager.
The elastic_doc_manager
is included in mongo-connector versions < 2.3, and only supports Elastic 1.x. For mongo-connector versions >= 2.3, doc managers for Elastic 1.x and 2.x are available as plugins.
Elastic 1.x doc manager: https://github.com/mongodb-labs/elastic-doc-manager
Elastic 2.x doc manager: https://github.com/mongodb-labs/elastic2-doc-manager
Command-line equivalent: -t
, --target-url
Default: (no default)
URL to pass to the DocManager. For example, this should point to the base REST endpoint for a Solr core, or should be a MongoDB connection string, or the base REST endpoint for Elasticsearch.
Command-line equivalent: -u
, --unique-key
_Default: id
What to call the _id
field in a MongoDB document. This is useful for certain systems that call their primary key something else (e.g., Solr uses id
instead).
Command-line equivalent: --auto-commit-interval
Default: no auto commit
Interval in seconds between when the DocManager forces the end system to flush changes. This doesn't apply to every system.
Command-line equivalent: (none)
Default: 1000
The number of documents that are sent in a single batch to the remote system.
Command-line equivalent: (none)
Default: (no default)
Any arbitrary keyword arguments to pass to the constructor of the DocManager. What arguments can be passed should be documented by the author of the DocManager.