Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDB 5.0.6 Compatibility Issues #933

Open
bablf opened this issue Feb 15, 2022 · 7 comments
Open

MongoDB 5.0.6 Compatibility Issues #933

bablf opened this issue Feb 15, 2022 · 7 comments

Comments

@bablf
Copy link

bablf commented Feb 15, 2022

Hey,

firstup please tell me what additional information is needed to debug this error.

I ran the following command:
mongo-connector --unique-key=id -n news-articles.articles -m localhost:27017 -t http://localhost:8983/solr/mongo_solr_collection -d solr_doc_manager

(Yes i am calling the collection with -t because the core does not work (I get an Error 404). )

The connector starts and logs the following:

2022-02-15 18:23:34,584 [ALWAYS] mongo_connector.connector:50 - Python version: 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110]
2022-02-15 18:23:34,586 [ALWAYS] mongo_connector.connector:50 - Platform: Linux-5.10.0-11-amd64-x86_64-with-glibc2.31
2022-02-15 18:23:34,587 [ALWAYS] mongo_connector.connector:50 - pymongo version: 4.0.1
2022-02-15 18:23:34,587 [WARNING] mongo_connector.connector:170 - MongoConnector: Can't find /srv/news_crawler/oplog.timestamp, attempting to create an empty progress log
2022-02-15 18:23:34,597 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 5.0.6
2022-02-15 18:23:34,597 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0
2022-02-15 18:25:33,715 [ERROR] mongo_connector.util:97 - Call to Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset='rs0'), 'news-articles'), 'collection_names') failed too many times in retry_until_ok
Traceback (most recent call last):
  File "/srv/news_crawler/venv/lib/python3.9/site-packages/mongo_connector/util.py", line 79, in retry_until_ok
    return func(*args, **kwargs)
  File "/srv/news_crawler/venv/lib/python3.9/site-packages/pymongo/collection.py", line 2579, in __call__
    raise TypeError("'Collection' object is not callable. If you "
TypeError: 'Collection' object is not callable. If you meant to call the 'collection_names' method on a 'Database' object it is failing because no such method exists.

I don't know what to do with this. MongoDB is up and running:

Current Mongosh Log ID:	620be3f843ef12a095c537e4
Connecting to:		mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+1.1.9
Using MongoDB:		5.0.6
Using Mongosh:		1.1.9
For mongosh info see: https://docs.mongodb.com/mongodb-shell/
   The server generated these startup warnings when booting:
   2022-02-15T17:19:31.349+01:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
   2022-02-15T17:19:32.383+01:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
   2022-02-15T17:19:32.384+01:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
Warning: Found ~/.mongorc.js, but not ~/.mongoshrc.js. ~/.mongorc.js will not be loaded.
  You may want to copy or rename ~/.mongorc.js to ~/.mongoshrc.js.
rs0 [direct: primary] test>

and the designated collection has several thousand items in it:

rs0 [direct: primary] news-articles> db.articles.find().count()
147203

Is rs0 setup wrong?

And solr is also up and running:

$ curl http://localhost:8983/solr/admin/cores?action=STATUS
{
  "responseHeader":{
    "status":0,
    "QTime":0},
  "initFailures":{},
  "status":{
    "mongo_solr_collection":{
      "name":"mongo_solr_collection",
      "instanceDir":"/srv/solr-8.11.1/server/solr/mongo_solr_collection",
      "dataDir":"/srv/solr-8.11.1/server/solr/mongo_solr_collection/data/",
      "config":"solrconfig.xml",
      "schema":"managed-schema",
      "startTime":"2022-02-15T15:09:55.153Z",
      "uptime":9084767,
      "index":{
        "numDocs":0,
        "maxDoc":0,
        "deletedDocs":0,
        "indexHeapUsageBytes":0,
        "version":2,
        "segmentCount":0,
        "current":true,
        "hasDeletions":false,
        "directory":"org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/srv/solr-8.11.1/server/solr/mongo_solr_collection/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@5d83b5e6; maxCacheMB=48.0 maxMergeSizeMB=4.0)",
        "segmentsFile":"segments_1",
        "segmentsFileSizeInBytes":69,
        "userData":{},
        "sizeInBytes":69,
        "size":"69 bytes"}}}}

I really don't know what I am doing wrong. Would be great if someone could point me in the right direction. 🙏

@bablf
Copy link
Author

bablf commented Feb 16, 2022

For now I think pymongo and mongodb are the problem. I am using mongodb 5.0.6 and tried different pymongo versions (3.6->3.12) because I saw this comment in another issue #916.

The method collection_names was definitely available in 3.5 according to the 3.5-docs, but I still get the same error.

@RicardoM17
Copy link

RicardoM17 commented Feb 17, 2022

My Free Atlas cluster was also updated yesterday to 5.0.6 which seems to have broken mongo-connector. Now when certain documents are updated I just get a diff on the other end, without any of the other fields, when before everything seemed to be working fine. I'll investigate but any tips are welcome.

Good document:

{
    _id: ObjectId('620e7c84872ee87a1d736101'),
    component_id: 'gps',
    version: '1.30.0',
    deprecated: false
}

Bad document:

{
    _id: ObjectId('620e7c84872ee87a1d736101'),
    diff: {
        u: {
            deprecated: false
        }
    }
}

It's a shame that this project seems to be dead at this point. It is quite useful

@bablf bablf changed the title ERROR: Call to Collection failed to many times MongoDB 5.0.6 Compatibility Issues Feb 21, 2022
@Aberion-e
Copy link

Aberion-e commented Apr 1, 2022

Hi @bablf I got the same error as you with MongoDB, i have found a workaround if you're using Solr

Solr Part


solr_doc_manager.py
just change this at line 292

            batch = list(next(cleaned) for i in range(self.chunk_size))
            while batch:
                self.solr.add(batch, **add_kwargs)
                batch = list(next(cleaned)
                             for i in range(self.chunk_size))

by this

            try:
                batch = list(next(cleaned) for i in range(self.chunk_size))
                while batch:
                    self.solr.add(batch, **add_kwargs)
                    batch = list(next(cleaned)
                                for i in range(self.chunk_size))
            except Exception:
                pass 

downgrade pysolr to 3.8.1


Mongo Part


downgrade pymongo to 2.9


this workaround works for MongoDB 5.0.6 with mongo-connector 3.1.1, pymongo 2.9 and pysolr 3.8.1

Hope it helps you

@Yinkash100
Copy link

This helped me. Thank you.

Cant install pymongo 2.9 so I used 3.12.3

@gurmitteotia
Copy link

gurmitteotia commented Sep 8, 2022

@RicardoM17 did you find solution to your problem. I'm experiencing the problem where mongo connector can successfully add and delete the document to Solr instance but failing to update the document. Version 4.x of mongodb used to work fine.

My setup
Mongodb: 5.0.9
Solr: 9.0.0

@RicardoM17
Copy link

@RicardoM17 did you find solution to your problem. I'm experiencing the problem where mongo connector can successfully add and update the document to Solr instance but it failing to update the document. Version 4.x of mongodb used to work fine.

My setup Mongodb: 5.0.9 Solr: 9.0.0

Hi @gurmitteotia I don't exactly remember but I believe what happened was the following. I was testing a new feature on my personal account. This worked because when I first started I was still in version 4.X.X in MongoDB. Then they rolled out the update to 5.X.X which broke my stuff. That being said I was already going to implement this in my company's MongoDB account which has 4.2.22. So it is working there. I can't exactly remember if that was what "fixed" it or if I did something else but I believe that to be the case.

I guess if you use a paid account you can still specify the version of MongoDB you want.

Sorry if I couldn't be of more help.

@gurmitteotia
Copy link

gurmitteotia commented Sep 8, 2022

Thanks @RicardoM17 ,

I did investigation and found out reason why document update was failing. I have also published a fix.

Primary reason is that "oplog" entry format has been changed in mongodb 5.x for document update. In Mongodb 4.x following oplog entry is generated when fields are changed -

{
	"op" : "u",
	"ns" : "mydb.emps",
	"ui" : UUID("79dc60c3-2fa9-454f-a84e-8b6d694d4ad3"),
	"o" : {
		"$v" : 1,
		"$set" : {
			"company_name" : "new company"
		}
	},
	"o2" : {
		"_id" : ObjectId("6319e61bc84d0826deef55e8")
	},
	"ts" : Timestamp(1662641739, 1),
	"t" : NumberLong(1),
	"wall" : ISODate("2022-09-08T12:55:39.544Z"),
	"v" : NumberLong(2)
}

However in case of mongodb 5 oplog entry has been changed to the following format-

{
	"op" : "u",
	"ns" : "mydb.col",
	"ui" : UUID("beebd3ac-fc25-41c9-98cf-90c87bfe79e0"),
	"o" : {
		"$v" : 2,
		"diff" : {
			"u" : {
				"company_name" : "new name",
				"website_url" : "https://www.newname.com1"
			}
		}
	},
	"o2" : {
		"_id" : ObjectId("5fa1ab5c184246d62af8fc46")
	},
	"ts" : Timestamp(1662644253, 1),
	"t" : NumberLong(4),
	"v" : NumberLong(2),
	"wall" : ISODate("2022-09-08T13:37:33.238Z")
}

And solr-doc-manager plug-in was misinterpreting the "update" document and adding it as new document, which was being rejected by solr. I have put a fix in solr-doc-manager plugin. You can find it in branch - mongo5-fix. I have added unit test to make sure the fix work for both mongodb 4 and mongodb 5. I will try if my fix can be reviewed and merged in the main repository.

Summary
The last released versions of mongo-connector (3.1.1) and solr-doc-manager (0.1.0) will not work with latest version of mongodb (5.0.9) and solr (9.0.0). You will encounter two issues-

  • Pymongo error as originally reported by @bablf. Fix is to install pymongo version 3.12.3
  • Update of document will fail. Solution is described above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants