Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock in org.fusesource.lmdbjni.JNI.mdb_txn_begin? #70

Open
jayenashar opened this issue Jun 3, 2016 · 9 comments
Open

deadlock in org.fusesource.lmdbjni.JNI.mdb_txn_begin? #70

jayenashar opened this issue Jun 3, 2016 · 9 comments

Comments

@jayenashar
Copy link

jayenashar commented Jun 3, 2016

I'm trying to convert from my old db to lmdb, using 8 threads in parallel. There's about 20000 entries, but after about 6000 org.fusesource.lmdbjni.Database#put(byte[], byte[])s, I either hit a SIGSEGV or the threads lock up all with this stack trace:

      at org.fusesource.lmdbjni.JNI.mdb_txn_begin(JNI.java:-1)
      at org.fusesource.lmdbjni.Env.createTransaction(Env.java:453)
      at org.fusesource.lmdbjni.Env.createWriteTransaction(Env.java:411)
      at org.fusesource.lmdbjni.Database.put(Database.java:394)
      at org.fusesource.lmdbjni.Database.put(Database.java:386)

I'm using my fork of 0.4.7-SNAPSHOT. Increasing the map size works, but I would have expected a MDB_MAP_FULL instead of a hang.

@krisskross
Copy link
Member

Try one thread and use an external transaction instead, commit when done.

Example:

 try (Transaction tx = env.createWriteTransaction()) {
   db.put(tx, key, val);
   ...
   tx.commit();
 }

@krisskross
Copy link
Member

LMDB is single-writer so other writing threads would block anyway.

If your application SIGSEGV you're probably using the API incorrectly. Show me the code and I may be able to help you.

@jayenashar
Copy link
Author

Yes, single threaded, it gives an MDB_MAP_FULL. I haven't tried with a single transaction, but I imagine that would not hang/SIGSEGV.

I got trigger happy and deleted my old db, but here's a small testcase with the same symptoms.

  @Test
  public void testStress() {
    Collections.nCopies(8, null).parallelStream().forEach(new Consumer<Object>() {
      @Override
      public void accept(Object ignored) {
        Random random = new Random();
        for (int i = 0; i < 15000; i++) {
          db.put(bytes(Long.toString(random.nextLong())), bytes(Long.toString(random.nextLong())));
        }
      }
    });
  }

@krisskross
Copy link
Member

krisskross commented Jun 4, 2016

Hmm. Yes I get the same SIGSEGV. The test runs fine with a ExecutorService though.

    ExecutorService service = Executors.newFixedThreadPool(8);
    service.execute(() -> {
      Random random = new Random();
      for (int i = 0; i < 15000; i++) {
        db.put(bytes(Long.toString(random.nextLong())), bytes(Long.toString(random.nextLong())));
      }
    });

MDB_MAP_FULL means the database is full. You can increase the size by calling Env.setMapSize before opening the environment.

@krisskross
Copy link
Member

The SIGSEGV happens when the transaction aborts which is strange. A put should always succeed or block-then-succeed when the write-lock is released.

I can also see the hang now sometimes. Sometimes a thread seems hanging in mdb_put and sometimes all threads block at mdb_txn_begin.

And all this only happens with ForkJoin. Weird.

@krisskross
Copy link
Member

krisskross commented Jun 4, 2016

It may well be a bug in LMDB. But maybe not since it works with an ExecutorService.

@jayenashar
Copy link
Author

I'm not sure I understand your ExecutorService example. Doesn't that only execute the loop once and not 8 times concurrently?

Yeah, I got it all working (serial and parallel) with Env.setMapSize. Sorry I didn't mention that earlier.

I haven't observed it hang in mdb_put, but I suppose at this point you've run it more times than I have. Hopefully all three symptoms will have the same fix.

@krisskross
Copy link
Member

Sorry my bad about the ExecutorService example.

The problem only seems to manifest when the database is too small. Do you see this as well?

@jayenashar
Copy link
Author

env.setMapSize(50, ByteUnit.MEBIBYTES); - OK
env.setMapSize(30, ByteUnit.MEBIBYTES); - OK
env.setMapSize(20, ByteUnit.MEBIBYTES); - OK
env.setMapSize(15, ByteUnit.MEBIBYTES); - OK
env.setMapSize(10, ByteUnit.MEBIBYTES); - OK
env.setMapSize(5, ByteUnit.MEBIBYTES); - SIGSEGV
env.setMapSize(8, ByteUnit.MEBIBYTES); - hang in mdb_txn_begin
env.setMapSize(9, ByteUnit.MEBIBYTES); - OK

I thought 10 was the default, so not sure why it has issues without setting the map size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants