Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Still having large bit precision calculation issues #522

Open
wjamesjr opened this issue Oct 14, 2024 · 45 comments
Open

Still having large bit precision calculation issues #522

wjamesjr opened this issue Oct 14, 2024 · 45 comments

Comments

@wjamesjr
Copy link

When using mpfrs at high precision (> 2 ** 30 bits) algorithms fail. I have verified that mpfr works as expected using a C++ wrapper and test pi agm code. Using very similar code with gmpy, the code fails (agm fails to converge). I am attaching some test code to demonstrate the problem. At 250 million digits the agm routine works, at 400 million digits the agm routine fails to converge. The c++ wrapper had a similar issue that was determined to be an argument in a call that needed to be a long that was an int.
PiAGM_GMPY.py.zip

@skirpichev
Copy link
Contributor

Using very similar code with gmpy

Attached archive has only Python code. We can't guess how "similar" it is to your C/C++ code.

@wjamesjr
Copy link
Author

C++ code is attached.
main.cpp.zip

@skirpichev
Copy link
Contributor

Well, maybe I miss something else (versions use different variable name conventions, etc; it's not trivial to compare literally), but so far you code examples aren't "identical". For example:

  • to set precision you are using math.log(10) / math.log(2) coefficient, instead of more precise math.log2(10). Either way it's not an equivalent of mpfr::digits2bits().
  • you are using different initial values for y0/g. sqrt(0.5) != 1/sqrt(mpfr("2.0").
  • finally, mpf("10.E-digits") != mpfr("10.0")**-digits.

So, if you C++ code works correctly, please double check that Python code match it.

@wjamesjr
Copy link
Author

Made changes to Python source as suggested. Still exhibits same issue with large bit precision.
PiAGM_GMPY.py.zip

@skirpichev
Copy link
Contributor

So far you just changed math.log(10) / math.log(2) to math.log2(10)

@wjamesjr
Copy link
Author

bcf = math.log2(10)

g, z, half = sqrt(mpfr("0.5")), mpfr("0.25"), mpfr("0.5")

epsilon = mpfr("10.E-" + str(needed_digits))

The above three lines were changes.

@casevh
Copy link
Collaborator

casevh commented Oct 20, 2024

I have identified a bug in gmpy2. The new exponent limits assigned to a context are not being used to configure the MPFR library. The are currently only used when normalizing a final result.

The easy fix is to configure the MPFR library to used the maximum exponent on startup. The MPFR exponent range is thread local setting. So unilaterally change the value may impact other CPython applications (SageMATH) that may be running in the same thread.

The better option is saving the current exponent range, setting the new exponent range, calling MPFR, restoring the old exponent range.

I prefer the second option but it will be more complicated.

@wjamesjr
Copy link
Author

Glad to hear that you found a problem. I am in no hurry for a solution, so take whatever time is needed to get it right.

@skirpichev
Copy link
Contributor

@wjamesjr, just to confirm, could you try the "easy fix"?

diff --git a/src/gmpy2_context.c b/src/gmpy2_context.c
index fecd79af..e4c776da 100644
--- a/src/gmpy2_context.c
+++ b/src/gmpy2_context.c
@@ -1034,6 +1034,7 @@ GMPy_CTXT_Set_emin(CTXT_Object *self, PyObject *value, void *closure)
         return -1;
     }
     self->ctx.emin = exp;
+    mpfr_set_emin(exp);
     return 0;
 }

@@ -1067,6 +1068,7 @@ GMPy_CTXT_Set_emax(CTXT_Object *self, PyObject *value, void *closure)
         return -1;
     }
     self->ctx.emax = exp;
+    mpfr_set_emax(exp);
     return 0;
 }

@wjamesjr
Copy link
Author

Sorry I have never built gmpy from source. Always used pip.

@skirpichev
Copy link
Contributor

It's not too hard and documented:
https://gmpy2.readthedocs.io/en/latest/install.html#from-sources

Just apply the patch before doing pip install ..

@wjamesjr
Copy link
Author

I tried to follow instructions and it timed out on my Mac. See below:

Last login: Sun Oct 20 03:49:44 on ttys000
williamjames@Mac gmpy % git clone git://github.com/aleaxit/gmpy.git
Cloning into 'gmpy'...
fatal: unable to connect to github.com:
github.com[0: 140.82.112.3]: errno=Operation timed out

@skirpichev
Copy link
Contributor

Ah, perhaps it's some firewall settings on your side. git clone https://github.com/aleaxit/gmpy.git - probably should work as you comment on the github.com.

@wjamesjr
Copy link
Author

Got it cloned and source change made now have an error on pip install:

williamjames@Mac gmpy % pip install -e .[tests]
zsh: no matches found: .[tests]

directory looks like:

COPYING MANIFEST.in demo pyproject.toml src
COPYING.LESSER README.rst docs scripts test
INSTALL TODO gmpy2 setup.py test_cython

@skirpichev
Copy link
Contributor

skirpichev commented Oct 20, 2024 via email

@wjamesjr
Copy link
Author

OK just pip install . worked but now 'gmp.h' can't be found . gmp, mpfr and mpc are installed but since they were installed by home-brew they are installed in non-standard locations. I ran into this before that stopped me from compiling open source packages.

@wjamesjr
Copy link
Author

Installed gmp, mpfr and libmpc with macports, pip install . still can't find gmp.h.

@skirpichev
Copy link
Contributor

Hmm, probably you can set correct CFLAGS and LDFLAGS like this:

PGMP="$(brew --prefix gmp)"
PMFR="$(brew --prefix mpfr)"
PMPC="$(brew --prefix libmpc)"
CFLAGS="-I$PGMP/include -I$PMFR/include -I$PMPC/include" LDFLAGS="-L$PGMP/lib -L$PMFR/lib -L$PMPC/lib" pip install .

@casevh
Copy link
Collaborator

casevh commented Oct 21, 2024 via email

@wjamesjr
Copy link
Author

@wjamesjr, just to confirm, could you try the "easy fix"?

diff --git a/src/gmpy2_context.c b/src/gmpy2_context.c
index fecd79af..e4c776da 100644
--- a/src/gmpy2_context.c
+++ b/src/gmpy2_context.c
@@ -1034,6 +1034,7 @@ GMPy_CTXT_Set_emin(CTXT_Object *self, PyObject *value, void *closure)
         return -1;
     }
     self->ctx.emin = exp;
+    mpfr_set_emin(exp);
     return 0;
 }

@@ -1067,6 +1068,7 @@ GMPy_CTXT_Set_emax(CTXT_Object *self, PyObject *value, void *closure)
         return -1;
     }
     self->ctx.emax = exp;
+    mpfr_set_emax(exp);
     return 0;
 }

I finally got the "easy fix" to build and tested it against the test case that I provided and it did not work for me. That being said I am not confident that I have done everything right. Skirpichev, thank you for your patience, I can now build the latest version in the future if I need to.

Case, thank you for your explanation of a fix. I wasn't expecting it to be this complicated.

@wjamesjr
Copy link
Author

Figured out what I was doing wrong. I was running Python 3.13 interpreter and never using the modified code. I had already backed out the "quick fix" change and changed to the two line addition to gmpy2.c when I figured this out. Ran a test against the test code I submitted and using Python 3.11 it worked perfectly. How can you modify the downloaded code to use 3.13 instead of 3.11?

@skirpichev
Copy link
Contributor

Usually it's a good idea to use virtual environments, especially if you are using different interpreter versions, work on different projects, etc.

E.g. after git clone ...gmpy.git && cd gmpy && patch the gmpy2_context.c, do:

python3.13 -m venv .venv/3.13/  # create virtual environment
. .venv/3.13/bin/activate  # activate this virtual environment
pip install .  # install gmpy2 into this environment
python /path/to/PiAGM_GMPY.py  # run your script

@casevh
Copy link
Collaborator

casevh commented Oct 22, 2024

I created a branch called issue522. The resulting binaries can be found at https://github.com/aleaxit/gmpy/actions/runs/11452424995

Download the wheels.zip file at the bottom of the page and extract the appropriate version for you system. It can be installed with the command "python3.13 -m pip install gmpy2-2.2.1-cp313-cp313-macosx_11_0_arm64.whl". (I think that is correct version for you system.)

I've already tested the Windows version and it fails. The MPFR exponent type is defined as a C long so it is only 32 bits on Windows platform.

@wjamesjr
Copy link
Author

Tested against Python 3.11, 3.12, and 3.13 on arm64 Mac. All three worked.

@casevh
Copy link
Collaborator

casevh commented Oct 22, 2024

Thanks for checking. I completed a successful test for 1_000_000_000 digits in slightly less than 2 hours. Now testing 2_000_000_000.

@wjamesjr
Copy link
Author

I am running into a problem now converting large mpfrs to a string. It actually causes a Python crash. I am now trying to find the hard cutoff. Suspect it is about 2**31 decimal digits (2147483648). Working to verify it now. Are you aware of a hard limit for max digits for mpfr to string conversions?

@skirpichev
Copy link
Contributor

What kind of "crash", have you seen some error message? How you do conversion? E.g. str/repr/format.

@wjamesjr
Copy link
Author

I am running it within PyCharm. I get a popup window that says that Python quit unexpectedly. I am attaching a simple test case that will demonstrate the problem. It causes Python to quit on my Mac.
test.py.zip

@wjamesjr
Copy link
Author

Error message is:

Process finished with exit code 139 (interrupted by signal 11:SIGSEGV)

Using str(mpfr) to convert.

@wjamesjr
Copy link
Author

Exit code 139 is a seg fault. The conversion to a string is taking some time so it isn't happening immediately after start of conversion. My guess is some code is writing beyond a memory buffer.

@skirpichev
Copy link
Contributor

My guess is some code is writing beyond a memory buffer.

Can you try to reproduce this in C? str(mpfr) internally just calls "{0:.pg}".format(mpfr), where

p = (long)(log10(2) * (double)mpfr_get_prec(MPFR(self))) + 2;

which in turn calls mpfr_asprintf(ppbuf, "%.pRg", mpfr).

Looking on the code, I don't think problem is on our side:

GMPy_MPFR_Str_Slot(MPFR_Object *self)

GMPy_MPFR_Format(PyObject *self, PyObject *args)

@wjamesjr
Copy link
Author

Looking on the code, I don't think problem is on our side:

I agree. The mpfr_asprintf call returns an int with the number of characters placed in the buffer. int on my system is an int32. So the call is incapable of returning a value greater than 2,147,483,647. This is why 2,147,483,648 is failing for my test case. I am guessing that the call is returning a negative value for number of characters written in this case. The C++ wrapper that I was using checks for a negative return.

inline std::string mpreal::toString(const std::string& format) const
{
char *s = NULL;
std::string out;

if( !format.empty() )
{
    if(!(mpfr_asprintf(&s, format.c_str(), mpfr_srcptr()) < 0))
    {
        out = std::string(s);

        mpfr_free_str(s);
    }
}

return out;

}

So it looks like mpfr is not capable of converting mpfrs whose length is greater than 2,147,483,647. It looks like the gmpy2 code does not check for a negative value and presses forward. Is that an issue to fix.

@skirpichev
Copy link
Contributor

Good catch. See #526, I would appreciate testing (wheels could be downloaded at https://github.com/aleaxit/gmpy/actions/runs/11478047408?pr=526).

On another hand, e.g. on my system - INT_MAX is 2147483647, while gmpy2.get_max_precision() reports 9223372036854775551. Perhaps, we should use the mpfr_out_str() (returns size_t) in str/repr instead.

@wjamesjr
Copy link
Author

Good catch. See #526, I would appreciate testing (wheels could be downloaded at https://github.com/aleaxit/gmpy/actions/runs/11478047408?pr=526).

Tested against 2147483648 digit precision on Python 3.11, 3.12, 3.13 and received the message below:

Traceback (most recent call last):
File "/Users/williamjames/PycharmProjects/stringtest/test.py", line 16, in
s1 = str(z)
RuntimeError: Number of characters written exceeds the INT_MAX

Tested against 2147483600 and got normal conversion.

mpfr_out_str() looks promising!

@wjamesjr
Copy link
Author

I did some testing to see if there is a precision at the boundary from returning a string and throwing an exception. Setting a precision of 2147483644 in my test script is the boundary and it still seg faults. Maybe it is returning a zero versus a negative value??

@wjamesjr
Copy link
Author

mpfr_get_str writes directly to a string but separates the exponent into a separate variable. It has size_t character size which makes it viable. But I don't know how it can be a replacement for str() since the formatting is different.

@wjamesjr
Copy link
Author

The C++ wrapper http://www.holoborodko.com/pavel/mpfr/ that I have been testing has a toString() call. It has a conditional if construct that uses mpfr_asprintf() for mpfr versions 2.4.0+ and mpfr_get_str for versions less than 2.4.0. I am modifying the code to force the use of mpfr_get_str and see how well it works.

@casevh
Copy link
Collaborator

casevh commented Oct 24, 2024

I have some good news. I accidentally solved this issue years ago.

The digits() method of an mpfr returns a tuple containing the digits in the mantissa, the exponent, and the precision. It calls mpfr_get_str which doesn't appear to have a size limit.

Maybe we just update some documentation and the error message in #526?

>>> from gmpy2 import *
>>> get_context().precision=8_500_000_000
>>> a=mpfr(1)/7
>>> b=a.digits()
>>> len(b[0])
2558754965
>>>

@skirpichev
Copy link
Contributor

Setting a precision of 2147483644 in my test script is the boundary and it still seg faults.

Did you check error message, it has SIGSEGV?

Perhaps, you could try following C program and check it's behaviour near the limit:

/* compile with cc a.c -lmpfr */
#include <stdio.h>
#include <stdlib.h>
#include <mpfr.h>
#include <math.h>
int main(int argc, char* argv[])
{
    mpfr_t x;
    mpfr_init_set_ui(x, 3, MPFR_RNDN);
    char *buf, fmt[20];
    long dps = INT_MAX + atoi(argv[1]);
    sprintf(fmt, "%%#.%ldRg", dps);

    int sz = mpfr_asprintf(&buf, fmt, x);
    if (sz != -1) {
//      printf("%s\n", buf);
        mpfr_free_str(buf);
    }

    mpfr_clear(x);
    return 0;
}
?

The digits() method of an mpfr returns a tuple containing the digits in the mantissa, the exponent, and the precision. It calls mpfr_get_str which doesn't appear to have a size limit.

This doesn't fix str/repr. On another hand, now I'm not sure that current str/repr is helpful for big precision.

Maybe we just update some documentation and the error message in #526?

What you suggest?

@wjamesjr
Copy link
Author

Did you check error message, it has SIGSEGV?

I ran it again and the error message is different:

GNU MP: Cannot reallocate memory (old_size=2147487744 new_size=18446744071562067968)

Process finished with exit code 134 (interrupted by signal 6:SIGABRT)

@wjamesjr
Copy link
Author

I have some good news. I accidentally solved this issue years ago.

The digits() method of an mpfr returns a tuple containing the digits in the mantissa, the exponent, and the precision. It calls mpfr_get_str which doesn't appear to have a size limit.

The digits() method meets my requirement.

@skirpichev
Copy link
Contributor

I ran it again and the error message is different: GNU MP: Cannot reallocate memory

That's ok. It means you run out of memory. The GMP in such cases just calls abort() and we can't recover from this. So, #526 - seems to be working.

@wjamesjr
Copy link
Author

Maybe we just update some documentation and the error message in #526?

What you suggest?

What would the documentation change be? Maybe something like the below:

There are two methods of displaying mpfrs, str() and digits(). str() is most useful for smaller values and digits() for larger ones. str() will fail for string lengths greater than about 2147483647.

@wjamesjr
Copy link
Author

I am now good with the changes made. I can calculate with high precision and display using digits().

@skirpichev
Copy link
Contributor

There are two methods of displaying mpfrs, str() and digits(). str() is most useful for smaller values and digits() for larger ones. str() will fail for string lengths greater than about 2147483647.

I don't think this belongs to the error message.

  1. it's too verbose
  2. the exact bound depends on system

@casevh, I think #526 may be merged as is.

Perhaps, we should include some hints to use the digits function/method in the __str__/__repr__ docstring? But I think it's better just fix those dunders to remove that limitation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants