Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for wide-Unicode little-endian Py3k. #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

abarnert
Copy link

@abarnert abarnert commented Oct 5, 2012

  • Inside ae.c, AE_GetCFStringRef assumes that the data inside a wide
    PyUnicode is in the same endianness that CF wants. But PyUnicode is
    native-endian, kCFStringEncodingUTF32 is big-endian (if no BOM). We
    could write code to explcitly use kCFStringEncodingUTF32[LE|BE] as
    appropriate, or tack on a BOM to the start of a copy of the UTF-32
    and use kCFStringEncodingUTF32 as-is, or various other possibilities...
    but it's a lot simpler to use UTF8, and I doubt the performance will
    ever be an issue.

- Inside ae.c, AE_GetCFStringRef assumes that the data inside a wide
  PyUnicode is in the same endianness that CF wants. But PyUnicode is
  native-endian, kCFStringEncodingUTF32 is big-endian (if no BOM). We
  could write code to explcitly use kCFStringEncodingUTF32[LE|BE] as
  appropriate, or tack on a BOM to the start of a copy of the UTF-32
  and use kCFStringEncodingUTF32 as-is, or various other possibilities...
  but it's a lot simpler to use UTF8, and I doubt the performance will
  ever be an issue.
@abarnert
Copy link
Author

abarnert commented Oct 5, 2012

To see the problem, you need a wide-Unicode Python 3 on an Intel Mac. The current python.org 3.3.0 installer is fine.

$ python3 -c "import appscript; print appscript.app('iTunes')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/abarnert/src/github/appscript/py-appscript/trunk/build/lib.macosx-10.6-intel-3.3/appscript/reference.py", line 799, in __call__
    return self._appclass(*args, **kargs)
  File "/Users/abarnert/src/github/appscript/py-appscript/trunk/build/lib.macosx-10.6-intel-3.3/appscript/reference.py", line 734, in __init__
    constructor, identifier = 'path', aem.findapp.byname(name)
  File "/Users/abarnert/src/github/appscript/py-appscript/trunk/build/lib.macosx-10.6-intel-3.3/aem/findapp.py", line 47, in byname
    name = _findapp(name)
  File "/Users/abarnert/src/github/appscript/py-appscript/trunk/build/lib.macosx-10.6-intel-3.3/aem/findapp.py", line 15, in _findapp
    return findapplicationforinfo(creator, id, name)
aem.ae.MacOSError: -50

You might be able to cause the same problem with a wide-Unicode (UTF32) Python 2 by using u'iTunes' instead of 'iTunes', but I haven't tested. Narrow unicode (UTF16) has a different, less serious bug. None of this is a problem on PowerPC builds, because the underlying problem is endianness-related.

The error -50 means "bad parameters", because we're converting a Python native-endian UTF32 string into CoreFoundation big-endian UTF32 string, which gives us invalid Unicode, which LaunchServices rejects.

@mattneub
Copy link
Owner

mattneub commented Oct 5, 2012

I wanted first to test the original problem, so I started with py-appscript as it stands. I installed Python 3.3 and ran the py-appscript python3 setup.py. I can't get as far as your test. This works fine:

$ python -c "import appscript; print appscript.app('Finder')"
app(u'/System/Library/CoreServices/Finder.app')

(I chose the Finder because we know there are problems with iTunes.) But with python3:

$ python3 -c "import appscript; print appscript.app('Finder')"
  File "<string>", line 1
    import appscript; print appscript.app('Finder')
                                    ^
SyntaxError: invalid syntax

I don't know much about python; can you explain how I can get further? It isn't just a command-line thing; the same thing happens when running a script file.

@mattneub
Copy link
Owner

mattneub commented Oct 5, 2012

Okay, we got past that; it turns out you have to say:

python3 -c "import appscript; print(appscript.app('Finder'))"

Let's try to be accurate here.

@mattneub
Copy link
Owner

mattneub commented Oct 5, 2012

Next question. I'm not seeing any problem with python 2, so why are we patching appscript_2x/ext/ae.c?

@abarnert
Copy link
Author

abarnert commented Oct 5, 2012

Apologies for the parens.

The main reason to fix 2x is that the code is identical. The relevant types are defined the same way, so this can't possibly work. If it can be triggered, it will have the same bug; if it can't be triggered it doesn't matter either way. Meanwhile, the fact that the two functions are nearly identical, as with most of the other code in ae.c, makes maintenance and debugging easier. For example, if they had been radically different when I was looking to fix this bug, I would have wasted time trying to figure out why they're different, which change is relevant to the bug, etc., before spotting the problem.

I'll make a utf32 py2.7 build to test it, because I'm not positive that app(u'iTunes') would trigger this code. It's worth having a test to be sure. And if it can't be triggered, it might be more reasonable to just remove the code. But I think the virtue of having the two branches be identical would still be a good argument.

@abarnert
Copy link
Author

abarnert commented Oct 5, 2012

Verified that the problem exists in Python 2.

First you need to get a wide-Unicode build. Apple's builds are narrow, as are the python.org installers. If you're not sure about what you have:

python -c 'import sys; print(sys.maxunicode)'

For 2.2 through 3.2, this will be 65535 for narrow, 1114111 for wide.

The easiest way to get a wide-Unicode build is ot build it:

./configure --enable-unicode=ucs4
make

You can make install, but you can just run it in-place by specifying the full path .../Python-2.7.3/python.exe. Since you won't have distribute/easy_install/pip, you'll need to download and untar appscript manually, then:

../Python-2.7.3/python.exe setup.py build_ext -i
PYTHONPATH=appscript_2x/lib ../Python2.7.3/python.exe -c 'import appscript; print(appscript.app("Finder"))'

You'll get the exact same error as with the official 3.3.0 package. And the same fix works.

@abarnert
Copy link
Author

abarnert commented Oct 5, 2012

One more thing: The official Python 3.3.0 installer isn't actually wide Unicode; that distinction no longer matters. But it acts as if it were wide for the purposes of Py_UNICODE_WIDE, PyUnicode_AsUnicode, etc. See http://docs.python.org/py3k/whatsnew/3.3.html#pep-393 for details.

Anyway, I tested with an actual wide 3.x (built from the 3.2.3 source, just like the 2.7.3 above) to verify that the same problem exists, and the same fix works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants