Skip to content

Commit

Permalink
Speed-up character data loading
Browse files Browse the repository at this point in the history
In Python 3, sys.maxunicode is 0x10FFFF, this caused slow loading, and
in fact it is not necessary in our code.

And clear some unused data to save memory use.
  • Loading branch information
SeaHOH authored and Kronuz committed Jan 26, 2022
1 parent 809cb6e commit 298c7aa
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions esprima/character.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,14 @@

from __future__ import absolute_import, unicode_literals

import sys

import unicodedata
from collections import defaultdict

from .compat import uchr, xrange

# https://stackoverflow.com/questions/14245893/efficiently-list-all-characters-in-a-given-unicode-category
U_CATEGORIES = defaultdict(list)
for c in map(uchr, xrange(sys.maxunicode + 1)):
for c in map(uchr, xrange(0x10000)):
U_CATEGORIES[unicodedata.category(c)].append(c)
UNICODE_LETTER = set(
U_CATEGORIES['Lu'] + U_CATEGORIES['Ll'] +
Expand Down Expand Up @@ -82,6 +80,9 @@
OCTAL_DIGIT = set(OCTAL_CONV.keys())
HEX_DIGIT = set(HEX_CONV.keys())

del U_CATEGORIES, UNICODE_LETTER, UNICODE_COMBINING_MARK
del UNICODE_DIGIT, UNICODE_CONNECTOR_PUNCTUATION
del DECIMAL_CONV, OCTAL_CONV, HEX_CONV

class Character:
@staticmethod
Expand Down

0 comments on commit 298c7aa

Please sign in to comment.