Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some image name strings are exponential in parse times WRT length #12

Open
MrCreosote opened this issue Oct 9, 2024 · 1 comment
Open

Comments

@MrCreosote
Copy link

I'm not sure what's causing this so I'm sure this is only a fraction of the cases where this could occur, but there's certain image name strings that take exponentially longer to parse as a function of the name length:

In [1]: from docker_image import reference

In [2]: import time

In [3]: def test_char(name, char):
   ...:     try:
   ...:         t1 = time.time()
   ...:         reference.Reference.parse_normalized_named(f"{name}:0.15{char}.0
   ...: ")
   ...:     except reference.ReferenceInvalidFormat:
   ...:         return time.time() - t1
   ...: 

In [4]: namefull = "abcdefghijklmnopqrst"

In [7]: for ln in range(10, 21, 5):
   ...:     name = namefull[:ln]
   ...:     print(f"Name len: {len(name)}")
   ...:     for char in "[]$()`&|<>;,\n{}'\"":
   ...:         print(f"    {char} {test_char(name, char):0.2f}")
   ...:     print("\n")
   ...: 
Name len: 10
    [ 0.00
    ] 0.00
    $ 0.00
    ( 0.00
    ) 0.00
    ` 0.06
    & 0.00
    | 0.00
    < 0.00
    > 0.00
    ; 0.03
    , 0.00
    
 0.03
    { 0.00
    } 0.00
    ' 0.00
    " 0.00


Name len: 15
    [ 0.00
    ] 0.00
    $ 0.00
    ( 0.00
    ) 0.00
    ` 0.86
    & 0.00
    | 0.00
    < 0.00
    > 0.00
    ; 0.88
    , 0.00
    
 0.88
    { 0.00
    } 0.00
    ' 0.00
    " 0.00


Name len: 20
    [ 0.00
    ] 0.00
    $ 0.00
    ( 0.00
    ) 0.00
    ` 29.65
    & 0.00
    | 0.00
    < 0.00
    > 0.00
    ; 28.44
    , 0.00
    
 28.26
    { 0.00
    } 0.00
    ' 0.00
    " 0.00

In particular, the backtick, ;, and newline characters seem to cause major parse problems for the regex parser. I tried putting the characters into the name portion of the image string but that didn't seem to have the same effect. Possibly the proximity to . causes the issue...? I didn't test that.

@MrCreosote
Copy link
Author

As a workaround prior to parsing I just run the regex r"([^a-zA-Z0-9@:_.\-\/])" and throw an error if it finds anything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant