Skip to content

TypeError: Wrong number or type of arguments for overloaded function 'Rule_extract' #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
impredicative opened this issue Nov 26, 2023 · 5 comments
Assignees
Labels

Comments

@impredicative
Copy link

With Python 3.12, hext.Rule('').extract('') gives the error:

  File "python3.12/site-packages/hext/__init__.py", line 139, in extract
    return _hext.Rule_extract(self, html, max_searches)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Wrong number or type of arguments for overloaded function 'Rule_extract'.
  Possible C/C++ prototypes are:
    Rule::extract(Html const &,std::uint64_t) const
    Rule::extract(Html const &) const

I am of course also getting this error with a more real-life example. At this time I cannot use hext for anything new.

@thomastrapp thomastrapp self-assigned this Nov 26, 2023
@thomastrapp
Copy link
Member

Rule.extract does not accept a string, only hext.Html.

import hext
rule = hext.Rule("<a href:link/>")
# (1) Ok, the argument for extract is of type hext.Html
results = rule.extract(hext.Html("""<a href="b"></a>"""))
# (2) Error, the argument for extract is of type string:
results = rule.extract("""<a href="b"></a>""")

If this was possible in a previous version of Hext (≥1.0.0), please let me know, as this would be a breaking change in the API.

The error message is unfortunately very unhelpful, and I will fix that in a future release with #28.

Thank you for creating this issue.

@brandonrobertz
Copy link
Member

If this was possible in a previous version of Hext (≥1.0.0), please let me know, as this would be a breaking change in the API.

This was not possible in 0.8 (just re-tested to be sure). AFAIK you always needed to pass a Html object.

@impredicative
Copy link
Author

impredicative commented Nov 26, 2023

Yes, it had been a while since I used hext, and I misremembered. Indeed hext.Rule('').extract(hext.Html('')) is what works.

As an aside, I think there really needs to exist at least one comprehensive page (or tabs) per supported programming language in the documentation. It would contain various necessary examples to train the user to use hext effectively.

@impredicative
Copy link
Author

impredicative commented Nov 26, 2023

As an example, please see the organization and tabs here (one tab per supported language).

@thomastrapp
Copy link
Member

As an aside, I think there really needs to exist at least one comprehensive page (or tabs) per supported programming language in the documentation. It would contain various necessary examples to train the user to use hext effectively.

I agree and have added another issue for this: html-extract/html-extract.github.io#4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants