Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: expected string or bytes-like object #12

Open
martincousi opened this issue Aug 20, 2024 · 7 comments
Open

TypeError: expected string or bytes-like object #12

martincousi opened this issue Aug 20, 2024 · 7 comments

Comments

@martincousi
Copy link

When translating a notebook to french, jtranslate reports the following error:

Traceback (most recent call last):
  File "\\?\C:\Users\11143054\AppData\Local\miniconda3\envs\jtranslate\Scripts\jupyter_translate-script.py", line 33, in <module>
    sys.exit(load_entry_point('jupyter-translate', 'console_scripts', 'jupyter_translate')())
  File "c:\users\11143054\documents\github\jupyter-translate\jupyter_translate.py", line 209, in main
    jupyter_translate(
  File "c:\users\11143054\documents\github\jupyter-translate\jupyter_translate.py", line 172, in jupyter_translate
    translate_markdown(source, translator, delay=delay)
  File "c:\users\11143054\documents\github\jupyter-translate\jupyter_translate.py", line 96, in translate_markdown
    return translate(text) + '\n'
  File "c:\users\11143054\documents\github\jupyter-translate\jupyter_translate.py", line 86, in translate
    text = replace_from_list('[Xx]' + LINK_REPLACEMENT_KW[1:], text, md_links)
  File "c:\users\11143054\documents\github\jupyter-translate\jupyter_translate.py", line 66, in replace_from_list
    return re.sub(tag, lambda x: next(iter(replacement_gen)), text)
  File "C:\Users\11143054\AppData\Local\miniconda3\envs\jtranslate\lib\re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

Removing the Markdown lines --- in my notebook allowed jtranslate to progress further. However, it stopped again when arriving at a Markdown table:

ClientID Date Demand (units)
X15 06-12-2020 560
AO5 06-12-2020 1152
ZI5 08-12-2020 32
T65 10-12-2020 194

Once that table was removed, I was able to fully translation the notebook. Is there an option to skip --- lines and Markdown tables?

@WittmannF
Copy link
Owner

WittmannF commented Aug 20, 2024

Hi @martincousi , thanks for the heads up, apparently a recent update made the library incompatible with windows OS. Let me try to replicate on my side and confirm if that's the issue. In the meantime, can you try running it on a Unix based OS? For example, you can run on Google colab: https://colab.research.google.com/drive/1QL7-L4AjL0kZ4nC51K2BmE_9_pNHsdtu?usp=sharing
Update: sorry, saw the full message only now. Can you share an example of notebook that would raise this error?

@martincousi
Copy link
Author

This is a notebook with such table and lines: https://github.com/acedesci/scanalytics/blob/master/EN/S01_Intro/01_InClass_Exercises.ipynb

I also saw that jtranslate translates the \left( and \right) LaTeX commands which is problematic.

@WittmannF
Copy link
Owner

Thanks @martincousi! I was able to replicate it here. In the meantime, can you please run the legacy version at https://github.com/WittmannF/jupyter-translate/tree/master/legacy ? I was able to run the file from there with no issues.

@andrebelem, can you please take a look? I'm trying to understand which specific recent change is raising this error.

@andrebelem
Copy link
Collaborator

The code is probably confusing the sequence of strings that mark the table. Tomorrow I will study how the code reacts in different situations to correct it. In the meantime, I suggest using legacy (just point python to the legacy code).

@andrebelem
Copy link
Collaborator

andrebelem commented Aug 21, 2024

First assessement:
The code was designed to search for a specific pattern in a text and replace it with something else. However, it ran into a problem because of two special cases:

  • When the text was None: The code expected to receive a piece of text to work with. But sometimes, the text it received was actually None, which means there was no text to process. Since the code was not prepared to handle None, it caused an error.
  • When the pattern was '---': (horizontal line). In this code, it wasn't just seen as a simple pattern of three dashes; it was treated as something that could confuse the text-replacement process. The code tried to interpret it as a special instruction, which led to issues.

What Was Done to Fix It?

  • The code was updated to check if the text is None before doing anything. If the text is None, the code simply returns None because there's nothing to translate.
  • The code also checks if the pattern is '---'. If it is, the code now returns '---' without trying to replace it, since this is a common pattern that shouldn't be changed.

Update:
I wanted to update you that I am currently conducting tests, but I’ve encountered significant degradation with Google services today. Due to these issues, I will need more time to complete the testing.

I appreciate your understanding and will keep you posted on any further developments.

@andrebelem
Copy link
Collaborator

⚠️ Warning: Proper Handling of Embedded Code
When including embedded code in your markdown files, please ensure that the code is enclosed within triple backticks (```). This is essential to prevent the program from mistakenly translating the code or misinterpreting it as regular text.

Example of Proper Embedded Code:

def example_function():
      """This is a docstring."""
      return "Hello, World!"

Why This is Important: If embedded code is not properly enclosed within triple backticks, the program may inadvertently translate the code, altering its functionality or meaning. This can lead to unexpected results, especially when the embedded code is meant for demonstration purposes rather than execution.

@MrCosta57
Copy link

Hi, I noticed the same error while script translating markdown cells with tables or latex formulas (defined with $$ symbol)

Here is the traceback:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\ProgramData\miniforge3\envs\ml\Scripts\jupyter_translate.exe\__main__.py", line 7, in <module>
  File "C:\ProgramData\miniforge3\envs\ml\Lib\site-packages\jupyter_translate.py", line 209, in main
    jupyter_translate(
  File "C:\ProgramData\miniforge3\envs\ml\Lib\site-packages\jupyter_translate.py", line 172, in jupyter_translate
    translate_markdown(source, translator, delay=delay)
  File "C:\ProgramData\miniforge3\envs\ml\Lib\site-packages\jupyter_translate.py", line 96, in translate_markdown
    return translate(text) + '\n'
           ^^^^^^^^^^^^^^^
  File "C:\ProgramData\miniforge3\envs\ml\Lib\site-packages\jupyter_translate.py", line 86, in translate
    text = replace_from_list('[Xx]' + LINK_REPLACEMENT_KW[1:], text, md_links)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\miniforge3\envs\ml\Lib\site-packages\jupyter_translate.py", line 66, in replace_from_list
    return re.sub(tag, lambda x: next(iter(replacement_gen)), text)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\miniforge3\envs\ml\Lib\re\__init__.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants