Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] in chapter 3 classification last exercise email classification #650

Open
faisalhussain-devs opened this issue Dec 12, 2024 · 0 comments

Comments

@faisalhussain-devs
Copy link

Thanks for helping us improve this project!

Before you create this issue
Please make sure you are using the latest updated code and libraries: see https://github.com/ageron/handson-ml2/blob/master/INSTALL.md#update-this-project-and-its-libraries

Also please make sure to read the FAQ (https://github.com/ageron/handson-ml2#faq) and search for existing issues (both open and closed), as your question may already have been answered: https://github.com/ageron/handson-ml2/issues

Describe the bug
The issue is in chapter 3 classification last exercise email spam or ham classification In Cell 144. The code through loop runs for every part of email but when it first arrives of text/plain or text/html once it returns them not further look for another text/plain or text/html as email can be multipart and various emails have text/plain and then text/html and further.

To Reproduce
def email_to_text(email):
html = None
for part in email.walk():
ctype = part.get_content_type()
if not ctype in ("text/plain", "text/html"):
continue
try:
content = part.get_content()
except: # in case of encoding issues
content = str(part.get_payload())
if ctype == "text/plain":
return content
else:
html = content
if html:
return html_to_plain_text(html)

And if you got an exception, please copy the full stacktrace here:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in inverse
ZeroDivisionError: division by zero

Expected behavior
i expected the code to run for every part of email and once it finds a text/plain or text/html it stores it in a variable and concatenates it with another text/plain or text/html found in email and after loop had completely run it returns that variable.

Screenshots
If applicable, add screenshots to help explain your problem.

Versions (please complete the following information):

  • OS: [e.g. MacOSX 10.15.7]
  • Python: [e.g. 3.7]
  • TensorFlow: [e.g., 2.4.1]
  • Scikit-Learn: [e.g., 0.24.1]
  • Other libraries that may be connected with the issue: [e.g., gym 0.18.0]

Additional context
This is proper code.

def email_to_text(email):
total_content = ""
for part in email.walk():
ctype = part.get_content_type()
if not ctype in ("text/plain", "text/html"):
continue
else:
try:
content = part.get_content()
except:
content = str(part.get_payload())
if ctype == "text/plain":
total_content += content
else:
total_content += html_to_text(content)
return total_content

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant