Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move get_books to parse.py and add get_chapters #61

Merged
merged 9 commits into from
Nov 21, 2023
Merged

Conversation

isaac091
Copy link
Collaborator

@isaac091 isaac091 commented Nov 10, 2023

Created new module for get_books and get_chapters due to circular import conflict.


This change is Reviewable

@codecov-commenter
Copy link

codecov-commenter commented Nov 10, 2023

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (58919a5) 86.62% compared to head (5baf156) 86.74%.

Files Patch % Lines
machine/scripture/parse.py 96.63% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #61      +/-   ##
==========================================
+ Coverage   86.62%   86.74%   +0.11%     
==========================================
  Files         223      225       +2     
  Lines       13376    13526     +150     
==========================================
+ Hits        11587    11733     +146     
- Misses       1789     1793       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 5 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @isaac091 and @johnml1135)


machine/scripture/parse.py line 44 at r2 (raw file):

# Output format: { book_num: [chapters] }
# An empty list, i.e. book_num: [] signifies the inclusion of all chapters, while the absence of an entry means the book is not included
def get_chapters(chapter_selections: str) -> dict:

The return type should be more explicit.


machine/scripture/parse.py line 45 at r2 (raw file):

# An empty list, i.e. book_num: [] signifies the inclusion of all chapters, while the absence of an entry means the book is not included
def get_chapters(chapter_selections: str) -> dict:
    versification = Versification.create("Original")

You should pass in the versification as an argument.


machine/scripture/parse.py line 51 at r2 (raw file):

    # Normalize books written as "MAT01.usfm" to "MAT"
    chapter_selections = re.sub(USFM_FILE_PATTERN, "", chapter_selections)

This seems specific to silnlp. You should do this in silnlp.


machine/scripture/parse.py line 72 at r2 (raw file):

            if book == 0:
                raise RuntimeError(f"{section[:3]} is an invalid book ID.")

I would use ValueError for all exceptions.


machine/scripture/parse.py line 77 at r2 (raw file):

            chapters[book] = set()
            last_chapter = versification.get_last_chapter(book)
            for chapter_num in chapter_nums:

We should raise an exception if the chapter is not valid, i.e. out of range.


machine/scripture/parse.py line 87 at r2 (raw file):

            # Delete entry if no chapter numbers were valid
            if len(chapters[book]) == 0:
                del chapters[book]

It would probably be clearer to only add the set if it is not empty.


machine/scripture/parse.py line 89 at r2 (raw file):

                del chapters[book]
        elif "-" in section:  # Spans of books
            spans.append(section)

Is there some reason why you don't just process the span here?

@isaac091
Copy link
Collaborator Author

I don't process the spans right away to prioritize getting specific chapters from a book, so for example if someone wanted "MAT-ROM;ACT1-3", they wouldn't have to worry about putting the Acts chapters first to avoid getting the selections overwritten with the span. But I guess that's not consistent all the way through because if "NT" or "OT" are passed first, all of those books are added right away. I can change it so that there's no "order of operations", or if we want to keep it this way, I'll change it so that "NT" and "OT" are added last as well.

@johnml1135
Copy link
Collaborator

The order shouldn't matter - I would keep it as original versification order. Are you accounting for removals as well?

@johnml1135
Copy link
Collaborator

machine/scripture/parse.py line 8 at r3 (raw file):

USFM_FILE_PATTERN = re.compile(r"(?<=[A-Z]{3})\d+\.usfm")
BOOK_SPAN = re.compile(r"[A-Z]{3}-[A-Z]{3}")

Books can include numbers - such as 1CO.

@johnml1135
Copy link
Collaborator

machine/scripture/parse.py line 7 at r3 (raw file):

from .verse_ref import Versification

USFM_FILE_PATTERN = re.compile(r"(?<=[A-Z]{3})\d+\.usfm")

Books include numbers.

@johnml1135
Copy link
Collaborator

machine/scripture/parse.py line 13 at r3 (raw file):

def get_books(books: Union[str, List[str]]) -> Set[int]:
    if isinstance(books, str):
        books = books.split(",")

Book chapter lists can have multiple parts, such as MAT1-3, 5, 7, 10-12. Because of this, we need to use a ";" for splitting sections.

@johnml1135
Copy link
Collaborator

machine/scripture/parse.py line 8 at r3 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Books can include numbers - such as 1CO.

Nevermind - this is the simple selection, not the advanced. Ignore my comment.

@isaac091
Copy link
Collaborator Author

The order shouldn't matter - I would keep it as original versification order. Are you accounting for removals as well?

I want to make sure I'm understanding - you're saying to add whole books last? I'm currently doing the removals after everything is added, but not allowing removals of spans of books. I can also change that to allow span removals.

Nevermind - this is the simple selection, not the advanced. Ignore my comment.

What do you mean by simple vs advanced selection? You were right originally, I need to consider books with numbers in the title.

@johnml1135
Copy link
Collaborator

Let me try it again: for MAT-ROM; ACT1-3, the ACT1-3 would be redundant. Instead the user should put in MAT-ROM;-ACT4-28. Is that how you understand it?

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @isaac091)

@isaac091
Copy link
Collaborator Author

I see. For my current code, MAT-ROM; ACT1-3 and MAT-ROM;-ACT4-28 give the same result. Basically, if chapters are specified for a book, the whole book is never added. I did it this way because I thought it would be more convenient for the user so they wouldn't have do something like MAT-JHN;ACT1-3;ROM, but I didn't think about doing MAT-ROM;-ACT4-28, which is not any more complicated, and probably more intuitive.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just apply them in the order that they appear. That way the precedence is explicit.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @isaac091)


machine/scripture/parse.py line 77 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We should raise an exception if the chapter is not valid, i.e. out of range.

I would also strip spaces from the chapter numbers.

@johnml1135
Copy link
Collaborator

I agree with Damein.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 5 files at r1, 1 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @isaac091)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @isaac091 and @johnml1135)


machine/scripture/parse.py line 45 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

You should pass in the versification as an argument.

You should use the ORIGINAL_VERSIFICATION constant rather than creating a new instance.

@johnml1135
Copy link
Collaborator

The updated tests look like they cover the cases that I was concerned about. If they pass, it looks good to me.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @isaac091)

…e modules and use constant in get_chapters signature
Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @isaac091)


machine/scripture/__init__.py line 39 at r5 (raw file):

def __getattr__(name: str) -> Any:

You can remove this function. It is no longer needed.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @isaac091)

@isaac091 isaac091 merged commit 63738c2 into main Nov 21, 2023
13 checks passed
@isaac091 isaac091 deleted the add_get_chapters branch November 21, 2023 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants