-
-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge "sibling" RSS item that belong to a multi-entry docket item #217
base: main
Are you sure you want to change the base?
Changes from 13 commits
a978447
3b60e8f
49cbaf2
1a1c476
358bf2d
d59e007
e7ace3b
05d7078
5ea25af
84d6050
a6a9c08
b43395c
d3d7526
6ce975c
7daede3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,7 @@ | |
from ..lib.html_utils import html_unescape | ||
from ..lib.log_tools import make_default_logger | ||
from ..lib.string_utils import harmonize, clean_string | ||
from ..lib.utils import previous_and_next | ||
|
||
logger = make_default_logger() | ||
|
||
|
@@ -98,19 +99,52 @@ def _parse_text(self, text): | |
|
||
@property | ||
def data(self): | ||
"""Override this to create a list of docket-like objects instead of the | ||
usual dict that is usually provided by the docket report. | ||
"""Return a list of docket-like objects instead of the usual dict that | ||
is usually provided by the BaseDocketReport superclass. | ||
|
||
When CMECF generates the RSS feed, it breaks up items with | ||
multiple consecutive entries into multiple RSS items with | ||
identical timestamp/id/title. We reverse that and recombine | ||
those items. | ||
""" | ||
if self._data is not None: | ||
return self._data | ||
|
||
data_list = [] | ||
for entry in self.feed.entries: | ||
for previous_entry, entry, next_entry in previous_and_next( | ||
self.feed.entries): | ||
data = self.metadata(entry) | ||
|
||
# We are guaranteed to only have a single docket entry for each | ||
# RSS item, and thus we use data['docket_entries'][0] below. | ||
# Coming up with an alternative data representation here and | ||
# then transforming it into what CL expects after we're done | ||
# iterating over the list is just not worth the bother. | ||
data[u'docket_entries'] = self.docket_entries(entry) | ||
# BUT: Guarantee this condition persists into the future: | ||
assert len(data[u'docket_entries']) <= 1 | ||
|
||
# If this entry and the immediately prior entry match | ||
# in metadata, then add the current description to | ||
# the previous entry's and continue the loop. | ||
if ( | ||
data_list and data_list[-1][u'docket_entries'] | ||
and data[u'docket_entries'] | ||
and entry.title == previous_entry.title | ||
and entry.link == previous_entry.link | ||
and entry.id == previous_entry.id | ||
and entry.published == previous_entry.published | ||
): | ||
data_list[-1][u'docket_entries'][0][u'short_description'] += ( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like a line like this deserves a comment, if even a short one. Or maybe break the bits into variables for readability?
That's a bit wordy, but clearer. This multiline thing is pretty bad. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I actually don't think that is clearer. Of course, the flipside is that I suspect you have an unfamiliarity with the And I definitely don't think that I'll think about how to write this better. At one point I had There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm familiar with the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Me:
Mike:
So B then. As I feared. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, I suspect you won't go for this one, but it kind of jumped out at me so I thought I'd offer it here: def entry(d):
return d[u'docket_entries'][0]
def get_append_shorttext(de, text=''):
de[u'short_description'] += text
return de[u'short_description']
previous_entry = entry(data_list[-1])
current_text = get_append_shorttext(entry(data))
get_append_shorttext(previous_entry, ' AND ' + current_text) Think of it as a conversation-starter :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, there's a lot of indirection in there, and Hm. What about something like:
That gets us back down to three lines, avoids repetition of the 5 levels of hierarchy, and uses There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, while I'm functionally happen with that, it fails PEP-8. And it's actually kind of tricky to fix, and it's why I introduced the helper function, which I agree is totally ridiculous and I'm not sure I could dignify by calling it a strawman: # 1 2 3 4 5 6 7 8
#2345678901234567890123456789012345678901234567890123456789012345678901234567890
previous_entry[u'short_description'] += ' AND ' +current_entry[u'short_description'] #1
previous_entry[u'short_description'] += ( #2
' AND ' + current_entry[u'short_description'])
previous_entry[u'short_description'] += ( #3
' AND ' +
current_entry[u'short_description'])
previous_entry[u'short_description'] += ( #4
' AND ' +
current_entry[u'short_description']
)
previous_entry[u'short_description'] += \
' AND ' + current_entry[u'short_description'] #5 Although I find 5 tempting, I think that backslash terminates are disfavored (although not against PEP8?). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think 5 has my vote followed closely by 1 (a minor PEP8 violation isn't the end of the world in my book). 2, 3, and 4 just don't seem worth it. I leave the rest of this to you. Let's land this. |
||
' AND ' + | ||
data[u'docket_entries'][0][u'short_description']) | ||
continue | ||
|
||
data[u'parties'] = None | ||
data[u'docket_entries'] = self.docket_entries(entry) | ||
if data[u'docket_entries'] and data['docket_number']: | ||
if data[u'docket_entries'] and data[u'docket_number']: | ||
data_list.append(data) | ||
|
||
self._data = data_list | ||
return data_list | ||
|
||
|
@@ -146,7 +180,7 @@ def docket_entries(self, entry): | |
u'date_filed': date(*entry.published_parsed[:3]), | ||
u'document_number': self._get_value(self.document_number_regex, | ||
entry.summary), | ||
u'description': '', | ||
u'description': u'', | ||
u'short_description': html_unescape( | ||
self._get_value(self.short_desc_regex, entry.summary)), | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like these two lines are of a different nature than the rest of the equality lines. Is it worth pulling them out of the if statement here? For example, maintain the logic, but do these types of tests in an earlier if statement? Something like:
It might allow you to put the comments about equality closer to where the equality is tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err...I think it is advantageous to not materially increase the nesting level of logic with gratuitous nested ifs.
The comments didn't seem to me to be far removed from equality — to me they made sense before the word
if
.Perhaps your reaction is to the fact that the comments don't discuss the first 2 lines of the if expressions…that's because I was focusing on what was maybe a little specialized. But certainly the comment could be broadened to discus the first 2 lines, i.e. "If there are docket entries associated with the previous entry and the current entry, and if those two entries match in metdata…"
But also, there's nothing stopping us from putting comments in the midst of the if expressions, so if there's really a desire for "Instructions at the point of need" then that can just happen, there's no need to change the structure of the
if
statement.So if you want to change the structure of the
if
, probably a better reason is required. I actually did try theall()
style and concluded erroneously that it didn't work here (I think I omitted the[]
), but that said it's not terribly familiar to me so I wouldn't vote for it though I don't oppose it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the extraneous nesting bugged me too. If you pull all this logic into a function, you could reverse the logic here and do:
That'd avoid the deep logic and keep the two things apart?