-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix Post.get_keywords to handle empty keywords, cleanup search_indexes #205
Conversation
@@ -47,32 +47,32 @@ def get_model(self): | |||
return Post | |||
|
|||
def get_search_data(self, post, language, request): | |||
optional_attributes = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this optional_attributes thing was never used :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a zombie code, yeah. The idea behind, if i remember correctly, was to provide optional attribute to extend the search. Maybe we could just remove it and eventually develop the idea in a separated branch.
Tests have to be rewritten. I will try to finish this up tomorrow. |
@@ -28,7 +25,7 @@ def get_description(self, post): | |||
return post.get_description() | |||
|
|||
def prepare_pub_date(self, post): | |||
return post.date_published.strftime("%Y-%m-%d %H:%M:%S") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haystack handles datetime
objects, strftime
is really unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
85a17db
to
977b0ef
Compare
This is pretty much finished. But there is an interesting thing going on with date_published. For some reason the prepared datetime differs a few seconds from the one saved. Not sure what to blame yet :) |
@skirsdeda let's ping @fmarco who worked on the first implementation of the haystack indexes IIRC he had time skewes issues as well |
@yakky I would say that the logic in |
@yakky Actually |
Also note that |
2285ce3
to
a85cc16
Compare
So I removed the offending test which did nothing more than the one that is left now. |
Also removed |
I'll have a deep look into ASAP @skirsdeda |
@@ -47,32 +44,36 @@ def get_model(self): | |||
return Post | |||
|
|||
def get_search_data(self, post, language, request): | |||
optional_attributes = [] | |||
abstract = post.safe_translation_getter('abstract') | |||
text_bits = [post.get_title()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep this one, because it returns the meta_title or the title as a fallback, so it's not a real repetition.
See: https://github.com/nephila/djangocms-blog/blob/develop/djangocms_blog/models.py#L228
Btw, we can discuss about it :)!
@fmarco Since |
@@ -15,20 +15,17 @@ class PostIndex(get_index_base()): | |||
|
|||
author = indexes.CharField(indexed=True, model_attr='get_author') | |||
keywords = indexes.CharField(null=True) | |||
tags = indexes.CharField(null=True) | |||
tags = indexes.CharField(null=True, model_attr='get_tags') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@skirsdeda Ok, it's an idea. By the way i think we could keep as a separated default search attribute the title, in the case we want to search for the title or the meta_title. @yakky what do you think? |
@fmarco Regarding title, I personally use |
@@ -232,7 +232,7 @@ def get_title(self): | |||
return title.strip() | |||
|
|||
def get_keywords(self): | |||
return self.safe_translation_getter('meta_keywords').strip().split(',') | |||
return self.safe_translation_getter('meta_keywords', default='').strip().split(',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we could move it directly here: https://github.com/nephila/djangocms-blog/blob/develop/djangocms_blog/models.py#L235
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ps: btw, how can the error be reproduced? Because here: https://github.com/nephila/djangocms-blog/blob/develop/djangocms_blog/models.py#L133 "meta_keywords" have the '' default.
Just to have an idea of the flow error. Anyway, it's a sane check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmarco Now that is interesting because I have all migrations applied and yet meta_keywords is not null with no default in database. Actually all fields in djangocms_blog_post_translation
are not null with no default... So either there were some migration changes after they were first created or it has to do with parler. I have to investigate this a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmarco So Django doesn't save default values in DB schema. Means that all is good with migrations. I bet I am hitting some django-parler bug here. Maybe it doesn't consider field's default value in safe_translation_getter
for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh! So that's the problem that causes the issue!
So, the stronger check have even more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree :)! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ps: so, as said above let's make the check on the Post model method get_keywords to have a default '' fallback 👍 .
@skirsdeda About test_blog_post_is_indexed_using_update_object test, do you any suggestion to make test using a different backend? Otherwise we keep it removed as you did :). |
@fmarco If you use a different backend you will need to pull in some dependencies for tests. For Solr and elasticsearch you would have to install and run an actual service which is a bit of an overkill for one test. Also it would make Travis builds considerably longer. The only option is using Whoosh backend which just runs as a python lib. |
@skirsdeda I see, infact i read that someone try to use services (as even Docker containers to run those services), but i agree with you that's an overkill for the moment. Maybe we could integrate a whoosh based test next. |
@yakky LGTM |
@skirsdeda thanks for your work |
fix Post.get_keywords to handle empty keywords, cleanup search_indexes
Tried using search_indexes got some exceptions (like None returned from get_keywords). Ended up cleaning up the entire
PostIndex
.