Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENT-8400: Added transcript_languages_search_facet_names to CourseRunSerializer #4296

Merged
merged 1 commit into from
Mar 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions course_discovery/apps/api/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1018,6 +1018,7 @@ class CourseRunSerializer(MinimalCourseRunSerializer):
required=False, many=True, slug_field='code',
queryset=LanguageTag.objects.prefetch_related('translations').order_by('name')
)
transcript_languages_search_facet_names = serializers.SerializerMethodField()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the field name is not good/descriptive -- if you want language names only, it should be transcript_language_names. Though I don't know why we need a new field for this -- when we have language info already available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DawoudSheraz The field name transcript_languages_search_facet_names suggests that it's specifically for Algolia search facets. We already have language information (stored in the transcript_languages attribute) in the form of language codes. Now, we need to extract language names from these codes to store in Algolia.
That's why we introduced this new variable, which indicates that it contains language names mapped to each language code, specifically for use in Algolia search facets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is only needed for Algolia(and for enterprise usage), the API is not the right place to add this information. It should be done on Enterprise algolia code level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DawoudSheraz In past, this approach has been explored here where the catalog can use the langcodes method to convert language codes into human-readable names. However, this method didn't align perfectly with the LanguageTag model data stored in the course-discovery database and in order to sync it, we would need to replicate the LanguageTag model in the catalog.

A similar strategy has been used previously for similar tasks (here). Language names were included in the CourseRunSerializer within course-discovery. This setup allows the enterprise-catalog to retrieve language names seamlessly through the API endpoint, eliminating the need to parse language codes or replicate the LanguageTag model from course-discovery.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In past, this approach has been explored #2990

And

A similar strategy has been used previously for similar tasks #2990

both seem to be referring to same PR? Anyways, while I get the context, I would suggest finding a better way for doing this. I personally do not agree with how it has been done in the past 🙂 . Discovery APIs already carry too much information and adding an new field for a very specific consumer is not worth it.

Copy link
Contributor Author

@mahamakifdar19 mahamakifdar19 Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DawoudSheraz Alternative approaches that have been considered in the past are listed in the PR.
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DawoudSheraz I will discuss with team markhors, and going forward, we will explore more effective methods of obtaining this information. Currently, replicating such information in the catalog seems like a substantial effort. Could you please confirm if these changes are ready for deployment at this time, and we can work on refining the approach in the future?

video = VideoSerializer(required=False, allow_null=True, source='get_video')
instructors = serializers.SerializerMethodField(help_text='This field is deprecated. Use staff.')
staff = SlugRelatedFieldWithReadSerializer(slug_field='uuid', required=False, many=True,
Expand Down Expand Up @@ -1066,10 +1067,11 @@ class Meta(MinimalCourseRunSerializer.Meta):
'level_type', 'mobile_available', 'hidden', 'reporting_type', 'eligible_for_financial_aid',
'first_enrollable_paid_seat_price', 'has_ofac_restrictions', 'ofac_comment',
'enrollment_count', 'recent_enrollment_count', 'expected_program_type', 'expected_program_name',
'course_uuid', 'estimated_hours', 'content_language_search_facet_name', 'enterprise_subscription_inclusion'
'course_uuid', 'estimated_hours', 'content_language_search_facet_name', 'enterprise_subscription_inclusion',
'transcript_languages_search_facet_names'
)
read_only_fields = ('enrollment_count', 'recent_enrollment_count', 'content_language_search_facet_name',
'enterprise_subscription_inclusion')
'enterprise_subscription_inclusion', 'transcript_languages_search_facet_names')

def get_instructors(self, obj): # pylint: disable=unused-argument
# This field is deprecated. Use the staff field.
Expand All @@ -1084,6 +1086,17 @@ def get_content_language_search_facet_name(self, obj):
return None
return language.get_search_facet_display(translate=True)

def get_transcript_languages_search_facet_names(self, obj):
transcript_languages = obj.transcript_languages.all()
if not transcript_languages:
return None
mahamakifdar19 marked this conversation as resolved.
Show resolved Hide resolved

transcript_languages_facet_names = []
for language in transcript_languages:
transcript_languages_facet_names.append(language.get_search_facet_display(translate=True))

return transcript_languages_facet_names

def update_video(self, instance, video_data):
# A separate video object is a historical concept. These days, we really just use the link address. So
# we look up a foreign key just based on the link and don't bother trying to match or set any other fields.
Expand Down
1 change: 1 addition & 0 deletions course_discovery/apps/api/tests/test_serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -705,6 +705,7 @@ def get_expected_data(cls, course_run, request):
'ofac_comment': course_run.ofac_comment,
'estimated_hours': get_course_run_estimated_hours(course_run),
'enterprise_subscription_inclusion': course_run.enterprise_subscription_inclusion,
'transcript_languages_search_facet_names': None
mahamakifdar19 marked this conversation as resolved.
Show resolved Hide resolved
})
return expected

Expand Down
Loading