Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEV: Move the post and topic custom fields into a table #309

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Lhcfl
Copy link

@Lhcfl Lhcfl commented Sep 5, 2024

This PR does some preparatory work to prepare for the front-end display of who marked an answer as solved. It migrates the custom fields from discourse-solved of Topic and Post to a new table, except "accepted_answer_post_id"

ref: /t/-/95318

This PR does some preparatory work to prepare for the front-end display
of who marked an answer as solved. It migrates the custom fields from
discourse-solved of Topic and Post to a new table, except "accepted_answer_post_id"

ref: /t/-/95318
@Lhcfl Lhcfl marked this pull request as draft September 5, 2024 11:19
FROM topic_custom_fields tc
LEFT JOIN topic_custom_fields tc2
ON tc2.topic_id = tc.topic_id AND tc2.name = 'solved_auto_close_topic_timer_id'
LEFT JOIN user_actions ua
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an index in user_actions for target_topic_id?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query plan is:

                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=14.85..14.87 rows=1 width=32)
   ->  Sort  (cost=14.85..14.86 rows=1 width=32)
         Sort Key: tc.topic_id, ((tc.value)::integer), ((tc2.value)::integer), ua.acting_user_id, tc.created_at, tc.updated_at
         ->  Nested Loop  (cost=0.42..14.84 rows=1 width=32)
               ->  Nested Loop Left Join  (cost=0.00..6.16 rows=1 width=84)
                     Join Filter: (tc2.topic_id = tc.topic_id)
                     ->  Seq Scan on topic_custom_fields tc  (cost=0.00..3.08 rows=1 width=52)
                           Filter: ((name)::text = 'accepted_answer_post_id'::text)
                     ->  Seq Scan on topic_custom_fields tc2  (cost=0.00..3.08 rows=1 width=36)
                           Filter: ((name)::text = 'solved_auto_close_topic_timer_id'::text)
               ->  Index Only Scan using idx_unique_rows on user_actions ua  (cost=0.42..8.66 rows=1 width=8)
                     Index Cond: ((action_type = 15) AND (target_topic_id = tc.topic_id))
(12 rows)

The query is using idx_unique_rows on user_actions ua, so i think yeah, it is using an index

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> Seq Scan on topic_custom_fields tc (cost=0.00..3.08 rows=1 width=52) Filter: ((name)::text = 'accepted_answer_post_id'::text)

This looks interesting to me. I wonder if we need a index on name. From my rusty knowledge Seq Scan can be costly if the table is large and if there are no indexes to filter on (e.g. the accepted_answer_post_id)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just noticed that custom_fields does not an index on name... 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it interesting that we have no index on topic_custom_fields.name. Wondering if it makes sense to have one added in core.

We have this one:

#  topic_custom_fields_value_key_idx                             (value,name) WHERE ((value IS NOT NULL) AND (char_length(value) < 400))

Which I find a bit strange, why not (name, value)?

@Lhcfl Lhcfl marked this pull request as ready for review September 6, 2024 11:25
Comment on lines +39 to +47
execute <<-SQL
DELETE FROM post_custom_fields
WHERE name = 'is_accepted_answer'
SQL

execute <<-SQL
DELETE FROM topic_custom_fields
WHERE name = 'solved_auto_close_topic_timer_id'
SQL
Copy link
Contributor

@nattsw nattsw Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the deletions here are ok though..

... if we have better knowledge or data, and we know that there are many rows to be deleted, we can consider deletion in batches. Let me do a quick check on who may be a heavy user of discourse-solved...

Copy link
Contributor

@nattsw nattsw Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ok looks like we do have some hundreds of thousands of rows (as mentioned in chat) so perhaps nice to batch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a trick to make it a bit better:

      DELETE FROM topic_custom_fields
      WHERE id IN (SELECT id FROM topic_custom_fields WHERE name = 'solved_auto_close_topic_timer_id')

... but it needs to be tested.

plugin.rb Outdated Show resolved Hide resolved
@@ -190,6 +182,12 @@ def self.skip_db?
::PostSerializer.prepend(DiscourseSolved::PostSerializerExtension)
::UserSummary.prepend(DiscourseSolved::UserSummaryExtension)
::Topic.attr_accessor(:accepted_answer_user_id)
::Topic.has_one(:solution, class_name: ::DiscourseSolved::Solution.to_s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this and the next line should be in a Extension module.

spec/fabricators/extend_topic_fabricator.rb Outdated Show resolved Hide resolved
FROM topic_custom_fields tc
LEFT JOIN topic_custom_fields tc2
ON tc2.topic_id = tc.topic_id AND tc2.name = 'solved_auto_close_topic_timer_id'
LEFT JOIN user_actions ua
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it interesting that we have no index on topic_custom_fields.name. Wondering if it makes sense to have one added in core.

We have this one:

#  topic_custom_fields_value_key_idx                             (value,name) WHERE ((value IS NOT NULL) AND (char_length(value) < 400))

Which I find a bit strange, why not (name, value)?

Comment on lines +39 to +47
execute <<-SQL
DELETE FROM post_custom_fields
WHERE name = 'is_accepted_answer'
SQL

execute <<-SQL
DELETE FROM topic_custom_fields
WHERE name = 'solved_auto_close_topic_timer_id'
SQL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a trick to make it a bit better:

      DELETE FROM topic_custom_fields
      WHERE id IN (SELECT id FROM topic_custom_fields WHERE name = 'solved_auto_close_topic_timer_id')

... but it needs to be tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants