Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB/1 Store suggested sentences as separate entries with separate ids (all corresponding to the same sentence id) #27

Open
AppolinFotso opened this issue Oct 7, 2023 · 2 comments · Fixed by #38
Assignees

Comments

@AppolinFotso
Copy link
Contributor

Currently, all suggested sentences are under one id. In JSON file, they appear as just one entry, comma-spearated. This is not ideal for further data processing.
We want to store the suggested sentences as separate entries (a sentence per JSON line) with separate ids.

@AppolinFotso
Copy link
Contributor Author

PR #38

sztupy added a commit that referenced this issue Oct 12, 2023
#27 and #34- store each suggestion with unique id and alter sentences table - Appolin Semegni Fotso
@sztupy
Copy link
Contributor

sztupy commented Oct 12, 2023

While there are some issues with this PR I will commit it as it includes a couple important architectural changes that should be in main. The ticket should be kept open to fix a couple remaining issues as a new PR:

We still need a good database setup script. We have the schema file now, but it is not runnable (if you just pass it into postgres it will throw syntax errors), and is missing setup of the seed data (the initial sentences)

We should be using IDs throughout. There are two places where this becomes important:

In the selected_suggestion column we still store the full suggestion. This should be the id of the relevant entry in thesuggestion table
We send the full sentence to the frontend but not it's ID. Then, when we click a suggestion we send the full sentence back and ask the Database to find it's ID. What we should do is send both the sentence AND it's ID to the frontend, and the frontend should send BOTH back. The database can then look up the sentence by ID (and verify it is correct by matching it to the sentence). This will be still safe, but also performant. (Note that sending back the full sentence is not actually technically necessary, the ID is enough for most purposes)

When getting a sentence from the database we query the entirety of the database and selecting a random one from that. Postgres is clever enough to have features (please have a look on your own) to provide you with a single random sentence - use that feature. The current implementation would lose performance once the sentence database gets huge.

@sztupy sztupy reopened this Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 👀 In review
2 participants