Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHARD-1104 - Postgres DB support #17

Open
wants to merge 71 commits into
base: dev
Choose a base branch
from
Open

SHARD-1104 - Postgres DB support #17

wants to merge 71 commits into from

Conversation

yaseen-oakrev
Copy link

No description provided.

yaseen-oakrev and others added 30 commits July 23, 2024 16:52
Copy link

github-actions bot commented Sep 5, 2024

PR Reviewer Guide 🔍

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 Security concerns

SQL Injection Risk:
The dynamic SQL generation in src/storage/transaction.ts could lead to SQL injection if not handled properly. Ensure all inputs are sanitized and use parameterized queries where possible.

⚡ Key issues to review

SQL Injection Risk
The SQL queries constructed in functions like insertTransaction and bulkInsertTransactions are dynamically creating SQL strings which could potentially lead to SQL injection if not properly sanitized. It's crucial to ensure that all inputs are properly sanitized and parameterized to avoid SQL injection vulnerabilities.

Code Duplication
There is significant code duplication in the transaction handling for PostgreSQL and SQLite. Consider abstracting the database interaction to reduce duplication and improve maintainability.

Error Handling
The current implementation logs the error but does not handle it in a way that the calling function can react to it. Consider throwing the error or handling it in a way that the caller can make decisions based on the type of error.

src/storage/pgStorage.ts Fixed Show fixed Hide fixed
src/storage/pgStorage.ts Fixed Show fixed Hide fixed
src/storage/pgStorage.ts Fixed Show fixed Hide fixed
src/storage/pgStorage.ts Fixed Show fixed Hide fixed
@@ -214,8 +263,10 @@ export async function queryBlockCount(): Promise<number> {

export async function queryLatestBlocks(count: number): Promise<DbBlock[]> {
try {
const sql = `SELECT * FROM blocks ORDER BY number DESC LIMIT ${count}`
const blocks: DbBlock[] = await db.all(sql)
const sql = `SELECT *${config.postgresEnabled ? ', readableBlock::TEXT' : ''} FROM blocks ORDER BY number DESC LIMIT ${count}`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of something like this can we abstract the query creation into a method (of a class maybe) and have the conditionals in that method instead of having it in place? This is hard to debug in case of issues and from readability POV it's not pleasant. Same applies for the rest of the queries.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhamj, due to the differences in sql syntax of postgres and sqlite, these queries are different.
If I understand this correctly, having a separate method that I call to abstract out select * from ..., select field from ... queries would result in that method becoming a rule-based mess. Please let me know if this isn't what you had in mind. Also, could you give me an example? That'll help me visualise the solution better.

const sql = `SELECT * FROM blocks ORDER BY number DESC LIMIT ${count}`
const blocks: DbBlock[] = await db.all(sql)
const sql = `SELECT *${config.postgresEnabled ? ', readableBlock::TEXT' : ''} FROM blocks ORDER BY number DESC LIMIT ${count}`
const blocks: DbBlock[] = config.postgresEnabled
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can choosing the DB instance also be abstracted into a single method? If I have to update the config in the future or add a new database it will be cumbersome. Unless there is a specific reason for this.

Ideally, I would have liked if we used a simple repo interface defining run, all, etc. which is composed into this file and has multiple implementations one for sqlite and one for postgres.

Copy link

@mgthuramoemyint mgthuramoemyint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query functions look okay, just wanna make sure in insert functions, all variables should come from trusted source or validate already.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these data such as account, token, _accountId coming from trusted source, type-check validation already done? All query functions look okay, just some concern for update function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the above come from the distributor. And we use parameterized queries to insert data, so threat of an SQL injection should be minimum.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data are properly validated, then it will be fine. Just wanted to bring your attention that there could be SQL injection even using parameterized queries. Example: insertAccount(account: Account), if the Account object has the key name like maliciousdata') , the query will become:
INSERT INTO accounts (maliciousdata')) VALUES ($1) ON CONFLICT(accountId) DO UPDATE SET ... which could change the query.

Copy link
Author

@yaseen-oakrev yaseen-oakrev Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this data is queried from the distributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same goes for AccountHistoryState, is it coming from trusted data?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this data is queried from the distributor

@@ -55,6 +55,8 @@
"morgan": "1.9.1",
"node-cron": "3.0.2",
"nodemon": "^2.0.20",
"pg": "8.12.0",
"pg-format": "^1.0.4",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"pg-format": "^1.0.4",
"pg-format": "1.0.4",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also commit necessary changes in package-lock.json

@@ -76,6 +78,7 @@
"@typescript-eslint/eslint-plugin": "5.60.1",
"@typescript-eslint/parser": "5.60.1",
"@typescript-eslint/typescript-estree": "5.61.0",
"@types/pg-format": "^1.0.5",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"@types/pg-format": "^1.0.5",
"@types/pg-format": "1.0.5",

Comment on lines +44 to +45
const balance = BigInt(`0x${account.account.account.balance}`)
const nonce = BigInt(`0x${account.account.account.nonce}`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we missed using these below?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they aren't required, removed them

const sql = 'INSERT OR REPLACE INTO accounts (' + fields + ') VALUES (' + placeholders + ')'
await db.run(sql, values)

if (config.postgresEnabled) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Arham has already mentioned this; it would be great if we can have some kind of factory pattern on datasource or config driven data source (or if I can say DAO layer which abstracts the methods) instead of if/else conditions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a conscious decision (Here's the relevant discussion)

await pgDb.run(sql, values)

// await transformCycle(cycle)
await updateCycleAnalytics()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed here? can this be done separately? I'm sorry I may not have enough context here but we may need to check impact it may have on inserting cycles. Will this run as a separate collector inserting only in PG and will not contribute to network?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aniketdivekar this is to store things for marketing dashboard. This runs only on one server, and has no response times. I don't think we need to do it seperately as it does not hold up anything which can cause downstream problems.

@justin-shardeum justin-shardeum changed the title Postgres DB support SHARD-1104 - Postgres DB support Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants