Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error with writing prices to database #87

Merged
merged 6 commits into from
Jan 31, 2025
Merged

Conversation

fhenneke
Copy link
Contributor

This PR attempts to fix an issue with writing prices to the database. It might close #82.

Before, it could happen that the same price (i.e. the same token, time stamp, source) was inserted into the prices table. This failed due to violating the primary key constraint. This error (and potentially more?) was silenced in #84.

With this PR, the code first trys to update a value and then inserts it when the update did not succeed.

Alternatively, one could also first try to insert and update if that throws an error.

I added a test for writing a price twice. I was not able to reproduce the error by running the daemon, so that we might need to test some other way.

this should fix a bug with trying to write the same price into the table multiple times
@harisang
Copy link
Contributor

harisang commented Jan 30, 2025

@fhenneke Out of curiosity, i attempted to pull a "review" from ChatGPT for the write_prices_new() function. Let me know if it makes sense overall.....

Your function is mostly correct, but I see a few potential issues and areas for improvement:


✅ What’s Good?

  • Uses SQLAlchemy transactions correctly with begin() and begin_nested().
  • Handles IntegrityError properly to update instead of inserting.
  • Logs errors properly, which is good for debugging.
  • Uses bytes.fromhex(token_address[2:]), correctly stripping the 0x prefix from hex strings.

⚠️ Issues and Improvements

1. autoload_with=self.engine Is Deprecated

  • Problem: Table("prices", MetaData(), autoload_with=self.engine) is not the best way to reflect an existing table anymore.
  • Fix: Use MetaData(bind=self.engine, reflect=True) instead.

Better Way to Load Table:

metadata = MetaData(bind=self.engine)
metadata.reflect()
prices_table = metadata.tables["prices"]

2. savepoint.rollback() May Not Be Necessary

  • Problem: The savepoint.rollback() inside except IntegrityError: isn't always necessary because a failed INSERT doesn't necessarily put the transaction in an invalid state.
  • Fix: Remove savepoint.rollback() and simply proceed with UPDATE.

Safer Alternative:

try:
    conn.execute(insert_stmt)
except sqlalchemy.exc.IntegrityError:
    logger.info("Insert failed, attempting update...")
    conn.execute(update_stmt)

3. Handle NumericValueOutOfRange Properly

  • Problem: psycopg.errors.NumericValueOutOfRange isn't caught because psycopg exceptions usually don’t propagate well inside SQLAlchemy’s transaction management.
  • Fix: Wrap it in sqlalchemy.exc.DataError, which is more reliable.

Updated Exception Handling:

except sqlalchemy.exc.DataError as e:
    logger.error(f"Value out of range error: {e}")

4. Missing Commit for the Update

  • Problem: Your update_stmt executes, but you never commit the transaction.
  • Fix: Explicitly commit after update.

Ensure Commit:

conn.execute(update_stmt)
conn.commit()

5. Improve Logging for Debugging

  • Right now, your logs don't indicate whether the UPDATE was actually successful.

Improve Logging:

result = conn.execute(update_stmt)
if result.rowcount == 0:
    logger.warning(f"Update failed: No matching record found for {token_address}")

🔍 Final Improved Code

Here’s a cleaned-up version with all fixes applied:

from sqlalchemy import Table, MetaData, insert, update
from sqlalchemy.exc import IntegrityError, DataError
import logging
from datetime import datetime, timezone

logger = logging.getLogger(__name__)

def write_prices_new(self, prices: list[tuple[str, int, float, str]]) -> None:
    """Write prices to database."""
    try:
        metadata = MetaData(bind=self.engine)
        metadata.reflect()
        prices_table = metadata.tables["prices"]

        with self.engine.connect() as conn:
            with conn.begin():
                for token_address, time, price, source in prices:
                    try:
                        # Try inserting first
                        insert_stmt = insert(prices_table).values(
                            token_address=bytes.fromhex(token_address[2:]),
                            time=datetime.fromtimestamp(time, tz=timezone.utc),
                            price=price,
                            source=source,
                        )
                        conn.execute(insert_stmt)

                    except IntegrityError:
                        logger.info(f"Insert failed, updating instead: {token_address}, {time}, {price}, {source}")
                        
                        update_stmt = (
                            update(prices_table)
                            .where(prices_table.c.token_address == bytes.fromhex(token_address[2:]))
                            .where(prices_table.c.time == datetime.fromtimestamp(time, tz=timezone.utc))
                            .where(prices_table.c.source == source)
                            .values(price=price)
                        )

                        result = conn.execute(update_stmt)
                        if result.rowcount == 0:
                            logger.warning(f"Update failed: No matching record found for {token_address}")

                    except DataError as e:
                        logger.error(f"Value out of range error: {e}")

    except Exception as e:
        logger.error(f"Error writing prices: {e}")

🔹 Key Fixes

✅ Uses proper table reflection
✅ Removes unnecessary savepoint.rollback()
✅ Ensures commit after update
✅ Uses DataError for NumericValueOutOfRange
✅ Improves logging for debugging

This version should be safer and more reliable! 🚀 Let me know if you need further tweaks.

@fhenneke
Copy link
Contributor Author

fhenneke commented Jan 31, 2025

I guess it is good to get such comments for double checking. The issues pointed out, though, are mostly incorrect.

  1. autoload_with=self.engine does not seem to be deprecated. It is part of the documentation for setting up tables here.
  2. Removing the call to rollback will crash the test with the error
     ERROR    root:database.py:217 Error writing prices:(psycopg.errors.InFailedSqlTransaction) current transaction is aborted, commands ignored until end of transaction block
    
    So it does seem to be necessary.
  3. It is not correct that the errors for large values are not caught. I added a test to check that writing large prices emits an error log but does not crash the code. I want to be rather specific in the errors which are caught, so I would like to keep it as psycopg.errors.NumericValueOutOfRange.
  4. A call to commit is not required due to the with conn.begin(): block. This block automatically commits at the end, see the documentation.
  5. I added an error log if updating fails. This should never happened nd indicates that something is wrong with the database (or the price).

@harisang
Copy link
Contributor

Ok thanks for double-checking everything. It was mostly out of curiosity that i posted the above, as i wouldn't have come up with such comments myself.

Things look good and since i want to get this out of the way, i am fine approving now

@harisang harisang merged commit 5f28499 into main Jan 31, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Properly handle duplicate entries in prices table
2 participants