-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Create database connection on need instead of keeping long-living connection #57
Conversation
WalkthroughThis pull request introduces a comprehensive refactoring of MySQL storage connection management across the Spider project. The changes primarily focus on introducing a new Changes
Possibly related PRs
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (12)
src/spider/storage/MySqlConnection.hpp (5)
18-23
: Ensure consistent use ofconst
qualifiers.In the
create
method, the parameterstd::string const& url
uses aconst
qualifier after the type, whereas in the rest of the code, theconst
qualifier is placed before the type (e.g.,MySqlConnection(MySqlConnection const&)
). For consistency and readability, it's recommended to place theconst
qualifier before the type in all cases.Apply this diff to standardize the
const
placement:-static auto create(std::string const& url) -> std::variant<MySqlConnection, StorageErr>; +static auto create(const std::string& url) -> std::variant<MySqlConnection, StorageErr>;
21-25
: Consider deleting move operations for clarity.Currently, the copy constructor and copy assignment operator are deleted, but the move constructor and move assignment operator are defaulted. Since
MySqlConnection
manages a unique resource (std::unique_ptr<sql::Connection>
), allowing move semantics is acceptable. However, to prevent accidental misuse and to make the class non-copyable and non-movable, consider deleting the move constructor and move assignment operator as well.If intentional, please disregard this suggestion.
29-30
: Implementconst
correctness in operator overloads.The
operator*
andoperator->
methods allow access to the underlyingsql::Connection
. To promoteconst
correctness and allow access toconst
member functions ofsql::Connection
, consider addingconst
versions of these operators.Apply this diff to add
const
versions:auto operator*() -> sql::Connection&; +auto operator*() const -> const sql::Connection&; auto operator->() -> sql::Connection*; +auto operator->() const -> const sql::Connection*;
33-35
: Initialize member variables in the constructor initializer list.In the private constructor,
m_connection
is initialized using an initializer list. The default member initialization= nullptr
on line 35 is redundant sincem_connection
is assigned a new value. Remove the default initialization for clarity.Apply this diff to remove the redundant initialization:
std::unique_ptr<sql::Connection> m_connection = nullptr; +std::unique_ptr<sql::Connection> m_connection;
15-37
: Add documentation for theMySqlConnection
class and its methods.To improve maintainability and readability, consider adding documentation comments to the
MySqlConnection
class and its methods. This will help other developers understand the purpose and usage of the class.src/spider/storage/MySqlConnection.cpp (1)
52-58
: Check for null pointers before dereferencing.In the
operator*
andoperator->
methods, there is an implicit assumption thatm_connection
is not null. To prevent potential null pointer dereferences, consider adding assertions or checks to ensurem_connection
is valid before accessing it.Apply this diff to add checks:
auto MySqlConnection::operator*() -> sql::Connection& { + assert(m_connection && "m_connection is null"); return *m_connection; } auto MySqlConnection::operator->() -> sql::Connection* { + assert(m_connection && "m_connection is null"); return m_connection.get(); }src/spider/storage/MySqlStorage.hpp (5)
27-28
: Provide meaningful default constructors or delete them with rationale.The default constructor for
MySqlMetadataStorage
is explicitly deleted. Deleting default constructors can be acceptable, but consider providing a comment explaining why the default constructor is deleted to aid future maintainability.
76-79
: Declareadd_task
andfetch_full_task
asstatic
or consider design alternatives.The methods
add_task
andfetch_full_task
are declared asstatic
, which may not align with object-oriented design principles if they manipulate class member data. Evaluate whether these methods should be static or whether they should be member functions.
78-79
: Consistency in method declarations between classes.In
MySqlMetadataStorage
, there are private static methodsadd_task
andfetch_full_task
, but similar methods are not present inMySqlDataStorage
. For consistency and maintainability, consider reviewing and aligning the design of both classes.
84-85
: Same as earlier: Provide meaningful default constructors or delete them with rationale.The default constructor for
MySqlDataStorage
is also explicitly deleted. As withMySqlMetadataStorage
, consider providing a comment explaining the deletion.
121-121
: Consider adding access modifiers for member variables for clarity.The member variable
m_url
inMySqlDataStorage
is declared without an explicit access modifier. While it's currently underprivate
, explicitly specifyingprivate
improves clarity.Apply this diff to add the access modifier:
+private: std::string m_url;
src/spider/storage/MySqlStorage.cpp (1)
Line range hint
386-452
: Consider adding exception handling withinadd_task
to ensure exception safety.The
add_task
function performs multiple database operations that may throwsql::SQLException
. Currently, it relies on the caller to handle exceptions. For better encapsulation and robustness, consider wrapping the body ofadd_task
in a try-catch block to handle exceptions internally.Apply this diff to include exception handling within
add_task
:void MySqlMetadataStorage::add_task(MySqlConnection& conn, sql::bytes job_id, Task const& task) { + try { // Add task std::unique_ptr<sql::PreparedStatement> task_statement( conn->prepareStatement("INSERT INTO `tasks` (`id`, `job_id`, `func_name`, `state`, " "`timeout`, `max_retry`) VALUES (?, ?, ?, ?, ?, ?)") ); // ... [rest of the function body] + } catch (sql::SQLException& e) { + conn->rollback(); + throw; // Rethrow the exception to be handled by the caller if necessary + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
src/spider/CMakeLists.txt
(2 hunks)src/spider/client/Driver.cpp
(3 hunks)src/spider/scheduler/scheduler.cpp
(2 hunks)src/spider/storage/DataStorage.hpp
(0 hunks)src/spider/storage/MetadataStorage.hpp
(0 hunks)src/spider/storage/MySqlConnection.cpp
(1 hunks)src/spider/storage/MySqlConnection.hpp
(1 hunks)src/spider/storage/MySqlStorage.cpp
(58 hunks)src/spider/storage/MySqlStorage.hpp
(4 hunks)src/spider/worker/task_executor.cpp
(2 hunks)src/spider/worker/worker.cpp
(2 hunks)tests/storage/StorageTestHelper.hpp
(3 hunks)
💤 Files with no reviewable changes (2)
- src/spider/storage/MetadataStorage.hpp
- src/spider/storage/DataStorage.hpp
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: non-storage-unit-tests (ubuntu-24.04)
- GitHub Check: non-storage-unit-tests (ubuntu-22.04)
- GitHub Check: lint
🔇 Additional comments (8)
src/spider/storage/MySqlConnection.cpp (2)
20-24
: Improve URL validation and error messages.The regular expression for parsing the JDBC URL may not cover all valid scenarios and could lead to false negatives. Additionally, the error message "Invalid url" lacks detail. Consider enhancing the URL validation to cover more cases and providing more informative error messages.
Would you like me to suggest an improved regular expression or validation approach?
28-37
: Securely handle sensitive information in exceptions and logs.When an exception occurs, the error message
e.what()
could contain sensitive information, such as database connection details. Ensure that sensitive information is not exposed in logs or error messages.[security_issue]
Consider sanitizing error messages before returning or logging them.src/spider/storage/MySqlStorage.cpp (1)
269-273
: Connection handling updated to create connections on-demand.The changes appropriately modify the connection handling by creating a new
MySqlConnection
for each database operation ininitialize()
, aligning with the PR objectives to avoid long-lived connections.tests/storage/StorageTestHelper.hpp (1)
14-14
: Update storage constructors to acceptcStorageUrl
parameter.The updates correctly modify the constructors to include
cStorageUrl
, conforming to the new requirement that storage classes are initialized with the storage URL.Also applies to: 27-27, 35-35, 44-44, 46-46
src/spider/client/Driver.cpp (1)
17-17
: Simplify storage initialization and error handling inDriver
constructor.The changes appropriately adjust the constructors to pass
storage_url
directly to the storage classes and streamline error handling by declaringerr
as a constant. This enhances code clarity and aligns with the updated connection management strategy.Also applies to: 26-27, 29-29, 51-52, 54-54
src/spider/worker/task_executor.cpp (1)
127-129
: Consider adding error handling for storage initialization.The storage objects are initialized with the storage_url parameter, but there's no explicit error handling for potential connection failures. This could lead to runtime issues if the database connection cannot be established.
Run this script to check if error handling exists in the MySqlStorage implementation:
src/spider/scheduler/scheduler.cpp (1)
169-174
: LGTM! Well-structured error handling.The initialization of storage objects includes proper error handling through the
StorageErr
mechanism, which is a good practice for managing database connection failures.src/spider/CMakeLists.txt (1)
3-4
: LGTM! Consistent naming convention and proper integration.The addition of MySqlConnection files and the renaming from Mysql to MySql maintains consistency across the codebase. The new connection management class is properly integrated into the build system.
Also applies to: 27-28
MySqlConnection::~MySqlConnection() { | ||
if (m_connection) { | ||
m_connection->close(); | ||
m_connection.reset(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure proper exception handling in the destructor.
The destructor calls m_connection->close()
, which may throw an exception. Throwing exceptions from a destructor can lead to undefined behaviour. Consider wrapping the close
call in a try-catch block to handle any potential exceptions.
Apply this diff to handle exceptions:
void MySqlConnection::~MySqlConnection() {
if (m_connection) {
+ try {
m_connection->close();
+ } catch (const sql::SQLException& e) {
+ // Log the exception or handle it appropriately
+ }
m_connection.reset();
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
MySqlConnection::~MySqlConnection() { | |
if (m_connection) { | |
m_connection->close(); | |
m_connection.reset(); | |
} | |
} | |
MySqlConnection::~MySqlConnection() { | |
if (m_connection) { | |
try { | |
m_connection->close(); | |
} catch (const sql::SQLException& e) { | |
// Log the exception or handle it appropriately | |
} | |
m_connection.reset(); | |
} | |
} |
bool const credential = match[2].matched && match[4].matched; | ||
std::unique_ptr<sql::Connection> conn; | ||
try { | ||
sql::Driver* driver = sql::mariadb::get_driver_instance(); | ||
if (credential) { | ||
conn = std::unique_ptr<sql::Connection>( | ||
driver->connect(sql::SQLString(url), match[2].str(), match[4].str()) | ||
); | ||
} else { | ||
conn = std::unique_ptr<sql::Connection>( | ||
driver->connect(sql::SQLString(url), sql::Properties{}) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle cases where only username or password is provided.
The current logic checks for both username and password (match[2].matched && match[4].matched
). If only a username or only a password is provided in the URL, the connection may fail or behave unexpectedly. Consider handling cases where only one credential is provided.
Apply this diff to adjust the credential handling:
-bool const credential = match[2].matched && match[4].matched;
+bool const has_user = match[2].matched;
+bool const has_password = match[4].matched;
Then update the connection logic accordingly to handle all combinations.
Committable suggestion skipped: line range outside the PR's diff.
src/spider/storage/MySqlStorage.cpp
Outdated
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | ||
if (std::holds_alternative<StorageErr>(conn_result)) { | ||
return std::get<StorageErr>(conn_result); | ||
} | ||
auto& conn = std::get<MySqlConnection>(conn_result); | ||
try { | ||
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | ||
if (std::holds_alternative<StorageErr>(conn_result)) { | ||
return std::get<StorageErr>(conn_result); | ||
} | ||
auto& conn = std::get<MySqlConnection>(conn_result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove redundant connection initialization in MySqlDataStorage::initialize()
.
The connection initialization code is duplicated between lines 1655 and 1665. This redundancy can lead to confusion and potential errors. Please remove the duplicate code to maintain clarity and prevent unnecessary operations.
Apply this diff to eliminate the redundancy:
auto MySqlDataStorage::initialize() -> StorageErr {
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url);
if (std::holds_alternative<StorageErr>(conn_result)) {
return std::get<StorageErr>(conn_result);
}
auto& conn = std::get<MySqlConnection>(conn_result);
- std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url);
- if (std::holds_alternative<StorageErr>(conn_result)) {
- return std::get<StorageErr>(conn_result);
- }
- auto& conn = std::get<MySqlConnection>(conn_result);
// Need to initialize metadata storage first so that foreign constraint is not violated
for (char const* create_table_str : cCreateStorage) {
std::unique_ptr<sql::Statement> statement(conn->createStatement());
statement->executeUpdate(create_table_str);
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | |
if (std::holds_alternative<StorageErr>(conn_result)) { | |
return std::get<StorageErr>(conn_result); | |
} | |
auto& conn = std::get<MySqlConnection>(conn_result); | |
try { | |
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | |
if (std::holds_alternative<StorageErr>(conn_result)) { | |
return std::get<StorageErr>(conn_result); | |
} | |
auto& conn = std::get<MySqlConnection>(conn_result); | |
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | |
if (std::holds_alternative<StorageErr>(conn_result)) { | |
return std::get<StorageErr>(conn_result); | |
} | |
auto& conn = std::get<MySqlConnection>(conn_result); |
= std::make_shared<spider::core::MySqlMetadataStorage>(storage_url); | ||
std::shared_ptr<spider::core::DataStorage> const data_store | ||
= std::make_shared<spider::core::MySqlDataStorage>(); | ||
err = data_store->connect(storage_url); | ||
if (!err.success()) { | ||
spdlog::error("Cannot connect to data storage: {}", err.description); | ||
return cStorageConnectionErr; | ||
} | ||
= std::make_shared<spider::core::MySqlDataStorage>(storage_url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Add initialization error handling after storage creation
The storage classes require explicit initialization after construction. Add error checking similar to the pattern in scheduler.cpp:
- After creating the storage objects, call
initialize()
on both stores - Check the returned
StorageErr
and handle any connection failures
Example locations requiring error handling:
src/spider/worker/worker.cpp:363-365
: Add initialization and error checking for both metadata_store and data_store
🔗 Analysis chain
Consider adding error handling for storage initialization.
While error handling exists for driver addition, there's no explicit error handling for potential connection failures during storage initialization.
Run this script to verify the error handling pattern across the codebase:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for consistent error handling patterns
rg -A 5 "std::make_shared<.*MySql.*Storage>" | grep -B 5 "if.*err.*success"
Length of output: 1393
Script:
#!/bin/bash
# Check storage class implementations
ast-grep --pattern 'class MySql(Meta)?DataStorage {
$$$
}'
# Also check for initialize method implementations
ast-grep --pattern 'StorageErr $_(MySql(Meta)?DataStorage::initialize)() {
$$$
}'
Length of output: 148
Script:
#!/bin/bash
# Find storage implementation files
fd -e cpp -e hpp -e h "MySql.*Storage"
# Search for constructor and initialization patterns
rg -A 5 "MySql(Meta)?DataStorage::(MySql|initialize)" --type cpp
Length of output: 711
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/spider/storage/MySqlStorage.hpp (1)
Line range hint
27-122
: Document connection lifecycle management.Consider adding documentation that describes:
- Connection lifecycle management
- Thread-safety requirements
- Best practices for connection sharing/pooling
- Error handling expectations
This will help users of these classes properly manage database connections and avoid potential issues.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/spider/storage/MySqlStorage.hpp
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: lint
- GitHub Check: non-storage-unit-tests (ubuntu-24.04)
- GitHub Check: non-storage-unit-tests (ubuntu-22.04)
🔇 Additional comments (4)
src/spider/storage/MySqlStorage.hpp (4)
6-6
: LGTM! Required includes added for new connection management.The addition of necessary headers supports the refactoring of connection management.
Also applies to: 22-22
27-28
: LGTM! Constructor changes enforce proper initialization.The explicit constructor with URL parameter and deletion of the default constructor ensure proper initialization of the storage object. The m_url member appropriately replaces the connection pointer for the new connection management strategy.
Also applies to: 76-76
85-86
: LGTM! Consistent implementation with MySqlMetadataStorage.The changes to MySqlDataStorage mirror those in MySqlMetadataStorage, maintaining consistency in the connection management approach across storage classes.
Also applies to: 122-122
78-80
: Consider thread-safety implications of shared connections.The static methods accepting a MySqlConnection reference might lead to thread-safety issues if the same connection is shared across multiple threads. Consider documenting thread-safety requirements or adding synchronization mechanisms.
Let's verify the thread-safety implications:
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/spider/storage/MySqlConnection.cpp (1)
25-37
: 🛠️ Refactor suggestionHandle cases where only username or password is provided
The current logic at line 25 checks for both username and password (
match[2].matched && match[4].matched
). If only a username or only a password is provided in the JDBC URL, the connection may fail or behave unexpectedly. Consider updating the logic to handle cases where only one of the credentials is provided.You can adjust the credential handling as follows:
-bool const credential = match[2].matched && match[4].matched; +bool const has_user = match[2].matched; +bool const has_password = match[4].matched;Then update the connection logic accordingly to handle all credential combinations.
src/spider/storage/MySqlStorage.cpp (1)
1655-1671
: 🛠️ Refactor suggestionConsolidate storage initialization to avoid redundancy
The
MySqlDataStorage::initialize()
method duplicates the storage initialization code fromMySqlMetadataStorage::initialize()
. To prevent redundancy and potential maintenance issues, consider invoking the metadata storage initialization from within the data storage initialization.You can modify
MySqlDataStorage::initialize()
as follows:auto MySqlDataStorage::initialize() -> StorageErr { // Initialize metadata storage first MySqlMetadataStorage metadata_storage(m_url); StorageErr err = metadata_storage.initialize(); if (err.has_error()) { return err; } // Additional data storage initialization if necessary return StorageErr{}; }
🧹 Nitpick comments (4)
src/spider/storage/MySqlConnection.cpp (2)
39-42
: Catch exceptions by constant referenceIn the
create
method, it is preferable to catch exceptions by constant reference to avoid unnecessary copying and to adhere to best practices.Update the catch block as follows:
-} catch (sql::SQLException& e) { +} catch (const sql::SQLException& e) {
50-51
: Log exceptions in the destructor for debuggingWhile ignoring exceptions in destructors prevents potential termination during stack unwinding, logging the exception details can aid in debugging connection closure issues.
Modify the catch block to log the exception:
} catch (const sql::SQLException& e) { - // Ignore + spdlog::error("Exception caught in MySqlConnection destructor: {}", e.what()); }Ensure that the appropriate logging headers are included.
src/spider/storage/MySqlStorage.cpp (2)
382-386
: Limit the scope of prepared statements to prevent resource retentionIn the
add_task
function, prepared statements inside loops can retain resources longer than necessary. Limiting their scope ensures resources are released promptly.Enclose prepared statements within their own scope:
for (std::uint64_t i = 0; i < task.get_num_inputs(); ++i) { { // Existing code std::unique_ptr<sql::PreparedStatement> input_statement(/* ... */); // Use input_statement } // input_statement is destroyed here }
Line range hint
1466-1514
: Optimize queries to fetch only necessary task detailsThe
get_child_tasks
andget_parent_tasks
functions retrieve full task details, which may be inefficient for large datasets if only specific fields are needed. Consider fetching only required fields to enhance performance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/spider/storage/MySqlConnection.cpp
(1 hunks)src/spider/storage/MySqlConnection.hpp
(1 hunks)src/spider/storage/MySqlStorage.cpp
(58 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/spider/storage/MySqlConnection.hpp
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: lint
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | ||
if (std::holds_alternative<StorageErr>(conn_result)) { | ||
return std::get<StorageErr>(conn_result); | ||
} | ||
auto& conn = std::get<MySqlConnection>(conn_result); | ||
try { | ||
for (char const* create_table_str : cCreateStorage) { | ||
std::unique_ptr<sql::Statement> statement(m_conn->createStatement()); | ||
std::unique_ptr<sql::Statement> statement(conn->createStatement()); | ||
statement->executeUpdate(create_table_str); | ||
} | ||
} catch (sql::SQLException& e) { | ||
m_conn->rollback(); | ||
conn->rollback(); | ||
return StorageErr{StorageErrType::OtherErr, e.what()}; | ||
} | ||
|
||
m_conn->commit(); | ||
conn->commit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Wrap connection initialization in a try-catch block
In MySqlMetadataStorage::initialize()
, exceptions during connection creation are not currently caught. Wrapping the connection initialization in a try-catch block ensures that any exceptions are properly handled, preventing unexpected behaviour.
Apply this change to handle exceptions during connection initialization:
auto MySqlMetadataStorage::initialize() -> StorageErr {
+ try {
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url);
if (std::holds_alternative<StorageErr>(conn_result)) {
return std::get<StorageErr>(conn_result);
}
auto& conn = std::get<MySqlConnection>(conn_result);
try {
for (char const* create_table_str : cCreateStorage) {
std::unique_ptr<sql::Statement> statement(conn->createStatement());
statement->executeUpdate(create_table_str);
}
- } catch (sql::SQLException& e) {
+ } catch (const sql::SQLException& e) {
conn->rollback();
return StorageErr{StorageErrType::OtherErr, e.what()};
}
conn->commit();
return StorageErr{};
+ } catch (const sql::SQLException& e) {
+ return StorageErr{StorageErrType::ConnectionErr, e.what()};
+ }
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | |
if (std::holds_alternative<StorageErr>(conn_result)) { | |
return std::get<StorageErr>(conn_result); | |
} | |
auto& conn = std::get<MySqlConnection>(conn_result); | |
try { | |
for (char const* create_table_str : cCreateStorage) { | |
std::unique_ptr<sql::Statement> statement(m_conn->createStatement()); | |
std::unique_ptr<sql::Statement> statement(conn->createStatement()); | |
statement->executeUpdate(create_table_str); | |
} | |
} catch (sql::SQLException& e) { | |
m_conn->rollback(); | |
conn->rollback(); | |
return StorageErr{StorageErrType::OtherErr, e.what()}; | |
} | |
m_conn->commit(); | |
conn->commit(); | |
try { | |
std::variant<MySqlConnection, StorageErr> conn_result = MySqlConnection::create(m_url); | |
if (std::holds_alternative<StorageErr>(conn_result)) { | |
return std::get<StorageErr>(conn_result); | |
} | |
auto& conn = std::get<MySqlConnection>(conn_result); | |
try { | |
for (char const* create_table_str : cCreateStorage) { | |
std::unique_ptr<sql::Statement> statement(conn->createStatement()); | |
statement->executeUpdate(create_table_str); | |
} | |
} catch (const sql::SQLException& e) { | |
conn->rollback(); | |
return StorageErr{StorageErrType::OtherErr, e.what()}; | |
} | |
conn->commit(); | |
return StorageErr{}; | |
} catch (const sql::SQLException& e) { | |
return StorageErr{StorageErrType::ConnectionErr, e.what()}; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/spider/storage/MySqlConnection.cpp (1)
26-38
:⚠️ Potential issueHandle cases where only username or password is provided.
The logic currently checks if both username and password are provided (
match[2].matched && match[4].matched
). If only one is present, the connection may fail. Consider handling cases where only the username or only the password is provided to improve flexibility and robustness.Apply this diff to adjust the credential handling:
-bool const credential = match[2].matched && match[4].matched; +bool const has_user = match[2].matched; +bool const has_password = match[4].matched;Then update the connection logic accordingly to handle all combinations of user credentials.
🧹 Nitpick comments (3)
src/spider/worker/task_executor.cpp (1)
126-128
: Consider adding error handling for storage initialization.While the connection handling has been moved into the constructors, it would be beneficial to catch and handle any potential connection errors at this level to provide more specific error codes and logging.
std::shared_ptr<spider::core::MetadataStorage> const metadata_store - = std::make_shared<spider::core::MySqlMetadataStorage>(storage_url); + = nullptr; std::shared_ptr<spider::core::DataStorage> const data_store - = std::make_shared<spider::core::MySqlDataStorage>(storage_url); + = nullptr; +try { + metadata_store = std::make_shared<spider::core::MySqlMetadataStorage>(storage_url); + data_store = std::make_shared<spider::core::MySqlDataStorage>(storage_url); +} catch (const std::exception& e) { + spdlog::error("Failed to initialize storage: {}", e.what()); + return cStorageErr; +}src/spider/storage/MySqlConnection.cpp (2)
21-25
: Improve URL parsing robustness.The current regular expression may not handle all valid JDBC URLs, especially if the URL includes additional query parameters or lacks user credentials. Consider using a more robust URL parsing method or a dedicated URI parsing library to handle various URL formats and edge cases.
24-24
: Enhance error message for invalid URL.Including the invalid URL in the error message can help users debug issues more effectively.
Apply this diff to improve the error message:
- return StorageErr{StorageErrType::OtherErr, "Invalid url"}; + return StorageErr{StorageErrType::OtherErr, "Invalid URL: " + url};
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/spider/storage/MySqlConnection.cpp
(1 hunks)src/spider/worker/task_executor.cpp
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: non-storage-unit-tests (ubuntu-24.04)
- GitHub Check: lint
- GitHub Check: non-storage-unit-tests (ubuntu-22.04)
🔇 Additional comments (1)
src/spider/worker/task_executor.cpp (1)
27-27
: LGTM! Header inclusion is appropriate.The inclusion of MySqlStorage.hpp is necessary for the MySQL storage implementations and follows the changes in storage initialization.
Description
MySqlMetadataStorage
andMySqlDataStorage
connects to database after it is created and keeps the connection alive to reduce the latency in database operations. However, database connection is a scarce resource. Keeping long-living connections significantly restricts scalability of the system. Thus, we now create database connection on need instead of keeping long-living connection.Validation performed
Summary by CodeRabbit
Release Notes: Storage Connection Management Update
New Features
MySqlConnection
class for robust database connection management.Changes
connect()
andclose()
methods from storage classes.Improvements
Compatibility