Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

334 handle spark session creation when already exists #335

Merged

Conversation

xuwenyihust
Copy link
Owner

No description provided.

…nagement

- Added a new code cell in the demo notebook to initialize a Spark session with detailed configuration settings, including application ID and Spark UI link.
- Updated SparkModel.js to check for existing Spark application IDs before storing new session information, improving error handling and preventing duplicate entries.
- Enhanced logging for better visibility into Spark session creation and management processes.
…encies flag

- Modified the npm install command to include the --legacy-peer-deps flag, ensuring compatibility with older peer dependencies during the build process of the React application.
- Upgraded Node.js version from 14 to 18 for improved performance and compatibility.
- Cleared npm cache before installing dependencies to ensure a clean environment.
- Added installation of @jridgewell/gen-mapping to support additional functionality.
- Increased memory allocation for the build process by setting NODE_OPTIONS to 4096 MB.
- Downgraded Node.js version from 18 to 14 for compatibility.
- Simplified npm installation by removing cache cleaning and legacy peer dependencies flag.
- Removed increased memory allocation for the build process, optimizing the Dockerfile for a more straightforward build.
- Upgraded Node.js version from 14 to 18 for improved performance.
- Implemented a clean install of npm dependencies with legacy peer dependencies support.
- Added specific package installations for @jridgewell/gen-mapping and @babel/generator.
- Increased memory allocation for the build process by setting NODE_OPTIONS to 4096 MB.
…process

- Updated npm installation commands to set legacy peer dependencies and install packages in a specific order.
- Cleaned npm cache and rebuilt before running the build command to ensure a fresh environment.
- Increased clarity and efficiency in the Dockerfile setup for the web application.
- Updated demo notebook to reflect successful Spark session execution, including updated execution metadata and application ID.
- Refactored Spark session creation in backend to streamline the process, removing unnecessary parameters and improving error handling.
- Modified SparkModel.js to ensure proper session initialization and validation of Spark application IDs.
- Improved logging for better visibility during Spark session creation and management processes.
…andling

- Removed outdated code cells from the demo notebook to enhance clarity and usability.
- Updated SparkModel.js to improve validation of Spark application IDs, ensuring they start with 'app-' and are correctly extracted from the HTML.
- Simplified the logic for storing Spark session information in the Notebook component, enhancing overall session management.
…ling

- Updated Notebook.js to extract and store Spark app ID more efficiently, ensuring it is only stored if valid.
- Enhanced logging to provide clearer visibility of the extracted Spark app ID.
- Added a console log in SparkModel.js to confirm successful extraction of the Spark app ID, improving debugging capabilities.
- Removed redundant console log for Spark app ID and retained a single log statement for clarity.
- Enhanced error handling in the Notebook component to ensure better debugging during cell execution.
- Added a new endpoint to retrieve the status of a Spark application by its ID in spark_app.py, improving the API's functionality.
- Enhanced logging in SparkModel.js to provide better visibility during the storage process of Spark application information, including status checks and error handling.
- Improved validation for Spark application IDs to ensure only valid IDs are processed, contributing to more robust error management.
- Moved the Spark app status retrieval logic from the route handler in spark_app.py to a static method in SparkApp class for better separation of concerns.
- Improved error handling and logging in the new get_spark_app_status method, ensuring clearer responses for application not found and internal errors.
- Simplified the route handler to directly return the response from the SparkApp method, enhancing code readability and maintainability.
- Safely handle notebook paths by simplifying them when they match the pattern work/user@domain/notebook.ipynb.
- Improved clarity by logging the simplified notebook path for better debugging and visibility.
- Removed unused JWT and user identification decorators from the Spark app route in spark_app.py for cleaner code.
- Simplified the notebook path handling in getSparkApps method of NotebookModel.js by removing unnecessary path simplification logic, allowing direct usage of the provided notebook path.
- Enhanced code readability and maintainability by streamlining the logic in both files.
…handling

- Added JWT authentication and user identification decorators to the create_spark_app route in spark_app.py to ensure only authenticated users can create Spark applications.
- Implemented user context validation in the SparkApp service, returning a 401 response if the user is not found.
- Added a database rollback mechanism on error during Spark app creation to maintain data integrity.
- Updated demo notebook to include successful Spark session execution details, including execution metadata and application ID.
- Removed the create_spark_session endpoint from spark_app.py to streamline session management.
- Refactored SparkApp service by removing the create_spark_session method, as session creation is now handled directly in the notebook.
- Improved SparkModel.js to ensure proper validation and storage of Spark application IDs, including enhanced logging for better visibility during the process.
@xuwenyihust xuwenyihust linked an issue Dec 10, 2024 that may be closed by this pull request
@xuwenyihust xuwenyihust merged commit 3b2dca7 into release-0.6.1 Dec 10, 2024
7 checks passed
@xuwenyihust xuwenyihust deleted the 334-handle-spark-session-creation-when-already-exists branch December 10, 2024 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle Spark Session Creation When Already Exists
1 participant