Watch the preliminary demo of ExpEngine here.
In the realm of data science and analytics, constructing effective workflows involves navigating through numerous variability points such as different implementations, training algorithms, hyperparameters, and deployment strategies. For data scientists and analysts, the challenge lies in fine-tuning these workflows to deliver precise and meaningful results that align with user requirements.
To address these complexities, we propose a novel approach centered around user-driven experimentation to optimize data analytics workflows.
Our approach revolves around a robust tool framework comprising an Experimentation Engine and a Domain-Specific Language (DSL) tailored for workflow optimization. Developed as part of the ExtremeXP EU project, this framework aims to empower data scientists and analysts by streamlining the experimentation process.
To address these complexities, we propose a novel approach centered around user-driven experimentation to optimize data analytics workflows.
-
Experiment Specification: The process starts with a data scientist using our DSL editor to create an experiment specification. This specification includes the different options and strategies for the experiment.
-
Execution and Iteration: The Experimentation Engine reads the specification and sets up multiple workflows based on the specified options. The data scientist can control the experiment by adjusting settings, pausing, resuming, and changing the workflow order as needed.
-
Optimal Workflow Delivery: At the end of the experiment, the Engine gathers results from all workflows. It identifies the best workflow setup and provides detailed metrics and outputs from the experiment.
In the realm of data science and analytics, constructing effective workflows involves navigating through numerous variability points such as different implementations, training algorithms, hyperparameters, and deployment strategies. For data scientists and analysts, the challenge lies in meticulously fine-tuning these workflows to deliver precise and meaningful results that align with user requirements.
-
Install Python:
- Ensure Python is installed on your system. Recommended version: Python 3.x.
-
Create a virtual environment:
- Set up a Python virtual environment for the ExpEngine to manage dependencies.
python -m venv env source env/bin/activate # On Windows use `env\Scripts\activate`
- Set up a Python virtual environment for the ExpEngine to manage dependencies.
-
Install textx:
- Install the textx library for DSL parsing.
pip install textx
- Install the textx library for DSL parsing.
-
Install matplotlib:
- Install matplotlib for generating plots (optional but recommended).
pip install matplotlib
- Install matplotlib for generating plots (optional but recommended).
-
Install proactive:
- Install proactive if it's a required dependency.
pip install proactive
- Install proactive if it's a required dependency.
-
Change the proactive credentials in credentials.py:
- Update the
credentials.py
file with necessary credentials for proactive usage.
- Update the
To run ExpEngine, follow these steps:
-
Use an IDE
- Open your preferred Integrated Development Environment (IDE). This will help you manage your project files and provide a convenient environment for running and debugging your code.
-
Activate the virtual environment
-
Before running ExpEngine, ensure that you activate the virtual environment where all the dependencies are installed. This isolates your project's dependencies and ensures compatibility.
-
In your terminal, navigate to your project directory and activate the virtual environment. The command may vary based on your operating system and the tool you used to create the virtual environment:
On Windows:
.\venv\Scripts\activate
On macOS and Linux:
source venv/bin/activate
-
-
Navigate to the project directory
- Change your directory to the
exp-engine
folder where the main script is located:cd exp-engine
- Change your directory to the
-
Run the ExpEngine script
- Execute the main script to start the ExpEngine:
python exp_engine.py
- Execute the main script to start the ExpEngine:
The datasets and scripts provided with ExpEngine are proprietary. To use your own datasets and task scripts, please upload these materials into the relevant folders within the project directory. Ensure that your datasets and scripts are formatted and organized correctly to integrate smoothly with the ExpEngine framework.
First, you need to create the DSL artifact using Maven. Open your terminal or command prompt and navigate to the exp.engine.dsl.parent
directory. Then, run the following command:
cd exp.engine.dsl.parent
mvn install
Next, create the language server by navigating to the exp.engine.dsl.ide
directory and running Maven with the lang-server
profile:
cd exp.engine.dsl.parent/exp.engine.dsl.ide
mvn install -Plang-server
After building the language server, navigate to the target
directory and run the server using the following command:
cd exp.engine.dsl.parent/exp.engine.dsl.ide/target
java -jar exp.engine.dsl.ide-1.0.0-SNAPSHOT-ls.jar
You should see the following message indicating that the language server is running:
Welcome to Experiment LSP version 4.0 - Resolved
The EXP Language Server can be integrated with VS Code to support .exp
files.
First, install the necessary packages by navigating to the vs-code-ext
directory and running npm install
:
cd vs-code-ext
npm install
After installing the packages, build the extension. The generated files will be in the src/out
directory. Open the extension.js
script in VS Code:
code src/out/extension.js
Press F5
to run the extension. This will open a new VS Code window with the extension loaded.
Note: Make sure the language server is running in a separate process.
To test the extension, create a new file with the .exp
extension and write some DSL code. The VS Code extension should provide syntax highlighting, code completion, and other language features for the EXP DSL.
- Open Command Prompt or PowerShell.
- Follow the Building the Language Server and Running the Language Server steps.
- For the VS Code extension, open a new Command Prompt or PowerShell window and follow the Building the VS Code Extension and Running the VS Code Extension steps.
- Open a terminal.
- Follow the Building the Language Server and Running the Language Server steps.
- For the VS Code extension, open a new terminal window and follow the Building the VS Code Extension and Running the VS Code Extension steps.
- Open a terminal.
- Follow the Building the Language Server and Running the Language Server steps.
- For the VS Code extension, open a new terminal window and follow the Building the VS Code Extension and Running the VS Code Extension steps.
By following these instructions, you should be able to set up and run the ExpEngine, EXP Language Server and VS Code extension on Windows, Linux, and macOS. If you encounter any issues, please refer to the respective documentation for Maven, Java, and Node.js, or seek assistance from the community.