This Python tool allows you to efficiently convert links of Toppr questions into a single CSV file with solutions as well as meta-data. It leverages Playwright for web automation, Beautiful Soup for HTML parsing, and Celery for background task processing, along with HTML, CSS, JS and Flask for creating a simple web interface.
-
Clone the project to your local system using:
git clone https://github.com/adhirajpandey/Toppr-Extractor
-
Create a virtual environment and activate it:
python -m venv venv && source venv/bin/activate
-
Install the required dependencies:
pip install -r requirements.txt
-
If you are using playwright for first time, you would also need to install its browser dependencies:
playwright install
-
Create empty DATA directory, Run Flask App and Celery Worker using
bash start.sh
-
Open your web browser and navigate to
http://localhost:4322
. -
Paste the Toppr question link in the provided input field.
-
Click the "Generate CSV" button, and the tool will gather information from the provided links and generate a CSV file.
-
"Download CSV" button will appear once celery worker has completed the task, click it to download the CSV file.
-
Clone the project to your local system using:
git clone https://github.com/adhirajpandey/Toppr-Extractor
-
Build Docker image using
docker build -t toppr .
-
Run Docker continaer
docker run -d -p 4322:5000 toppr
-
Follow through from Step 6 to Step 9
- This tool relies on web automation and HTML parsing, which might break if the structure of Toppr's website changes significantly.
- The accuracy and effectiveness of the tool are dependent on the stability of the Playwright, Beautiful Soup, and Flask libraries, as well as the underlying operating system.
This project is not affiliated with Toppr or its services. It's an independent tool created for educational and personal use.