Skip to content

Latest commit

 

History

History
56 lines (29 loc) · 1.59 KB

README.md

File metadata and controls

56 lines (29 loc) · 1.59 KB

✨ Web Scrapper ✨

🌐 Web Text Scraper is your go-to tool for effortlessly extracting text elements from web pages. 🧰 Customize your extraction process by selecting 📃 paragraphs, 🏷️ titles, or specific HTML tags. With robust error handling and a visually appealing display of the extracted text, it simplifies web scraping, making data gathering a breeze. 🚀

🔧 Features:

  • Flexible Element Selection 🖋️
  • Interactive Interface 🌐
  • Real-Time Text Extraction ⏳
  • Feedback Messages 📢

📥 Installation:

To run the Web Text Scraper, make sure you have the following dependencies installed:

  • streamlit
  • beautifulsoup4==4.11.1
  • pip==23.1.2
  • requests==2.28.0

Command:

👉 pip install -r requirements.txt 👈

📝 Usage:

  1. Run the streamlit_app.py script using the following command:

🚀 python streamlit_app.py 🚀

  1. Enter the URL of the web page to scrape.

  2. Select the elements to scrape: "Paragraphs", "Titles", "Paragraphs and Titles", "All", or "Custom".

  3. If choosing the "Custom" option, enable the "Custom Tag" checkbox and enter HTML tags (comma-separated).

  4. Click the "Scrape" button to start scraping.

  5. View the extracted text.

📌 Note:

Make sure to replace 'Jatin_Agrawal_20BCS6606' with your desired page title and 'LOGO.png' with the path to your desired page icon in the set_page_config function.

Contributing 🤝

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

License 📄

This project is licensed under the MIT License.