Skip to content

yash3004/extraction_data-selenium-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Extraction with Selenium, NumPy, and Pandas

This project involves data extraction from websites using Selenium for web scraping and manipulation of the extracted data using NumPy and Pandas libraries in Python.

Overview

The purpose of this project is to demonstrate how to:

  • Use Selenium to automate web browser interactions for data extraction.
  • Employ NumPy and Pandas for data manipulation, analysis, and storage.

Prerequisites

Ensure you have the following installed:

  • Python (3.x recommended)
  • Selenium library (pip install selenium)
  • NumPy library (pip install numpy)
  • Pandas library (pip install pandas)
  • WebDriver for your browser (e.g., ChromeDriver for Google Chrome)

Usage

  1. Clone the repository:

    git clone https://github.com/yash3004/extraction_data-selenium-/
  2. Install the required libraries:

    pip install -r requirements.txt
  3. Download and place the WebDriver for your browser in the project directory.

  4. Customize the Selenium scripts (extract_data.py) to target the desired website(s) and data.

  5. Run the data extraction script:

    python voyalla.py
    
  6. The extracted data will be stored in NumPy arrays or Pandas DataFrames based on your script configuration.

Scripts Overview

  • extract_data.py: Contains the Selenium code for web scraping and data extraction.
  • data_analysis.py: Demonstrates data manipulation, analysis, and storage using NumPy and Pandas.

Examples

  • Use voyalla.py to extract tabular data from a website and store it in a Pandas DataFrame.
  • Utilize cleaning.py to perform various data manipulations, calculations, or analyses on the extracted data.

Contributing

Contributions are welcome! Feel free to open issues or pull requests for improvements, bug fixes, or additional features.

License

This project is licensed under the MIT License.

About

using selenium code to extract the

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages