This project is designed to scrape listing and detail pages from a website and store the extracted data into Google Sheets. Additionally, it downloads images from the detail pages and uploads them to an FTP server.
-
Clone the Repository: git clone https://github.com/claudioandriaan/Python_web_scraping_test.git
-
Install Dependencies:
pip install BeautifulSoup
pip install selenium
pip install gspread
pip install ftputil
pip install requests
- Set Up Google API Credentials:
- Obtain Google API credentials (JSON file) and save it as
dot.json
in the project directory.
- Configure FTP Credentials:
- Update the FTP server details (host, username, password) in the
scrape_data_from_link()
function in the script.
- Run the Script:
python spiders.py -d <output_directory>