Comp-Scraping

Comp-Scraping is a web scraper designed to collect salary data for software engineers in Brazil from levels.fyi. This project uses Python and Selenium to automate data collection and analysis.

Want to check out the last analysis? Open the notebook here.

Tools and Technologies

Python 3.12+
Poetry (for dependency management)
Selenium (for web scraping)
BeautifulSoup4 (for HTML parsing)
Pandas (for data manipulation)
Pytest (for testing)

Setup

Ensure you have Python 3.12 or higher installed on your system.

Install Poetry if you haven't already:

curl -sSL https://install.python-poetry.org | python3 -

Clone the repository:

git clone https://github.com/lucasheriques/comp-scraping.git
cd comp-scraping

Install dependencies using Poetry:
```
poetry install
```
Activate the virtual environment:
```
poetry shell
```

Usage

Scraper

To run the scraper:

poetry run scrape

This will start the scraping process and save the data to a CSV file in the data directory.

Data Analysis

To analyze the scraped data using a Jupyter notebook:

Ensure you're in the project's virtual environment:
```
poetry shell
```
Run the following command to launch the Jupyter notebook:
```
poetry run analyze
```
This will start the Jupyter notebook server and directly open the data analysis notebook in your default web browser.
Open the notebook here. You can run the cells to load the most recent data, perform analysis, and visualize the results.
To re-run the analysis with updated data, make sure to restart the kernel and run all cells again.

Note: The analysis notebook automatically uses the most recent CSV file in the data directory, so you don't need to update the file path manually.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
comp_scraping		comp_scraping
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run_tests.py		run_tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comp-Scraping

Tools and Technologies

Setup

Usage

Scraper

Data Analysis

About

Releases

Packages

Languages

lucasheriques/comp-scraping

Folders and files

Latest commit

History

Repository files navigation

Comp-Scraping

Tools and Technologies

Setup

Usage

Scraper

Data Analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages