Python Web Scraper

A lightweight and modular web scraping library built with Python. This project provides a simple interface for fetching and parsing web content.

Features

Clean and modular architecture
HTML content parsing
Mock fetcher for testing and development
Easy-to-use API

Project Structure

├── demo.py                 # Demo script showing usage
├── scraper/               # Main package directory
│   ├── __init__.py       # Package initialization
│   ├── main.py           # Main scraping coordinator
│   ├── parser.py         # HTML content parser
│   └── fetcher.py        # Web content fetcher

Installation

Clone the repository:

git clone https://github.com/dimikarl2022/python-web-scraper.git
cd python-web-scraper

Usage

Basic usage example:

from scraper import scrape_website

# Scrape a website
url = "https://example.com"
content = scrape_website(url)

# Print extracted content
for item in content:
    print(item)

Run the demo script:

python demo.py

Components

WebFetcher: Handles webpage content retrieval
ContentParser: Parses HTML content and extracts text
Main Scraper: Coordinates fetching and parsing operations

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Web Scraper

Features

Project Structure

Installation

Usage

Components

Contributing

License

About

Releases

Packages

Languages

dimikarl2022/webscraping

Folders and files

Latest commit

History

Repository files navigation

Python Web Scraper

Features

Project Structure

Installation

Usage

Components

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages