GoScrape 🐙: Universal hltv.org demofile scraper

Go scrape is a little open source project I created to make it easy to bulk download demofiles for the FPS CS:GO from the popular CS:GO fansite hltv.org.

Installation in Python - PyPi release

GoScrape is on PyPi, so you can use pip to install it.

  pip install goscrape

TL;DR

GoScrape consists of two main commands.

command	description
`events`	used in the first step to create a json lookup file containing important and structured information about CS:GO esports events in a given timeframe and if specified also links to associated demofiles and matches.
`fetch`	build on top of the events command and can be used to bulk download the demofile json output from the events command otherwise a single event id can be specified to simply download demofiles for that event.

Getting Started

Events 🎮

argument	datatype	description	notes
STARTDATE	string	the start date from when evet data should be gathered	formatted as string 'YYYY-MM-DD'	required
ENDDATE	string	the date to which event data should be gathered	formatted as string 'YYYY-MM-DD'	required
STORAGEPATH	string	the directory or filepath to which the resulting json should be stored		optional (default is cwd)
MATCHES	boolean	whether match information and demofile urls should be scraped as well	This flag is required if the resulting json file should be used for the fetch command	optional (True if present)
EVENT TYPE	enum	Which type of event datashould be pulled (Online, Lan ...)		optional (default is online)

The Objects in the resulting json are identified by their event id given as a key and will look something like this:

{
  "6475": {
    "event_data": {
      "entity": "event",
      "event_id": "6475",
      "event_url": "https://www.hltv.org/events/6475/iem-dallas-2022-oceania-open-qualifier-2",
      "event_name_encoded": "iem-dallas-2022-oceania-open-qualifier-2",
      "event_name_full": "IEM Dallas 2022 Oceania Open Qualifier 2",
      "nr_of_teams": "8+",
      "prize": "Other",
      "event_type": "Online",
      "location": "Oceania (Online)",
      "event_start": "2022-04-20",
      "event_end": "2022-04-21"
    },
    "matches": [
      {
        "entity": "match",
        "teams": ["Paradox", "Aftershock"],
        "date_time": "2022-04-21 10:00:00",
        "match_url": "https://www.hltv.org//matches/2355881/paradox-vs-aftershock-iem-dallas-2022-oceania-open-qualifier-2",
        "demo_id": "71497",
        "demo_url": "https://www.hltv.org/download/demo/71497"
      }
    ]
  }

Fetch 💾

argument	datatype	description	notes
EVENT ID	string \| int	the start date from when evet data should be gathered	LOOKUP FILE & EVENT ID are mutually exclusive only one can be used	required
LOOKUP FILE	string	the filepath of the by the events command generated lookup that should be sued for demo downloading	LOOKUP FILE & EVENT ID are mutually exclusive only one can be used	required
STORAGEPATH	string	the directory to which the demofiles should be written		optional (default is cwd)
MULTIPROCESSING	boolean	whether multiprocessing should be utilized to speed up downloading		optional (True if present)

Disclaimer

This tool nor I have any affiliation with HLTV. I originally built this CLI to aid in my ability to download demos for scientific research purposes. I made it publicly availible because I thought it might benefit others as well. If you download a lot of demos the tool will automatically implement a sleep time to avoid a temporary cloudflar ban.

Changelog

Version 0.1.3 (2022.09.22)

Resolved an issue where the package failed to gather the file name of the fetched demo file

Version 0.1.2 (2022.05.30)

Bug fixes and improvements

Version 0.1.1 (2022.04.29)

Bug fixes on multiprocessed downloading

Version 0.1.0 (2022.04.24)

Initial release

Contributing

Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Issued

If you expierience any issues please message me or raise an issue here

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
docs/images		docs/images
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GoScrape 🐙: Universal hltv.org demofile scraper

Installation in Python - PyPi release

TL;DR

Getting Started

Events 🎮

Fetch 💾

Disclaimer

Changelog

Version 0.1.3 (2022.09.22)

Version 0.1.2 (2022.05.30)

Version 0.1.1 (2022.04.29)

Version 0.1.0 (2022.04.24)

Contributing

Issued

About

Releases

Packages

Languages

License

mo-cmyk/goscrape

Folders and files

Latest commit

History

Repository files navigation

GoScrape 🐙: Universal hltv.org demofile scraper

Installation in Python - PyPi release

TL;DR

Getting Started

Events 🎮

Fetch 💾

Disclaimer

Changelog

Version 0.1.3 (2022.09.22)

Version 0.1.2 (2022.05.30)

Version 0.1.1 (2022.04.29)

Version 0.1.0 (2022.04.24)

Contributing

Issued

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages