🔧 Requisites

Extract information from all games published in Steam thanks to its Web API, and store it in JSON format.

I used this code to generate these dataset: 'Steam Games Dataset'.

🔧 Requisites

Pyhton 3.8
requests and argparse.

pip3 install requests argparse

🚀 Usage

Start generating data simply with:

python SteamGamesScraper.py

The first time, the file 'appplist.json' will be created with all the ID that facilitates Steam (>140K). In the next execution, that file will be used instead of requesting all the data again. If you want to get new IDs, simply delete the file 'appplist.json'.

Only the data of the games are saved. DLCs, music, tools, etc. are ignored and added to the file 'discarted.json' so as not to ask for them in future searches. You can delete the file to ask again for those IDs.

Finally, in the file 'games.json' all games are stored, if:

It have been already been released.
'developers' field not empty.
Price included if its not free.

The format is this:

{
    "906850": {
        "name": "...",
        "release_date": {
            "coming_soon": false,
            "date": "..."
        },
        "required_age": 0,
        "is_free": false,
        "price": 0.99,
        "detailed_description": "...",
        "supported_languages": "...",
        "reviews": "...",
        "header_image": "...",
        "website": "...",
        "support_url": "...",
        "support_email": "...",
        "windows": true,
        "mac": false,
        "linux": false,
        "metacritic_score": 0,
        "metacritic_url": "...",
        "achievements": 0,
        "recommendations": 0,
        "notes": "",
        "packages": [
            {
                "title": "...",
                "description": "...",
                "subs": [
                    {
                        "text": "...",
                        "description": "...",
                        "price": 0.99
                    }
                ]
            }
        ],
        "developers": [
            "..."
        ],
        "publishers": [
            "..."
        ],
        "categories": [
            "..."
        ],
        "genres": [
            "..."
        ],
        "screenshots": [
            "..."
        ],
        "movies": [
            "..."
        ]
    },
    ...
}

In the file 'ParseExample.py' you can see a simple example of how to parse the information.

⚙️ Parameters

To change the output file uses the parameter '-o' / '-outfile':

python SteamGamesScraper.py -o output.json

Steam can reject, or even banner your IP, if he considers that you are doing too many requests. That's why 5.0 seconds are waited by default. You can change this with the parameter '-s' / '-sleep':

python SteamGamesScraper.py -s 2.0

It is not recommended to set the wait time below 5.0 seconds.

When Steam denies a request, by default it is trying up to four times. You can change the number of retries with '-r' / '-retries':

python SteamGamesScraper.py -r 10

Although it is not recommended, you can set always retry by changing the value to 0.

The games that have not yet been released are added to the file 'notreleased.json' and will not be checked again. If you want to ignore this list, you can set the parameter '-d' / '-released' to False, or eliminate the file.

At the end of the scan, or by pressing Ctrl + C, all data are recorded. You can activate the auto-save to activate each X new entries with '-a' / '-autosave':

python SteamGamesScraper.py -a 100

A backup file will also be generated with the previous data.

📜 License

Code released under MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
images		images
.gitignore		.gitignore
ConvertToCSV.py		ConvertToCSV.py
LICENSE.md		LICENSE.md
ParseExample.py		ParseExample.py
README.md		README.md
SteamGamesScraper.py		SteamGamesScraper.py
discarted.json		discarted.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔧 Requisites

🚀 Usage

⚙️ Parameters

📜 License

About

Releases

Packages

Languages

License

DanielSchimit/Steam-Games-Scraper

Folders and files

Latest commit

History

Repository files navigation

🔧 Requisites

🚀 Usage

⚙️ Parameters

📜 License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages