Resource - Generate Links.txt from subscribed item pages html #21

Qazzquimby · 2021-01-30T07:02:29Z

This isn't an issue, so sorry if there's a better way I should have shared this.
I wanted to quickly generate a Links.txt from the items I currently have subscribed. I expect there's a better way using steamcmd, but I'm more familiar with web scraping.

This python script requires the user manually download the html for each page of their collection (ideally with 30 items per page) and put them in a directory called workshop_html. This could be automated, but I didn't want to mess around with authorization and requests.

Requires pip install bs4

`import glob
import dataclasses

import bs4

@dataclasses.dataclass
class Mod:
name: str
url: str

def get_mods_from_soup(soup: bs4.BeautifulSoup):
mod_soups = soup.select("div.workshopItemSubscriptionDetails")
mods = [Mod(url=soup.select_one('a')['href'], name=soup.select_one('div.workshopItemTitle').text)
for soup in mod_soups]

return mods

def get_mods():
mods = []

for path in glob.glob('workshop_html/*.html'):
    with open(path, errors='ignore') as file:
        soup = bs4.BeautifulSoup(file)
        mods_in_file = get_mods_from_soup(soup)
        mods += mods_in_file

return mods

def get_mods_text():
mods = get_mods()
mod_texts = [f"*{mod.name}\n{mod.url}" for mod in mods]
mods_text = '\n'.join(mod_texts)
return mods_text

if name == 'main':
mods_text = get_mods_text()
with open('Links.txt', 'w+') as file:
file.write(mods_text)`

The text was updated successfully, but these errors were encountered:

Hypocrita20XX · 2021-02-06T02:33:49Z

Hey there, thanks for sharing!
Your solution is definitely more elegant than how I usually go about doing it, I take what Conexus gives me and edit that a fair bit in Notepad++.
He does something similar, although it's all tied up in a package, versus what you have here, which is standalone, as best I understand, at least. I've only ever messed with Python to very briefly automate Blender some years ago.

Edit: I may integrate this into the wiki, although I'll need some time to figure out exactly how best to do this, and if so, I'll be sure to properly credit you. =-)

Hypocrita20XX pinned this issue Feb 6, 2021

Hypocrita20XX added the enhancement New feature or request label Feb 6, 2021

Hypocrita20XX closed this as completed Feb 6, 2021

Hypocrita20XX mentioned this issue Feb 11, 2021

Index outside the bounds of the array when copying files #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource - Generate Links.txt from subscribed item pages html #21

Resource - Generate Links.txt from subscribed item pages html #21

Qazzquimby commented Jan 30, 2021

Hypocrita20XX commented Feb 6, 2021 •

edited

Loading

Resource - Generate Links.txt from subscribed item pages html #21

Resource - Generate Links.txt from subscribed item pages html #21

Comments

Qazzquimby commented Jan 30, 2021

Hypocrita20XX commented Feb 6, 2021 • edited Loading

Hypocrita20XX commented Feb 6, 2021 •

edited

Loading