Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource - Generate Links.txt from subscribed item pages html #21

Closed
Qazzquimby opened this issue Jan 30, 2021 · 1 comment
Closed

Resource - Generate Links.txt from subscribed item pages html #21

Qazzquimby opened this issue Jan 30, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@Qazzquimby
Copy link

This isn't an issue, so sorry if there's a better way I should have shared this.
I wanted to quickly generate a Links.txt from the items I currently have subscribed. I expect there's a better way using steamcmd, but I'm more familiar with web scraping.

This python script requires the user manually download the html for each page of their collection (ideally with 30 items per page) and put them in a directory called workshop_html. This could be automated, but I didn't want to mess around with authorization and requests.

Requires pip install bs4

`import glob
import dataclasses

import bs4

@dataclasses.dataclass
class Mod:
name: str
url: str

def get_mods_from_soup(soup: bs4.BeautifulSoup):
mod_soups = soup.select("div.workshopItemSubscriptionDetails")
mods = [Mod(url=soup.select_one('a')['href'], name=soup.select_one('div.workshopItemTitle').text)
for soup in mod_soups]

return mods

def get_mods():
mods = []

for path in glob.glob('workshop_html/*.html'):
    with open(path, errors='ignore') as file:
        soup = bs4.BeautifulSoup(file)
        mods_in_file = get_mods_from_soup(soup)
        mods += mods_in_file

return mods

def get_mods_text():
mods = get_mods()
mod_texts = [f"*{mod.name}\n{mod.url}" for mod in mods]
mods_text = '\n'.join(mod_texts)
return mods_text

if name == 'main':
mods_text = get_mods_text()
with open('Links.txt', 'w+') as file:
file.write(mods_text)`

@Hypocrita20XX Hypocrita20XX pinned this issue Feb 6, 2021
@Hypocrita20XX
Copy link
Owner

Hypocrita20XX commented Feb 6, 2021

Hey there, thanks for sharing!
Your solution is definitely more elegant than how I usually go about doing it, I take what Conexus gives me and edit that a fair bit in Notepad++.
He does something similar, although it's all tied up in a package, versus what you have here, which is standalone, as best I understand, at least. I've only ever messed with Python to very briefly automate Blender some years ago.

Edit: I may integrate this into the wiki, although I'll need some time to figure out exactly how best to do this, and if so, I'll be sure to properly credit you. =-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants