Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawler outputs thing ids to screen but does not put anything into the summary.csv #6

Open
rtho782 opened this issue Nov 19, 2019 · 2 comments

Comments

@rtho782
Copy link

rtho782 commented Nov 19, 2019

As per title, the summary CSV is created with the column headers in:

thing_id, file_id, file, license, link

But no other output is generated.

The terminal displays the thing IDs, there are no errors displayed.

@rtho782
Copy link
Author

rtho782 commented Nov 19, 2019

Changing the following section:

def get_thing(thing_id): base_url = "https://www.thingiverse.com/{}:{}" file_ids = [] url = base_url.format("thing", thing_id) contents = get_url(url).text license = parse_license(contents) return license, parse_file_ids(contents)

As follows:

def get_thing(thing_id): base_url = "https://www.thingiverse.com/{}:{}/files" file_ids = [] url = base_url.format("thing", thing_id) contents = get_url(url).text license = parse_license(contents) return license, parse_file_ids(contents)

seems to resolve this?

@rtho782
Copy link
Author

rtho782 commented Nov 19, 2019

The page it is trying to parse to find download links by default doesn't have any matching strings. This fixes it but means it is looking for the individual files rather than the zipped files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant