-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
harvester: add arXiv metadata harvester #68
base: master
Are you sure you want to change the base?
Conversation
3ca54b5
to
1548182
Compare
1548182
to
8494608
Compare
Pull Request Test Coverage Report for Build 240
💛 - Coveralls |
else: | ||
return False | ||
|
||
def get_metadata(self, arxiv_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the same function name in this case (for Client and Harvester might be a bit misleading)
# Identifiers | ||
result['Identifier'] = [] | ||
doi = metadata['doi'] | ||
if doi: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a suggestion maybe the construction of the results could be a separate function to make it more distinct.
|
||
def get_metadata(self, arxiv_id): | ||
"""Get metadata from ArXiv.""" | ||
res = arxiv.query(query="", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is in the scope, but maybe we could make this function a bit more versatile by providing a by default empty query, giving us the chance in the future to harvest by query too if needed without having to refactor this.
""".""" | ||
data = self.get_metadata(identifier) | ||
if data: | ||
providers = set(providers) if providers else set() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case where we can get duplicate providers?
No description provided.