-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show information about the journal #10015
Conversation
Following the suggestion here (#6189 (comment)), as a first step I've written a script that combines Scimagojr data across all the available years (1999-2022). The idea is to build a consolidated json and to include it with JabRef. The data source contains info on around 38k journals. An issue here is the size of the consolidated JSON, which is ~115MB (compressed size of 21MB), when all the fields are included for all years. These fields include -
Is this size acceptable or should json size be reduced further by |
Hmm a possible solution could be to use a similar approach as the journal abbreviations, put them in a mv database. |
Thanks, I'll take a look at that |
@aqurilla some more context (was only on mobile earlier) We are regularly updating the journal abbreviations either manually or partially automated and then merge them together to store them in a mv store database. https://www.h2database.com/html/mvstore.html Another option could be as a first step to download the file from a GitHub repo |
Is there no way to recieve the data about the journals just in time, maybe chaching them, as soon as it is recieved once? A progress indicatoor could be shown while a background process loads the journal information... |
@Siedlerchr Thanks for the additional context - using an mv database would be a good option if we are going with the consolidated json approach. I expect it would need similar components i.e. creating something like the @calixtus thanks for bringing this up. I tried the Elsevier API again and it looks like one of their endpoints works well for our use case (https://dev.elsevier.com/documentation/SerialTitleAPI.wadl#d1e534). It allows search by ISSN and returns the latest data for several journal metrics. We could just call the endpoint when the user clicks the journal info button. The drawback in this approach is the API quota/limits. What would be the best approach to follow? |
I'm not happy with another repository distributed with every release of jabref. Adds another 30 mb to our installer. Lightweight is something different. Maybe we can cache the journals as soon they are loaded from the API and the cache stored when closing jabref, so the API quota limit is not exceeded to soon. |
Also please avoid adding the future repository to a preferences object. |
I could host the journal information on jabref online. 100mb is nothing for a real DB. Has the disadvantage that the user needs to be online. There is now also openalex, eg https://docs.openalex.org/api-entities/sources/get-a-single-source |
openalex looks great w.r.t having a very high daily API quota, so there is low chance of quota exhaustion. It does seem to only have yearly counts for number of works and number of citations though |
DevcallWe had a short little discussion tonight in our maintainers devcall. We really like this little enhancement, but we also see some problems that may arise. If we distribute all the journal data with the jabref package on release, it will only be update approx twice a year. Also this feature may only affect / be useful only for a very small number of users, but we would add a database of 20 mb (?) to every release. This is something we are not very happy about. |
@aqurilla thanks for your efforts already. Would be great if you could keep the concerns above in mind continuing your PR! ❤️ |
@aqurilla just a quick question about this, do you have a link? I'd expect at least monthly based on https://docs.openalex.org/api-entities/publishers/publisher-object
(perhaps you would need to combine the information to get the current years count, I haven't tried the API yet) EDIT: Oh wait, you are talking about yearly vs monthly? Nvm. |
@calixtus thanks for sharing the devcall discussion details! Keeping all these points in mind, I think the best way would be to go ahead with a jit data fetcher using the Scopus API, and include caching on the user system. It looks like this would address all of the issues. @k3KAW8Pnf7mkmdSMPHz27 sure! I am referring to the link that tobiasdiez shared (https://docs.openalex.org/api-entities/sources/get-a-single-source). For showing year-wise variation charts we only have the |
I'll try to implement the API at jabrefonline later this week. This approach has also the advantage that we can easily extend it in the future and enrich it by data from other sources (eg openalex). For the moment I would say concentrate on the display of the data in jabref. |
@tobiasdiez sounds good! |
I now have a first draft for the API. You can try it out by issuing a POST request to https://mango-pebble-0224c3803-2067.westeurope.1.azurestaticapps.net/api with {
"query":"query GetJournalByIssn($issn: Int) {\n journal(issn: $issn) {\n id\n name\n issn\n scimagoId\n country\n publisher\n areas\n categories\n citationInfo {\n year\n docsThisYear\n docsPrevious3Years\n citableDocsPrevious3Years\n citesOutgoing\n citesOutgoingPerDoc\n citesIncomingByRecentlyPublished\n citesIncomingPerDocByRecentlyPublished\n sjrIndex\n }\n hIndex\n }\n}\n",
"variables":{"issn":15230864},
"operationName":"GetJournalByIssn"
} This should give you a result of the type {
"data": {
"journal": {
"id": "ckslj3f10000f09jvc1xifgi9",
"name": "Antioxidants & Redox Signaling",
"issn": [
15230864,
15577716
],
"scimagoId": 27514,
"country": "United States",
"publisher": "Mary Ann Liebert Inc.",
"areas": [
"Biochemistry, Genetics and Molecular Biology",
"Medicine"
],
"categories": [
"Biochemistry (Q1)",
"Cell Biology (Q1)",
"Clinical Biochemistry (Q1)",
"Medicine (miscellaneous) (Q1)",
"Molecular Biology (Q1)",
"Physiology (Q1)"
],
"citationInfo": [
{
"year": 2022,
"docsThisYear": 217,
"docsPrevious3Years": 488,
"citableDocsPrevious3Years": 487,
"citesOutgoing": 19202,
"citesOutgoingPerDoc": 130.63,
"citesIncomingByRecentlyPublished": 3692,
"citesIncomingPerDocByRecentlyPublished": 7.21,
"sjrIndex": 1.706
},
{
"year": 2021,
"docsThisYear": 158,
"docsPrevious3Years": 530,
"citableDocsPrevious3Years": 530,
"citesOutgoing": 24155,
"citesOutgoingPerDoc": 152.88,
"citesIncomingByRecentlyPublished": 4724,
"citesIncomingPerDocByRecentlyPublished": 7.59,
"sjrIndex": 1.832
}
],
"hIndex": 217
}
}
} Do you think this data format is convenient or would you like to see some changes? Currently, it only contains this particular test data. I'll later add all of the data from scimago. Let me know if you encounter any issues or questions. EDIT: Something went wrong with the deployment of the database. Will try to have a look at this tomorrow. |
@tobiasdiez thanks I will check it out. The data format looks convenient to me 👍 |
Sorry I wasn't able to read your PR in detail earlier, this week was unexpectedly very busy and next week isn't much better too. Please hang on... |
@calixtus no worries, thanks for the update! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried it again and looked through the code. One nitpick. Nevertheless, I would vote for merging it to be able to gather more user feedback.
The only thing that worries me is that the popup shows up at another screen and is not resizable/movable here. -- I would have excpected that the popup shows up on the same screen as JabRef
src/main/java/org/jabref/logic/journals/JournalInformationFetcher.java
Outdated
Show resolved
Hide resolved
Why is the script removed? Is it available at another repository? I would like to keep it somewhere "near" to enable switching to other endpoints. |
Would it be ppossible to output the "INFO" only if there could not be found any information about a journal name?
|
I've removed the python script here and will add it to JabRef/JabRefOnline#2067. Also updated the api url to the production server. This needs JabRef/JabRefOnline#2067 to be merged, which I will hopefully do tomorrow or latest at the weekend. |
For JabRef/jabref#10015. Sample query: ``` query GetJournalByIssn($issn: String) { journal(issn: $issn) { id name issn scimagoId country publisher areas categories citationInfo { year docsThisYear docsPrevious3Years citableDocsPrevious3Years citesOutgoing citesOutgoingPerDoc citesIncomingByRecentlyPublished citesIncomingPerDocByRecentlyPublished sjrIndex } hIndex } } ``` with `issn: 15230864`. References: - https://www.scimagojr.com/help.php - Example: https://www.scimagojr.com/journalsearch.php?q=27514&tip=sid&clean=0 - https://docs.openalex.org/api-entities/sources/source-object --------- Co-authored-by: Nitin Suresh <[email protected]>
I've now merged the PR in the jabref online repo, but the data ingestion is still ongoing (might take a few hours). You can already test it with issn 15454509. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get this in as soon as possible? It's a very nice new feature and it would be nice to get some feedback from dev build users before the next release (mainly because after the release I need to be careful with api changes to not break backwards compatibility).
Hi, is there any further work required for this PR? It looks like the failing test is unrelated |
I've merged this now as there were no further requests for changes. Thanks a lot @aqurilla for your nice work on this, and your patience! |
Thankyou! appreciate the extensive discussions |
This fixes #6189 by adding a fetcher for journal information. Info buttons are added next to the
Journal
andISSN
fields in the entry editor, and show the information as a popover. AnEnablementStatus
enum is also added for generally maintaining state of online services in Preferences.Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)