Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the curation page for chemicals, add panel for work with titles containing the chemical name but with no corresponding main subject tag #1909

Closed
Adafede opened this issue Mar 20, 2022 · 5 comments · Fixed by #2308
Assignees
Labels
panels screen space for displaying the result of a query

Comments

@Adafede
Copy link
Contributor

Adafede commented Mar 20, 2022

What kind of panel would you like to add to which Scholia aspect?

A panel that assists in finding works that are candidates for being tagged with the chemical as main subject.

What kind of information should the panel provide, and which of the visualization options (e.g. table, bubble chart, map) should it use?

Here is a draft query:

PREFIX target: <http://www.wikidata.org/entity/Q2079986>

SELECT DISTINCT 
?item ?itemLabel ?chemicalname
  (REPLACE(STR(?item), ".*Q", "Q") AS ?work) 
  ("P921" AS ?main_subject)
  (REPLACE(STR(target: ), ".*Q", "Q") AS ?chemical)
  ("S887" AS ?heuristic)
  ("Q69652283" AS ?deduced)

WITH
{ SELECT  ?item ?chemicalname WHERE {
  VALUES ?chemicals {
    wd:Q11173 # chemical compounds
    wd:Q59199015 # group of stereoisomers
  }
  target: rdfs:label ?chemicalname .
  target: wdt:P31 ?chemicals. # comment out or adapt as necessary

  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org";
      wikibase:api "Generator";
      mwapi:generator "search";
      mwapi:gsrsearch ?chemicalname;
      mwapi:gsrlimit "max".
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P1476 ?title .
  
  MINUS {?item wdt:P921 target: }

  FILTER (REGEX(LCASE(?title), LCASE(CONCAT( "\\", "b", ?chemicalname ,"\\", "b"))))
  }
  LIMIT 200
}
AS %items
WHERE {
  INCLUDE %items
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 200

Which Wikidata entries would be good candidates to explore such visualizations?

Chemicals that are common in titles of publications, e.g. https://w.wiki/4y5S.

Anything else?

I am not sure how safe this is, I dislike working with chemicals labels but we do not have the equivalent of taxon name as in #1903.
This might end but with articles about (+)-Quassin being tagged with Quassin as the topic, which is not the best we can do but completely incorrect either (see query).
On the other hand, it allows finding articles where the chemical name is maybe not the most used one (eg. acetylsalicylic acid/aspirin).
tagging @egonw for his expertise here.

@Adafede Adafede added the panels screen space for displaying the result of a query label Mar 20, 2022
@egonw egonw moved this to In progress in Scholia Chemistry paper Mar 11, 2023
@egonw egonw self-assigned this Jul 21, 2023
@egonw
Copy link
Collaborator

egonw commented Jul 23, 2023

I wonder if this cannot create a bit of QuickStatements ...

@Adafede
Copy link
Contributor Author

Adafede commented Jul 23, 2023

@egonw Sure!

PREFIX target: <http://www.wikidata.org/entity/Q2079986>

SELECT DISTINCT 
  (REPLACE(STR(?item), ".*Q", "Q") AS ?qid) 
  (REPLACE(STR(target: ), ".*Q", "Q") AS ?P921)
  ("Q69652283" AS ?S887)

WITH
{ SELECT  ?item ?chemicalname WHERE {
  VALUES ?chemicals {
    wd:Q11173 # chemical compounds
    wd:Q59199015 # group of stereoisomers
  }
  target: rdfs:label ?chemicalname .
  target: wdt:P31 ?chemicals. # comment out or adapt as necessary

  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org";
      wikibase:api "Generator";
      mwapi:generator "search";
      mwapi:gsrsearch ?chemicalname;
      mwapi:gsrlimit "max".
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item wdt:P1476 ?title .
  
  MINUS {?item wdt:P921 target: }

  FILTER (REGEX(LCASE(?title), LCASE(CONCAT( "\\", "b", ?chemicalname ,"\\", "b"))))
  }
  LIMIT 200
}
AS %items
WHERE {
  INCLUDE %items
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 200

⚠️ The example highlights some of the limitations for example (-)-chemical-name, etc.

@egonw
Copy link
Collaborator

egonw commented Jul 23, 2023

This version is slightly faster and outputs QuickStatements:

PREFIX target: <http://www.wikidata.org/entity/Q2079986>

SELECT DISTINCT ?item ?itemLabel ?chemicalname
  (CONCAT("qid,P921,S887\n", SUBSTR(STR(?item),32), ",Q2079986,Q69652283") AS ?quickStatements)
WITH {
  SELECT  ?item ?chemicalname WHERE {
    VALUES ?chemicalType {
      wd:Q113145171 # type of a chemical entity
      wd:Q59199015 # group of stereoisomers
    }
    target: wdt:P31 ?chemicalType ; rdfs:label ?chemicalname .
    FILTER(lang(?chemicalname) = "en")
    SERVICE wikibase:mwapi {
      bd:serviceParam wikibase:endpoint "www.wikidata.org";
      wikibase:api "Generator";
      mwapi:generator "search";
      mwapi:gsrsearch ?chemicalname;
      mwapi:gsrlimit "max".
      ?item wikibase:apiOutputItem mwapi:title.
    }
    ?item wdt:P1476 ?title .
    MINUS {?item wdt:P921 target: }
    FILTER (REGEX(LCASE(?title), LCASE(CONCAT( "\\", "b", ?chemicalname ,"\\", "b"))))
  } LIMIT 200
} AS %items WHERE {
  INCLUDE %items
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

@egonw
Copy link
Collaborator

egonw commented Jul 23, 2023

Implemented, but did not finalize the patch yet:

image

cc @fnielsen @Daniel-Mietchen

@egonw egonw linked a pull request Jul 23, 2023 that will close this issue
10 tasks
@egonw
Copy link
Collaborator

egonw commented Jul 23, 2023

@Adafede, pull request submitted.

@egonw egonw moved this from In progress to Done in Scholia Chemistry paper Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
panels screen space for displaying the result of a query
Projects
Development

Successfully merging a pull request may close this issue.

2 participants