Testing figures template. #171

balhoff · 2023-08-14T22:39:00Z

Currently seeing error:

jim (figures)$ poetry run ontogpt extract -t figure.FigureCaption -i caption.txt -o caption.yaml
Configuration file exists at /Users/jim/Library/Preferences/pypoetry, reusing this directory.

Consider moving TOML configuration files to /Users/jim/Library/Application Support/pypoetry, as support for the legacy directory will be removed in an upcoming release.
ERROR:root:HuggingFace Hub API key not found. See README.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/jim/Library/Caches/pypoetry/virtualenvs/ontogpt-T28sWqJT-py3.11/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim/Library/Caches/pypoetry/virtualenvs/ontogpt-T28sWqJT-py3.11/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/jim/Library/Caches/pypoetry/virtualenvs/ontogpt-T28sWqJT-py3.11/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim/Library/Caches/pypoetry/virtualenvs/ontogpt-T28sWqJT-py3.11/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim/Library/Caches/pypoetry/virtualenvs/ontogpt-T28sWqJT-py3.11/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim/Documents/Source/ontogpt/src/ontogpt/cli.py", line 298, in extract
    results = ke.extract_from_text(text, target_class_def)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim/Documents/Source/ontogpt/src/ontogpt/engines/spires_engine.py", line 91, in extract_from_text
    extracted_object = self.parse_completion_payload(raw_text, cls, object=object)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jim/Documents/Source/ontogpt/src/ontogpt/engines/spires_engine.py", line 529, in parse_completion_payload
    self._auto_add_ids(raw, cls)
  File "/Users/jim/Documents/Source/ontogpt/src/ontogpt/engines/spires_engine.py", line 542, in _auto_add_ids
    if slot.range == "uriorcurie" or self.range == "uri":
                                     ^^^^^^^^^^
AttributeError: 'SPIRESEngine' object has no attribute 'range'

balhoff · 2023-08-14T23:19:29Z

Here is caption.txt:

Fig. 3. Morphological characters. A–D. Head in dorsal view. A. Gerbelius nr. confluens. B. Voconia decorata sp. nov. C. Voconia pallidipes Stål, 1866. D. Voconia schoutedeni (Villiers, 1964) comb. nov. E–G. Head in lateral view. E. Voconia wegneri (Miller, 1954) comb. nov. F. Voconia dolichocephala sp. nov. G. Gerbelius typicus Distant, 1903. H. Voconia loki sp. nov., head and pronotum in dorsal view. I–J. Prosternum in ventrolateral view. I. Voconia mexicana sp. nov. J. Voconia bracata sp. nov. K–L. Pronotum in dorsal view. K. Voconia conradti (Jeannel, 1917) comb. nov. L. Voconia tuberculata sp. nov.

cmungall · 2023-08-14T23:55:12Z

I can replicate this
You are running into a bug that was fixed after you branched a3e5452

However, the bug is only triggered by certain odd schema configurations. In this case, I don't think you want to be subclassing NamedEntity. This has a special meaning for OntoGPT (sorry for the out-of-band secrets)

cmungall · 2023-08-14T23:56:46Z

After removing the two is_as I get:

input_text: |
  Fig. 3. Morphological characters. A–D. Head in dorsal view. A. Gerbelius nr. confluens. B. Voconia decorata sp. nov. C. Voconia pallidipes Stål, 1866. D. Voconia schoutedeni (Villiers, 1964) comb. nov. E–G. Head in lateral view. E. Voconia wegneri (Miller, 1954) comb. nov. F. Voconia dolichocephala sp. nov. G. Gerbelius typicus Distant, 1903. H. Voconia loki sp. nov., head and pronotum in dorsal view. I–J. Prosternum in ventrolateral view. I. Voconia mexicana sp. nov. J. Voconia bracata sp. nov. K–L. Pronotum in dorsal view. K. Voconia conradti (Jeannel, 1917) comb. nov. L. Voconia tuberculata sp. nov.
raw_completion_output: |-
  title: Morphological characters
  subpanel: A–D. Head in dorsal view.
  subpanel: E–G. Head in lateral view.
  subpanel: H. Voconia loki sp. nov., head and pronotum in dorsal view.
  subpanel: I–J. Prosternum in ventrolateral view.
  subpanel: K–L. Pronotum in dorsal view.
prompt: |+
  Split the following piece of text into fields in the following format:

  id: <The identifier for this figure subpanel>
  text: <The text associated with this figure subpanel>
  info: <any information from the overall figure caption that applies to that subpanel (which may be duplicated across other subpanels).>


  Text:
  K–L. Pronotum in dorsal view.

  ===

extracted_object:
  title: Morphological characters
  subpanel:
    - id: K-L
      text: Pronotum in dorsal view.
      info: None

which is disappointing but at least works!

cmungall · 2023-08-15T00:02:54Z

I get much better results with a hint:

      subpanel:
        description: a subpanel of the figure
        annotations:
          prompt: >-
            a semicolon separated list of descriptions of every panel in the text. Keep the panel id and text together.
            for example: "1A: A side view of the foo; 1B: A frontal view of the foo"
        multivalued: true
        range: SubPanel

results:

extracted_object:
  title: Morphological characters
  subpanel:
    - id: A
      text: Head in dorsal view of Gerbelius nr. confluens
      info: None
    - id: B
      text: Head in dorsal view of Voconia decorata sp. nov.
      info: None
    - text: C
      info: Head in dorsal view of Voconia pallidipes Stål, 1866
    - id: E
      text: Head in lateral view of Voconia wegneri (Miller, 1954) comb. nov.
      info: None
    - id: F
      text: Head in lateral view of Voconia dolichocephala sp. nov.
      info: None
    - id: G
      text: Head in lateral view of Gerbelius typicus Distant, 1903
      info: None
    - id: N/A
      text: 'H: Head and pronotum in dorsal view of Voconia loki sp. nov.'
      info: N/A
    - id: I
      text: Prosternum in ventrolateral view of Voconia mexicana sp. nov.
      info: None
    - id: J
      text: Prosternum in ventrolateral view of Voconia bracata sp. nov.
      info: None
    - id: K
      text: Pronotum in dorsal view of Voconia conradti (Jeannel, 1917) comb. nov.
      info: None
    - id: L
      text: Pronotum in dorsal view of Voconia tuberculata sp. nov.
      info: None

caufieldjh · 2023-12-22T19:34:11Z

Merging this to retain the template, though it will likely need to be rebuilt (for pydantic classes) before use

Testing figures template.

977a4fa

balhoff marked this pull request as draft August 14, 2023 22:39

balhoff and others added 2 commits August 14, 2023 19:22

Update yaml.

507e96e

fixed schema

1474d40

Change subpanel slot as per suggestion

5f13c4f

caufieldjh marked this pull request as ready for review December 22, 2023 19:43

caufieldjh merged commit 4d1f1cc into monarch-initiative:main Dec 22, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing figures template. #171

Testing figures template. #171

balhoff commented Aug 14, 2023

balhoff commented Aug 14, 2023

cmungall commented Aug 14, 2023

cmungall commented Aug 14, 2023

cmungall commented Aug 15, 2023

caufieldjh commented Dec 22, 2023

Testing figures template. #171

Testing figures template. #171

Conversation

balhoff commented Aug 14, 2023

balhoff commented Aug 14, 2023

cmungall commented Aug 14, 2023

cmungall commented Aug 14, 2023

cmungall commented Aug 15, 2023

caufieldjh commented Dec 22, 2023