Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word process and result table generation #150

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

haohangyan
Copy link

@haohangyan haohangyan commented Aug 22, 2024

Word process branch for entity process(commits after 8/23)

  1. Dash Handling: In annotate(), replaced dashes with spaces so that words like A-B are split into A and B, allowing correct recognition of all forms (A, B, and A-B).
  2. Greek Letter Handling: In _generate_lookups(), added a step to strip Greek letters (e.g., pra1Δ becomes pra1) to ensure these entries are recognized.

Add functions to generate NER output result table (First 6 commits before 8/23)
The output result table contains columns ['text', 'obj', 'obj_synonyms', 'don_article', 'groundings' , 'match']
There are three types of output:

  1. Reference('obj', 'obj_synonyms') is None but goundings has values
  2. Grounding is None but reference('obj', 'obj_synonyms') has values
  3. Both grounding and reference have values
    The 'match' column specifies if there is a match between references and groundings

Old commits are used to be in master branch. Now every commit is in word process branch

@haohangyan haohangyan changed the title Word process Word process and result table generation Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant