Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[en] new dicts #10699

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

[en] new dicts #10699

wants to merge 16 commits into from

Conversation

jaumeortola
Copy link
Member

No description provided.

@jaumeortola jaumeortola requested review from evan-defran-lt and AzadehSafakish and removed request for evan-defran-lt July 2, 2024 11:42
@AzadehSafakish
Copy link
Collaborator

I didn't build the new dictionaries locally, so these APs aren't guaranteed to work (although I think most of them should as they don't rely too much on postag info).
If they don't work, they should at least provide an idea of what needs to be fixed and how to fix it.

  • COMMA_THANKS[4]
<antipattern>
    <token postag="SENT_START|PCT" postag_regexp="yes" />
    <token>no</token>
    <token>thank</token>
    <token>you</token>
    <token>,</token>
    <example>No thank you, I'm full.</example>
</antipattern>
  • POSSESSIVE_APOSTROPHE[1]
<antipattern>
    <token>missing</token>
    <token>persons</token>
    <token chunk_re="[IE]-NP.*" />
    <example>Fadil requested the help of missing persons investigator Rami Hasan.</example>
</antipattern>
  • ADVISE_VBG[3]
<antipattern>   <!-- this isn't an FP, but the suggestions are incorrect in this context and interrupting the parallel structure makes the sentence worse -->
    <token skip="5" postag="VBG">
        <exception scope="next" postag="V.*" postag_regexp="yes" />
    </token>
    <token regexp="yes">and|or</token>
    <token postag="VBG" chunk="B-VP" />
    <example>Meditation helps downshifting more easily after work and sleeping better at night.</example>
</antipattern>
  • MISSING_HYPHEN[5]
<antipattern>
    <token regexp="yes" case_sensitive="yes" postag="CD">[A-Z].*</token>
    <token min="2" regexp="yes" case_sensitive="yes">[A-Z].*</token>
    <example>Mussels contributed to a valve problem in the 1990s at the Nine Mile Nuclear Power plant in Lake Ontario.</example>
    <example>The Eighty Minute Hour (1974) — A weird and ambitious "space opera" whose characters actually sing.</example>
    <example>With the Nine Inch Nails album Year Zero, the concept of the albums songs which "[take] place about 15 years in the future" when "Things are not good." and incorporated sites from the Web.</example>
    <example>Chris, As per clause 7.2 (a) (i) of the LNG sales contract we are providing you EcoElctrica's Ninety Day Schedule.</example>
</antipattern>
  • UH_UH_COMMA[1] (pattern, not antipattern)
<pattern>
    <token postag="UH">
        <exception postag="IN" />
        <exception regexp="yes">ha|yo|why|health|check|hip|meow|break|really|never|contact|blah|there|yum</exception>
    </token>
    <token>
        <match no="0" />
    </token>
</pattern>
<message>Consider adding a comma between these interjections.</message>
<suggestion>\1, \2</suggestion>
<suggestion>\1</suggestion>
<example correction="Oh, oh|Oh"><marker>Oh oh</marker>, he is coming.</example>
<example>Yum yum!</example>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants