Skip to content
/ yargy Public
forked from natasha/yargy

Tiny package for information extraction

License

Notifications You must be signed in to change notification settings

kuzmaka/yargy

 
 

Repository files navigation

Yargy Build Status Build status Documentation Status PyPI

Yargy is a Earley parser, that uses russian morphology for facts extraction process, and written in pure python

Install

Yargy supports both Python 2.7+ / 3.4+ versions including PyPy.

$ pip install yargy

Usage

from yargy import Parser, rule, and_, not_
from yargy.interpretation import fact
from yargy.predicates import gram
from yargy.relations import gnc_relation
from yargy.pipelines import morph_pipeline


Name = fact(
    'Name',
    ['first', 'last'],
)
Person = fact(
    'Person',
    ['position', 'name']
)

LAST = and_(
    gram('Surn'),
    not_(gram('Abbr')),
)
FIRST = and_(
    gram('Name'),
    not_(gram('Abbr')),
)

POSITION = morph_pipeline([
    'управляющий директор',
    'вице-мэр'
])

gnc = gnc_relation()
NAME = rule(
    FIRST.interpretation(
        Name.first
    ).match(gnc),
    LAST.interpretation(
        Name.last
    ).match(gnc)
).interpretation(
    Name
)

PERSON = rule(
    POSITION.interpretation(
        Person.position
    ).match(gnc),
    NAME.interpretation(
        Person.name
    )
).interpretation(
    Person
)

parser = Parser(PERSON)

match = parser.match('управляющий директор Иван Ульянов')
print(match)

And in output you will see something like this:

Person(
    position='управляющий директор',
    name=Name(
        first='Иван',
        last='Ульянов'
)

For more examples, details on grammar syntax, predicates and pipelines see Yargy documentation.

License

Source code of yargy is distributed under MIT license (allows modification and commercial usage)

Support

About

Tiny package for information extraction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Makefile 0.2%