Skip to content

NAMD/pypln.api

Repository files navigation

PyPLN Application Programming Interface

PyPi version PyPi downloads

PyPLN is a distributed pipeline for natural language processing, made in Python. Learn more at the PyPLN website.

pypln.api is a package that interacts with PyPLN HTTP API to do everything programatically, in a Pythonic way. Basically, you are able to add/list corpora, add/list documents and retrieve documents' properties (resulted from the pipeline processing by the backend).

Installation

pypln.api is available at Python Package Index. So, to install it, just execute:

pip install pypln.api

Example - Usage

You can see docstrings inside pypln.api.PyPLN, but the general usage will be something like this:

from pypln.api import PyPLN

# Start an authenticated session to PyPLN demo server
pypln = PyPLN('http://fgv.pypln.org/', ('username', 'password'))

# You could also use your authentication token:
#pypln = PyPLN('http://fgv.pypln.org/', 'my-auth-token')

# Add a new corpus to your account
new_corpus = pypln.add_corpus(name='test', description='my new corpus')

# Add a document to this new corpus
with open('my-file.pdf') as fp:
    new_doc = new_corpus.add_document(fp)
print('Document added: {}'.format(new_doc))

# Retrieve all available (processed) properties for your brand new document
print('Processed properties:')
for document_property in new_doc.properties:
    print(' - {}'.format(document_property))

# Retrieve one document property:
print('Extracted text from our PDF:')
print(new_doc.get_property('text'))

# Retrieve a document using it's url:
from pypln.api import Document
# Make sure you replace this url for the url of a document you have access to!
my_doc = Document.from_url('http://fgv.pypln.org/documents/1/',
    ('username', 'password'))
print(my_doc.get_property('text'))

# Retrieve wordcloud image built from the document
with open("wordcloud_{}.png".format(doc_id), 'w') as fd:
    fd.write(base64.b64decode(my_doc.get_property("wordcloud")))

ProTip™: use ipython to discover all methods available at PyPLN, Corpus and Document classes - they are very simple and straightford to use.

License

pypln.api is free software, released under the GPLv3.

About

Python library to access PyPLN's API.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •