Python parser for microformats 2.
Current status: Full-featured and mostly stable. Implements the full mf2 spec, including backward compatibility with microformats1.
Documentation, code tidying and so on is rather lacking.
License: MIT
pip install mf2py
Import the parser using
import mf2py
Parse a file containing the content
with open('file/content.html','r') as file:
obj = mf2py.parse(doc=file)
Parse string containing content
content = '<article class="h-entry"><h1 class="p-name">Hello</h1></article>'
obj = mf2py.parse(doc=content)
Parse content from a URL
obj = mf2py.parse(url="http://tommorris.org/")
parse
is a convenience method that actually delegates to
mf2py.Parser
to do the real work. More sophisticated behaviors are
available by invoking the object directly.
Get parsed microformat in a variety of formats
p = mf2py.Parser(...)
p.to_dict() # returns a python dictionary
p.to_json() # returns a JSON string
Filter by microformat type
p.to_dict(filter_by_type="h-entry")
p.to_json(filter_by_type="h-entry")
- pass the optional argument
img_with_alt=True
to either theParser
object or to theparse
method to enable parsing of thealt
attribute of<img>
tags according to issue: image alt text is lost during parsing. By default this isFalse
to be backwards compatible.
- I passed
mf2py.parse()
a BeautifulSoup document, and it got modified!
Yes, mf2py currently does that. We're working on preventing it! Hopefully soon.
A basic web interface for mf2py and mf2util is available at mf2py-web.
A hosted live version can be found at python.microformats.io.
We welcome contributions and bug reports via Github, and on the microformats wiki.
We try to follow the IndieWebCamp code of conduct. Please be respectful of other contributors, and forge a spirit of positive co-operation without discrimination or disrespect.