Every post on opensiddur.org parsed from Wordpress eXtended RSS (XML) export files.
These XML files are manually generated from our Wordpress backend. (I'd like to figure out how to automatically generate them.)
Individually parsed posts (see posts/HTML) are generated from these XML files using a fork of ruslanosipov's wxr2txt.py script.
The generated posts with the .md markdown extension contain the HTML of the body of each post on opensiddur.org as well as some limited metadata. Not all postmeta data (e.g., co-authors, categories, tags, and license information) is getting parsed into these posts right now. Figuring out how to do this in python or via XSLT is a current goal.
Content in the posts file is shared under a mix of Creative Commons Attribution-ShareAlike and Creative Commons Attribution licenses. Inspect the postmeta "open_content_license" metakey, to determine the specific license for any specific post.
Content in all other files (pages, authors, media) is shared under a Creative Commons Attribution-ShareAlike 4.0 International license.
Email addresses of users and commenters have been redacted.