Skip to content
hpiwowar edited this page May 16, 2011 · 16 revisions

Open questions

1. How do we organise plugins?

Should input be filtered before sending it to plugins? Jason says no (+Heather agrees). Plugins should know what kind of artefacts their Sources can handle, and how to identify those based on ID-strings. Once you store that knowledge somewhere else as well, you've made two points of contact between the external Source and us. When stuff changes--and it does, frequently, with these Source APIs--you've got two places to fix stuff.

How should plugins be registered? Jason: All we should need to know about the plugin is a url. Plugin authors are responsible for everything else. Specifically, they need to

  • accept a JSON object full of IDs
  • filter those IDs in a sensible way
  • chunk them together if their Source API gets down like that
  • make as many calls to their Source API to get data on everything.
  • do error checking on what they get back from the Source and handle things appropriately (exponential backoff, return nulls, throw an error, log stuff, etc.)
  • send back a JSON object that exactly matches our specification.

Question: Do plugins also need to send a documentation blurb about their metric to dynamically populate the "Metrics are computed based on the following data sources" documentation, or where should that come from?? It sort of makes me think that our API should have a "register yourself" call to each plugin on startup or something, to get this meta info (Heather).

How do we run plugins in Python (or any language) Jason: There are two parts to this:

  1. external authors with their own hosting: We don't care what language you use. Give us a url. Implement the steps listed above. Done.
  2. us, with a bunch of Python plugins sitting on our server: I think there are two routes:
    1. we ditch everything in "plugins/common/" and rely on Apache to route requests. Each plugin folder then has an index.php that simply reads the content of a POST request, initiates a Python process with the request JSON as a param, and then prints out whatever it gets back from the Python script. We break DRY a bit, but it's like 5 lines and lets us each plugin be complete self-reliant. This stackoverflow question explains it well. As long as we escape correctly I don't think it's a big security hole. Heather thinks this sounds fine.
    2. We setup Python so that it can get http requests, handle them, then serve responses itself. I (jason) have no idea how to do this or how it interacts with Apache. So, I favour option 1 :) Heather thinks this isn't hard, but I don't have time to do it this week, so weight my vote appropriately.

2. How much to we get done before we start to circulate?

  1. I think there are some other handlers implemented, but not currently getting called. Can we add them? Examples: icpsr, facebook, plosalm....
  2. Do we want to do viz to make it prettier before we send it around?
  3. Sure would be cool to get the artifact metadata (article names etc from crossref) in there, but maybe this counts as a nice-to-have? The crossref filter is working, I think, but requires unique rendering
Clone this wiki locally