Configuration of harvesting #36

odscjames · 2020-06-25T13:09:12Z

Optimisation and Configuration
The process of harvesting itself needs to be parameterised so that users can split up the harvesting work as they see fit and assign additional workers to it as they see fit - in addition to any other system settings that might help speed the process of harvesting for particular use-cases. Scheduling should also be configurable.

Can we explore what use cases your looking to meet here, so we can plan this?

robredpath · 2020-06-26T12:41:21Z

Discuss with Client : parameterise harvesting?

robredpath · 2020-06-29T15:08:40Z

Hi @thill-odi @nickevansuk !

I'm looking at the work for our second sprint on the OA Harvester Extension, and this is an outstanding question.

We've got a bit of work to do around making the harvesting part of the system customisable. @odscjames quoted from one of the early spec documents, but that has left quite a lot undefined.

Can you tell us a bit about:

who you expect to be using the harvester code directly
what we know about how they might want to slice up the harvesting? Would it be per-publisher? Per-feed? Based on some sort of query/filter that they might apply at the filtering stage and/or from the status service?
Any constraints around speed that you might know about?

robredpath · 2020-07-13T12:49:25Z

I spoke with @thill-odi a couple of weeks ago.

The work to be done here in this round of development is around parallelisation - threads, workers, etc.

The anticipated use case is a developer setting up their own application that consumes OA data, and wanting to configure the process to match the resources available to them. We anticipate that they'll already broadly know what they want to do, having (hopefully) used the API and other developer resources to try out their idea.

The expectation is that they'll be competent enough to do any sort of source selection themselves, so no filtering/per-source stuff.

odscjames assigned nickevansuk and thill-odi Jun 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration of harvesting #36

Configuration of harvesting #36

odscjames commented Jun 25, 2020

robredpath commented Jun 26, 2020

robredpath commented Jun 29, 2020

robredpath commented Jul 13, 2020

Configuration of harvesting #36

Configuration of harvesting #36

Comments

odscjames commented Jun 25, 2020

robredpath commented Jun 26, 2020

robredpath commented Jun 29, 2020

robredpath commented Jul 13, 2020