Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for more Container Types and Instance Types out-of-the-box #517

Closed
ctgraham opened this issue Aug 1, 2022 · 5 comments · Fixed by #518 or #519
Closed

Support for more Container Types and Instance Types out-of-the-box #517

ctgraham opened this issue Aug 1, 2022 · 5 comments · Fixed by #518 or #519

Comments

@ctgraham
Copy link
Contributor

ctgraham commented Aug 1, 2022

Is your feature request related to a problem? Please describe.

Supported container type choices and instance type choices may be fairly bespoke, but additional common selections exist. It would be preferable for each implementer to need to edit transfomer/resources/configs.py as little as possible.

Describe the solution you'd like

For now, Pitt is submitting a list from our data which seems to be potentially common.

Describe alternatives you've considered

This list includes some capitalization/wordbreak variants. It might be preferable to programatically transform the options to normalize the incoming data (or to explode the configuration data) to cover variants like:

Additional context

N.b.: I normalized the presentation of our container type choices to singular when the code was plural, to match existing pairings. I'm not sure if that is appropriate from an Archives perspective.

@helrond
Copy link
Member

helrond commented Aug 2, 2022

@ctgraham Thanks for raising this! First a bit of context from our perspective: the main use of these config lists is to validate data coming from ArchivesSpace. While these values are all controlled in AS by enumeration lists, it is possible to add to those lists by importing data via a spreadsheet, so those values aren't as controlled as they might appear to be. So those lists serve as a way of warning us when there's some funky data that's made its way into the system.

That said, I totally get that not everyone has that use case, and I also very much agree with the principle of making as few modifications to the transformer/resources/configs.py file as possible.

It seems to me like there are a range of options here:

  • Just get rid of those config lists and deal with ArchivesSpace validation some other way. There may be some implications for this in the mappings or elsewhere in the application, so we'd have to do a bit more research before I'd be comfortable recommending that as an option.
  • Adding to the list (as your PR has done). I think this is a good stopgap.
  • Creating variables in the application config which would be appended to a standardized list. This would make the config file a lot heavier, but would keep all of the config edits in one place.
  • Storing these configs in the application database, which would allow them to be updated on a per-instance basis (using the built-in Django admin UI). This is a bit more work but might be the most sustainable choice long-term.

I don't really love the idea of programmatically exploding the options, and would argue for just removing those lists instead, because what we're saying in that case is that we really don't care about the values that are coming in.

I am happy for now to accept the existing PR, but would be interested in thoughts on whether we should add a more substantive change to the backlog.

helrond added a commit that referenced this issue Aug 2, 2022
#517: provide additonal CONTAINER_TYPE_CHOICES and INSTANCE_TYPE_CHOICES
@ctgraham
Copy link
Contributor Author

ctgraham commented Aug 4, 2022

Thanks, @helrond . Speculatively, how does this data validation relate to the schema validation done by https://github.com/RockefellerArchiveCenter/rac_schemas ? Could this schema, which already likely needs to be customized per institution, be the location to store the enforcement of certain term lists? (Maybe not, because it seems we are one step beyond simple enums here, in mapping the values against human readable representations?)

@helrond
Copy link
Member

helrond commented Aug 4, 2022

They're validating two different data structures - the lists we're looking at in /transformer/resources/configs.py relate to the incoming source data from ArchivesSpace, whereas the schemas in rac_schemas validate the transformed output.

I think there is a reasonable case to be made that really what we care about is validating the output. My concern about removing these lists is whether there is application logic which relies on specific values for source data (for example, are we building logic off of values like instance.type). If that's a direction that seems promising I'm happy to dig in and see what the potential effects of removing those controlled lists might be.

@ctgraham
Copy link
Contributor Author

ctgraham commented Aug 4, 2022

Understood, thanks. I appreciate the challenge of wanting to build logic around customizable data values, without turning the application configuration into rat's nest. Of the options you describe, I think the one with the best UX is going to be the Django admin based config; the most expedient would likely be moving this to the application config (including specific values which affect application logic).

@helrond
Copy link
Member

helrond commented Aug 5, 2022

I've moved this future work to #521, and will merge the addition to the values list shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants