Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Api reverse links #139

Merged
merged 20 commits into from
Aug 20, 2024
Merged

Api reverse links #139

merged 20 commits into from
Aug 20, 2024

Conversation

liviuba
Copy link
Contributor

@liviuba liviuba commented Aug 6, 2024

For https://app.clickup.com/t/86959dg0w , https://app.clickup.com/t/869527km2

Adding ObjectReference as it as (with the runtime validation hooks) just so we don't lose it when we squash, in case we ever have a usecase for it. Will make a PR deleting that part (so ObjectReference will just be a container for the link target) after that

Not fully tested, but tests should cover "one of each" for the major usecases (validating that a dependency object exists, and generating reverse links)

@liviuba liviuba force-pushed the api_reverse_links branch from a7bf304 to 48efe5e Compare August 7, 2024 15:03
…t used anyway)

Clarify pydantic.raises block, because I didn't know what was failing
@liviuba
Copy link
Contributor Author

liviuba commented Aug 19, 2024

Note: Had to delete the ObjectReference type-validation-hooks because it broke schema chaining. Something to do with __get_pydantic_core_schema__, probably fixable but we were going to delete it in the next PR anyway.

Also, changes to the models tests are just to get a better exception for easier debugging (IMO - default error was just "Exception type not matching" and it took a lot of pytest flag fidgeting to get any meaningful info). Happy to undo

@liviuba liviuba requested a review from sherwoodf August 19, 2024 13:33
@liviuba liviuba marked this pull request as ready for review August 19, 2024 13:34
):
less_than_minimal_dict[key] = []
with pytest.raises(ValidationError):
if isinstance(less_than_minimal_dict[key], list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the try-catch. Some super nit-picky comments:

  1. the else on line 114 executes exactly the same code as in the else on line 130, so you could just check both conditions on line 96 & 97 and the do this logic.
  2. I guess this also no longer checks a case where if a field is provided it must have 1 value, but can be not provided - which would be weird & require custom validation logic.
  3. I think your code is clear enough to not need the comments you've included. If there is one that should have an explanation of what you're testing, its the comment one on line 99 since minimum cardinality requirements are unusual.

Copy link
Contributor Author

@liviuba liviuba Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Makes sense, I think I defaulted to refining things by context, changed now
  2. Was that checked previously? I didn't read it like that. But happy to tweak this in this PR if it was
  3. Rephrased some of the comments, added to help keep track of branch context but happy to remove / trim down if you want the code to be neater

Copy link
Contributor

@sherwoodf sherwoodf Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think del less_than_minimal_dict[key] ... test run in all test cases, the
                isinstance(less_than_minimal_dict[key], list)
                and len(less_than_minimal_dict[key]) > 0
            ):
                ...```

conditions added an additional test for minimal list length: at least i presume this is the test logic if the test doesn't 'exit' on `with pytest.raises(...)` ?

api/api/app.py Outdated
@@ -1,6 +1,7 @@
from . import public
from . import private
from .models.repository import repository_create, Repository
from bia_shared_datamodels.bia_data_model import ObjectReference, Study
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit-pick: unused imports

) in object_to_check.get_object_reference_fields().items():
if not getattr(object_to_check, link_attribute_name):
"""
If a field optional and a link to another object,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like something from this comment was deleted & it's a comment about the whole assert_model_doc_dependencies_exists logic, rather than specifically the logic in from like 93-101

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually meant to only cover that branch, just because it looks odd / handles the edge case of Optional[List[X]]. Changed the comment a bit, does that describe the context better?


docs = await self.biaint.find(
{"$or": doc_dependencies}, {"uuid": 1, "model": 1}
).to_list(length=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit-pick - explicitly saying length=None made me double check what this was doing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it's annoying, but it's a required param. It probably should be some number instead of None for safety, will most likely change when I do pagination

if len(docs) != len(doc_dependencies):
"""
! This also fails if an object has the same object as a dependency twice.
Feature? Could it happen for something like links in different contexts to the same file/fileref?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing currently has 2 links objects of the same class (except through generated endpoint links). It could be possible in the furture, but i would expect some of the fields to be 'calculateable'. I presume that since this is checking what we're putting in mongo, it's not testing generated fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand this. The main actual place where this would actually be useful (there's a test for it too, to make sure it stays consistent) is if we point to the same object multiple times in a list (which we technically treat as sets). This was an issue a while ago because diffing ingested studies that were changed is tricky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't a feedback comment, was just reacting to the 'Could it happen...'

I think the accidentally pointing twice to the same object in the same field is probably a feature - most data we care about is like to be a set of unique values and rarely some intrinsic order (maybe some order for e.g. authors).

But when you said 'different context' i thought you mean different fields pointing to the same uuid. i was wondering if we could have a case where (and actually im realising as i'm typing this that i bet we do have this use-case and that the current models can't cope with it...) an Experimental Imaging Dataset with an annotation csv that is annotation if the images in that dataset. The csv, if it were an AnnotationFileReference, has both "submitted_in_dataset": <dataset123> and "source_dataset": <dataset123>.

# just raise and defer definining behaviour until we use it
if link_attribute_type in models_public:
router.add_api_route(
f"/{to_snake(link_attribute_type.__name__)}/{{uuid}}/{to_snake(model.__name__)}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 You're using the type of the object to create the field names for the reverse links, right?

That feels not ideal when the field names of the forwards links aren't the names of the classes, but it does follow the type pattern of api definitions. Eurgh, i'm not sure what would be better here. But this method no longer works if we ever have 2 links in the same direction between two classes (which we don't currently, in the diagram, and we don't have plans for, but could be possible in the future).

Will we eventually generate the forward links as paths as well? My understanding was this was the eventual intent - if so we start getting weird discrepancies between the uuid field names and the path/endpoint.

Copy link
Contributor Author

@liviuba liviuba Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the paths, I hadn't thought of that. Makes sense that it could come up as something we might need. The thing is we only have the "arrow origin" object in that context. Could add some extra params to the annotation to control what the reverse link shows up as, when we need it?

Also for the forward links, I didn't have those in mind here, the forward "links" would basically be get the dataset, then get the study (vs this which is "get datasets for some study"). The reason we needed the reverse links was so that we could actually find things that point to the thing upstream, so we could walk down the model. Like for example in biaint, if I had a study and wanted to know which datasets does it have, wouldn't have a way to do that without them. It would be a nice convenience feature though, but the thing we would actually need to be able to walk through everything that's in a study was this AFAICT.

Copy link
Contributor

@sherwoodf sherwoodf Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, totally, this was definitely the higher priority feature to get implemented. I was just bringing up the forward links here because the naming of them might influence our decision here.

I think extra params to the annotation might be the nicest option if you're okay with them not being the class names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, totally, do you think it should go in here, or a ticket to do later once we release? I think it would be a default-first sort of thing (with the default being the class name) just to avoid making it too verbose. We would also need to come up with a way to add that information without calling it api_reverse_link_fragment (or similar) to make it usecase-independent. Something like association_id I guess?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it should go in here, or a ticket to do later once we release

I'm happy to approve this PR without the feature in the name of iterative development, smaller PRs being faster to merge etc. Depends what you mean by release - i think it's okay to get the website working with this since it's only going to mean that i'll need to make changes to the export code once the client updates, but i think it's a high priority ticket before anyone else starts using the client, since it would be quite a big change to a lot of endpoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@sherwoodf sherwoodf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sherwoodf sherwoodf merged commit 897014a into main Aug 20, 2024
30 of 33 checks passed
@sherwoodf sherwoodf deleted the api_reverse_links branch August 20, 2024 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants