Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENG-6733 fix data for ‘modified’ gitlab attribute is not returned from WaterButler service on sorting attempt issue #414

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

mkovalua
Copy link

Ticket

https://openscience.atlassian.net/browse/ENG-6733

Purpose

When the data are rendered on OSF frontend side the data for ‘modified’ gitlab attribute is not returned from WaterButler service so sorting is not available

Changes

implemented async gitlab api https://gitlab.com/api/v4/projects/{}/commits?path= calls to gather modified datatime data , also the issue related changings are there CenterForOpenScience/osf.io#10956

Side effects

For such a purpose it is needed additional api calls for each folder item to get datetime because https://gitlab.com/api/v4/projects/{}/repository/tree?ref= API call is not able to return last_modified date.


It may be investigated also grapql approach to do less api calls and extract just needed gitlab api data. I have tried and not see time difference with calls

image

image

and do not find for now the api call that combine both https://gitlab.com/api/v4/projects/{}/repository/tree?ref= and https://gitlab.com/api/v4/projects/{}/commits?path=. and returns just needed data. 


Also I thought about how to speed up performance f.e we may avoid additional calls if last commit saved in OSF ORM is the same like in WaterButler but I did not found Repository item in OSF ORM (database) that saves as attribute last commit hash and related items.

Commit hashes are saved into _history = DateTimeAwareJSONField(default=list, blank=True) for files in BaseFileNode and it makes the task more complex to achieve good performance speed (

  1. it will be needed to determine all files related to one repository
  2. get last commits in ‘’_history for each files
  3. compare datetimes
  4. get specific commit hash
  5. check if the commit hash is the latest using gitlab API (if yes no additional calls for https://gitlab.com/api/v4/projects/{}/commits?path= otherwise do the calls to update datatime).

The changings aim is to get datetime. It is needed to think whether it is needed at all or how it is possible speed up it, for example the approach I mentioned above (adding caching) or any other ideas.

Though it was not purpose of the ticket but I suppose that it may be needed to take created time of file too in any another workflows.

QA Notes

Deployment Notes

…m WaterButler service on sorting attempt issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant