Skip to content
This repository has been archived by the owner on Nov 7, 2024. It is now read-only.

Add transforms for girder items and folders #37

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 49 additions & 14 deletions girder_worker_utils/transforms/girder_io.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import abc
import mimetypes
import os
import shutil
import six
import tempfile

from girder_client import GirderClient
Expand All @@ -9,6 +11,7 @@
from ..transform import ResultTransform, Transform


@six.add_metaclass(abc.ABCMeta)
class GirderClientTransform(Transform):
def __init__(self, *args, **kwargs):
gc = kwargs.pop('gc', None)
Expand Down Expand Up @@ -37,36 +40,68 @@ def __init__(self, *args, **kwargs):
self.gc = None


@six.add_metaclass(abc.ABCMeta)
class GirderClientResultTransform(ResultTransform, GirderClientTransform):
pass


class GirderFileId(GirderClientTransform):
@six.add_metaclass(abc.ABCMeta)
class _GirderResourceDownload(GirderClientTransform):
def __init__(self, _id, **kwargs):
super(_GirderResourceDownload, self).__init__(**kwargs)
self._id = _id
self._local_path = None

def _repr_model_(self):
return "{}('{}')".format(self.__class__.__name__, self._id)

@property
def local_path(self):
if self._local_path is None:
self._local_path = os.path.join(tempfile.mkdtemp(), str(self._id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the virtues of downloading an item via the old Girder worker methods was that the names of the files were the original names in the item (with some sanitization). This has the problem that two files with the same name would conflict, but the virtue that a task that uses a group of files in a item has expected names. For a single file, this is usually just the loss of the extension, but when downloading a whole item, the names of files relative to each other are more useful. It would be better to uses the original file name (projects like gaia would otherwise need to reimplement this).

Copy link
Member Author

@zachmullen zachmullen Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of an item with multiple files, the files will still retain their names from the server, and will be siblings on the filesystem: https://github.com/girder/girder/blob/master/clients/python/girder_client/__init__.py#L1294-L1309

However, I do think it's a bug that a single file download loses its name. Placing each resource inside a folder by its ID is a good feature because it prevents collision based on name, but we don't want to lose either the name or the file extension.

I think I'd prefer to fix that problem on a future PR, though, since it was not introduced by this branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that we can let the calling context decide by adding an optional file_name kwarg to __init__(...) and just using the ID if its not passed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my plan.

return self._local_path

def cleanup(self):
shutil.rmtree(os.path.dirname(self.local_path), ignore_errors=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to remove the downloaded folder or file, but not the tempdir that was created.



class GirderFileId(_GirderResourceDownload):
"""
This transform downloads a Girder File to the local machine and passes its
This transform downloads a Girder file to the local machine and passes its
local path into the function.

:param _id: The ID of the file to download.
:type _id: str
"""
def __init__(self, _id, **kwargs):
super(GirderFileId, self).__init__(**kwargs)
self.file_id = _id
def transform(self):
self.gc.downloadFile(self._id, self.local_path)
return self.local_path

def _repr_model_(self):
return "{}('{}')".format(self.__class__.__name__, self.file_id)

class GirderFolderId(_GirderResourceDownload):
"""
This transform downloads a Girder folder to the local machine and passes its
local path into the function.

:param _id: The ID of the folder to download.
:type _id: str
"""
def transform(self):
self.file_path = os.path.join(
tempfile.mkdtemp(), '{}'.format(self.file_id))
self.gc.downloadFolderRecursive(self._id, self.local_path)
return self.local_path

self.gc.downloadFile(self.file_id, self.file_path)

return self.file_path
class GirderItemId(_GirderResourceDownload):
"""
This transform downloads a Girder item to the local machine and passes its
local path into the function.

def cleanup(self):
shutil.rmtree(os.path.dirname(self.file_path),
ignore_errors=True)
:param _id: The ID of the item to download.
:type _id: str
"""
def transform(self):
self.gc.downloadItem(self._id, self.local_path)
return self.local_path


class GirderItemMetadata(GirderClientTransform):
Expand Down