Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filters in DID search #40

Open
wants to merge 32 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7401c83
Add metadata filter interface
May 17, 2024
c7e0499
Fix invalid style references
May 17, 2024
f38f165
Fix invalid reference
May 29, 2024
a6fa766
Add filtering functionality
Jul 23, 2024
f8de872
Improve UI: alignment, responsiveness, spacing, icon sizes
Jul 29, 2024
975b74d
Merge branch 'master' of github.com:GeorgySk/jupyterlab-extension int…
Aug 20, 2024
a8a26b4
Fix version in python code
Aug 21, 2024
f93bb0a
Fix ESLint errors
Aug 21, 2024
71713dc
Move TS tests to standard location
Aug 21, 2024
d37e533
Fix routes to TS libraries
Aug 21, 2024
1cf8e34
Remove dummy test
Aug 21, 2024
e0b0141
Add minimal test on MetadataFilterItem rendering
Aug 21, 2024
28f53f1
Add empty filters argument in DID search tests
Aug 21, 2024
35b0864
Move metadata file container into an independent module
Aug 22, 2024
59e4200
Move MetadataFilterContainer into a separate module and add tests on it
Aug 22, 2024
3280f7b
Add class names
Aug 23, 2024
e865c04
Add test on the whole ExploreTab
Aug 23, 2024
53dba0a
Add -u flag to jest call to fix test github action
Aug 23, 2024
2c29635
Fix ESlint errors
Aug 23, 2024
c690410
Merge pull request #1 from ftorradeflot/master_georgy
GeorgySk Sep 5, 2024
7e270af
Add quotes to filter keys and values
Sep 10, 2024
09f6e82
Remove unnecessary `json.dumps` for filters
Sep 12, 2024
9f74e33
Fix DID search not working with filters
Sep 12, 2024
909bf20
Make selectors change colors when switching themes
Sep 12, 2024
0619140
Remove comma, ES lint check
Dec 16, 2024
389df01
inherit jupyterlab ci/cd actions
Dec 17, 2024
721d24d
Fix test workflow
Dec 17, 2024
72a3e45
Add back custom setup step
Dec 17, 2024
4097970
Add firefox setup step
Dec 17, 2024
fe40d40
Force browser to firefox
Dec 17, 2024
ced8b56
Adapt build & publish workflow
Dec 17, 2024
8b82e00
Merge pull request #2 from ftorradeflot/master_georgy
GeorgySk Dec 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 0 additions & 29 deletions .github/actions/setup/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,35 +2,6 @@ name: "Setup"
runs:
using: "composite"
steps:
- name: Install node
uses: actions/setup-node@v4
with:
node-version: '18.x'
- name: Install Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
architecture: 'x64'
- name: Setup pip cache
uses: actions/cache@v2
with:
path: ~/.cache/pip
key: pip-3.11-${{ hashFiles('package.json') }}
restore-keys: |
pip-3.11-
pip-
- name: Get yarn cache directory path
id: yarn-cache-dir-path
run: echo "::set-output name=dir::$(yarn cache dir)"
shell: bash
- name: Setup yarn cache
uses: actions/cache@v2
id: yarn-cache
with:
path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
key: yarn-${{ hashFiles('**/yarn.lock') }}
restore-keys: |
yarn-
- name: Install Python dependencies
run: python -m pip install -r requirements.txt
shell: bash
Expand Down
2 changes: 1 addition & 1 deletion .github/actions/test/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ runs:
run: jlpm run eslint:check
shell: bash
- name: Run Jest
run: jlpm jest
run: jlpm jest -u
shell: bash
- name: Run Pytest
run: pytest rucio_jupyterlab/tests/
Expand Down
28 changes: 16 additions & 12 deletions .github/workflows/build-and-publish-tagged.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,28 @@ on:
tags: 'v*'

jobs:
build:
name: Build and Publish to PyPI

Test:
uses: ./.github/workflows/test.yml

Build:
needs: [Test]
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/rucio-jupyterlab
permissions:
id-token: write
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/setup
- uses: ./.github/actions/test
- uses: ./.github/actions/build-ext
- uses: ./.github/actions/post-test
- name: Checkout
uses: actions/checkout@v4

- name: Base Setup
uses: jupyterlab/maintainer-tools/.github/actions/base-setup@v1

- name: Custom Setup
uses: ./.github/actions/setup

- name: Build sdist
run: |
pip install build
python -m build --sdist

- name: Publish distribution to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
Expand Down
36 changes: 29 additions & 7 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,34 @@ on:
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-latest
Test:
name: Test
strategy:
matrix:
python-version: ["3.9", "3.11", "3.12"]
node-version: ["18.x", "20.x"]
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/setup
- uses: ./.github/actions/test
- uses: ./.github/actions/build-ext
- uses: ./.github/actions/post-test
- name: Checkout
uses: actions/checkout@v4

- name: Base Setup
uses: jupyterlab/maintainer-tools/.github/actions/base-setup@v1

- name: Setup firefox
uses: browser-actions/setup-firefox@latest

- name: Custom Setup
uses: ./.github/actions/setup

- name: Test
uses: ./.github/actions/test

- name: Build
uses: ./.github/actions/build-ext

- name: Post Test
env:
JLAB_BROWSER_TYPE: firefox
uses: ./.github/actions/post-test

4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -123,3 +123,7 @@ dmypy.json

# Yarn cache
.yarn/

# jest files
src/__tests__/__snapshots__/
junit.xml
4 changes: 2 additions & 2 deletions jest.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,6 @@ module.exports = {
'!src/**/.ipynb_checkpoints/*'
],
coverageReporters: ['lcov', 'text'],
testRegex: 'src/.*/.*.spec.ts[x]?$',
testRegex: 'src/\_\_tests\_\_/.*\.tsx?$',
transformIgnorePatterns: [`/node_modules/(?!${esModules}).+`]
};
};
2 changes: 2 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@
"@types/mocha": "^10.0.6",
"@types/react": "^18.0.26",
"@types/react-addons-linked-state-mixin": "^0.14.22",
"@types/react-test-renderer": "^18.3.0",
"@typescript-eslint/eslint-plugin": "^7.2.0",
"@typescript-eslint/parser": "^7.2.0",
"css-loader": "^6.7.1",
Expand All @@ -113,6 +114,7 @@
"mkdirp": "^3.0.1",
"npm-run-all": "^4.1.5",
"prettier": "^3.0.0",
"react-test-renderer": "^18.3.1",
"rimraf": "^5.0.1",
"source-map-loader": "^1.0.2",
"style-loader": "^3.3.1",
Expand Down
2 changes: 1 addition & 1 deletion rucio_jupyterlab/_version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# This file is auto-generated by Hatchling. As such, do not:
# - modify
# - track in version control e.g. be sure to add to .gitignore
__version__ = VERSION = '0.10.0'
__version__ = VERSION = '1.0.0'
13 changes: 10 additions & 3 deletions rucio_jupyterlab/handlers/did_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,20 @@ def __init__(self, namespace, rucio):
self.rucio = rucio
self.db = get_db() # pylint: disable=invalid-name

def search_did(self, scope, name, search_type, limit):
def search_did(self, scope, name, search_type, filters, limit):
wildcard_enabled = self.rucio.instance_config.get('wildcard_enabled', False)

if ('*' in name or '%' in name) and not wildcard_enabled:
raise WildcardDisallowedException()

dids = self.rucio.search_did(scope, name, search_type, limit)
dids = self.rucio.search_did(scope, name, search_type, filters, limit)

for did in dids:
if did['did_type'] is None: # JSON plugin was used lacking data
metadata = self.rucio.get_metadata(scope, did['name'])[0]
did['did_type'] = f"DIDType.{metadata['did_type']}"
did['bytes'] = metadata['bytes']
did['length'] = metadata['length']

def mapper(entry, _):
return {
Expand All @@ -54,13 +60,14 @@ def get(self):
namespace = self.get_query_argument('namespace')
search_type = self.get_query_argument('type', 'collection')
did = self.get_query_argument('did')
filters = self.get_query_argument('filters', default=None)
rucio = self.rucio.for_instance(namespace)

(scope, name) = did.split(':')
handler = DIDSearchHandlerImpl(namespace, rucio)

try:
dids = handler.search_did(scope, name, search_type, ROW_LIMIT)
dids = handler.search_did(scope, name, search_type, filters, ROW_LIMIT)
self.finish(json.dumps(dids))
except RucioAuthenticationException:
self.set_status(401)
Expand Down
151 changes: 148 additions & 3 deletions rucio_jupyterlab/rucio/rucio.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
# - Muhammad Aditya Hilmy, <[email protected]>, 2020

import logging
import re
import time
import json
from urllib.parse import urlencode, quote
Expand All @@ -16,6 +17,129 @@
from .authenticators import RucioAuthenticationException, authenticate_userpass, authenticate_x509, authenticate_oidc


def parse_did_filter_from_string_fe(input_string, name='*', type='collection', omit_name=False):
"""
Parse DID filter string for the filter engine (fe).

Should adhere to the following conventions:
- ';' represents the logical OR operator
- ',' represents the logical AND operator
- all operators belong to set of (<=, >=, ==, !=, >, <, =)
- there should be no duplicate key+operator criteria.

One sided and compound inequalities are supported.

Sanity checking of input is left to the filter engine.

:param input_string: String containing the filter options.
:param name: DID name.
:param type: The type of the did: all(container, dataset, file), collection(dataset or container), dataset, container.
:param omit_name: omit addition of name to filters.
:return: list of dictionaries with each dictionary as a separate OR expression.
"""
# lookup table unifying all comprehended operators to a nominal suffix.
# note that the order matters as the regex engine is eager, e.g. don't want to evaluate '<=' as '<' and '='.
operators_suffix_LUT = dict({
'≤': 'lte',
'≥': 'gte',
'==': '',
'≠': 'ne',
'>': 'gt',
'<': 'lt',
'=': ''
})

# lookup table mapping operator opposites, used to reverse compound inequalities.
operator_opposites_LUT = {
'lt': 'gt',
'lte': 'gte'
}
operator_opposites_LUT.update({op2: op1 for op1, op2 in operator_opposites_LUT.items()})

filters = []
if input_string:
or_groups = list(filter(None, input_string.split(';'))) # split <input_string> into OR clauses
for or_group in or_groups:
or_group = or_group.strip()
and_groups = list(filter(None, or_group.split(','))) # split <or_group> into AND clauses
and_group_filters = {}
for and_group in and_groups:
and_group = and_group.strip()
# tokenise this AND clause using operators as delimiters.
tokenisation_regex = "({})".format('|'.join(operators_suffix_LUT.keys()))
and_group_split_by_operator = list(filter(None, re.split(tokenisation_regex, and_group)))
if len(and_group_split_by_operator) == 3: # this is a one-sided inequality or expression
key, operator, value = [token.strip() for token in and_group_split_by_operator]

# substitute input operator with the nominal operator defined by the LUT, <operators_suffix_LUT>.
operator_mapped = operators_suffix_LUT.get(operator)

filter_key_full = key = "'{}'".format(key)
if operator_mapped is not None:
if operator_mapped:
filter_key_full = "{}.{}".format(key, operator_mapped)
else:
raise ValueError("{} operator not understood.".format(operator_mapped))

if filter_key_full in and_group_filters:
raise ValueError(filter_key_full)
else:
if not is_numeric(value):
value = "'{}'".format(value)
and_group_filters[filter_key_full] = value
elif len(and_group_split_by_operator) == 5: # this is a compound inequality
value1, operator1, key, operator2, value2 = [token.strip() for token in and_group_split_by_operator]

# substitute input operator with the nominal operator defined by the LUT, <operators_suffix_LUT>.
operator1_mapped = operator_opposites_LUT.get(operators_suffix_LUT.get(operator1))
operator2_mapped = operators_suffix_LUT.get(operator2)

key = "'{}'".format(key)
filter_key1_full = filter_key2_full = key
if operator1_mapped is not None and operator2_mapped is not None:
if operator1_mapped: # ignore '' operator (maps from equals)
filter_key1_full = "{}.{}".format(key, operator1_mapped)
if operator2_mapped: # ignore '' operator (maps from equals)
filter_key2_full = "{}.{}".format(key, operator2_mapped)
else:
raise ValueError("{} operator not understood.".format(operator_mapped))

if filter_key1_full in and_group_filters:
raise ValueError(filter_key1_full)
else:
if not is_numeric(value1):
value1 = "'{}'".format(value1)
and_group_filters[filter_key1_full] = value1
if filter_key2_full in and_group_filters:
raise ValueError(filter_key2_full)
else:
if not is_numeric(value2):
value2 = "'{}'".format(value2)
and_group_filters[filter_key2_full] = value2
else:
raise ValueError(and_group)

# add name key to each AND clause if it hasn't already been populated from the filter and <omit_name> not set.
if not omit_name and 'name' not in and_group_filters:
and_group_filters['name'] = name

filters.append(and_group_filters)
else:
if not omit_name:
filters.append({
'name': name
})
return filters, type


def is_numeric(value):
try:
float(value)
return True
except ValueError:
return False


class RucioAPI:
rucio_auth_token_cache = dict()

Expand Down Expand Up @@ -55,16 +179,20 @@ def get_rses(self, rse_expression=None):

return results

def search_did(self, scope, name, search_type='collection', limit=None):
def search_did(self, scope, name, search_type='collection', filters=None, limit=None):
token = self._get_auth_token()
headers = {'X-Rucio-Auth-Token': token}

scope = quote(scope)
urlencoded_params = urlencode({
params = {
'type': search_type,
'long': '1',
'name': name
})
}
if filters:
filters, _ = parse_did_filter_from_string_fe(filters, name=name)
params['filters'] = filters
urlencoded_params = urlencode(params)

response = requests.get(url=f'{self.base_url}/dids/{scope}/dids/search?{urlencoded_params}', headers=headers, verify=self.rucio_ca_cert)

Expand All @@ -80,6 +208,23 @@ def search_did(self, scope, name, search_type='collection', limit=None):

return results

def get_metadata(self, scope, name):
token = self._get_auth_token()
headers = {'X-Rucio-Auth-Token': token}

scope = quote(scope)
name = quote(name)

response = requests.get(url=f'{self.base_url}/dids/{scope}/{name}/meta', headers=headers, verify=self.rucio_ca_cert)

if response.text == '':
return []

lines = response.text.rstrip('\n').splitlines()
results = [json.loads(l) for l in lines]

return results

def get_files(self, scope, name):
token = self._get_auth_token()
headers = {'X-Rucio-Auth-Token': token}
Expand Down
Loading