Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eduid_user_cleaner skatteverket worker #314

Closed
wants to merge 62 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
53d6da9
Init commit.
masv3971 Sep 28, 2022
1eddece
Update hashes.
masv3971 Sep 28, 2022
6fae0d5
First.
masv3971 Sep 30, 2022
4ab0f89
Add first layout.
masv3971 Oct 3, 2022
97cf1dc
Merge
masv3971 Nov 10, 2022
fae6838
Skeleton worker structure.
masv3971 Nov 14, 2022
a9d0d23
Merge masv_amapi-client
masv3971 Nov 18, 2022
38e02f4
Merge branch 'masv_amapi-client' into masv-cleaner-worker
masv3971 Nov 21, 2022
6ab8095
Add navet stuff to worker.
masv3971 Nov 22, 2022
12cff53
user cleaner barebone.
masv3971 Nov 29, 2022
76cb98a
Merge from masv_amapi-client.
masv3971 Nov 29, 2022
91f82cd
Add delay and change quota.
masv3971 Nov 30, 2022
9dd0d10
Rename file.
masv3971 Dec 1, 2022
ce5b3be
Add stats.
masv3971 Dec 5, 2022
4392f41
Add init files.
masv3971 Dec 5, 2022
a72be83
Cleand away unused folder.
masv3971 Dec 5, 2022
503bef0
Add dry run and change log to debug.
masv3971 Dec 6, 2022
486159e
Merge branch 'masv_amapi-client' into masv-cleaner-worker
masv3971 Dec 6, 2022
e62b015
Add teleadress and some spelling.
masv3971 Dec 6, 2022
90b0ae7
Add health check.
masv3971 Dec 6, 2022
949032c
Add words to vscode.
masv3971 Dec 7, 2022
ec76bd5
Merge from masv_amapi-client.
masv3971 Dec 12, 2022
5cc7f84
Add comments and general sleep method for faster kills of the app.
masv3971 Dec 12, 2022
297531a
Add words to dictionary.
masv3971 Dec 12, 2022
393ed71
Better syntax.
masv3971 Dec 12, 2022
e878033
Add milliseconds sleep.
masv3971 Dec 13, 2022
4638578
Merge from main.
masv3971 Dec 14, 2022
fac9091
Merge branch 'main' into masv-cleaner-worker
masv3971 Dec 19, 2022
dce92b8
untangle files: cleaner related files.
masv3971 Dec 23, 2022
bbc85a9
Merge from main.
masv3971 Jan 11, 2023
38ac3a8
Add wait, meta and cache.
masv3971 Jan 26, 2023
e7a02ad
Merge from main.
masv3971 Jan 26, 2023
a1a790e
Change auth url due to change.
masv3971 Jan 29, 2023
942f3a9
Merge branch 'main' into masv-cleaner-worker
masv3971 Jan 29, 2023
86aef2c
Lost method replaced.
masv3971 Jan 30, 2023
c7fa42f
Correct comment.
masv3971 Jan 30, 2023
ace25ce
Remove test cases from app, since they should not be in app testing s…
masv3971 Jan 30, 2023
d7d29b0
Add test and add working fastapi client mock.
masv3971 Jan 31, 2023
2df3f6b
Rename queue_user to cache_user in order to bring ORDER.
masv3971 Jan 31, 2023
f8ab129
Use TUserDbDocument instead of dict.
masv3971 Jan 31, 2023
6f18a76
Add routes for amapi_client mock.
masv3971 Jan 31, 2023
3879339
Since almost nothing load in app, it's more efficient to test in skv.
masv3971 Jan 31, 2023
d21f3b6
Fix typing issue.
masv3971 Jan 31, 2023
87fbf30
Add queue test.
masv3971 Jan 31, 2023
9bc8af6
Change to __future__ annotations.
masv3971 Feb 1, 2023
6062809
Change to use isoformat instead of strfrime.
masv3971 Feb 1, 2023
3b0072a
Rename queue to cache due to legacy.
masv3971 Feb 1, 2023
1a75778
Remove mock from self, since it's not used globaly.
masv3971 Feb 1, 2023
f270aaa
Add stat collection for each completed task.
masv3971 Feb 1, 2023
35897db
Remove dubble fnutts around Queue type, not necessary.
masv3971 Feb 1, 2023
2bf2ec7
Remove trivial method and apply its logic directly.
masv3971 Feb 1, 2023
4ad0778
Optimize db query.
masv3971 Feb 1, 2023
521432c
Use _get_all_docs() istead of _coll.find({}].
masv3971 Feb 1, 2023
ac5a6cf
Use BaseDB drop method.
masv3971 Feb 1, 2023
a94bb52
Use BaseDB method instead of _coll.
masv3971 Feb 1, 2023
e8f0a07
Replace int (unix epoch) with datetime.
masv3971 Feb 1, 2023
4378c29
Remove unused dep.
masv3971 Feb 1, 2023
37b1a71
Merge branch 'main' into masv-cleaner-worker
helylle Oct 3, 2023
67397b9
Merge branch 'main' into masv-cleaner-worker
helylle Oct 9, 2023
44ffed6
move function to common utils
helylle Oct 11, 2023
05b4e09
make reformat
helylle Oct 11, 2023
b36dbeb
make the worker startable
helylle Oct 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,13 @@
"cSpell.words": [
"accr",
"amapi",
"amdb",
"authlib",
"authns",
"baar",
"backchannel",
"backupdir",
"bson",
"checkpw",
"chpass",
"csrft",
Expand All @@ -33,36 +35,48 @@
"eppn",
"fastapi",
"Freja",
"gnap",
"hashname",
"hashpw",
"httpx",
"hubba",
"IDNIN",
"jsonify",
"jwks",
"klass",
"ladok",
"lundberg",
"masv",
"mischttp",
"nameid",
"navet",
"nins",
"NORDUnet",
"oidc",
"orcid",
"prid",
"proofings",
"proquint",
"PWAUTH",
"pydantic",
"pysaml2",
"reauthn",
"REFEDS",
"replicaset",
"respx",
"satosa",
"scim",
"SCIMAPI",
"sess",
"SIGTERM",
"skatteverket",
"Svenska",
"svipe",
"svipeid",
"swamid",
"swedenconnect",
"TELE",
"teleadress",
"Testsson",
"timeutil",
"Unmarshal",
Expand Down
1 change: 0 additions & 1 deletion requirements/test_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2085,7 +2085,6 @@ zxcvbn==4.4.28 \
# via
# -c main.txt
# -r main.in

# The following packages are considered to be unsafe in a requirements file:
pip==23.2.1 \
--hash=sha256:7ccf472345f20d35bdc9d1841ff5f313260c2c33fe417f48c30ac46cccabf5be \
Expand Down
6 changes: 3 additions & 3 deletions src/eduid/common/clients/amapi_client/amapi_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,15 @@


class AMAPIClient(GNAPClient):
def __init__(self, amapi_url: str, app, auth_data=GNAPClientAuthData, **kwargs):
super().__init__(auth_data=auth_data, app=app, **kwargs)
def __init__(self, amapi_url: str, auth_data=GNAPClientAuthData, verify_tls: bool = True, **kwargs):
super().__init__(auth_data=auth_data, verify=verify_tls, **kwargs)
self.amapi_url = amapi_url

def _users_base_url(self) -> str:
return urlappend(self.amapi_url, "users")

def _put(self, base_path: str, user: str, endpoint: str, body: Any) -> httpx.Response:
return self.put(urlappend(base_path, f"{user}/{endpoint}"), json=body.json())
return self.put(url=urlappend(base_path, f"{user}/{endpoint}"), content=body.json())

def update_user_email(self, user: str, body: UserUpdateEmailRequest) -> UserUpdateResponse:
ret = self._put(base_path=self._users_base_url(), user=user, endpoint="email", body=body)
Expand Down
91 changes: 82 additions & 9 deletions src/eduid/common/clients/amapi_client/testing.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,92 @@
from typing import Optional
from typing import Mapping, Optional

import respx
from httpx import Request, Response

from eduid.common.clients.gnap_client.testing import MockedSyncAuthAPIMixin
from eduid.common.misc.timeutil import utc_now
from eduid.common.models.amapi_user import (
UserUpdateEmailRequest,
UserUpdateLanguageRequest,
UserUpdateNameRequest,
UserUpdatePhoneRequest,
)
from eduid.userdb.mail import MailAddressList
from eduid.userdb.phone import PhoneNumberList
from eduid.userdb.user import User
from eduid.userdb.userdb import AmDB


class MockedAMAPIMixin(MockedSyncAuthAPIMixin):
def start_mock_amapi(self, access_token_value: Optional[str] = None):
def start_mock_amapi(self, central_user_db: Optional[AmDB] = None, access_token_value: Optional[str] = None):
self.start_mock_auth_api(access_token_value=access_token_value)
self.central_user_db = central_user_db
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vet inte om mypy eller våra IDEer kommer att vara så imponerade av att vi sätter något på self här.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vscode brydde sig inte. Men tagit bort dom eftersom dom inte används utanför metoden.

mocked_users = respx.mock(base_url="http://localhost/amapi", assert_all_called=False)

self.mocked_users = respx.mock(base_url="http://localhost", assert_all_called=False)
put_users_name_route = self.mocked_users.put(
url="/users/hubba-bubba/name",
name="put_users_name_request",
mocked_users.put(url="/users/hubba-bubba/name", name="200_put_name").mock(
side_effect=self._side_effect_update_name
)
put_users_name_route.pass_through()
self.mocked_users.start()
self.addCleanup(self.mocked_users.stop) # type: ignore
mocked_users.put(url="/users/hubba-bubba/email", name="200_put_email").mock(
side_effect=self._side_effect_update_email
)
mocked_users.put(url="/users/hubba-bubba/language", name="200_put_language").mock(
side_effect=self._side_effect_update_language
)

mocked_users.start()
self.addCleanup(mocked_users.stop) # type: ignore

def _side_effect_update(self, func_name: str) -> User:
if self.central_user_db is None:
raise ValueError(f"side affect '{func_name}' was called but self.amdb is None")
return self.central_user_db.get_user_by_eppn("hubba-bubba")

def _side_effect_update_name(self, request: Request) -> Response:
db_user = self._side_effect_update(func_name="_side_effect_update_name")
assert self.central_user_db is not None
mock_request = UserUpdateNameRequest.parse_raw(request.content)

db_user.given_name = mock_request.given_name
db_user.surname = mock_request.surname
db_user.display_name = mock_request.display_name
self.central_user_db.save(user=db_user)
return Response(200, text='{"status": "true"}')

def _side_effect_update_email(self, request: Request) -> Response:
db_user = self._side_effect_update(func_name="_side_effect_update_email")
assert self.central_user_db is not None
mock_request = UserUpdateEmailRequest.parse_raw(request.content)

mail_addresses = [mail.to_dict() for mail in mock_request.mail_addresses]
db_user = self.central_user_db.get_user_by_eppn("hubba-bubba")
self.central_user_db.unverify_mail_aliases(user_id=db_user.user_id, mail_aliases=mail_addresses)

db_user.mail_addresses = MailAddressList(elements=mock_request.mail_addresses)
self.central_user_db.save(user=db_user)
return Response(200, text='{"status": "true"}')

def _side_effect_update_language(self, request: Request) -> Response:
db_user = self._side_effect_update(func_name="_side_effect_update_language")
assert self.central_user_db is not None
mock_request = UserUpdateLanguageRequest.parse_raw(request.content)

db_user.language = mock_request.language
self.central_user_db.save(user=db_user)
return Response(200, text='{"status": "true"}')

def _side_effect_update_phone(self, request: Request) -> Response:
db_user = self._side_effect_update(func_name="_side_effect_update_phone")
assert self.central_user_db is not None
mock_request = UserUpdatePhoneRequest.parse_raw(request.content)

phones = [phone.to_dict() for phone in mock_request.phone_numbers]
self.central_user_db.unverify_phones(user_id=db_user.user_id, phones=phones)
db_user.phone_numbers = PhoneNumberList(elements=mock_request.phone_numbers)
return Response(200, text='{"status": "true"}')

def _side_effect_update_terminate(self, request: Request) -> Response:
db_user = self._side_effect_update(func_name="_side_effect_update_terminate")
assert self.central_user_db is not None

db_user.terminated = utc_now()
return Response(200, text='{"status": "true"}')
5 changes: 3 additions & 2 deletions src/eduid/common/clients/gnap_client/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ class GNAPClientException(Exception):


class GNAPClientAuthData(BaseModel):
authn_server_url: str
auth_server_url: str
auth_server_verify: bool = True
key_name: str
client_jwk: JWK
access: list[Union[str, Access]] = Field(default_factory=list)
Expand All @@ -51,7 +52,7 @@ class GNAPBearerTokenMixin(ABC):

@property
def transaction_uri(self) -> str:
return urlappend(self._auth_data.authn_server_url, "transaction")
return urlappend(self._auth_data.auth_server_url, "transaction")

def _create_grant_request_jws(self) -> str:
req = GrantRequest(
Expand Down
3 changes: 1 addition & 2 deletions src/eduid/common/clients/gnap_client/sync_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ def __init__(self, auth_data: GNAPClientAuthData, **kwargs):

super().__init__(**kwargs)

self.verify = kwargs.get("verify", True)
self._auth_data = auth_data

@staticmethod
Expand All @@ -34,7 +33,7 @@ def _request_bearer_token(self) -> GrantResponse:
url=self.transaction_uri,
content=data,
headers={"Content-Type": "application/jose+json"},
verify=self.verify,
verify=self._auth_data.auth_server_verify,
)
resp.raise_for_status()
return GrantResponse.parse_raw(resp.text)
Expand Down
14 changes: 9 additions & 5 deletions src/eduid/common/rpc/msg_relay.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# -*- coding: utf-8 -*-
import logging
from enum import Enum
from typing import Optional
Expand Down Expand Up @@ -123,17 +124,20 @@ def is_deregistered(self) -> bool:
return bool(self.deregistration_information.cause_code or self.deregistration_information.date)


class NavetData(NavetModelConfig):
case_information: CaseInformation = Field(alias="CaseInformation")
person: Person = Field(alias="Person")


# Used to parse data from get_postal_address
class FullPostalAddress(NavetModelConfig):
name: Name = Field(default_factory=Name, alias="Name")
official_address: OfficialAddress = Field(default_factory=OfficialAddress, alias="OfficialAddress")


class NavetData(NavetModelConfig):
case_information: CaseInformation = Field(alias="CaseInformation")
person: Person = Field(alias="Person")

def get_full_postal_address(self) -> FullPostalAddress:
return FullPostalAddress(name=self.person.name, official_address=self.person.postal_addresses)


class MsgRelay:
"""
This is the interface to the RPC task to fetch data from NAVET, and to send SMSs.
Expand Down
75 changes: 75 additions & 0 deletions src/eduid/common/utils.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
__author__ = "lundberg"

import logging
from typing import List, Optional

from eduid.common.rpc.msg_relay import FullPostalAddress
from eduid.userdb.user import TUserSubclass

logger = logging.getLogger(__name__)


def urlappend(base: str, path: str) -> str:
"""
Expand Down Expand Up @@ -40,3 +48,70 @@ def removesuffix(s: str, suffix: str) -> str:
return s[: -len(suffix)]
else:
return s[:]


def get_marked_given_name(given_name: str, given_name_marking: Optional[str]) -> str:
"""
Given name marking denotes up to two given names, and is used to determine
which of the given names are to be primarily used in addressing a person.
For this purpose, the given_name_marking is two numbers:
indexing starting at 1
the second can be 0 for only one mark
hyphenated names are counted separately (i.e. Jan-Erik are two separate names)
If they are both marked they should be re-hyphenated

current version of documentation:
https://www.skatteverket.se/download/18.2cf1b5cd163796a5c8bf20e/1530691773712/AllmanBeskrivning.pdf

:param given_name: Given name
:param given_name_marking: Given name marking

:return: Marked given name (Tilltalsnamn)
"""
if not given_name_marking or "00" == given_name_marking:
return given_name

# cheating with indexing
_given_names: List[Optional[str]] = [None]
for name in given_name.split():
if "-" in name:
# hyphenated names are counted separately
_given_names.extend(name.split("-"))
else:
_given_names.append(name)

_optional_marked_names: List[Optional[str]] = []
for i in given_name_marking:
_optional_marked_names.append(_given_names[int(i)])
# remove None values
# i.e. 0 index and hyphenated names second part placeholder
_marked_names: List[str] = [name for name in _optional_marked_names if name is not None]
if "-".join(_marked_names) in given_name:
return "-".join(_marked_names)
else:
return " ".join(_marked_names)


def set_user_names_from_official_address(user: TUserSubclass, user_postal_address: FullPostalAddress) -> TUserSubclass:
"""
:param user: Proofing app private userdb user
:param user_postal_address: user postal address

:returns: User object
"""
user.given_name = user_postal_address.name.given_name
user.surname = user_postal_address.name.surname
if user.given_name is None or user.surname is None:
# please mypy
raise RuntimeError("No given name or surname found in proofing log user postal address")
given_name_marking = user_postal_address.name.given_name_marking
user.display_name = f"{user.given_name} {user.surname}"
if given_name_marking:
_given_name = get_marked_given_name(user.given_name, given_name_marking)
user.display_name = f"{_given_name} {user.surname}"
logger.info("User names set from official address")
logger.debug(
f"{user_postal_address.name} resulted in given_name: {user.given_name}, "
f"surname: {user.surname} and display_name: {user.display_name}"
)
return user
14 changes: 14 additions & 0 deletions src/eduid/userdb/tests/test_user_cleaner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
from unittest import TestCase

from eduid.userdb.testing import MongoTemporaryInstance
from eduid.userdb.user_cleaner.cache import CacheUser
from eduid.userdb.user_cleaner.cachedb import CacheDB


class TestUserCleanerCache(TestCase):
def setUp(self):
self.tmp_db = MongoTemporaryInstance.get_instance()
self.user_cleaner_meta_db = CacheDB(db_uri=self.tmp_db.uri, collection="skv_cache")

def tearDown(self):
self.user_cleaner_meta_db._drop_whole_collection()
39 changes: 39 additions & 0 deletions src/eduid/userdb/user_cleaner/cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
from __future__ import annotations

from datetime import datetime
from typing import Any, Mapping, Optional

from pydantic import BaseModel, Field

from eduid.common.misc.timeutil import utc_now
from eduid.userdb.db import TUserDbDocument
from eduid.userdb.user import User


class CacheUser(BaseModel):
eppn: str
created_ts: datetime = Field(default_factory=utc_now)
next_run_ts: Optional[datetime] = None

def to_dict(self) -> TUserDbDocument:
"""
Convert Element to a dict in eduid format, that can be used to reconstruct the
Element later.
"""
data = self.dict(exclude_none=True)

return TUserDbDocument(data)

@classmethod
def from_dict(cls, data: Mapping[str, Any]) -> CacheUser:
"""Convert a dict to a Element object."""
return cls(**data)

def from_user(self, data: User) -> CacheUser:
"""
Convert a User object to a CacheUser object.
"""
cache_user = CacheUser(
eppn=data.eppn,
)
return cache_user
Loading