Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service cleanup #184

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Service cleanup #184

wants to merge 3 commits into from

Conversation

romain-intel
Copy link
Contributor

No description provided.

tags:
- Auth
produces:
- text/plain
- application/json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified that this change wouldn't affect the sandbox? We have a similar issue where the heartbeat endpoint returns an actual JSON but is tagged as text/plain which has caused issues in the metaflow codebase before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to rework things and push a PR to make it so that everything works on application/json. I will also check with Ferras if there was a reason for the string only implementation.

@@ -78,10 +77,12 @@ async def get_artifact(self, request):
required: true
type: "string"
produces:
- text/plain
- application/json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the comment above.

romain-intel added a commit that referenced this pull request May 28, 2021
Comment on lines +318 to +322
run = await self._db.get_run_ids(flow_name, run_number)
task = await self._db.get_task_ids(flow_name, run_number,
step_name, task_id)
if run.response_code != 200 or task.response_code != 200:
return DBResponse(400, {"message": "need to register run_id and task_id first"})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more generic thought and not a blocker for merging the cleanup, but is there a reason why this kind of integrity check is not covered by a foreign key constraint in the table?

also affects the metadata create handler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is covered I think; the issue is that it used to be silent (ie: the db would raise the error but it could silently make it through.

return http_500(str(err))
# either use provided traceback from subprocess, or generate trace from current process
err_trace = getattr(err, 'traceback_str', None) or get_traceback_str()
print(err_trace)
Copy link
Collaborator

@saikonen saikonen May 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary print or leftover from some testing? If this is for logging purposes, consider using the logger from services.utils instead.

If this is for generic logging over the process, it should probably be set as a logging handler with loop.set_exception_handler(some_error_handling_function) in the server setup phase instead. You can see https://github.com/Netflix/metaflow-service/blob/ui/services/ui_backend_service/ui_server.py#L98 for an example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use the logger. I did want to log things to aid in debugging after.

Comment on lines -21 to -29
async def read_body(request_content):
byte_array = bytearray()
while not request_content.at_eof():
data = await request_content.read(4)
byte_array.extend(data)

return json.loads(byte_array.decode("utf-8"))


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there some issues encountered with using await response.json() or why was this previously necessary? Concerned about some possible edge cases that motivated the previous implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check with Ferras for the why of the previous implementation but I felt it would be better to use the standard implementation instead of doing our own.

@@ -139,5 +139,5 @@ async def get_authorization_token(self, request):

return web.Response(status=200, body=json.dumps(credentials))
except Exception as ex:
body = {"err_msg": str(ex), "traceback": get_traceback_str()}
body = {"message": str(ex), "traceback": get_traceback_str()}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the expected body for the authorization route? the http_500 helper only has a detail field for the error message. Consider keeping it consistent with the helper output if possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to make them all consistent. Still working on it. After I pushed this, I realized I missed af ew things.

body = await request.text()
except:
body = '<no body>'
print("Error caused when %s %s with query %s and body %s" %
Copy link
Collaborator

@saikonen saikonen May 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also instead be using the logger here to adhere to configured log levels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will change to a logger here and elsewhere. I will try to get the proper one (I couldn't find it initially).

run_number, run_id = await self._db.get_run_ids(flow_name, run_number)
run = await self._db.get_run_ids(flow_name, run_number)
if run.response_code != 200:
return DBResponse(400, {"message": "need to register run_id first"})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steps have a foreign key for runs at least in the V3 schema, so consider handling error code 409 specifically for this instead?

Is it necessary to rewrite the error messages on the response handler level, or could this be handled at a higher level in aiopg_exception_handling instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reworking this and yes, plan to raise an exception all the way through. The issue before though was that it would be silent.

Comment on lines +185 to +186
if run.response_code != 200:
return DBResponse(400, {"message": "need to register run_id and task_id first"})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as for steps. Tasks in the V3 schema also have a foreign key covering this, so consider handling the specific error instead.

@romain-intel
Copy link
Contributor Author

Thanks for all the comments. I am going to keep working on this. After I pushed it out, I realized it was still not as clean as I wanted it to be and that I would need something on the Metaflow side to talk properly to an actual REST service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants