Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataflow logs #88

Closed
wants to merge 10 commits into from
Closed

Dataflow logs #88

wants to merge 10 commits into from

Conversation

cisaacstern
Copy link
Member

@cisaacstern
Copy link
Member Author

cisaacstern commented Jun 3, 2022

When run locally with a gcloud system login, the new api page now works in very rudimentary fashion, with the URL http://localhost:3000/api/dataflow/wordcount-example-0 returning something like:

[{"timestamp":"2022-06-01T19:23:27.133Z","severity":"INFO","textPayload":"Executing operation [1]: Write/Write/WriteImpl/FinalizeWrite/View-python_side_input1-[1]: Write/Write/WriteImpl/FinalizeWrite"}]

Next steps:

  1. Refine API query process (by adding time range and severity filters)
  2. Figure out vercel gcloud authentication so this will work in production
  3. Build frontend component to display API query results

@cisaacstern
Copy link
Member Author

The first item above is now implemented (albeit without graceful failure modes), so that

/* e.g., this works:
http://localhost:3000/api/dataflow/wordcount-example-0?
startTime="2022-06-01T19:23:26.133337225Z"
&stopTime="2022-06-01T19:23:29.133337225Z"
&severity=INFO
*/

produces

[{"timestamp":"2022-06-01T19:23:27.381Z","severity":"INFO","textPayload":"Executing operation [1]: Write/Write/WriteImpl/PreFinalize"},{"timestamp":"2022-06-01T19:23:27.233Z","severity":"INFO","textPayload":"Finished operation [1]: Write/Write/WriteImpl/PreFinalize/View-python_side_input1-[1]: Write/Write/WriteImpl/PreFinalize"},{"timestamp":"2022-06-01T19:23:27.187Z","severity":"INFO","textPayload":"Finished operation [1]: Write/Write/WriteImpl/FinalizeWrite/View-python_side_input1-[1]: Write/Write/WriteImpl/FinalizeWrite"},{"timestamp":"2022-06-01T19:23:27.167Z","severity":"INFO","textPayload":"Executing operation [1]: Write/Write/WriteImpl/PreFinalize/View-python_side_input1-[1]: Write/Write/WriteImpl/PreFinalize"},{"timestamp":"2022-06-01T19:23:27.133Z","severity":"INFO","textPayload":"Executing operation [1]: Write/Write/WriteImpl/FinalizeWrite/View-python_side_input1-[1]: Write/Write/WriteImpl/FinalizeWrite"},{"timestamp":"2022-06-01T19:23:26.984Z","severity":"INFO","textPayload":"Finished operation [1]: Write/Write/WriteImpl/GroupByKey/Read+[1]: Write/Write/WriteImpl/Extract"}]

going to see about the frontend component now.

@cisaacstern
Copy link
Member Author

A question I don't know the answer to: if a job fails due to a Python error (not a Dataflow issue), will the Traceback be surfaced by a severity=ERROR query? Ideally, yes. This is the main value of this PR, so I'll test it shortly.

@cisaacstern
Copy link
Member Author

I'm confused why data here ends up as undefined

const route = `/api/dataflow/${jobName}?startTime="${startTime}"&stopTime="${stopTime}"&severity="${severity}"`
console.log('\n\n ROUTE:', route, '\n\n')
const { data, error } = useSWR(route, jsonFetcher, options)
console.log('\n\n DATA:', data, '\n\n')

wait  - compiling /dataflow-logs (client and server)...
wait  - compiling...
event - compiled client and server successfully in 54 ms (237 modules)

 ROUTE: /api/dataflow/wordcount-example-0?startTime="2022-06-01T19:23:26.133337225Z"&stopTime="2022-06-01T19:23:29.133337225Z"&severity="INFO" 


 DATA: undefined 


 DATAFLOW: undefined 

despite the fact that the route resolves the data shown in #88 (comment) when queried directly (by navigating to it in the browser).

@jhamman and/or @andersy005, would appreciated your perspective on this next week. (As noted above, Vercel doesn't have the gcloud logging credentials set up yet, so as of this moment this can only be reproduced locally.)

My overall impression: developing this stuff is really fun! I look forward to getting more competent at it.

@jhamman
Copy link

jhamman commented Jun 6, 2022

Happy to take a look @cisaacstern. Two things I'd check right off the bat:

  1. Does data remain undefined or is just undefined initially? As you have likely seen elsewhere in the site, we protect against load times by waiting to render data until it has been fetched.
  2. Is the API call via SWR authenticating in the same way as the direct API route? Any hints from SWR or the API route?

lib/endpoints.js Outdated Show resolved Hide resolved
pages/dataflow-logs.js Outdated Show resolved Hide resolved
Co-authored-by: Anderson Banihirwe <[email protected]>
@cisaacstern
Copy link
Member Author

Thanks for the reflections, Joe, and the suggestion, Anderson. I'll pull this suggestion locally and see how it runs.

@cisaacstern
Copy link
Member Author

cisaacstern commented Jun 6, 2022

As you both suspected, it looks like the data was only undefined in console.log due to async.

And then once it did load, I wasn't seeing it on the rendered page because I wasn't mapping it onto the frontend component correctly. Now on my local server I get:

Screen Shot 2022-06-06 at 9 42 41 AM

which was the goal!

I'll probably leave off the Loading... option for now, because the data actually appears quickly enough that it seems unnecessary. Good to see how that's done, though.

Going to move forward with:

  • Probably adding some type of selector widget for refining the time extent and/or logging level
  • Running some actual recipe pipelines, so we can see what pangeo-forge-recipes logs look like (as opposed to these example logs)
  • Authentication for the Vercel environment
  • Surfacing Python module-level logging statements & tracebacks in these logs

@cisaacstern
Copy link
Member Author

Just noting that recent experience with pangeo-forge/cesm-atm-025deg-feedstock#2 highlights that in order to be truly useful for production debugging, we need a resolution for #63. I'm currently aiming to incorporate some minimal solution to that into this PR.

@cisaacstern cisaacstern closed this Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants