-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APM] Fix missing code_owner cases in Datadog #609
Comments
Interestingly, there's a property |
|
I've updated the ticket with a few notes (a query to find these traces, course assets middleware was fixed). Running
So probably we could get down to zero (and re-establish our monitor) if we:
On the |
A/C:
env:prod service:edx-edxapp-lms operation_name:django.request [email protected]_code:(4* OR 5*) -@code_owner_squad:*
@session_email_mismatch:*
(any custom attribute we always attach to the root span) to filter out false positives from Disabling NR APM causes trace concatenation in Datadog #692 (which causes us to attach the code owner attribute to the wrong span).@http.status_code:*
to filter out traces where an HTTP status code was never even attached.The current implementation of code-owner tagging in LMS uses
_get_module_from_request_path
to determine what Python module is handling the request. However, some requests fail to identify the module and fall back to_get_module_from_current_transaction
, which contains New Relic specific code. We'll need to find a way to make this work with Datadog.Identifying where fallbacks occur
code_owner_transaction_name
. (We only getcode_owner_path_error
if both path-based and transaction-based module lookups fail.)_get_module_from_request
.Known problematic views
Over the past month I observed that about 1% of requests in prod LMS had required a transaction-based module lookup, but only for a very small list of transaction types:
Occurrences in production, from most to least frequent, along with how to reproduce them:
openedx.core.djangoapps.contentserver.middleware:StaticContentServer
- Course assets should be served by a view rather than a middleware openedx/edx-platform#34702django.core.handlers.wsgi:WSGIHandler._get_response
200
responses, mostly under/api
, such as a301
from/api/discussion/v1/comments
lms.djangoapps.mobile_api.middleware:AppVersionUpgrade
426
s from/api/mobile/v2/users/.../course_enrollments
corsheaders.middleware:CorsMiddleware
200
s from URLs like/api/courseware/course/course-v1:...
,/api/grades/v1/gradebook/course-v1:.../bulk-updatehistory/
,/api/course_home/course_metadata/course-v1:ArmEducationX
(latter is possibly someone probing the API)openedx.core.djangoapps.safe_sessions.middleware:SafeSessionMiddleware
401
s and302
s from a collection of endpointsOther observations
code_owner_module like '%middleware%' and code_owner_transaction_name is null
-- requests that are handled by a middleware never have a request-path-based answer for code owner module.resolve
.Assorted notes
_get_module_from_request_path
could be improved, because I think theview_func
might come in different places for different requests. There is auth related test code that does this. (TODO: find code example)The text was updated successfully, but these errors were encountered: