-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance for Trip Segmentation Stage #1041
Comments
Spent week and a half to figure out the pipeline flow (including the trip segmentation flow) from b. discussion with @MukuFlash03 where he gave Issue #950 as a good way to understand the flow of prediction pipeline. pipeline begins from We'll currently focus on the |
The very first point of improvement I could find is related to the negation of out of order data :
where
This is done one ID at a time, which can be inefficient, especially with large list.
and then use
|
To handle too big of a list, we can make batches and then create bulk operations on those batches. |
To test the above, I flipped the
thus allowing us to test the out_of_order part of the code. Another thing I tried was experimenting with the scale of data by appending the same dataframe to the end of itself as :
where n_fold means the no_of_times you want to upscale the dataframe. After this I reversed the dataframe as above. Results were as below :
|
I did try the batch method( with bulk_write ) as well, using different batch sizes ( 500,1000,2000) but there no improvement as compared to bulk_write. |
there's two other dependency for
|
For example, in the issue here: #950 |
Looking into the Android run time for now. Out of this, almost all the time ( ~ 6.774) is to run the
Inside the loop, for each iteration run it took ~0.0043s to run from beginning of the loop :
till
AND another ~ 0.011 from
till rest of the loops,i.e., till
Meaning that the second part of the loop is currently taking the most time. |
In this part, the To improve There was this db call
which was run in every iteration. Instead , we can place this upstream outside the Once we have the df, since all the rows are sorted by
Similar changes in
upstream to extract in a df and replace by :
|
With these changes the runtime can down to ~0.0001 (from .005) for So all in all the second part of the for loop that took ~.011s came down to ~0.0006 s. In total , the |
e-mission/e-mission-server@1d1b31f handles these changes. This was bound to fail. |
And with these changes inside I tried some pandas improvements which might be overkill ( in which case we can roll this back) which reduced overall runtime from ~2.12 to ~1.5. For this , started by adding Vectorised implementation in
and in |
For the iOS wrapper, the necessary changes to support the generalized flow of trip and that brought the iOS runtime to 0.3s. However, when running the combinedWrapper (iOS+Android), the
isn't work not when placed upstream in For now, on dummy data, this additional call is increasing android runtime from 1.5 to ~1.6 and iOS from 0.3 to 0.4. Lets try to figure out why upstream call isn't working |
Also, current CombinedWrapper runtime is 2.1s |
On the side, after yesterday's PR, I was expecting all the tests to pass. Figured from the logs that |
The changes below that led to these performance upgrades are investigated in e-mission/e-mission-docs#1041 . They are : 1. db calls for transition and motion dataframes are moved upstream from `is_tracking_restarted_in_range` function and `get_ongoing_motion_in_range` in `restart_checking.py` to `trip_segmentaiton.py`. The old setting which had multiple db calls ( for each iteration ) now happen once in the improved setting. 2. All the other changes in `trip_segmentation.py` and `dwell_segmentation_dist_filter.py` are just to support the change in point 1 ( above). 3. in `dwell_segmentation_time_filter.py`,other than the changes to support point 1 ( above), there an additional improvement. The calculations for `last10PointsDistances` and `last5MinsPoints` are vectorised. For this, `calDistance` in `common.py` now supports numpy arrays.
In This brought the runtime for this entire loop
from ~ 2s to 1.2s over 327 iterations. |
Instead of invalidating one ooid from the list at a time, use UpdateOne and bulkwrite to invalidate entire list . This is supported by findings here e-mission/e-mission-docs#1041 (comment)
The changes below that led to these performance upgrades are investigated in e-mission/e-mission-docs#1041 . They are : 1. db calls for transition and motion dataframes are moved upstream from `is_tracking_restarted_in_range` function and `get_ongoing_motion_in_range` in `restart_checking.py` to `trip_segmentaiton.py`. The old setting which had multiple db calls ( for each iteration ) now happen once in the improved setting. 2. All the other changes in `trip_segmentation.py`,`dwell_segmentation_dist_filter.py` and `dwell_segmentation_time_filter.py` are just to support the change in point 1 ( above).
I think we should revisit this from first principles and then validate that what Satyam came up aligns with what you find.
we can instrument in two ways:
Pro of the logs: it is quick and easy to see as you are making changes I would suggest additional logs initially so you can fix it and potentially generate some stats around improvements on pulled data and do lots of analysis and then we deploy to production and then we can run analyses (before and after) on multiple deployments of various sizes |
Aim : To improve the performance of the trip segmentation stage of the pipeline by reducing the number of DB calls and performing more in-memory operations (potentially using pandas).
The text was updated successfully, but these errors were encountered: