Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise Simpson Desert tests #896

Merged
merged 9 commits into from
Nov 2, 2023
Merged

Revise Simpson Desert tests #896

merged 9 commits into from
Nov 2, 2023

Conversation

abyrd
Copy link
Member

@abyrd abyrd commented Oct 18, 2023

These tests were using Mercator gridded destinations that did not align with the street intersections or transit stops. This added a somewhat unpredictable walking delay to the end of each itinerary. This also caused the tests to fail when changing the representative points of grid pixels where travel times are measured. This PR or similar changes are necessary to make tests pass on PR #894: ideally this PR should be merged first before ones changing anything about Mercator grids or travel time sample points.

While I was working on this I took care of some other cleanup items for these tests. These are the changes:

  • Use single destination instead of Mercator grid.
  • Expand and add Javadoc comments.
  • Add better measure of distribution goodness-of-fit.
  • Apply percentile check and goodness-of-fit in every test.
  • Increase default Monte Carlo draws to get smoother histograms.
  • Add Javadoc caveats on clone() overrides.
  • Adapt and document test task builder accordingly.

This PR removes the use of web Mercator grids as destinations, concentrating testing on transit routing. To the extent that we also want the Simpson Desert tests to serve as integration tests of the whole routing system, we will eventually want some tests to use gridded destinations, test that grid cell / pixel sample points are properly situated and introducing the correct amount of walk time. But that should be done in a controlled way, not by inducing unpredictable extra walk times on the end of these more precise tests.

Use single destination instead of Mercator grid.
Expand and add Javadoc comments.
Add better measure of distribution goodness-of-fit.
Apply percentile check and goodness-of-fit in every test.
Increase default Monte Carlo draws to get smoother histograms.
Add Javadoc caveats on clone() overrides.
Adapt and document test task builder accordingly.
the predicted distribution for the multi-frequency test needs improvement
also remove unused opportunity density grids
Copy link
Member

@ansoncfit ansoncfit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to have the linking time removed from consideration.

Separate concern to consider outside this PR: I noticed some pre-existing code (at least in Distribution) has 120 hard-coded as the number of cutoff minutes. If our standard two-hour cutoff might ever be changed, we might want to use a consistent symbolic constant (see also maxTripDurationMinutes)

// 10 minutes wait, 10 minutes ride, giving 31 to 51 minutes.
// This estimation logic could be better codified as something like TravelTimeEstimate.waitWithHeadaway(20) etc.

// TODO For some reason observed is off by 1 minute, figure out why.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is board slack being applied at the second (transfer) boarding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not expect it to be applied, as the purpose of slack is to allow at least N minutes (currently set to 1) between arrival at the transit stop and departure of the vehicle, not to add 1 minute to every boarding time. In a case where the transfer target vehicle always departs 10 minutes after the passenger arrives at the stop, the fixed 10 minute wait exceeds the 1 minute slack, so I would expect the total wait to always be 10 minutes. However FastRaptorWorker#BOARD_SLACK_SECONDS seems to only be used in the multi-criteria RAPTOR code, and there's another constant FastRaptorWorker#MINIMUM_BOARD_WAIT_SEC which serves a very similar purpose and comments advising that these two be merged. Uses of this MINIMUM_BOARD_WAIT_SEC imply that it would just search for any departure more than one minute later (so finding one 10 minutes later rather than 11). I will keep digging into this though to see if there's anywhere we'd just be adding the slack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved some constants and updated some comments to make things a little clearer. Board slack (actually MINIMUM_BOARD_WAIT_SEC) seems to work as I expected, establishing a minimum wait but not always introducing an additional wait. My best guess now for the source of the extra minute is some kind of edge effect due to the fact that we're binning times into one-minute bins, and something may be pushing the travel time over the edge into the next bin. I'll have to look at the travel time in seconds instead of minutes to confirm this.

@abyrd
Copy link
Member Author

abyrd commented Oct 27, 2023

While debugging a one-minute (board slack?) discrepancy, I noticed that OneOriginResult.traveltimes has 5252 values even though there's only one destination point. Need to investigate.

@trevorgerhardt
Copy link
Member

Comments seem to indicate this PR is still being worked on. Is this PR still in progress or ready for a review? @abyrd

@abyrd
Copy link
Member Author

abyrd commented Nov 2, 2023

Comments seem to indicate this PR is still being worked on. Is this PR still in progress or ready for a review? @abyrd

I am taking a look at the OneOriginResult.traveltimes with 5252 values because I think it's the only thing really blocking this change, and this PR is needed before we can switch to pixel centers. Hoping to have it cleared up shortly.

The two-minute board slack (?) thing is not a regression, just an unclear explanation of why exactly the travel times are what they are. I haven't yet figured out the origin of this discrepancy but I think it has to do with things being binned into 1-minute bins. I am comfortable leaving it as-is until we have a full explanation.

abyrd and others added 2 commits November 2, 2023 19:24
Includes assertions to validate assumptions and ensure non-testing
behavior is unchanged.

This could later be used to generally allow any kind of PointSet as the
destinations in travel time tasks.
@abyrd abyrd enabled auto-merge November 2, 2023 11:58
Copy link
Member

@ansoncfit ansoncfit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@abyrd abyrd merged commit 7970690 into dev Nov 2, 2023
3 checks passed
@abyrd abyrd deleted the revise-simpson-desert branch November 2, 2023 13:19
abyrd added a commit that referenced this pull request Nov 3, 2023
Addresses #907, follow up to #896.
Regional tasks already allow freeform destinations.
Removed special code path required to do this with single point tasks.
Comments added to better explain the constraints and expectations around
single point and regional tasks, and how destinations are validated.
Assertions added pending better general validation on TravelTimeReducer.
Geographic dimensions and/or number of target points should be checked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants