Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend and simplify API for calculation of range-based rolling window offsets #17807

Open
wants to merge 25 commits into
base: branch-25.04
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
8fba685
New implementation of range-based window bound calculation
wence- Jan 23, 2025
b4f1ba0
Refactor to try and reduce/parallelise compile times
wence- Jan 24, 2025
7f75f8f
Better names for range-window column factories
wence- Jan 24, 2025
8148603
Simplify dispatch
wence- Jan 24, 2025
f5aa177
Reduce repetition in tests
wence- Jan 24, 2025
4cc530c
Explicitly write out ctor
wence- Jan 30, 2025
596cb3e
proclaim_return_type
wence- Jan 30, 2025
64aa4ec
Merge remote-tracking branch 'upstream/branch-25.04' into wence/fea/r…
wence- Feb 13, 2025
6898955
Use explicit template instantiation rather than forwarding functions
wence- Feb 13, 2025
7d04ec8
Move polyfill code to its use site
wence- Feb 13, 2025
0c5f162
Naming
wence- Feb 13, 2025
548171c
No need for an internal enum tag
wence- Feb 14, 2025
c0f9437
Merge remote-tracking branch 'upstream/branch-25.04' into wence/fea/r…
wence- Feb 24, 2025
bb09793
Fix stupid
wence- Feb 24, 2025
fec5788
Partial refactor following Vyas
wence- Feb 24, 2025
a3eab98
Grouping row info is named struct
wence- Feb 24, 2025
fb1af0f
Separate distance functors by window type
wence- Feb 24, 2025
2f7b5e2
More refactoring to reduce duplication
wence- Feb 25, 2025
4405187
Some docs
wence- Feb 25, 2025
107ec10
Tidy a bit more
wence- Feb 25, 2025
3cc2bb5
More doc fixes
wence- Feb 25, 2025
9db3613
Remove unnecessary files
wence- Feb 25, 2025
e1e535e
SFINAE for saturating add/sub
wence- Feb 25, 2025
de42bc2
Refactor a bit to reduce specialisations
wence- Feb 27, 2025
7d6d005
Use `static_cast`
wence- Mar 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -642,6 +642,7 @@ add_library(
src/rolling/detail/rolling_fixed_window.cu
src/rolling/detail/rolling_variable_window.cu
src/rolling/grouped_rolling.cu
src/rolling/range_rolling.cu
src/rolling/range_window_bounds.cpp
src/rolling/rolling.cu
src/round/round.cu
Expand Down
63 changes: 62 additions & 1 deletion cpp/include/cudf/detail/rolling.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2021-2024, NVIDIA CORPORATION.
* Copyright (c) 2021-2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -29,6 +29,39 @@
namespace CUDF_EXPORT cudf {
namespace detail {

namespace rolling {
/**
* @brief Direction tag for a range-based rolling window.
*/
enum class direction : bool {
PRECEDING, ///< A preceding window.
FOLLOWING, ///< A following window.
};
/**
* @brief Wrapper for preprocessed group information from sorted group keys.
*/
struct preprocessed_group_info {
rmm::device_uvector<size_type> const& labels; ///< Mapping from row index to group label
rmm::device_uvector<size_type> const& offsets; ///< Mapping from group label to row offsets
rmm::device_uvector<size_type> const&
nulls_per_group; ///< Mapping from group label to null count in the group
};

} // namespace rolling

/**
* @brief Compute the number of nulls in each group.
*
* @param orderby Column with null mask.
* @param offsets Offset array defining the (sorted) groups.
* @param stream CUDA stream used for kernel launches
* @return device_uvector containing the null count per group.
*/
[[nodiscard]] rmm::device_uvector<cudf::size_type> nulls_per_group(
column_view const& orderby,
rmm::device_uvector<size_type> const& offsets,
rmm::cuda_stream_view stream);

/**
* @copydoc std::unique_ptr<column> rolling_window(
* column_view const& input,
Expand All @@ -48,5 +81,33 @@ std::unique_ptr<column> rolling_window(column_view const& input,
rmm::cuda_stream_view stream,
rmm::device_async_resource_ref mr);

/**
* @brief Make a column representing the window offsets for a range-based window
*
* @param orderby Column use to define window ranges. If @p grouping is empty,
* must be sorted. If
* @p grouping is non-empty, must be sorted within each group. As well as
* being sorted, must be sorted consistently with the @p order and @p null_order
* parameters.
* @param grouping Optional preprocessed grouping information.
* @param order The sort order of the @p orderby column.
* @param null_order The sort order of nulls in the @p orderby column.
* @param window Descriptor specifying the window type.
* @param stream CUDA stream used for device memory operations and kernel
* launches.
* @param mr Device memory resource used for allocations.
* @return Column representing the window offsets as requested, suitable for passing to
* `rolling_window`.
*/
[[nodiscard]] std::unique_ptr<column> make_range_window(
column_view const& orderby,
std::optional<rolling::preprocessed_group_info> const& grouping,
rolling::direction direction,
order order,
null_order null_order,
range_window_type window,
rmm::cuda_stream_view stream,
rmm::device_async_resource_ref mr);

} // namespace detail
} // namespace CUDF_EXPORT cudf
118 changes: 118 additions & 0 deletions cpp/include/cudf/rolling.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,15 @@
#include <cudf/aggregation.hpp>
#include <cudf/rolling/range_window_bounds.hpp>
#include <cudf/types.hpp>
#include <cudf/utilities/default_stream.hpp>
#include <cudf/utilities/export.hpp>
#include <cudf/utilities/memory_resource.hpp>

#include <rmm/resource_ref.hpp>

#include <memory>
#include <optional>
#include <variant>

namespace CUDF_EXPORT cudf {
/**
Expand All @@ -31,6 +36,92 @@ namespace CUDF_EXPORT cudf {
* @file
*/

/**
* @brief Strongly typed wrapper for bounded closed rolling windows.
*
* The endpoints of this window are included.
*/
struct bounded_closed {
cudf::scalar const& delta_; ///< Delta from the current row in the window. Must be valid,
///< behaviour is undefined if not.
/**
* @brief Return pointer to the row delta scalar.
* @return pointer to scalar, not null.
*/
cudf::scalar const* delta() const noexcept { return &delta_; }
};

/**
* @brief Strongly typed wrapper for bounded open rolling windows.
*
* The endpoints of this window are excluded.
*/
struct bounded_open {
cudf::scalar const& delta_; ///< Delta from the current row in the window. Must be valid,
///< behaviour is undefined if not.
/**
* @brief Return pointer to the row delta scalar.
* @return pointer to scalar, not null.
*/
cudf::scalar const* delta() const noexcept { return &delta_; }
};

/**
* @brief Strongly typed wrapper for unbounded rolling windows.
*
* This window runs to the begin/end of the current row's group.
*/
struct unbounded {
/**
* @brief Return a null row delta
* @return nullptr
*/
constexpr cudf::scalar const* delta() const noexcept { return nullptr; }
};
/**
* @brief Strongly typed wrapper for current_row rolling windows.
*
* This window contains all rows that are equal to the current row.
*/
struct current_row {
/**
* @brief Return a null row delta
* @return nullptr
*/
constexpr cudf::scalar const* delta() const noexcept { return nullptr; }
};

/**
* @brief The type of the range-based rolling window endpoint.
*/
using range_window_type = std::variant<unbounded, current_row, bounded_closed, bounded_open>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm more familiar with what the code is doing, is the main reason that we're using a variant here that the bounded windows have a window size (delta) while the other two do not? If so, would things be simpler if we switched to a single class + enum? I guess it would be a little awkward to have to deal with having a delta and not in others.

Perhaps the better question in the long run is, is there a way for us to unify range_window_type and range_window_bounds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want range_window_bounds to go away


/**
* @brief Constructs preceding and following columns given window range specifications.
*
* @param group_keys Possibly empty table of sorted keys defining groups.
* @param orderby Column defining window ranges. Must be sorted. If `group_keys` is non-empty, must
* be sorted groupwise.
* @param order Sort order of the `orderby` column.
* @param null_order Null sort order in the sorted `orderby` column. Apples groupwise if
* `group_keys` is non-empty.
* @param preceding Type of the preceding window.
* @param following Type of the following window.
* @param stream CUDA stream used for device memory operations and kernel launches
* @param mr Device memory resource used to allocate the returned column's device memory
* @return pair of preceding and following columns that define the window bounds for each row,
* suitable for passing to `rolling_window`.
*/
std::pair<std::unique_ptr<column>, std::unique_ptr<column>> make_range_windows(
table_view const& group_keys,
column_view const& orderby,
order order,
null_order null_order,
range_window_type preceding,
range_window_type following,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref());

/**
* @brief Applies a fixed-size rolling window function to the values in a column.
*
Expand Down Expand Up @@ -443,6 +534,33 @@ std::unique_ptr<column> grouped_range_rolling_window(
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref());

/**
* @brief Apply a grouping-aware range-based rolling window function to a sequence of columns.
*
* @param group_keys Possibly empty table of sorted keys defining groups.
* @param orderby Column defining window ranges. Must be sorted. If `group_keys` is non-empty, must
* be sorted groupwise.
* @param order Sort order of the `orderby` column.
* @param null_order Null sort order in the sorted `orderby` column.
* @param preceding Type of the preceding window.
* @param following Type of the following window.
* @param min_periods Minimum number of observations in the window required to have a value.
* @param requests Vector of pairs of columns and aggregation requests.
* @param stream CUDA stream used for device memory operations and kernel launches
* @param mr Device memory resource used to allocate the returned column's device memory
* @return A table of results, one column per input request.
*/
std::unique_ptr<table> grouped_range_rolling_window(
table_view const& group_keys,
column_view const& orderby,
order order,
null_order null_order,
range_window_type preceding,
range_window_type following,
size_type min_periods,
std::vector<std::pair<column_view const&, rolling_aggregation const&>> requests,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref());
Comment on lines +553 to +563
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet added tests of this function, want to bikeshed the interface first.

For example, should the min_periods be part of the request, do we want a rolling_request object similar to the groupby_request we have for grouped aggregations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t feel like I know the status quo of our rolling APIs or the downstream requirements well enough to opine on this without significant research time. I might be able to circle back to this but it’ll be at least a week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the thing to do for now is to remove this API and then when using the other stuff from python we might notice things we want.

/**
* @brief Applies a variable-size rolling window function to the values in a column.
*
Expand Down
143 changes: 0 additions & 143 deletions cpp/src/rolling/detail/range_comparator_utils.cuh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are now no longer required.

This file was deleted.

Loading
Loading