From 9021941263e48c9f5515929679feb4a190522096 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 23 May 2024 11:28:16 +0200 Subject: [PATCH 01/49] Updated CHANGELOG.md for 0.15.0 (#1846) * Updated CHANGELOG.md for 0.15.0 * Update CHANGELOG.md Co-authored-by: Natalia Polina * Update CHANGELOG.md Co-authored-by: Natalia Polina * Update CHANGELOG.md Co-authored-by: Natalia Polina --------- Co-authored-by: Natalia Polina --- CHANGELOG.md | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index d8c5261cc1b..9e2bc27d4e1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,7 +4,75 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## [0.14.0] - MM/DD/2024 +## [0.15.0] - 05/DD/2024 + +This release completes implementation of `dpnp.linalg` module and array creation routine, adds cumulative reductions and histogram functions. + +### Added + +* Implemented `dpnp.frombuffer`, `dpnp.fromfile` and `dpnp.fromstring` functions [#1727](https://github.com/IntelPython/dpnp/pull/1727) +* Implemented `dpnp.fromfunction`, `dpnp.fromiter` and `dpnp.loadtxt` functions [#1728](https://github.com/IntelPython/dpnp/pull/1728) +* Added implementation of `dpnp.linalg.pinv` function [#1704](https://github.com/IntelPython/dpnp/pull/1704) +* Added implementation of `dpnp.linalg.eigvalsh` function [#1714](https://github.com/IntelPython/dpnp/pull/1714) +* Added implementation of `dpnp.linalg.tensorinv` function [#1752](https://github.com/IntelPython/dpnp/pull/1752) +* Added implementation of `dpnp.linalg.tensorsolve` function [#1753](https://github.com/IntelPython/dpnp/pull/1753) +* Added implementation of `dpnp.linalg.lstsq` function [#1792](https://github.com/IntelPython/dpnp/pull/1792) +* Added implementation of `dpnp.einsum` and `dpnp.einsum_path` functions [#1779](https://github.com/IntelPython/dpnp/pull/1779) +* Added implementation of `dpnp.histogram` function [#1785](https://github.com/IntelPython/dpnp/pull/1785) +* Added implementation of `dpnp.histogram_bin_edges` function [#1823](https://github.com/IntelPython/dpnp/pull/1823) +* Extended pre-commit hooks with `pylint` configuration [#1718](https://github.com/IntelPython/dpnp/pull/1718) +* Extended pre-commit hooks with `codespell` configuration [#1798](https://github.com/IntelPython/dpnp/pull/1798) +* Added a Security policy page [#1730](https://github.com/IntelPython/dpnp/pull/1730) +* Implemented `nin` and `nout` properties for `dpnp` elementwise functions [#1712](https://github.com/IntelPython/dpnp/pull/1712) +* Implemented `outer` method for `dpnp` elementwise functions [#1813](https://github.com/IntelPython/dpnp/pull/1813) + +### Changed + +* Added support of more number of data types and dimensions for input arrays, and all keyword arguments in `dpnp.cross` function [#1715](https://github.com/IntelPython/dpnp/pull/1715) +* Added support of more number of data types and dimensions for input array, and all keyword arguments in `dpnp.linalg.matrix_rank` function [#1717](https://github.com/IntelPython/dpnp/pull/1717) +* Added support of more number of data types and dimensions for input arrays in `dpnp.inner` function [#1726](https://github.com/IntelPython/dpnp/pull/1726) +* Added support of more number of data types and dimensions for input arrays in `dpnp.linalg.multi_dot` function [#1729](https://github.com/IntelPython/dpnp/pull/1729) +* Added support of more number of data types and dimensions for input arrays in `dpnp.kron` function [#1732](https://github.com/IntelPython/dpnp/pull/1732) +* Added support of more number of data types and dimensions for input arrays in `dpnp.linalg.matrix_power` function [#1748](https://github.com/IntelPython/dpnp/pull/1748) +* Added support of more number of data types and dimensions for input array, and all keyword arguments in `dpnp.norm` function [#1746](https://github.com/IntelPython/dpnp/pull/1746) +* Added support of more number of data types and dimensions for input array in `dpnp.cond` function [#1773](https://github.com/IntelPython/dpnp/pull/1773) +* Extended `dpnp.matmul` function to support `axes` keyword argument [#1705](https://github.com/IntelPython/dpnp/pull/1705) +* Extended `dpnp.searchsorted` function to support `side` and `sorter` keyword arguments [#1751](https://github.com/IntelPython/dpnp/pull/1751) +* Extended `dpnp.where` function to support scalar type by `x` and `y` arrays [#1760](https://github.com/IntelPython/dpnp/pull/1760) +* Extended `dpnp.ndarray.transpose` method to support `axes` keyword as a list [#1770](https://github.com/IntelPython/dpnp/pull/1770) +* Extended `dpnp.nancumsum` function to support `axis`, `dtype` and `out` keyword arguments [#1781](https://github.com/IntelPython/dpnp/pull/1781) +* Extended `dpnp.nancumprod` function to support `axis`, `dtype` and `out` keyword arguments [#1812](https://github.com/IntelPython/dpnp/pull/1812) +* Extended `dpnp.put` function to support more number of data types and dimensions for input arrays [#1838](https://github.com/IntelPython/dpnp/pull/1838) +* Extended `dpnp.trace` function to support `axis1`, `axis2`, `dtype` and `out` keyword arguments [#1842](https://github.com/IntelPython/dpnp/pull/1842) +* Corrected `dpnp.ndarray.real`and `dpnp.ndarray.imag` methods to return a view of the array [#1719](https://github.com/IntelPython/dpnp/pull/1719) +* Corrected `dpnp.nonzero` function to raise `TypeError` exception for input array of unexpected type [#1764](https://github.com/IntelPython/dpnp/pull/1764) +* Corrected `dpnp.diagonal` function to return a view of the array [#1817](https://github.com/IntelPython/dpnp/pull/1817) +* Removed `dpnp.find_common_type` function as it was deprecated since NumPy 1.25.0 [#1742](https://github.com/IntelPython/dpnp/pull/1742) +* Removed use of `dpctl` queue manager API [#1735](https://github.com/IntelPython/dpnp/pull/1735) +* Leveraged `dpctl.tensor` implementation for `dpnp.cumsum` function [#1772](https://github.com/IntelPython/dpnp/pull/1772) +* Leveraged `dpctl.tensor` implementation for `dpnp.cumprod` function [#1811](https://github.com/IntelPython/dpnp/pull/1811) +* Leveraged `dpctl.tensor` implementation for `dpnp.cumlogsumexp` function [#1816](https://github.com/IntelPython/dpnp/pull/1816) +* Leveraged `dpctl.tensor` support of `out` keyword argument in reduction and `dpnp.where` functions [#1808](https://github.com/IntelPython/dpnp/pull/1808) +* Aligned with `dpctl` interface changes per Python Array API 2023.12 specification [#1774](https://github.com/IntelPython/dpnp/pull/1774) +* Reworked `dpnp.linalg.eig` and `dpnp.linalg.eigvals` implementations to fall back on on NumPy calculation due to a lack of required functionality in OneMKL LAPACK [#1780](https://github.com/IntelPython/dpnp/pull/1780) +* `dpnp` uses pybind11 2.12.0 [#1783](https://github.com/IntelPython/dpctl/pull/1783) +* Improved `dpnp.matmul` implementation to use column major `gemm` layout for F-contiguous input arrays [#1793](https://github.com/IntelPython/dpnp/pull/1793) +* Improved performance of `dpnp.matmul` function by call of `dpnp.kron` and `dpnp.dot` for special cases [#1815](https://github.com/IntelPython/dpnp/pull/1815) +* Improved performance of `dpnp.diag` function by use of `dpnp.diagonal` which returns a view of the array [#1822](https://github.com/IntelPython/dpnp/pull/1822) +* Removed limitations from `diag_indices`, `diag_indices_from`, `fill_diagonal`, `tril_indices`, `tril_indices_from`, `triu_indices`, `triu_indices_from` functions +and added implementation of `dpnp.mask_indices` function [#1814](https://github.com/IntelPython/dpnp/pull/1814) + +### Fixed + +* Changed `dpnp.linalg.solve` to use a pair of `getrf` and `getrs` calls from OneMKL library instead of `gesv` one to mitigate an unexpected `RuntimeError` exception [#1763](https://github.com/IntelPython/dpnp/pull/1763) +* Resolved a hang in batch implementation of `dpnp.linalg.solve` when computes on CPU device [#1778](https://github.com/IntelPython/dpnp/pull/1778) +* Resolved an unexpected `TypeError` exception raised from `dpnp.random.vonmises` when used with a scalar `kappa` argument [#1799](https://github.com/IntelPython/dpnp/pull/1799) +* Changed `dpnp.flatten` to comply with compute follows data approach [#1825](https://github.com/IntelPython/dpnp/pull/1825) +* Resolved a hang in batch implementation of `dpnp.linalg.eigh` when computes on CPU device [#1832](https://github.com/IntelPython/dpnp/pull/1832) +* Resolved an unexpected `ValueError` exception raised from `dpnp.linalg.pinv` due to a shape issue in `dpnp.matmul` [#1843](https://github.com/IntelPython/dpnp/pull/1843) + + +## [0.14.0] - 02/16/2024 This release will require DPC++ `2024.1.0`, which no longer supports Intel Gen9 integrated GPUs found in Intel CPUs of 10th generation and older. From 6c41c4f250a495a8428094575d42c7e8cf774c64 Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Thu, 23 May 2024 14:23:51 +0200 Subject: [PATCH 02/49] Implement `dpnp.digitize()` (#1847) * Implement dpnp.digitize * Update cupy tests for digitize func * Update skipped_tests files * Add tests in test_sycl_queue and test_usm_type * Return pylint disable * Handle empty bins * Small update cupy tests * Add dpnp tests for dpnp.digitize * Increase code coverage * Apply remarks * Move tests from test_statistic to test_histogram * Add test with different dtypes --- dpnp/dpnp_iface_histograms.py | 95 ++++++++++++++++++- tests/skipped_tests.tbl | 55 ----------- tests/skipped_tests_gpu.tbl | 55 ----------- tests/test_histogram.py | 93 ++++++++++++++++++ tests/test_sycl_queue.py | 1 + tests/test_usm_type.py | 1 + .../cupy/statistics_tests/test_histogram.py | 15 ++- 7 files changed, 196 insertions(+), 119 deletions(-) diff --git a/dpnp/dpnp_iface_histograms.py b/dpnp/dpnp_iface_histograms.py index 919c3f64b99..1a1b4daf740 100644 --- a/dpnp/dpnp_iface_histograms.py +++ b/dpnp/dpnp_iface_histograms.py @@ -46,6 +46,7 @@ import dpnp __all__ = [ + "digitize", "histogram", "histogram_bin_edges", ] @@ -208,6 +209,98 @@ def _search_sorted_inclusive(a, v): ) +def digitize(x, bins, right=False): + """ + Return the indices of the bins to which each value in input array belongs. + + For full documentation refer to :obj:`numpy.digitize`. + + Parameters + ---------- + a : {dpnp.ndarray, usm_ndarray} + Input array to be binned. + bins : {dpnp.ndarray, usm_ndarray} + Array of bins. It has to be 1-dimensional and monotonic + increasing or decreasing. + right : bool, optional + Indicates whether the intervals include the right or the left bin edge. + Default: ``False``. + + Returns + ------- + indices : dpnp.ndarray + Array of indices with the same shape as `x`. + + Notes + ----- + This will not raise an exception when the input array is + not monotonic. + + See Also + -------- + :obj:`dpnp.bincount` : Count number of occurrences of each value in array + of non-negative integers. + :obj:`dpnp.histogram` : Compute the histogram of a data set. + :obj:`dpnp.unique` : Find the unique elements of an array. + :obj:`dpnp.searchsorted` : Find indices where elements should be inserted + to maintain order. + + Examples + -------- + >>> import dpnp as np + >>> x = np.array([0.2, 6.4, 3.0, 1.6]) + >>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0]) + >>> inds = np.digitize(x, bins) + >>> inds + array([1, 4, 3, 2]) + >>> for n in range(x.size): + ... print(bins[inds[n]-1], "<=", x[n], "<", bins[inds[n]]) + ... + 0. <= 0.2 < 1. + 4. <= 6.4 < 10. + 2.5 <= 3. < 4. + 1. <= 1.6 < 2.5 + + >>> x = np.array([1.2, 10.0, 12.4, 15.5, 20.]) + >>> bins = np.array([0, 5, 10, 15, 20]) + >>> np.digitize(x, bins, right=True) + array([1, 2, 3, 4, 4]) + >>> np.digitize(x, bins, right=False) + array([1, 3, 3, 4, 5]) + + """ + + dpnp.check_supported_arrays_type(x, bins) + + if dpnp.issubdtype(x.dtype, dpnp.complexfloating): + raise TypeError("x may not be complex") + + if bins.ndim > 1: + raise ValueError("object too deep for desired array") + if bins.ndim < 1: + raise ValueError("object of too small depth for desired array") + + # This is backwards because the arguments below are swapped + side = "left" if right else "right" + + # Check if bins are monotonically increasing. + # If bins is empty, the array is considered to be increasing. + # If all bins are NaN, the array is considered to be decreasing. + if bins.size == 0: + bins_increasing = True + else: + bins_increasing = bins[0] <= bins[-1] or ( + not dpnp.isnan(bins[0]) and dpnp.isnan(bins[-1]) + ) + + if bins_increasing: + # Use dpnp.searchsorted directly if bins are increasing + return dpnp.searchsorted(bins, x, side=side) + + # Reverse bins and adjust indices if bins are decreasing + return bins.size - dpnp.searchsorted(bins[::-1], x, side=side) + + def histogram(a, bins=10, range=None, density=None, weights=None): """ Compute the histogram of a data set. @@ -335,8 +428,8 @@ def histogram(a, bins=10, range=None, density=None, weights=None): n = dpnp.diff(cum_n) if density: - db = dpnp.diff(bin_edges).astype(dpnp.default_float_type()) # pylint: disable=possibly-used-before-assignment + db = dpnp.diff(bin_edges).astype(dpnp.default_float_type()) return n / db / n.sum(), bin_edges return n, bin_edges diff --git a/tests/skipped_tests.tbl b/tests/skipped_tests.tbl index a9cb3d09560..7fa1510e8a5 100644 --- a/tests/skipped_tests.tbl +++ b/tests/skipped_tests.tbl @@ -613,61 +613,6 @@ tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_ tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_rowvar tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_y -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeInvalid::test_digitize_complex -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeInvalid::test_digitize_nd_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_all_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins_decreasing -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins_decreasing_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_searchsorted_inf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_searchsorted_minf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_all_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins_decreasing -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins_decreasing_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_searchsorted_inf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_searchsorted_minf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_0_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_10_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_11_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_12_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_13_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_14_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_15_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_16_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_17_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_18_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_19_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_1_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_20_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_21_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_22_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_23_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_24_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_25_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_26_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_27_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_28_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_29_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_2_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_30_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_31_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_32_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_33_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_34_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_35_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_3_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_4_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_5_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_6_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_7_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_8_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_9_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=False, shape=()}::test_digitize - tests/third_party/cupy/statistics_tests/test_order.py::TestOrder::test_percentile_defaults[linear] tests/third_party/cupy/statistics_tests/test_order.py::TestOrder::test_percentile_defaults[lower] tests/third_party/cupy/statistics_tests/test_order.py::TestOrder::test_percentile_defaults[higher] diff --git a/tests/skipped_tests_gpu.tbl b/tests/skipped_tests_gpu.tbl index fa8d00145d1..8791400846b 100644 --- a/tests/skipped_tests_gpu.tbl +++ b/tests/skipped_tests_gpu.tbl @@ -619,61 +619,6 @@ tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_ tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_rowvar tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_y -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeInvalid::test_digitize_complex -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeInvalid::test_digitize_nd_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_all_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins_decreasing -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins_decreasing_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_digitize_nan_bins_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_searchsorted_inf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_0_{right=True}::test_searchsorted_minf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_all_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins_decreasing -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins_decreasing_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_digitize_nan_bins_repeated -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_searchsorted_inf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitizeNanInf_param_1_{right=False}::test_searchsorted_minf -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_0_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_10_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_11_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_12_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_13_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_14_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_15_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_16_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_17_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=True, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_18_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_19_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_1_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_20_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_21_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_22_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_23_{bins=[-1.0, 1.0, 2.5, 4.0, 20.0], increasing=False, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_24_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_25_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_26_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_27_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_28_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_29_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=True, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_2_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_30_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_31_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_32_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_33_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_34_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_35_{bins=[0.0, 1.0, 1.0, 4.0, 4.0, 10.0], increasing=False, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_3_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=False, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_4_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=False, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_5_{bins=[1.5, 2.5, 4.0, 6.0], increasing=True, right=False, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_6_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=True, shape=()}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_7_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=True, shape=(10,)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_8_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=True, shape=(6, 3, 3)}::test_digitize -tests/third_party/cupy/statistics_tests/test_histogram.py::TestDigitize_param_9_{bins=[1.5, 2.5, 4.0, 6.0], increasing=False, right=False, shape=()}::test_digitize - tests/third_party/cupy/statistics_tests/test_order.py::TestOrder::test_percentile_defaults[linear] tests/third_party/cupy/statistics_tests/test_order.py::TestOrder::test_percentile_defaults[lower] tests/third_party/cupy/statistics_tests/test_order.py::TestOrder::test_percentile_defaults[higher] diff --git a/tests/test_histogram.py b/tests/test_histogram.py index a70f2db8044..7601d67c54a 100644 --- a/tests/test_histogram.py +++ b/tests/test_histogram.py @@ -20,6 +20,99 @@ ) +class TestDigitize: + @pytest.mark.parametrize( + "dtype", get_all_dtypes(no_bool=True, no_complex=True) + ) + @pytest.mark.parametrize("right", [True, False]) + @pytest.mark.parametrize( + "x, bins", + [ + # Negative values + ( + numpy.array([-5, -3, -1, 0, 1, 3, 5]), + numpy.array([-4, -2, 0, 2, 4]), + ), + # Non-uniform bins + ( + numpy.array([1, 2, 3, 4, 5, 6, 7, 8, 9]), + numpy.array([1, 4, 6, 7]), + ), + # Infinity values + ( + numpy.array([-numpy.inf, -1, 0, 1, numpy.inf]), + numpy.array([-2, -1, 0, 1, 2]), + ), + # Repeated elements + (numpy.array([1, 2, 2, 3, 3, 3, 4, 5]), numpy.array([1, 2, 3, 4])), + ], + ) + def test_digitize(self, x, bins, dtype, right): + x = x.astype(dtype) + bins = bins.astype(dtype) + x_dp = dpnp.array(x) + bins_dp = dpnp.array(bins) + + result = dpnp.digitize(x_dp, bins_dp, right=right) + expected = numpy.digitize(x, bins, right=right) + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize( + "dtype_x", get_all_dtypes(no_bool=True, no_complex=True) + ) + @pytest.mark.parametrize( + "dtype_bins", get_all_dtypes(no_bool=True, no_complex=True) + ) + @pytest.mark.parametrize("right", [True, False]) + def test_digitize_diff_types(self, dtype_x, dtype_bins, right): + x = numpy.array([1, 2, 3, 4, 5], dtype=dtype_x) + bins = numpy.array([1, 3, 5], dtype=dtype_bins) + x_dp = dpnp.array(x) + bins_dp = dpnp.array(bins) + + result = dpnp.digitize(x_dp, bins_dp, right=right) + expected = numpy.digitize(x, bins, right=right) + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize( + "dtype", get_all_dtypes(no_bool=True, no_complex=True) + ) + @pytest.mark.parametrize( + "x, bins", + [ + # Empty array + (numpy.array([]), numpy.array([1, 2, 3])), + # Empty bins + (numpy.array([1, 2, 3]), numpy.array([])), + ], + ) + def test_digitize_empty(self, x, bins, dtype): + x = x.astype(dtype) + bins = bins.astype(dtype) + x_dp = dpnp.array(x) + bins_dp = dpnp.array(bins) + + result = dpnp.digitize(x_dp, bins_dp) + expected = numpy.digitize(x, bins) + assert_dtype_allclose(result, expected) + + def test_digitize_error(self): + x_dp = dpnp.array([1, 2, 3], dtype="float32") + bins_dp = dpnp.array([1, 2, 3], dtype="float32") + + # unsupported type + x_np = dpnp.asnumpy(x_dp) + bins_np = dpnp.asnumpy(bins_dp) + with pytest.raises(TypeError): + dpnp.digitize(x_np, bins_dp) + dpnp.digitize(x_dp, bins_np) + + # bins ndim < 1 + bins_scalar = dpnp.array(1) + with pytest.raises(ValueError): + dpnp.digitize(x_dp, bins_scalar) + + class TestHistogram: @pytest.mark.usefixtures("suppress_complex_warning") @pytest.mark.parametrize( diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index 9286131a65b..fae4dd52221 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -606,6 +606,7 @@ def test_reduce_hypot(device): pytest.param("arctan2", [[-1, +1, +1, -1]], [[-1, -1, +1, +1]]), pytest.param("copysign", [0.0, 1.0, 2.0], [-1.0, 0.0, 1.0]), pytest.param("cross", [1.0, 2.0, 3.0], [4.0, 5.0, 6.0]), + pytest.param("digitize", [0.2, 6.4, 3.0], [0.0, 1.0, 2.5, 4.0]), pytest.param( "divide", [0.0, 1.0, 2.0, 3.0, 4.0], [4.0, 4.0, 4.0, 4.0, 4.0] ), diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index a2b38b82e8d..eab59cf001b 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -614,6 +614,7 @@ def test_1in_1out(func, data, usm_type): pytest.param("arctan2", [[-1, +1, +1, -1]], [[-1, -1, +1, +1]]), pytest.param("copysign", [0.0, 1.0, 2.0], [-1.0, 0.0, 1.0]), pytest.param("cross", [1.0, 2.0, 3.0], [4.0, 5.0, 6.0]), + pytest.param("digitize", [0.2, 6.4, 3.0], [0.0, 1.0, 2.5, 4.0]), # dpnp.dot has 3 different implementations based on input arrays dtype # checking all of them pytest.param("dot", [3.0, 4.0, 5.0], [1.0, 2.0, 3.0]), diff --git a/tests/third_party/cupy/statistics_tests/test_histogram.py b/tests/third_party/cupy/statistics_tests/test_histogram.py index bb1dd8e07ce..18fd4a0aa55 100644 --- a/tests/third_party/cupy/statistics_tests/test_histogram.py +++ b/tests/third_party/cupy/statistics_tests/test_histogram.py @@ -345,7 +345,6 @@ def test_bincount_too_small_minlength(self, dtype): # in this comment to restore the support. -@pytest.mark.skip("digitize() is not implemented yet") @testing.parameterize( *testing.product( { @@ -367,6 +366,8 @@ class TestDigitize: @testing.for_all_dtypes(no_bool=True, no_complex=True) @testing.numpy_cupy_array_equal() def test_digitize(self, xp, dtype): + if self.shape == () and not self.increasing: + pytest.skip("dpctl issue #1689") x = testing.shaped_arange(self.shape, xp, dtype) bins = self.bins if not self.increasing: @@ -376,7 +377,6 @@ def test_digitize(self, xp, dtype): return (y,) -@pytest.mark.skip("digitize() is not implemented yet") @testing.parameterize({"right": True}, {"right": False}) class TestDigitizeNanInf(unittest.TestCase): @testing.numpy_cupy_array_equal() @@ -432,7 +432,7 @@ def test_digitize_all_nan_bins(self, xp): @testing.numpy_cupy_array_equal() def test_searchsorted_inf(self, xp): - x = testing.shaped_arange((14,), xp, xp.float64) + x = testing.shaped_arange((14,), xp, cupy.default_float_type()) x[5] = float("inf") bins = xp.array([0, 1, 2, 4, 10]) y = xp.digitize(x, bins, right=self.right) @@ -440,25 +440,24 @@ def test_searchsorted_inf(self, xp): @testing.numpy_cupy_array_equal() def test_searchsorted_minf(self, xp): - x = testing.shaped_arange((14,), xp, xp.float64) + x = testing.shaped_arange((14,), xp, cupy.default_float_type()) x[5] = float("-inf") bins = xp.array([0, 1, 2, 4, 10]) y = xp.digitize(x, bins, right=self.right) return (y,) -@pytest.mark.skip("digitize() is not implemented yet") class TestDigitizeInvalid(unittest.TestCase): def test_digitize_complex(self): for xp in (numpy, cupy): - x = testing.shaped_arange((14,), xp, complex) - bins = xp.array([1.0, 3.0, 5.0, 8.0, 12.0], complex) + x = testing.shaped_arange((14,), xp, xp.complex64) + bins = xp.array([1.0, 3.0, 5.0, 8.0, 12.0], xp.complex64) with pytest.raises(TypeError): xp.digitize(x, bins) def test_digitize_nd_bins(self): for xp in (numpy, cupy): - x = testing.shaped_arange((14,), xp, xp.float64) + x = testing.shaped_arange((14,), xp, cupy.default_float_type()) bins = xp.array([[1], [2]]) with pytest.raises(ValueError): xp.digitize(x, bins) From 41bd6586fee2b9a68b3ba1cdb7551fb4423fef3c Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Thu, 23 May 2024 16:07:44 +0200 Subject: [PATCH 03/49] Update CHANGELOG.md (#1848) --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9e2bc27d4e1..1fc3b1d5078 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -20,6 +20,7 @@ This release completes implementation of `dpnp.linalg` module and array creation * Added implementation of `dpnp.einsum` and `dpnp.einsum_path` functions [#1779](https://github.com/IntelPython/dpnp/pull/1779) * Added implementation of `dpnp.histogram` function [#1785](https://github.com/IntelPython/dpnp/pull/1785) * Added implementation of `dpnp.histogram_bin_edges` function [#1823](https://github.com/IntelPython/dpnp/pull/1823) +* Added implementation of `dpnp.digitize` function [#1847](https://github.com/IntelPython/dpnp/pull/1847) * Extended pre-commit hooks with `pylint` configuration [#1718](https://github.com/IntelPython/dpnp/pull/1718) * Extended pre-commit hooks with `codespell` configuration [#1798](https://github.com/IntelPython/dpnp/pull/1798) * Added a Security policy page [#1730](https://github.com/IntelPython/dpnp/pull/1730) From 71bdbe1fb187d4fd544f865f29b05e153c901628 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 23 May 2024 18:13:36 +0200 Subject: [PATCH 04/49] Start 0.16 development (#1850) * Added stub for 0.16 release cycle * Set CMake version to 0.16 --- CHANGELOG.md | 9 +++++++++ CMakeLists.txt | 2 ++ 2 files changed, 11 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 1fc3b1d5078..1d02f4eb8f3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,15 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.16.0] - MM/DD/2024 + +### Added + +### Change + +### Fixed + + ## [0.15.0] - 05/DD/2024 This release completes implementation of `dpnp.linalg` module and array creation routine, adds cumulative reductions and histogram functions. diff --git a/CMakeLists.txt b/CMakeLists.txt index 6a3c7d8c99e..9d061b8020c 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -1,6 +1,8 @@ cmake_minimum_required(VERSION 3.21...3.27 FATAL_ERROR) project(dpnp + VERSION 0.16 + LANGUAGES CXX DESCRIPTION "NumPy-like API accelerated by SYCL." ) From 7afd98d9f72ecdec95e9d2a466668a4a7dcd3792 Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Thu, 23 May 2024 20:35:22 +0200 Subject: [PATCH 05/49] Remove skip in test_digitize (#1851) --- tests/third_party/cupy/statistics_tests/test_histogram.py | 2 -- 1 file changed, 2 deletions(-) diff --git a/tests/third_party/cupy/statistics_tests/test_histogram.py b/tests/third_party/cupy/statistics_tests/test_histogram.py index 18fd4a0aa55..521bd4062fb 100644 --- a/tests/third_party/cupy/statistics_tests/test_histogram.py +++ b/tests/third_party/cupy/statistics_tests/test_histogram.py @@ -366,8 +366,6 @@ class TestDigitize: @testing.for_all_dtypes(no_bool=True, no_complex=True) @testing.numpy_cupy_array_equal() def test_digitize(self, xp, dtype): - if self.shape == () and not self.increasing: - pytest.skip("dpctl issue #1689") x = testing.shaped_arange(self.shape, xp, dtype) bins = self.bins if not self.increasing: From d819a087992a8003dd6ef207e7e7cc7f4e841e60 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Sat, 25 May 2024 18:41:16 +0200 Subject: [PATCH 06/49] Bump github/codeql-action from 3.25.5 to 3.25.6 (#1857) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.5 to 3.25.6. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/b7cec7526559c32f1616476ff32d17ba4c59b2d6...9fdb3e49720b44c48891d036bb502feb25684276) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- .github/workflows/openssf-scorecard.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index ce05a4b2acf..726b817e2ff 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -68,6 +68,6 @@ jobs: # Upload the results to GitHub's code scanning dashboard. - name: "Upload to code-scanning" - uses: github/codeql-action/upload-sarif@b7cec7526559c32f1616476ff32d17ba4c59b2d6 # v3.25.5 + uses: github/codeql-action/upload-sarif@9fdb3e49720b44c48891d036bb502feb25684276 # v3.25.6 with: sarif_file: results.sarif From cb48f8de8610c72c68998d09e0d55976b8a66f32 Mon Sep 17 00:00:00 2001 From: vtavana <120411540+vtavana@users.noreply.github.com> Date: Mon, 27 May 2024 09:02:23 -0500 Subject: [PATCH 07/49] implement gemv (#1834) --- dpnp/backend/extensions/blas/CMakeLists.txt | 1 + dpnp/backend/extensions/blas/blas_py.cpp | 17 +- dpnp/backend/extensions/blas/gemv.cpp | 295 ++++++++++++++++++ dpnp/backend/extensions/blas/gemv.hpp | 62 ++++ dpnp/backend/extensions/blas/types_matrix.hpp | 25 ++ dpnp/dpnp_iface_linearalgebra.py | 10 +- dpnp/dpnp_utils/dpnp_utils_linearalgebra.py | 48 ++- tests/test_mathematical.py | 92 +++++- tests/test_product.py | 10 +- 9 files changed, 521 insertions(+), 39 deletions(-) create mode 100644 dpnp/backend/extensions/blas/gemv.cpp create mode 100644 dpnp/backend/extensions/blas/gemv.hpp diff --git a/dpnp/backend/extensions/blas/CMakeLists.txt b/dpnp/backend/extensions/blas/CMakeLists.txt index debd412da9f..8ef4e7d79e1 100644 --- a/dpnp/backend/extensions/blas/CMakeLists.txt +++ b/dpnp/backend/extensions/blas/CMakeLists.txt @@ -29,6 +29,7 @@ set(_module_src ${CMAKE_CURRENT_SOURCE_DIR}/blas_py.cpp ${CMAKE_CURRENT_SOURCE_DIR}/gemm.cpp ${CMAKE_CURRENT_SOURCE_DIR}/gemm_batch.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/gemv.cpp ) pybind11_add_module(${python_module_name} MODULE ${_module_src}) diff --git a/dpnp/backend/extensions/blas/blas_py.cpp b/dpnp/backend/extensions/blas/blas_py.cpp index fee0c3bf6ca..3fdfebe7c30 100644 --- a/dpnp/backend/extensions/blas/blas_py.cpp +++ b/dpnp/backend/extensions/blas/blas_py.cpp @@ -35,17 +35,19 @@ #include "dotc.hpp" #include "dotu.hpp" #include "gemm.hpp" +#include "gemv.hpp" namespace blas_ext = dpnp::backend::ext::blas; namespace py = pybind11; namespace dot_ext = blas_ext::dot; using dot_ext::dot_impl_fn_ptr_t; -// populate dispatch tables -void init_dispatch_tables(void) +// populate dispatch vectors and tables +void init_dispatch_vectors_tables(void) { blas_ext::init_gemm_batch_dispatch_table(); blas_ext::init_gemm_dispatch_table(); + blas_ext::init_gemv_dispatch_vector(); } static dot_impl_fn_ptr_t dot_dispatch_vector[dpctl_td_ns::num_types]; @@ -54,7 +56,7 @@ static dot_impl_fn_ptr_t dotu_dispatch_vector[dpctl_td_ns::num_types]; PYBIND11_MODULE(_blas_impl, m) { - init_dispatch_tables(); + init_dispatch_vectors_tables(); using arrayT = dpctl::tensor::usm_ndarray; using event_vecT = std::vector; @@ -129,4 +131,13 @@ PYBIND11_MODULE(_blas_impl, m) py::arg("sycl_queue"), py::arg("matrixA"), py::arg("matrixB"), py::arg("resultC"), py::arg("depends") = py::list()); } + + { + m.def("_gemv", &blas_ext::gemv, + "Call `gemv` from OneMKL BLAS library to return " + "the matrix-vector product using a general matrix.", + py::arg("sycl_queue"), py::arg("matrixA"), py::arg("vectorX"), + py::arg("vectorY"), py::arg("transpose"), + py::arg("depends") = py::list()); + } } diff --git a/dpnp/backend/extensions/blas/gemv.cpp b/dpnp/backend/extensions/blas/gemv.cpp new file mode 100644 index 00000000000..c325299aa03 --- /dev/null +++ b/dpnp/backend/extensions/blas/gemv.cpp @@ -0,0 +1,295 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +// dpctl tensor headers +#include "utils/memory_overlap.hpp" +#include "utils/output_validation.hpp" +#include "utils/type_utils.hpp" + +#include "gemv.hpp" +#include "types_matrix.hpp" + +#include "dpnp_utils.hpp" + +namespace dpnp +{ +namespace backend +{ +namespace ext +{ +namespace blas +{ +namespace mkl_blas = oneapi::mkl::blas; +namespace py = pybind11; +namespace type_utils = dpctl::tensor::type_utils; + +typedef sycl::event (*gemv_impl_fn_ptr_t)(sycl::queue &, + oneapi::mkl::transpose, + const std::int64_t, + const std::int64_t, + char *, + const std::int64_t, + char *, + const std::int64_t, + char *, + const std::int64_t, + bool, + const std::vector &); + +static gemv_impl_fn_ptr_t gemv_dispatch_vector[dpctl_td_ns::num_types]; + +template +static sycl::event gemv_impl(sycl::queue &exec_q, + oneapi::mkl::transpose transA, + const std::int64_t m, + const std::int64_t n, + char *matrixA, + const std::int64_t lda, + char *vectorX, + const std::int64_t incx, + char *vectorY, + const std::int64_t incy, + bool is_row_major, + const std::vector &depends) +{ + type_utils::validate_type_for_device(exec_q); + + T *a = reinterpret_cast(matrixA); + T *x = reinterpret_cast(vectorX); + T *y = reinterpret_cast(vectorY); + + std::stringstream error_msg; + bool is_exception_caught = false; + + sycl::event gemv_event; + try { + auto gemv_func = + [&](sycl::queue &q, oneapi::mkl::transpose transA, std::int64_t m, + std::int64_t n, T alpha, const T *a, std::int64_t lda, + const T *x, std::int64_t incx, T beta, T *y, std::int64_t incy, + const std::vector &deps) -> sycl::event { + if (is_row_major) { + return mkl_blas::row_major::gemv(q, transA, m, n, alpha, a, lda, + x, incx, beta, y, incy, deps); + } + else { + return mkl_blas::column_major::gemv(q, transA, m, n, alpha, a, + lda, x, incx, beta, y, incy, + deps); + } + }; + gemv_event = gemv_func( + exec_q, + transA, // Defines the transpose operation for matrix A: + // 'N' indicates no transpose, 'T' for transpose, + // or 'C' for a conjugate transpose. + m, // Number of rows in matrix A. + n, // Number of columns in matrix A. + T(1), // Scaling factor for the matrix-vector product. + a, // Pointer to the input matrix A. + lda, // Leading dimension of matrix A, which is the + // stride between successive rows (for row major + // layout). + x, // Pointer to the input vector x. + incx, // The stride of vector x. + T(0), // Scaling factor for vector y. + y, // Pointer to output vector y, where the result is stored. + incy, // The stride of vector y. + depends); + } catch (oneapi::mkl::exception const &e) { + error_msg + << "Unexpected MKL exception caught during gemv() call:\nreason: " + << e.what(); + is_exception_caught = true; + } catch (sycl::exception const &e) { + error_msg << "Unexpected SYCL exception caught during gemv() call:\n" + << e.what(); + is_exception_caught = true; + } + + if (is_exception_caught) // an unexpected error occurs + { + throw std::runtime_error(error_msg.str()); + } + + return gemv_event; +} + +std::pair + gemv(sycl::queue &exec_q, + dpctl::tensor::usm_ndarray matrixA, + dpctl::tensor::usm_ndarray vectorX, + dpctl::tensor::usm_ndarray vectorY, + bool transpose, + const std::vector &depends) +{ + const int matrixA_nd = matrixA.get_ndim(); + const int vectorX_nd = vectorX.get_ndim(); + const int vectorY_nd = vectorY.get_ndim(); + + if ((matrixA_nd != 2) || (vectorX_nd != 1) || (vectorY_nd != 1)) { + throw py::value_error("The arrays have incorrect dimensions."); + } + + auto const &overlap = dpctl::tensor::overlap::MemoryOverlap(); + if (overlap(matrixA, vectorY)) { + throw py::value_error("Input matrix and output vector are overlapping " + "segments of memory"); + } + if (overlap(vectorX, vectorY)) { + throw py::value_error("Input vector and output vector are overlapping " + "segments of memory"); + } + + if (!dpctl::utils::queues_are_compatible( + exec_q, + {matrixA.get_queue(), vectorX.get_queue(), vectorY.get_queue()})) + { + throw py::value_error( + "USM allocations are not compatible with the execution queue."); + } + + bool is_matrixA_f_contig = matrixA.is_f_contiguous(); + bool is_matrixA_c_contig = matrixA.is_c_contiguous(); + + if (!is_matrixA_f_contig and !is_matrixA_c_contig) { + throw py::value_error( + "Input matrix is not c-contiguous nor f-contiguous."); + } + + bool is_row_major = true; + if (is_matrixA_f_contig) { + is_row_major = false; + } + + const py::ssize_t *a_shape = matrixA.get_shape_raw(); + const py::ssize_t *x_shape = vectorX.get_shape_raw(); + const py::ssize_t *y_shape = vectorY.get_shape_raw(); + const std::int64_t m = a_shape[0]; + const std::int64_t n = a_shape[1]; + const std::int64_t lda = is_row_major ? n : m; + + oneapi::mkl::transpose transA; + size_t src_nelems; + if (transpose) { + transA = oneapi::mkl::transpose::T; + src_nelems = n; + if (m != x_shape[0]) { + throw py::value_error("The number of rows in A must be equal to " + "the number of elements in X."); + } + if (n != y_shape[0]) { + throw py::value_error("The number of columns in A must be equal to " + "the number of elements in Y."); + } + } + else { + transA = oneapi::mkl::transpose::N; + src_nelems = m; + if (n != x_shape[0]) { + throw py::value_error("The number of columns in A must be equal to " + "the number of elements in X."); + } + if (m != y_shape[0]) { + throw py::value_error("The number of rows in A must be equal to " + "the number of elements in Y."); + } + } + dpctl::tensor::validation::CheckWritable::throw_if_not_writable(vectorY); + dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(vectorY, + src_nelems); + + int matrixA_typenum = matrixA.get_typenum(); + int vectorX_typenum = vectorX.get_typenum(); + int vectorY_typenum = vectorY.get_typenum(); + + if (matrixA_typenum != vectorX_typenum || + matrixA_typenum != vectorY_typenum) { + throw py::value_error("Given arrays must be of the same type."); + } + + auto array_types = dpctl_td_ns::usm_ndarray_types(); + int type_id = array_types.typenum_to_lookup_id(matrixA_typenum); + + gemv_impl_fn_ptr_t gemv_fn = gemv_dispatch_vector[type_id]; + if (gemv_fn == nullptr) { + throw py::value_error( + "Types of input arrays and result array are mismatched."); + } + + char *a_typeless_ptr = matrixA.get_data(); + char *x_typeless_ptr = vectorX.get_data(); + char *y_typeless_ptr = vectorY.get_data(); + + std::vector x_stride = vectorX.get_strides_vector(); + std::vector y_stride = vectorY.get_strides_vector(); + const int x_elemsize = vectorX.get_elemsize(); + const int y_elemsize = vectorY.get_elemsize(); + const std::int64_t incx = x_stride[0]; + const std::int64_t incy = y_stride[0]; + if (incx < 0) { + x_typeless_ptr -= (x_shape[0] - 1) * std::abs(incx) * x_elemsize; + } + if (incy < 0) { + y_typeless_ptr -= (y_shape[0] - 1) * std::abs(incy) * y_elemsize; + } + + sycl::event gemv_ev = + gemv_fn(exec_q, transA, m, n, a_typeless_ptr, lda, x_typeless_ptr, incx, + y_typeless_ptr, incy, is_row_major, depends); + + sycl::event args_ev = dpctl::utils::keep_args_alive( + exec_q, {matrixA, vectorX, vectorY}, {gemv_ev}); + + return std::make_pair(args_ev, gemv_ev); +} + +template +struct GemvContigFactory +{ + fnT get() + { + if constexpr (types::GemvTypePairSupportFactory::is_defined) { + return gemv_impl; + } + else { + return nullptr; + } + } +}; + +void init_gemv_dispatch_vector(void) +{ + dpctl_td_ns::DispatchVectorBuilder + contig; + contig.populate_dispatch_vector(gemv_dispatch_vector); +} +} // namespace blas +} // namespace ext +} // namespace backend +} // namespace dpnp diff --git a/dpnp/backend/extensions/blas/gemv.hpp b/dpnp/backend/extensions/blas/gemv.hpp new file mode 100644 index 00000000000..703f9c4cc0a --- /dev/null +++ b/dpnp/backend/extensions/blas/gemv.hpp @@ -0,0 +1,62 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include +#include + +#include + +namespace dpnp +{ +namespace backend +{ +namespace ext +{ +namespace blas +{ +extern std::pair + gemv(sycl::queue &exec_q, + dpctl::tensor::usm_ndarray matrixA, + dpctl::tensor::usm_ndarray vectorX, + dpctl::tensor::usm_ndarray vectorY, + bool transpose, + const std::vector &depends); + +extern std::pair + gemv_batch(sycl::queue &exec_q, + dpctl::tensor::usm_ndarray matrixA, + dpctl::tensor::usm_ndarray vectorX, + dpctl::tensor::usm_ndarray vectorY, + bool transpose, + const std::vector &depends); + +extern void init_gemv_dispatch_vector(void); +extern void init_gemv_batch_dispatch_vector(void); +} // namespace blas +} // namespace ext +} // namespace backend +} // namespace dpnp diff --git a/dpnp/backend/extensions/blas/types_matrix.hpp b/dpnp/backend/extensions/blas/types_matrix.hpp index 2a62a0bd917..a33fa42b971 100644 --- a/dpnp/backend/extensions/blas/types_matrix.hpp +++ b/dpnp/backend/extensions/blas/types_matrix.hpp @@ -165,6 +165,31 @@ struct GemmBatchTypePairSupportFactory // fall-through dpctl_td_ns::NotDefinedEntry>::is_defined; }; + +/** + * @brief A factory to define pairs of supported types for which + * MKL BLAS library provides support in oneapi::mkl::blas::gemv + * function. + * + * @tparam T Type of input and output arrays. + */ +template +struct GemvTypePairSupportFactory +{ + static constexpr bool is_defined = std::disjunction< + dpctl_td_ns::TypePairDefinedEntry, + dpctl_td_ns::TypePairDefinedEntry, + dpctl_td_ns::TypePairDefinedEntry, + T, + std::complex>, + dpctl_td_ns::TypePairDefinedEntry, + T, + std::complex>, + // fall-through + dpctl_td_ns::NotDefinedEntry>::is_defined; +}; } // namespace types } // namespace blas } // namespace ext diff --git a/dpnp/dpnp_iface_linearalgebra.py b/dpnp/dpnp_iface_linearalgebra.py index 4bf0d2ab524..033929443a5 100644 --- a/dpnp/dpnp_iface_linearalgebra.py +++ b/dpnp/dpnp_iface_linearalgebra.py @@ -140,19 +140,21 @@ def dot(a, b, out=None): # functions from BLAS here instead of dpnp.multiply return dpnp.multiply(a, b, out=out) - if a.ndim == 0 or b.ndim == 0: + a_ndim = a.ndim + b_ndim = b.ndim + if a_ndim == 0 or b_ndim == 0: # TODO: investigate usage of axpy (axpy_batch) or scal # functions from BLAS here instead of dpnp.multiply return dpnp.multiply(a, b, out=out) - if a.ndim == 1 and b.ndim == 1: + if a_ndim == 1 and b_ndim == 1: return dpnp_dot(a, b, out=out) - if a.ndim == 2 and b.ndim == 2: + if a_ndim == 2 and b_ndim == 2: # NumPy does not allow casting even if it is safe return dpnp.matmul(a, b, out=out, casting="no") - if a.ndim == 1 or b.ndim == 1: + if a_ndim == 1 or b_ndim == 1: # NumPy does not allow casting even if it is safe return dpnp.matmul(a, b, out=out, casting="no") diff --git a/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py b/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py index 616b47483a0..43f6cc1f3fe 100644 --- a/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py +++ b/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py @@ -726,8 +726,16 @@ def _gemm_batch_matmul(exec_q, x1, x2, res, dev_tasks_list): chunk = 2048 * 2048 batch_size = res.shape[0] for i in range(0, batch_size, chunk): - x1_usm = dpnp.get_usm_ndarray(x1[i : i + chunk, ...]) - x2_usm = dpnp.get_usm_ndarray(x2[i : i + chunk, ...]) + if x1.shape[0] == 1: + # x1 is repeatedly multiplied with each matrix in x2 + x1_usm = dpnp.get_usm_ndarray(x1) + x2_usm = dpnp.get_usm_ndarray(x2[i : i + chunk, ...]) + elif x2.shape[0] == 1: + x1_usm = dpnp.get_usm_ndarray(x1[i : i + chunk, ...]) + x2_usm = dpnp.get_usm_ndarray(x2) + else: + x1_usm = dpnp.get_usm_ndarray(x1[i : i + chunk, ...]) + x2_usm = dpnp.get_usm_ndarray(x2[i : i + chunk, ...]) res_usm = dpnp.get_usm_ndarray(res[i : i + chunk, ...]) ht_blas_ev, _, row_major = bi._gemm_batch( exec_q, @@ -2090,6 +2098,7 @@ def dpnp_matmul( ) call_flag = None + transpose = False x1_shape = x1.shape x2_shape = x2.shape x1_is_2D, x1_is_1D, x1_base_is_1D = _define_dim_flags(x1, pos=0) @@ -2110,19 +2119,16 @@ def dpnp_matmul( call_flag = "gemm_batch" res_shape = result_shape elif x1_is_1D and x2_is_2D: - # TODO: implement gemv to use it here with transpose - call_flag = "gemm" - x1 = dpnp.reshape(x1, (1, x1.size)) + transpose = True + call_flag = "gemv" + x1 = dpnp.reshape(x1, x1.size) x2 = dpnp.reshape(x2, x2_shape[-2:]) - x1_shape = x1.shape - res_shape = (x1_shape[-2], x2_shape[-1]) + res_shape = (x2_shape[-1],) elif x1_is_2D and x2_is_1D: - # TODO: implement gemv to use it here without transpose - call_flag = "gemm" + call_flag = "gemv" x1 = dpnp.reshape(x1, x1_shape[-2:]) - x2 = dpnp.reshape(x2, (x2.size, 1)) - x2_shape = x2.shape - res_shape = (x1_shape[-2], x2_shape[-1]) + x2 = dpnp.reshape(x2, x2.size) + res_shape = (x1_shape[-2],) elif x1_is_2D and x2_is_2D: call_flag = "gemm" x1 = dpnp.reshape(x1, x1_shape[-2:]) @@ -2189,7 +2195,23 @@ def dpnp_matmul( dtype=compute_dtype, ) - if call_flag == "gemm": + if call_flag == "gemv": + if transpose: + a_usm = dpnp.get_usm_ndarray(x2) + x_usm = dpnp.get_usm_ndarray(x1) + else: + a_usm = dpnp.get_usm_ndarray(x1) + x_usm = dpnp.get_usm_ndarray(x2) + ht_blas_ev, _ = bi._gemv( + exec_q, + a_usm, + x_usm, + dpnp.get_usm_ndarray(res), + transpose, + dep_events_list, + ) + host_tasks_list.append(ht_blas_ev) + elif call_flag == "gemm": res = _gemm_matmul( exec_q, x1, diff --git a/tests/test_mathematical.py b/tests/test_mathematical.py index 6dc5cb01688..69b590b386c 100644 --- a/tests/test_mathematical.py +++ b/tests/test_mathematical.py @@ -2594,6 +2594,70 @@ def test_matmul_strided3(self, stride, transpose): assert result is out assert_dtype_allclose(result, expected) + @pytest.mark.parametrize("shape", [(8, 10)], ids=["2D"]) + @pytest.mark.parametrize("incx", [-2, 2], ids=["-2", "2"]) + @pytest.mark.parametrize("incy", [-2, 2], ids=["-2", "2"]) + @pytest.mark.parametrize("transpose", [False, True], ids=["False", "True"]) + def test_matmul_strided_mat_vec(self, shape, incx, incy, transpose): + if transpose: + s1 = shape[-2] + s2 = shape[-1] + else: + s1 = shape[-1] + s2 = shape[-2] + a = numpy.random.rand(*shape) + B = numpy.random.rand(2 * s1) + a_dp = dpnp.asarray(a) + if transpose: + a = numpy.moveaxis(a, (-2, -1), (-1, -2)) + a_dp = dpnp.moveaxis(a_dp, (-2, -1), (-1, -2)) + B_dp = dpnp.asarray(B) + b = B[::incx] + b_dp = B_dp[::incx] + + result = dpnp.matmul(a_dp, b_dp) + expected = numpy.matmul(a, b) + assert_dtype_allclose(result, expected) + + out_shape = shape[:-2] + (2 * s2,) + OUT = dpnp.empty(out_shape, dtype=result.dtype) + out = OUT[..., ::incy] + result = dpnp.matmul(a_dp, b_dp, out=out) + assert result is out + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize("shape", [(8, 10)], ids=["2D"]) + @pytest.mark.parametrize("incx", [-2, 2], ids=["-2", "2"]) + @pytest.mark.parametrize("incy", [-2, 2], ids=["-2", "2"]) + @pytest.mark.parametrize("transpose", [False, True], ids=["False", "True"]) + def test_matmul_strided_vec_mat(self, shape, incx, incy, transpose): + if transpose: + s1 = shape[-2] + s2 = shape[-1] + else: + s1 = shape[-1] + s2 = shape[-2] + a = numpy.random.rand(*shape) + B = numpy.random.rand(2 * s2) + a_dp = dpnp.asarray(a) + if transpose: + a = numpy.moveaxis(a, (-2, -1), (-1, -2)) + a_dp = dpnp.moveaxis(a_dp, (-2, -1), (-1, -2)) + B_dp = dpnp.asarray(B) + b = B[::incx] + b_dp = B_dp[::incx] + + result = dpnp.matmul(b_dp, a_dp) + expected = numpy.matmul(b, a) + assert_dtype_allclose(result, expected) + + out_shape = shape[:-2] + (2 * s1,) + OUT = dpnp.empty(out_shape, dtype=result.dtype) + out = OUT[..., ::incy] + result = dpnp.matmul(b_dp, a_dp, out=out) + assert result is out + assert_dtype_allclose(result, expected) + @pytest.mark.parametrize( "dtype", get_all_dtypes(no_none=True, no_bool=True) ) @@ -2631,26 +2695,24 @@ def test_matmul_out_0D(self, out_shape): @testing.slow @pytest.mark.parametrize( - "shape", + "shape_pair", [ - ((4096, 4096, 4, 4)), - ((2048, 2048, 8, 8)), + ((4096, 4096, 2, 2), (4096, 4096, 2, 2)), + ((2, 2), (4096, 4096, 2, 2)), + ((4096, 4096, 2, 2), (2, 2)), ], ) - def test_matmul_large(self, shape): - size = numpy.prod(shape, dtype=int) - a = numpy.array(numpy.random.uniform(-5, 5, size)).reshape(shape) + def test_matmul_large(self, shape_pair): + shape1, shape2 = shape_pair + size1 = numpy.prod(shape1, dtype=int) + size2 = numpy.prod(shape2, dtype=int) + a = numpy.array(numpy.random.uniform(-5, 5, size1)).reshape(shape1) + b = numpy.array(numpy.random.uniform(-5, 5, size2)).reshape(shape2) a_dp = dpnp.asarray(a) + b_dp = dpnp.asarray(b) - result = dpnp.matmul(a_dp, a_dp) - expected = numpy.matmul(a, a) - assert_dtype_allclose(result, expected, factor=24) - - # make the 2-d base f-contiguous - a = a.transpose(0, 1, 3, 2) - a_dp = a_dp.transpose(0, 1, 3, 2) - result = dpnp.matmul(a_dp, a_dp) - expected = numpy.matmul(a, a) + result = dpnp.matmul(a_dp, b_dp) + expected = numpy.matmul(a, b) assert_dtype_allclose(result, expected, factor=24) diff --git a/tests/test_product.py b/tests/test_product.py index ae233b7d3ab..ded938bda7f 100644 --- a/tests/test_product.py +++ b/tests/test_product.py @@ -244,9 +244,9 @@ def test_dot_scalar(self, dtype): ((10,), (10,)), ((4, 3), (3, 2)), ((4, 3), (3,)), + ((4,), (4, 2)), ((5, 4, 3), (3,)), ((4,), (5, 4, 3)), - ((4,), (4, 2)), ((5, 3, 4), (6, 4, 2)), ], ids=[ @@ -256,9 +256,9 @@ def test_dot_scalar(self, dtype): "1d_1d", "2d_2d", "2d_1d", + "1d_2d", "3d_1d", "1d_3d", - "1d_2d", "3d_3d", ], ) @@ -404,8 +404,9 @@ def test_dot_out_scalar(self, dtype): ((10,), (10,), ()), ((4, 3), (3, 2), (4, 2)), ((4, 3), (3,), (4,)), - ((5, 4, 3), (3,), (5, 4)), ((4,), (4, 2), (2,)), + ((5, 4, 3), (3,), (5, 4)), + ((4,), (5, 4, 3), (5, 3)), ((5, 3, 4), (6, 4, 2), (5, 3, 6, 2)), ], ids=[ @@ -415,8 +416,9 @@ def test_dot_out_scalar(self, dtype): "1d_1d", "2d_2d", "2d_1d", - "3d_1d", "1d_2d", + "3d_1d", + "1d_3d", "3d_3d", ], ) From 410cb1ba46fcfb9f53dbd5c8f444316224da4aab Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Tue, 28 May 2024 18:49:02 +0200 Subject: [PATCH 08/49] Use mamba in GitHub actions (#1858) * Use mamba in GitHub actions * Use rattler-build to build conda package * Resolve spaces issue * Use conda-build command * Use conda activate inside retry step * Explicitly removing defaults channel * Update actions for Generate coverage and Build Sphinx * Disable download speed check in mamba * Corrected setting the env variable --- .github/workflows/build-sphinx.yml | 28 +++++--- .github/workflows/conda-package.yml | 90 +++++++++++++++++------- .github/workflows/generate_coverage.yaml | 25 +++++-- 3 files changed, 102 insertions(+), 41 deletions(-) diff --git a/.github/workflows/build-sphinx.yml b/.github/workflows/build-sphinx.yml index 7d5cd8afc3f..5d0372fb48d 100644 --- a/.github/workflows/build-sphinx.yml +++ b/.github/workflows/build-sphinx.yml @@ -99,31 +99,43 @@ jobs: - name: Setup miniconda uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - auto-update-conda: true + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true + channels: conda-forge python-version: ${{ env.python-ver }} - miniconda-version: 'latest' activate-environment: 'docs' - channels: intel, conda-forge + + # Here is an issue in conda gh-12356 causing adding defaults to the list of channels + # upon running `conda config --append channels conda-forge`, while mamba requires to have only conda-forge channel + - name: Remove defaults channel + run: | + conda config --remove channels defaults + conda config --show + + # Sometimes `mamba install ...` fails due to slow download speed rate, so disable the check in mamba + - name: Disable speed limit check in mamba + run: echo "MAMBA_NO_LOW_SPEED_LIMIT=1" >> $GITHUB_ENV - name: Install sphinx dependencies run: | - conda install sphinx sphinx_rtd_theme + mamba install sphinx sphinx_rtd_theme pip install sphinxcontrib-googleanalytics==0.4 \ pyenchant sphinxcontrib-spelling - name: Install dpnp dependencies run: | - conda install numpy"<1.24" dpctl">=0.17.0dev0" mkl-devel-dpcpp onedpl-devel tbb-devel dpcpp_linux-64 \ + mamba install numpy"<1.24" dpctl">=0.17.0dev0" mkl-devel-dpcpp onedpl-devel tbb-devel dpcpp_linux-64 \ cmake cython pytest ninja scikit-build ${{ env.CHANNELS }} - name: Install cuPy dependencies - run: conda install cupy cudatoolkit=10.0 + run: mamba install cupy cudatoolkit=10.0 - name: Conda info - run: conda info + run: mamba info - name: Conda list - run: conda list + run: mamba list - name: Build library run: python scripts/build_locally.py diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index 55ec174227c..3b24acf7774 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -12,7 +12,7 @@ env: PACKAGE_NAME: dpnp MODULE_NAME: dpnp CHANNELS: '-c dppy/label/dev -c intel -c conda-forge --override-channels' - CONDA_BUILD_VERSION: '24.1.2' + CONDA_BUILD_VERSION: '24.5.0' CONDA_INDEX_VERSION: '0.4.0' TEST_ENV_NAME: 'test' TEST_SCOPE: >- @@ -96,19 +96,32 @@ jobs: - name: Setup miniconda uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - auto-update-conda: true + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true + channels: conda-forge python-version: ${{ matrix.python }} - miniconda-version: 'latest' activate-environment: 'build' + # Here is an issue in conda gh-12356 causing adding defaults to the list of channels + # upon running `conda config --append channels conda-forge`, while mamba requires to have only conda-forge channel + - name: Remove defaults channel + run: | + conda config --remove channels defaults + conda config --show + + # Sometimes `mamba install ...` fails due to slow download speed rate, so disable the check in mamba + - name: Disable speed limit check in mamba + run: echo "MAMBA_NO_LOW_SPEED_LIMIT=1" >> $GITHUB_ENV + - name: Store conda paths as envs shell: bash -l {0} run: | - echo "CONDA_BLD=$CONDA_PREFIX/conda-bld/${{ runner.os == 'Linux' && 'linux' || 'win' }}-64/" | tr "\\" '/' >> $GITHUB_ENV + echo "CONDA_BLD=$CONDA_PREFIX/conda-bld/${{ runner.os == 'Linux' && 'linux' || 'win' }}-64/" | tr "\\\\" '/' >> $GITHUB_ENV echo "WHEELS_OUTPUT_FOLDER=$GITHUB_WORKSPACE${{ runner.os == 'Linux' && '/' || '\\' }}" >> $GITHUB_ENV - name: Install conda-build - run: conda install conda-build=${{ env.CONDA_BUILD_VERSION}} + run: mamba install conda-build=${{ env.CONDA_BUILD_VERSION}} - name: Cache conda packages uses: actions/cache@0c45773b623bea8c8e75f6c82b208c3cf94ea4f9 # v4.0.2 @@ -123,7 +136,7 @@ jobs: ${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}- - name: Build conda package - run: conda build --no-test --python ${{ matrix.python }} --numpy 1.23 ${{ env.CHANNELS }} conda-recipe + run: conda build --no-test --python ${{ matrix.python }} --numpy 1.24 ${{ env.CHANNELS }} conda-recipe - name: Upload artifact uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4.3.3 @@ -178,13 +191,18 @@ jobs: - name: Setup miniconda uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - auto-update-conda: true + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true + channels: conda-forge python-version: ${{ matrix.python }} - miniconda-version: 'latest' activate-environment: ${{ env.TEST_ENV_NAME }} + - name: Remove defaults channel + run: conda config --remove channels defaults + - name: Install conda-index - run: conda install conda-index=${{ env.CONDA_INDEX_VERSION }} + run: mamba install conda-index=${{ env.CONDA_INDEX_VERSION }} - name: Create conda channel run: | @@ -192,7 +210,7 @@ jobs: - name: Test conda channel run: | - conda search ${{ env.PACKAGE_NAME }} -c ${{ env.channel-path }} --override-channels --info --json > ${{ env.ver-json-path }} + mamba search ${{ env.PACKAGE_NAME }} -c ${{ env.channel-path }} --override-channels --info --json > ${{ env.ver-json-path }} cat ${{ env.ver-json-path }} - name: Collect dependencies @@ -202,7 +220,7 @@ jobs: echo PACKAGE_VERSION=${PACKAGE_VERSION} echo "PACKAGE_VERSION=$PACKAGE_VERSION" >> $GITHUB_ENV - conda install ${{ env.PACKAGE_NAME }}=${PACKAGE_VERSION} python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} --only-deps --dry-run > lockfile + mamba install ${{ env.PACKAGE_NAME }}=${PACKAGE_VERSION} python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} --only-deps --dry-run > lockfile cat lockfile env: TEST_CHANNELS: '-c ${{ env.channel-path }} ${{ env.CHANNELS }}' @@ -220,12 +238,13 @@ jobs: ${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}- - name: Install dpnp - run: conda install ${{ env.PACKAGE_NAME }}=${{ env.PACKAGE_VERSION }} pytest python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} + run: mamba install ${{ env.PACKAGE_NAME }}=${{ env.PACKAGE_VERSION }} pytest python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} env: TEST_CHANNELS: '-c ${{ env.channel-path }} ${{ env.CHANNELS }}' + MAMBA_NO_LOW_SPEED_LIMIT: 1 - name: List installed packages - run: conda list + run: mamba list - name: Smoke test run: | @@ -302,11 +321,16 @@ jobs: - name: Setup miniconda uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - auto-update-conda: true + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true + channels: conda-forge python-version: ${{ matrix.python }} - miniconda-version: 'latest' activate-environment: ${{ env.TEST_ENV_NAME }} + - name: Remove defaults channel + run: conda config --remove channels defaults + - name: Store conda paths as envs run: | @echo on @@ -314,7 +338,7 @@ jobs: (echo CONDA_LIB_BIN_PATH=%CONDA_PREFIX%\Library\bin\) >> %GITHUB_ENV% - name: Install conda-index - run: conda install conda-index=${{ env.CONDA_INDEX_VERSION}} + run: mamba install conda-index=${{ env.CONDA_INDEX_VERSION }} - name: Create conda channel run: | @@ -324,7 +348,7 @@ jobs: - name: Test conda channel run: | @echo on - conda search ${{ env.PACKAGE_NAME }} -c ${{ env.channel-path }} --override-channels --info --json > ${{ env.ver-json-path }} + mamba search ${{ env.PACKAGE_NAME }} -c ${{ env.channel-path }} --override-channels --info --json > ${{ env.ver-json-path }} - name: Dump version.json run: more ${{ env.ver-json-path }} @@ -339,7 +363,7 @@ jobs: echo PACKAGE_VERSION: %PACKAGE_VERSION% (echo PACKAGE_VERSION=%PACKAGE_VERSION%) >> %GITHUB_ENV% - conda install ${{ env.PACKAGE_NAME }}=%PACKAGE_VERSION% python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} --only-deps --dry-run > lockfile + mamba install ${{ env.PACKAGE_NAME }}=%PACKAGE_VERSION% python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} --only-deps --dry-run > lockfile env: TEST_CHANNELS: '-c ${{ env.channel-path }} ${{ env.CHANNELS }}' @@ -361,12 +385,13 @@ jobs: - name: Install dpnp run: | @echo on - conda install ${{ env.PACKAGE_NAME }}=${{ env.PACKAGE_VERSION }} pytest python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} + mamba install ${{ env.PACKAGE_NAME }}=${{ env.PACKAGE_VERSION }} pytest python=${{ matrix.python }} ${{ env.TEST_CHANNELS }} env: TEST_CHANNELS: '-c ${{ env.channel-path }} ${{ env.CHANNELS }}' + MAMBA_NO_LOW_SPEED_LIMIT: 1 - name: List installed packages - run: conda list + run: mamba list - name: Activate OCL CPU RT shell: pwsh @@ -398,7 +423,7 @@ jobs: max_attempts: 5 retry_on: any command: >- - conda activate ${{ env.TEST_ENV_NAME }} + mamba activate ${{ env.TEST_ENV_NAME }} & cd ${{ env.tests-path }} & python -m pytest -q -ra --disable-warnings -vv ${{ env.TEST_SCOPE }} @@ -438,13 +463,18 @@ jobs: - name: Setup miniconda uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - auto-update-conda: true + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true + channels: conda-forge python-version: ${{ matrix.python }} - miniconda-version: 'latest' activate-environment: 'upload' + - name: Remove defaults channel + run: conda config --remove channels defaults + - name: Install anaconda-client - run: conda install anaconda-client + run: mamba install anaconda-client - name: Package version run: echo "PACKAGE_VERSION=$(basename ${{ env.PACKAGE_NAME }}-*.tar.bz2 | sed 's/^${{ env.PACKAGE_NAME }}-\([^-]*\).*/\1/')" >> $GITHUB_ENV @@ -469,13 +499,19 @@ jobs: steps: - uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - run-post: false - channel-priority: "disabled" + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true channels: conda-forge + run-post: false python-version: '3.11' + activate-environment: 'cleanup' + + - name: Remove defaults channel + run: conda config --remove channels defaults - name: Install anaconda-client - run: conda install anaconda-client + run: mamba install anaconda-client - name: Checkout repo uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 diff --git a/.github/workflows/generate_coverage.yaml b/.github/workflows/generate_coverage.yaml index bd966a79095..d0f65e9729b 100644 --- a/.github/workflows/generate_coverage.yaml +++ b/.github/workflows/generate_coverage.yaml @@ -59,27 +59,40 @@ jobs: - name: Setup miniconda uses: conda-incubator/setup-miniconda@a4260408e20b96e80095f42ff7f1a15b27dd94ca # v3.0.4 with: - auto-update-conda: true + miniforge-variant: Mambaforge + miniforge-version: latest + use-mamba: true + channels: conda-forge python-version: ${{ env.python-ver }} - miniconda-version: 'latest' activate-environment: 'coverage' + # Here is an issue in conda gh-12356 causing adding defaults to the list of channels + # upon running `conda config --append channels conda-forge`, while mamba requires to have only conda-forge channel + - name: Remove defaults channel + run: | + conda config --remove channels defaults + conda config --show + + # Sometimes `mamba install ...` fails due to slow download speed rate, so disable the check in mamba + - name: Disable speed limit check in mamba + run: echo "MAMBA_NO_LOW_SPEED_LIMIT=1" >> $GITHUB_ENV + - name: Install dpnp dependencies if: env.INSTALL_ONE_API == 'yes' run: | - conda install cython llvm cmake">=3.21" scikit-build ninja pytest pytest-cov coverage[toml] \ + mamba install cython llvm cmake">=3.21" scikit-build ninja pytest pytest-cov coverage[toml] \ dpctl">=0.17.0dev0" onedpl-devel ${{ env.CHANNELS }} - name: Install dpnp dependencies if: env.INSTALL_ONE_API != 'yes' run: | - conda install cython llvm cmake">=3.21" scikit-build ninja pytest pytest-cov coverage[toml] \ + mamba install cython llvm cmake">=3.21" scikit-build ninja pytest pytest-cov coverage[toml] \ dpctl">=0.17.0dev0" dpcpp_linux-64 mkl-devel-dpcpp tbb-devel onedpl-devel ${{ env.CHANNELS }} - name: Conda info run: | - conda info - conda list + mamba info + mamba list - name: Build dpnp with coverage id: build_coverage From 59be03de5d96cb49e7334d55da50a5f5517a2df2 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Wed, 29 May 2024 15:25:17 +0200 Subject: [PATCH 09/49] Enable pre-commit pylint check in fft module (#1860) --- .pre-commit-config.yaml | 2 +- dpnp/fft/__init__.py | 23 ++++++++++++ dpnp/fft/dpnp_iface_fft.py | 75 +++++++++++++++++++++++++++++--------- 3 files changed, 82 insertions(+), 18 deletions(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index b996291c155..e5e77c67768 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -100,4 +100,4 @@ repos: "--disable=redefined-builtin", "--disable=unused-wildcard-import" ] - files: '^dpnp/(dpnp_iface.*|linalg)' + files: '^dpnp/(dpnp_iface.*|fft|linalg)' diff --git a/dpnp/fft/__init__.py b/dpnp/fft/__init__.py index 1b743518d79..811e9b23ad0 100644 --- a/dpnp/fft/__init__.py +++ b/dpnp/fft/__init__.py @@ -24,6 +24,29 @@ # THE POSSIBILITY OF SUCH DAMAGE. # ***************************************************************************** +""" +``dpnp.fft`` +=========================== +Discrete Fourier Transform. + +Fourier analysis is fundamentally a method for expressing a function as a sum +of periodic components, and for recovering the function from those components. +When both the function and its Fourier transform are replaced with discretized +counterparts, it is called the discrete Fourier transform (DFT). The DFT has +become a mainstay of numerical computing in part because of a very fast +algorithm for computing it, called the Fast Fourier Transform (FFT), which was +known to Gauss (1805) and was brought to light in its current form by Cooley +and Tukey. + +Because the discrete Fourier transform separates its input into components +that contribute at discrete frequencies, it has a great number of applications +in digital signal processing, e.g., for filtering, and in this context the +discretized input to the transform is customarily referred to as a *signal*, +which exists in the *time domain*. The output is called a *spectrum* or +*transform* and exists in the *frequency domain*. + +""" + from dpnp.fft.dpnp_iface_fft import * from dpnp.fft.dpnp_iface_fft import __all__ as __all__fft diff --git a/dpnp/fft/dpnp_iface_fft.py b/dpnp/fft/dpnp_iface_fft.py index f8609293701..c8064e0122e 100644 --- a/dpnp/fft/dpnp_iface_fft.py +++ b/dpnp/fft/dpnp_iface_fft.py @@ -39,14 +39,23 @@ """ +# pylint: disable=invalid-name from enum import Enum import numpy import dpnp -from dpnp.dpnp_utils import * -from dpnp.fft.dpnp_algo_fft import * + +# pylint: disable=no-name-in-module +from dpnp.dpnp_utils import ( + call_origin, + checker_throw_axis_error, +) +from dpnp.fft.dpnp_algo_fft import ( + dpnp_fft, + dpnp_rfft, +) __all__ = [ "fft", @@ -70,12 +79,16 @@ ] +# TODO: remove pylint disable, once new implementation is ready +# pylint: disable=missing-class-docstring class Norm(Enum): backward = 0 forward = 1 ortho = 2 +# TODO: remove pylint disable, once new implementation is ready +# pylint: disable=missing-function-docstring def get_validated_norm(norm): if norm is None or norm == "backward": return Norm.backward @@ -98,8 +111,10 @@ def fft(x, n=None, axis=-1, norm=None): Parameter `axis` is supported with its default value. Only `dpnp.float64`, `dpnp.float32`, `dpnp.int64`, `dpnp.int32`, `dpnp.complex128`, `dpnp.complex64` data types are supported. - The `dpnp.bool` data type is not supported and will raise a `TypeError` exception. + The `dpnp.bool` data type is not supported and will raise a `TypeError` + exception. Otherwise the function will be executed sequentially on CPU. + """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) @@ -205,12 +220,12 @@ def fftn(x, s=None, axes=None, norm=None): x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) if x_desc: if s is None: - boundaries = tuple([x_desc.shape[i] for i in range(x_desc.ndim)]) + boundaries = tuple(x_desc.shape[i] for i in range(x_desc.ndim)) else: boundaries = s if axes is None: - axes_param = tuple([i for i in range(x_desc.ndim)]) + axes_param = list(range(x_desc.ndim)) else: axes_param = axes @@ -256,6 +271,8 @@ def fftshift(x, axes=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: norm_ = Norm.backward @@ -267,6 +284,9 @@ def fftshift(x, axes=None): if x_desc.size < 1: pass # let fallback to handle exception else: + input_boundarie = x_desc.shape[axis_param] + output_boundarie = input_boundarie + return dpnp_fft( x_desc, input_boundarie, @@ -281,7 +301,8 @@ def fftshift(x, axes=None): def hfft(x, n=None, axis=-1, norm=None): """ - Compute the one-dimensional discrete Fourier Transform of a signal that has Hermitian symmetry. + Compute the one-dimensional discrete Fourier Transform of a signal that has + Hermitian symmetry. For full documentation refer to :obj:`numpy.fft.hfft`. @@ -296,6 +317,8 @@ def hfft(x, n=None, axis=-1, norm=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: norm_ = get_validated_norm(norm) @@ -342,7 +365,8 @@ def ifft(x, n=None, axis=-1, norm=None): Parameter `axis` is supported with its default value. Only `dpnp.float64`, `dpnp.float32`, `dpnp.int64`, `dpnp.int32`,, `dpnp.complex128`, `dpnp.complex64` data types are supported. - The `dpnp.bool` data type is not supported and will raise a `TypeError` exception. + The `dpnp.bool` data type is not supported and will raise a `TypeError` + exception. Otherwise the function will be executed sequentially on CPU. """ @@ -430,6 +454,8 @@ def ifftshift(x, axes=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: norm_ = Norm.backward @@ -478,14 +504,16 @@ def ifftn(x, s=None, axes=None, norm=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: if s is None: - boundaries = tuple([x_desc.shape[i] for i in range(x_desc.ndim)]) + boundaries = tuple(x_desc.shape[i] for i in range(x_desc.ndim)) else: boundaries = s if axes is None: - axes_param = tuple([i for i in range(x_desc.ndim)]) + axes_param = list(range(x_desc.ndim)) else: axes_param = axes @@ -522,7 +550,8 @@ def ifftn(x, s=None, axes=None, norm=None): def ihfft(x, n=None, axis=-1, norm=None): """ - Compute inverse one-dimensional discrete Fourier Transform of a signal that has Hermitian symmetry. + Compute inverse one-dimensional discrete Fourier Transform of a signal that + has Hermitian symmetry. For full documentation refer to :obj:`numpy.fft.ihfft`. @@ -537,6 +566,8 @@ def ihfft(x, n=None, axis=-1, norm=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: norm_ = get_validated_norm(norm) @@ -575,7 +606,8 @@ def ihfft(x, n=None, axis=-1, norm=None): def irfft(x, n=None, axis=-1, norm=None): """ - Compute the one-dimensional inverse discrete Fourier Transform for real input. + Compute the one-dimensional inverse discrete Fourier Transform for real + input. For full documentation refer to :obj:`numpy.fft.irfft`. @@ -590,6 +622,8 @@ def irfft(x, n=None, axis=-1, norm=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: norm_ = get_validated_norm(norm) @@ -622,7 +656,8 @@ def irfft(x, n=None, axis=-1, norm=None): True, norm_.value, ).get_pyobj() - # TODO tmp = utils.create_output_array(result_shape, result_c_type, out) + # TODO: + # tmp = utils.create_output_array(result_shape, result_c_type, out) # tmp = dparray(result.shape, dtype=dpnp.float64) # for it in range(tmp.size): # tmp[it] = result[it].real @@ -678,14 +713,16 @@ def irfftn(x, s=None, axes=None, norm=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: if s is None: - boundaries = tuple([x_desc.shape[i] for i in range(x_desc.ndim)]) + boundaries = tuple(x_desc.shape[i] for i in range(x_desc.ndim)) else: boundaries = s if axes is None: - axes_param = tuple([i for i in range(x_desc.ndim)]) + axes_param = list(range(x_desc.ndim)) else: axes_param = axes @@ -732,8 +769,10 @@ def rfft(x, n=None, axis=-1, norm=None): Parameter `norm` is unsupported. Only `dpnp.float64`, `dpnp.float32`, `dpnp.int64`, `dpnp.int32`, `dpnp.complex128` data types are supported. - The `dpnp.bool` data type is not supported and will raise a `TypeError` exception. + The `dpnp.bool` data type is not supported and will raise a `TypeError` + exception. Otherwise the function will be executed sequentially on CPU. + """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) @@ -844,14 +883,16 @@ def rfftn(x, s=None, axes=None, norm=None): """ x_desc = dpnp.get_dpnp_descriptor(x, copy_when_nondefault_queue=False) + # TODO: enable implementation + # pylint: disable=condition-evals-to-constant if x_desc and 0: if s is None: - boundaries = tuple([x_desc.shape[i] for i in range(x_desc.ndim)]) + boundaries = tuple(x_desc.shape[i] for i in range(x_desc.ndim)) else: boundaries = s if axes is None: - axes_param = tuple([i for i in range(x_desc.ndim)]) + axes_param = list(range(x_desc.ndim)) else: axes_param = axes From 841664c9fa0f46df227020bb6192e58c51ac404d Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Fri, 31 May 2024 11:33:59 +0200 Subject: [PATCH 10/49] Implement `dpnp.gradient` function (#1859) * Implement dpnp.gradient function * Resolve pre-commit issues * Update dpnp/dpnp_iface_mathematical.py Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> * Update dpnp/dpnp_iface_mathematical.py Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> --------- Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> --- dpnp/dpnp_algo/dpnp_algo_mathematical.pxi | 31 -- dpnp/dpnp_iface_mathematical.py | 372 ++++++++++++++-- tests/skipped_tests_gpu_no_fp64.tbl | 4 - tests/test_mathematical.py | 421 +++++++++++++++--- tests/test_sycl_queue.py | 7 +- tests/test_usm_type.py | 4 + .../cupy/math_tests/test_sumprod.py | 13 +- 7 files changed, 734 insertions(+), 118 deletions(-) diff --git a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi index f2111e4e671..2b8d63c6d2d 100644 --- a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi +++ b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi @@ -39,7 +39,6 @@ __all__ += [ "dpnp_ediff1d", "dpnp_fabs", "dpnp_fmod", - "dpnp_gradient", "dpnp_fmax", "dpnp_fmin", "dpnp_modf", @@ -123,36 +122,6 @@ cpdef utils.dpnp_descriptor dpnp_fmod(utils.dpnp_descriptor x1_obj, return call_fptr_2in_1out_strides(DPNP_FN_FMOD_EXT, x1_obj, x2_obj, dtype, out, where) -cpdef utils.dpnp_descriptor dpnp_gradient(utils.dpnp_descriptor y1, int dx=1): - - cdef size_t size = y1.size - - y1_obj = y1.get_array() - - # create result array with type given by FPTR data - cdef shape_type_c result_shape = utils._object_to_tuple(size) - cdef utils.dpnp_descriptor result = utils_py.create_output_descriptor_py(result_shape, - dpnp.default_float_type(y1_obj.sycl_queue), - None, - device=y1_obj.sycl_device, - usm_type=y1_obj.usm_type, - sycl_queue=y1_obj.sycl_queue) - - cdef double cur = (y1.get_pyobj()[1] - y1.get_pyobj()[0]) / dx - - result.get_pyobj().flat[0] = cur - - cur = (y1.get_pyobj()[-1] - y1.get_pyobj()[-2]) / dx - - result.get_pyobj().flat[size - 1] = cur - - for i in range(1, size - 1): - cur = (y1.get_pyobj()[i + 1] - y1.get_pyobj()[i - 1]) / (2 * dx) - result.get_pyobj().flat[i] = cur - - return result - - cpdef utils.dpnp_descriptor dpnp_fmax(utils.dpnp_descriptor x1_obj, utils.dpnp_descriptor x2_obj, object dtype=None, diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index 716696cacbd..b0d0c7b6123 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -46,6 +46,7 @@ import dpctl.tensor as dpt import dpctl.tensor._tensor_elementwise_impl as ti import dpctl.tensor._type_utils as dtu +import dpctl.utils as dpu import numpy from dpctl.tensor._type_utils import _acceptance_fn_divide from numpy.core.numeric import ( @@ -63,7 +64,6 @@ dpnp_fmax, dpnp_fmin, dpnp_fmod, - dpnp_gradient, dpnp_modf, dpnp_trapz, ) @@ -168,6 +168,169 @@ def _get_reduction_res_dt(a, dtype, _out): return dtu._to_device_supported_dtype(dtype, a.sycl_device) +def _gradient_build_dx(f, axes, *varargs): + """Build an array with distance per each dimension.""" + + len_axes = len(axes) + n = len(varargs) + if n == 0: + # no spacing argument - use 1 in all axes + dx = [1.0] * len_axes + elif n == 1 and numpy.ndim(varargs[0]) == 0: + dpnp.check_supported_arrays_type( + varargs[0], scalar_type=True, all_scalars=True + ) + + # single scalar for all axes + dx = varargs * len_axes + elif n == len_axes: + # scalar or 1d array for each axis + dx = list(varargs) + for i, distances in enumerate(dx): + dpnp.check_supported_arrays_type( + distances, scalar_type=True, all_scalars=True + ) + + if numpy.ndim(distances) == 0: + continue + if distances.ndim != 1: + raise ValueError("distances must be either scalars or 1d") + + if len(distances) != f.shape[axes[i]]: + raise ValueError( + "when 1d, distances must match " + "the length of the corresponding dimension" + ) + + if dpnp.issubdtype(distances.dtype, dpnp.integer): + # Convert integer types to default float type to avoid modular + # arithmetic in dpnp.diff(distances). + distances = distances.astype(dpnp.default_float_type()) + diffx = dpnp.diff(distances) + + # if distances are constant reduce to the scalar case + # since it brings a consistent speedup + if (diffx == diffx[0]).all(): + diffx = diffx[0] + dx[i] = diffx + else: + raise TypeError("invalid number of arguments") + return dx + + +def _gradient_num_diff_2nd_order_interior( + f, ax_dx, out, slices, axis, uniform_spacing +): + """Numerical differentiation: 2nd order interior.""" + + slice1, slice2, slice3, slice4 = slices + ndim = f.ndim + + slice1[axis] = slice(1, -1) + slice2[axis] = slice(None, -2) + slice3[axis] = slice(1, -1) + slice4[axis] = slice(2, None) + + if uniform_spacing: + out[tuple(slice1)] = (f[tuple(slice4)] - f[tuple(slice2)]) / ( + 2.0 * ax_dx + ) + else: + dx1 = ax_dx[0:-1] + dx2 = ax_dx[1:] + a = -(dx2) / (dx1 * (dx1 + dx2)) + b = (dx2 - dx1) / (dx1 * dx2) + c = dx1 / (dx2 * (dx1 + dx2)) + + # fix the shape for broadcasting + shape = [1] * ndim + shape[axis] = -1 + # TODO: use shape.setter once dpctl#1699 is resolved + # a.shape = b.shape = c.shape = shape + a = a.reshape(shape) + b = b.reshape(shape) + c = c.reshape(shape) + + # 1D equivalent -- out[1:-1] = a * f[:-2] + b * f[1:-1] + c * f[2:] + t1 = a * f[tuple(slice2)] + t2 = b * f[tuple(slice3)] + t3 = c * f[tuple(slice4)] + t4 = t1 + t2 + t3 + + out[tuple(slice1)] = t4 + out[tuple(slice1)] = ( + a * f[tuple(slice2)] + b * f[tuple(slice3)] + c * f[tuple(slice4)] + ) + + +def _gradient_num_diff_edges( + f, ax_dx, out, slices, axis, uniform_spacing, edge_order +): + """Numerical differentiation: 1st and 2nd order edges.""" + + slice1, slice2, slice3, slice4 = slices + + # Numerical differentiation: 1st order edges + if edge_order == 1: + slice1[axis] = 0 + slice2[axis] = 1 + slice3[axis] = 0 + dx_0 = ax_dx if uniform_spacing else ax_dx[0] + + # 1D equivalent -- out[0] = (f[1] - f[0]) / (x[1] - x[0]) + out[tuple(slice1)] = (f[tuple(slice2)] - f[tuple(slice3)]) / dx_0 + + slice1[axis] = -1 + slice2[axis] = -1 + slice3[axis] = -2 + dx_n = ax_dx if uniform_spacing else ax_dx[-1] + + # 1D equivalent -- out[-1] = (f[-1] - f[-2]) / (x[-1] - x[-2]) + out[tuple(slice1)] = (f[tuple(slice2)] - f[tuple(slice3)]) / dx_n + + # Numerical differentiation: 2nd order edges + else: + slice1[axis] = 0 + slice2[axis] = 0 + slice3[axis] = 1 + slice4[axis] = 2 + if uniform_spacing: + a = -1.5 / ax_dx + b = 2.0 / ax_dx + c = -0.5 / ax_dx + else: + dx1 = ax_dx[0] + dx2 = ax_dx[1] + a = -(2.0 * dx1 + dx2) / (dx1 * (dx1 + dx2)) + b = (dx1 + dx2) / (dx1 * dx2) + c = -dx1 / (dx2 * (dx1 + dx2)) + + # 1D equivalent -- out[0] = a * f[0] + b * f[1] + c * f[2] + out[tuple(slice1)] = ( + a * f[tuple(slice2)] + b * f[tuple(slice3)] + c * f[tuple(slice4)] + ) + + slice1[axis] = -1 + slice2[axis] = -3 + slice3[axis] = -2 + slice4[axis] = -1 + if uniform_spacing: + a = 0.5 / ax_dx + b = -2.0 / ax_dx + c = 1.5 / ax_dx + else: + dx1 = ax_dx[-2] + dx2 = ax_dx[-1] + a = (dx2) / (dx1 * (dx1 + dx2)) + b = -(dx2 + dx1) / (dx1 * dx2) + c = (2.0 * dx2 + dx1) / (dx2 * (dx1 + dx2)) + + # 1D equivalent -- out[-1] = a * f[-3] + b * f[-2] + c * f[-1] + out[tuple(slice1)] = ( + a * f[tuple(slice2)] + b * f[tuple(slice3)] + c * f[tuple(slice4)] + ) + + _ABS_DOCSTRING = """ Calculates the absolute value for each element `x_i` of input array `x`. @@ -1682,51 +1845,206 @@ def fmod(x1, x2, /, out=None, *, where=True, dtype=None, subok=True, **kwargs): ) -def gradient(x1, *varargs, **kwargs): +def gradient(f, *varargs, axis=None, edge_order=1): """ - Return the gradient of an array. + Return the gradient of an N-dimensional array. + + The gradient is computed using second order accurate central differences + in the interior points and either first or second order accurate one-sides + (forward or backwards) differences at the boundaries. + The returned gradient hence has the same shape as the input array. For full documentation refer to :obj:`numpy.gradient`. - Limitations - ----------- - Parameter `y1` is supported as :class:`dpnp.ndarray`. - Argument `varargs[0]` is supported as `int`. - Keyword argument `kwargs` is currently unsupported. - Otherwise the function will be executed sequentially on CPU. - Input array data types are limited by supported DPNP :ref:`Data types`. + Parameters + ---------- + f : {dpnp.ndarray, usm_ndarray} + An N-dimensional array containing samples of a scalar function. + varargs : {scalar, list of scalars, list of arrays}, optional + Spacing between `f` values. Default unitary spacing for all dimensions. + Spacing can be specified using: + + 1. Single scalar to specify a sample distance for all dimensions. + 2. N scalars to specify a constant sample distance for each dimension. + i.e. `dx`, `dy`, `dz`, ... + 3. N arrays to specify the coordinates of the values along each + dimension of `f`. The length of the array must match the size of + the corresponding dimension + 4. Any combination of N scalars/arrays with the meaning of 2. and 3. + + If `axis` is given, the number of `varargs` must equal the number of + axes. + Default: ``1``. + axis : {None, int, tuple of ints}, optional + Gradient is calculated only along the given axis or axes. + The default is to calculate the gradient for all the axes of the input + array. `axis` may be negative, in which case it counts from the last to + the first axis. + Default: ``None``. + edge_order : {1, 2}, optional + Gradient is calculated using N-th order accurate differences + at the boundaries. + Default: ``1``. + + Returns + ------- + gradient : {dpnp.ndarray, list of ndarray} + A list of :class:`dpnp.ndarray` (or a single :class:`dpnp.ndarray` if + there is only one dimension) corresponding to the derivatives of `f` + with respect to each dimension. + Each derivative has the same shape as `f`. See Also -------- :obj:`dpnp.diff` : Calculate the n-th discrete difference along the given axis. + :obj:`dpnp.ediff1d` : Calculate the differences between consecutive + elements of an array. Examples -------- >>> import dpnp as np - >>> y = np.array([1, 2, 4, 7, 11, 16], dtype=float) - >>> result = np.gradient(y) - >>> [x for x in result] - [1.0, 1.5, 2.5, 3.5, 4.5, 5.0] - >>> result = np.gradient(y, 2) - >>> [x for x in result] - [0.5, 0.75, 1.25, 1.75, 2.25, 2.5] + >>> f = np.array([1, 2, 4, 7, 11, 16], dtype=float) + >>> np.gradient(f) + array([1. , 1.5, 2.5, 3.5, 4.5, 5. ]) + >>> np.gradient(f, 2) + array([0.5 , 0.75, 1.25, 1.75, 2.25, 2.5 ]) + + Spacing can be also specified with an array that represents the coordinates + of the values `f` along the dimensions. + For instance a uniform spacing: + + >>> x = np.arange(f.size) + >>> np.gradient(f, x) + array([1. , 1.5, 2.5, 3.5, 4.5, 5. ]) + + Or a non uniform one: + + >>> x = np.array([0., 1., 1.5, 3.5, 4., 6.], dtype=float) + >>> np.gradient(f, x) + array([1. , 3. , 3.5, 6.7, 6.9, 2.5]) + + For two dimensional arrays, the return will be two arrays ordered by + axis. In this example the first array stands for the gradient in + rows and the second one in columns direction: + + >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float)) + (array([[ 2., 2., -1.], + [ 2., 2., -1.]]), + array([[1. , 2.5, 4. ], + [1. , 1. , 1. ]])) + + In this example the spacing is also specified: + uniform for axis=0 and non uniform for axis=1 + + >>> dx = 2. + >>> y = np.array([1., 1.5, 3.5]) + >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), dx, y) + (array([[ 1. , 1. , -0.5], + [ 1. , 1. , -0.5]]), + array([[2. , 2. , 2. ], + [2. , 1.7, 0.5]])) + + It is possible to specify how boundaries are treated using `edge_order` + + >>> x = np.array([0, 1, 2, 3, 4]) + >>> f = x**2 + >>> np.gradient(f, edge_order=1) + array([1., 2., 4., 6., 7.]) + >>> np.gradient(f, edge_order=2) + array([0., 2., 4., 6., 8.]) + + The `axis` keyword can be used to specify a subset of axes of which the + gradient is calculated + + >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), axis=0) + array([[ 2., 2., -1.], + [ 2., 2., -1.]]) """ - x1_desc = dpnp.get_dpnp_descriptor(x1, copy_when_nondefault_queue=False) - if x1_desc and not kwargs: - if len(varargs) > 1: - pass - elif len(varargs) == 1 and not isinstance(varargs[0], int): - pass + dpnp.check_supported_arrays_type(f) + ndim = f.ndim # number of dimensions + + if axis is None: + axes = tuple(range(ndim)) + else: + axes = normalize_axis_tuple(axis, ndim) + + dx = _gradient_build_dx(f, axes, *varargs) + if edge_order > 2: + raise ValueError("'edge_order' greater than 2 not supported") + + # Use central differences on interior and one-sided differences on the + # endpoints. This preserves second order-accuracy over the full domain. + outvals = [] + + # create slice objects --- initially all are [:, :, ..., :] + slice1 = [slice(None)] * ndim + slice2 = [slice(None)] * ndim + slice3 = [slice(None)] * ndim + slice4 = [slice(None)] * ndim + + otype = f.dtype + if dpnp.issubdtype(otype, dpnp.inexact): + pass + else: + # All other types convert to floating point. + # First check if f is a dpnp integer type; if so, convert f to default + # float type to avoid modular arithmetic when computing changes in f. + if dpnp.issubdtype(otype, dpnp.integer): + f = f.astype(dpnp.default_float_type()) + otype = dpnp.default_float_type() + + for axis_, ax_dx in zip(axes, dx): + if f.shape[axis_] < edge_order + 1: + raise ValueError( + "Shape of array too small to calculate a numerical gradient, " + "at least (edge_order + 1) elements are required." + ) + + # result allocation + if dpnp.isscalar(ax_dx): + usm_type = f.usm_type else: - if len(varargs) == 0: - return dpnp_gradient(x1_desc).get_pyobj() + usm_type = dpu.get_coerced_usm_type([f.usm_type, ax_dx.usm_type]) + out = dpnp.empty_like(f, dtype=otype, usm_type=usm_type) + + # spacing for the current axis + uniform_spacing = numpy.ndim(ax_dx) == 0 + + # Numerical differentiation: 2nd order interior + _gradient_num_diff_2nd_order_interior( + f, + ax_dx, + out, + (slice1, slice2, slice3, slice4), + axis_, + uniform_spacing, + ) + + # Numerical differentiation: 1st and 2nd order edges + _gradient_num_diff_edges( + f, + ax_dx, + out, + (slice1, slice2, slice3, slice4), + axis_, + uniform_spacing, + edge_order, + ) + + outvals.append(out) - return dpnp_gradient(x1_desc, varargs[0]).get_pyobj() + # reset the slice object in this dimension to ":" + slice1[axis_] = slice(None) + slice2[axis_] = slice(None) + slice3[axis_] = slice(None) + slice4[axis_] = slice(None) - return call_origin(numpy.gradient, x1, *varargs, **kwargs) + if len(axes) == 1: + return outvals[0] + return tuple(outvals) _IMAG_DOCSTRING = """ diff --git a/tests/skipped_tests_gpu_no_fp64.tbl b/tests/skipped_tests_gpu_no_fp64.tbl index 7a999c99617..c209c876df6 100644 --- a/tests/skipped_tests_gpu_no_fp64.tbl +++ b/tests/skipped_tests_gpu_no_fp64.tbl @@ -1,7 +1,3 @@ -tests/test_mathematical.py::TestGradient::test_gradient_y1_dx[3.5-array0] -tests/test_mathematical.py::TestGradient::test_gradient_y1_dx[3.5-array1] -tests/test_mathematical.py::TestGradient::test_gradient_y1_dx[3.5-array2] - tests/test_strides.py::test_strides_1arg[(10,)-int32-fabs] tests/test_strides.py::test_strides_1arg[(10,)-int64-fabs] tests/test_strides.py::test_strides_1arg[(10,)-None-fabs] diff --git a/tests/test_mathematical.py b/tests/test_mathematical.py index 69b590b386c..4a86cdc081e 100644 --- a/tests/test_mathematical.py +++ b/tests/test_mathematical.py @@ -9,6 +9,7 @@ assert_array_equal, assert_equal, assert_raises, + assert_raises_regex, ) import dpnp @@ -23,7 +24,6 @@ get_float_dtypes, get_integer_dtypes, has_support_aspect64, - is_cpu_device, ) from .test_umath import ( _get_numpy_arrays_1in_1out, @@ -73,6 +73,35 @@ def test_angle_complex(self, dtype, deg): assert_dtype_allclose(result, expected) +@pytest.mark.usefixtures("allow_fall_back_on_numpy") +class TestConvolve: + def test_object(self): + d = [1.0] * 100 + k = [1.0] * 3 + assert_array_almost_equal(dpnp.convolve(d, k)[2:-2], dpnp.full(98, 3)) + + def test_no_overwrite(self): + d = dpnp.ones(100) + k = dpnp.ones(3) + dpnp.convolve(d, k) + assert_array_equal(d, dpnp.ones(100)) + assert_array_equal(k, dpnp.ones(3)) + + def test_mode(self): + d = dpnp.ones(100) + k = dpnp.ones(3) + default_mode = dpnp.convolve(d, k, mode="full") + full_mode = dpnp.convolve(d, k, mode="f") + assert_array_equal(full_mode, default_mode) + # integer mode + with assert_raises(ValueError): + dpnp.convolve(d, k, mode=-1) + assert_array_equal(dpnp.convolve(d, k, mode=2), full_mode) + # illegal arguments + with assert_raises(TypeError): + dpnp.convolve(d, k, mode=None) + + class TestClip: @pytest.mark.parametrize( "dtype", get_all_dtypes(no_bool=True, no_none=True, no_complex=True) @@ -582,33 +611,347 @@ def test_prepend_append_axis_error(self, xp): assert_raises(numpy.AxisError, xp.diff, a, axis=3, append=0) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") -class TestConvolve: - def test_object(self): - d = [1.0] * 100 - k = [1.0] * 3 - assert_array_almost_equal(dpnp.convolve(d, k)[2:-2], dpnp.full(98, 3)) +class TestGradient: + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True, no_bool=True)) + def test_basic(self, dt): + x = numpy.array([[1, 1], [3, 4]], dtype=dt) + ix = dpnp.array(x) - def test_no_overwrite(self): - d = dpnp.ones(100) - k = dpnp.ones(3) - dpnp.convolve(d, k) - assert_array_equal(d, dpnp.ones(100)) - assert_array_equal(k, dpnp.ones(3)) + expected = numpy.gradient(x) + result = dpnp.gradient(ix) + assert_array_equal(result, expected) - def test_mode(self): - d = dpnp.ones(100) - k = dpnp.ones(3) - default_mode = dpnp.convolve(d, k, mode="full") - full_mode = dpnp.convolve(d, k, mode="f") - assert_array_equal(full_mode, default_mode) - # integer mode - with assert_raises(ValueError): - dpnp.convolve(d, k, mode=-1) - assert_array_equal(dpnp.convolve(d, k, mode=2), full_mode) - # illegal arguments - with assert_raises(TypeError): - dpnp.convolve(d, k, mode=None) + @pytest.mark.parametrize( + "args", + [3.0, numpy.array(3.0), numpy.cumsum(numpy.ones(5))], + ids=["scalar", "array", "cumsum"], + ) + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True, no_bool=True)) + def test_args_1d(self, args, dt): + x = numpy.arange(5, dtype=dt) + ix = dpnp.array(x) + + if numpy.isscalar(args): + iargs = args + else: + iargs = dpnp.array(args) + + expected = numpy.gradient(x, args) + result = dpnp.gradient(ix, iargs) + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize( + "args", [1.5, numpy.array(1.5)], ids=["scalar", "array"] + ) + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True, no_bool=True)) + def test_args_2d(self, args, dt): + x = numpy.arange(25, dtype=dt).reshape(5, 5) + ix = dpnp.array(x) + + if numpy.isscalar(args): + iargs = args + else: + iargs = dpnp.array(args) + + expected = numpy.gradient(x, args) + result = dpnp.gradient(ix, iargs) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True, no_bool=True)) + def test_args_2d_uneven(self, dt): + x = numpy.arange(25, dtype=dt).reshape(5, 5) + ix = dpnp.array(x) + + dx = numpy.array([1.0, 2.0, 5.0, 9.0, 11.0]) + idx = dpnp.array(dx) + + expected = numpy.gradient(x, dx, dx) + result = dpnp.gradient(ix, idx, idx) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True, no_bool=True)) + def test_args_2d_mix_with_scalar(self, dt): + x = numpy.arange(25, dtype=dt).reshape(5, 5) + ix = dpnp.array(x) + + dx = numpy.cumsum(numpy.ones(5)) + idx = dpnp.array(dx) + + expected = numpy.gradient(x, dx, 2) + result = dpnp.gradient(ix, idx, 2) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True, no_bool=True)) + def test_axis_args_2d(self, dt): + x = numpy.arange(25, dtype=dt).reshape(5, 5) + ix = dpnp.array(x) + + dx = numpy.cumsum(numpy.ones(5)) + idx = dpnp.array(dx) + + expected = numpy.gradient(x, dx, axis=1) + result = dpnp.gradient(ix, idx, axis=1) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_args_2d_error(self, xp): + x = xp.arange(25).reshape(5, 5) + dx = xp.cumsum(xp.ones(5)) + assert_raises_regex( + ValueError, + ".*scalars or 1d", + xp.gradient, + x, + xp.stack([dx] * 2, axis=-1), + 1, + ) + + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_badargs(self, xp): + x = xp.arange(25).reshape(5, 5) + dx = xp.cumsum(xp.ones(5)) + + # wrong sizes + assert_raises(ValueError, xp.gradient, x, x, xp.ones(2)) + assert_raises(ValueError, xp.gradient, x, 1, xp.ones(2)) + assert_raises(ValueError, xp.gradient, x, xp.ones(2), xp.ones(2)) + # wrong number of arguments + assert_raises(TypeError, xp.gradient, x, x) + assert_raises(TypeError, xp.gradient, x, dx, axis=(0, 1)) + assert_raises(TypeError, xp.gradient, x, dx, dx, dx) + assert_raises(TypeError, xp.gradient, x, 1, 1, 1) + assert_raises(TypeError, xp.gradient, x, dx, dx, axis=1) + assert_raises(TypeError, xp.gradient, x, 1, 1, axis=1) + + @pytest.mark.parametrize( + "x", + [ + numpy.linspace(0, 1, 10), + numpy.sort(numpy.random.RandomState(0).random(10)), + ], + ids=["linspace", "random_sorted"], + ) + @pytest.mark.parametrize("dt", get_float_dtypes()) + # testing that the relative numerical error is close to numpy + def test_second_order_accurate(self, x, dt): + x = x.astype(dt) + dx = x[1] - x[0] + y = 2 * x**3 + 4 * x**2 + 2 * x + + iy = dpnp.array(y) + idx = dpnp.array(dx) + + expected = numpy.gradient(y, dx, edge_order=2) + result = dpnp.gradient(iy, idx, edge_order=2) + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize("edge_order", [1, 2]) + @pytest.mark.parametrize("axis", [0, 1, (0, 1)]) + @pytest.mark.parametrize("dt", get_float_dtypes()) + def test_spacing_axis_scalar(self, edge_order, axis, dt): + x = numpy.array([0, 2.0, 3.0, 4.0, 5.0, 5.0], dtype=dt) + x = numpy.tile(x, (6, 1)) + x.reshape(-1, 1) + ix = dpnp.array(x) + + expected = numpy.gradient(x, 1.0, axis=axis, edge_order=edge_order) + result = dpnp.gradient(ix, 1.0, axis=axis, edge_order=edge_order) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("edge_order", [1, 2]) + @pytest.mark.parametrize("axis", [(0, 1), None]) + @pytest.mark.parametrize("dt", get_float_dtypes()) + @pytest.mark.parametrize( + "dx", + [numpy.arange(6.0), numpy.array([0.0, 0.5, 1.0, 3.0, 5.0, 7.0])], + ids=["even", "uneven"], + ) + def test_spacing_axis_two_args(self, edge_order, axis, dt, dx): + x = numpy.array([0, 2.0, 3.0, 4.0, 5.0, 5.0], dtype=dt) + x = numpy.tile(x, (6, 1)) + x.reshape(-1, 1) + + ix = dpnp.array(x) + idx = dpnp.array(dx) + + expected = numpy.gradient(x, dx, dx, axis=axis, edge_order=edge_order) + result = dpnp.gradient(ix, idx, idx, axis=axis, edge_order=edge_order) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("edge_order", [1, 2]) + @pytest.mark.parametrize("axis", [0, 1]) + @pytest.mark.parametrize("dt", get_float_dtypes()) + @pytest.mark.parametrize( + "dx", + [numpy.arange(6.0), numpy.array([0.0, 0.5, 1.0, 3.0, 5.0, 7.0])], + ids=["even", "uneven"], + ) + def test_spacing_axis_args(self, edge_order, axis, dt, dx): + x = numpy.array([0, 2.0, 3.0, 4.0, 5.0, 5.0], dtype=dt) + x = numpy.tile(x, (6, 1)) + x.reshape(-1, 1) + + ix = dpnp.array(x) + idx = dpnp.array(dx) + + expected = numpy.gradient(x, dx, axis=axis, edge_order=edge_order) + result = dpnp.gradient(ix, idx, axis=axis, edge_order=edge_order) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("edge_order", [1, 2]) + @pytest.mark.parametrize("dt", get_float_dtypes()) + def test_spacing_mix_args(self, edge_order, dt): + x = numpy.array([0, 2.0, 3.0, 4.0, 5.0, 5.0], dtype=dt) + x = numpy.tile(x, (6, 1)) + x.reshape(-1, 1) + x_uneven = numpy.array([0.0, 0.5, 1.0, 3.0, 5.0, 7.0]) + x_even = numpy.arange(6.0) + + ix = dpnp.array(x) + ix_uneven = dpnp.array(x_uneven) + ix_even = dpnp.array(x_even) + + expected = numpy.gradient( + x, x_even, x_uneven, axis=(0, 1), edge_order=edge_order + ) + result = dpnp.gradient( + ix, ix_even, ix_uneven, axis=(0, 1), edge_order=edge_order + ) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + expected = numpy.gradient( + x, x_uneven, x_even, axis=(1, 0), edge_order=edge_order + ) + result = dpnp.gradient( + ix, ix_uneven, ix_even, axis=(1, 0), edge_order=edge_order + ) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("axis", [0, 1, -1, (1, 0), None]) + def test_specific_axes(self, axis): + x = numpy.array([[1, 1], [3, 4]]) + ix = dpnp.array(x) + + expected = numpy.gradient(x, axis=axis) + result = dpnp.gradient(ix, axis=axis) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + def test_axis_scalar_args(self): + x = numpy.array([[1, 1], [3, 4]]) + ix = dpnp.array(x) + + expected = numpy.gradient(x, 2, 3, axis=(1, 0)) + result = dpnp.gradient(ix, 2, 3, axis=(1, 0)) + for gr, igr in zip(expected, result): + assert_dtype_allclose(igr, gr) + + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_wrong_number_of_args(self, xp): + x = xp.array([[1, 1], [3, 4]]) + assert_raises(TypeError, xp.gradient, x, 1, 2, axis=1) + + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_wrong_axis(self, xp): + x = xp.array([[1, 1], [3, 4]]) + assert_raises(numpy.AxisError, xp.gradient, x, axis=3) + + @pytest.mark.parametrize( + "size, edge_order", + [ + pytest.param(2, 1), + pytest.param(3, 2), + ], + ) + def test_min_size_with_edge_order(self, size, edge_order): + x = numpy.arange(size) + ix = dpnp.array(x) + + expected = numpy.gradient(x, edge_order=edge_order) + result = dpnp.gradient(ix, edge_order=edge_order) + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize( + "size, edge_order", + [ + pytest.param(0, 1), + pytest.param(0, 2), + pytest.param(1, 1), + pytest.param(1, 2), + pytest.param(2, 2), + ], + ) + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_wrong_size_with_edge_order(self, size, edge_order, xp): + assert_raises( + ValueError, xp.gradient, xp.arange(size), edge_order=edge_order + ) + + @pytest.mark.parametrize( + "dt", [numpy.uint8, numpy.uint16, numpy.uint32, numpy.uint64] + ) + def test_f_decreasing_unsigned_int(self, dt): + x = numpy.array([5, 4, 3, 2, 1], dtype=dt) + ix = dpnp.array(x) + + expected = numpy.gradient(x) + result = dpnp.gradient(ix) + assert_array_equal(result, expected) + + @pytest.mark.parametrize( + "dt", [numpy.int8, numpy.int16, numpy.int32, numpy.int64] + ) + def test_f_signed_int_big_jump(self, dt): + maxint = numpy.iinfo(dt).max + x = numpy.array([-1, maxint], dtype=dt) + dx = numpy.array([1, 3]) + + ix = dpnp.array(x) + idx = dpnp.array(dx) + + expected = numpy.gradient(x, dx) + result = dpnp.gradient(ix, idx) + assert_array_equal(result, expected) + + @pytest.mark.parametrize( + "dt", [numpy.uint8, numpy.uint16, numpy.uint32, numpy.uint64] + ) + def test_x_decreasing_unsigned(self, dt): + x = numpy.array([3, 2, 1], dtype=dt) + f = numpy.array([0, 2, 4]) + + dp_x = dpnp.array(x) + dp_f = dpnp.array(f) + + expected = numpy.gradient(f, x) + result = dpnp.gradient(dp_f, dp_x) + assert_array_equal(result, expected) + + @pytest.mark.parametrize( + "dt", [numpy.int8, numpy.int16, numpy.int32, numpy.int64] + ) + def test_x_signed_int_big_jump(self, dt): + minint = numpy.iinfo(dt).min + maxint = numpy.iinfo(dt).max + x = numpy.array([-1, maxint], dtype=dt) + f = numpy.array([minint // 2, 0]) + + dp_x = dpnp.array(x) + dp_f = dpnp.array(f) + + expected = numpy.gradient(f, x) + result = dpnp.gradient(dp_f, dp_x) + assert_array_equal(result, expected) + + def test_return_type(self): + x = dpnp.array([[1, 2], [2, 3]]) + res = dpnp.gradient(x) + assert type(res) is tuple @pytest.mark.parametrize("dtype1", get_all_dtypes()) @@ -1384,32 +1727,6 @@ def test_trapz_with_dx_params(self, y_array, dx): assert_array_equal(expected, result) -class TestGradient: - @pytest.mark.parametrize( - "array", [[2, 3, 6, 8, 4, 9], [3.0, 4.0, 7.5, 9.0], [2, 6, 8, 10]] - ) - def test_gradient_y1(self, array): - np_y = numpy.array(array) - dpnp_y = dpnp.array(array) - - result = dpnp.gradient(dpnp_y) - expected = numpy.gradient(np_y) - assert_array_equal(expected, result) - - @pytest.mark.usefixtures("allow_fall_back_on_numpy") - @pytest.mark.parametrize( - "array", [[2, 3, 6, 8, 4, 9], [3.0, 4.0, 7.5, 9.0], [2, 6, 8, 10]] - ) - @pytest.mark.parametrize("dx", [2, 3.5]) - def test_gradient_y1_dx(self, array, dx): - np_y = numpy.array(array) - dpnp_y = dpnp.array(array) - - result = dpnp.gradient(dpnp_y, dx) - expected = numpy.gradient(np_y, dx) - assert_array_equal(expected, result) - - class TestRoundingFuncs: @pytest.fixture( params=[ diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index fae4dd52221..e66c1a55b87 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -625,6 +625,11 @@ def test_reduce_hypot(device): [-3.0, -2.0, -1.0, 1.0, 2.0, 3.0], [2.0, 2.0, 2.0, 2.0, 2.0, 2.0], ), + pytest.param( + "gradient", + [1.0, 2.0, 4.0, 7.0, 11.0, 16.0], + [0.0, 1.0, 1.5, 3.5, 4.0, 6.0], + ), pytest.param( "histogram_bin_edges", [0, 0, 0, 1, 2, 3, 3, 4, 5], @@ -691,7 +696,7 @@ def test_2in_1out(func, data1, data2, device): x2 = dpnp.array(data2, device=device) result = getattr(dpnp, func)(x1, x2) - assert_allclose(result, expected) + assert_dtype_allclose(result, expected) assert_sycl_queue_equal(result.sycl_queue, x1.sycl_queue) assert_sycl_queue_equal(result.sycl_queue, x2.sycl_queue) diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index eab59cf001b..f42b6a769bc 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -539,6 +539,7 @@ def test_norm(usm_type, ord, axis): pytest.param("exp2", [0.0, 1.0, 2.0]), pytest.param("expm1", [1.0e-10, 1.0, 2.0, 4.0, 7.0]), pytest.param("floor", [-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]), + pytest.param("gradient", [1, 2, 4, 7, 11, 16]), pytest.param("histogram_bin_edges", [0, 0, 0, 1, 2, 3, 3, 4, 5]), pytest.param( "imag", [complex(1.0, 2.0), complex(3.0, 4.0), complex(5.0, 6.0)] @@ -622,6 +623,9 @@ def test_1in_1out(func, data, usm_type): pytest.param("dot", [3 + 2j, 4 + 1j, 5], [1, 2 + 3j, 3]), pytest.param("fmax", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), pytest.param("fmin", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), + pytest.param( + "gradient", [1, 2, 4, 7, 11, 16], [0.0, 1.0, 1.5, 3.5, 4.0, 6.0] + ), pytest.param( "hypot", [[1.0, 2.0, 3.0, 4.0]], [[-1.0, -2.0, -4.0, -5.0]] ), diff --git a/tests/third_party/cupy/math_tests/test_sumprod.py b/tests/third_party/cupy/math_tests/test_sumprod.py index 18a74a76330..f36086755e9 100644 --- a/tests/third_party/cupy/math_tests/test_sumprod.py +++ b/tests/third_party/cupy/math_tests/test_sumprod.py @@ -717,9 +717,15 @@ def test_diff_invalid_axis(self): ), ) ) -@pytest.mark.skip("gradient() is not implemented yet") class TestGradient: def _gradient(self, xp, dtype, shape, spacing, axis, edge_order): + if ( + not has_support_aspect64() + and shape == (10, 20, 30) + and spacing == "arrays" + ): + pytest.skip("too big values") + x = testing.shaped_random(shape, xp, dtype=dtype) if axis is None: normalized_axes = tuple(range(x.ndim)) @@ -755,7 +761,9 @@ def test_gradient_floating(self, xp, dtype): # https://github.com/numpy/numpy/issues/15207 @testing.with_requires("numpy>=1.18.1") @testing.for_int_dtypes(no_bool=True) - @testing.numpy_cupy_allclose(atol=1e-6, rtol=1e-5) + @testing.numpy_cupy_allclose( + atol=1e-6, rtol=1e-5, type_check=has_support_aspect64() + ) def test_gradient_int(self, xp, dtype): return self._gradient( xp, dtype, self.shape, self.spacing, self.axis, self.edge_order @@ -773,7 +781,6 @@ def test_gradient_float16(self, xp): ) -@pytest.mark.skip("gradient() is not implemented yet") class TestGradientErrors: def test_gradient_invalid_spacings1(self): # more spacings than axes From 48d6191e1a03dfdbf66bfba63028201074e290a8 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Fri, 31 May 2024 15:12:36 +0200 Subject: [PATCH 11/49] Bump conda-build version to 24.5.1 (#1862) --- .github/workflows/conda-package.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index 3b24acf7774..dde7e3d435a 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -12,7 +12,7 @@ env: PACKAGE_NAME: dpnp MODULE_NAME: dpnp CHANNELS: '-c dppy/label/dev -c intel -c conda-forge --override-channels' - CONDA_BUILD_VERSION: '24.5.0' + CONDA_BUILD_VERSION: '24.5.1' CONDA_INDEX_VERSION: '0.4.0' TEST_ENV_NAME: 'test' TEST_SCOPE: >- From 807dc14807d1e08b2643d15f8bb5500908b565aa Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 3 Jun 2024 12:38:17 +0200 Subject: [PATCH 12/49] Bump github/codeql-action from 3.25.6 to 3.25.7 (#1865) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.6 to 3.25.7. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/9fdb3e49720b44c48891d036bb502feb25684276...f079b8493333aace61c81488f8bd40919487bd9f) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- .github/workflows/openssf-scorecard.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index 726b817e2ff..25f351b9c1e 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -68,6 +68,6 @@ jobs: # Upload the results to GitHub's code scanning dashboard. - name: "Upload to code-scanning" - uses: github/codeql-action/upload-sarif@9fdb3e49720b44c48891d036bb502feb25684276 # v3.25.6 + uses: github/codeql-action/upload-sarif@f079b8493333aace61c81488f8bd40919487bd9f # v3.25.7 with: sarif_file: results.sarif From 006ccf95392e86f8034955cae66e56e39880cad0 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Tue, 4 Jun 2024 17:10:44 +0200 Subject: [PATCH 13/49] Implement `dpnp.sort_complex` function (#1864) * Implement dpnp.sort_complex * Improve test coverage --- doc/reference/sorting.rst | 1 - dpnp/dpnp_iface_sorting.py | 40 +++++- tests/skipped_tests.tbl | 44 ------- tests/skipped_tests_gpu.tbl | 44 ------- tests/test_sort.py | 25 ++++ tests/test_sycl_queue.py | 1 + tests/test_usm_type.py | 1 + .../cupy/sorting_tests/test_sort.py | 123 ++++++++++-------- 8 files changed, 135 insertions(+), 144 deletions(-) diff --git a/doc/reference/sorting.rst b/doc/reference/sorting.rst index 170e0d33662..d0a966c6731 100644 --- a/doc/reference/sorting.rst +++ b/doc/reference/sorting.rst @@ -13,7 +13,6 @@ Sorting dpnp.sort dpnp.lexsort dpnp.argsort - dpnp.msort dpnp.sort_complex dpnp.partition dpnp.argpartition diff --git a/dpnp/dpnp_iface_sorting.py b/dpnp/dpnp_iface_sorting.py index 3dff8adf485..a6fed26d406 100644 --- a/dpnp/dpnp_iface_sorting.py +++ b/dpnp/dpnp_iface_sorting.py @@ -51,9 +51,10 @@ from .dpnp_array import dpnp_array from .dpnp_utils import ( call_origin, + map_dtype_to_device, ) -__all__ = ["argsort", "partition", "sort"] +__all__ = ["argsort", "partition", "sort", "sort_complex"] def argsort(a, axis=-1, kind=None, order=None): @@ -263,3 +264,40 @@ def sort(a, axis=-1, kind=None, order=None): return dpnp_array._create_from_usm_ndarray( dpt.sort(dpnp.get_usm_ndarray(a), axis=axis) ) + + +def sort_complex(a): + """ + Sort a complex array using the real part first, then the imaginary part. + + For full documentation refer to :obj:`numpy.sort_complex`. + + Parameters + ---------- + a : {dpnp.ndarray, usm_ndarray} + Input array. + + Returns + ------- + out : dpnp.ndarray of complex dtype + Always returns a sorted complex array. + + Examples + -------- + >>> import dpnp as np + >>> a = np.array([5, 3, 6, 2, 1]) + >>> np.sort_complex(a) + array([1.+0.j, 2.+0.j, 3.+0.j, 5.+0.j, 6.+0.j]) + + >>> a = np.array([1 + 2j, 2 - 1j, 3 - 2j, 3 - 3j, 3 + 5j]) + >>> np.sort_complex(a) + array([1.+2.j, 2.-1.j, 3.-3.j, 3.-2.j, 3.+5.j]) + + """ + + b = dpnp.sort(a) + if not dpnp.issubsctype(b.dtype, dpnp.complexfloating): + if b.dtype.char in "bhBH": + return b.astype(dpnp.complex64) + return b.astype(map_dtype_to_device(dpnp.complex128, b.sycl_device)) + return b diff --git a/tests/skipped_tests.tbl b/tests/skipped_tests.tbl index 7fa1510e8a5..5e012b3a496 100644 --- a/tests/skipped_tests.tbl +++ b/tests/skipped_tests.tbl @@ -564,50 +564,6 @@ tests/third_party/cupy/random_tests/test_sample.py::TestRandomIntegers2::test_bo tests/third_party/cupy/random_tests/test_sample.py::TestRandomIntegers2::test_goodness_of_fit tests/third_party/cupy/random_tests/test_sample.py::TestRandomIntegers2::test_goodness_of_fit_2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_negative_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_negative_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_multi_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_negative_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_non_contiguous -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_none_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_one_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_sequence_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_zero_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_negative_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_negative_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_multi_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_negative_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_non_contiguous -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_none_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_one_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_sequence_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_zero_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_F_order -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_lexsort_dtype -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_lexsort_three_or_more_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_nan1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_nan2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_nan3 -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_view -tests/third_party/cupy/sorting_tests/test_sort.py::TestMsort::test_msort_multi_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestMsort::test_msort_one_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_1dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_nan -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_ndim -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_zero_dim - tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_diag_exception tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_rowvar diff --git a/tests/skipped_tests_gpu.tbl b/tests/skipped_tests_gpu.tbl index 8791400846b..e14b954abe6 100644 --- a/tests/skipped_tests_gpu.tbl +++ b/tests/skipped_tests_gpu.tbl @@ -570,50 +570,6 @@ tests/third_party/cupy/random_tests/test_sample.py::TestRandomIntegers2::test_bo tests/third_party/cupy/random_tests/test_sample.py::TestRandomIntegers2::test_goodness_of_fit tests/third_party/cupy/random_tests/test_sample.py::TestRandomIntegers2::test_goodness_of_fit_2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_negative_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_negative_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_invalid_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_multi_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_negative_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_non_contiguous -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_none_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_one_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_sequence_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_0_{external=False}::test_argpartition_zero_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_negative_axis1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_negative_axis2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_invalid_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_multi_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_negative_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_negative_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_non_contiguous -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_none_axis -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_one_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_sequence_kth -tests/third_party/cupy/sorting_tests/test_sort.py::TestArgpartition_param_1_{external=True}::test_argpartition_zero_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_F_order -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_lexsort_dtype -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_lexsort_three_or_more_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_nan1 -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_nan2 -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_nan3 -tests/third_party/cupy/sorting_tests/test_sort.py::TestLexsort::test_view -tests/third_party/cupy/sorting_tests/test_sort.py::TestMsort::test_msort_multi_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestMsort::test_msort_one_dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_1dim -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_nan -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_ndim -tests/third_party/cupy/sorting_tests/test_sort.py::TestSort_complex::test_sort_complex_zero_dim - tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_diag_exception tests/third_party/cupy/statistics_tests/test_correlation.py::TestCorrcoef::test_corrcoef_rowvar diff --git a/tests/test_sort.py b/tests/test_sort.py index e9e8afb4454..289f7c9716b 100644 --- a/tests/test_sort.py +++ b/tests/test_sort.py @@ -340,6 +340,31 @@ def test_sort_notimplemented(self): dpnp.sort(dp_array, order=["age"]) +class TestSortComplex: + @pytest.mark.parametrize( + "dtype", get_all_dtypes(no_complex=True) + [numpy.int8, numpy.int16] + ) + def test_real(self, dtype): + # sort_complex() type casting for real input types + a = numpy.array([5, 3, 6, 2, 1], dtype=dtype) + ia = dpnp.array(a) + + result = dpnp.sort_complex(ia) + expected = numpy.sort_complex(a) + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize("dtype", get_complex_dtypes()) + def test_complex(self, dtype): + # sort_complex() handling of complex input + a = numpy.array([2 + 3j, 1 - 2j, 1 - 3j, 2 + 1j], dtype=dtype) + ia = dpnp.array(a) + + result = dpnp.sort_complex(ia) + expected = numpy.sort_complex(a) + assert_equal(result, expected) + assert result.dtype == expected.dtype + + @pytest.mark.parametrize("kth", [0, 1], ids=["0", "1"]) @pytest.mark.parametrize("dtype", get_all_dtypes(no_none=True)) @pytest.mark.parametrize( diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index e66c1a55b87..8332f26949b 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -463,6 +463,7 @@ def test_meshgrid(device_x, device_y): ), pytest.param("sinh", [-5.0, -3.5, 0.0, 3.5, 5.0]), pytest.param("sort", [2.0, 1.0, 7.0, 4.0]), + pytest.param("sort_complex", [1 + 2j, 2 - 1j, 3 - 2j, 3 - 3j, 3 + 5j]), pytest.param("sqrt", [1.0, 3.0, 9.0]), pytest.param("square", [1.0, 3.0, 9.0]), pytest.param("std", [1.0, 2.0, 4.0, 7.0]), diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index f42b6a769bc..f66017ea6e2 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -581,6 +581,7 @@ def test_norm(usm_type, ord, axis): ), pytest.param("sinh", [-5.0, -3.5, 0.0, 3.5, 5.0]), pytest.param("sort", [2.0, 1.0, 7.0, 4.0]), + pytest.param("sort_complex", [1 + 2j, 2 - 1j, 3 - 2j, 3 - 3j, 3 + 5j]), pytest.param("sqrt", [1.0, 3.0, 9.0]), pytest.param("square", [1.0, 3.0, 9.0]), pytest.param("std", [1.0, 2.0, 4.0, 7.0]), diff --git a/tests/third_party/cupy/sorting_tests/test_sort.py b/tests/third_party/cupy/sorting_tests/test_sort.py index 8715702db51..154e0b4f599 100644 --- a/tests/third_party/cupy/sorting_tests/test_sort.py +++ b/tests/third_party/cupy/sorting_tests/test_sort.py @@ -4,6 +4,7 @@ import pytest import dpnp as cupy +from tests.helper import has_support_aspect64 from tests.third_party.cupy import testing @@ -210,6 +211,7 @@ def test_large(self, xp): return xp.sort(a, axis=-1) +@pytest.mark.skip("lexsort() is not implemented yet") class TestLexsort(unittest.TestCase): # Test ranks @@ -221,12 +223,12 @@ def test_lexsort_zero_dim(self): with pytest.raises(numpy.AxisError): return xp.lexsort(a) - @testing.numpy_cupy_array_equal + @testing.numpy_cupy_array_equal() def test_lexsort_one_dim(self, xp): a = testing.shaped_random((2,), xp) return xp.lexsort(a) - @testing.numpy_cupy_array_equal + @testing.numpy_cupy_array_equal() def test_lexsort_two_dim(self, xp): a = xp.array( [[9, 4, 0, 4, 0, 2, 1], [1, 5, 1, 4, 3, 4, 4]] @@ -411,11 +413,10 @@ def test_nan2(self, xp, dtype): return self.argsort(a) +@pytest.mark.skip("msort() is deprecated") class TestMsort(unittest.TestCase): # Test base cases - # TODO(niboshi): Fix xfail - @pytest.mark.xfail(reason="Explicit error types required") def test_msort_zero_dim(self): for xp in (numpy, cupy): a = testing.shaped_random((), xp) @@ -443,19 +444,19 @@ def test_sort_complex_zero_dim(self): xp.sort_complex(a) @testing.for_all_dtypes() - @testing.numpy_cupy_array_equal() + @testing.numpy_cupy_array_equal(type_check=has_support_aspect64()) def test_sort_complex_1dim(self, xp, dtype): a = testing.shaped_random((100,), xp, dtype) return a, xp.sort_complex(a) @testing.for_all_dtypes() - @testing.numpy_cupy_array_equal() + @testing.numpy_cupy_array_equal(type_check=has_support_aspect64()) def test_sort_complex_ndim(self, xp, dtype): a = testing.shaped_random((2, 5, 3), xp, dtype) return a, xp.sort_complex(a) @testing.for_dtypes("efdFD") - @testing.numpy_cupy_array_equal() + @testing.numpy_cupy_array_equal(type_check=has_support_aspect64()) def test_sort_complex_nan(self, xp, dtype): a = testing.shaped_random((2, 3, 5), xp, dtype) a[0, 2, 1] = a[1, 0, 3] = xp.nan @@ -618,6 +619,7 @@ def test_partition_invalid_negative_axis2(self): } ) ) +@pytest.mark.skip("not fully supported yet") class TestArgpartition(unittest.TestCase): def argpartition(self, a, kth, axis=-1): if self.external: @@ -641,9 +643,9 @@ def test_argpartition_one_dim(self, xp, dtype): a = testing.shaped_random((10,), xp, dtype, 100) kth = 2 idx = self.argpartition(a, kth) - self.assertTrue((a[idx[:kth]] < a[idx[kth]]).all()) - self.assertTrue((a[idx[kth]] < a[idx[kth + 1 :]]).all()) - return idx[kth] + assert (a[idx[:kth]] <= a[idx[kth]]).all() + assert (a[idx[kth]] <= a[idx[kth + 1 :]]).all() + return a[idx[kth]] # TODO(leofang): test all dtypes -- this workaround needs to be kept, # likely due to #3287? Need investigation. @@ -655,18 +657,39 @@ def test_argpartition_multi_dim(self, xp, dtype): idx = self.argpartition(a, kth) rows = [[[0]], [[1]], [[2]]] cols = [[[0], [1], [2]]] - self.assertTrue( - ( - a[rows, cols, idx[:, :, :kth]] - < a[rows, cols, idx[:, :, kth : kth + 1]] - ).all() - ) - self.assertTrue( - ( - a[rows, cols, idx[:, :, kth : kth + 1]] - < a[rows, cols, idx[:, :, kth + 1 :]] - ).all() - ) + assert ( + a[rows, cols, idx[:, :, :kth]] + < a[rows, cols, idx[:, :, kth : kth + 1]] + ).all() + assert ( + a[rows, cols, idx[:, :, kth : kth + 1]] + < a[rows, cols, idx[:, :, kth + 1 :]] + ).all() + return idx[:, :, kth : kth + 1] + + @testing.for_all_dtypes(no_bool=True) + @testing.numpy_cupy_array_equal() + def test_argpartition_multi_dim_kernel(self, xp, dtype): + # Use a larger scale for shaped_random to avoid duplicated numbers, + # which may make different indices at kth between NumPy and CuPy. Skip + # if int8 and uint8 not to overflow. + if dtype in (xp.int8, xp.uint8): + pytest.skip() + a = testing.shaped_random((3, 3, 256), xp, dtype, 10000) + kth = 20 + idx = self.argpartition(a, kth, axis=-1) + + rows = [[[0]], [[1]], [[2]]] + cols = [[[0], [1], [2]]] + + assert ( + a[rows, cols, idx[:, :, :kth]] + <= a[rows, cols, idx[:, :, kth : kth + 1]] + ).all() + assert ( + a[rows, cols, idx[:, :, kth : kth + 1]] + <= a[rows, cols, idx[:, :, kth + 1 :]] + ).all() return idx[:, :, kth : kth + 1] # Test non-contiguous array @@ -676,8 +699,8 @@ def test_argpartition_non_contiguous(self, xp): a = testing.shaped_random((10,), xp, "i", 100)[::2] kth = 2 idx = self.argpartition(a, kth) - self.assertTrue((a[idx[:kth]] < a[idx[kth]]).all()) - self.assertTrue((a[idx[kth]] < a[idx[kth + 1 :]]).all()) + assert (a[idx[:kth]] < a[idx[kth]]).all() + assert (a[idx[kth]] < a[idx[kth + 1 :]]).all() return idx[kth] # Test kth @@ -688,8 +711,8 @@ def test_argpartition_sequence_kth(self, xp): kth = (2, 4) idx = self.argpartition(a, kth) for _kth in kth: - self.assertTrue((a[idx[:_kth]] < a[idx[_kth]]).all()) - self.assertTrue((a[idx[_kth]] < a[idx[_kth + 1 :]]).all()) + assert (a[idx[:_kth]] < a[idx[_kth]]).all() + assert (a[idx[_kth]] < a[idx[_kth + 1 :]]).all() return (idx[2], idx[4]) @testing.numpy_cupy_equal() @@ -697,8 +720,8 @@ def test_argpartition_negative_kth(self, xp): a = testing.shaped_random((10,), xp, scale=100) kth = -3 idx = self.argpartition(a, kth) - self.assertTrue((a[idx[:kth]] < a[idx[kth]]).all()) - self.assertTrue((a[idx[kth]] < a[idx[kth + 1 :]]).all()) + assert (a[idx[:kth]] < a[idx[kth]]).all() + assert (a[idx[kth]] < a[idx[kth + 1 :]]).all() return idx[kth] def test_argpartition_invalid_kth(self): @@ -725,18 +748,14 @@ def test_argpartition_axis(self, xp): idx = self.argpartition(a, kth, axis=axis) rows = [[[0], [1], [2]]] cols = [[[0, 1, 2]]] - self.assertTrue( - ( - a[idx[:kth, :, :], rows, cols] - < a[idx[kth : kth + 1, :, :], rows, cols] - ).all() - ) - self.assertTrue( - ( - a[idx[kth : kth + 1, :, :], rows, cols] - < a[idx[kth + 1 :, :, :], rows, cols] - ).all() - ) + assert ( + a[idx[:kth, :, :], rows, cols] + < a[idx[kth : kth + 1, :, :], rows, cols] + ).all() + assert ( + a[idx[kth : kth + 1, :, :], rows, cols] + < a[idx[kth + 1 :, :, :], rows, cols] + ).all() return idx[kth : kth + 1, :, :] @testing.numpy_cupy_array_equal() @@ -747,18 +766,14 @@ def test_argpartition_negative_axis(self, xp): idx = self.argpartition(a, kth, axis=axis) rows = [[[0]], [[1]], [[2]]] cols = [[[0], [1], [2]]] - self.assertTrue( - ( - a[rows, cols, idx[:, :, :kth]] - < a[rows, cols, idx[:, :, kth : kth + 1]] - ).all() - ) - self.assertTrue( - ( - a[rows, cols, idx[:, :, kth : kth + 1]] - < a[rows, cols, idx[:, :, kth + 1 :]] - ).all() - ) + assert ( + a[rows, cols, idx[:, :, :kth]] + < a[rows, cols, idx[:, :, kth : kth + 1]] + ).all() + assert ( + a[rows, cols, idx[:, :, kth : kth + 1]] + < a[rows, cols, idx[:, :, kth + 1 :]] + ).all() return idx[:, :, kth : kth + 1] @testing.numpy_cupy_equal() @@ -768,8 +783,8 @@ def test_argpartition_none_axis(self, xp): axis = None idx = self.argpartition(a, kth, axis=axis) a1 = a.flatten() - self.assertTrue((a1[idx[:kth]] < a1[idx[kth]]).all()) - self.assertTrue((a1[idx[kth]] < a1[idx[kth + 1 :]]).all()) + assert (a1[idx[:kth]] < a1[idx[kth]]).all() + assert (a1[idx[kth]] < a1[idx[kth + 1 :]]).all() return idx[kth] def test_argpartition_invalid_axis1(self): From 062bbd7a81d8ebe104a8b00344866d20122cf076 Mon Sep 17 00:00:00 2001 From: vtavana <120411540+vtavana@users.noreply.github.com> Date: Fri, 7 Jun 2024 08:49:52 -0500 Subject: [PATCH 14/49] minor updates for related to BLAS routines (#1869) --- dpnp/backend/extensions/blas/blas_py.cpp | 14 ++++----- dpnp/backend/extensions/blas/gemm_batch.cpp | 16 +--------- dpnp/dpnp_iface_linearalgebra.py | 33 ++++++++++++--------- dpnp/dpnp_utils/dpnp_utils_linearalgebra.py | 7 ++--- tests/test_product.py | 25 ++++++++++------ 5 files changed, 45 insertions(+), 50 deletions(-) diff --git a/dpnp/backend/extensions/blas/blas_py.cpp b/dpnp/backend/extensions/blas/blas_py.cpp index 3fdfebe7c30..b5d83375f23 100644 --- a/dpnp/backend/extensions/blas/blas_py.cpp +++ b/dpnp/backend/extensions/blas/blas_py.cpp @@ -73,7 +73,7 @@ PYBIND11_MODULE(_blas_impl, m) }; m.def("_dot", dot_pyapi, - "Call `dot` from OneMKL BLAS library to return " + "Call `dot` from OneMKL BLAS library to compute " "the dot product of two real-valued vectors.", py::arg("sycl_queue"), py::arg("vectorA"), py::arg("vectorB"), py::arg("result"), py::arg("depends") = py::list()); @@ -91,7 +91,7 @@ PYBIND11_MODULE(_blas_impl, m) }; m.def("_dotc", dotc_pyapi, - "Call `dotc` from OneMKL BLAS library to return " + "Call `dotc` from OneMKL BLAS library to compute " "the dot product of two complex vectors, " "conjugating the first vector.", py::arg("sycl_queue"), py::arg("vectorA"), py::arg("vectorB"), @@ -110,7 +110,7 @@ PYBIND11_MODULE(_blas_impl, m) }; m.def("_dotu", dotu_pyapi, - "Call `dotu` from OneMKL BLAS library to return " + "Call `dotu` from OneMKL BLAS library to compute " "the dot product of two complex vectors.", py::arg("sycl_queue"), py::arg("vectorA"), py::arg("vectorB"), py::arg("result"), py::arg("depends") = py::list()); @@ -118,7 +118,7 @@ PYBIND11_MODULE(_blas_impl, m) { m.def("_gemm", &blas_ext::gemm, - "Call `gemm` from OneMKL BLAS library to return " + "Call `gemm` from OneMKL BLAS library to compute " "the matrix-matrix product with 2-D matrices.", py::arg("sycl_queue"), py::arg("matrixA"), py::arg("matrixB"), py::arg("resultC"), py::arg("depends") = py::list()); @@ -126,7 +126,7 @@ PYBIND11_MODULE(_blas_impl, m) { m.def("_gemm_batch", &blas_ext::gemm_batch, - "Call `gemm_batch` from OneMKL BLAS library to return " + "Call `gemm_batch` from OneMKL BLAS library to compute " "the matrix-matrix product for a batch of 2-D matrices.", py::arg("sycl_queue"), py::arg("matrixA"), py::arg("matrixB"), py::arg("resultC"), py::arg("depends") = py::list()); @@ -134,8 +134,8 @@ PYBIND11_MODULE(_blas_impl, m) { m.def("_gemv", &blas_ext::gemv, - "Call `gemv` from OneMKL BLAS library to return " - "the matrix-vector product using a general matrix.", + "Call `gemv` from OneMKL BLAS library to compute " + "the matrix-vector product with a general matrix.", py::arg("sycl_queue"), py::arg("matrixA"), py::arg("vectorX"), py::arg("vectorY"), py::arg("transpose"), py::arg("depends") = py::list()); diff --git a/dpnp/backend/extensions/blas/gemm_batch.cpp b/dpnp/backend/extensions/blas/gemm_batch.cpp index 0d8ad1a6743..689ef77b786 100644 --- a/dpnp/backend/extensions/blas/gemm_batch.cpp +++ b/dpnp/backend/extensions/blas/gemm_batch.cpp @@ -257,21 +257,7 @@ std::tuple throw py::value_error("The number of columns in B must be equal to " "the number of columns in result array."); } - - std::int64_t first_dim; - if (a_shape[0] == b_shape[0]) { - first_dim = a_shape[0]; - } - else if (a_shape[0] == 1 || b_shape[0] == 1) { - first_dim = std::max(a_shape[0], b_shape[0]); - } - else { - throw py::value_error("Array shapes do not match."); - } - if (first_dim != c_shape[0]) { - throw py::value_error("Array shapes do not match."); - } - std::int64_t src_nelems = first_dim * m * n; + std::int64_t src_nelems = batch_size * m * n; dpctl::tensor::validation::CheckWritable::throw_if_not_writable(resultC); dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(resultC, src_nelems); diff --git a/dpnp/dpnp_iface_linearalgebra.py b/dpnp/dpnp_iface_linearalgebra.py index 033929443a5..1af952388a6 100644 --- a/dpnp/dpnp_iface_linearalgebra.py +++ b/dpnp/dpnp_iface_linearalgebra.py @@ -136,32 +136,29 @@ def dot(a, b, out=None): raise ValueError("Only C-contiguous array is acceptable.") if dpnp.isscalar(a) or dpnp.isscalar(b): - # TODO: investigate usage of axpy (axpy_batch) or scal - # functions from BLAS here instead of dpnp.multiply + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b, out=out) a_ndim = a.ndim b_ndim = b.ndim if a_ndim == 0 or b_ndim == 0: - # TODO: investigate usage of axpy (axpy_batch) or scal - # functions from BLAS here instead of dpnp.multiply + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b, out=out) if a_ndim == 1 and b_ndim == 1: return dpnp_dot(a, b, out=out) + # NumPy does not allow casting even if it is safe + # casting="no" is used in the following if a_ndim == 2 and b_ndim == 2: - # NumPy does not allow casting even if it is safe return dpnp.matmul(a, b, out=out, casting="no") if a_ndim == 1 or b_ndim == 1: - # NumPy does not allow casting even if it is safe return dpnp.matmul(a, b, out=out, casting="no") # TODO: investigate usage of matmul for some possible # use cases instead of dpnp.tensordot result = dpnp.tensordot(a, b, axes=(-1, -2)) - # NumPy does not allow casting even if it is safe return dpnp.get_result_array(result, out, casting="no") @@ -619,9 +616,11 @@ def inner(a, b): dpnp.check_supported_arrays_type(a, b, scalar_type=True) if dpnp.isscalar(a) or dpnp.isscalar(b): + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b) if a.ndim == 0 or b.ndim == 0: + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b) if a.shape[-1] != b.shape[-1]: @@ -696,11 +695,13 @@ def kron(a, b): dpnp.check_supported_arrays_type(a, b, scalar_type=True) if dpnp.isscalar(a) or dpnp.isscalar(b): + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b) a_ndim = a.ndim b_ndim = b.ndim if a_ndim == 0 or b_ndim == 0: + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b) return dpnp_kron(a, b, a_ndim, b_ndim) @@ -999,6 +1000,7 @@ def tensordot(a, b, axes=2): raise ValueError( "One of the inputs is scalar, axes should be zero." ) + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b) try: @@ -1028,6 +1030,7 @@ def tensordot(a, b, axes=2): axes_b = normalize_axis_tuple(axes_b, b_ndim, "axis_b") if a.ndim == 0 or b.ndim == 0: + # TODO: use specific scalar-vector kernel return dpnp.multiply(a, b) a_shape = a.shape @@ -1112,14 +1115,16 @@ def vdot(a, b): dpnp.check_supported_arrays_type(a, b, scalar_type=True) - if dpnp.isscalar(a) or dpnp.isscalar(b): - if dpnp.isscalar(b) and a.size != 1: - raise ValueError("The first array should be of size one.") - if dpnp.isscalar(a) and b.size != 1: + if dpnp.isscalar(a): + if b.size != 1: raise ValueError("The second array should be of size one.") - a_conj = numpy.conj(a) if dpnp.isscalar(a) else dpnp.conj(a) - # TODO: investigate usage of axpy (axpy_batch) or scal - # functions from BLAS here instead of dpnp.multiply + a_conj = numpy.conj(a) + return dpnp.multiply(a_conj, b) + + if dpnp.isscalar(b): + if a.size != 1: + raise ValueError("The first array should be of size one.") + a_conj = dpnp.conj(a) return dpnp.multiply(a_conj, b) if a.ndim == 1 and b.ndim == 1: diff --git a/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py b/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py index 43f6cc1f3fe..0b9686771c3 100644 --- a/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py +++ b/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py @@ -108,7 +108,7 @@ def _chr(label): return chr(label) -def _compute_res_dtype(*arrays, dtype, casting, sycl_queue): +def _compute_res_dtype(*arrays, sycl_queue, dtype=None, casting="no"): """ Determines the output array data type and an intermediate data type used in performing calculations related to a specific math function. @@ -1748,10 +1748,7 @@ def dpnp_dot(a, b, /, out=None, *, conjugate=False): res_usm_type, exec_q = get_usm_allocations([a, b]) # Determine the appropriate data types - # casting is irrelevant here since dtype is `None` - dot_dtype, res_dtype = _compute_res_dtype( - a, b, dtype=None, casting="no", sycl_queue=exec_q - ) + dot_dtype, res_dtype = _compute_res_dtype(a, b, sycl_queue=exec_q) result = _create_result_array( a, b, out, (), dot_dtype, res_usm_type, exec_q diff --git a/tests/test_product.py b/tests/test_product.py index ded938bda7f..d9463a1546c 100644 --- a/tests/test_product.py +++ b/tests/test_product.py @@ -1,6 +1,7 @@ import dpctl import numpy import pytest +from numpy.testing import assert_raises import dpnp @@ -8,6 +9,11 @@ def _assert_selective_dtype_allclose(result, expected, dtype): + # For numpy.dot, numpy.vdot, numpy.kron, numpy.inner, and numpy.tensordot, + # when inputs are an scalar (which has the default dtype of platform) and + # an array, the scalar dtype precision determines the output dtype + # precision. In dpnp, we rely on dpnp.multiply for scalar-array product + # and array (not scalar) determines output dtype precision of dpnp.multiply if dtype in [numpy.int32, numpy.float32, numpy.complex64]: assert_dtype_allclose(result, expected, check_only_type_kind=True) else: @@ -467,21 +473,22 @@ def test_dot_sycl_queue_error(self): with pytest.raises(ValueError): dpnp.dot(a, b) - # NumPy does not raise an error for the following test. - # it just does not update the out keyword if it as not properly defined - @pytest.mark.parametrize("ia", [1, dpnp.ones((), dtype=dpnp.int32)]) + @pytest.mark.parametrize("ia", [1, dpnp.ones((), dtype=dpnp.float32)]) def test_dot_out_error_scalar(self, ia): - ib = dpnp.ones(10, dtype=dpnp.int32) + a = ia if dpnp.isscalar(ia) else ia.asnumpy() + ib = dpnp.ones(10, dtype=dpnp.float32) + b = ib.asnumpy() # output data type is incorrect - dp_out = dpnp.empty((10,), dtype=dpnp.int64) - with pytest.raises(ValueError): - dpnp.dot(ia, ib, out=dp_out) + dp_out = dpnp.empty((10,), dtype=dpnp.complex64) + out = numpy.empty((10,), dtype=numpy.complex64) + assert_raises(ValueError, dpnp.dot, ia, ib, out=dp_out) + assert_raises(ValueError, numpy.dot, a, b, out=out) # output shape is incorrect dp_out = dpnp.empty((2,), dtype=dpnp.int32) - with pytest.raises(ValueError): - dpnp.dot(ia, ib, out=dp_out) + assert_raises(ValueError, dpnp.dot, ia, ib, out=dp_out) + assert_raises(ValueError, numpy.dot, a, b, out=out) @pytest.mark.parametrize( "shape_pair", From 0d326ea273bd064aa303481e59b39339d49b5b3b Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 10 Jun 2024 11:23:35 +0200 Subject: [PATCH 15/49] Bump github/codeql-action from 3.25.7 to 3.25.8 (#1876) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.7 to 3.25.8. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/f079b8493333aace61c81488f8bd40919487bd9f...2e230e8fe0ad3a14a340ad0815ddb96d599d2aff) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- .github/workflows/openssf-scorecard.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index 25f351b9c1e..5d0d13d45fb 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -68,6 +68,6 @@ jobs: # Upload the results to GitHub's code scanning dashboard. - name: "Upload to code-scanning" - uses: github/codeql-action/upload-sarif@f079b8493333aace61c81488f8bd40919487bd9f # v3.25.7 + uses: github/codeql-action/upload-sarif@2e230e8fe0ad3a14a340ad0815ddb96d599d2aff # v3.25.8 with: sarif_file: results.sarif From 97512e541dcb8b09383492a8dc67f841a4352e8f Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Mon, 10 Jun 2024 13:14:44 +0200 Subject: [PATCH 16/49] Limit rerun of the tests in GitHub action by 2 attempts (#1875) * Limit rerun of the tests in GH action by 2 attempts * Set retry limit to 2 --- .github/workflows/conda-package.yml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index dde7e3d435a..83c657a77c5 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -14,6 +14,7 @@ env: CHANNELS: '-c dppy/label/dev -c intel -c conda-forge --override-channels' CONDA_BUILD_VERSION: '24.5.1' CONDA_INDEX_VERSION: '0.4.0' + RUN_TESTS_MAX_ATTEMPTS: 2 TEST_ENV_NAME: 'test' TEST_SCOPE: >- test_absolute.py @@ -264,7 +265,7 @@ jobs: with: shell: bash timeout_minutes: 10 - max_attempts: 5 + max_attempts: ${{ env.RUN_TESTS_MAX_ATTEMPTS }} retry_on: any command: | . $CONDA/etc/profile.d/conda.sh @@ -420,7 +421,7 @@ jobs: with: shell: cmd timeout_minutes: 15 - max_attempts: 5 + max_attempts: ${{ env.RUN_TESTS_MAX_ATTEMPTS }} retry_on: any command: >- mamba activate ${{ env.TEST_ENV_NAME }} From 114dff6df72b43b123d55d293c5db312b0102c0a Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:47:28 +0200 Subject: [PATCH 17/49] Bump dpctl version for meta.yaml and dependecies of GH actions (#1874) --- .github/workflows/build-sphinx.yml | 2 +- .github/workflows/generate_coverage.yaml | 4 ++-- conda-recipe/meta.yaml | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/.github/workflows/build-sphinx.yml b/.github/workflows/build-sphinx.yml index 5d0372fb48d..02d4be09541 100644 --- a/.github/workflows/build-sphinx.yml +++ b/.github/workflows/build-sphinx.yml @@ -125,7 +125,7 @@ jobs: - name: Install dpnp dependencies run: | - mamba install numpy"<1.24" dpctl">=0.17.0dev0" mkl-devel-dpcpp onedpl-devel tbb-devel dpcpp_linux-64 \ + mamba install numpy"<1.24" dpctl">=0.18.0dev0" mkl-devel-dpcpp onedpl-devel tbb-devel dpcpp_linux-64 \ cmake cython pytest ninja scikit-build ${{ env.CHANNELS }} - name: Install cuPy dependencies diff --git a/.github/workflows/generate_coverage.yaml b/.github/workflows/generate_coverage.yaml index d0f65e9729b..90586747841 100644 --- a/.github/workflows/generate_coverage.yaml +++ b/.github/workflows/generate_coverage.yaml @@ -81,13 +81,13 @@ jobs: if: env.INSTALL_ONE_API == 'yes' run: | mamba install cython llvm cmake">=3.21" scikit-build ninja pytest pytest-cov coverage[toml] \ - dpctl">=0.17.0dev0" onedpl-devel ${{ env.CHANNELS }} + dpctl">=0.18.0dev0" onedpl-devel ${{ env.CHANNELS }} - name: Install dpnp dependencies if: env.INSTALL_ONE_API != 'yes' run: | mamba install cython llvm cmake">=3.21" scikit-build ninja pytest pytest-cov coverage[toml] \ - dpctl">=0.17.0dev0" dpcpp_linux-64 mkl-devel-dpcpp tbb-devel onedpl-devel ${{ env.CHANNELS }} + dpctl">=0.18.0dev0" dpcpp_linux-64 mkl-devel-dpcpp tbb-devel onedpl-devel ${{ env.CHANNELS }} - name: Conda info run: | diff --git a/conda-recipe/meta.yaml b/conda-recipe/meta.yaml index 80375eb26f0..c10cd061345 100644 --- a/conda-recipe/meta.yaml +++ b/conda-recipe/meta.yaml @@ -2,7 +2,7 @@ {% set excluded_compiler_version1 = "2024.0.1" %} {% set excluded_compiler_version2 = "2024.0.2" %} {% set excluded_compiler_version3 = "2024.0.3" %} -{% set required_dpctl_version = "0.16.0" %} +{% set required_dpctl_version = "0.17.0" %} package: name: dpnp From 896209a1715dba870867ee9d27ff585113c50d42 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Mon, 10 Jun 2024 16:22:16 +0200 Subject: [PATCH 18/49] Use dpctl from dedicated channel in coverage GH action (#1873) --- .github/workflows/generate_coverage.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/generate_coverage.yaml b/.github/workflows/generate_coverage.yaml index 90586747841..22ec13da23a 100644 --- a/.github/workflows/generate_coverage.yaml +++ b/.github/workflows/generate_coverage.yaml @@ -21,7 +21,7 @@ jobs: env: python-ver: '3.10' - CHANNELS: '-c dppy/label/dev -c intel -c conda-forge --override-channels' + CHANNELS: '-c dppy/label/coverage -c intel -c conda-forge --override-channels' # Install the latest oneAPI compiler to work around an issue INSTALL_ONE_API: 'yes' From d553611d9670fb1ee50e6b293ffea2853d0f5655 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Wed, 12 Jun 2024 21:26:04 +0200 Subject: [PATCH 19/49] Preparation to reuse future common dpctl f/w in functions from `vm` extension (#1868) * Preparation to reuse common dpctl f/w for VM functions * PoC to decouple abs implementation to separate source file * Reuse typedef for function poiter from dpctl.tensor * Define populating vectors by a separate macro * Move implementation of utility functions from headers to source to resolve link issues * Separated implementation of acos function * Separated implementation of acosh function * Use function to simplify strides from dpctl tensor headers * PoC to decouple add implementation to separate source file * Separated implementation of asin function * Separated implementation of asinh function * Separated implementation of atan, atan2, atanh functions * Resolve issue with calling MKL function for undefined types * Separated implementation of cbrt, ceil, conj, cos and cosh functions * Separated implementation of div, exp, exp2, expm1, floor and hypot functions * Separated implementation of ln, log1p, log2 and log10 functions * Separated implementation of mul, pow, rint, sin and sinh functions * Separated implementation of sqr, sqrt, sub, tan, tanh and trunc functions * Removed unused header with types matrix * Remove unused functions * Use passing by reference in unary and binary funcs --- .../elementwise_functions.hpp | 824 +++++++++++++ .../elementwise_functions_type_utils.cpp | 87 ++ .../elementwise_functions_type_utils.hpp | 47 + .../simplify_iteration_space.cpp | 205 ++++ .../simplify_iteration_space.hpp | 61 + dpnp/backend/extensions/vm/CMakeLists.txt | 44 +- dpnp/backend/extensions/vm/abs.cpp | 138 +++ dpnp/backend/extensions/vm/abs.hpp | 54 +- dpnp/backend/extensions/vm/acos.cpp | 138 +++ dpnp/backend/extensions/vm/acos.hpp | 54 +- dpnp/backend/extensions/vm/acosh.cpp | 138 +++ dpnp/backend/extensions/vm/acosh.hpp | 54 +- dpnp/backend/extensions/vm/add.cpp | 171 +++ dpnp/backend/extensions/vm/add.hpp | 57 +- dpnp/backend/extensions/vm/asin.cpp | 138 +++ dpnp/backend/extensions/vm/asin.hpp | 54 +- dpnp/backend/extensions/vm/asinh.cpp | 138 +++ dpnp/backend/extensions/vm/asinh.hpp | 54 +- dpnp/backend/extensions/vm/atan.cpp | 138 +++ dpnp/backend/extensions/vm/atan.hpp | 54 +- dpnp/backend/extensions/vm/atan2.cpp | 160 +++ dpnp/backend/extensions/vm/atan2.hpp | 57 +- dpnp/backend/extensions/vm/atanh.cpp | 138 +++ dpnp/backend/extensions/vm/atanh.hpp | 54 +- dpnp/backend/extensions/vm/cbrt.cpp | 136 +++ dpnp/backend/extensions/vm/cbrt.hpp | 54 +- dpnp/backend/extensions/vm/ceil.cpp | 136 +++ dpnp/backend/extensions/vm/ceil.hpp | 54 +- dpnp/backend/extensions/vm/common.hpp | 374 ++---- dpnp/backend/extensions/vm/conj.cpp | 136 +++ dpnp/backend/extensions/vm/conj.hpp | 54 +- dpnp/backend/extensions/vm/cos.cpp | 138 +++ dpnp/backend/extensions/vm/cos.hpp | 54 +- dpnp/backend/extensions/vm/cosh.cpp | 138 +++ dpnp/backend/extensions/vm/cosh.hpp | 54 +- dpnp/backend/extensions/vm/div.cpp | 171 +++ dpnp/backend/extensions/vm/div.hpp | 57 +- dpnp/backend/extensions/vm/exp.cpp | 138 +++ dpnp/backend/extensions/vm/exp.hpp | 54 +- dpnp/backend/extensions/vm/exp2.cpp | 136 +++ dpnp/backend/extensions/vm/exp2.hpp | 54 +- dpnp/backend/extensions/vm/expm1.cpp | 136 +++ dpnp/backend/extensions/vm/expm1.hpp | 54 +- dpnp/backend/extensions/vm/floor.cpp | 136 +++ dpnp/backend/extensions/vm/floor.hpp | 54 +- dpnp/backend/extensions/vm/hypot.cpp | 160 +++ dpnp/backend/extensions/vm/hypot.hpp | 57 +- dpnp/backend/extensions/vm/ln.cpp | 138 +++ dpnp/backend/extensions/vm/ln.hpp | 53 +- dpnp/backend/extensions/vm/log10.cpp | 138 +++ dpnp/backend/extensions/vm/log10.hpp | 54 +- dpnp/backend/extensions/vm/log1p.cpp | 136 +++ dpnp/backend/extensions/vm/log1p.hpp | 54 +- dpnp/backend/extensions/vm/log2.cpp | 136 +++ dpnp/backend/extensions/vm/log2.hpp | 54 +- dpnp/backend/extensions/vm/mul.cpp | 171 +++ dpnp/backend/extensions/vm/mul.hpp | 57 +- dpnp/backend/extensions/vm/pow.cpp | 171 +++ dpnp/backend/extensions/vm/pow.hpp | 57 +- dpnp/backend/extensions/vm/rint.cpp | 136 +++ .../extensions/vm/{round.hpp => rint.hpp} | 54 +- dpnp/backend/extensions/vm/sin.cpp | 138 +++ dpnp/backend/extensions/vm/sin.hpp | 54 +- dpnp/backend/extensions/vm/sinh.cpp | 138 +++ dpnp/backend/extensions/vm/sinh.hpp | 54 +- dpnp/backend/extensions/vm/sqr.cpp | 136 +++ dpnp/backend/extensions/vm/sqr.hpp | 54 +- dpnp/backend/extensions/vm/sqrt.cpp | 139 +++ dpnp/backend/extensions/vm/sqrt.hpp | 54 +- dpnp/backend/extensions/vm/sub.cpp | 171 +++ dpnp/backend/extensions/vm/sub.hpp | 57 +- dpnp/backend/extensions/vm/tan.cpp | 138 +++ dpnp/backend/extensions/vm/tan.hpp | 54 +- dpnp/backend/extensions/vm/tanh.cpp | 138 +++ dpnp/backend/extensions/vm/tanh.hpp | 54 +- dpnp/backend/extensions/vm/trunc.cpp | 136 +++ dpnp/backend/extensions/vm/trunc.hpp | 54 +- dpnp/backend/extensions/vm/types_matrix.hpp | 659 ---------- dpnp/backend/extensions/vm/vm_py.cpp | 1081 +---------------- 79 files changed, 6629 insertions(+), 3681 deletions(-) create mode 100644 dpnp/backend/extensions/elementwise_functions/elementwise_functions.hpp create mode 100644 dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.cpp create mode 100644 dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.hpp create mode 100644 dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.cpp create mode 100644 dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.hpp create mode 100644 dpnp/backend/extensions/vm/abs.cpp create mode 100644 dpnp/backend/extensions/vm/acos.cpp create mode 100644 dpnp/backend/extensions/vm/acosh.cpp create mode 100644 dpnp/backend/extensions/vm/add.cpp create mode 100644 dpnp/backend/extensions/vm/asin.cpp create mode 100644 dpnp/backend/extensions/vm/asinh.cpp create mode 100644 dpnp/backend/extensions/vm/atan.cpp create mode 100644 dpnp/backend/extensions/vm/atan2.cpp create mode 100644 dpnp/backend/extensions/vm/atanh.cpp create mode 100644 dpnp/backend/extensions/vm/cbrt.cpp create mode 100644 dpnp/backend/extensions/vm/ceil.cpp create mode 100644 dpnp/backend/extensions/vm/conj.cpp create mode 100644 dpnp/backend/extensions/vm/cos.cpp create mode 100644 dpnp/backend/extensions/vm/cosh.cpp create mode 100644 dpnp/backend/extensions/vm/div.cpp create mode 100644 dpnp/backend/extensions/vm/exp.cpp create mode 100644 dpnp/backend/extensions/vm/exp2.cpp create mode 100644 dpnp/backend/extensions/vm/expm1.cpp create mode 100644 dpnp/backend/extensions/vm/floor.cpp create mode 100644 dpnp/backend/extensions/vm/hypot.cpp create mode 100644 dpnp/backend/extensions/vm/ln.cpp create mode 100644 dpnp/backend/extensions/vm/log10.cpp create mode 100644 dpnp/backend/extensions/vm/log1p.cpp create mode 100644 dpnp/backend/extensions/vm/log2.cpp create mode 100644 dpnp/backend/extensions/vm/mul.cpp create mode 100644 dpnp/backend/extensions/vm/pow.cpp create mode 100644 dpnp/backend/extensions/vm/rint.cpp rename dpnp/backend/extensions/vm/{round.hpp => rint.hpp} (53%) create mode 100644 dpnp/backend/extensions/vm/sin.cpp create mode 100644 dpnp/backend/extensions/vm/sinh.cpp create mode 100644 dpnp/backend/extensions/vm/sqr.cpp create mode 100644 dpnp/backend/extensions/vm/sqrt.cpp create mode 100644 dpnp/backend/extensions/vm/sub.cpp create mode 100644 dpnp/backend/extensions/vm/tan.cpp create mode 100644 dpnp/backend/extensions/vm/tanh.cpp create mode 100644 dpnp/backend/extensions/vm/trunc.cpp delete mode 100644 dpnp/backend/extensions/vm/types_matrix.hpp diff --git a/dpnp/backend/extensions/elementwise_functions/elementwise_functions.hpp b/dpnp/backend/extensions/elementwise_functions/elementwise_functions.hpp new file mode 100644 index 00000000000..01013d10f5d --- /dev/null +++ b/dpnp/backend/extensions/elementwise_functions/elementwise_functions.hpp @@ -0,0 +1,824 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +#include "dpctl4pybind11.hpp" +#include +#include +#include + +#include "elementwise_functions_type_utils.hpp" +#include "simplify_iteration_space.hpp" + +// dpctl tensor headers +#include "kernels/alignment.hpp" +// #include "kernels/dpctl_tensor_types.hpp" +#include "utils/memory_overlap.hpp" +#include "utils/offset_utils.hpp" +#include "utils/output_validation.hpp" +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; +namespace td_ns = dpctl::tensor::type_dispatch; + +static_assert(std::is_same_v); + +namespace dpnp::extensions::py_internal +{ + +using dpctl::tensor::kernels::alignment_utils::is_aligned; +using dpctl::tensor::kernels::alignment_utils::required_alignment; + +/*! @brief Template implementing Python API for unary elementwise functions */ +template +std::pair + py_unary_ufunc(const dpctl::tensor::usm_ndarray &src, + const dpctl::tensor::usm_ndarray &dst, + sycl::queue &q, + const std::vector &depends, + // + const output_typesT &output_type_vec, + const contig_dispatchT &contig_dispatch_vector, + const strided_dispatchT &strided_dispatch_vector) +{ + int src_typenum = src.get_typenum(); + int dst_typenum = dst.get_typenum(); + + const auto &array_types = td_ns::usm_ndarray_types(); + int src_typeid = array_types.typenum_to_lookup_id(src_typenum); + int dst_typeid = array_types.typenum_to_lookup_id(dst_typenum); + + int func_output_typeid = output_type_vec[src_typeid]; + + // check that types are supported + if (dst_typeid != func_output_typeid) { + throw py::value_error( + "Destination array has unexpected elemental data type."); + } + + // check that queues are compatible + if (!dpctl::utils::queues_are_compatible(q, {src, dst})) { + throw py::value_error( + "Execution queue is not compatible with allocation queues"); + } + + dpctl::tensor::validation::CheckWritable::throw_if_not_writable(dst); + + // check that dimensions are the same + int src_nd = src.get_ndim(); + if (src_nd != dst.get_ndim()) { + throw py::value_error("Array dimensions are not the same."); + } + + // check that shapes are the same + const py::ssize_t *src_shape = src.get_shape_raw(); + const py::ssize_t *dst_shape = dst.get_shape_raw(); + bool shapes_equal(true); + size_t src_nelems(1); + + for (int i = 0; i < src_nd; ++i) { + src_nelems *= static_cast(src_shape[i]); + shapes_equal = shapes_equal && (src_shape[i] == dst_shape[i]); + } + if (!shapes_equal) { + throw py::value_error("Array shapes are not the same."); + } + + // if nelems is zero, return + if (src_nelems == 0) { + return std::make_pair(sycl::event(), sycl::event()); + } + + dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(dst, src_nelems); + + // check memory overlap + auto const &overlap = dpctl::tensor::overlap::MemoryOverlap(); + auto const &same_logical_tensors = + dpctl::tensor::overlap::SameLogicalTensors(); + if (overlap(src, dst) && !same_logical_tensors(src, dst)) { + throw py::value_error("Arrays index overlapping segments of memory"); + } + + const char *src_data = src.get_data(); + char *dst_data = dst.get_data(); + + // handle contiguous inputs + bool is_src_c_contig = src.is_c_contiguous(); + bool is_src_f_contig = src.is_f_contiguous(); + + bool is_dst_c_contig = dst.is_c_contiguous(); + bool is_dst_f_contig = dst.is_f_contiguous(); + + bool both_c_contig = (is_src_c_contig && is_dst_c_contig); + bool both_f_contig = (is_src_f_contig && is_dst_f_contig); + + if (both_c_contig || both_f_contig) { + auto contig_fn = contig_dispatch_vector[src_typeid]; + + if (contig_fn == nullptr) { + throw std::runtime_error( + "Contiguous implementation is missing for src_typeid=" + + std::to_string(src_typeid)); + } + + auto comp_ev = contig_fn(q, src_nelems, src_data, dst_data, depends); + sycl::event ht_ev = + dpctl::utils::keep_args_alive(q, {src, dst}, {comp_ev}); + + return std::make_pair(ht_ev, comp_ev); + } + + // simplify iteration space + // if 1d with strides 1 - input is contig + // dispatch to strided + + auto const &src_strides = src.get_strides_vector(); + auto const &dst_strides = dst.get_strides_vector(); + + using shT = std::vector; + shT simplified_shape; + shT simplified_src_strides; + shT simplified_dst_strides; + py::ssize_t src_offset(0); + py::ssize_t dst_offset(0); + + int nd = src_nd; + const py::ssize_t *shape = src_shape; + + simplify_iteration_space(nd, shape, src_strides, dst_strides, + // output + simplified_shape, simplified_src_strides, + simplified_dst_strides, src_offset, dst_offset); + + if (nd == 1 && simplified_src_strides[0] == 1 && + simplified_dst_strides[0] == 1) { + // Special case of contiguous data + auto contig_fn = contig_dispatch_vector[src_typeid]; + + if (contig_fn == nullptr) { + throw std::runtime_error( + "Contiguous implementation is missing for src_typeid=" + + std::to_string(src_typeid)); + } + + int src_elem_size = src.get_elemsize(); + int dst_elem_size = dst.get_elemsize(); + auto comp_ev = + contig_fn(q, src_nelems, src_data + src_elem_size * src_offset, + dst_data + dst_elem_size * dst_offset, depends); + + sycl::event ht_ev = + dpctl::utils::keep_args_alive(q, {src, dst}, {comp_ev}); + + return std::make_pair(ht_ev, comp_ev); + } + + // Strided implementation + auto strided_fn = strided_dispatch_vector[src_typeid]; + + if (strided_fn == nullptr) { + throw std::runtime_error( + "Strided implementation is missing for src_typeid=" + + std::to_string(src_typeid)); + } + + using dpctl::tensor::offset_utils::device_allocate_and_pack; + + std::vector host_tasks{}; + host_tasks.reserve(2); + + const auto &ptr_size_event_triple_ = device_allocate_and_pack( + q, host_tasks, simplified_shape, simplified_src_strides, + simplified_dst_strides); + py::ssize_t *shape_strides = std::get<0>(ptr_size_event_triple_); + const sycl::event ©_shape_ev = std::get<2>(ptr_size_event_triple_); + + if (shape_strides == nullptr) { + throw std::runtime_error("Device memory allocation failed"); + } + + sycl::event strided_fn_ev = + strided_fn(q, src_nelems, nd, shape_strides, src_data, src_offset, + dst_data, dst_offset, depends, {copy_shape_ev}); + + // async free of shape_strides temporary + auto ctx = q.get_context(); + sycl::event tmp_cleanup_ev = q.submit([&](sycl::handler &cgh) { + cgh.depends_on(strided_fn_ev); + cgh.host_task( + [ctx, shape_strides]() { sycl::free(shape_strides, ctx); }); + }); + host_tasks.push_back(tmp_cleanup_ev); + + return std::make_pair( + dpctl::utils::keep_args_alive(q, {src, dst}, host_tasks), + strided_fn_ev); +} + +/*! @brief Template implementing Python API for querying of type support by + * unary elementwise functions */ +template +py::object py_unary_ufunc_result_type(const py::dtype &input_dtype, + const output_typesT &output_types) +{ + int tn = input_dtype.num(); // NumPy type numbers are the same as in dpctl + int src_typeid = -1; + + auto array_types = td_ns::usm_ndarray_types(); + + try { + src_typeid = array_types.typenum_to_lookup_id(tn); + } catch (const std::exception &e) { + throw py::value_error(e.what()); + } + + using type_utils::_result_typeid; + int dst_typeid = _result_typeid(src_typeid, output_types); + + if (dst_typeid < 0) { + auto res = py::none(); + return py::cast(res); + } + else { + using type_utils::_dtype_from_typenum; + + auto dst_typenum_t = static_cast(dst_typeid); + auto dt = _dtype_from_typenum(dst_typenum_t); + + return py::cast(dt); + } +} + +// ======================== Binary functions =========================== + +namespace +{ +template +bool isEqual(Container const &c, std::initializer_list const &l) +{ + return std::equal(std::begin(c), std::end(c), std::begin(l), std::end(l)); +} +} // namespace + +/*! @brief Template implementing Python API for binary elementwise + * functions */ +template +std::pair py_binary_ufunc( + const dpctl::tensor::usm_ndarray &src1, + const dpctl::tensor::usm_ndarray &src2, + const dpctl::tensor::usm_ndarray &dst, // dst = op(src1, src2), elementwise + sycl::queue &exec_q, + const std::vector depends, + // + const output_typesT &output_type_table, + const contig_dispatchT &contig_dispatch_table, + const strided_dispatchT &strided_dispatch_table, + const contig_matrix_row_dispatchT + &contig_matrix_row_broadcast_dispatch_table, + const contig_row_matrix_dispatchT + &contig_row_matrix_broadcast_dispatch_table) +{ + // check type_nums + int src1_typenum = src1.get_typenum(); + int src2_typenum = src2.get_typenum(); + int dst_typenum = dst.get_typenum(); + + auto array_types = td_ns::usm_ndarray_types(); + int src1_typeid = array_types.typenum_to_lookup_id(src1_typenum); + int src2_typeid = array_types.typenum_to_lookup_id(src2_typenum); + int dst_typeid = array_types.typenum_to_lookup_id(dst_typenum); + + int output_typeid = output_type_table[src1_typeid][src2_typeid]; + + if (output_typeid != dst_typeid) { + throw py::value_error( + "Destination array has unexpected elemental data type."); + } + + // check that queues are compatible + if (!dpctl::utils::queues_are_compatible(exec_q, {src1, src2, dst})) { + throw py::value_error( + "Execution queue is not compatible with allocation queues"); + } + + dpctl::tensor::validation::CheckWritable::throw_if_not_writable(dst); + + // check shapes, broadcasting is assumed done by caller + // check that dimensions are the same + int dst_nd = dst.get_ndim(); + if (dst_nd != src1.get_ndim() || dst_nd != src2.get_ndim()) { + throw py::value_error("Array dimensions are not the same."); + } + + // check that shapes are the same + const py::ssize_t *src1_shape = src1.get_shape_raw(); + const py::ssize_t *src2_shape = src2.get_shape_raw(); + const py::ssize_t *dst_shape = dst.get_shape_raw(); + bool shapes_equal(true); + size_t src_nelems(1); + + for (int i = 0; i < dst_nd; ++i) { + src_nelems *= static_cast(src1_shape[i]); + shapes_equal = shapes_equal && (src1_shape[i] == dst_shape[i] && + src2_shape[i] == dst_shape[i]); + } + if (!shapes_equal) { + throw py::value_error("Array shapes are not the same."); + } + + // if nelems is zero, return + if (src_nelems == 0) { + return std::make_pair(sycl::event(), sycl::event()); + } + + dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(dst, src_nelems); + + auto const &overlap = dpctl::tensor::overlap::MemoryOverlap(); + auto const &same_logical_tensors = + dpctl::tensor::overlap::SameLogicalTensors(); + if ((overlap(src1, dst) && !same_logical_tensors(src1, dst)) || + (overlap(src2, dst) && !same_logical_tensors(src2, dst))) + { + throw py::value_error("Arrays index overlapping segments of memory"); + } + // check memory overlap + const char *src1_data = src1.get_data(); + const char *src2_data = src2.get_data(); + char *dst_data = dst.get_data(); + + // handle contiguous inputs + bool is_src1_c_contig = src1.is_c_contiguous(); + bool is_src1_f_contig = src1.is_f_contiguous(); + + bool is_src2_c_contig = src2.is_c_contiguous(); + bool is_src2_f_contig = src2.is_f_contiguous(); + + bool is_dst_c_contig = dst.is_c_contiguous(); + bool is_dst_f_contig = dst.is_f_contiguous(); + + bool all_c_contig = + (is_src1_c_contig && is_src2_c_contig && is_dst_c_contig); + bool all_f_contig = + (is_src1_f_contig && is_src2_f_contig && is_dst_f_contig); + + // dispatch for contiguous inputs + if (all_c_contig || all_f_contig) { + auto contig_fn = contig_dispatch_table[src1_typeid][src2_typeid]; + + if (contig_fn != nullptr) { + auto comp_ev = contig_fn(exec_q, src_nelems, src1_data, 0, + src2_data, 0, dst_data, 0, depends); + sycl::event ht_ev = dpctl::utils::keep_args_alive( + exec_q, {src1, src2, dst}, {comp_ev}); + + return std::make_pair(ht_ev, comp_ev); + } + } + + // simplify strides + auto const &src1_strides = src1.get_strides_vector(); + auto const &src2_strides = src2.get_strides_vector(); + auto const &dst_strides = dst.get_strides_vector(); + + using shT = std::vector; + shT simplified_shape; + shT simplified_src1_strides; + shT simplified_src2_strides; + shT simplified_dst_strides; + py::ssize_t src1_offset(0); + py::ssize_t src2_offset(0); + py::ssize_t dst_offset(0); + + int nd = dst_nd; + const py::ssize_t *shape = src1_shape; + + simplify_iteration_space_3( + nd, shape, src1_strides, src2_strides, dst_strides, + // outputs + simplified_shape, simplified_src1_strides, simplified_src2_strides, + simplified_dst_strides, src1_offset, src2_offset, dst_offset); + + std::vector host_tasks{}; + if (nd < 3) { + static constexpr auto unit_stride = + std::initializer_list{1}; + + if ((nd == 1) && isEqual(simplified_src1_strides, unit_stride) && + isEqual(simplified_src2_strides, unit_stride) && + isEqual(simplified_dst_strides, unit_stride)) + { + auto contig_fn = contig_dispatch_table[src1_typeid][src2_typeid]; + + if (contig_fn != nullptr) { + auto comp_ev = contig_fn(exec_q, src_nelems, src1_data, + src1_offset, src2_data, src2_offset, + dst_data, dst_offset, depends); + sycl::event ht_ev = dpctl::utils::keep_args_alive( + exec_q, {src1, src2, dst}, {comp_ev}); + + return std::make_pair(ht_ev, comp_ev); + } + } + if (nd == 2) { + static constexpr auto zero_one_strides = + std::initializer_list{0, 1}; + static constexpr auto one_zero_strides = + std::initializer_list{1, 0}; + constexpr py::ssize_t one{1}; + // special case of C-contiguous matrix and a row + if (isEqual(simplified_src2_strides, zero_one_strides) && + isEqual(simplified_src1_strides, {simplified_shape[1], one}) && + isEqual(simplified_dst_strides, {simplified_shape[1], one})) + { + auto matrix_row_broadcast_fn = + contig_matrix_row_broadcast_dispatch_table[src1_typeid] + [src2_typeid]; + if (matrix_row_broadcast_fn != nullptr) { + int src1_itemsize = src1.get_elemsize(); + int src2_itemsize = src2.get_elemsize(); + int dst_itemsize = dst.get_elemsize(); + + if (is_aligned( + src1_data + src1_offset * src1_itemsize) && + is_aligned( + src2_data + src2_offset * src2_itemsize) && + is_aligned( + dst_data + dst_offset * dst_itemsize)) + { + size_t n0 = simplified_shape[0]; + size_t n1 = simplified_shape[1]; + sycl::event comp_ev = matrix_row_broadcast_fn( + exec_q, host_tasks, n0, n1, src1_data, src1_offset, + src2_data, src2_offset, dst_data, dst_offset, + depends); + + return std::make_pair( + dpctl::utils::keep_args_alive( + exec_q, {src1, src2, dst}, host_tasks), + comp_ev); + } + } + } + if (isEqual(simplified_src1_strides, one_zero_strides) && + isEqual(simplified_src2_strides, {one, simplified_shape[0]}) && + isEqual(simplified_dst_strides, {one, simplified_shape[0]})) + { + auto row_matrix_broadcast_fn = + contig_row_matrix_broadcast_dispatch_table[src1_typeid] + [src2_typeid]; + if (row_matrix_broadcast_fn != nullptr) { + + int src1_itemsize = src1.get_elemsize(); + int src2_itemsize = src2.get_elemsize(); + int dst_itemsize = dst.get_elemsize(); + + if (is_aligned( + src1_data + src1_offset * src1_itemsize) && + is_aligned( + src2_data + src2_offset * src2_itemsize) && + is_aligned( + dst_data + dst_offset * dst_itemsize)) + { + size_t n0 = simplified_shape[1]; + size_t n1 = simplified_shape[0]; + sycl::event comp_ev = row_matrix_broadcast_fn( + exec_q, host_tasks, n0, n1, src1_data, src1_offset, + src2_data, src2_offset, dst_data, dst_offset, + depends); + + return std::make_pair( + dpctl::utils::keep_args_alive( + exec_q, {src1, src2, dst}, host_tasks), + comp_ev); + } + } + } + } + } + + // dispatch to strided code + auto strided_fn = strided_dispatch_table[src1_typeid][src2_typeid]; + + if (strided_fn == nullptr) { + throw std::runtime_error( + "Strided implementation is missing for src1_typeid=" + + std::to_string(src1_typeid) + + " and src2_typeid=" + std::to_string(src2_typeid)); + } + + using dpctl::tensor::offset_utils::device_allocate_and_pack; + const auto &ptr_sz_event_triple_ = device_allocate_and_pack( + exec_q, host_tasks, simplified_shape, simplified_src1_strides, + simplified_src2_strides, simplified_dst_strides); + + py::ssize_t *shape_strides = std::get<0>(ptr_sz_event_triple_); + const sycl::event ©_shape_ev = std::get<2>(ptr_sz_event_triple_); + + if (shape_strides == nullptr) { + throw std::runtime_error("Unabled to allocate device memory"); + } + + sycl::event strided_fn_ev = strided_fn( + exec_q, src_nelems, nd, shape_strides, src1_data, src1_offset, + src2_data, src2_offset, dst_data, dst_offset, depends, {copy_shape_ev}); + + // async free of shape_strides temporary + auto ctx = exec_q.get_context(); + + sycl::event tmp_cleanup_ev = exec_q.submit([&](sycl::handler &cgh) { + cgh.depends_on(strided_fn_ev); + cgh.host_task( + [ctx, shape_strides]() { sycl::free(shape_strides, ctx); }); + }); + + host_tasks.push_back(tmp_cleanup_ev); + + return std::make_pair( + dpctl::utils::keep_args_alive(exec_q, {src1, src2, dst}, host_tasks), + strided_fn_ev); +} + +/*! @brief Type querying for binary elementwise functions */ +template +py::object py_binary_ufunc_result_type(const py::dtype &input1_dtype, + const py::dtype &input2_dtype, + const output_typesT &output_types_table) +{ + int tn1 = input1_dtype.num(); // NumPy type numbers are the same as in dpctl + int tn2 = input2_dtype.num(); // NumPy type numbers are the same as in dpctl + int src1_typeid = -1; + int src2_typeid = -1; + + auto array_types = td_ns::usm_ndarray_types(); + + try { + src1_typeid = array_types.typenum_to_lookup_id(tn1); + src2_typeid = array_types.typenum_to_lookup_id(tn2); + } catch (const std::exception &e) { + throw py::value_error(e.what()); + } + + if (src1_typeid < 0 || src1_typeid >= td_ns::num_types || src2_typeid < 0 || + src2_typeid >= td_ns::num_types) + { + throw std::runtime_error("binary output type lookup failed"); + } + int dst_typeid = output_types_table[src1_typeid][src2_typeid]; + + if (dst_typeid < 0) { + auto res = py::none(); + return py::cast(res); + } + else { + using type_utils::_dtype_from_typenum; + + auto dst_typenum_t = static_cast(dst_typeid); + auto dt = _dtype_from_typenum(dst_typenum_t); + + return py::cast(dt); + } +} + +// ==================== Inplace binary functions ======================= + +template +std::pair + py_binary_inplace_ufunc(const dpctl::tensor::usm_ndarray &lhs, + const dpctl::tensor::usm_ndarray &rhs, + sycl::queue &exec_q, + const std::vector depends, + // + const output_typesT &output_type_table, + const contig_dispatchT &contig_dispatch_table, + const strided_dispatchT &strided_dispatch_table, + const contig_row_matrix_dispatchT + &contig_row_matrix_broadcast_dispatch_table) +{ + dpctl::tensor::validation::CheckWritable::throw_if_not_writable(lhs); + + // check type_nums + int rhs_typenum = rhs.get_typenum(); + int lhs_typenum = lhs.get_typenum(); + + auto array_types = td_ns::usm_ndarray_types(); + int rhs_typeid = array_types.typenum_to_lookup_id(rhs_typenum); + int lhs_typeid = array_types.typenum_to_lookup_id(lhs_typenum); + + int output_typeid = output_type_table[rhs_typeid][lhs_typeid]; + + if (output_typeid != lhs_typeid) { + throw py::value_error( + "Left-hand side array has unexpected elemental data type."); + } + + // check that queues are compatible + if (!dpctl::utils::queues_are_compatible(exec_q, {rhs, lhs})) { + throw py::value_error( + "Execution queue is not compatible with allocation queues"); + } + + // check shapes, broadcasting is assumed done by caller + // check that dimensions are the same + int lhs_nd = lhs.get_ndim(); + if (lhs_nd != rhs.get_ndim()) { + throw py::value_error("Array dimensions are not the same."); + } + + // check that shapes are the same + const py::ssize_t *rhs_shape = rhs.get_shape_raw(); + const py::ssize_t *lhs_shape = lhs.get_shape_raw(); + bool shapes_equal(true); + size_t rhs_nelems(1); + + for (int i = 0; i < lhs_nd; ++i) { + rhs_nelems *= static_cast(rhs_shape[i]); + shapes_equal = shapes_equal && (rhs_shape[i] == lhs_shape[i]); + } + if (!shapes_equal) { + throw py::value_error("Array shapes are not the same."); + } + + // if nelems is zero, return + if (rhs_nelems == 0) { + return std::make_pair(sycl::event(), sycl::event()); + } + + dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(lhs, rhs_nelems); + + // check memory overlap + auto const &same_logical_tensors = + dpctl::tensor::overlap::SameLogicalTensors(); + auto const &overlap = dpctl::tensor::overlap::MemoryOverlap(); + if (overlap(rhs, lhs) && !same_logical_tensors(rhs, lhs)) { + throw py::value_error("Arrays index overlapping segments of memory"); + } + // check memory overlap + const char *rhs_data = rhs.get_data(); + char *lhs_data = lhs.get_data(); + + // handle contiguous inputs + bool is_rhs_c_contig = rhs.is_c_contiguous(); + bool is_rhs_f_contig = rhs.is_f_contiguous(); + + bool is_lhs_c_contig = lhs.is_c_contiguous(); + bool is_lhs_f_contig = lhs.is_f_contiguous(); + + bool both_c_contig = (is_rhs_c_contig && is_lhs_c_contig); + bool both_f_contig = (is_rhs_f_contig && is_lhs_f_contig); + + // dispatch for contiguous inputs + if (both_c_contig || both_f_contig) { + auto contig_fn = contig_dispatch_table[rhs_typeid][lhs_typeid]; + + if (contig_fn != nullptr) { + auto comp_ev = contig_fn(exec_q, rhs_nelems, rhs_data, 0, lhs_data, + 0, depends); + sycl::event ht_ev = + dpctl::utils::keep_args_alive(exec_q, {rhs, lhs}, {comp_ev}); + + return std::make_pair(ht_ev, comp_ev); + } + } + + // simplify strides + auto const &rhs_strides = rhs.get_strides_vector(); + auto const &lhs_strides = lhs.get_strides_vector(); + + using shT = std::vector; + shT simplified_shape; + shT simplified_rhs_strides; + shT simplified_lhs_strides; + py::ssize_t rhs_offset(0); + py::ssize_t lhs_offset(0); + + int nd = lhs_nd; + const py::ssize_t *shape = rhs_shape; + + simplify_iteration_space(nd, shape, rhs_strides, lhs_strides, + // outputs + simplified_shape, simplified_rhs_strides, + simplified_lhs_strides, rhs_offset, lhs_offset); + + std::vector host_tasks{}; + if (nd < 3) { + static constexpr auto unit_stride = + std::initializer_list{1}; + + if ((nd == 1) && isEqual(simplified_rhs_strides, unit_stride) && + isEqual(simplified_lhs_strides, unit_stride)) + { + auto contig_fn = contig_dispatch_table[rhs_typeid][lhs_typeid]; + + if (contig_fn != nullptr) { + auto comp_ev = + contig_fn(exec_q, rhs_nelems, rhs_data, rhs_offset, + lhs_data, lhs_offset, depends); + sycl::event ht_ev = dpctl::utils::keep_args_alive( + exec_q, {rhs, lhs}, {comp_ev}); + + return std::make_pair(ht_ev, comp_ev); + } + } + if (nd == 2) { + static constexpr auto one_zero_strides = + std::initializer_list{1, 0}; + constexpr py::ssize_t one{1}; + // special case of C-contiguous matrix and a row + if (isEqual(simplified_rhs_strides, one_zero_strides) && + isEqual(simplified_lhs_strides, {one, simplified_shape[0]})) + { + auto row_matrix_broadcast_fn = + contig_row_matrix_broadcast_dispatch_table[rhs_typeid] + [lhs_typeid]; + if (row_matrix_broadcast_fn != nullptr) { + size_t n0 = simplified_shape[1]; + size_t n1 = simplified_shape[0]; + sycl::event comp_ev = row_matrix_broadcast_fn( + exec_q, host_tasks, n0, n1, rhs_data, rhs_offset, + lhs_data, lhs_offset, depends); + + return std::make_pair(dpctl::utils::keep_args_alive( + exec_q, {lhs, rhs}, host_tasks), + comp_ev); + } + } + } + } + + // dispatch to strided code + auto strided_fn = strided_dispatch_table[rhs_typeid][lhs_typeid]; + + if (strided_fn == nullptr) { + throw std::runtime_error( + "Strided implementation is missing for rhs_typeid=" + + std::to_string(rhs_typeid) + + " and lhs_typeid=" + std::to_string(lhs_typeid)); + } + + using dpctl::tensor::offset_utils::device_allocate_and_pack; + const auto &ptr_sz_event_triple_ = device_allocate_and_pack( + exec_q, host_tasks, simplified_shape, simplified_rhs_strides, + simplified_lhs_strides); + + py::ssize_t *shape_strides = std::get<0>(ptr_sz_event_triple_); + const sycl::event ©_shape_ev = std::get<2>(ptr_sz_event_triple_); + + if (shape_strides == nullptr) { + throw std::runtime_error("Unabled to allocate device memory"); + } + + sycl::event strided_fn_ev = + strided_fn(exec_q, rhs_nelems, nd, shape_strides, rhs_data, rhs_offset, + lhs_data, lhs_offset, depends, {copy_shape_ev}); + + // async free of shape_strides temporary + auto ctx = exec_q.get_context(); + + sycl::event tmp_cleanup_ev = exec_q.submit([&](sycl::handler &cgh) { + cgh.depends_on(strided_fn_ev); + cgh.host_task( + [ctx, shape_strides]() { sycl::free(shape_strides, ctx); }); + }); + + host_tasks.push_back(tmp_cleanup_ev); + + return std::make_pair( + dpctl::utils::keep_args_alive(exec_q, {rhs, lhs}, host_tasks), + strided_fn_ev); +} + +} // namespace dpnp::extensions::py_internal diff --git a/dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.cpp b/dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.cpp new file mode 100644 index 00000000000..3f88f735a71 --- /dev/null +++ b/dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.cpp @@ -0,0 +1,87 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include "dpctl4pybind11.hpp" + +#include +#include +#include + +#include "elementwise_functions_type_utils.hpp" + +// dpctl tensor headers +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; +namespace td_ns = dpctl::tensor::type_dispatch; + +namespace dpnp::extensions::py_internal::type_utils +{ +py::dtype _dtype_from_typenum(td_ns::typenum_t dst_typenum_t) +{ + switch (dst_typenum_t) { + case td_ns::typenum_t::BOOL: + return py::dtype("?"); + case td_ns::typenum_t::INT8: + return py::dtype("i1"); + case td_ns::typenum_t::UINT8: + return py::dtype("u1"); + case td_ns::typenum_t::INT16: + return py::dtype("i2"); + case td_ns::typenum_t::UINT16: + return py::dtype("u2"); + case td_ns::typenum_t::INT32: + return py::dtype("i4"); + case td_ns::typenum_t::UINT32: + return py::dtype("u4"); + case td_ns::typenum_t::INT64: + return py::dtype("i8"); + case td_ns::typenum_t::UINT64: + return py::dtype("u8"); + case td_ns::typenum_t::HALF: + return py::dtype("f2"); + case td_ns::typenum_t::FLOAT: + return py::dtype("f4"); + case td_ns::typenum_t::DOUBLE: + return py::dtype("f8"); + case td_ns::typenum_t::CFLOAT: + return py::dtype("c8"); + case td_ns::typenum_t::CDOUBLE: + return py::dtype("c16"); + default: + throw py::value_error("Unrecognized dst_typeid"); + } +} + +int _result_typeid(int arg_typeid, const int *fn_output_id) +{ + if (arg_typeid < 0 || arg_typeid >= td_ns::num_types) { + throw py::value_error("Input typeid " + std::to_string(arg_typeid) + + " is outside of expected bounds."); + } + + return fn_output_id[arg_typeid]; +} +} // namespace dpnp::extensions::py_internal::type_utils diff --git a/dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.hpp b/dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.hpp new file mode 100644 index 00000000000..ede4ea35fad --- /dev/null +++ b/dpnp/backend/extensions/elementwise_functions/elementwise_functions_type_utils.hpp @@ -0,0 +1,47 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include "dpctl4pybind11.hpp" +#include +#include +#include + +// dpctl tensor headers +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; +namespace td_ns = dpctl::tensor::type_dispatch; + +namespace dpnp::extensions::py_internal::type_utils +{ +/*! @brief Produce dtype from a type number */ +extern py::dtype _dtype_from_typenum(td_ns::typenum_t); + +/*! @brief Lookup typeid of the result from typeid of + * argument and the mapping table */ +extern int _result_typeid(int, const int *); +} // namespace dpnp::extensions::py_internal::type_utils diff --git a/dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.cpp b/dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.cpp new file mode 100644 index 00000000000..a3ab0b99b7a --- /dev/null +++ b/dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.cpp @@ -0,0 +1,205 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include "dpctl4pybind11.hpp" + +#include +#include + +#include "simplify_iteration_space.hpp" + +// dpctl tensor headers +#include "utils/strided_iters.hpp" + +namespace dpnp::extensions::py_internal +{ +namespace py = pybind11; +namespace st_ns = dpctl::tensor::strides; + +void simplify_iteration_space(int &nd, + const py::ssize_t *const &shape, + std::vector const &src_strides, + std::vector const &dst_strides, + // output + std::vector &simplified_shape, + std::vector &simplified_src_strides, + std::vector &simplified_dst_strides, + py::ssize_t &src_offset, + py::ssize_t &dst_offset) +{ + if (nd > 1) { + // Simplify iteration space to reduce dimensionality + // and improve access pattern + simplified_shape.reserve(nd); + simplified_shape.insert(std::begin(simplified_shape), shape, + shape + nd); + assert(simplified_shape.size() == static_cast(nd)); + + simplified_src_strides.reserve(nd); + simplified_src_strides.insert(std::end(simplified_src_strides), + std::begin(src_strides), + std::end(src_strides)); + assert(simplified_src_strides.size() == static_cast(nd)); + + simplified_dst_strides.reserve(nd); + simplified_dst_strides.insert(std::end(simplified_dst_strides), + std::begin(dst_strides), + std::end(dst_strides)); + assert(simplified_dst_strides.size() == static_cast(nd)); + + int contracted_nd = st_ns::simplify_iteration_two_strides( + nd, simplified_shape.data(), simplified_src_strides.data(), + simplified_dst_strides.data(), + src_offset, // modified by reference + dst_offset // modified by reference + ); + simplified_shape.resize(contracted_nd); + simplified_src_strides.resize(contracted_nd); + simplified_dst_strides.resize(contracted_nd); + + nd = contracted_nd; + } + else if (nd == 1) { + src_offset = 0; + dst_offset = 0; + // Populate vectors + simplified_shape.reserve(nd); + simplified_shape.push_back(shape[0]); + assert(simplified_shape.size() == static_cast(nd)); + + simplified_src_strides.reserve(nd); + simplified_dst_strides.reserve(nd); + + if (src_strides[0] < 0 && dst_strides[0] < 0) { + simplified_src_strides.push_back(-src_strides[0]); + simplified_dst_strides.push_back(-dst_strides[0]); + if (shape[0] > 1) { + src_offset += (shape[0] - 1) * src_strides[0]; + dst_offset += (shape[0] - 1) * dst_strides[0]; + } + } + else { + simplified_src_strides.push_back(src_strides[0]); + simplified_dst_strides.push_back(dst_strides[0]); + } + + assert(simplified_src_strides.size() == static_cast(nd)); + assert(simplified_dst_strides.size() == static_cast(nd)); + } +} + +void simplify_iteration_space_3( + int &nd, + const py::ssize_t *const &shape, + // src1 + std::vector const &src1_strides, + // src2 + std::vector const &src2_strides, + // dst + std::vector const &dst_strides, + // output + std::vector &simplified_shape, + std::vector &simplified_src1_strides, + std::vector &simplified_src2_strides, + std::vector &simplified_dst_strides, + py::ssize_t &src1_offset, + py::ssize_t &src2_offset, + py::ssize_t &dst_offset) +{ + if (nd > 1) { + // Simplify iteration space to reduce dimensionality + // and improve access pattern + simplified_shape.reserve(nd); + simplified_shape.insert(std::end(simplified_shape), shape, shape + nd); + assert(simplified_shape.size() == static_cast(nd)); + + simplified_src1_strides.reserve(nd); + simplified_src1_strides.insert(std::end(simplified_src1_strides), + std::begin(src1_strides), + std::end(src1_strides)); + assert(simplified_src1_strides.size() == static_cast(nd)); + + simplified_src2_strides.reserve(nd); + simplified_src2_strides.insert(std::end(simplified_src2_strides), + std::begin(src2_strides), + std::end(src2_strides)); + assert(simplified_src2_strides.size() == static_cast(nd)); + + simplified_dst_strides.reserve(nd); + simplified_dst_strides.insert(std::end(simplified_dst_strides), + std::begin(dst_strides), + std::end(dst_strides)); + assert(simplified_dst_strides.size() == static_cast(nd)); + + int contracted_nd = st_ns::simplify_iteration_three_strides( + nd, simplified_shape.data(), simplified_src1_strides.data(), + simplified_src2_strides.data(), simplified_dst_strides.data(), + src1_offset, // modified by reference + src2_offset, // modified by reference + dst_offset // modified by reference + ); + simplified_shape.resize(contracted_nd); + simplified_src1_strides.resize(contracted_nd); + simplified_src2_strides.resize(contracted_nd); + simplified_dst_strides.resize(contracted_nd); + + nd = contracted_nd; + } + else if (nd == 1) { + src1_offset = 0; + src2_offset = 0; + dst_offset = 0; + // Populate vectors + simplified_shape.reserve(nd); + simplified_shape.push_back(shape[0]); + assert(simplified_shape.size() == static_cast(nd)); + + simplified_src1_strides.reserve(nd); + simplified_src2_strides.reserve(nd); + simplified_dst_strides.reserve(nd); + + if ((src1_strides[0] < 0) && (src2_strides[0] < 0) && + (dst_strides[0] < 0)) { + simplified_src1_strides.push_back(-src1_strides[0]); + simplified_src2_strides.push_back(-src2_strides[0]); + simplified_dst_strides.push_back(-dst_strides[0]); + if (shape[0] > 1) { + src1_offset += src1_strides[0] * (shape[0] - 1); + src2_offset += src2_strides[0] * (shape[0] - 1); + dst_offset += dst_strides[0] * (shape[0] - 1); + } + } + else { + simplified_src1_strides.push_back(src1_strides[0]); + simplified_src2_strides.push_back(src2_strides[0]); + simplified_dst_strides.push_back(dst_strides[0]); + } + + assert(simplified_src1_strides.size() == static_cast(nd)); + assert(simplified_src2_strides.size() == static_cast(nd)); + assert(simplified_dst_strides.size() == static_cast(nd)); + } +} +} // namespace dpnp::extensions::py_internal diff --git a/dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.hpp b/dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.hpp new file mode 100644 index 00000000000..111050ae59a --- /dev/null +++ b/dpnp/backend/extensions/elementwise_functions/simplify_iteration_space.hpp @@ -0,0 +1,61 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include +#include + +namespace dpnp::extensions::py_internal +{ +namespace py = pybind11; + +void simplify_iteration_space(int &, + const py::ssize_t *const &, + std::vector const &, + std::vector const &, + std::vector &, + std::vector &, + std::vector &, + py::ssize_t &, + py::ssize_t &); + +void simplify_iteration_space_3(int &, + const py::ssize_t *const &, + // src1 + std::vector const &, + // src2 + std::vector const &, + // dst + std::vector const &, + // output + std::vector &, + std::vector &, + std::vector &, + std::vector &, + py::ssize_t &, + py::ssize_t &, + py::ssize_t &); +} // namespace dpnp::extensions::py_internal diff --git a/dpnp/backend/extensions/vm/CMakeLists.txt b/dpnp/backend/extensions/vm/CMakeLists.txt index 1fa895f4e69..ba1e46ea0ed 100644 --- a/dpnp/backend/extensions/vm/CMakeLists.txt +++ b/dpnp/backend/extensions/vm/CMakeLists.txt @@ -23,12 +23,54 @@ # THE POSSIBILITY OF SUCH DAMAGE. # ***************************************************************************** +set(_elementwise_sources + ${CMAKE_CURRENT_SOURCE_DIR}/abs.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/acos.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/acosh.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/add.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/asin.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/asinh.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/atan.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/atan2.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/atanh.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/cbrt.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/ceil.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/conj.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/cos.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/cosh.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/div.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/exp.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/exp2.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/expm1.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/floor.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/hypot.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/ln.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/log10.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/log1p.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/log2.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/mul.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/pow.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/rint.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/sin.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/sinh.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/sqr.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/sqrt.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/sub.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/tan.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/tanh.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/trunc.cpp +) -set(python_module_name _vm_impl) set(_module_src + # TODO: remove sources from `elementwise_functions` folder + ${CMAKE_CURRENT_SOURCE_DIR}/../elementwise_functions/elementwise_functions_type_utils.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/../elementwise_functions/simplify_iteration_space.cpp ${CMAKE_CURRENT_SOURCE_DIR}/vm_py.cpp + ${_elementwise_sources} ) +set(python_module_name _vm_impl) + pybind11_add_module(${python_module_name} MODULE ${_module_src}) add_sycl_to_target(TARGET ${python_module_name} SOURCES ${_module_src}) diff --git a/dpnp/backend/extensions/vm/abs.cpp b/dpnp/backend/extensions/vm/abs.cpp new file mode 100644 index 00000000000..7eb7086de85 --- /dev/null +++ b/dpnp/backend/extensions/vm/abs.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "abs.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::abs function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry, double>, + td_ns::TypeMapResultEntry, float>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event abs_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::abs(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(abs); +} // namespace impl + +void init_abs(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto abs_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_abs", abs_pyapi, + "Call `abs` function from OneMKL VM library to compute " + "the absolute value of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto abs_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_abs_to_call", abs_need_to_call_pyapi, + "Check input arguments to answer if `abs` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/abs.hpp b/dpnp/backend/extensions/vm/abs.hpp index bb5e55010b4..9e074bc1ac8 100644 --- a/dpnp/backend/extensions/vm/abs.hpp +++ b/dpnp/backend/extensions/vm/abs.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event abs_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AbsOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::abs(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AbsContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AbsOutputType::value_type, void>) - { - return nullptr; - } - else { - return abs_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_abs(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/acos.cpp b/dpnp/backend/extensions/vm/acos.cpp new file mode 100644 index 00000000000..ab744bf99c4 --- /dev/null +++ b/dpnp/backend/extensions/vm/acos.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "acos.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::acos function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event acos_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::acos(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(acos); +} // namespace impl + +void init_acos(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto acos_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_acos", acos_pyapi, + "Call `acos` function from OneMKL VM library to compute " + "the inverse cosine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto acos_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_acos_to_call", acos_need_to_call_pyapi, + "Check input arguments to answer if `acos` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/acos.hpp b/dpnp/backend/extensions/vm/acos.hpp index 029a9d9c886..2bfb2a71d6b 100644 --- a/dpnp/backend/extensions/vm/acos.hpp +++ b/dpnp/backend/extensions/vm/acos.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event acos_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AcosOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::acos(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AcosContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AcosOutputType::value_type, void>) - { - return nullptr; - } - else { - return acos_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_acos(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/acosh.cpp b/dpnp/backend/extensions/vm/acosh.cpp new file mode 100644 index 00000000000..2cab39313d2 --- /dev/null +++ b/dpnp/backend/extensions/vm/acosh.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "acosh.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::acosh function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event acosh_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::acosh(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(acosh); +} // namespace impl + +void init_acosh(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto acosh_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_acosh", acosh_pyapi, + "Call `acosh` function from OneMKL VM library to compute " + "the inverse hyperbolic cosine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto acosh_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_acosh_to_call", acosh_need_to_call_pyapi, + "Check input arguments to answer if `acosh` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/acosh.hpp b/dpnp/backend/extensions/vm/acosh.hpp index 9f86ae589cf..6cfde12cbcb 100644 --- a/dpnp/backend/extensions/vm/acosh.hpp +++ b/dpnp/backend/extensions/vm/acosh.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event acosh_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AcoshOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::acosh(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AcoshContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AcoshOutputType::value_type, void>) - { - return nullptr; - } - else { - return acosh_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_acosh(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/add.cpp b/dpnp/backend/extensions/vm/add.cpp new file mode 100644 index 00000000000..c43f07bbcde --- /dev/null +++ b/dpnp/backend/extensions/vm/add.cpp @@ -0,0 +1,171 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "add.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::add function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event add_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::add(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(add); +} // namespace impl + +void init_add(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto add_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_add", add_pyapi, + "Call `add` function from OneMKL VM library to performs element " + "by element addition of vector `src1` by vector `src2` " + "to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto add_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_add_to_call", add_need_to_call_pyapi, + "Check input arguments to answer if `add` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/add.hpp b/dpnp/backend/extensions/vm/add.hpp index 47ff60ed96a..824fb649f2d 100644 --- a/dpnp/backend/extensions/vm/add.hpp +++ b/dpnp/backend/extensions/vm/add.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event add_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::AddOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::add(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AddContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AddOutputType::value_type, void>) - { - return nullptr; - } - else { - return add_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_add(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/asin.cpp b/dpnp/backend/extensions/vm/asin.cpp new file mode 100644 index 00000000000..afbb868e8cc --- /dev/null +++ b/dpnp/backend/extensions/vm/asin.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "asin.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::asin function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event asin_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::asin(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(asin); +} // namespace impl + +void init_asin(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto asin_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_asin", asin_pyapi, + "Call `asin` function from OneMKL VM library to compute " + "the inverse sine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto asin_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_asin_to_call", asin_need_to_call_pyapi, + "Check input arguments to answer if `asin` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/asin.hpp b/dpnp/backend/extensions/vm/asin.hpp index 5e44aa5bde6..a37bff38fbc 100644 --- a/dpnp/backend/extensions/vm/asin.hpp +++ b/dpnp/backend/extensions/vm/asin.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event asin_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AsinOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::asin(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AsinContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AsinOutputType::value_type, void>) - { - return nullptr; - } - else { - return asin_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_asin(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/asinh.cpp b/dpnp/backend/extensions/vm/asinh.cpp new file mode 100644 index 00000000000..0f70c3cb501 --- /dev/null +++ b/dpnp/backend/extensions/vm/asinh.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "asinh.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::asinh function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event asinh_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::asinh(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(asinh); +} // namespace impl + +void init_asinh(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto asinh_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_asinh", asinh_pyapi, + "Call `asinh` function from OneMKL VM library to compute " + "the inverse hyperbolic sine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto asinh_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_asinh_to_call", asinh_need_to_call_pyapi, + "Check input arguments to answer if `asinh` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/asinh.hpp b/dpnp/backend/extensions/vm/asinh.hpp index 58e2815e3f7..ad40f0d4efb 100644 --- a/dpnp/backend/extensions/vm/asinh.hpp +++ b/dpnp/backend/extensions/vm/asinh.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event asinh_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AsinhOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::asinh(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AsinhContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AsinhOutputType::value_type, void>) - { - return nullptr; - } - else { - return asinh_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_asinh(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/atan.cpp b/dpnp/backend/extensions/vm/atan.cpp new file mode 100644 index 00000000000..59f7064ef15 --- /dev/null +++ b/dpnp/backend/extensions/vm/atan.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "atan.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::atan function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event atan_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::atan(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(atan); +} // namespace impl + +void init_atan(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto atan_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_atan", atan_pyapi, + "Call `atan` function from OneMKL VM library to compute " + "the inverse tangent of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto atan_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_atan_to_call", atan_need_to_call_pyapi, + "Check input arguments to answer if `atan` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/atan.hpp b/dpnp/backend/extensions/vm/atan.hpp index b36abc16138..90547e92c8d 100644 --- a/dpnp/backend/extensions/vm/atan.hpp +++ b/dpnp/backend/extensions/vm/atan.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event atan_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AtanOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::atan(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AtanContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AtanOutputType::value_type, void>) - { - return nullptr; - } - else { - return atan_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_atan(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/atan2.cpp b/dpnp/backend/extensions/vm/atan2.cpp new file mode 100644 index 00000000000..30bb59c9c42 --- /dev/null +++ b/dpnp/backend/extensions/vm/atan2.cpp @@ -0,0 +1,160 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "atan2.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::atan2 function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event atan2_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::atan2(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(atan2); +} // namespace impl + +void init_atan2(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto atan2_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_atan2", atan2_pyapi, + "Call `atan2` function from OneMKL VM library to compute element " + "by element inverse tangent of `x1/x2`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto atan2_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_atan2_to_call", atan2_need_to_call_pyapi, + "Check input arguments to answer if `atan2` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/atan2.hpp b/dpnp/backend/extensions/vm/atan2.hpp index 19a66e877ac..cd0e259914c 100644 --- a/dpnp/backend/extensions/vm/atan2.hpp +++ b/dpnp/backend/extensions/vm/atan2.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event atan2_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::Atan2OutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::atan2(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct Atan2ContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::Atan2OutputType::value_type, void>) - { - return nullptr; - } - else { - return atan2_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_atan2(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/atanh.cpp b/dpnp/backend/extensions/vm/atanh.cpp new file mode 100644 index 00000000000..bd32d25f2a6 --- /dev/null +++ b/dpnp/backend/extensions/vm/atanh.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "atanh.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::atanh function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event atanh_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::atanh(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(atanh); +} // namespace impl + +void init_atanh(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto atanh_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_atanh", atanh_pyapi, + "Call `atanh` function from OneMKL VM library to compute " + "the inverse hyperbolic tangent of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto atanh_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_atanh_to_call", atanh_need_to_call_pyapi, + "Check input arguments to answer if `atanh` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/atanh.hpp b/dpnp/backend/extensions/vm/atanh.hpp index 9764df84ce3..afe404adf9b 100644 --- a/dpnp/backend/extensions/vm/atanh.hpp +++ b/dpnp/backend/extensions/vm/atanh.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event atanh_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::AtanhOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::atanh(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct AtanhContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::AtanhOutputType::value_type, void>) - { - return nullptr; - } - else { - return atanh_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_atanh(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/cbrt.cpp b/dpnp/backend/extensions/vm/cbrt.cpp new file mode 100644 index 00000000000..88bc8282418 --- /dev/null +++ b/dpnp/backend/extensions/vm/cbrt.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "cbrt.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::cbrt function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event cbrt_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::cbrt(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(cbrt); +} // namespace impl + +void init_cbrt(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto cbrt_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_cbrt", cbrt_pyapi, + "Call `cbrt` function from OneMKL VM library to compute " + "the element-wise cube root of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto cbrt_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_cbrt_to_call", cbrt_need_to_call_pyapi, + "Check input arguments to answer if `cbrt` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/cbrt.hpp b/dpnp/backend/extensions/vm/cbrt.hpp index 5c0a0adc53e..d4eb052a65b 100644 --- a/dpnp/backend/extensions/vm/cbrt.hpp +++ b/dpnp/backend/extensions/vm/cbrt.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event cbrt_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::CbrtOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::cbrt(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct CbrtContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::CbrtOutputType::value_type, void>) - { - return nullptr; - } - else { - return cbrt_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_cbrt(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/ceil.cpp b/dpnp/backend/extensions/vm/ceil.cpp new file mode 100644 index 00000000000..14e7234a54c --- /dev/null +++ b/dpnp/backend/extensions/vm/ceil.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "ceil.hpp" +#include "common.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::ceil function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event ceil_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::ceil(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(ceil); +} // namespace impl + +void init_ceil(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto ceil_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_ceil", ceil_pyapi, + "Call `ceil` function from OneMKL VM library to compute " + "the ceiling of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto ceil_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_ceil_to_call", ceil_need_to_call_pyapi, + "Check input arguments to answer if `ceil` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/ceil.hpp b/dpnp/backend/extensions/vm/ceil.hpp index fd4f3a8680c..dd9006d1b18 100644 --- a/dpnp/backend/extensions/vm/ceil.hpp +++ b/dpnp/backend/extensions/vm/ceil.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event ceil_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::CeilOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::ceil(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct CeilContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::CeilOutputType::value_type, void>) - { - return nullptr; - } - else { - return ceil_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_ceil(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/common.hpp b/dpnp/backend/extensions/vm/common.hpp index b53b9b0881c..74e9f81fa0f 100644 --- a/dpnp/backend/extensions/vm/common.hpp +++ b/dpnp/backend/extensions/vm/common.hpp @@ -25,252 +25,46 @@ #pragma once -#include #include +#include #include -#include #include // dpctl tensor headers #include "utils/memory_overlap.hpp" #include "utils/type_dispatch.hpp" -#include "utils/type_utils.hpp" #include "dpnp_utils.hpp" static_assert(INTEL_MKL_VERSION >= __INTEL_MKL_2023_2_0_VERSION_REQUIRED, "OneMKL does not meet minimum version requirement"); -// OneMKL namespace with VM functions -namespace mkl_vm = oneapi::mkl::vm; - -// dpctl namespace for type utils -namespace type_utils = dpctl::tensor::type_utils; - -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace vm -{ -typedef sycl::event (*unary_impl_fn_ptr_t)(sycl::queue, - const std::int64_t, - const char *, - char *, - const std::vector &); - -typedef sycl::event (*binary_impl_fn_ptr_t)(sycl::queue, - const std::int64_t, - const char *, - const char *, - char *, - const std::vector &); - -namespace dpctl_td_ns = dpctl::tensor::type_dispatch; namespace py = pybind11; +namespace td_ns = dpctl::tensor::type_dispatch; -template -std::pair - unary_ufunc(sycl::queue exec_q, - dpctl::tensor::usm_ndarray src, - dpctl::tensor::usm_ndarray dst, // dst = op(src), elementwise - const std::vector &depends, - const dispatchT &dispatch_vector) +namespace dpnp::extensions::vm::py_internal { - // check type_nums - int src_typenum = src.get_typenum(); - auto array_types = dpctl_td_ns::usm_ndarray_types(); - int src_typeid = array_types.typenum_to_lookup_id(src_typenum); - - // check that queues are compatible - if (!dpctl::utils::queues_are_compatible(exec_q, {src, dst})) { - throw py::value_error( - "Execution queue is not compatible with allocation queues."); - } - - // check that dimensions are the same - int dst_nd = dst.get_ndim(); - if (dst_nd != src.get_ndim()) { - throw py::value_error( - "Input and output arrays have have different dimensions."); - } - - // check that shapes are the same - const py::ssize_t *src_shape = src.get_shape_raw(); - const py::ssize_t *dst_shape = dst.get_shape_raw(); - bool shapes_equal(true); - size_t src_nelems(1); - - for (int i = 0; i < dst_nd; ++i) { - src_nelems *= static_cast(src_shape[i]); - shapes_equal = shapes_equal && (src_shape[i] == dst_shape[i]); - } - if (!shapes_equal) { - throw py::value_error("Input and output arrays have different shapes."); - } - - // if nelems is zero, return - if (src_nelems == 0) { - return std::make_pair(sycl::event(), sycl::event()); - } - - // ensure that output is ample enough to accommodate all elements - auto dst_offsets = dst.get_minmax_offsets(); - // destination must be ample enough to accommodate all elements - { - size_t range = - static_cast(dst_offsets.second - dst_offsets.first); - if (range + 1 < src_nelems) { - throw py::value_error( - "Destination array can not accommodate all the elements " - "of source array."); - } - } - - // check memory overlap - auto const &overlap = dpctl::tensor::overlap::MemoryOverlap(); - if (overlap(src, dst)) { - throw py::value_error("Arrays index overlapping segments of memory."); - } - - const char *src_data = src.get_data(); - char *dst_data = dst.get_data(); - - // handle contiguous inputs - bool is_src_c_contig = src.is_c_contiguous(); - bool is_dst_c_contig = dst.is_c_contiguous(); - - bool all_c_contig = (is_src_c_contig && is_dst_c_contig); - if (!all_c_contig) { - throw py::value_error("Input and outpur arrays must be C-contiguous."); - } - - auto dispatch_fn = dispatch_vector[src_typeid]; - if (dispatch_fn == nullptr) { - throw py::value_error("No implementation is defined for ufunc."); - } - sycl::event comp_ev = - dispatch_fn(exec_q, src_nelems, src_data, dst_data, depends); - - sycl::event ht_ev = - dpctl::utils::keep_args_alive(exec_q, {src, dst}, {comp_ev}); - return std::make_pair(ht_ev, comp_ev); -} - -template -std::pair binary_ufunc( - sycl::queue exec_q, - dpctl::tensor::usm_ndarray src1, - dpctl::tensor::usm_ndarray src2, - dpctl::tensor::usm_ndarray dst, // dst = op(src1, src2), elementwise - const std::vector &depends, - const dispatchT &dispatch_vector) +template +bool need_to_call_unary_ufunc(sycl::queue &exec_q, + const dpctl::tensor::usm_ndarray &src, + const dpctl::tensor::usm_ndarray &dst, + const output_typesT &output_type_vec, + const contig_dispatchT &contig_dispatch_vector) { // check type_nums - int src1_typenum = src1.get_typenum(); - int src2_typenum = src2.get_typenum(); - - auto array_types = dpctl_td_ns::usm_ndarray_types(); - int src1_typeid = array_types.typenum_to_lookup_id(src1_typenum); - int src2_typeid = array_types.typenum_to_lookup_id(src2_typenum); - - if (src1_typeid != src2_typeid) { - throw py::value_error("Input arrays have different types."); - } - - // check that queues are compatible - if (!dpctl::utils::queues_are_compatible(exec_q, {src1, src2, dst})) { - throw py::value_error( - "Execution queue is not compatible with allocation queues."); - } - - // check shapes, broadcasting is assumed done by caller - // check that dimensions are the same - int dst_nd = dst.get_ndim(); - if (dst_nd != src1.get_ndim() || dst_nd != src2.get_ndim()) { - throw py::value_error("Array dimensions are not the same."); - } - - // check that shapes are the same - const py::ssize_t *src1_shape = src1.get_shape_raw(); - const py::ssize_t *src2_shape = src2.get_shape_raw(); - const py::ssize_t *dst_shape = dst.get_shape_raw(); - bool shapes_equal(true); - size_t src_nelems(1); - - for (int i = 0; i < dst_nd; ++i) { - src_nelems *= static_cast(src1_shape[i]); - shapes_equal = shapes_equal && (src1_shape[i] == dst_shape[i] && - src2_shape[i] == dst_shape[i]); - } - if (!shapes_equal) { - throw py::value_error("Array shapes are not the same."); - } - - // if nelems is zero, return - if (src_nelems == 0) { - return std::make_pair(sycl::event(), sycl::event()); - } - - // ensure that output is ample enough to accommodate all elements - auto dst_offsets = dst.get_minmax_offsets(); - // destination must be ample enough to accommodate all elements - { - size_t range = - static_cast(dst_offsets.second - dst_offsets.first); - if (range + 1 < src_nelems) { - throw py::value_error( - "Destination array can not accommodate all the " - "elements of source array."); - } - } - - // check memory overlap - auto const &overlap = dpctl::tensor::overlap::MemoryOverlap(); - if (overlap(src1, dst) || overlap(src2, dst)) { - throw py::value_error("Arrays index overlapping segments of memory."); - } - - const char *src1_data = src1.get_data(); - const char *src2_data = src2.get_data(); - char *dst_data = dst.get_data(); - - // handle contiguous inputs - bool is_src1_c_contig = src1.is_c_contiguous(); - bool is_src2_c_contig = src2.is_c_contiguous(); - bool is_dst_c_contig = dst.is_c_contiguous(); + int src_typenum = src.get_typenum(); + int dst_typenum = dst.get_typenum(); - bool all_c_contig = - (is_src1_c_contig && is_src2_c_contig && is_dst_c_contig); - if (!all_c_contig) { - throw py::value_error("Input and outpur arrays must be C-contiguous."); - } + auto array_types = td_ns::usm_ndarray_types(); + int src_typeid = array_types.typenum_to_lookup_id(src_typenum); + int dst_typeid = array_types.typenum_to_lookup_id(dst_typenum); - auto dispatch_fn = dispatch_vector[src1_typeid]; - if (dispatch_fn == nullptr) { - throw py::value_error("No implementation is defined for ufunc."); + // check that types are supported + int func_output_typeid = output_type_vec[src_typeid]; + if (dst_typeid != func_output_typeid) { + return false; } - sycl::event comp_ev = dispatch_fn(exec_q, src_nelems, src1_data, src2_data, - dst_data, depends); - - sycl::event ht_ev = - dpctl::utils::keep_args_alive(exec_q, {src1, src2, dst}, {comp_ev}); - return std::make_pair(ht_ev, comp_ev); -} - -template -bool need_to_call_unary_ufunc(sycl::queue exec_q, - dpctl::tensor::usm_ndarray src, - dpctl::tensor::usm_ndarray dst, - const dispatchT &dispatch_vector) -{ - // check type_nums - int src_typenum = src.get_typenum(); - auto array_types = dpctl_td_ns::usm_ndarray_types(); - int src_typeid = array_types.typenum_to_lookup_id(src_typenum); // OneMKL VM functions perform a copy on host if no double type support if (!exec_q.get_device().has(sycl::aspect::fp64)) { @@ -338,26 +132,35 @@ bool need_to_call_unary_ufunc(sycl::queue exec_q, } // MKL function is not defined for the type - if (dispatch_vector[src_typeid] == nullptr) { + if (contig_dispatch_vector[src_typeid] == nullptr) { return false; } return true; } -template -bool need_to_call_binary_ufunc(sycl::queue exec_q, - dpctl::tensor::usm_ndarray src1, - dpctl::tensor::usm_ndarray src2, - dpctl::tensor::usm_ndarray dst, - const dispatchT &dispatch_vector) +template +bool need_to_call_binary_ufunc(sycl::queue &exec_q, + const dpctl::tensor::usm_ndarray &src1, + const dpctl::tensor::usm_ndarray &src2, + const dpctl::tensor::usm_ndarray &dst, + const output_typesT &output_type_table, + const contig_dispatchT &contig_dispatch_table) { // check type_nums int src1_typenum = src1.get_typenum(); int src2_typenum = src2.get_typenum(); + int dst_typenum = dst.get_typenum(); - auto array_types = dpctl_td_ns::usm_ndarray_types(); + auto array_types = td_ns::usm_ndarray_types(); int src1_typeid = array_types.typenum_to_lookup_id(src1_typenum); int src2_typeid = array_types.typenum_to_lookup_id(src2_typenum); + int dst_typeid = array_types.typenum_to_lookup_id(dst_typenum); + + // check that types are supported + int output_typeid = output_type_table[src1_typeid][src2_typeid]; + if (output_typeid != dst_typeid) { + return false; + } // types must be the same if (src1_typeid != src2_typeid) { @@ -434,23 +237,110 @@ bool need_to_call_binary_ufunc(sycl::queue exec_q, } // MKL function is not defined for the type - if (dispatch_vector[src1_typeid] == nullptr) { + if (contig_dispatch_table[src1_typeid] == nullptr) { return false; } return true; } +/** + * @brief A macro used to define factories and a populating unary functions + * to dispatch to a callback with proper OneMKL function within VM extension + * scope. + */ +#define MACRO_POPULATE_DISPATCH_VECTORS(__name__) \ + template \ + struct ContigFactory \ + { \ + fnT get() \ + { \ + if constexpr (std::is_same_v::value_type, \ + void>) { \ + return nullptr; \ + } \ + else { \ + return __name__##_contig_impl; \ + } \ + } \ + }; \ + \ + template \ + struct TypeMapFactory \ + { \ + std::enable_if_t::value, int> get() \ + { \ + using rT = typename OutputType::value_type; \ + return td_ns::GetTypeid{}.get(); \ + } \ + }; \ + \ + static void populate_dispatch_vectors(void) \ + { \ + py_internal::init_ufunc_dispatch_vector( \ + output_typeid_vector); \ + py_internal::init_ufunc_dispatch_vector( \ + contig_dispatch_vector); \ + }; + +/** + * @brief A macro used to define factories and a populating binary functions + * to dispatch to a callback with proper OneMKL function within VM extension + * scope. + */ +#define MACRO_POPULATE_DISPATCH_TABLES(__name__) \ + template \ + struct ContigFactory \ + { \ + fnT get() \ + { \ + if constexpr (std::is_same_v< \ + typename OutputType::value_type, void>) \ + { \ + return nullptr; \ + } \ + else { \ + return __name__##_contig_impl; \ + } \ + } \ + }; \ + \ + template \ + struct TypeMapFactory \ + { \ + std::enable_if_t::value, int> get() \ + { \ + using rT = typename OutputType::value_type; \ + return td_ns::GetTypeid{}.get(); \ + } \ + }; \ + \ + static void populate_dispatch_tables(void) \ + { \ + py_internal::init_ufunc_dispatch_table( \ + output_typeid_vector); \ + py_internal::init_ufunc_dispatch_table( \ + contig_dispatch_vector); \ + }; + template - typename factoryT> + typename factoryT, + int _num_types = td_ns::num_types> void init_ufunc_dispatch_vector(dispatchT dispatch_vector[]) { - dpctl_td_ns::DispatchVectorBuilder - contig; - contig.populate_dispatch_vector(dispatch_vector); + td_ns::DispatchVectorBuilder dvb; + dvb.populate_dispatch_vector(dispatch_vector); +} + +template + typename factoryT, + int _num_types = td_ns::num_types> +void init_ufunc_dispatch_table(dispatchT dispatch_table[][_num_types]) +{ + td_ns::DispatchTableBuilder dtb; + dtb.populate_dispatch_table(dispatch_table); } -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::vm::py_internal diff --git a/dpnp/backend/extensions/vm/conj.cpp b/dpnp/backend/extensions/vm/conj.cpp new file mode 100644 index 00000000000..edfb4384dad --- /dev/null +++ b/dpnp/backend/extensions/vm/conj.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "conj.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::conj function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event conj_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::conj(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(conj); +} // namespace impl + +void init_conj(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto conj_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_conj", conj_pyapi, + "Call `conj` function from OneMKL VM library to compute " + "the conjugate of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto conj_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_conj_to_call", conj_need_to_call_pyapi, + "Check input arguments to answer if `conj` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/conj.hpp b/dpnp/backend/extensions/vm/conj.hpp index af3acb3466e..0ce61082ab6 100644 --- a/dpnp/backend/extensions/vm/conj.hpp +++ b/dpnp/backend/extensions/vm/conj.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event conj_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::ConjOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::conj(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct ConjContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::ConjOutputType::value_type, void>) - { - return nullptr; - } - else { - return conj_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_conj(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/cos.cpp b/dpnp/backend/extensions/vm/cos.cpp new file mode 100644 index 00000000000..e7925cc3298 --- /dev/null +++ b/dpnp/backend/extensions/vm/cos.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "cos.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::cos function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event cos_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::cos(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(cos); +} // namespace impl + +void init_cos(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto cos_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_cos", cos_pyapi, + "Call `cos` function from OneMKL VM library to compute " + "the cosine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto cos_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_cos_to_call", cos_need_to_call_pyapi, + "Check input arguments to answer if `cos` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/cos.hpp b/dpnp/backend/extensions/vm/cos.hpp index a085123ca14..59c92ad0fd8 100644 --- a/dpnp/backend/extensions/vm/cos.hpp +++ b/dpnp/backend/extensions/vm/cos.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event cos_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::CosOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::cos(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct CosContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::CosOutputType::value_type, void>) - { - return nullptr; - } - else { - return cos_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_cos(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/cosh.cpp b/dpnp/backend/extensions/vm/cosh.cpp new file mode 100644 index 00000000000..bb883c97c33 --- /dev/null +++ b/dpnp/backend/extensions/vm/cosh.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "cosh.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::cosh function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event cosh_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::cosh(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(cosh); +} // namespace impl + +void init_cosh(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto cosh_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_cosh", cosh_pyapi, + "Call `cosh` function from OneMKL VM library to compute " + "the hyperbolic cosine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto cosh_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_cosh_to_call", cosh_need_to_call_pyapi, + "Check input arguments to answer if `cosh` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/cosh.hpp b/dpnp/backend/extensions/vm/cosh.hpp index 301a2fbeb22..030ef945823 100644 --- a/dpnp/backend/extensions/vm/cosh.hpp +++ b/dpnp/backend/extensions/vm/cosh.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event cosh_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::CoshOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::cosh(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct CoshContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::CoshOutputType::value_type, void>) - { - return nullptr; - } - else { - return cosh_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_cosh(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/div.cpp b/dpnp/backend/extensions/vm/div.cpp new file mode 100644 index 00000000000..8cdb547feb4 --- /dev/null +++ b/dpnp/backend/extensions/vm/div.cpp @@ -0,0 +1,171 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "div.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::div function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event div_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::div(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(div); +} // namespace impl + +void init_div(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto div_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_div", div_pyapi, + "Call `div` function from OneMKL VM library to performs element " + "by element division of vector `src1` by vector `src2` " + "to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto div_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_div_to_call", div_need_to_call_pyapi, + "Check input arguments to answer if `div` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/div.hpp b/dpnp/backend/extensions/vm/div.hpp index c1306660484..8095f0bb2cb 100644 --- a/dpnp/backend/extensions/vm/div.hpp +++ b/dpnp/backend/extensions/vm/div.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event div_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::DivOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::div(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct DivContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::DivOutputType::value_type, void>) - { - return nullptr; - } - else { - return div_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_div(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/exp.cpp b/dpnp/backend/extensions/vm/exp.cpp new file mode 100644 index 00000000000..b7f8d4422d1 --- /dev/null +++ b/dpnp/backend/extensions/vm/exp.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "exp.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::exp function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event exp_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::exp(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(exp); +} // namespace impl + +void init_exp(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto exp_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_exp", exp_pyapi, + "Call `exp` function from OneMKL VM library to compute " + "the natural (base-e) exponential of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto exp_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_exp_to_call", exp_need_to_call_pyapi, + "Check input arguments to answer if `exp` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/exp.hpp b/dpnp/backend/extensions/vm/exp.hpp index 936b6a5a0ce..a1d88998fd4 100644 --- a/dpnp/backend/extensions/vm/exp.hpp +++ b/dpnp/backend/extensions/vm/exp.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event exp_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::ExpOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::exp(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct ExpContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::ExpOutputType::value_type, void>) - { - return nullptr; - } - else { - return exp_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_exp(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/exp2.cpp b/dpnp/backend/extensions/vm/exp2.cpp new file mode 100644 index 00000000000..8b5d7a7c5ff --- /dev/null +++ b/dpnp/backend/extensions/vm/exp2.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "exp2.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::exp2 function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event exp2_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::exp2(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(exp2); +} // namespace impl + +void init_exp2(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto exp2_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_exp2", exp2_pyapi, + "Call `exp2` function from OneMKL VM library to compute " + "the base-2 exponential of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto exp2_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_exp2_to_call", exp2_need_to_call_pyapi, + "Check input arguments to answer if `exp2` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/exp2.hpp b/dpnp/backend/extensions/vm/exp2.hpp index 362897fdbe6..fe0694c5181 100644 --- a/dpnp/backend/extensions/vm/exp2.hpp +++ b/dpnp/backend/extensions/vm/exp2.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event exp2_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::Exp2OutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::exp2(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct Exp2ContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::Exp2OutputType::value_type, void>) - { - return nullptr; - } - else { - return exp2_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_exp2(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/expm1.cpp b/dpnp/backend/extensions/vm/expm1.cpp new file mode 100644 index 00000000000..b27668ba7c4 --- /dev/null +++ b/dpnp/backend/extensions/vm/expm1.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "expm1.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::expm1 function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event expm1_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::expm1(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(expm1); +} // namespace impl + +void init_expm1(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto expm1_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_expm1", expm1_pyapi, + "Call `expm1` function from OneMKL VM library to compute " + "the subtraction of 1 from the exponential of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto expm1_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_expm1_to_call", expm1_need_to_call_pyapi, + "Check input arguments to answer if `expm1` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/expm1.hpp b/dpnp/backend/extensions/vm/expm1.hpp index d0a94bca8e9..7719d4948b4 100644 --- a/dpnp/backend/extensions/vm/expm1.hpp +++ b/dpnp/backend/extensions/vm/expm1.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event expm1_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::Expm1OutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::expm1(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct Expm1ContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::Expm1OutputType::value_type, void>) - { - return nullptr; - } - else { - return expm1_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_expm1(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/floor.cpp b/dpnp/backend/extensions/vm/floor.cpp new file mode 100644 index 00000000000..8a32f40e0ff --- /dev/null +++ b/dpnp/backend/extensions/vm/floor.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "floor.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::floor function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event floor_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::floor(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(floor); +} // namespace impl + +void init_floor(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto floor_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_floor", floor_pyapi, + "Call `floor` function from OneMKL VM library to compute " + "the floor of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto floor_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_floor_to_call", floor_need_to_call_pyapi, + "Check input arguments to answer if `floor` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/floor.hpp b/dpnp/backend/extensions/vm/floor.hpp index c138b8b6678..4cc85f2bb89 100644 --- a/dpnp/backend/extensions/vm/floor.hpp +++ b/dpnp/backend/extensions/vm/floor.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event floor_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::FloorOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::floor(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct FloorContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::FloorOutputType::value_type, void>) - { - return nullptr; - } - else { - return floor_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_floor(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/hypot.cpp b/dpnp/backend/extensions/vm/hypot.cpp new file mode 100644 index 00000000000..42dd8127111 --- /dev/null +++ b/dpnp/backend/extensions/vm/hypot.cpp @@ -0,0 +1,160 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "hypot.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::hypot function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event hypot_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::hypot(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(hypot); +} // namespace impl + +void init_hypot(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto hypot_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_hypot", hypot_pyapi, + "Call `hypot` function from OneMKL VM library to compute " + "the square root of sum of squares elementwisely", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto hypot_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_hypot_to_call", hypot_need_to_call_pyapi, + "Check input arguments to answer if `hypot` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/hypot.hpp b/dpnp/backend/extensions/vm/hypot.hpp index 19dd4345c36..f7a171556d0 100644 --- a/dpnp/backend/extensions/vm/hypot.hpp +++ b/dpnp/backend/extensions/vm/hypot.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event hypot_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::HypotOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::hypot(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct HypotContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::HypotOutputType::value_type, void>) - { - return nullptr; - } - else { - return hypot_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_hypot(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/ln.cpp b/dpnp/backend/extensions/vm/ln.cpp new file mode 100644 index 00000000000..2eb321a3777 --- /dev/null +++ b/dpnp/backend/extensions/vm/ln.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "ln.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::ln function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event ln_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::ln(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(ln); +} // namespace impl + +void init_ln(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto ln_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_ln", ln_pyapi, + "Call `ln` function from OneMKL VM library to compute " + "the natural logarithm of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto ln_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_ln_to_call", ln_need_to_call_pyapi, + "Check input arguments to answer if `ln` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/ln.hpp b/dpnp/backend/extensions/vm/ln.hpp index 574cc8fa33c..7dadf76b2fd 100644 --- a/dpnp/backend/extensions/vm/ln.hpp +++ b/dpnp/backend/extensions/vm/ln.hpp @@ -25,54 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event ln_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::LnOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::ln(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct LnContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::LnOutputType::value_type, void>) { - return nullptr; - } - else { - return ln_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_ln(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/log10.cpp b/dpnp/backend/extensions/vm/log10.cpp new file mode 100644 index 00000000000..e685e5fce60 --- /dev/null +++ b/dpnp/backend/extensions/vm/log10.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "log10.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::log10 function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event log10_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::log10(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(log10); +} // namespace impl + +void init_log10(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto log10_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_log10", log10_pyapi, + "Call `log10` function from OneMKL VM library to compute " + "the base-10 logarithm of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto log10_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_log10_to_call", log10_need_to_call_pyapi, + "Check input arguments to answer if `log10` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/log10.hpp b/dpnp/backend/extensions/vm/log10.hpp index dc030817cda..c62ae122d35 100644 --- a/dpnp/backend/extensions/vm/log10.hpp +++ b/dpnp/backend/extensions/vm/log10.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event log10_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::Log10OutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::log10(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct Log10ContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::Log10OutputType::value_type, void>) - { - return nullptr; - } - else { - return log10_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_log10(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/log1p.cpp b/dpnp/backend/extensions/vm/log1p.cpp new file mode 100644 index 00000000000..2db1491e5eb --- /dev/null +++ b/dpnp/backend/extensions/vm/log1p.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "log1p.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::log1p function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event log1p_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::log1p(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(log1p); +} // namespace impl + +void init_log1p(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto log1p_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_log1p", log1p_pyapi, + "Call `log1p` function from OneMKL VM library to compute " + "the natural logarithm of 1 plus vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto log1p_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_log1p_to_call", log1p_need_to_call_pyapi, + "Check input arguments to answer if `log1p` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/log1p.hpp b/dpnp/backend/extensions/vm/log1p.hpp index 39ab1b3a21c..7cbfb1fe187 100644 --- a/dpnp/backend/extensions/vm/log1p.hpp +++ b/dpnp/backend/extensions/vm/log1p.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event log1p_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::Log1pOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::log1p(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct Log1pContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::Log1pOutputType::value_type, void>) - { - return nullptr; - } - else { - return log1p_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_log1p(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/log2.cpp b/dpnp/backend/extensions/vm/log2.cpp new file mode 100644 index 00000000000..a6800185c25 --- /dev/null +++ b/dpnp/backend/extensions/vm/log2.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "log2.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::log2 function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event log2_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::log2(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(log2); +} // namespace impl + +void init_log2(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto log2_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_log2", log2_pyapi, + "Call `log2` function from OneMKL VM library to compute " + "the base-2 logarithm of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto log2_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_log2_to_call", log2_need_to_call_pyapi, + "Check input arguments to answer if `log2` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/log2.hpp b/dpnp/backend/extensions/vm/log2.hpp index 2c419ac8ab2..34dd1a92136 100644 --- a/dpnp/backend/extensions/vm/log2.hpp +++ b/dpnp/backend/extensions/vm/log2.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event log2_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::Log2OutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::log2(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct Log2ContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::Log2OutputType::value_type, void>) - { - return nullptr; - } - else { - return log2_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_log2(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/mul.cpp b/dpnp/backend/extensions/vm/mul.cpp new file mode 100644 index 00000000000..34007fbc07c --- /dev/null +++ b/dpnp/backend/extensions/vm/mul.cpp @@ -0,0 +1,171 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "mul.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::mul function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event mul_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::mul(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(mul); +} // namespace impl + +void init_mul(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto mul_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_mul", mul_pyapi, + "Call `mul` function from OneMKL VM library to performs element " + "by element multiplication of vector `src1` by vector `src2` " + "to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto mul_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_mul_to_call", mul_need_to_call_pyapi, + "Check input arguments to answer if `mul` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/mul.hpp b/dpnp/backend/extensions/vm/mul.hpp index 39ea8eec20a..4dd138aea52 100644 --- a/dpnp/backend/extensions/vm/mul.hpp +++ b/dpnp/backend/extensions/vm/mul.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event mul_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::MulOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::mul(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct MulContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::MulOutputType::value_type, void>) - { - return nullptr; - } - else { - return mul_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_mul(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/pow.cpp b/dpnp/backend/extensions/vm/pow.cpp new file mode 100644 index 00000000000..65acd2ece44 --- /dev/null +++ b/dpnp/backend/extensions/vm/pow.cpp @@ -0,0 +1,171 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "pow.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::pow function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event pow_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::pow(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(pow); +} // namespace impl + +void init_pow(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto pow_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_pow", pow_pyapi, + "Call `pow` function from OneMKL VM library to performs element " + "by element exponentiation of vector `src1` raised to the power " + "of vector `src2` to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto pow_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_pow_to_call", pow_need_to_call_pyapi, + "Check input arguments to answer if `pow` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/pow.hpp b/dpnp/backend/extensions/vm/pow.hpp index f5e946914bf..ef6770d1065 100644 --- a/dpnp/backend/extensions/vm/pow.hpp +++ b/dpnp/backend/extensions/vm/pow.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event pow_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::PowOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::pow(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct PowContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::PowOutputType::value_type, void>) - { - return nullptr; - } - else { - return pow_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_pow(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/rint.cpp b/dpnp/backend/extensions/vm/rint.cpp new file mode 100644 index 00000000000..ee0edbecd23 --- /dev/null +++ b/dpnp/backend/extensions/vm/rint.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "rint.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::rint function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event rint_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::rint(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(rint); +} // namespace impl + +void init_rint(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto rint_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_round", rint_pyapi, + "Call `rint` function from OneMKL VM library to compute " + "the rounded value of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto rint_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_round_to_call", rint_need_to_call_pyapi, + "Check input arguments to answer if `rint` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/round.hpp b/dpnp/backend/extensions/vm/rint.hpp similarity index 53% rename from dpnp/backend/extensions/vm/round.hpp rename to dpnp/backend/extensions/vm/rint.hpp index a2ae3b3bc52..ce493368788 100644 --- a/dpnp/backend/extensions/vm/round.hpp +++ b/dpnp/backend/extensions/vm/rint.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event round_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::RoundOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::rint(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct RoundContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::RoundOutputType::value_type, void>) - { - return nullptr; - } - else { - return round_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_rint(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sin.cpp b/dpnp/backend/extensions/vm/sin.cpp new file mode 100644 index 00000000000..55d9f8ed301 --- /dev/null +++ b/dpnp/backend/extensions/vm/sin.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "sin.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::sin function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event sin_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::sin(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(sin); +} // namespace impl + +void init_sin(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto sin_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_sin", sin_pyapi, + "Call `sin` function from OneMKL VM library to compute " + "the sine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto sin_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_sin_to_call", sin_need_to_call_pyapi, + "Check input arguments to answer if `sin` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sin.hpp b/dpnp/backend/extensions/vm/sin.hpp index 0af14c68c87..dcda488e728 100644 --- a/dpnp/backend/extensions/vm/sin.hpp +++ b/dpnp/backend/extensions/vm/sin.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event sin_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::SinOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::sin(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct SinContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::SinOutputType::value_type, void>) - { - return nullptr; - } - else { - return sin_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_sin(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sinh.cpp b/dpnp/backend/extensions/vm/sinh.cpp new file mode 100644 index 00000000000..f8ddbc580eb --- /dev/null +++ b/dpnp/backend/extensions/vm/sinh.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "sinh.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::sinh function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event sinh_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::sinh(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(sinh); +} // namespace impl + +void init_sinh(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto sinh_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_sinh", sinh_pyapi, + "Call `sinh` function from OneMKL VM library to compute " + "the inverse cosine of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto sinh_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_sinh_to_call", sinh_need_to_call_pyapi, + "Check input arguments to answer if `sinh` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sinh.hpp b/dpnp/backend/extensions/vm/sinh.hpp index 6fe53423c53..92f1e740a62 100644 --- a/dpnp/backend/extensions/vm/sinh.hpp +++ b/dpnp/backend/extensions/vm/sinh.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event sinh_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::SinhOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::sinh(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct SinhContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::SinhOutputType::value_type, void>) - { - return nullptr; - } - else { - return sinh_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_sinh(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sqr.cpp b/dpnp/backend/extensions/vm/sqr.cpp new file mode 100644 index 00000000000..f42427ea00f --- /dev/null +++ b/dpnp/backend/extensions/vm/sqr.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "sqr.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::sqr function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event sqr_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::sqr(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(sqr); +} // namespace impl + +void init_sqr(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto sqr_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_sqr", sqr_pyapi, + "Call `sqr` from OneMKL VM library to performs element by element " + "operation of squaring of vector `src` to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto sqr_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_sqr_to_call", sqr_need_to_call_pyapi, + "Check input arguments to answer if `sqr` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sqr.hpp b/dpnp/backend/extensions/vm/sqr.hpp index 8f1d4ac44fd..2fe78ceead6 100644 --- a/dpnp/backend/extensions/vm/sqr.hpp +++ b/dpnp/backend/extensions/vm/sqr.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event sqr_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::SqrOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::sqr(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct SqrContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::SqrOutputType::value_type, void>) - { - return nullptr; - } - else { - return sqr_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_sqr(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sqrt.cpp b/dpnp/backend/extensions/vm/sqrt.cpp new file mode 100644 index 00000000000..70ebbf298fd --- /dev/null +++ b/dpnp/backend/extensions/vm/sqrt.cpp @@ -0,0 +1,139 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "sqrt.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::sqrt function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event sqrt_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::sqrt(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(sqrt); +} // namespace impl + +void init_sqrt(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto sqrt_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_sqrt", sqrt_pyapi, + "Call `sqrt` from OneMKL VM library to performs element by element " + "operation of extracting the square root " + "of vector `src` to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto sqrt_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_sqrt_to_call", sqrt_need_to_call_pyapi, + "Check input arguments to answer if `sqrt` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sqrt.hpp b/dpnp/backend/extensions/vm/sqrt.hpp index e3984133628..08d37049580 100644 --- a/dpnp/backend/extensions/vm/sqrt.hpp +++ b/dpnp/backend/extensions/vm/sqrt.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event sqrt_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::SqrtOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::sqrt(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct SqrtContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::SqrtOutputType::value_type, void>) - { - return nullptr; - } - else { - return sqrt_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_sqrt(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sub.cpp b/dpnp/backend/extensions/vm/sub.cpp new file mode 100644 index 00000000000..4ec1bdc36b5 --- /dev/null +++ b/dpnp/backend/extensions/vm/sub.cpp @@ -0,0 +1,171 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "sub.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::sub function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + T2, + std::complex, + std::complex>, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event sub_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + ssize_t a_offset, + const char *in_b, + ssize_t b_offset, + char *out_y, + ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::sub(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(sub); +} // namespace impl + +void init_sub(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto sub_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_sub", sub_pyapi, + "Call `sub` function from OneMKL VM library to performs element " + "by element subtraction of vector `src1` by vector `src2` " + "to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto sub_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_sub_to_call", sub_need_to_call_pyapi, + "Check input arguments to answer if `sub` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/sub.hpp b/dpnp/backend/extensions/vm/sub.hpp index e1a2464b867..059a78dcbda 100644 --- a/dpnp/backend/extensions/vm/sub.hpp +++ b/dpnp/backend/extensions/vm/sub.hpp @@ -25,58 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event sub_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - const char *in_b, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - const T *b = reinterpret_cast(in_b); - using resTy = typename types::SubOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::sub(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing 1st input vector of size n - b, // pointer `b` containing 2nd input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct SubContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::SubOutputType::value_type, void>) - { - return nullptr; - } - else { - return sub_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_sub(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/tan.cpp b/dpnp/backend/extensions/vm/tan.cpp new file mode 100644 index 00000000000..250c3838722 --- /dev/null +++ b/dpnp/backend/extensions/vm/tan.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "tan.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::tan function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event tan_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::tan(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(tan); +} // namespace impl + +void init_tan(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto tan_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_tan", tan_pyapi, + "Call `tan` function from OneMKL VM library to compute " + "the tangent of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto tan_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_tan_to_call", tan_need_to_call_pyapi, + "Check input arguments to answer if `tan` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/tan.hpp b/dpnp/backend/extensions/vm/tan.hpp index d759ea46fe1..6fcfed9f816 100644 --- a/dpnp/backend/extensions/vm/tan.hpp +++ b/dpnp/backend/extensions/vm/tan.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event tan_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::TanOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::tan(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct TanContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::TanOutputType::value_type, void>) - { - return nullptr; - } - else { - return tan_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_tan(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/tanh.cpp b/dpnp/backend/extensions/vm/tanh.cpp new file mode 100644 index 00000000000..d0e9ecc1669 --- /dev/null +++ b/dpnp/backend/extensions/vm/tanh.cpp @@ -0,0 +1,138 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "tanh.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::tanh function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry>, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event tanh_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::tanh(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(tanh); +} // namespace impl + +void init_tanh(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto tanh_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_tanh", tanh_pyapi, + "Call `tanh` function from OneMKL VM library to compute " + "the hyperbolic tangent of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto tanh_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_tanh_to_call", tanh_need_to_call_pyapi, + "Check input arguments to answer if `tanh` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/tanh.hpp b/dpnp/backend/extensions/vm/tanh.hpp index 98909685ff2..9afbe1eb480 100644 --- a/dpnp/backend/extensions/vm/tanh.hpp +++ b/dpnp/backend/extensions/vm/tanh.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event tanh_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::TanhOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::tanh(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct TanhContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::TanhOutputType::value_type, void>) - { - return nullptr; - } - else { - return tanh_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_tanh(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/trunc.cpp b/dpnp/backend/extensions/vm/trunc.cpp new file mode 100644 index 00000000000..f47da825719 --- /dev/null +++ b/dpnp/backend/extensions/vm/trunc.cpp @@ -0,0 +1,136 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "trunc.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::trunc function. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event trunc_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + char *out_y, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + + std::int64_t n = static_cast(in_n); + const T *a = reinterpret_cast(in_a); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::trunc(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types]; +static unary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(trunc); +} // namespace impl + +void init_trunc(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_vectors(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto trunc_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst, const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrVector{}); + }; + m.def("_trunc", trunc_pyapi, + "Call `trunc` function from OneMKL VM library to compute " + "the truncated value of vector elements", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), + py::arg("depends") = py::list()); + + auto trunc_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src, + const arrayT &dst) { + return py_internal::need_to_call_unary_ufunc( + exec_q, src, dst, output_typeid_vector, contig_dispatch_vector); + }; + m.def("_mkl_trunc_to_call", trunc_need_to_call_pyapi, + "Check input arguments to answer if `trunc` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/trunc.hpp b/dpnp/backend/extensions/vm/trunc.hpp index c06c7cf566f..0b430fd1efc 100644 --- a/dpnp/backend/extensions/vm/trunc.hpp +++ b/dpnp/backend/extensions/vm/trunc.hpp @@ -25,55 +25,11 @@ #pragma once -#include +#include -#include "common.hpp" -#include "types_matrix.hpp" +namespace py = pybind11; -namespace dpnp +namespace dpnp::extensions::vm { -namespace backend -{ -namespace ext -{ -namespace vm -{ -template -sycl::event trunc_contig_impl(sycl::queue exec_q, - const std::int64_t n, - const char *in_a, - char *out_y, - const std::vector &depends) -{ - type_utils::validate_type_for_device(exec_q); - - const T *a = reinterpret_cast(in_a); - using resTy = typename types::TruncOutputType::value_type; - resTy *y = reinterpret_cast(out_y); - - return mkl_vm::trunc(exec_q, - n, // number of elements to be calculated - a, // pointer `a` containing input vector of size n - y, // pointer `y` to the output vector of size n - depends); -} - -template -struct TruncContigFactory -{ - fnT get() - { - if constexpr (std::is_same_v< - typename types::TruncOutputType::value_type, void>) - { - return nullptr; - } - else { - return trunc_contig_impl; - } - } -}; -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp +void init_trunc(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/types_matrix.hpp b/dpnp/backend/extensions/vm/types_matrix.hpp deleted file mode 100644 index 5b4ccb8fdf6..00000000000 --- a/dpnp/backend/extensions/vm/types_matrix.hpp +++ /dev/null @@ -1,659 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2023-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -#pragma once - -#include - -// dpctl tensor headers -#include "utils/type_dispatch.hpp" - -// dpctl namespace for types dispatching -namespace dpctl_td_ns = dpctl::tensor::type_dispatch; - -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace vm -{ -namespace types -{ -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::abs function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AbsOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, double>, - dpctl_td_ns::TypeMapResultEntry, float>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::acos function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AcosOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::acosh function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AcoshOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::add function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct AddOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::asin function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AsinOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::asinh function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AsinhOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::atan function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AtanOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::atan2 function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct Atan2OutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::atanh function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct AtanhOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::cbrt function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct CbrtOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::ceil function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct CeilOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::conj function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct ConjOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::cos function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct CosOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::cosh function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct CoshOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::div function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct DivOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::exp function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct ExpOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::exp2 function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct Exp2OutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::expm1 function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct Expm1OutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::floor function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct FloorOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::hypot function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct HypotOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::ln function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct LnOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::log10 function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct Log10OutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::log1p function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct Log1pOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::log2 function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct Log2OutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::mul function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct MulOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::pow function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct PowOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::rint function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct RoundOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::sin function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct SinOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::sinh function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct SinhOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::sqr function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct SqrOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::sqrt function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct SqrtOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::sub function. - * - * @tparam T Type of input vectors `a` and `b` and of result vector `y`. - */ -template -struct SubOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - T, - std::complex, - std::complex>, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::BinaryTypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::tan function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct TanOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::tanh function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct TanhOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry>, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -/** - * @brief A factory to define pairs of supported types for which - * MKL VM library provides support in oneapi::mkl::vm::trunc function. - * - * @tparam T Type of input vector `a` and of result vector `y`. - */ -template -struct TruncOutputType -{ - using value_type = typename std::disjunction< - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::TypeMapResultEntry, - dpctl_td_ns::DefaultResultEntry>::result_type; -}; - -} // namespace types -} // namespace vm -} // namespace ext -} // namespace backend -} // namespace dpnp diff --git a/dpnp/backend/extensions/vm/vm_py.cpp b/dpnp/backend/extensions/vm/vm_py.cpp index 74d2ae67794..791a8f6d656 100644 --- a/dpnp/backend/extensions/vm/vm_py.cpp +++ b/dpnp/backend/extensions/vm/vm_py.cpp @@ -27,9 +27,6 @@ // //***************************************************************************** -#include -#include - #include "abs.hpp" #include "acos.hpp" #include "acosh.hpp" @@ -41,7 +38,6 @@ #include "atanh.hpp" #include "cbrt.hpp" #include "ceil.hpp" -#include "common.hpp" #include "conj.hpp" #include "cos.hpp" #include "cosh.hpp" @@ -57,7 +53,7 @@ #include "log2.hpp" #include "mul.hpp" #include "pow.hpp" -#include "round.hpp" +#include "rint.hpp" #include "sin.hpp" #include "sinh.hpp" #include "sqr.hpp" @@ -66,1047 +62,44 @@ #include "tan.hpp" #include "tanh.hpp" #include "trunc.hpp" -#include "types_matrix.hpp" - -namespace py = pybind11; -namespace vm_ext = dpnp::backend::ext::vm; -using vm_ext::binary_impl_fn_ptr_t; -using vm_ext::unary_impl_fn_ptr_t; - -static unary_impl_fn_ptr_t abs_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t acos_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t acosh_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t add_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t asin_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t asinh_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t atan_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t atan2_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t atanh_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t cbrt_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t ceil_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t conj_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t cos_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t cosh_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t div_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t exp_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t exp2_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t expm1_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t floor_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t hypot_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t ln_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t log10_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t log1p_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t log2_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t mul_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t pow_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t round_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t sin_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t sinh_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t sqr_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t sqrt_dispatch_vector[dpctl_td_ns::num_types]; -static binary_impl_fn_ptr_t sub_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t tan_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t tanh_dispatch_vector[dpctl_td_ns::num_types]; -static unary_impl_fn_ptr_t trunc_dispatch_vector[dpctl_td_ns::num_types]; +namespace vm_ns = dpnp::extensions::vm; PYBIND11_MODULE(_vm_impl, m) { - using arrayT = dpctl::tensor::usm_ndarray; - using event_vecT = std::vector; - - // UnaryUfunc: ==== Abs(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - abs_dispatch_vector); - - auto abs_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - abs_dispatch_vector); - }; - m.def("_abs", abs_pyapi, - "Call `abs` function from OneMKL VM library to compute " - "the absolute value of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto abs_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - abs_dispatch_vector); - }; - m.def("_mkl_abs_to_call", abs_need_to_call_pyapi, - "Check input arguments to answer if `abs` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Acos(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - acos_dispatch_vector); - - auto acos_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - acos_dispatch_vector); - }; - m.def("_acos", acos_pyapi, - "Call `acos` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto acos_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - acos_dispatch_vector); - }; - m.def("_mkl_acos_to_call", acos_need_to_call_pyapi, - "Check input arguments to answer if `acos` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Acosh(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - acosh_dispatch_vector); - - auto acosh_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - acosh_dispatch_vector); - }; - m.def("_acosh", acosh_pyapi, - "Call `acosh` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto acosh_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - acosh_dispatch_vector); - }; - m.def("_mkl_acosh_to_call", acosh_need_to_call_pyapi, - "Check input arguments to answer if `acosh` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // BinaryUfunc: ==== Add(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - add_dispatch_vector); - - auto add_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - add_dispatch_vector); - }; - m.def("_add", add_pyapi, - "Call `add` function from OneMKL VM library to performs element " - "by element addition of vector `src1` by vector `src2` " - "to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto add_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - add_dispatch_vector); - }; - m.def("_mkl_add_to_call", add_need_to_call_pyapi, - "Check input arguments to answer if `add` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // UnaryUfunc: ==== Asin(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - asin_dispatch_vector); - - auto asin_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - asin_dispatch_vector); - }; - m.def("_asin", asin_pyapi, - "Call `asin` function from OneMKL VM library to compute " - "inverse sine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto asin_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - asin_dispatch_vector); - }; - m.def("_mkl_asin_to_call", asin_need_to_call_pyapi, - "Check input arguments to answer if `asin` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Asinh(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - asinh_dispatch_vector); - - auto asinh_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - asinh_dispatch_vector); - }; - m.def("_asinh", asinh_pyapi, - "Call `asinh` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto asinh_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - asinh_dispatch_vector); - }; - m.def("_mkl_asinh_to_call", asinh_need_to_call_pyapi, - "Check input arguments to answer if `asinh` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Atan(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - atan_dispatch_vector); - - auto atan_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - atan_dispatch_vector); - }; - m.def("_atan", atan_pyapi, - "Call `atan` function from OneMKL VM library to compute " - "inverse tangent of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto atan_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - atan_dispatch_vector); - }; - m.def("_mkl_atan_to_call", atan_need_to_call_pyapi, - "Check input arguments to answer if `atan` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // BinaryUfunc: ==== Atan2(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - atan2_dispatch_vector); - - auto atan2_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - atan2_dispatch_vector); - }; - m.def("_atan2", atan2_pyapi, - "Call `atan2` function from OneMKL VM library to compute element " - "by element inverse tangent of `x1/x2`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto atan2_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - atan2_dispatch_vector); - }; - m.def("_mkl_atan2_to_call", atan2_need_to_call_pyapi, - "Check input arguments to answer if `atan2` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // UnaryUfunc: ==== Atanh(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - atanh_dispatch_vector); - - auto atanh_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - atanh_dispatch_vector); - }; - m.def("_atanh", atanh_pyapi, - "Call `atanh` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto atanh_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - atanh_dispatch_vector); - }; - m.def("_mkl_atanh_to_call", atanh_need_to_call_pyapi, - "Check input arguments to answer if `atanh` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Cbrt(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - cbrt_dispatch_vector); - - auto cbrt_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - cbrt_dispatch_vector); - }; - m.def("_cbrt", cbrt_pyapi, - "Call `cbrt` function from OneMKL VM library to compute " - "the element-wise cube root of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto cbrt_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - cbrt_dispatch_vector); - }; - m.def("_mkl_cbrt_to_call", cbrt_need_to_call_pyapi, - "Check input arguments to answer if `cbrt` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Ceil(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - ceil_dispatch_vector); - - auto ceil_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - ceil_dispatch_vector); - }; - m.def("_ceil", ceil_pyapi, - "Call `ceil` function from OneMKL VM library to compute " - "ceiling of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto ceil_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - ceil_dispatch_vector); - }; - m.def("_mkl_ceil_to_call", ceil_need_to_call_pyapi, - "Check input arguments to answer if `ceil` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Conj(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - conj_dispatch_vector); - - auto conj_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - conj_dispatch_vector); - }; - m.def("_conj", conj_pyapi, - "Call `conj` function from OneMKL VM library to compute " - "conjugate of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto conj_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - conj_dispatch_vector); - }; - m.def("_mkl_conj_to_call", conj_need_to_call_pyapi, - "Check input arguments to answer if `conj` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Cos(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - cos_dispatch_vector); - - auto cos_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - cos_dispatch_vector); - }; - m.def("_cos", cos_pyapi, - "Call `cos` function from OneMKL VM library to compute " - "cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto cos_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - cos_dispatch_vector); - }; - m.def("_mkl_cos_to_call", cos_need_to_call_pyapi, - "Check input arguments to answer if `cos` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Cosh(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - cosh_dispatch_vector); - - auto cosh_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - cosh_dispatch_vector); - }; - m.def("_cosh", cosh_pyapi, - "Call `cosh` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto cosh_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - cosh_dispatch_vector); - }; - m.def("_mkl_cosh_to_call", cosh_need_to_call_pyapi, - "Check input arguments to answer if `cosh` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // BinaryUfunc: ==== Div(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - div_dispatch_vector); - - auto div_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - div_dispatch_vector); - }; - m.def("_div", div_pyapi, - "Call `div` function from OneMKL VM library to performs element " - "by element division of vector `src1` by vector `src2` " - "to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto div_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - div_dispatch_vector); - }; - m.def("_mkl_div_to_call", div_need_to_call_pyapi, - "Check input arguments to answer if `div` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // UnaryUfunc: ==== Exp(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - exp_dispatch_vector); - - auto exp_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - exp_dispatch_vector); - }; - m.def("_exp", exp_pyapi, - "Call `exp` function from OneMKL VM library to compute " - "natural (base-e) exponential of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto exp_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - exp_dispatch_vector); - }; - m.def("_mkl_exp_to_call", exp_need_to_call_pyapi, - "Check input arguments to answer if `exp` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== exp2(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - exp2_dispatch_vector); - - auto exp2_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - exp2_dispatch_vector); - }; - m.def("_exp2", exp2_pyapi, - "Call `exp2` function from OneMKL VM library to compute " - "the element-wise base-2 exponential of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto exp2_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - exp2_dispatch_vector); - }; - m.def("_mkl_exp2_to_call", exp2_need_to_call_pyapi, - "Check input arguments to answer if `exp2` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== expm1(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - expm1_dispatch_vector); - - auto expm1_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - expm1_dispatch_vector); - }; - m.def("_expm1", expm1_pyapi, - "Call `expm1` function from OneMKL VM library to compute " - "subtraction of 1 from the exponential of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto expm1_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - expm1_dispatch_vector); - }; - m.def("_mkl_expm1_to_call", expm1_need_to_call_pyapi, - "Check input arguments to answer if `expm1` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Floor(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - floor_dispatch_vector); - - auto floor_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - floor_dispatch_vector); - }; - m.def("_floor", floor_pyapi, - "Call `floor` function from OneMKL VM library to compute " - "floor of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto floor_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - floor_dispatch_vector); - }; - m.def("_mkl_floor_to_call", floor_need_to_call_pyapi, - "Check input arguments to answer if `floor` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // BinaryUfunc: ==== Hypot(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - hypot_dispatch_vector); - - auto hypot_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - hypot_dispatch_vector); - }; - m.def("_hypot", hypot_pyapi, - "Call `hypot` function from OneMKL VM library to compute element " - "by element hypotenuse of `x`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto hypot_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - hypot_dispatch_vector); - }; - m.def("_mkl_hypot_to_call", hypot_need_to_call_pyapi, - "Check input arguments to answer if `hypot` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // UnaryUfunc: ==== Ln(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - ln_dispatch_vector); - - auto ln_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - ln_dispatch_vector); - }; - m.def("_ln", ln_pyapi, - "Call `ln` function from OneMKL VM library to compute " - "natural logarithm of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto ln_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - ln_dispatch_vector); - }; - m.def("_mkl_ln_to_call", ln_need_to_call_pyapi, - "Check input arguments to answer if `ln` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Log10(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - log10_dispatch_vector); - - auto log10_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - log10_dispatch_vector); - }; - m.def("_log10", log10_pyapi, - "Call `log10` function from OneMKL VM library to compute " - "base-10 logarithm of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto log10_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - log10_dispatch_vector); - }; - m.def("_mkl_log10_to_call", log10_need_to_call_pyapi, - "Check input arguments to answer if `log10` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Log1p(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - log1p_dispatch_vector); - - auto log1p_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - log1p_dispatch_vector); - }; - m.def("_log1p", log1p_pyapi, - "Call `log1p` function from OneMKL VM library to compute " - "natural logarithm of 1 plus vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto log1p_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - log1p_dispatch_vector); - }; - m.def("_mkl_log1p_to_call", log1p_need_to_call_pyapi, - "Check input arguments to answer if `log1p` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Log2(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - log2_dispatch_vector); - - auto log2_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - log2_dispatch_vector); - }; - m.def("_log2", log2_pyapi, - "Call `log2` function from OneMKL VM library to compute " - "base-2 logarithm of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto log2_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - log2_dispatch_vector); - }; - m.def("_mkl_log2_to_call", log2_need_to_call_pyapi, - "Check input arguments to answer if `log2` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // BinaryUfunc: ==== Mul(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - mul_dispatch_vector); - - auto mul_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - mul_dispatch_vector); - }; - m.def("_mul", mul_pyapi, - "Call `mul` function from OneMKL VM library to performs element " - "by element multiplication of vector `src1` by vector `src2` " - "to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto mul_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - mul_dispatch_vector); - }; - m.def("_mkl_mul_to_call", mul_need_to_call_pyapi, - "Check input arguments to answer if `mul` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // BinaryUfunc: ==== Pow(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - pow_dispatch_vector); - - auto pow_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - pow_dispatch_vector); - }; - m.def("_pow", pow_pyapi, - "Call `pow` function from OneMKL VM library to performs element " - "by element exponentiation of vector `src1` raised to the power " - "of vector `src2` to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto pow_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - pow_dispatch_vector); - }; - m.def("_mkl_pow_to_call", pow_need_to_call_pyapi, - "Check input arguments to answer if `pow` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // UnaryUfunc: ==== Round(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - round_dispatch_vector); - - auto round_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - round_dispatch_vector); - }; - m.def("_round", round_pyapi, - "Call `rint` function from OneMKL VM library to compute " - "the rounded value of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto round_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - round_dispatch_vector); - }; - m.def("_mkl_round_to_call", round_need_to_call_pyapi, - "Check input arguments to answer if `rint` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Sin(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - sin_dispatch_vector); - - auto sin_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - sin_dispatch_vector); - }; - m.def("_sin", sin_pyapi, - "Call `sin` function from OneMKL VM library to compute " - "sine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto sin_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - sin_dispatch_vector); - }; - m.def("_mkl_sin_to_call", sin_need_to_call_pyapi, - "Check input arguments to answer if `sin` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Sinh(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - sinh_dispatch_vector); - - auto sinh_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - sinh_dispatch_vector); - }; - m.def("_sinh", sinh_pyapi, - "Call `sinh` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto sinh_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - sinh_dispatch_vector); - }; - m.def("_mkl_sinh_to_call", sinh_need_to_call_pyapi, - "Check input arguments to answer if `sinh` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Sqr(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - sqr_dispatch_vector); - - auto sqr_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - sqr_dispatch_vector); - }; - m.def( - "_sqr", sqr_pyapi, - "Call `sqr` from OneMKL VM library to performs element by element " - "operation of squaring of vector `src` to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto sqr_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - sqr_dispatch_vector); - }; - m.def("_mkl_sqr_to_call", sqr_need_to_call_pyapi, - "Check input arguments to answer if `sqr` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Sqrt(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - sqrt_dispatch_vector); - - auto sqrt_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - sqrt_dispatch_vector); - }; - m.def( - "_sqrt", sqrt_pyapi, - "Call `sqrt` from OneMKL VM library to performs element by element " - "operation of extracting the square root " - "of vector `src` to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto sqrt_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - sqrt_dispatch_vector); - }; - m.def("_mkl_sqrt_to_call", sqrt_need_to_call_pyapi, - "Check input arguments to answer if `sqrt` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // BinaryUfunc: ==== Sub(x1, x2) ==== - { - vm_ext::init_ufunc_dispatch_vector( - sub_dispatch_vector); - - auto sub_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return vm_ext::binary_ufunc(exec_q, src1, src2, dst, depends, - sub_dispatch_vector); - }; - m.def("_sub", sub_pyapi, - "Call `sub` function from OneMKL VM library to performs element " - "by element subtraction of vector `src1` by vector `src2` " - "to resulting vector `dst`", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst"), py::arg("depends") = py::list()); - - auto sub_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src1, - arrayT src2, arrayT dst) { - return vm_ext::need_to_call_binary_ufunc(exec_q, src1, src2, dst, - sub_dispatch_vector); - }; - m.def("_mkl_sub_to_call", sub_need_to_call_pyapi, - "Check input arguments to answer if `sub` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), - py::arg("dst")); - } - - // UnaryUfunc: ==== Tan(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - tan_dispatch_vector); - - auto tan_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - tan_dispatch_vector); - }; - m.def("_tan", tan_pyapi, - "Call `tan` function from OneMKL VM library to compute " - "tangent of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto tan_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - tan_dispatch_vector); - }; - m.def("_mkl_tan_to_call", tan_need_to_call_pyapi, - "Check input arguments to answer if `tan` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Tanh(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - tanh_dispatch_vector); - - auto tanh_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - tanh_dispatch_vector); - }; - m.def("_tanh", tanh_pyapi, - "Call `tanh` function from OneMKL VM library to compute " - "inverse cosine of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto tanh_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - tanh_dispatch_vector); - }; - m.def("_mkl_tanh_to_call", tanh_need_to_call_pyapi, - "Check input arguments to answer if `tanh` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } - - // UnaryUfunc: ==== Trunc(x) ==== - { - vm_ext::init_ufunc_dispatch_vector( - trunc_dispatch_vector); - - auto trunc_pyapi = [&](sycl::queue exec_q, arrayT src, arrayT dst, - const event_vecT &depends = {}) { - return vm_ext::unary_ufunc(exec_q, src, dst, depends, - trunc_dispatch_vector); - }; - m.def("_trunc", trunc_pyapi, - "Call `trunc` function from OneMKL VM library to compute " - "the truncated value of vector elements", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst"), - py::arg("depends") = py::list()); - - auto trunc_need_to_call_pyapi = [&](sycl::queue exec_q, arrayT src, - arrayT dst) { - return vm_ext::need_to_call_unary_ufunc(exec_q, src, dst, - trunc_dispatch_vector); - }; - m.def("_mkl_trunc_to_call", trunc_need_to_call_pyapi, - "Check input arguments to answer if `trunc` function from " - "OneMKL VM library can be used", - py::arg("sycl_queue"), py::arg("src"), py::arg("dst")); - } + vm_ns::init_abs(m); + vm_ns::init_acos(m); + vm_ns::init_acosh(m); + vm_ns::init_add(m); + vm_ns::init_asin(m); + vm_ns::init_asinh(m); + vm_ns::init_atan(m); + vm_ns::init_atan2(m); + vm_ns::init_atanh(m); + vm_ns::init_cbrt(m); + vm_ns::init_ceil(m); + vm_ns::init_conj(m); + vm_ns::init_cos(m); + vm_ns::init_cosh(m); + vm_ns::init_div(m); + vm_ns::init_exp(m); + vm_ns::init_exp2(m); + vm_ns::init_expm1(m); + vm_ns::init_floor(m); + vm_ns::init_hypot(m); + vm_ns::init_ln(m); + vm_ns::init_log10(m); + vm_ns::init_log1p(m); + vm_ns::init_log2(m); + vm_ns::init_mul(m); + vm_ns::init_pow(m); + vm_ns::init_rint(m); + vm_ns::init_sin(m); + vm_ns::init_sinh(m); + vm_ns::init_sqr(m); + vm_ns::init_sqrt(m); + vm_ns::init_sub(m); + vm_ns::init_tan(m); + vm_ns::init_tanh(m); + vm_ns::init_trunc(m); } From 6a737c49919baf52f2186182019410b17a7c5080 Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Fri, 14 Jun 2024 18:35:37 +0200 Subject: [PATCH 20/49] Handling warnings in pytest (#1845) * Enable loading of warning plugin in pytest * Fix DeprecationWarning in test_histogram.py * Ignore DeprecationWarning for pkg_resources * Fix SyntaxWarning in test_ndarray_math.py * Deprecate numpy_cupy_array_list_equal * Fix DeprecationWarning in test_mathematical.py * Avoid FutureWarning for rcond parameter of numpy.linalg.lstsq * Fix DeprecationWarning in cupy test_elementwise.py * Skip test_msort_zero_dim - not implemented * Ignore RuntimeWarning for numpy.arccosh * Fix DeprecationWarning for numpy.fromstring * Add test_digitize_inf to TestDigitize * Fix DeprecationWarning: Converting np.integer to a dtype is deprecated * Fix ComplexWarning in 2 ways * Fix RuntimeWarning by reducing shape in TestNansumNanprodLong * Handle DeprecationWarning in test_dparray.py * Skip test_lexsort_one_dim/_two_dim - not implemented * Handle RuntimeWarning in test_linspace_float_underflow * Fix DeprecationWarning in test_round_halfway_uint * Update test_linspace to avoid DeprecationWarning * Use pytest.mark.usefixtures('suppress_complex_warning') * Handle RuntimeWarning: divide by zero in test_reciprocal * Fix DeprecationWarning in test_mathematical.py * Handle RuntimeWarning in test_from_dlpack * Add the fixture to test_from_dlpack_with_dpt insted of test_from_dlpack --- setup.cfg | 12 ++++++++--- tests/test_arraycreation.py | 6 +++--- tests/test_arraymanipulation.py | 1 + tests/test_dparray.py | 6 ++++++ tests/test_histogram.py | 20 +++++++++++++------ tests/test_linalg.py | 12 ++++++++--- tests/test_mathematical.py | 2 +- tests/test_random_state.py | 17 ++++++---------- tests/test_sycl_queue.py | 7 +++++-- tests/test_umath.py | 1 + tests/test_usm_type.py | 2 +- .../cupy/binary_tests/test_elementwise.py | 6 +++--- .../cupy/core_tests/test_ndarray_math.py | 8 ++++---- .../cupy/creation_tests/test_ranges.py | 6 +++--- .../cupy/logic_tests/test_comparison.py | 8 ++++---- .../cupy/math_tests/test_sumprod.py | 14 ++++++++++++- 16 files changed, 83 insertions(+), 45 deletions(-) diff --git a/setup.cfg b/setup.cfg index 387884acef0..60c8b1f6372 100644 --- a/setup.cfg +++ b/setup.cfg @@ -6,14 +6,20 @@ ignore = E201 # By default, tests marked as slow will be deselected. # To run all tests, use -m "slow and not slow". # To run only slow tests, use -m "slow". -addopts = -m "not slow" -p no:warnings --tb=short --strict-markers +addopts = -m "not slow" --tb=short --strict-markers norecursedirs = tests_perf testpaths = tests markers = slow: marks tests as slow (deselect with '-m "not slow"') multi_gpu: marks tests that require a specified number of GPUs - # Added due to -p no:warnings to avoid errors with --strict-markers - filterwarnings: mark to filter warnings during tests +filterwarnings = + # pkg_resources + ignore:pkg_resources is deprecated as an API:DeprecationWarning + # NumPy arccosh + # Undefined behavior depends on the backend: + # NumPy with OpenBLAS for np.array[1.0] does not raise a warning + # while numpy with OneMKL raises RuntimeWarning + ignore:invalid value encountered in arccosh:RuntimeWarning [versioneer] VCS = git diff --git a/tests/test_arraycreation.py b/tests/test_arraycreation.py index 1f98aa2de6f..ca91ec6f699 100644 --- a/tests/test_arraycreation.py +++ b/tests/test_arraycreation.py @@ -516,6 +516,7 @@ def test_vander_seq(sequence): assert_allclose(vander_func(numpy, sequence), vander_func(dpnp, sequence)) +@pytest.mark.usefixtures("suppress_complex_warning") @pytest.mark.parametrize( "shape", [(), 0, (0,), (2, 0, 3), (3, 2)], @@ -531,6 +532,7 @@ def test_full(shape, fill_value, dtype, order): assert_array_equal(func(numpy), func(dpnp)) +@pytest.mark.usefixtures("suppress_complex_warning") @pytest.mark.parametrize( "array", [[], 0, [1, 2, 3], [[1, 2], [3, 4]]], @@ -709,9 +711,7 @@ def test_linspace(start, stop, num, dtype, retstep): if numpy.issubdtype(dtype, dpnp.integer): assert_allclose(res_np, res_dp, rtol=1) else: - if dtype is None and not has_support_aspect64(): - dtype = dpnp.float32 - assert_allclose(res_np, res_dp, rtol=1e-06, atol=dpnp.finfo(dtype).eps) + assert_dtype_allclose(res_dp, res_np) @pytest.mark.parametrize( diff --git a/tests/test_arraymanipulation.py b/tests/test_arraymanipulation.py index 69116ef8692..12f14bf4109 100644 --- a/tests/test_arraymanipulation.py +++ b/tests/test_arraymanipulation.py @@ -846,6 +846,7 @@ def test_asfarray(dtype, data): assert_array_equal(result, expected) +@pytest.mark.usefixtures("suppress_complex_warning") @pytest.mark.parametrize("dtype", get_all_dtypes()) @pytest.mark.parametrize("data", [[1.0, 2.0, 3.0]], ids=["[1., 2., 3.]"]) @pytest.mark.parametrize("data_dtype", get_all_dtypes(no_none=True)) diff --git a/tests/test_dparray.py b/tests/test_dparray.py index 874493f4e95..ac9757c580a 100644 --- a/tests/test_dparray.py +++ b/tests/test_dparray.py @@ -220,6 +220,9 @@ def test_print_dpnp_zero_shape(): assert result == expected +# Numpy will raise an error when converting a.ndim > 0 to a scalar +# TODO: Discuss dpnp behavior according to these future changes +@pytest.mark.filterwarnings("ignore::DeprecationWarning") @pytest.mark.parametrize("func", [bool, float, int, complex]) @pytest.mark.parametrize("shape", [tuple(), (1,), (1, 1), (1, 1, 1)]) @pytest.mark.parametrize( @@ -231,6 +234,9 @@ def test_scalar_type_casting(func, shape, dtype): assert func(numpy_array) == func(dpnp_array) +# Numpy will raise an error when converting a.ndim > 0 to a scalar +# TODO: Discuss dpnp behavior according to these future changes +@pytest.mark.filterwarnings("ignore::DeprecationWarning") @pytest.mark.parametrize( "method", ["__bool__", "__float__", "__int__", "__complex__"] ) diff --git a/tests/test_histogram.py b/tests/test_histogram.py index 7601d67c54a..da58a4ac2f8 100644 --- a/tests/test_histogram.py +++ b/tests/test_histogram.py @@ -38,11 +38,6 @@ class TestDigitize: numpy.array([1, 2, 3, 4, 5, 6, 7, 8, 9]), numpy.array([1, 4, 6, 7]), ), - # Infinity values - ( - numpy.array([-numpy.inf, -1, 0, 1, numpy.inf]), - numpy.array([-2, -1, 0, 1, 2]), - ), # Repeated elements (numpy.array([1, 2, 2, 3, 3, 3, 4, 5]), numpy.array([1, 2, 3, 4])), ], @@ -57,6 +52,18 @@ def test_digitize(self, x, bins, dtype, right): expected = numpy.digitize(x, bins, right=right) assert_dtype_allclose(result, expected) + @pytest.mark.parametrize("dtype", get_float_dtypes()) + @pytest.mark.parametrize("right", [True, False]) + def test_digitize_inf(self, dtype, right): + x = numpy.array([-numpy.inf, -1, 0, 1, numpy.inf], dtype=dtype) + bins = numpy.array([-2, -1, 0, 1, 2], dtype=dtype) + x_dp = dpnp.array(x) + bins_dp = dpnp.array(bins) + + result = dpnp.digitize(x_dp, bins_dp, right=right) + expected = numpy.digitize(x, bins, right=right) + assert_dtype_allclose(result, expected) + @pytest.mark.parametrize( "dtype_x", get_all_dtypes(no_bool=True, no_complex=True) ) @@ -386,7 +393,8 @@ def test_infinite_edge(self, xp, inf_val): # both first and last ranges must be finite with assert_raises_regex( - ValueError, f"autodetected range of \[{min}, {max}\] is not finite" + ValueError, + f"autodetected range of \\[{min}, {max}\\] is not finite", ): xp.histogram(v) diff --git a/tests/test_linalg.py b/tests/test_linalg.py index a45b2826b3a..ec2a085d4d3 100644 --- a/tests/test_linalg.py +++ b/tests/test_linalg.py @@ -780,7 +780,9 @@ def test_lstsq(self, a_shape, b_shape, dtype): b_dp = inp.array(b_np) result = inp.linalg.lstsq(a_dp, b_dp) - expected = numpy.linalg.lstsq(a_np, b_np) + # if rcond is not set, FutureWarning is given. + # By default Numpy uses None for calculations + expected = numpy.linalg.lstsq(a_np, b_np, rcond=None) for param_dp, param_np in zip(result, expected): assert_dtype_allclose(param_dp, param_np) @@ -794,7 +796,9 @@ def test_lstsq_diff_type(self, a_dtype, b_dtype): a_dp = inp.array(a_np) b_dp = inp.array(b_np) - expected = numpy.linalg.lstsq(a_np, b_np) + # if rcond is not set, FutureWarning is given. + # By default Numpy uses None for calculations + expected = numpy.linalg.lstsq(a_np, b_np, rcond=None) result = inp.linalg.lstsq(a_dp, b_dp) for param_dp, param_np in zip(result, expected): @@ -813,7 +817,9 @@ def test_lstsq_empty(self, m, n, nrhs, dtype): b_dp = inp.array(b_np) result = inp.linalg.lstsq(a_dp, b_dp) - expected = numpy.linalg.lstsq(a_np, b_np) + # if rcond is not set, FutureWarning is given. + # By default Numpy uses None for calculations + expected = numpy.linalg.lstsq(a_np, b_np, rcond=None) for param_dp, param_np in zip(result, expected): assert_dtype_allclose(param_dp, param_np) diff --git a/tests/test_mathematical.py b/tests/test_mathematical.py index 4a86cdc081e..4a1ee63c8fc 100644 --- a/tests/test_mathematical.py +++ b/tests/test_mathematical.py @@ -91,7 +91,7 @@ def test_mode(self): d = dpnp.ones(100) k = dpnp.ones(3) default_mode = dpnp.convolve(d, k, mode="full") - full_mode = dpnp.convolve(d, k, mode="f") + full_mode = dpnp.convolve(d, k, mode="full") assert_array_equal(full_mode, default_mode) # integer mode with assert_raises(ValueError): diff --git a/tests/test_random_state.py b/tests/test_random_state.py index 70940501d2e..ed56dbdf730 100644 --- a/tests/test_random_state.py +++ b/tests/test_random_state.py @@ -239,7 +239,6 @@ def test_fallback(self, loc, scale): [ dpnp.float16, float, - dpnp.integer, dpnp.int64, dpnp.int32, dpnp.int, @@ -253,7 +252,6 @@ def test_fallback(self, loc, scale): ids=[ "dpnp.float16", "float", - "dpnp.integer", "dpnp.int64", "dpnp.int32", "dpnp.int", @@ -366,8 +364,8 @@ def test_wrong_dims(self): class TestRandInt: @pytest.mark.parametrize( "dtype", - [int, dpnp.int32, dpnp.int, dpnp.integer], - ids=["int", "dpnp.int32", "dpnp.int", "dpnp.integer"], + [int, dpnp.int32, dpnp.int], + ids=["int", "dpnp.int32", "dpnp.int"], ) @pytest.mark.parametrize( "usm_type", @@ -379,7 +377,7 @@ def test_distr(self, dtype, usm_type): low = 1 high = 10 - if dtype in (dpnp.int, dpnp.integer) and dtype != dpnp.dtype("int32"): + if dtype == dpnp.int and dtype != dpnp.dtype("int32"): pytest.skip( "dtype isn't alias on dpnp.int32 on the target OS, so there will be a fallback" ) @@ -566,11 +564,10 @@ def test_bounds_fallback(self, low, high): @pytest.mark.usefixtures("allow_fall_back_on_numpy") @pytest.mark.parametrize( "dtype", - [dpnp.int64, dpnp.int, dpnp.integer, dpnp.bool, dpnp.bool_, bool], + [dpnp.int64, dpnp.int, dpnp.bool, dpnp.bool_, bool], ids=[ "dpnp.int64", "dpnp.int", - "dpnp.integer", "dpnp.bool", "dpnp.bool_", "bool", @@ -582,7 +579,7 @@ def test_dtype_fallback(self, dtype): high = 37 if not dtype in {dpnp.bool_, bool} else 2 size = (3, 2, 5) - if dtype in (dpnp.int, dpnp.integer) and dtype == dpnp.dtype("int32"): + if dtype == dpnp.int and dtype == dpnp.dtype("int32"): pytest.skip( "dtype is alias on dpnp.int32 on the target OS, so no fallback here" ) @@ -1157,7 +1154,6 @@ def test_fallback(self, low, high): [ dpnp.float16, float, - dpnp.integer, dpnp.int64, dpnp.int, int, @@ -1170,7 +1166,6 @@ def test_fallback(self, low, high): ids=[ "dpnp.float16", "float", - "dpnp.integer", "dpnp.int64", "dpnp.int", "int", @@ -1182,7 +1177,7 @@ def test_fallback(self, low, high): ], ) def test_invalid_dtype(self, dtype): - if dtype in (dpnp.int, dpnp.integer) and dtype == dpnp.dtype("int32"): + if dtype == dpnp.int and dtype == dpnp.dtype("int32"): pytest.skip( "dtype is alias on dpnp.int32 on the target OS, so no error here" ) diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index 8332f26949b..3073b8806e5 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -103,7 +103,7 @@ def vvsort(val, vec, size, xp): {"dtype": dpnp.int32}, ), pytest.param("fromiter", [[1, 2, 3, 4]], {"dtype": dpnp.int64}), - pytest.param("fromstring", ["1, 2"], {"dtype": int, "sep": " "}), + pytest.param("fromstring", ["1 2"], {"dtype": int, "sep": " "}), pytest.param("full", [(2, 2)], {"fill_value": 5}), pytest.param("eye", [4, 2], {}), pytest.param("geomspace", [1, 4, 8], {}), @@ -1686,6 +1686,7 @@ def test_from_dlpack(arr_dtype, shape, device): assert V.strides == W.strides +@pytest.mark.usefixtures("suppress_invalid_numpy_warnings") @pytest.mark.parametrize( "device", valid_devices, @@ -2112,7 +2113,9 @@ def test_lstsq(m, n, nrhs, device): b_dp = dpnp.array(b_np, device=device) result_dp = dpnp.linalg.lstsq(a_dp, b_dp) - result = numpy.linalg.lstsq(a_np, b_np) + # if rcond is not set, FutureWarning is given. + # By default Numpy uses None for calculations + result = numpy.linalg.lstsq(a_np, b_np, rcond=None) for param_dp, param_np in zip(result_dp, result): assert_dtype_allclose(param_dp, param_np) diff --git a/tests/test_umath.py b/tests/test_umath.py index 5a61079335f..c302cbba3a0 100644 --- a/tests/test_umath.py +++ b/tests/test_umath.py @@ -398,6 +398,7 @@ def test_invalid_out(self, out): class TestReciprocal: + @pytest.mark.usefixtures("suppress_divide_numpy_warnings") @pytest.mark.parametrize("dtype", get_float_complex_dtypes()) def test_reciprocal(self, dtype): np_array, expected = _get_numpy_arrays_1in_1out( diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index f66017ea6e2..77839f9b933 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -199,7 +199,7 @@ def test_array_creation_from_2d_array(func, args, usm_type_x, usm_type_y): "fromfunction", [(lambda i, j: i + j), (3, 3)], {"dtype": dp.int32} ), pytest.param("fromiter", [[1, 2, 3, 4]], {"dtype": dp.int64}), - pytest.param("fromstring", ["1, 2"], {"dtype": int, "sep": " "}), + pytest.param("fromstring", ["1 2"], {"dtype": int, "sep": " "}), pytest.param("full", [(2, 2)], {"fill_value": 5}), pytest.param("eye", [4, 2], {}), pytest.param("geomspace", [1, 4, 8], {}), diff --git a/tests/third_party/cupy/binary_tests/test_elementwise.py b/tests/third_party/cupy/binary_tests/test_elementwise.py index a2698366256..e756c454f15 100644 --- a/tests/third_party/cupy/binary_tests/test_elementwise.py +++ b/tests/third_party/cupy/binary_tests/test_elementwise.py @@ -7,14 +7,14 @@ class TestElementwise(unittest.TestCase): @testing.for_int_dtypes() @testing.numpy_cupy_array_equal() def check_unary_int(self, name, xp, dtype): - a = xp.array([-3, -2, -1, 0, 1, 2, 3], dtype=dtype) + a = xp.array([-3, -2, -1, 0, 1, 2, 3]).astype(dtype) return getattr(xp, name)(a) @testing.for_int_dtypes() @testing.numpy_cupy_array_equal() def check_binary_int(self, name, xp, dtype): - a = xp.array([-3, -2, -1, 0, 1, 2, 3], dtype=dtype) - b = xp.array([0, 1, 2, 3, 4, 5, 6], dtype=dtype) + a = xp.array([-3, -2, -1, 0, 1, 2, 3]).astype(dtype) + b = xp.array([0, 1, 2, 3, 4, 5, 6]).astype(dtype) return getattr(xp, name)(a, b) def test_bitwise_and(self): diff --git a/tests/third_party/cupy/core_tests/test_ndarray_math.py b/tests/third_party/cupy/core_tests/test_ndarray_math.py index 3233687789a..81caf2c8ceb 100644 --- a/tests/third_party/cupy/core_tests/test_ndarray_math.py +++ b/tests/third_party/cupy/core_tests/test_ndarray_math.py @@ -57,7 +57,7 @@ class TestRoundHalfway(unittest.TestCase): @testing.for_float_dtypes() @testing.numpy_cupy_allclose(atol=1e-5) def test_round_halfway_float(self, xp, dtype): - if self.decimals is -3 and dtype == numpy.float32: + if self.decimals == -3 and dtype == numpy.float32: pytest.skip( "Case with decimals=-3 and dtype float32 has divide error less than 1e-5" ) @@ -78,7 +78,7 @@ def test_round_halfway_float(self, xp, dtype): @testing.numpy_cupy_array_equal() def test_round_halfway_int(self, xp, dtype): # generate [..., -1.5, -0.5, 0.5, 1.5, ...] * 10^{-decimals} - if self.decimals is -3 and not has_support_aspect64(): + if self.decimals == -3 and not has_support_aspect64(): pytest.skip( "Case with decimals=-3 and dtype float32 has divide error less than 1e-5" ) @@ -96,7 +96,7 @@ def test_round_halfway_int(self, xp, dtype): @testing.numpy_cupy_array_equal() def test_round_halfway_uint(self, xp, dtype): # generate [0.5, 1.5, ...] * 10^{-decimals} - if self.decimals is -3 and not has_support_aspect64(): + if self.decimals == -3 and not has_support_aspect64(): pytest.skip( "Case with decimals=-3 and dtype float32 has divide error less than 1e-5" ) @@ -105,7 +105,7 @@ def test_round_halfway_uint(self, xp, dtype): a -= 1 scale = 10 ** abs(self.decimals) if self.decimals < 0: - a *= xp.array(scale, dtype=dtype) + a *= xp.array(scale).astype(dtype) a >>= 1 return a.round(self.decimals) diff --git a/tests/third_party/cupy/creation_tests/test_ranges.py b/tests/third_party/cupy/creation_tests/test_ranges.py index 2094f2ffc8e..92c81061b7a 100644 --- a/tests/third_party/cupy/creation_tests/test_ranges.py +++ b/tests/third_party/cupy/creation_tests/test_ranges.py @@ -176,9 +176,9 @@ def test_linspace_float_overflow(self, xp): def test_linspace_float_underflow(self, xp): # find minimum subnormal number dtype = cupy.default_float_type() - x = xp.finfo(dtype).min - while x / 2 > 0: - x /= 2 + # use .tiny instead of .min and while to get + # minimum subnormal number directly and avoid RuntimeWarning + x = xp.finfo(dtype).tiny return xp.linspace(0.0, x, 10, dtype=dtype) @testing.with_requires("numpy>=1.16") diff --git a/tests/third_party/cupy/logic_tests/test_comparison.py b/tests/third_party/cupy/logic_tests/test_comparison.py index b7dba2a219b..eed4c7f9b36 100644 --- a/tests/third_party/cupy/logic_tests/test_comparison.py +++ b/tests/third_party/cupy/logic_tests/test_comparison.py @@ -46,28 +46,28 @@ class TestComparisonOperator(unittest.TestCase): ] @testing.for_all_dtypes(no_complex=True) - @testing.numpy_cupy_array_list_equal() + @testing.numpy_cupy_array_equal() def test_binary_npscalar_array(self, xp, dtype): a = numpy.int16(3) b = testing.shaped_arange((2, 3), xp, dtype) return [op(a, b) for op in self.operators] @testing.for_all_dtypes(no_complex=True) - @testing.numpy_cupy_array_list_equal() + @testing.numpy_cupy_array_equal() def test_binary_pyscalar_array(self, xp, dtype): a = 3.0 b = testing.shaped_arange((2, 3), xp, dtype) return [op(a, b) for op in self.operators] @testing.for_all_dtypes(no_complex=True) - @testing.numpy_cupy_array_list_equal() + @testing.numpy_cupy_array_equal() def test_binary_array_npscalar(self, xp, dtype): a = testing.shaped_arange((2, 3), xp, dtype) b = numpy.float32(3.0) return [op(a, b) for op in self.operators] @testing.for_all_dtypes(no_complex=True) - @testing.numpy_cupy_array_list_equal() + @testing.numpy_cupy_array_equal() def test_binary_array_pyscalar(self, xp, dtype): a = testing.shaped_arange((2, 3), xp, dtype) b = 3 diff --git a/tests/third_party/cupy/math_tests/test_sumprod.py b/tests/third_party/cupy/math_tests/test_sumprod.py index f36086755e9..b1561260402 100644 --- a/tests/third_party/cupy/math_tests/test_sumprod.py +++ b/tests/third_party/cupy/math_tests/test_sumprod.py @@ -234,7 +234,18 @@ def _numpy_nanprod_implemented(self): ) def _test(self, xp, dtype): - a = testing.shaped_arange(self.shape, xp, dtype) + shape = self.shape + # Reduce the shape of the input array to avoid overflow warning + # for nanprod with float32, shape=(20, 30, 40), axis=0 and transpose_axes=False + if ( + self.func == "nanprod" + and dtype == xp.float32 + and self.shape == (20, 30, 40) + and self.axis == 0 + and not self.transpose_axes + ): + shape = (10, 20, 30) + a = testing.shaped_arange(shape, xp, dtype) if self.transpose_axes: a = a.transpose(2, 0, 1) if not issubclass(dtype, xp.integer): @@ -245,6 +256,7 @@ def _test(self, xp, dtype): @testing.for_all_dtypes(no_bool=True, no_float16=True) @testing.numpy_cupy_allclose(type_check=has_support_aspect64()) def test_nansum_all(self, xp, dtype): + dtype = xp.float32 if ( not self._numpy_nanprod_implemented() or not self._do_transposed_axis_test() From 38fd39debebc741349587bcd254648cc9b1ca724 Mon Sep 17 00:00:00 2001 From: Natalia Polina Date: Fri, 14 Jun 2024 11:07:34 -0700 Subject: [PATCH 21/49] Added device keyword argument to astype function (#1870) * Added device keyword argument to astype function * Added test for astype function * address comments --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- dpnp/dpnp_array.py | 21 +++++++++++++++++++-- dpnp/dpnp_iface.py | 11 +++++++++-- tests/test_sycl_queue.py | 19 +++++++++++++++++++ 3 files changed, 47 insertions(+), 4 deletions(-) diff --git a/dpnp/dpnp_array.py b/dpnp/dpnp_array.py index fb8e1fcef12..fd2d06f7428 100644 --- a/dpnp/dpnp_array.py +++ b/dpnp/dpnp_array.py @@ -562,7 +562,15 @@ def asnumpy(self): return dpt.asnumpy(self._array_obj) - def astype(self, dtype, order="K", casting="unsafe", subok=True, copy=True): + def astype( + self, + dtype, + order="K", + casting="unsafe", + subok=True, + copy=True, + device=None, + ): """ Copy the array with data type casting. @@ -597,6 +605,13 @@ def astype(self, dtype, order="K", casting="unsafe", subok=True, copy=True): this is set to ``False``, and the `dtype`, `order`, and `subok` requirements are satisfied, the input array is returned instead of a copy. + device : {None, string, SyclDevice, SyclQueue}, optional + An array API concept of device where the output array is created. + The `device` can be ``None`` (the default), an OneAPI filter selector + string, an instance of :class:`dpctl.SyclDevice` corresponding to + a non-partitioned SYCL device, an instance of :class:`dpctl.SyclQueue`, + or a `Device` object returned by + :obj:`dpnp.dpnp_array.dpnp_array.device` property. Default: ``None``. Returns ------- @@ -626,7 +641,9 @@ def astype(self, dtype, order="K", casting="unsafe", subok=True, copy=True): f"subok={subok} is currently not supported" ) - return dpnp.astype(self, dtype, order=order, casting=casting, copy=copy) + return dpnp.astype( + self, dtype, order=order, casting=casting, copy=copy, device=device + ) # 'base', # 'byteswap', diff --git a/dpnp/dpnp_iface.py b/dpnp/dpnp_iface.py index 0dfd63dab21..49e7b41c01c 100644 --- a/dpnp/dpnp_iface.py +++ b/dpnp/dpnp_iface.py @@ -180,7 +180,7 @@ def asnumpy(a, order="C"): # pylint: disable=redefined-outer-name -def astype(x1, dtype, order="K", casting="unsafe", copy=True): +def astype(x1, dtype, order="K", casting="unsafe", copy=True, device=None): """ Copy the array with data type casting. @@ -213,6 +213,13 @@ def astype(x1, dtype, order="K", casting="unsafe", copy=True): By default, ``astype`` always returns a newly allocated array. If this is set to ``False``, and the `dtype`, `order`, and `subok` requirements are satisfied, the input array is returned instead of a copy. + device : {None, string, SyclDevice, SyclQueue}, optional + An array API concept of device where the output array is created. + The `device` can be ``None`` (the default), an OneAPI filter selector + string, an instance of :class:`dpctl.SyclDevice` corresponding to + a non-partitioned SYCL device, an instance of :class:`dpctl.SyclQueue`, + or a `Device` object returned by + :obj:`dpnp.dpnp_array.dpnp_array.device` property. Default: ``None``. Returns ------- @@ -228,7 +235,7 @@ def astype(x1, dtype, order="K", casting="unsafe", copy=True): x1_obj = dpnp.get_usm_ndarray(x1) array_obj = dpt.astype( - x1_obj, dtype, order=order, casting=casting, copy=copy + x1_obj, dtype, order=order, casting=casting, copy=copy, device=device ) # return x1 if dpctl returns a zero copy of x1_obj diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index 3073b8806e5..99334cfabfc 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -2211,3 +2211,22 @@ def test_histogram_bin_edges(weights, device): edges_queue = result_edges.sycl_queue assert_sycl_queue_equal(edges_queue, iv.sycl_queue) + + +@pytest.mark.parametrize( + "device_x", + valid_devices, + ids=[device.filter_string for device in valid_devices], +) +@pytest.mark.parametrize( + "device_y", + valid_devices, + ids=[device.filter_string for device in valid_devices], +) +def test_astype(device_x, device_y): + x = dpnp.array([1, 2, 3], dtype="i4", device=device_x) + y = dpnp.astype(x, dtype="f4") + assert_sycl_queue_equal(y.sycl_queue, x.sycl_queue) + sycl_queue = dpctl.SyclQueue(device_y) + y = dpnp.astype(x, dtype="f4", device=sycl_queue) + assert_sycl_queue_equal(y.sycl_queue, sycl_queue) From fe93c056e9998f55ce1711110704955fa2fc690c Mon Sep 17 00:00:00 2001 From: vtavana <120411540+vtavana@users.noreply.github.com> Date: Sat, 15 Jun 2024 05:03:00 -0500 Subject: [PATCH 22/49] resolve gh-1871 (#1872) * update returned result when out is defined with order F * address comments * add test for out keyword in einsum --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- dpnp/dpnp_iface_linearalgebra.py | 1 - dpnp/dpnp_utils/dpnp_utils_linearalgebra.py | 101 ++++++++++++++------ tests/test_linalg.py | 16 ++++ tests/test_mathematical.py | 100 +++++++++++++++++-- tests/test_product.py | 8 ++ 5 files changed, 191 insertions(+), 35 deletions(-) diff --git a/dpnp/dpnp_iface_linearalgebra.py b/dpnp/dpnp_iface_linearalgebra.py index 1af952388a6..f674c96040a 100644 --- a/dpnp/dpnp_iface_linearalgebra.py +++ b/dpnp/dpnp_iface_linearalgebra.py @@ -821,7 +821,6 @@ def matmul( """ - dpnp.check_supported_arrays_type(x1, x2) if subok is False: raise NotImplementedError( "subok keyword argument is only supported by its default value." diff --git a/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py b/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py index 0b9686771c3..c98acc2c81e 100644 --- a/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py +++ b/dpnp/dpnp_utils/dpnp_utils_linearalgebra.py @@ -33,6 +33,7 @@ import dpctl.tensor._tensor_elementwise_impl as tei import dpctl.tensor._tensor_impl as ti import numpy +from dpctl.utils import ExecutionPlacementError from numpy.core.numeric import normalize_axis_tuple import dpnp @@ -218,7 +219,9 @@ def _compute_size(start, shape): return ret -def _copy_array(x, dep_events, host_events, copy_flag=False, dtype=None): +def _copy_array( + x, dep_events, host_events, copy_flag=False, dtype=None, order="C" +): """ Creating a copy of input array if needed. @@ -236,7 +239,7 @@ def _copy_array(x, dep_events, host_events, copy_flag=False, dtype=None): copy = x.dtype != dtype if dtype is not None else False if copy: - x_copy = dpnp.empty_like(x, dtype=dtype, order="C") + x_copy = dpnp.empty_like(x, dtype=dtype, order=order) ht_copy_ev, copy_ev = ti._copy_usm_ndarray_into_usm_ndarray( src=dpnp.get_usm_ndarray(x), dst=x_copy.get_array(), @@ -248,7 +251,9 @@ def _copy_array(x, dep_events, host_events, copy_flag=False, dtype=None): return x -def _create_result_array(x1, x2, out, shape, dtype, usm_type, sycl_queue): +def _create_result_array( + x1, x2, out, shape, dtype, usm_type, sycl_queue, order="C" +): """ Create the result array. @@ -263,13 +268,12 @@ def _create_result_array(x1, x2, out, shape, dtype, usm_type, sycl_queue): x1_usm = dpnp.get_usm_ndarray(x1) x2_usm = dpnp.get_usm_ndarray(x2) out_usm = dpnp.get_usm_ndarray(out) - contig_flag = _define_contig_flag(out) + contig_flag, _, _ = _define_contig_flag(out) if ( out.dtype == dtype and out.shape == shape and out.usm_type == usm_type - and out.sycl_queue == sycl_queue and contig_flag and not ti._array_overlap(x1_usm, out_usm) and not ti._array_overlap(x2_usm, out_usm) @@ -279,6 +283,7 @@ def _create_result_array(x1, x2, out, shape, dtype, usm_type, sycl_queue): return dpnp.empty( shape, dtype=dtype, + order=order, usm_type=usm_type, sycl_queue=sycl_queue, ) @@ -295,14 +300,14 @@ def _define_contig_flag(x): x_strides = x.strides x_shape = x.shape if x.ndim < 2: - return True + return True, True, True x_strides = _standardize_strides_to_nonzero(x_strides, x_shape) x_is_c_contiguous = x_strides[-1] == 1 and x_strides[-2] == x_shape[-1] x_is_f_contiguous = x_strides[-2] == 1 and x_strides[-1] == x_shape[-2] if x_is_c_contiguous or x_is_f_contiguous: flag = True - return flag + return flag, x_is_c_contiguous, x_is_f_contiguous def _define_dim_flags(x, pos): @@ -746,17 +751,26 @@ def _gemm_batch_matmul(exec_q, x1, x2, res, dev_tasks_list): ) ht_tasks_list.append(ht_blas_ev) dpctl.SyclEvent.wait_for(ht_tasks_list) + res_shape = res.shape - if not row_major: - res = dpnp.reshape( - res.ravel(), (batch_size, res_shape[2], res_shape[1]) - ).transpose(0, 2, 1) + _, res_is_c_contig, res_is_f_contig = _define_contig_flag(res) + if row_major: + if res_is_f_contig: + res = dpnp.reshape( + dpnp.ravel(res, order="F"), + (res_shape[1], res_shape[2], batch_size), + ).transpose(2, 0, 1) + else: + if res_is_c_contig: + res = dpnp.reshape( + dpnp.ravel(res, order="C"), + (batch_size, res_shape[2], res_shape[1]), + ).transpose(0, 2, 1) if res_shape != orig_shape: res = res.reshape(orig_shape) - res = dpnp.ascontiguousarray(res) - return res + return dpnp.ascontiguousarray(res) def _gemm_matmul(exec_q, x1, x2, res, dev_tasks_list): @@ -769,14 +783,16 @@ def _gemm_matmul(exec_q, x1, x2, res, dev_tasks_list): ) ht_blas_ev.wait() - if not row_major: - # TODO: investigate the possibility of defining result - # array with "F" order for this case - res = dpnp.ascontiguousarray( - dpnp.reshape(res.ravel(), res.shape, order="F") - ) + if row_major: + if res.flags.f_contiguous is True: + # read data in "F" order and write it in "C" order + res = dpnp.reshape(dpnp.ravel(res, order="F"), res.shape, order="C") + else: + if res.flags.c_contiguous is True: + # read data in "C" order and write it in "F" order + res = dpnp.reshape(dpnp.ravel(res, order="C"), res.shape, order="F") - return res + return dpnp.ascontiguousarray(res) def _greedy_path(input_sets, output_set, idx_dict, memory_limit): @@ -1746,6 +1762,13 @@ def dpnp_dot(a, b, /, out=None, *, conjugate=False): ) res_usm_type, exec_q = get_usm_allocations([a, b]) + if ( + out is not None + and dpctl.utils.get_execution_queue((exec_q, out.sycl_queue)) is None + ): + raise ExecutionPlacementError( + "Input and output allocation queues are not compatible" + ) # Determine the appropriate data types dot_dtype, res_dtype = _compute_res_dtype(a, b, sycl_queue=exec_q) @@ -1812,6 +1835,12 @@ def dpnp_einsum( arrays.append(a) res_usm_type, exec_q = get_usm_allocations(arrays) + if out is not None: + dpnp.check_supported_arrays_type(out) + if dpctl.utils.get_execution_queue((exec_q, out.sycl_queue)) is None: + raise ExecutionPlacementError( + "Input and output allocation queues are not compatible" + ) result_dtype = dpnp.result_type(*arrays) if dtype is None else dtype for id, a in enumerate(operands): if dpnp.isscalar(a): @@ -2056,10 +2085,17 @@ def dpnp_matmul( """ - x1_ndim = x1.ndim - x2_ndim = x2.ndim + dpnp.check_supported_arrays_type(x1, x2) res_usm_type, exec_q = get_usm_allocations([x1, x2]) + if out is not None: + dpnp.check_supported_arrays_type(out) + if dpctl.utils.get_execution_queue((exec_q, out.sycl_queue)) is None: + raise ExecutionPlacementError( + "Input and output allocation queues are not compatible" + ) + x1_ndim = x1.ndim + x2_ndim = x2.ndim if axes is not None: axes = _validate_axes(x1, x2, axes) @@ -2072,7 +2108,6 @@ def dpnp_matmul( x2 = dpnp.moveaxis(x2, axes_x2, (-2, -1)) if x2_ndim != 1 else x2 out_orig = out if out is not None: - dpnp.check_supported_arrays_type(out) # out that is passed to the backend should have the correct shape if len(axes_res) == 2: out = dpnp.moveaxis(out, axes_res, (-2, -1)) @@ -2161,8 +2196,18 @@ def dpnp_matmul( res = dpnp_dot(x1, x2, out=out) res_shape = res.shape else: + x1_contig_flag, _, x1_f = _define_contig_flag(x1) + x2_contig_flag, _, x2_f = _define_contig_flag(x2) + res_order = "F" if (x1_f and x2_f and call_flag == "gemm") else "C" res = _create_result_array( - x1, x2, out, res_shape, compute_dtype, res_usm_type, exec_q + x1, + x2, + out, + res_shape, + compute_dtype, + res_usm_type, + exec_q, + res_order, ) # calculate result @@ -2175,21 +2220,21 @@ def dpnp_matmul( # their base (last 2-dimensions) to be c-contiguous or f-contiguous dep_events_list = [] host_tasks_list = [] - contig_flag = _define_contig_flag(x1) x1 = _copy_array( x1, dep_events_list, host_tasks_list, - copy_flag=not contig_flag, + copy_flag=not x1_contig_flag, dtype=compute_dtype, + order=res_order, ) - contig_flag = _define_contig_flag(x2) x2 = _copy_array( x2, dep_events_list, host_tasks_list, - copy_flag=not contig_flag, + copy_flag=not x2_contig_flag, dtype=compute_dtype, + order=res_order, ) if call_flag == "gemv": diff --git a/tests/test_linalg.py b/tests/test_linalg.py index ec2a085d4d3..48a4891034c 100644 --- a/tests/test_linalg.py +++ b/tests/test_linalg.py @@ -613,12 +613,28 @@ def test_einsum_trivial_cases(self): expected = numpy.einsum("i,i,i", b_np, b_np, b_np, optimize="greedy") assert_dtype_allclose(result, expected) + def test_einsum_out(self): + a = inp.ones((5, 5)) + a_np = a.asnumpy() + out = inp.empty((5,)) + out_np = out.asnumpy() + result = inp.einsum("ii->i", a, out=out) + assert result is out + expected = numpy.einsum("ii->i", a_np, out=out_np) + assert_dtype_allclose(result, expected) + def test_einsum_error(self): a = inp.ones((5, 5)) # unknown keyword argument with pytest.raises(TypeError): inp.einsum("ii->i", a, copy=False) + a = inp.ones((5, 5)) + out = inp.empty((5,), sycl_queue=dpctl.SyclQueue()) + # inconsistent sycl_queue + with pytest.raises(ExecutionPlacementError): + inp.einsum("ii->i", a, out=out) + # unknown value for optimize keyword with pytest.raises(TypeError): inp.einsum("ii->i", a, optimize="average") diff --git a/tests/test_mathematical.py b/tests/test_mathematical.py index 4a1ee63c8fc..6cf52e91deb 100644 --- a/tests/test_mathematical.py +++ b/tests/test_mathematical.py @@ -2,6 +2,7 @@ import dpctl.tensor as dpt import numpy import pytest +from dpctl.utils import ExecutionPlacementError from numpy.testing import ( assert_allclose, assert_almost_equal, @@ -2975,20 +2976,99 @@ def test_matmul_strided_vec_mat(self, shape, incx, incy, transpose): assert result is out assert_dtype_allclose(result, expected) + @pytest.mark.parametrize( + "order1, order2, out_order", + [ + ("C", "C", "C"), + ("C", "C", "F"), + ("C", "F", "C"), + ("C", "F", "F"), + ("F", "C", "C"), + ("F", "C", "F"), + ("F", "F", "F"), + ("F", "F", "C"), + ], + ) @pytest.mark.parametrize( "dtype", get_all_dtypes(no_none=True, no_bool=True) ) - def test_matmul_out(self, dtype): - a1 = numpy.arange(5 * 4, dtype=dtype).reshape(5, 4) - a2 = numpy.arange(7 * 4, dtype=dtype).reshape(4, 7) + def test_matmul_out1(self, order1, order2, out_order, dtype): + # test gemm with out keyword + a1 = numpy.arange(20, dtype=dtype).reshape(5, 4, order=order1) + a2 = numpy.arange(28, dtype=dtype).reshape(4, 7, order=order2) b1 = dpnp.asarray(a1) b2 = dpnp.asarray(a2) - dpnp_out = dpnp.empty((5, 7), dtype=dtype) + dpnp_out = dpnp.empty((5, 7), dtype=dtype, order=out_order) result = dpnp.matmul(b1, b2, out=dpnp_out) - expected = numpy.matmul(a1, a2) assert result is dpnp_out + + out = numpy.empty((5, 7), dtype=dtype, order=out_order) + expected = numpy.matmul(a1, a2, out=out) + assert result.flags.c_contiguous == expected.flags.c_contiguous + assert result.flags.f_contiguous == expected.flags.f_contiguous + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize("trans", [True, False]) + @pytest.mark.parametrize( + "dtype", get_all_dtypes(no_none=True, no_bool=True) + ) + def test_matmul_out2(self, trans, dtype): + # test gemm_batch with out keyword + # the base of input arrays is c-contiguous + # the base of output array is c-contiguous or f-contiguous + a1 = numpy.arange(24, dtype=dtype).reshape(2, 3, 4) + a2 = numpy.arange(40, dtype=dtype).reshape(2, 4, 5) + b1 = dpnp.asarray(a1) + b2 = dpnp.asarray(a2) + + if trans: + dpnp_out = dpnp.empty((2, 5, 3), dtype=dtype).transpose(0, 2, 1) + out = numpy.empty((2, 5, 3), dtype=dtype).transpose(0, 2, 1) + else: + dpnp_out = dpnp.empty((2, 3, 5), dtype=dtype) + out = numpy.empty((2, 3, 5), dtype=dtype) + + result = dpnp.matmul(b1, b2, out=dpnp_out) + assert result is dpnp_out + + expected = numpy.matmul(a1, a2, out=out) + assert result.flags.c_contiguous == expected.flags.c_contiguous + assert result.flags.f_contiguous == expected.flags.f_contiguous + assert_dtype_allclose(result, expected) + + @pytest.mark.parametrize("trans", [True, False]) + @pytest.mark.parametrize( + "dtype", get_all_dtypes(no_none=True, no_bool=True) + ) + def test_matmul_out3(self, trans, dtype): + # test gemm_batch with out keyword + # the base of input arrays is f-contiguous + # the base of output array is c-contiguous or f-contiguous + a1 = numpy.arange(24, dtype=dtype).reshape(2, 4, 3) + a2 = numpy.arange(40, dtype=dtype).reshape(2, 5, 4) + b1 = dpnp.asarray(a1) + b2 = dpnp.asarray(a2) + + a1 = numpy.asarray(a1).transpose(0, 2, 1) + a2 = numpy.asarray(a2).transpose(0, 2, 1) + b1 = b1.transpose(0, 2, 1) + b2 = b2.transpose(0, 2, 1) + + if trans: + dpnp_out = dpnp.empty((2, 5, 3), dtype=dtype).transpose(0, 2, 1) + out = numpy.empty((2, 5, 3), dtype=dtype).transpose(0, 2, 1) + else: + dpnp_out = dpnp.empty((2, 3, 5), dtype=dtype) + out = numpy.empty((2, 3, 5), dtype=dtype) + + result = dpnp.matmul(b1, b2, out=dpnp_out) + assert result is dpnp_out + + expected = numpy.matmul(a1, a2, out=out) + assert result.flags.c_contiguous == expected.flags.c_contiguous + assert result.flags.f_contiguous == expected.flags.f_contiguous assert_dtype_allclose(result, expected) @pytest.mark.parametrize( @@ -3000,6 +3080,9 @@ def test_matmul_out(self, dtype): ], ) def test_matmul_out_0D(self, out_shape): + # for matmul of 0-D arrays with out keyword, + # NumPy repeats the data to match the shape + # of output array a = numpy.arange(3) b = dpnp.asarray(a) @@ -3107,10 +3190,15 @@ def test_invalid_dtype(self, dtype): def test_exe_q(self): x1 = dpnp.ones((5, 4), sycl_queue=dpctl.SyclQueue()) x2 = dpnp.ones((4, 7), sycl_queue=dpctl.SyclQueue()) - with pytest.raises(ValueError): dpnp.matmul(x1, x2) + x1 = dpnp.ones((5, 4)) + x2 = dpnp.ones((4, 7)) + out = dpnp.empty((5, 7), sycl_queue=dpctl.SyclQueue()) + with pytest.raises(ExecutionPlacementError): + dpnp.matmul(x1, x2, out=out) + def test_matmul_casting(self): a1 = dpnp.arange(2 * 4, dtype=dpnp.float32).reshape(2, 4) a2 = dpnp.arange(4 * 3).reshape(4, 3) diff --git a/tests/test_product.py b/tests/test_product.py index d9463a1546c..a15e82f6d90 100644 --- a/tests/test_product.py +++ b/tests/test_product.py @@ -1,6 +1,7 @@ import dpctl import numpy import pytest +from dpctl.utils import ExecutionPlacementError from numpy.testing import assert_raises import dpnp @@ -473,6 +474,12 @@ def test_dot_sycl_queue_error(self): with pytest.raises(ValueError): dpnp.dot(a, b) + a = dpnp.ones((5,)) + b = dpnp.ones((5,)) + out = dpnp.empty((), sycl_queue=dpctl.SyclQueue()) + with pytest.raises(ExecutionPlacementError): + dpnp.dot(a, b, out=out) + @pytest.mark.parametrize("ia", [1, dpnp.ones((), dtype=dpnp.float32)]) def test_dot_out_error_scalar(self, ia): a = ia if dpnp.isscalar(ia) else ia.asnumpy() @@ -487,6 +494,7 @@ def test_dot_out_error_scalar(self, ia): # output shape is incorrect dp_out = dpnp.empty((2,), dtype=dpnp.int32) + out = numpy.empty((2,), dtype=numpy.int32) assert_raises(ValueError, dpnp.dot, ia, ib, out=dp_out) assert_raises(ValueError, numpy.dot, a, b, out=out) From de71047cc6b5f29363c1bda2caa14af387e180aa Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Sat, 15 Jun 2024 13:28:11 +0200 Subject: [PATCH 23/49] Update docstrings for ufuncs (#1881) --- dpnp/dpnp_iface_bitwise.py | 24 +++---- dpnp/dpnp_iface_logic.py | 53 ++++++++-------- dpnp/dpnp_iface_mathematical.py | 100 ++++++++++++++--------------- dpnp/dpnp_iface_trigonometric.py | 106 +++++++++++++++---------------- 4 files changed, 142 insertions(+), 141 deletions(-) diff --git a/dpnp/dpnp_iface_bitwise.py b/dpnp/dpnp_iface_bitwise.py index 91560732d3f..21ee7cc3d82 100644 --- a/dpnp/dpnp_iface_bitwise.py +++ b/dpnp/dpnp_iface_bitwise.py @@ -70,12 +70,12 @@ x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have integer or boolean data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -137,12 +137,12 @@ x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have integer or boolean data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -199,12 +199,12 @@ x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have integer or boolean data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -261,12 +261,12 @@ ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have integer or boolean data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -331,12 +331,12 @@ x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have integer data type. Each element must be greater than or equal to 0. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- out : dpnp.ndarray @@ -389,12 +389,12 @@ x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have integer data type. Each element must be greater than or equal to 0. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- diff --git a/dpnp/dpnp_iface_logic.py b/dpnp/dpnp_iface_logic.py index dad2dd78039..d780cf578bf 100644 --- a/dpnp/dpnp_iface_logic.py +++ b/dpnp/dpnp_iface_logic.py @@ -317,12 +317,12 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -386,12 +386,12 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -449,12 +449,12 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -553,12 +553,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -612,12 +612,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -665,12 +665,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -722,12 +722,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -739,6 +739,7 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): ----------- Parameters `where` and `subok` are supported with their default values. Otherwise ``NotImplementedError`` exception will be raised. + See Also -------- :obj:`dpnp.greater` : Return the truth value of (x1 > x2) element-wise. @@ -784,12 +785,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -847,12 +848,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): First input array. x2 : {dpnp.ndarray, usm_ndarray} Second input array. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -909,12 +910,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): ---------- x : {dpnp.ndarray, usm_ndarray} Input array. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -964,12 +965,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): First input array. x2 : {dpnp.ndarray, usm_ndarray} Second input array. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1029,12 +1030,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): First input array. x2 : {dpnp.ndarray, usm_ndarray} Second input array. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1092,12 +1093,12 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index b0d0c7b6123..fb3496709df 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -340,12 +340,12 @@ def _gradient_num_diff_edges( ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -408,12 +408,12 @@ def _gradient_num_diff_edges( First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -477,12 +477,12 @@ def _gradient_num_diff_edges( ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a complex-valued floating-point data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -534,11 +534,10 @@ def around(x, /, decimals=0, out=None): Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point. - out : {None, dpnp.ndarray}, optional + out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. - Returns ------- out : dpnp.ndarray @@ -556,6 +555,7 @@ def around(x, /, decimals=0, out=None): Notes ----- This function works the same as :obj:`dpnp.round`. + """ return round(x, decimals, out) @@ -570,12 +570,12 @@ def around(x, /, decimals=0, out=None): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a real-valued data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -631,7 +631,7 @@ def clip(a, a_min, a_max, *, out=None, order="K", **kwargs): output. Its type is preserved. order : {"C", "F", "A", "K", None}, optional Memory layout of the newly output array, if parameter `out` is `None`. - If `order` is ``None``, the default value "K" will be used. + If `order` is ``None``, the default value ``"K"`` will be used. Returns ------- @@ -696,12 +696,12 @@ def clip(a, a_min, a_max, *, out=None, order="K", **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -772,7 +772,7 @@ def convolve(a, v, mode="full"): Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1240,12 +1240,12 @@ def diff(a, n=1, axis=-1, prepend=None, append=None): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1393,12 +1393,12 @@ def fabs(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a real-valued data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1452,12 +1452,12 @@ def fabs(x1, **kwargs): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2056,12 +2056,12 @@ def gradient(f, *varargs, axis=None, edge_order=1): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2113,12 +2113,12 @@ def gradient(f, *varargs, axis=None, edge_order=1): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2185,12 +2185,12 @@ def gradient(f, *varargs, axis=None, edge_order=1): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2344,12 +2344,12 @@ def modf(x1, **kwargs): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2410,12 +2410,12 @@ def modf(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2465,12 +2465,12 @@ def modf(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2527,12 +2527,12 @@ def modf(x1, **kwargs): First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2553,7 +2553,6 @@ def modf(x1, **kwargs): :obj:`dpnp.fmin` : Element-wise minimum of array elements. :obj:`dpnp.fmod` : Calculate the element-wise remainder of division. - Examples -------- >>> import dpnp as dp @@ -2708,12 +2707,12 @@ def prod( ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2791,12 +2790,12 @@ def prod( First input array, expected to have a real-valued data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have a real-valued data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2806,6 +2805,7 @@ def prod( array is determined by the Type Promotion Rules. Limitations +----------- Parameters `where` and `subok` are supported with their default values. Keyword argument `kwargs` is currently unsupported. Otherwise ``NotImplementedError`` exception will be raised. @@ -2857,12 +2857,12 @@ def prod( ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -2916,7 +2916,7 @@ def prod( decimals : int, optional Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. @@ -2972,12 +2972,12 @@ def prod( ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -3026,12 +3026,12 @@ def prod( ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -3079,12 +3079,12 @@ def prod( First input array, expected to have numeric data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -3362,12 +3362,12 @@ def trapz(y1, x1=None, dx=1.0, axis=-1): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a real-valued data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- diff --git a/dpnp/dpnp_iface_trigonometric.py b/dpnp/dpnp_iface_trigonometric.py index 64c110190bf..d38af96ea2c 100644 --- a/dpnp/dpnp_iface_trigonometric.py +++ b/dpnp/dpnp_iface_trigonometric.py @@ -118,12 +118,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -172,12 +172,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -226,12 +226,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -280,12 +280,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type.. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -334,12 +334,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type.. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -396,12 +396,12 @@ def _get_accumulation_res_dt(a, dtype, _out): x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have a real-valued floating-point data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -463,12 +463,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -517,12 +517,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a real-valued data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -568,12 +568,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -621,12 +621,12 @@ def _get_accumulation_res_dt(a, dtype, _out): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -817,12 +817,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -869,12 +869,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a floating-point data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -924,12 +924,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -986,12 +986,12 @@ def degrees(x1, **kwargs): First input array, expected to have a real-valued data type. x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have a real-valued data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1046,12 +1046,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1100,12 +1100,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1159,12 +1159,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1218,12 +1218,12 @@ def degrees(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1284,12 +1284,12 @@ def degrees(x1, **kwargs): x2 : {dpnp.ndarray, usm_ndarray} Second input array, also expected to have a real-valued floating-point data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1423,12 +1423,12 @@ def logsumexp(x, /, *, axis=None, dtype=None, keepdims=False, out=None): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have a real-valued floating-point data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1558,7 +1558,7 @@ def reduce_hypot(x, /, *, axis=None, dtype=None, keepdims=False, out=None): Array must have the correct shape and the expected data type. order : ({'C', 'F', 'A', 'K'}, optional): Memory layout of the newly output array, if parameter `out` is `None`. - Default: "K" + Default: ``"K"`` Returns ------- @@ -1657,12 +1657,12 @@ def radians(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1710,12 +1710,12 @@ def radians(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1762,12 +1762,12 @@ def radians(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1817,12 +1817,12 @@ def radians(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1871,12 +1871,12 @@ def radians(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- @@ -1924,12 +1924,12 @@ def radians(x1, **kwargs): ---------- x : {dpnp.ndarray, usm_ndarray} Input array, expected to have numeric data type. -out : {None, dpnp.ndarray}, optional +out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. - Default: "K". + Default: ``"K"``. Returns ------- From 79eded160dc3abab3c983943d0c770127ff78e39 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Sat, 15 Jun 2024 15:16:01 +0200 Subject: [PATCH 24/49] mod is an alias of remainder (#1882) --- dpnp/dpnp_iface_mathematical.py | 65 ++++----------------------------- 1 file changed, 8 insertions(+), 57 deletions(-) diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index fb3496709df..dc27384a917 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -2245,63 +2245,6 @@ def gradient(f, *varargs, axis=None, edge_order=1): ) -def mod( - x1, - x2, - /, - out=None, - *, - where=True, - order="K", - dtype=None, - subok=True, - **kwargs, -): - """ - Compute element-wise remainder of division. - - For full documentation refer to :obj:`numpy.mod`. - - Returns - ------- - out : dpnp.ndarray - The element-wise remainder of the quotient `floor_divide(x1, x2)`. - - Limitations - ----------- - Parameters `x1` and `x2` are supported as either scalar, - :class:`dpnp.ndarray` or :class:`dpctl.tensor.usm_ndarray`, but both `x1` - and `x2` can not be scalars at the same time. - Parameters `where`, `dtype` and `subok` are supported with their default - values. - Keyword argument `kwargs` is currently unsupported. - Otherwise the function will be executed sequentially on CPU. - Input array data types are limited by supported DPNP :ref:`Data types`. - - See Also - -------- - :obj:`dpnp.fmod` : Calculate the element-wise remainder of division - :obj:`dpnp.remainder` : Remainder complementary to floor_divide. - :obj:`dpnp.divide` : Standard division. - - Notes - ----- - This function works the same as :obj:`dpnp.remainder`. - - """ - - return dpnp.remainder( - x1, - x2, - out=out, - where=where, - order=order, - dtype=dtype, - subok=subok, - **kwargs, - ) - - def modf(x1, **kwargs): """ Return the fractional and integral parts of an array, element-wise. @@ -2818,6 +2761,12 @@ def prod( :obj:`dpnp.floor_divide` : Compute the largest integer smaller or equal to the division of the inputs. :obj:`dpnp.mod` : Calculate the element-wise remainder of division. +Notes +----- +Returns ``0`` when `x2` is ``0`` and both `x1` and `x2` are (arrays of) +integers. +:obj:`dpnp.mod` is an alias of :obj:`dpnp.remainder`. + Examples -------- >>> import dpnp as np @@ -2843,6 +2792,8 @@ def prod( binary_inplace_fn=ti._remainder_inplace, ) +mod = remainder + _RINT_DOCSTRING = """ Rounds each element `x_i` of the input array `x` to From 2d50ce1b949c97dde002510c2342e7febe15350c Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Sat, 15 Jun 2024 19:04:59 +0200 Subject: [PATCH 25/49] Rework implementation of `dpnp.fabs` function (#1878) * Preparation to reuse common dpctl f/w for VM functions * PoC to decouple abs implementation to separate source file * Reuse typedef for function poiter from dpctl.tensor * Define populating vectors by a separate macro * Move implementation of utility functions from headers to source to resolve link issues * Separated implementation of acos function * Separated implementation of acosh function * Use function to simplify strides from dpctl tensor headers * PoC to decouple add implementation to separate source file * Separated implementation of asin function * Separated implementation of asinh function * Separated implementation of atan, atan2, atanh functions * Resolve issue with calling MKL function for undefined types * Separated implementation of cbrt, ceil, conj, cos and cosh functions * Separated implementation of div, exp, exp2, expm1, floor and hypot functions * Separated implementation of ln, log1p, log2 and log10 functions * Separated implementation of mul, pow, rint, sin and sinh functions * Separated implementation of sqr, sqrt, sub, tan, tanh and trunc functions * Removed unused header with types matrix * Remove unused functions * Use passing by reference in unary and binary funcs * Implement dpnp.fabs function * Create an instance of DPNPUnaryFunc for fabs * Enable and add relating tests * Decouple populate logic to a macro * Resolve compilation failure on Win * Update dpnp/dpnp_iface_mathematical.py Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> --------- Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> --- dpnp/CMakeLists.txt | 1 + dpnp/backend/extensions/ufunc/CMakeLists.txt | 79 +++++++++++ .../ufunc/elementwise_functions/common.cpp | 41 ++++++ .../ufunc/elementwise_functions/common.hpp | 35 +++++ .../ufunc/elementwise_functions/fabs.cpp | 128 ++++++++++++++++++ .../ufunc/elementwise_functions/fabs.hpp | 35 +++++ .../ufunc/elementwise_functions/populate.hpp | 122 +++++++++++++++++ dpnp/backend/extensions/ufunc/ufunc_py.cpp | 36 +++++ dpnp/backend/extensions/vm/add.cpp | 6 +- dpnp/backend/extensions/vm/atan2.cpp | 6 +- dpnp/backend/extensions/vm/div.cpp | 6 +- dpnp/backend/extensions/vm/hypot.cpp | 6 +- dpnp/backend/extensions/vm/mul.cpp | 6 +- dpnp/backend/extensions/vm/pow.cpp | 6 +- dpnp/backend/extensions/vm/sub.cpp | 6 +- dpnp/backend/include/dpnp_iface_fptr.hpp | 28 ++-- dpnp/backend/kernels/dpnp_krnl_elemwise.cpp | 9 -- .../kernels/elementwise_functions/fabs.hpp | 49 +++++++ dpnp/dpnp_algo/dpnp_algo.pxd | 1 - dpnp/dpnp_algo/dpnp_algo_mathematical.pxi | 5 - dpnp/dpnp_iface_mathematical.py | 69 ++++++---- tests/skipped_tests.tbl | 92 ------------- tests/skipped_tests_gpu.tbl | 90 ------------ tests/skipped_tests_gpu_no_fp64.tbl | 7 - tests/test_usm_type.py | 1 + .../third_party/cupy/math_tests/test_misc.py | 10 +- 26 files changed, 610 insertions(+), 270 deletions(-) create mode 100644 dpnp/backend/extensions/ufunc/CMakeLists.txt create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/common.hpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fabs.cpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fabs.hpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp create mode 100644 dpnp/backend/extensions/ufunc/ufunc_py.cpp create mode 100644 dpnp/backend/kernels/elementwise_functions/fabs.hpp diff --git a/dpnp/CMakeLists.txt b/dpnp/CMakeLists.txt index 9c79d5af385..d9c95b62c0b 100644 --- a/dpnp/CMakeLists.txt +++ b/dpnp/CMakeLists.txt @@ -60,6 +60,7 @@ add_subdirectory(backend/extensions/blas) add_subdirectory(backend/extensions/lapack) add_subdirectory(backend/extensions/vm) add_subdirectory(backend/extensions/sycl_ext) +add_subdirectory(backend/extensions/ufunc) add_subdirectory(dpnp_algo) add_subdirectory(dpnp_utils) diff --git a/dpnp/backend/extensions/ufunc/CMakeLists.txt b/dpnp/backend/extensions/ufunc/CMakeLists.txt new file mode 100644 index 00000000000..7f9a240271b --- /dev/null +++ b/dpnp/backend/extensions/ufunc/CMakeLists.txt @@ -0,0 +1,79 @@ +# ***************************************************************************** +# Copyright (c) 2024, Intel Corporation +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are met: +# - Redistributions of source code must retain the above copyright notice, +# this list of conditions and the following disclaimer. +# - Redistributions in binary form must reproduce the above copyright notice, +# this list of conditions and the following disclaimer in the documentation +# and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +# THE POSSIBILITY OF SUCH DAMAGE. +# ***************************************************************************** + +set(_elementwise_sources + ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/common.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fabs.cpp +) + +set(python_module_name _ufunc_impl) + +set(_module_src + # TODO: remove sources from `elementwise_functions` folder + ${CMAKE_CURRENT_SOURCE_DIR}/../elementwise_functions/elementwise_functions_type_utils.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/../elementwise_functions/simplify_iteration_space.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/ufunc_py.cpp + ${_elementwise_sources} +) + +pybind11_add_module(${python_module_name} MODULE ${_module_src}) +add_sycl_to_target(TARGET ${python_module_name} SOURCES ${_module_src}) + +if (WIN32) + if (${CMAKE_VERSION} VERSION_LESS "3.27") + # this is a work-around for target_link_options inserting option after -link option, cause + # linker to ignore it. + set(CMAKE_CXX_LINK_FLAGS "${CMAKE_CXX_LINK_FLAGS} -fsycl-device-code-split=per_kernel") + endif() +endif() + +set_target_properties(${python_module_name} PROPERTIES CMAKE_POSITION_INDEPENDENT_CODE ON) + +target_include_directories(${python_module_name} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/../../) + +target_include_directories(${python_module_name} PUBLIC ${Dpctl_INCLUDE_DIR}) +target_include_directories(${python_module_name} PUBLIC ${Dpctl_TENSOR_INCLUDE_DIR}) + +if (WIN32) + target_compile_options(${python_module_name} PRIVATE + /clang:-fno-approx-func + /clang:-fno-finite-math-only + ) +else() + target_compile_options(${python_module_name} PRIVATE + -fno-approx-func + -fno-finite-math-only + ) +endif() + +target_link_options(${python_module_name} PUBLIC -fsycl-device-code-split=per_kernel) + +if (DPNP_GENERATE_COVERAGE) + target_link_options(${python_module_name} PRIVATE -fprofile-instr-generate -fcoverage-mapping) +endif() + +install(TARGETS ${python_module_name} + DESTINATION "dpnp/backend/extensions/ufunc" +) diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp new file mode 100644 index 00000000000..44173fc764f --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp @@ -0,0 +1,41 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +#include "fabs.hpp" + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +/** + * @brief Add elementwise functions to Python module + */ +void init_elementwise_functions(py::module_ m) +{ + init_fabs(m); +} +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/common.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/common.hpp new file mode 100644 index 00000000000..345ff14308e --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/common.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +void init_elementwise_functions(py::module_); +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fabs.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fabs.cpp new file mode 100644 index 00000000000..7588e133473 --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fabs.cpp @@ -0,0 +1,128 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +#include "dpctl4pybind11.hpp" + +#include "fabs.hpp" +#include "kernels/elementwise_functions/fabs.hpp" +#include "populate.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +namespace impl +{ +/** + * @brief A factory to define pairs of supported types for which + * sycl::fabs function is available. + * + * @tparam T Type of input vector `a` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = + typename std::disjunction, + td_ns::TypeMapResultEntry, + td_ns::TypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +using dpnp::kernels::fabs::FabsFunctor; + +template +using ContigFunctor = ew_cmn_ns::UnaryContigFunctor, + vec_sz, + n_vecs, + enable_sg_loadstore>; + +template +using StridedFunctor = ew_cmn_ns:: + UnaryStridedFunctor>; + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +static unary_contig_impl_fn_ptr_t fabs_contig_dispatch_vector[td_ns::num_types]; +static int fabs_output_typeid_vector[td_ns::num_types]; +static unary_strided_impl_fn_ptr_t + fabs_strided_dispatch_vector[td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_VECTORS(fabs); +} // namespace impl + +void init_fabs(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + { + impl::populate_fabs_dispatch_vectors(); + using impl::fabs_contig_dispatch_vector; + using impl::fabs_output_typeid_vector; + using impl::fabs_strided_dispatch_vector; + + auto fabs_pyapi = [&](const arrayT &src, const arrayT &dst, + sycl::queue &exec_q, + const event_vecT &depends = {}) { + return py_int::py_unary_ufunc( + src, dst, exec_q, depends, fabs_output_typeid_vector, + fabs_contig_dispatch_vector, fabs_strided_dispatch_vector); + }; + m.def("_fabs", fabs_pyapi, "", py::arg("src"), py::arg("dst"), + py::arg("sycl_queue"), py::arg("depends") = py::list()); + + auto fabs_result_type_pyapi = [&](const py::dtype &dtype) { + return py_int::py_unary_ufunc_result_type( + dtype, fabs_output_typeid_vector); + }; + m.def("_fabs_result_type", fabs_result_type_pyapi); + } +} +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fabs.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fabs.hpp new file mode 100644 index 00000000000..f4a070747ac --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fabs.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +void init_fabs(py::module_ m); +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp new file mode 100644 index 00000000000..6261fcc08eb --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp @@ -0,0 +1,122 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +/** + * @brief A macro used to define factories and a populating universal functions. + */ +#define MACRO_POPULATE_DISPATCH_VECTORS(__name__) \ + template \ + class __name__##_contig_kernel; \ + \ + template \ + sycl::event __name__##_contig_impl( \ + sycl::queue &exec_q, size_t nelems, const char *arg_p, char *res_p, \ + const std::vector &depends = {}) \ + { \ + return ew_cmn_ns::unary_contig_impl( \ + exec_q, nelems, arg_p, res_p, depends); \ + } \ + \ + template \ + struct ContigFactory \ + { \ + fnT get() \ + { \ + if constexpr (std::is_same_v::value_type, \ + void>) { \ + fnT fn = nullptr; \ + return fn; \ + } \ + else { \ + fnT fn = __name__##_contig_impl; \ + return fn; \ + } \ + } \ + }; \ + \ + template \ + struct TypeMapFactory \ + { \ + std::enable_if_t::value, int> get() \ + { \ + using rT = typename OutputType::value_type; \ + return td_ns::GetTypeid{}.get(); \ + } \ + }; \ + \ + template \ + class __name__##_strided_kernel; \ + \ + template \ + sycl::event __name__##_strided_impl( \ + sycl::queue &exec_q, size_t nelems, int nd, \ + const py::ssize_t *shape_and_strides, const char *arg_p, \ + py::ssize_t arg_offset, char *res_p, py::ssize_t res_offset, \ + const std::vector &depends, \ + const std::vector &additional_depends) \ + { \ + return ew_cmn_ns::unary_strided_impl< \ + argTy, OutputType, StridedFunctor, __name__##_strided_kernel>( \ + exec_q, nelems, nd, shape_and_strides, arg_p, arg_offset, res_p, \ + res_offset, depends, additional_depends); \ + } \ + \ + template \ + struct StridedFactory \ + { \ + fnT get() \ + { \ + if constexpr (std::is_same_v::value_type, \ + void>) { \ + fnT fn = nullptr; \ + return fn; \ + } \ + else { \ + fnT fn = __name__##_strided_impl; \ + return fn; \ + } \ + } \ + }; \ + \ + void populate_##__name__##_dispatch_vectors(void) \ + { \ + td_ns::DispatchVectorBuilder \ + dvb1; \ + dvb1.populate_dispatch_vector(__name__##_contig_dispatch_vector); \ + \ + td_ns::DispatchVectorBuilder \ + dvb2; \ + dvb2.populate_dispatch_vector(__name__##_strided_dispatch_vector); \ + \ + td_ns::DispatchVectorBuilder \ + dvb3; \ + dvb3.populate_dispatch_vector(__name__##_output_typeid_vector); \ + }; diff --git a/dpnp/backend/extensions/ufunc/ufunc_py.cpp b/dpnp/backend/extensions/ufunc/ufunc_py.cpp new file mode 100644 index 00000000000..3618bce2cec --- /dev/null +++ b/dpnp/backend/extensions/ufunc/ufunc_py.cpp @@ -0,0 +1,36 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +#include "elementwise_functions/common.hpp" + +namespace py = pybind11; +namespace ufunc_ns = dpnp::extensions::ufunc; + +PYBIND11_MODULE(_ufunc_impl, m) +{ + ufunc_ns::init_elementwise_functions(m); +} diff --git a/dpnp/backend/extensions/vm/add.cpp b/dpnp/backend/extensions/vm/add.cpp index c43f07bbcde..c174bf73a99 100644 --- a/dpnp/backend/extensions/vm/add.cpp +++ b/dpnp/backend/extensions/vm/add.cpp @@ -83,11 +83,11 @@ template static sycl::event add_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/extensions/vm/atan2.cpp b/dpnp/backend/extensions/vm/atan2.cpp index 30bb59c9c42..4820a9623f0 100644 --- a/dpnp/backend/extensions/vm/atan2.cpp +++ b/dpnp/backend/extensions/vm/atan2.cpp @@ -73,11 +73,11 @@ template static sycl::event atan2_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/extensions/vm/div.cpp b/dpnp/backend/extensions/vm/div.cpp index 8cdb547feb4..5fb7122a76c 100644 --- a/dpnp/backend/extensions/vm/div.cpp +++ b/dpnp/backend/extensions/vm/div.cpp @@ -83,11 +83,11 @@ template static sycl::event div_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/extensions/vm/hypot.cpp b/dpnp/backend/extensions/vm/hypot.cpp index 42dd8127111..50ca178c37c 100644 --- a/dpnp/backend/extensions/vm/hypot.cpp +++ b/dpnp/backend/extensions/vm/hypot.cpp @@ -73,11 +73,11 @@ template static sycl::event hypot_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/extensions/vm/mul.cpp b/dpnp/backend/extensions/vm/mul.cpp index 34007fbc07c..de59d087f51 100644 --- a/dpnp/backend/extensions/vm/mul.cpp +++ b/dpnp/backend/extensions/vm/mul.cpp @@ -83,11 +83,11 @@ template static sycl::event mul_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/extensions/vm/pow.cpp b/dpnp/backend/extensions/vm/pow.cpp index 65acd2ece44..491b86f7946 100644 --- a/dpnp/backend/extensions/vm/pow.cpp +++ b/dpnp/backend/extensions/vm/pow.cpp @@ -83,11 +83,11 @@ template static sycl::event pow_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/extensions/vm/sub.cpp b/dpnp/backend/extensions/vm/sub.cpp index 4ec1bdc36b5..8bfc477bfa7 100644 --- a/dpnp/backend/extensions/vm/sub.cpp +++ b/dpnp/backend/extensions/vm/sub.cpp @@ -83,11 +83,11 @@ template static sycl::event sub_contig_impl(sycl::queue &exec_q, std::size_t in_n, const char *in_a, - ssize_t a_offset, + py::ssize_t a_offset, const char *in_b, - ssize_t b_offset, + py::ssize_t b_offset, char *out_y, - ssize_t out_offset, + py::ssize_t out_offset, const std::vector &depends) { tu_ns::validate_type_for_device(exec_q); diff --git a/dpnp/backend/include/dpnp_iface_fptr.hpp b/dpnp/backend/include/dpnp_iface_fptr.hpp index d8e6f8b26e8..0f6ef51bc7c 100644 --- a/dpnp/backend/include/dpnp_iface_fptr.hpp +++ b/dpnp/backend/include/dpnp_iface_fptr.hpp @@ -117,21 +117,19 @@ enum class DPNPFuncName : size_t DPNP_FN_DOT, /**< Used in numpy.dot() impl */ DPNP_FN_DOT_EXT, /**< Used in numpy.dot() impl, requires extra parameters */ DPNP_FN_EDIFF1D, /**< Used in numpy.ediff1d() impl */ - DPNP_FN_EDIFF1D_EXT, /**< Used in numpy.ediff1d() impl, requires extra - parameters */ - DPNP_FN_EIG, /**< Used in numpy.linalg.eig() impl */ - DPNP_FN_EIGVALS, /**< Used in numpy.linalg.eigvals() impl */ - DPNP_FN_ERF, /**< Used in scipy.special.erf impl */ - DPNP_FN_ERF_EXT, /**< Used in scipy.special.erf impl, requires extra - parameters */ - DPNP_FN_EYE, /**< Used in numpy.eye() impl */ - DPNP_FN_EXP, /**< Used in numpy.exp() impl */ - DPNP_FN_EXP2, /**< Used in numpy.exp2() impl */ - DPNP_FN_EXPM1, /**< Used in numpy.expm1() impl */ - DPNP_FN_FABS, /**< Used in numpy.fabs() impl */ - DPNP_FN_FABS_EXT, /**< Used in numpy.fabs() impl, requires extra parameters - */ - DPNP_FN_FFT_FFT, /**< Used in numpy.fft.fft() impl */ + DPNP_FN_EDIFF1D_EXT, /**< Used in numpy.ediff1d() impl, requires extra + parameters */ + DPNP_FN_EIG, /**< Used in numpy.linalg.eig() impl */ + DPNP_FN_EIGVALS, /**< Used in numpy.linalg.eigvals() impl */ + DPNP_FN_ERF, /**< Used in scipy.special.erf impl */ + DPNP_FN_ERF_EXT, /**< Used in scipy.special.erf impl, requires extra + parameters */ + DPNP_FN_EYE, /**< Used in numpy.eye() impl */ + DPNP_FN_EXP, /**< Used in numpy.exp() impl */ + DPNP_FN_EXP2, /**< Used in numpy.exp2() impl */ + DPNP_FN_EXPM1, /**< Used in numpy.expm1() impl */ + DPNP_FN_FABS, /**< Used in numpy.fabs() impl */ + DPNP_FN_FFT_FFT, /**< Used in numpy.fft.fft() impl */ DPNP_FN_FFT_FFT_EXT, /**< Used in numpy.fft.fft() impl, requires extra parameters */ DPNP_FN_FFT_RFFT, /**< Used in numpy.fft.rfft() impl */ diff --git a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp index a69a875fc1e..122a3ccdedd 100644 --- a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp @@ -462,15 +462,6 @@ static void func_map_init_elemwise_1arg_2type(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_FABS][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_fabs_c_default}; - fmap[DPNPFuncName::DPNP_FN_FABS_EXT][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_fabs_c_ext}; - fmap[DPNPFuncName::DPNP_FN_FABS_EXT][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_fabs_c_ext}; - fmap[DPNPFuncName::DPNP_FN_FABS_EXT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_fabs_c_ext}; - fmap[DPNPFuncName::DPNP_FN_FABS_EXT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_fabs_c_ext}; - fmap[DPNPFuncName::DPNP_FN_FLOOR][eft_INT][eft_INT] = { eft_DBL, (void *)dpnp_floor_c_default}; fmap[DPNPFuncName::DPNP_FN_FLOOR][eft_LNG][eft_LNG] = { diff --git a/dpnp/backend/kernels/elementwise_functions/fabs.hpp b/dpnp/backend/kernels/elementwise_functions/fabs.hpp new file mode 100644 index 00000000000..525cfc5bfe6 --- /dev/null +++ b/dpnp/backend/kernels/elementwise_functions/fabs.hpp @@ -0,0 +1,49 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace dpnp::kernels::fabs +{ +template +struct FabsFunctor +{ + // is function constant for given argT + using is_constant = typename std::false_type; + // constant value, if constant + // constexpr resT constant_value = resT{}; + // is function defined for sycl::vec + using supports_vec = typename std::false_type; + // do both argT and resT support sugroup store/load operation + using supports_sg_loadstore = typename std::true_type; + + resT operator()(const argT &x) const + { + return sycl::fabs(x); + } +}; +} // namespace dpnp::kernels::fabs diff --git a/dpnp/dpnp_algo/dpnp_algo.pxd b/dpnp/dpnp_algo/dpnp_algo.pxd index a82a96ed0c5..f6df42981a9 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pxd +++ b/dpnp/dpnp_algo/dpnp_algo.pxd @@ -40,7 +40,6 @@ cdef extern from "dpnp_iface_fptr.hpp" namespace "DPNPFuncName": # need this na DPNP_FN_DEGREES_EXT DPNP_FN_EDIFF1D_EXT DPNP_FN_ERF_EXT - DPNP_FN_FABS_EXT DPNP_FN_FFT_FFT_EXT DPNP_FN_FFT_RFFT_EXT DPNP_FN_FMOD_EXT diff --git a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi index 2b8d63c6d2d..405037da782 100644 --- a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi +++ b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi @@ -37,7 +37,6 @@ and the rest of the library __all__ += [ "dpnp_ediff1d", - "dpnp_fabs", "dpnp_fmod", "dpnp_fmax", "dpnp_fmin", @@ -110,10 +109,6 @@ cpdef utils.dpnp_descriptor dpnp_ediff1d(utils.dpnp_descriptor x1): return result -cpdef utils.dpnp_descriptor dpnp_fabs(utils.dpnp_descriptor x1): - return call_fptr_1in_1out_strides(DPNP_FN_FABS_EXT, x1) - - cpdef utils.dpnp_descriptor dpnp_fmod(utils.dpnp_descriptor x1_obj, utils.dpnp_descriptor x2_obj, object dtype=None, diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index dc27384a917..2f34be46312 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -55,12 +55,12 @@ ) import dpnp +import dpnp.backend.extensions.ufunc._ufunc_impl as ufi import dpnp.backend.extensions.vm._vm_impl as vmi from .backend.extensions.sycl_ext import _sycl_ext_impl from .dpnp_algo import ( dpnp_ediff1d, - dpnp_fabs, dpnp_fmax, dpnp_fmin, dpnp_fmod, @@ -1347,39 +1347,54 @@ def ediff1d(x1, to_end=None, to_begin=None): return call_origin(numpy.ediff1d, x1, to_end=to_end, to_begin=to_begin) -def fabs(x1, **kwargs): - """ - Compute the absolute values element-wise. +_FABS_DOCSTRING = """ +Compute the absolute values element-wise. - For full documentation refer to :obj:`numpy.fabs`. +This function returns the absolute values (positive magnitude) of the data in +`x`. Complex values are not handled, use :obj:`dpnp.absolute` to find the +absolute values of complex data. - Limitations - ----------- - Parameter `x1` is supported as :class:`dpnp.ndarray`. - Keyword argument `kwargs` is currently unsupported. - Otherwise the function will be executed sequentially on CPU. - Input array data types are limited by supported DPNP :ref:`Data types`. +For full documentation refer to :obj:`numpy.fabs`. - See Also - -------- - :obj:`dpnp.absolute` : Calculate the absolute value element-wise. +Parameters +---------- +x : {dpnp.ndarray, usm_ndarray} + The array of numbers for which the absolute values are required. +out : {None, dpnp.ndarray, usm_ndarray}, optional + Output array to populate. + Array must have the correct shape and the expected data type. +order : {"C", "F", "A", "K"}, optional + Memory layout of the newly output array, if parameter `out` is ``None``. + Default: ``"K"``. - Examples - -------- - >>> import dpnp as np - >>> result = np.fabs(np.array([1, -2, 6, -9])) - >>> [x for x in result] - [1.0, 2.0, 6.0, 9.0] +Returns +------- +out : dpnp.ndarray + The absolute values of `x`, the returned values are always floats. + If `x` does not have a floating point data type, the returned array + will have a data type that depends on the capabilities of the device + on which the array resides. - """ +See Also +-------- +:obj:`dpnp.absolute` : Absolute values including `complex` types. - x1_desc = dpnp.get_dpnp_descriptor( - x1, copy_when_strides=False, copy_when_nondefault_queue=False - ) - if x1_desc: - return dpnp_fabs(x1_desc).get_pyobj() +Examples +-------- +>>> import dpnp as np +>>> a = np.array([-1.2, 1.2]) +>>> np.fabs(a) +array([1.2, 1.2]) +""" - return call_origin(numpy.fabs, x1, **kwargs) +fabs = DPNPUnaryFunc( + "fabs", + ufi._fabs_result_type, + ufi._fabs, + _FABS_DOCSTRING, + mkl_fn_to_call=vmi._mkl_abs_to_call, + mkl_impl_fn=vmi._abs, +) _FLOOR_DOCSTRING = """ diff --git a/tests/skipped_tests.tbl b/tests/skipped_tests.tbl index 5e012b3a496..c86b0d848c5 100644 --- a/tests/skipped_tests.tbl +++ b/tests/skipped_tests.tbl @@ -36,7 +36,6 @@ tests/third_party/cupy/fft_tests/test_fft.py::TestFftn_param_23_{axes=None, norm tests/third_party/intel/test_zero_copy_test1.py::test_dpnp_interaction_with_dpctl_memory tests/test_strides.py::test_strides_1arg[(10,)-None-degrees] -tests/test_strides.py::test_strides_1arg[(10,)-None-fabs] tests/test_strides.py::test_strides_1arg[(10,)-None-radians] tests/test_umath.py::test_umaths[('divmod', 'ii')] @@ -260,12 +259,6 @@ tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_inf_ar tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_broadcast[nan] tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_broadcast[posinf] tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_broadcast[neginf] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_empty[_param_0_{mode='valid'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_empty[_param_1_{mode='same'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_empty[_param_2_{mode='full'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_ndim[_param_0_{mode='valid'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_ndim[_param_1_{mode='same'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_ndim[_param_2_{mode='full'}] tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_scalar_nan tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_copy @@ -292,91 +285,6 @@ tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_interp_inf_to_nan tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_heaviside tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_heaviside_nan_inf -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_0_{mode='valid', shape1=(), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_1_{mode='valid', shape1=(), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_2_{mode='valid', shape1=(), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_3_{mode='valid', shape1=(), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_4_{mode='valid', shape1=(), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_5_{mode='valid', shape1=(5,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_6_{mode='valid', shape1=(5,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_7_{mode='valid', shape1=(5,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_8_{mode='valid', shape1=(5,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_9_{mode='valid', shape1=(5,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_10_{mode='valid', shape1=(6,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_11_{mode='valid', shape1=(6,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_12_{mode='valid', shape1=(6,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_13_{mode='valid', shape1=(6,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_14_{mode='valid', shape1=(6,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_15_{mode='valid', shape1=(20,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_16_{mode='valid', shape1=(20,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_17_{mode='valid', shape1=(20,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_18_{mode='valid', shape1=(20,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_19_{mode='valid', shape1=(20,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_20_{mode='valid', shape1=(21,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_21_{mode='valid', shape1=(21,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_22_{mode='valid', shape1=(21,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_23_{mode='valid', shape1=(21,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_24_{mode='valid', shape1=(21,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_25_{mode='same', shape1=(), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_26_{mode='same', shape1=(), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_27_{mode='same', shape1=(), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_28_{mode='same', shape1=(), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_29_{mode='same', shape1=(), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_30_{mode='same', shape1=(5,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_31_{mode='same', shape1=(5,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_32_{mode='same', shape1=(5,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_33_{mode='same', shape1=(5,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_34_{mode='same', shape1=(5,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_35_{mode='same', shape1=(6,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_36_{mode='same', shape1=(6,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_37_{mode='same', shape1=(6,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_38_{mode='same', shape1=(6,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_39_{mode='same', shape1=(6,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_40_{mode='same', shape1=(20,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_41_{mode='same', shape1=(20,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_42_{mode='same', shape1=(20,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_43_{mode='same', shape1=(20,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_44_{mode='same', shape1=(20,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_45_{mode='same', shape1=(21,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_46_{mode='same', shape1=(21,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_47_{mode='same', shape1=(21,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_48_{mode='same', shape1=(21,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_49_{mode='same', shape1=(21,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_50_{mode='full', shape1=(), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_51_{mode='full', shape1=(), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_52_{mode='full', shape1=(), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_53_{mode='full', shape1=(), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_54_{mode='full', shape1=(), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_55_{mode='full', shape1=(5,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_56_{mode='full', shape1=(5,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_57_{mode='full', shape1=(5,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_58_{mode='full', shape1=(5,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_59_{mode='full', shape1=(5,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_60_{mode='full', shape1=(6,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_61_{mode='full', shape1=(6,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_62_{mode='full', shape1=(6,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_63_{mode='full', shape1=(6,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_64_{mode='full', shape1=(6,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_65_{mode='full', shape1=(20,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_66_{mode='full', shape1=(20,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_67_{mode='full', shape1=(20,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_68_{mode='full', shape1=(20,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_69_{mode='full', shape1=(20,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_70_{mode='full', shape1=(21,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_71_{mode='full', shape1=(21,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_72_{mode='full', shape1=(21,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_73_{mode='full', shape1=(21,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_74_{mode='full', shape1=(21,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_non_contiguous[valid] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_non_contiguous[same] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_non_contiguous[full] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_large_non_contiguous[valid] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_large_non_contiguous[same] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_large_non_contiguous[full] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_diff_types[valid] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_diff_types[same] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_diff_types[full] - tests/third_party/cupy/math_tests/test_rounding.py::TestRounding::test_fix tests/third_party/cupy/math_tests/test_trigonometric.py::TestUnwrap::test_unwrap_1dim_with_discont diff --git a/tests/skipped_tests_gpu.tbl b/tests/skipped_tests_gpu.tbl index e14b954abe6..45b41f2dafb 100644 --- a/tests/skipped_tests_gpu.tbl +++ b/tests/skipped_tests_gpu.tbl @@ -310,12 +310,6 @@ tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_inf_ar tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_broadcast[nan] tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_broadcast[posinf] tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_broadcast[neginf] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_empty[_param_0_{mode='valid'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_empty[_param_1_{mode='same'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_empty[_param_2_{mode='full'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_ndim[_param_0_{mode='valid'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_ndim[_param_1_{mode='same'}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveInvalid::test_convolve_ndim[_param_2_{mode='full'}] tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_scalar_nan tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_copy @@ -341,90 +335,6 @@ tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_interp_size1 tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_interp_inf_to_nan tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_heaviside tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_heaviside_nan_inf -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_0_{mode='valid', shape1=(), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_1_{mode='valid', shape1=(), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_2_{mode='valid', shape1=(), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_3_{mode='valid', shape1=(), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_4_{mode='valid', shape1=(), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_5_{mode='valid', shape1=(5,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_6_{mode='valid', shape1=(5,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_7_{mode='valid', shape1=(5,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_8_{mode='valid', shape1=(5,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_9_{mode='valid', shape1=(5,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_10_{mode='valid', shape1=(6,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_11_{mode='valid', shape1=(6,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_12_{mode='valid', shape1=(6,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_13_{mode='valid', shape1=(6,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_14_{mode='valid', shape1=(6,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_15_{mode='valid', shape1=(20,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_16_{mode='valid', shape1=(20,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_17_{mode='valid', shape1=(20,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_18_{mode='valid', shape1=(20,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_19_{mode='valid', shape1=(20,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_20_{mode='valid', shape1=(21,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_21_{mode='valid', shape1=(21,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_22_{mode='valid', shape1=(21,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_23_{mode='valid', shape1=(21,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_24_{mode='valid', shape1=(21,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_25_{mode='same', shape1=(), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_26_{mode='same', shape1=(), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_27_{mode='same', shape1=(), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_28_{mode='same', shape1=(), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_29_{mode='same', shape1=(), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_30_{mode='same', shape1=(5,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_31_{mode='same', shape1=(5,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_32_{mode='same', shape1=(5,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_33_{mode='same', shape1=(5,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_34_{mode='same', shape1=(5,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_35_{mode='same', shape1=(6,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_36_{mode='same', shape1=(6,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_37_{mode='same', shape1=(6,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_38_{mode='same', shape1=(6,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_39_{mode='same', shape1=(6,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_40_{mode='same', shape1=(20,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_41_{mode='same', shape1=(20,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_42_{mode='same', shape1=(20,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_43_{mode='same', shape1=(20,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_44_{mode='same', shape1=(20,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_45_{mode='same', shape1=(21,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_46_{mode='same', shape1=(21,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_47_{mode='same', shape1=(21,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_48_{mode='same', shape1=(21,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_49_{mode='same', shape1=(21,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_50_{mode='full', shape1=(), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_51_{mode='full', shape1=(), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_52_{mode='full', shape1=(), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_53_{mode='full', shape1=(), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_54_{mode='full', shape1=(), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_55_{mode='full', shape1=(5,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_56_{mode='full', shape1=(5,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_57_{mode='full', shape1=(5,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_58_{mode='full', shape1=(5,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_59_{mode='full', shape1=(5,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_60_{mode='full', shape1=(6,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_61_{mode='full', shape1=(6,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_62_{mode='full', shape1=(6,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_63_{mode='full', shape1=(6,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_64_{mode='full', shape1=(6,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_65_{mode='full', shape1=(20,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_66_{mode='full', shape1=(20,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_67_{mode='full', shape1=(20,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_68_{mode='full', shape1=(20,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_69_{mode='full', shape1=(20,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_70_{mode='full', shape1=(21,), shape2=()}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_71_{mode='full', shape1=(21,), shape2=(5,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_72_{mode='full', shape1=(21,), shape2=(6,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_73_{mode='full', shape1=(21,), shape2=(20,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolveShapeCombination::test_convolve[_param_74_{mode='full', shape1=(21,), shape2=(21,)}] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_non_contiguous[valid] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_non_contiguous[same] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_non_contiguous[full] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_large_non_contiguous[valid] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_large_non_contiguous[same] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_large_non_contiguous[full] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_diff_types[valid] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_diff_types[same] -tests/third_party/cupy/math_tests/test_misc.py::TestConvolve::test_convolve_diff_types[full] tests/third_party/cupy/math_tests/test_rounding.py::TestRounding::test_fix diff --git a/tests/skipped_tests_gpu_no_fp64.tbl b/tests/skipped_tests_gpu_no_fp64.tbl index c209c876df6..44e4c856b77 100644 --- a/tests/skipped_tests_gpu_no_fp64.tbl +++ b/tests/skipped_tests_gpu_no_fp64.tbl @@ -1,12 +1,5 @@ -tests/test_strides.py::test_strides_1arg[(10,)-int32-fabs] -tests/test_strides.py::test_strides_1arg[(10,)-int64-fabs] -tests/test_strides.py::test_strides_1arg[(10,)-None-fabs] - tests/test_umath.py::test_umaths[('floor_divide', 'ff')] -tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_fabs -tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_fabs_negative - tests/third_party/cupy/math_tests/test_trigonometric.py::TestUnwrap::test_unwrap_1dim tests/third_party/cupy/random_tests/test_distributions.py::TestDistributionsBeta_param_6_{a_shape=(3, 2), b_shape=(3, 2), shape=(4, 3, 2)}::test_beta diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index 77839f9b933..4f7314ff2db 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -538,6 +538,7 @@ def test_norm(usm_type, ord, axis): pytest.param("exp", [1.0, 2.0, 4.0, 7.0]), pytest.param("exp2", [0.0, 1.0, 2.0]), pytest.param("expm1", [1.0e-10, 1.0, 2.0, 4.0, 7.0]), + pytest.param("fabs", [-1.2, 1.2]), pytest.param("floor", [-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]), pytest.param("gradient", [1, 2, 4, 7, 11, 16]), pytest.param("histogram_bin_edges", [0, 0, 0, 1, 2, 3, 3, 4, 5]), diff --git a/tests/third_party/cupy/math_tests/test_misc.py b/tests/third_party/cupy/math_tests/test_misc.py index dd7fe9dcc1a..62717803aca 100644 --- a/tests/third_party/cupy/math_tests/test_misc.py +++ b/tests/third_party/cupy/math_tests/test_misc.py @@ -26,6 +26,7 @@ def check_binary(self, name, xp, dtype, no_bool=False): @testing.for_dtypes(["?", "b", "h", "i", "q", "e", "f", "d", "F", "D"]) @testing.numpy_cupy_allclose(atol=1e-5) + # TODO: remove no_comlex=True, once adopted to numpy 2.0 def check_unary_negative( self, name, xp, dtype, no_bool=False, no_complex=False ): @@ -184,13 +185,13 @@ def test_absolute_negative(self): self.check_unary_negative("absolute") @testing.for_all_dtypes(no_complex=True) - @testing.numpy_cupy_allclose(atol=1e-5) + @testing.numpy_cupy_allclose(atol=1e-5, type_check=has_support_aspect64()) def test_fabs(self, xp, dtype): a = xp.array([2, 3, 4], dtype=dtype) return xp.fabs(a) @testing.for_all_dtypes(no_complex=True) - @testing.numpy_cupy_allclose(atol=1e-5) + @testing.numpy_cupy_allclose(atol=1e-5, type_check=has_support_aspect64()) def test_fabs_negative(self, xp, dtype): a = xp.array([-2.0, -4.0, 0.0, 4.0], dtype=dtype) return xp.fabs(a) @@ -198,7 +199,7 @@ def test_fabs_negative(self, xp, dtype): def test_sign(self): self.check_unary("sign", no_bool=True) - # TODO: remove no_comlex=True, when numpy 2.0.0 will release + # TODO: remove no_comlex=True, once adopted to numpy 2.0 def test_sign_negative(self): self.check_unary_negative("sign", no_bool=True, no_complex=True) @@ -504,6 +505,7 @@ def test_heaviside_nan_inf(self, xp, dtype_1, dtype_2): } ) ) +@pytest.mark.skip("convolve() is not implemented yet") class TestConvolveShapeCombination: @testing.for_all_dtypes(no_float16=True) @testing.numpy_cupy_allclose(rtol=1e-3) @@ -513,6 +515,7 @@ def test_convolve(self, xp, dtype): return xp.convolve(a, b, mode=self.mode) +@pytest.mark.skip("convolve() is not implemented yet") @pytest.mark.parametrize("mode", ["valid", "same", "full"]) class TestConvolve: @testing.for_all_dtypes(no_float16=True) @@ -537,6 +540,7 @@ def test_convolve_diff_types(self, xp, dtype1, dtype2, mode): return xp.convolve(a, b, mode=mode) +@pytest.mark.skip("convolve() is not implemented yet") @testing.parameterize(*testing.product({"mode": ["valid", "same", "full"]})) class TestConvolveInvalid: @testing.for_all_dtypes() From a01f21f8dcb1b3d843caed7811d14cf87592209b Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun, 16 Jun 2024 11:11:54 +0200 Subject: [PATCH 26/49] Bump github/codeql-action from 3.25.8 to 3.25.10 (#1885) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.8 to 3.25.10. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/2e230e8fe0ad3a14a340ad0815ddb96d599d2aff...23acc5c183826b7a8a97bce3cecc52db901f8251) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- .github/workflows/openssf-scorecard.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index 5d0d13d45fb..09b21df1350 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -68,6 +68,6 @@ jobs: # Upload the results to GitHub's code scanning dashboard. - name: "Upload to code-scanning" - uses: github/codeql-action/upload-sarif@2e230e8fe0ad3a14a340ad0815ddb96d599d2aff # v3.25.8 + uses: github/codeql-action/upload-sarif@23acc5c183826b7a8a97bce3cecc52db901f8251 # v3.25.10 with: sarif_file: results.sarif From af601c60d1c043b4056c661107d8e108a4a247bd Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun, 16 Jun 2024 18:36:14 +0200 Subject: [PATCH 27/49] Bump actions/checkout from 4.1.6 to 4.1.7 (#1886) Bumps [actions/checkout](https://github.com/actions/checkout) from 4.1.6 to 4.1.7. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/a5ac7e51b41094c92402da3b24376905380afc29...692973e3d937129bcbf40652eb9f2f61becf3332) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- .github/workflows/build-sphinx.yml | 4 ++-- .github/workflows/conda-package.yml | 4 ++-- .github/workflows/generate_coverage.yaml | 2 +- .github/workflows/openssf-scorecard.yml | 2 +- .github/workflows/pre-commit.yml | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/.github/workflows/build-sphinx.yml b/.github/workflows/build-sphinx.yml index 02d4be09541..13c84de50e7 100644 --- a/.github/workflows/build-sphinx.yml +++ b/.github/workflows/build-sphinx.yml @@ -91,7 +91,7 @@ jobs: sudo apt-get install -y nvidia-cuda-toolkit clinfo - name: Checkout repo - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 with: fetch-depth: 0 @@ -221,7 +221,7 @@ jobs: runs-on: ubuntu-20.04 steps: - - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + - uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 with: fetch-depth: 0 diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index 83c657a77c5..8f474e5398e 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -90,7 +90,7 @@ jobs: access_token: ${{ github.token }} - name: Checkout DPNP repo - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 with: fetch-depth: 0 @@ -515,7 +515,7 @@ jobs: run: mamba install anaconda-client - name: Checkout repo - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 with: repository: IntelPython/devops-tools fetch-depth: 0 diff --git a/.github/workflows/generate_coverage.yaml b/.github/workflows/generate_coverage.yaml index 22ec13da23a..1fa71fb479d 100644 --- a/.github/workflows/generate_coverage.yaml +++ b/.github/workflows/generate_coverage.yaml @@ -32,7 +32,7 @@ jobs: access_token: ${{ github.token }} - name: Checkout repo - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 with: fetch-depth: 0 diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index 09b21df1350..803f20d284b 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -33,7 +33,7 @@ jobs: steps: - name: "Checkout code" - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 with: persist-credentials: false diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml index 0a5f91f89bf..3b2f1c9e215 100644 --- a/.github/workflows/pre-commit.yml +++ b/.github/workflows/pre-commit.yml @@ -26,7 +26,7 @@ jobs: pylint - name: Checkout DPNP repo - uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29 # v4.1.6 + uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7 - name: Set up python uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5.1.0 From a813fae646ec784e0be994568ae63c342ee6288c Mon Sep 17 00:00:00 2001 From: vtavana <120411540+vtavana@users.noreply.github.com> Date: Mon, 17 Jun 2024 06:10:05 -0500 Subject: [PATCH 28/49] update BLAS extension routines (#1884) Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- dpnp/backend/extensions/blas/blas_py.cpp | 57 +++++------ dpnp/backend/extensions/blas/dot.hpp | 21 ++--- dpnp/backend/extensions/blas/dot_common.hpp | 44 ++++----- dpnp/backend/extensions/blas/dotc.hpp | 21 ++--- dpnp/backend/extensions/blas/dotu.hpp | 21 ++--- dpnp/backend/extensions/blas/gemm.cpp | 73 +++++++------- dpnp/backend/extensions/blas/gemm.hpp | 27 ++---- dpnp/backend/extensions/blas/gemm_batch.cpp | 94 +++++++++---------- dpnp/backend/extensions/blas/gemv.cpp | 62 ++++++------ dpnp/backend/extensions/blas/gemv.hpp | 31 +++--- dpnp/backend/extensions/blas/types_matrix.hpp | 16 +--- 11 files changed, 192 insertions(+), 275 deletions(-) diff --git a/dpnp/backend/extensions/blas/blas_py.cpp b/dpnp/backend/extensions/blas/blas_py.cpp index b5d83375f23..54fde4f4fea 100644 --- a/dpnp/backend/extensions/blas/blas_py.cpp +++ b/dpnp/backend/extensions/blas/blas_py.cpp @@ -37,17 +37,17 @@ #include "gemm.hpp" #include "gemv.hpp" -namespace blas_ext = dpnp::backend::ext::blas; +namespace blas_ns = dpnp::extensions::blas; namespace py = pybind11; -namespace dot_ext = blas_ext::dot; -using dot_ext::dot_impl_fn_ptr_t; +namespace dot_ns = blas_ns::dot; +using dot_ns::dot_impl_fn_ptr_t; // populate dispatch vectors and tables void init_dispatch_vectors_tables(void) { - blas_ext::init_gemm_batch_dispatch_table(); - blas_ext::init_gemm_dispatch_table(); - blas_ext::init_gemv_dispatch_vector(); + blas_ns::init_gemm_batch_dispatch_table(); + blas_ns::init_gemm_dispatch_table(); + blas_ns::init_gemv_dispatch_vector(); } static dot_impl_fn_ptr_t dot_dispatch_vector[dpctl_td_ns::num_types]; @@ -62,14 +62,15 @@ PYBIND11_MODULE(_blas_impl, m) using event_vecT = std::vector; { - dot_ext::init_dot_dispatch_vector( + dot_ns::init_dot_dispatch_vector( dot_dispatch_vector); - auto dot_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return dot_ext::dot_func(exec_q, src1, src2, dst, depends, - dot_dispatch_vector); + auto dot_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return dot_ns::dot_func(exec_q, src1, src2, dst, depends, + dot_dispatch_vector); }; m.def("_dot", dot_pyapi, @@ -80,14 +81,15 @@ PYBIND11_MODULE(_blas_impl, m) } { - dot_ext::init_dot_dispatch_vector( + dot_ns::init_dot_dispatch_vector( dotc_dispatch_vector); - auto dotc_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return dot_ext::dot_func(exec_q, src1, src2, dst, depends, - dotc_dispatch_vector); + auto dotc_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return dot_ns::dot_func(exec_q, src1, src2, dst, depends, + dotc_dispatch_vector); }; m.def("_dotc", dotc_pyapi, @@ -99,14 +101,15 @@ PYBIND11_MODULE(_blas_impl, m) } { - dot_ext::init_dot_dispatch_vector( + dot_ns::init_dot_dispatch_vector( dotu_dispatch_vector); - auto dotu_pyapi = [&](sycl::queue exec_q, arrayT src1, arrayT src2, - arrayT dst, const event_vecT &depends = {}) { - return dot_ext::dot_func(exec_q, src1, src2, dst, depends, - dotu_dispatch_vector); + auto dotu_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return dot_ns::dot_func(exec_q, src1, src2, dst, depends, + dotu_dispatch_vector); }; m.def("_dotu", dotu_pyapi, @@ -117,7 +120,7 @@ PYBIND11_MODULE(_blas_impl, m) } { - m.def("_gemm", &blas_ext::gemm, + m.def("_gemm", &blas_ns::gemm, "Call `gemm` from OneMKL BLAS library to compute " "the matrix-matrix product with 2-D matrices.", py::arg("sycl_queue"), py::arg("matrixA"), py::arg("matrixB"), @@ -125,7 +128,7 @@ PYBIND11_MODULE(_blas_impl, m) } { - m.def("_gemm_batch", &blas_ext::gemm_batch, + m.def("_gemm_batch", &blas_ns::gemm_batch, "Call `gemm_batch` from OneMKL BLAS library to compute " "the matrix-matrix product for a batch of 2-D matrices.", py::arg("sycl_queue"), py::arg("matrixA"), py::arg("matrixB"), @@ -133,7 +136,7 @@ PYBIND11_MODULE(_blas_impl, m) } { - m.def("_gemv", &blas_ext::gemv, + m.def("_gemv", &blas_ns::gemv, "Call `gemv` from OneMKL BLAS library to compute " "the matrix-vector product with a general matrix.", py::arg("sycl_queue"), py::arg("matrixA"), py::arg("vectorX"), diff --git a/dpnp/backend/extensions/blas/dot.hpp b/dpnp/backend/extensions/blas/dot.hpp index 7e665b1f74d..e700f983097 100644 --- a/dpnp/backend/extensions/blas/dot.hpp +++ b/dpnp/backend/extensions/blas/dot.hpp @@ -27,13 +27,7 @@ #include "dot_common.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { namespace mkl_blas = oneapi::mkl::blas; namespace type_utils = dpctl::tensor::type_utils; @@ -41,17 +35,17 @@ namespace type_utils = dpctl::tensor::type_utils; template static sycl::event dot_impl(sycl::queue &exec_q, const std::int64_t n, - char *vectorX, + const char *vectorX, const std::int64_t incx, - char *vectorY, + const char *vectorY, const std::int64_t incy, char *result, const std::vector &depends) { type_utils::validate_type_for_device(exec_q); - T *x = reinterpret_cast(vectorX); - T *y = reinterpret_cast(vectorY); + const T *x = reinterpret_cast(vectorX); + const T *y = reinterpret_cast(vectorY); T *res = reinterpret_cast(result); std::stringstream error_msg; @@ -99,7 +93,4 @@ struct DotContigFactory } } }; -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/dot_common.hpp b/dpnp/backend/extensions/blas/dot_common.hpp index 15e7c694f74..4ee8201338c 100644 --- a/dpnp/backend/extensions/blas/dot_common.hpp +++ b/dpnp/backend/extensions/blas/dot_common.hpp @@ -36,21 +36,13 @@ #include "types_matrix.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas -{ -namespace dot +namespace dpnp::extensions::blas::dot { typedef sycl::event (*dot_impl_fn_ptr_t)(sycl::queue &, const std::int64_t, - char *, + const char *, const std::int64_t, - char *, + const char *, const std::int64_t, char *, const std::vector &); @@ -61,9 +53,9 @@ namespace py = pybind11; template std::pair dot_func(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray vectorX, - dpctl::tensor::usm_ndarray vectorY, - dpctl::tensor::usm_ndarray result, + const dpctl::tensor::usm_ndarray &vectorX, + const dpctl::tensor::usm_ndarray &vectorY, + const dpctl::tensor::usm_ndarray &result, const std::vector &depends, const dispatchT &dot_dispatch_vector) { @@ -109,22 +101,22 @@ std::pair "USM allocations are not compatible with the execution queue."); } - size_t src_nelems = 1; + const int src_nelems = 1; dpctl::tensor::validation::CheckWritable::throw_if_not_writable(result); dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(result, src_nelems); - py::ssize_t x_size = vectorX.get_size(); - py::ssize_t y_size = vectorY.get_size(); + const py::ssize_t x_size = vectorX.get_size(); + const py::ssize_t y_size = vectorY.get_size(); const std::int64_t n = x_size; if (x_size != y_size) { throw py::value_error("The size of the first input array must be " "equal to the size of the second input array."); } - int vectorX_typenum = vectorX.get_typenum(); - int vectorY_typenum = vectorY.get_typenum(); - int result_typenum = result.get_typenum(); + const int vectorX_typenum = vectorX.get_typenum(); + const int vectorY_typenum = vectorY.get_typenum(); + const int result_typenum = result.get_typenum(); if (result_typenum != vectorX_typenum || result_typenum != vectorY_typenum) { @@ -132,7 +124,7 @@ std::pair } auto array_types = dpctl_td_ns::usm_ndarray_types(); - int type_id = array_types.typenum_to_lookup_id(vectorX_typenum); + const int type_id = array_types.typenum_to_lookup_id(vectorX_typenum); dot_impl_fn_ptr_t dot_fn = dot_dispatch_vector[type_id]; if (dot_fn == nullptr) { @@ -144,8 +136,8 @@ std::pair char *y_typeless_ptr = vectorY.get_data(); char *r_typeless_ptr = result.get_data(); - std::vector x_stride = vectorX.get_strides_vector(); - std::vector y_stride = vectorY.get_strides_vector(); + const std::vector x_stride = vectorX.get_strides_vector(); + const std::vector y_stride = vectorY.get_strides_vector(); const int x_elemsize = vectorX.get_elemsize(); const int y_elemsize = vectorY.get_elemsize(); @@ -184,8 +176,4 @@ void init_dot_dispatch_vector(dispatchT dot_dispatch_vector[]) contig; contig.populate_dispatch_vector(dot_dispatch_vector); } -} // namespace dot -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas::dot diff --git a/dpnp/backend/extensions/blas/dotc.hpp b/dpnp/backend/extensions/blas/dotc.hpp index 8ca78c20343..417c832bf06 100644 --- a/dpnp/backend/extensions/blas/dotc.hpp +++ b/dpnp/backend/extensions/blas/dotc.hpp @@ -27,13 +27,7 @@ #include "dot_common.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { namespace mkl_blas = oneapi::mkl::blas; namespace type_utils = dpctl::tensor::type_utils; @@ -41,17 +35,17 @@ namespace type_utils = dpctl::tensor::type_utils; template static sycl::event dotc_impl(sycl::queue &exec_q, const std::int64_t n, - char *vectorX, + const char *vectorX, const std::int64_t incx, - char *vectorY, + const char *vectorY, const std::int64_t incy, char *result, const std::vector &depends) { type_utils::validate_type_for_device(exec_q); - T *x = reinterpret_cast(vectorX); - T *y = reinterpret_cast(vectorY); + const T *x = reinterpret_cast(vectorX); + const T *y = reinterpret_cast(vectorY); T *res = reinterpret_cast(result); std::stringstream error_msg; @@ -100,7 +94,4 @@ struct DotcContigFactory } }; -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/dotu.hpp b/dpnp/backend/extensions/blas/dotu.hpp index 832e99fff5e..51c30735d22 100644 --- a/dpnp/backend/extensions/blas/dotu.hpp +++ b/dpnp/backend/extensions/blas/dotu.hpp @@ -27,13 +27,7 @@ #include "dot_common.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { namespace mkl_blas = oneapi::mkl::blas; namespace type_utils = dpctl::tensor::type_utils; @@ -41,17 +35,17 @@ namespace type_utils = dpctl::tensor::type_utils; template static sycl::event dotu_impl(sycl::queue &exec_q, const std::int64_t n, - char *vectorX, + const char *vectorX, const std::int64_t incx, - char *vectorY, + const char *vectorY, const std::int64_t incy, char *result, const std::vector &depends) { type_utils::validate_type_for_device(exec_q); - T *x = reinterpret_cast(vectorX); - T *y = reinterpret_cast(vectorY); + const T *x = reinterpret_cast(vectorX); + const T *y = reinterpret_cast(vectorY); T *res = reinterpret_cast(result); std::stringstream error_msg; @@ -99,7 +93,4 @@ struct DotuContigFactory } } }; -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/gemm.cpp b/dpnp/backend/extensions/blas/gemm.cpp index c1005f797b1..f47f8ebe7ae 100644 --- a/dpnp/backend/extensions/blas/gemm.cpp +++ b/dpnp/backend/extensions/blas/gemm.cpp @@ -35,13 +35,7 @@ #include "dpnp_utils.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { namespace mkl_blas = oneapi::mkl::blas; namespace py = pybind11; @@ -53,13 +47,13 @@ typedef sycl::event (*gemm_impl_fn_ptr_t)(sycl::queue &, const std::int64_t, const std::int64_t, const std::int64_t, - char *, + const char *, const std::int64_t, - char *, + const char *, const std::int64_t, char *, const std::int64_t, - bool, + const bool, const std::vector &); static gemm_impl_fn_ptr_t gemm_dispatch_table[dpctl_td_ns::num_types] @@ -72,20 +66,20 @@ static sycl::event gemm_impl(sycl::queue &exec_q, const std::int64_t m, const std::int64_t n, const std::int64_t k, - char *matrixA, + const char *matrixA, const std::int64_t lda, - char *matrixB, + const char *matrixB, const std::int64_t ldb, char *resultC, const std::int64_t ldc, - bool is_row_major, + const bool is_row_major, const std::vector &depends) { type_utils::validate_type_for_device(exec_q); type_utils::validate_type_for_device(exec_q); - Tab *a = reinterpret_cast(matrixA); - Tab *b = reinterpret_cast(matrixB); + const Tab *a = reinterpret_cast(matrixA); + const Tab *b = reinterpret_cast(matrixB); Tc *res = reinterpret_cast(resultC); std::stringstream error_msg; @@ -95,10 +89,10 @@ static sycl::event gemm_impl(sycl::queue &exec_q, try { auto gemm_func = [&](sycl::queue &q, oneapi::mkl::transpose transA, - oneapi::mkl::transpose transB, std::int64_t m, std::int64_t n, - std::int64_t k, Tab alpha, const Tab *a, std::int64_t lda, - const Tab *b, std::int64_t ldb, Tab beta, Tc *c, - std::int64_t ldc, + oneapi::mkl::transpose transB, const std::int64_t m, + const std::int64_t n, const std::int64_t k, Tab alpha, + const Tab *a, const std::int64_t lda, const Tab *b, + const std::int64_t ldb, Tab beta, Tc *c, const std::int64_t ldc, const std::vector &deps) -> sycl::event { if (is_row_major) { return mkl_blas::row_major::gemm(q, transA, transB, m, n, k, @@ -152,9 +146,9 @@ static sycl::event gemm_impl(sycl::queue &exec_q, std::tuple gemm(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray matrixB, - dpctl::tensor::usm_ndarray resultC, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &matrixB, + const dpctl::tensor::usm_ndarray &resultC, const std::vector &depends) { const int matrixA_nd = matrixA.get_ndim(); @@ -204,17 +198,17 @@ std::tuple "the number of columns in result array."); } - size_t src_nelems = m * n; + const std::size_t src_nelems = m * n; dpctl::tensor::validation::CheckWritable::throw_if_not_writable(resultC); dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(resultC, src_nelems); - bool is_matrixA_f_contig = matrixA.is_f_contiguous(); - bool is_matrixB_f_contig = matrixB.is_f_contiguous(); - bool is_resultC_f_contig = resultC.is_f_contiguous(); - bool is_matrixA_c_contig = matrixA.is_c_contiguous(); - bool is_matrixB_c_contig = matrixB.is_c_contiguous(); - bool is_resultC_c_contig = resultC.is_c_contiguous(); + const bool is_matrixA_f_contig = matrixA.is_f_contiguous(); + const bool is_matrixB_f_contig = matrixB.is_f_contiguous(); + const bool is_resultC_f_contig = resultC.is_f_contiguous(); + const bool is_matrixA_c_contig = matrixA.is_c_contiguous(); + const bool is_matrixB_c_contig = matrixB.is_c_contiguous(); + const bool is_resultC_c_contig = resultC.is_c_contiguous(); if (!is_matrixA_f_contig and !is_matrixA_c_contig) { throw py::value_error( @@ -267,17 +261,19 @@ std::tuple } const std::int64_t ldc = is_row_major ? n : m; - int matrixA_typenum = matrixA.get_typenum(); - int matrixB_typenum = matrixB.get_typenum(); - int resultC_typenum = resultC.get_typenum(); + const int matrixA_typenum = matrixA.get_typenum(); + const int matrixB_typenum = matrixB.get_typenum(); + const int resultC_typenum = resultC.get_typenum(); if (matrixA_typenum != matrixB_typenum) { throw py::value_error("matrixA and matrixB must be of the same type."); } auto array_types = dpctl_td_ns::usm_ndarray_types(); - int matrixAB_type_id = array_types.typenum_to_lookup_id(matrixA_typenum); - int resultC_type_id = array_types.typenum_to_lookup_id(resultC_typenum); + const int matrixAB_type_id = + array_types.typenum_to_lookup_id(matrixA_typenum); + const int resultC_type_id = + array_types.typenum_to_lookup_id(resultC_typenum); gemm_impl_fn_ptr_t gemm_fn = gemm_dispatch_table[matrixAB_type_id][resultC_type_id]; @@ -286,8 +282,8 @@ std::tuple "Types of input matrices and result matrix are mismatched."); } - char *a_typeless_ptr = matrixA.get_data(); - char *b_typeless_ptr = matrixB.get_data(); + const char *a_typeless_ptr = matrixA.get_data(); + const char *b_typeless_ptr = matrixB.get_data(); char *r_typeless_ptr = resultC.get_data(); sycl::event gemm_ev = gemm_fn(exec_q, transA, transB, m, n, k, @@ -321,7 +317,4 @@ void init_gemm_dispatch_table(void) contig; contig.populate_dispatch_table(gemm_dispatch_table); } -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/gemm.hpp b/dpnp/backend/extensions/blas/gemm.hpp index 6e3a5840269..ee14400ae25 100644 --- a/dpnp/backend/extensions/blas/gemm.hpp +++ b/dpnp/backend/extensions/blas/gemm.hpp @@ -25,36 +25,27 @@ #pragma once -#include #include +#include #include -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { extern std::tuple gemm(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray matrixB, - dpctl::tensor::usm_ndarray resultC, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &matrixB, + const dpctl::tensor::usm_ndarray &resultC, const std::vector &depends); extern std::tuple gemm_batch(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray matrixB, - dpctl::tensor::usm_ndarray resultC, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &matrixB, + const dpctl::tensor::usm_ndarray &resultC, const std::vector &depends); extern void init_gemm_dispatch_table(void); extern void init_gemm_batch_dispatch_table(void); -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/gemm_batch.cpp b/dpnp/backend/extensions/blas/gemm_batch.cpp index 689ef77b786..640cc779120 100644 --- a/dpnp/backend/extensions/blas/gemm_batch.cpp +++ b/dpnp/backend/extensions/blas/gemm_batch.cpp @@ -35,13 +35,7 @@ #include "dpnp_utils.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { namespace mkl_blas = oneapi::mkl::blas; namespace py = pybind11; @@ -56,15 +50,15 @@ typedef sycl::event (*gemm_batch_impl_fn_ptr_t)( const std::int64_t, const std::int64_t, const std::int64_t, - size_t, - size_t, - size_t, + const std::int64_t, + const std::int64_t, + const std::int64_t, oneapi::mkl::transpose, oneapi::mkl::transpose, + const char *, + const char *, char *, - char *, - char *, - bool, + const bool, const std::vector &); static gemm_batch_impl_fn_ptr_t @@ -79,22 +73,22 @@ static sycl::event gemm_batch_impl(sycl::queue &exec_q, const std::int64_t lda, const std::int64_t ldb, const std::int64_t ldc, - size_t stridea, - size_t strideb, - size_t stridec, + const std::int64_t stridea, + const std::int64_t strideb, + const std::int64_t stridec, oneapi::mkl::transpose transA, oneapi::mkl::transpose transB, - char *matrixA, - char *matrixB, + const char *matrixA, + const char *matrixB, char *resultC, - bool is_row_major, + const bool is_row_major, const std::vector &depends) { type_utils::validate_type_for_device(exec_q); type_utils::validate_type_for_device(exec_q); - Tab *a = reinterpret_cast(matrixA); - Tab *b = reinterpret_cast(matrixB); + const Tab *a = reinterpret_cast(matrixA); + const Tab *b = reinterpret_cast(matrixB); Tc *res = reinterpret_cast(resultC); std::stringstream error_msg; @@ -104,11 +98,13 @@ static sycl::event gemm_batch_impl(sycl::queue &exec_q, try { auto gemm_batch_func = [&](sycl::queue &q, oneapi::mkl::transpose transA, - oneapi::mkl::transpose transB, std::int64_t m, std::int64_t n, - std::int64_t k, Tab alpha, const Tab *a, std::int64_t lda, - std::int64_t stridea, const Tab *b, std::int64_t ldb, - std::int64_t strideb, Tab beta, Tc *c, std::int64_t ldc, - std::int64_t stridec, std::int64_t batch_size, + oneapi::mkl::transpose transB, const std::int64_t m, + const std::int64_t n, const std::int64_t k, Tab alpha, + const Tab *a, const std::int64_t lda, + const std::int64_t stridea, const Tab *b, + const std::int64_t ldb, const std::int64_t strideb, Tab beta, + Tc *c, const std::int64_t ldc, const std::int64_t stridec, + const std::int64_t batch_size, const std::vector &deps) -> sycl::event { if (is_row_major) { return mkl_blas::row_major::gemm_batch( @@ -172,9 +168,10 @@ void standardize_strides_to_nonzero(std::vector &strides, // When shape of an array along any particular dimension is 1, the stride // along that dimension is undefined. This function standardize the strides // by calculating the non-zero value of the strides. - std::size_t ndim = strides.size(); - bool has_zero_stride = std::accumulate(strides.begin(), strides.end(), 1, - std::multiplies{}) == 0; + const std::size_t ndim = strides.size(); + const bool has_zero_stride = + std::accumulate(strides.begin(), strides.end(), 1, + std::multiplies{}) == 0; if (has_zero_stride) { for (std::size_t i = 0; i < ndim - 1; ++i) { @@ -196,9 +193,9 @@ void standardize_strides_to_zero(std::vector &strides, // instead of copying the array into the additional dimension for batch // multiplication, we choose to use zero as the stride between different // matrices. Therefore, the same array is used repeatedly. - std::size_t ndim = strides.size(); + const std::size_t ndim = strides.size(); - for (size_t i = 0; i < ndim; ++i) { + for (std::size_t i = 0; i < ndim; ++i) { if (shape[i] <= 1) { strides[i] = 0; } @@ -207,9 +204,9 @@ void standardize_strides_to_zero(std::vector &strides, std::tuple gemm_batch(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray matrixB, - dpctl::tensor::usm_ndarray resultC, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &matrixB, + const dpctl::tensor::usm_ndarray &resultC, const std::vector &depends = {}) { const int matrixA_nd = matrixA.get_ndim(); @@ -257,7 +254,7 @@ std::tuple throw py::value_error("The number of columns in B must be equal to " "the number of columns in result array."); } - std::int64_t src_nelems = batch_size * m * n; + const std::int64_t src_nelems = batch_size * m * n; dpctl::tensor::validation::CheckWritable::throw_if_not_writable(resultC); dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(resultC, src_nelems); @@ -274,8 +271,10 @@ std::tuple standardize_strides_to_nonzero(a_stride, a_shape); standardize_strides_to_nonzero(b_stride, b_shape); - bool A_base_is_f_contig = a_stride[1] == 1 && a_stride[2] == a_shape[1]; - bool B_base_is_f_contig = b_stride[1] == 1 && b_stride[2] == b_shape[1]; + const bool A_base_is_f_contig = + a_stride[1] == 1 && a_stride[2] == a_shape[1]; + const bool B_base_is_f_contig = + b_stride[1] == 1 && b_stride[2] == b_shape[1]; bool is_row_major = true; if (A_base_is_f_contig && B_base_is_f_contig) { @@ -317,17 +316,19 @@ std::tuple } const std::int64_t ldc = is_row_major ? n : m; - int matrixA_typenum = matrixA.get_typenum(); - int matrixB_typenum = matrixB.get_typenum(); - int resultC_typenum = resultC.get_typenum(); + const int matrixA_typenum = matrixA.get_typenum(); + const int matrixB_typenum = matrixB.get_typenum(); + const int resultC_typenum = resultC.get_typenum(); if (matrixA_typenum != matrixB_typenum) { throw py::value_error("matrixA and matrixB must be of the same type."); } auto array_types = dpctl_td_ns::usm_ndarray_types(); - int matrixAB_type_id = array_types.typenum_to_lookup_id(matrixA_typenum); - int resultC_type_id = array_types.typenum_to_lookup_id(resultC_typenum); + const int matrixAB_type_id = + array_types.typenum_to_lookup_id(matrixA_typenum); + const int resultC_type_id = + array_types.typenum_to_lookup_id(resultC_typenum); gemm_batch_impl_fn_ptr_t gemm_batch_fn = gemm_batch_dispatch_table[matrixAB_type_id][resultC_type_id]; @@ -336,8 +337,8 @@ std::tuple "Types of input matrices and result matrix are mismatched."); } - char *a_typeless_ptr = matrixA.get_data(); - char *b_typeless_ptr = matrixB.get_data(); + const char *a_typeless_ptr = matrixA.get_data(); + const char *b_typeless_ptr = matrixB.get_data(); char *r_typeless_ptr = resultC.get_data(); sycl::event gemm_batch_ev = @@ -374,7 +375,4 @@ void init_gemm_batch_dispatch_table(void) contig; contig.populate_dispatch_table(gemm_batch_dispatch_table); } -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/gemv.cpp b/dpnp/backend/extensions/blas/gemv.cpp index c325299aa03..7104c9023f8 100644 --- a/dpnp/backend/extensions/blas/gemv.cpp +++ b/dpnp/backend/extensions/blas/gemv.cpp @@ -35,13 +35,7 @@ #include "dpnp_utils.hpp" -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { namespace mkl_blas = oneapi::mkl::blas; namespace py = pybind11; @@ -51,13 +45,13 @@ typedef sycl::event (*gemv_impl_fn_ptr_t)(sycl::queue &, oneapi::mkl::transpose, const std::int64_t, const std::int64_t, - char *, + const char *, const std::int64_t, - char *, + const char *, const std::int64_t, char *, const std::int64_t, - bool, + const bool, const std::vector &); static gemv_impl_fn_ptr_t gemv_dispatch_vector[dpctl_td_ns::num_types]; @@ -67,19 +61,19 @@ static sycl::event gemv_impl(sycl::queue &exec_q, oneapi::mkl::transpose transA, const std::int64_t m, const std::int64_t n, - char *matrixA, + const char *matrixA, const std::int64_t lda, - char *vectorX, + const char *vectorX, const std::int64_t incx, char *vectorY, const std::int64_t incy, - bool is_row_major, + const bool is_row_major, const std::vector &depends) { type_utils::validate_type_for_device(exec_q); - T *a = reinterpret_cast(matrixA); - T *x = reinterpret_cast(vectorX); + const T *a = reinterpret_cast(matrixA); + const T *x = reinterpret_cast(vectorX); T *y = reinterpret_cast(vectorY); std::stringstream error_msg; @@ -88,9 +82,10 @@ static sycl::event gemv_impl(sycl::queue &exec_q, sycl::event gemv_event; try { auto gemv_func = - [&](sycl::queue &q, oneapi::mkl::transpose transA, std::int64_t m, - std::int64_t n, T alpha, const T *a, std::int64_t lda, - const T *x, std::int64_t incx, T beta, T *y, std::int64_t incy, + [&](sycl::queue &q, oneapi::mkl::transpose transA, + const std::int64_t m, const std::int64_t n, T alpha, const T *a, + const std::int64_t lda, const T *x, const std::int64_t incx, + T beta, T *y, const std::int64_t incy, const std::vector &deps) -> sycl::event { if (is_row_major) { return mkl_blas::row_major::gemv(q, transA, m, n, alpha, a, lda, @@ -141,10 +136,10 @@ static sycl::event gemv_impl(sycl::queue &exec_q, std::pair gemv(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray vectorX, - dpctl::tensor::usm_ndarray vectorY, - bool transpose, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &vectorX, + const dpctl::tensor::usm_ndarray &vectorY, + const bool transpose, const std::vector &depends) { const int matrixA_nd = matrixA.get_ndim(); @@ -173,8 +168,8 @@ std::pair "USM allocations are not compatible with the execution queue."); } - bool is_matrixA_f_contig = matrixA.is_f_contiguous(); - bool is_matrixA_c_contig = matrixA.is_c_contiguous(); + const bool is_matrixA_f_contig = matrixA.is_f_contiguous(); + const bool is_matrixA_c_contig = matrixA.is_c_contiguous(); if (!is_matrixA_f_contig and !is_matrixA_c_contig) { throw py::value_error( @@ -194,7 +189,7 @@ std::pair const std::int64_t lda = is_row_major ? n : m; oneapi::mkl::transpose transA; - size_t src_nelems; + std::size_t src_nelems; if (transpose) { transA = oneapi::mkl::transpose::T; src_nelems = n; @@ -223,9 +218,9 @@ std::pair dpctl::tensor::validation::AmpleMemory::throw_if_not_ample(vectorY, src_nelems); - int matrixA_typenum = matrixA.get_typenum(); - int vectorX_typenum = vectorX.get_typenum(); - int vectorY_typenum = vectorY.get_typenum(); + const int matrixA_typenum = matrixA.get_typenum(); + const int vectorX_typenum = vectorX.get_typenum(); + const int vectorY_typenum = vectorY.get_typenum(); if (matrixA_typenum != vectorX_typenum || matrixA_typenum != vectorY_typenum) { @@ -233,7 +228,7 @@ std::pair } auto array_types = dpctl_td_ns::usm_ndarray_types(); - int type_id = array_types.typenum_to_lookup_id(matrixA_typenum); + const int type_id = array_types.typenum_to_lookup_id(matrixA_typenum); gemv_impl_fn_ptr_t gemv_fn = gemv_dispatch_vector[type_id]; if (gemv_fn == nullptr) { @@ -245,8 +240,8 @@ std::pair char *x_typeless_ptr = vectorX.get_data(); char *y_typeless_ptr = vectorY.get_data(); - std::vector x_stride = vectorX.get_strides_vector(); - std::vector y_stride = vectorY.get_strides_vector(); + const std::vector x_stride = vectorX.get_strides_vector(); + const std::vector y_stride = vectorY.get_strides_vector(); const int x_elemsize = vectorX.get_elemsize(); const int y_elemsize = vectorY.get_elemsize(); const std::int64_t incx = x_stride[0]; @@ -289,7 +284,4 @@ void init_gemv_dispatch_vector(void) contig; contig.populate_dispatch_vector(gemv_dispatch_vector); } -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/gemv.hpp b/dpnp/backend/extensions/blas/gemv.hpp index 703f9c4cc0a..bb5aff87748 100644 --- a/dpnp/backend/extensions/blas/gemv.hpp +++ b/dpnp/backend/extensions/blas/gemv.hpp @@ -25,38 +25,29 @@ #pragma once -#include #include +#include #include -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas +namespace dpnp::extensions::blas { extern std::pair gemv(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray vectorX, - dpctl::tensor::usm_ndarray vectorY, - bool transpose, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &vectorX, + const dpctl::tensor::usm_ndarray &vectorY, + const bool transpose, const std::vector &depends); extern std::pair gemv_batch(sycl::queue &exec_q, - dpctl::tensor::usm_ndarray matrixA, - dpctl::tensor::usm_ndarray vectorX, - dpctl::tensor::usm_ndarray vectorY, - bool transpose, + const dpctl::tensor::usm_ndarray &matrixA, + const dpctl::tensor::usm_ndarray &vectorX, + const dpctl::tensor::usm_ndarray &vectorY, + const bool transpose, const std::vector &depends); extern void init_gemv_dispatch_vector(void); extern void init_gemv_batch_dispatch_vector(void); -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas diff --git a/dpnp/backend/extensions/blas/types_matrix.hpp b/dpnp/backend/extensions/blas/types_matrix.hpp index a33fa42b971..1d9bf637780 100644 --- a/dpnp/backend/extensions/blas/types_matrix.hpp +++ b/dpnp/backend/extensions/blas/types_matrix.hpp @@ -33,15 +33,7 @@ // dpctl namespace for operations with types namespace dpctl_td_ns = dpctl::tensor::type_dispatch; -namespace dpnp -{ -namespace backend -{ -namespace ext -{ -namespace blas -{ -namespace types +namespace dpnp::extensions::blas::types { /** * @brief A factory to define pairs of supported types for which @@ -190,8 +182,4 @@ struct GemvTypePairSupportFactory // fall-through dpctl_td_ns::NotDefinedEntry>::is_defined; }; -} // namespace types -} // namespace blas -} // namespace ext -} // namespace backend -} // namespace dpnp +} // namespace dpnp::extensions::blas::types From 96a2e4111fe025d66d4a6b795cd6ff5f45de8afd Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Mon, 17 Jun 2024 14:34:33 +0200 Subject: [PATCH 29/49] Support usm_ndarray batched input for dpnp.linalg (#1880) * Add usm_ndarray input support for linalg * Add test_usm_ndarray_input_batch to test_linalg.py * Add usm_ndarray input support for dpnp_iface_linearalgebra * Add test_usm_ndarray_linearalgebra_batch to test_linalg.py * Apply comments --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- dpnp/dpnp_iface_linearalgebra.py | 12 ++--- dpnp/linalg/dpnp_iface_linalg.py | 6 +-- dpnp/linalg/dpnp_utils_linalg.py | 20 ++++---- tests/test_linalg.py | 81 ++++++++++++++++++++++++++++++++ 4 files changed, 100 insertions(+), 19 deletions(-) diff --git a/dpnp/dpnp_iface_linearalgebra.py b/dpnp/dpnp_iface_linearalgebra.py index f674c96040a..aef9203746f 100644 --- a/dpnp/dpnp_iface_linearalgebra.py +++ b/dpnp/dpnp_iface_linearalgebra.py @@ -892,13 +892,13 @@ def outer(a, b, out=None): dpnp.check_supported_arrays_type(a, b, scalar_type=True, all_scalars=False) if dpnp.isscalar(a): x1 = a - x2 = b.ravel()[None, :] + x2 = dpnp.ravel(b)[None, :] elif dpnp.isscalar(b): - x1 = a.ravel()[:, None] + x1 = dpnp.ravel(a)[:, None] x2 = b else: - x1 = a.ravel() - x2 = b.ravel() + x1 = dpnp.ravel(a) + x2 = dpnp.ravel(b) return dpnp.multiply.outer(x1, x2, out=out) @@ -1056,8 +1056,8 @@ def tensordot(a, b, axes=2): newshape_b = (n1, n2) oldb = [b_shape[axis] for axis in notin] - at = a.transpose(newaxes_a).reshape(newshape_a) - bt = b.transpose(newaxes_b).reshape(newshape_b) + at = dpnp.transpose(a, newaxes_a).reshape(newshape_a) + bt = dpnp.transpose(b, newaxes_b).reshape(newshape_b) res = dpnp.matmul(at, bt) return res.reshape(olda + oldb) diff --git a/dpnp/linalg/dpnp_iface_linalg.py b/dpnp/linalg/dpnp_iface_linalg.py index 72d79ad329d..5342daa1758 100644 --- a/dpnp/linalg/dpnp_iface_linalg.py +++ b/dpnp/linalg/dpnp_iface_linalg.py @@ -1354,7 +1354,7 @@ def tensorinv(a, ind=2): old_shape = a.shape inv_shape = old_shape[ind:] + old_shape[:ind] prod = numpy.prod(old_shape[ind:]) - a = a.reshape(prod, -1) + a = dpnp.reshape(a, (prod, -1)) a_inv = inv(a) return a_inv.reshape(*inv_shape) @@ -1428,7 +1428,7 @@ def tensorsolve(a, b, axes=None): "prod(a.shape[b.ndim:]) == prod(a.shape[:b.ndim])" ) - a = a.reshape(-1, prod) - b = b.ravel() + a = dpnp.reshape(a, (-1, prod)) + b = dpnp.ravel(b) res = solve(a, b) return res.reshape(old_shape) diff --git a/dpnp/linalg/dpnp_utils_linalg.py b/dpnp/linalg/dpnp_utils_linalg.py index 22aa396c7fe..10e297937ee 100644 --- a/dpnp/linalg/dpnp_utils_linalg.py +++ b/dpnp/linalg/dpnp_utils_linalg.py @@ -99,7 +99,7 @@ def _batched_eigh(a, UPLO, eigen_mode, w_type, v_type): is_cpu_device = a.sycl_device.has_aspect_cpu orig_shape = a.shape # get 3d input array by reshape - a = a.reshape(-1, orig_shape[-2], orig_shape[-1]) + a = dpnp.reshape(a, (-1, orig_shape[-2], orig_shape[-1])) a_usm_arr = dpnp.get_usm_ndarray(a) # allocate a memory for dpnp array of eigenvalues @@ -191,7 +191,7 @@ def _batched_inv(a, res_type): orig_shape = a.shape # get 3d input arrays by reshape - a = a.reshape(-1, orig_shape[-2], orig_shape[-1]) + a = dpnp.reshape(a, (-1, orig_shape[-2], orig_shape[-1])) batch_size = a.shape[0] a_usm_arr = dpnp.get_usm_ndarray(a) a_sycl_queue = a.sycl_queue @@ -280,11 +280,11 @@ def _batched_solve(a, b, exec_q, res_usm_type, res_type): if a.ndim > 3: # get 3d input arrays by reshape if a.ndim == b.ndim: - b = b.reshape(-1, b_shape[-2], b_shape[-1]) + b = dpnp.reshape(b, (-1, b_shape[-2], b_shape[-1])) else: - b = b.reshape(-1, b_shape[-1]) + b = dpnp.reshape(b, (-1, b_shape[-1])) - a = a.reshape(-1, a_shape[-2], a_shape[-1]) + a = dpnp.reshape(a, (-1, a_shape[-2], a_shape[-1])) a_usm_arr = dpnp.get_usm_ndarray(a) b_usm_arr = dpnp.get_usm_ndarray(b) @@ -386,7 +386,7 @@ def _batched_qr(a, mode="reduced"): a_sycl_queue = a.sycl_queue # get 3d input arrays by reshape - a = a.reshape(-1, m, n) + a = dpnp.reshape(a, (-1, m, n)) a = a.swapaxes(-2, -1) a_usm_arr = dpnp.get_usm_ndarray(a) @@ -537,7 +537,7 @@ def _batched_svd( if a.ndim > 3: # get 3d input arrays by reshape - a = a.reshape(prod(a.shape[:-2]), a.shape[-2], a.shape[-1]) + a = dpnp.reshape(a, (prod(a.shape[:-2]), a.shape[-2], a.shape[-1])) reshape = True batch_size = a.shape[0] @@ -830,7 +830,7 @@ def _lu_factor(a, res_type): if a.ndim > 2: orig_shape = a.shape # get 3d input arrays by reshape - a = a.reshape(-1, n, n) + a = dpnp.reshape(a, (-1, n, n)) batch_size = a.shape[0] a_usm_arr = dpnp.get_usm_ndarray(a) @@ -1743,7 +1743,7 @@ def dpnp_cholesky_batch(a, upper_lower, res_type): orig_shape = a.shape # get 3d input arrays by reshape - a = a.reshape(-1, n, n) + a = dpnp.reshape(a, (-1, n, n)) batch_size = a.shape[0] a_usm_arr = dpnp.get_usm_ndarray(a) @@ -2171,7 +2171,7 @@ def dpnp_matrix_power(a, n): # `result` will hold the final matrix power, # while `acc` serves as an accumulator for the intermediate matrix powers. result = None - acc = a.copy() + acc = dpnp.copy(a) while n > 0: n, bit = divmod(n, 2) if bit: diff --git a/tests/test_linalg.py b/tests/test_linalg.py index 48a4891034c..b718a2cec87 100644 --- a/tests/test_linalg.py +++ b/tests/test_linalg.py @@ -57,6 +57,87 @@ def vvsort(val, vec, size, xp): vec[:, imax] = temp +# check linear algebra functions from dpnp.linalg +# with multidimensional usm_ndarray as input +@pytest.mark.parametrize( + "func, gen_kwargs, func_kwargs", + [ + pytest.param("cholesky", {"hermitian": True}, {}), + pytest.param("cond", {}, {}), + pytest.param("det", {}, {}), + pytest.param("eig", {}, {}), + pytest.param("eigh", {"hermitian": True}, {}), + pytest.param("eigvals", {}, {}), + pytest.param("eigvalsh", {"hermitian": True}, {}), + pytest.param("inv", {}, {}), + pytest.param("matrix_power", {}, {"n": 4}), + pytest.param("matrix_rank", {}, {}), + pytest.param("norm", {}, {}), + pytest.param("pinv", {}, {}), + pytest.param("qr", {}, {}), + pytest.param("slogdet", {}, {}), + pytest.param("solve", {}, {}), + pytest.param("svd", {}, {}), + pytest.param("tensorinv", {}, {"ind": 1}), + pytest.param("tensorsolve", {}, {}), + ], +) +def test_usm_ndarray_linalg_batch(func, gen_kwargs, func_kwargs): + shape = ( + (2, 2, 3, 3) if func not in ["tensorinv", "tensorsolve"] else (4, 2, 2) + ) + + if func == "tensorsolve": + shape_b = (4,) + dpt_args = [ + dpt.asarray( + generate_random_numpy_array(shape, seed_value=81, **gen_kwargs) + ), + dpt.asarray( + generate_random_numpy_array( + shape_b, seed_value=81, **gen_kwargs + ) + ), + ] + elif func in ["lstsq", "solve"]: + dpt_args = [ + dpt.asarray( + generate_random_numpy_array(shape, seed_value=81, **gen_kwargs) + ) + for _ in range(2) + ] + else: + dpt_args = [ + dpt.asarray(generate_random_numpy_array(shape, **gen_kwargs)) + ] + + result = getattr(inp.linalg, func)(*dpt_args, **func_kwargs) + + if isinstance(result, tuple): + for res in result: + assert isinstance(res, inp.ndarray) + else: + assert isinstance(result, inp.ndarray) + + +# check linear algebra functions from dpnp +# with multidimensional usm_ndarray as input +@pytest.mark.parametrize( + "func", ["dot", "inner", "kron", "matmul", "outer", "tensordot", "vdot"] +) +def test_usm_ndarray_linearalgebra_batch(func): + shape = (2, 2, 2, 2) + + dpt_args = [ + dpt.asarray(generate_random_numpy_array(shape, seed_value=81)) + for _ in range(2) + ] + + result = getattr(inp, func)(*dpt_args) + + assert isinstance(result, inp.ndarray) + + class TestCholesky: @pytest.mark.parametrize( "array", From edcaaa592eced4c77b22034304d48f2a5aa98510 Mon Sep 17 00:00:00 2001 From: Natalia Polina Date: Wed, 19 Jun 2024 03:25:39 -0700 Subject: [PATCH 30/49] Clean up legacy linalg implementation from the backend (#1887) * Clean up legacy linalg implementation from the backend * fix pre-commit --- dpnp/backend/CMakeLists.txt | 1 - dpnp/backend/kernels/dpnp_krnl_common.cpp | 502 ------------ dpnp/backend/kernels/dpnp_krnl_linalg.cpp | 914 ---------------------- dpnp/backend/src/dpnp_fptr.hpp | 1 - dpnp/backend/src/dpnp_iface_fptr.cpp | 1 - 5 files changed, 1419 deletions(-) delete mode 100644 dpnp/backend/kernels/dpnp_krnl_linalg.cpp diff --git a/dpnp/backend/CMakeLists.txt b/dpnp/backend/CMakeLists.txt index 2ce0dfd5c04..f1f5b447772 100644 --- a/dpnp/backend/CMakeLists.txt +++ b/dpnp/backend/CMakeLists.txt @@ -30,7 +30,6 @@ set(DPNP_SRC kernels/dpnp_krnl_elemwise.cpp kernels/dpnp_krnl_fft.cpp kernels/dpnp_krnl_indexing.cpp - kernels/dpnp_krnl_linalg.cpp kernels/dpnp_krnl_logic.cpp kernels/dpnp_krnl_manipulation.cpp kernels/dpnp_krnl_mathematical.cpp diff --git a/dpnp/backend/kernels/dpnp_krnl_common.cpp b/dpnp/backend/kernels/dpnp_krnl_common.cpp index 423851e4bfd..b1d864327e6 100644 --- a/dpnp/backend/kernels/dpnp_krnl_common.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_common.cpp @@ -38,69 +38,6 @@ namespace mkl_blas_cm = oneapi::mkl::blas::column_major; namespace mkl_blas_rm = oneapi::mkl::blas::row_major; namespace mkl_lapack = oneapi::mkl::lapack; -template -class dpnp_astype_c_kernel; - -template -DPCTLSyclEventRef dpnp_astype_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - void *result1, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, size); - const _DataType *array_in = input1_ptr.get_ptr(); - _ResultType *result = reinterpret_cast<_ResultType *>(result1); - - if ((array_in == nullptr) || (result == nullptr)) { - return event_ref; - } - - if (size == 0) { - return event_ref; - } - - sycl::range<1> gws(size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - size_t i = global_id[0]; - result[i] = array_in[i]; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_astype_c(const void *array1_in, void *result1, const size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_astype_c<_DataType, _ResultType>( - q_ref, array1_in, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_astype_default_c)(const void *, void *, const size_t) = - dpnp_astype_c<_DataType, _ResultType>; - template @@ -521,200 +458,6 @@ DPCTLSyclEventRef (*dpnp_dot_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_dot_c<_DataType_output, _DataType_input1, _DataType_input2>; -template -DPCTLSyclEventRef dpnp_eig_c(DPCTLSyclQueueRef q_ref, - const void *array_in, - void *result1, - void *result2, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // TODO this kernel works with square 2-D array only - - // Kernel Type for calculation is double type - // because interface requires float type but calculations are expected in - // double type - - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array_in, size * size, true); - DPNPC_ptr_adapter<_ResultType> result1_ptr(q_ref, result1, size, true, - true); - DPNPC_ptr_adapter<_ResultType> result2_ptr(q_ref, result2, size * size, - true, true); - const _DataType *array = input1_ptr.get_ptr(); - _ResultType *result_val = result1_ptr.get_ptr(); - _ResultType *result_vec = result2_ptr.get_ptr(); - - double *result_val_kern = reinterpret_cast( - sycl::malloc_shared(size * sizeof(double), q)); - double *result_vec_kern = reinterpret_cast( - sycl::malloc_shared(size * size * sizeof(double), q)); - - // type conversion. Also, math library requires copy memory because override - for (size_t it = 0; it < (size * size); ++it) { - result_vec_kern[it] = - array[it]; // TODO use memcpy_c or input1_ptr(array_in, size, true) - } - - const std::int64_t lda = std::max(1UL, size); - - const std::int64_t scratchpad_size = - mkl_lapack::syevd_scratchpad_size( - q, oneapi::mkl::job::vec, oneapi::mkl::uplo::upper, size, lda); - - // https://github.com/IntelPython/dpnp/issues/1005 - // Test tests/test_linalg.py::test_eig_arange raises 2 issues in dpnp_eig_c - // on CPU - // 1. Call of mkl_lapack::syevd_scratchpad_size returns wrong value - // that causes out of memory issue. - // 2. Call of the function oneapi::mkl::lapack::syevd causes segfault. - // Example of the command to reproduce the issues: - // SYCL_DEVICE_FILTER=cpu pytest - // tests/test_linalg.py::test_eig_arange[2-float64] High-level reason of the - // issues is numpy is imported before dpnp in third party tests. Low-level - // reason of the issues could be related to MKL runtime library loaded - // during numpy import. - - double *scratchpad = reinterpret_cast( - sycl::malloc_shared(scratchpad_size * sizeof(double), q)); - - event = mkl_lapack::syevd( - q, // queue - oneapi::mkl::job::vec, // jobz - oneapi::mkl::uplo::upper, // uplo - size, // The order of the matrix A (0 <= n) - result_vec_kern, // will be overwritten with eigenvectors - lda, result_val_kern, scratchpad, scratchpad_size); - event.wait(); - - sycl::free(scratchpad, q); - - for (size_t it1 = 0; it1 < size; ++it1) { - result_val[it1] = - result_val_kern[it1]; // TODO use memcpy_c or dpnpc_transpose_c - for (size_t it2 = 0; it2 < size; ++it2) { - // copy + transpose - result_vec[it2 * size + it1] = result_vec_kern[it1 * size + it2]; - } - } - - sycl::free(result_val_kern, q); - sycl::free(result_vec_kern, q); - - return event_ref; -} - -template -void dpnp_eig_c(const void *array_in, void *result1, void *result2, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_eig_c<_DataType, _ResultType>( - q_ref, array_in, result1, result2, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_eig_default_c)(const void *, void *, void *, size_t) = - dpnp_eig_c<_DataType, _ResultType>; - -template -DPCTLSyclEventRef dpnp_eigvals_c(DPCTLSyclQueueRef q_ref, - const void *array_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // TODO this kernel works with square 2-D array only - - // Kernel Type for calculation is double type - // because interface requires float type but calculations are expected in - // double type - - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array_in, size * size, true); - DPNPC_ptr_adapter<_ResultType> result1_ptr(q_ref, result1, size, true, - true); - const _DataType *array = input1_ptr.get_ptr(); - _ResultType *result_val = result1_ptr.get_ptr(); - - double *result_val_kern = reinterpret_cast( - sycl::malloc_shared(size * sizeof(double), q)); - double *result_vec_kern = reinterpret_cast( - sycl::malloc_shared(size * size * sizeof(double), q)); - - // type conversion. Also, math library requires copy memory because override - for (size_t it = 0; it < (size * size); ++it) { - result_vec_kern[it] = array[it]; // TODO same as previous kernel - } - - const std::int64_t lda = std::max(1UL, size); - - const std::int64_t scratchpad_size = - mkl_lapack::syevd_scratchpad_size( - q, oneapi::mkl::job::vec, oneapi::mkl::uplo::upper, size, lda); - - double *scratchpad = reinterpret_cast( - sycl::malloc_shared(scratchpad_size * sizeof(double), q)); - - event = mkl_lapack::syevd(q, // queue - oneapi::mkl::job::vec, // jobz - oneapi::mkl::uplo::upper, // uplo - size, // The order of the matrix A (0 <= n) - result_vec_kern, lda, result_val_kern, scratchpad, - scratchpad_size); - event.wait(); - - sycl::free(scratchpad, q); - - for (size_t it1 = 0; it1 < size; ++it1) { - result_val[it1] = result_val_kern[it1]; - } - - sycl::free(result_val_kern, q); - - return event_ref; -} - -template -void dpnp_eigvals_c(const void *array_in, void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_eigvals_c<_DataType, _ResultType>( - q_ref, array_in, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_eigvals_default_c)(const void *, - void *, - size_t) = dpnp_eigvals_c<_DataType, _ResultType>; - template class dpnp_initval_c_kernel; @@ -769,226 +512,8 @@ DPCTLSyclEventRef (*dpnp_initval_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_initval_c<_DataType>; -template -class dpnp_matmul_c_kernel; - -template -DPCTLSyclEventRef dpnp_matmul_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input1_in, - const size_t input1_size, - const size_t input1_ndim, - const shape_elem_type *input1_shape, - const shape_elem_type *input1_strides, - const void *input2_in, - const size_t input2_size, - const size_t input2_ndim, - const shape_elem_type *input2_shape, - const shape_elem_type *input2_strides, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - (void)result_size; - (void)result_ndim; - (void)result_shape; - (void)result_strides; - (void)input1_size; - (void)input1_ndim; - (void)input1_strides; - (void)input2_size; - (void)input2_ndim; - (void)input2_strides; - - DPCTLSyclEventRef event_ref = nullptr; - - size_t size_m = input1_shape[0]; - size_t size_n = input2_shape[1]; - size_t size_k = input1_shape[1]; - - if (!size_m || !size_n || !size_k) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - std::vector dep_events = cast_event_vector(dep_event_vec_ref); - sycl::event event; - - _DataType *array_1 = - reinterpret_cast<_DataType *>(const_cast(input1_in)); - _DataType *array_2 = - reinterpret_cast<_DataType *>(const_cast(input2_in)); - _DataType *result = reinterpret_cast<_DataType *>(result_out); - - if constexpr (std::is_same<_DataType, double>::value || - std::is_same<_DataType, float>::value) - { - // using std::max for these ldx variables is required by math library - const std::int64_t ld_array_2 = - std::max(1UL, size_n); // First dimensions of array_2 - const std::int64_t ld_array_1 = - std::max(1UL, size_k); // First dimensions of array_1 - const std::int64_t ld_result = - std::max(1UL, size_n); // Fast dimensions of result - - event = mkl_blas::gemm(q, oneapi::mkl::transpose::nontrans, - oneapi::mkl::transpose::nontrans, size_n, size_m, - size_k, _DataType(1), array_2, ld_array_2, - array_1, ld_array_1, _DataType(0), result, - ld_result, dep_events); - } - else { - // input1: M x K - // input2: K x N - // result: M x N - const size_t dim_m = - size_m; // shape1.front(); // First dimensions of array1 - const size_t dim_n = - size_n; // shape2.back(); // Last dimensions of array2 - const size_t dim_k = - size_k; // shape1.back(); // First dimensions of array2 - - sycl::range<2> gws(dim_m, dim_n); // dimensions are: "i" and "j" - - auto kernel_parallel_for_func = [=](sycl::id<2> global_id) { - size_t i = global_id[0]; // for (size_t i = 0; i < size; ++i) - { - size_t j = global_id[1]; // for (size_t j = 0; j < size; ++j) - { - _DataType acc = _DataType(0); - for (size_t k = 0; k < dim_k; ++k) { - const size_t index_1 = i * dim_k + k; - const size_t index_2 = k * dim_n + j; - acc += array_1[index_1] * array_2[index_2]; - } - const size_t index_result = i * dim_n + j; - result[index_result] = acc; - } - } - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.depends_on(dep_events); - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - } - - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_matmul_c(void *result_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input1_in, - const size_t input1_size, - const size_t input1_ndim, - const shape_elem_type *input1_shape, - const shape_elem_type *input1_strides, - const void *input2_in, - const size_t input2_size, - const size_t input2_ndim, - const shape_elem_type *input2_shape, - const shape_elem_type *input2_strides) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_matmul_c<_DataType>( - q_ref, result_out, result_size, result_ndim, result_shape, - result_strides, input1_in, input1_size, input1_ndim, input1_shape, - input1_strides, input2_in, input2_size, input2_ndim, input2_shape, - input2_strides, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_matmul_default_c)(void *, - const size_t, - const size_t, - const shape_elem_type *, - const shape_elem_type *, - const void *, - const size_t, - const size_t, - const shape_elem_type *, - const shape_elem_type *, - const void *, - const size_t, - const size_t, - const shape_elem_type *, - const shape_elem_type *) = - dpnp_matmul_c<_DataType>; - void func_map_init_linalg(func_map_t &fmap) { - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_BLN][eft_INT] = { - eft_INT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_BLN][eft_LNG] = { - eft_LNG, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_BLN][eft_FLT] = { - eft_FLT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_BLN][eft_DBL] = { - eft_DBL, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_INT][eft_BLN] = { - eft_BLN, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_INT][eft_FLT] = { - eft_FLT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_LNG][eft_BLN] = { - eft_BLN, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_LNG][eft_INT] = { - eft_INT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_LNG][eft_FLT] = { - eft_FLT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_FLT][eft_BLN] = { - eft_BLN, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_FLT][eft_INT] = { - eft_INT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_FLT][eft_LNG] = { - eft_LNG, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_DBL][eft_BLN] = { - eft_BLN, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_DBL][eft_INT] = { - eft_INT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_DBL][eft_LNG] = { - eft_LNG, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_DBL][eft_FLT] = { - eft_FLT, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_astype_default_c}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_C64][eft_C64] = { - eft_C64, - (void *) - dpnp_astype_default_c, std::complex>}; - fmap[DPNPFuncName::DPNP_FN_ASTYPE][eft_C128][eft_C128] = { - eft_C128, - (void *) - dpnp_astype_default_c, std::complex>}; fmap[DPNPFuncName::DPNP_FN_DOT][eft_INT][eft_INT] = { eft_INT, (void *)dpnp_dot_default_c}; @@ -1057,24 +582,6 @@ void func_map_init_linalg(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_DOT_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_dot_ext_c}; - fmap[DPNPFuncName::DPNP_FN_EIG][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_eig_default_c}; - fmap[DPNPFuncName::DPNP_FN_EIG][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_eig_default_c}; - fmap[DPNPFuncName::DPNP_FN_EIG][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_eig_default_c}; - fmap[DPNPFuncName::DPNP_FN_EIG][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_eig_default_c}; - - fmap[DPNPFuncName::DPNP_FN_EIGVALS][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_eigvals_default_c}; - fmap[DPNPFuncName::DPNP_FN_EIGVALS][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_eigvals_default_c}; - fmap[DPNPFuncName::DPNP_FN_EIGVALS][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_eigvals_default_c}; - fmap[DPNPFuncName::DPNP_FN_EIGVALS][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_eigvals_default_c}; - fmap[DPNPFuncName::DPNP_FN_INITVAL][eft_BLN][eft_BLN] = { eft_BLN, (void *)dpnp_initval_default_c}; fmap[DPNPFuncName::DPNP_FN_INITVAL][eft_INT][eft_INT] = { @@ -1103,14 +610,5 @@ void func_map_init_linalg(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_INITVAL_EXT][eft_C128][eft_C128] = { eft_C128, (void *)dpnp_initval_ext_c>}; - fmap[DPNPFuncName::DPNP_FN_MATMUL][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_matmul_default_c}; - fmap[DPNPFuncName::DPNP_FN_MATMUL][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_matmul_default_c}; - fmap[DPNPFuncName::DPNP_FN_MATMUL][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_matmul_default_c}; - fmap[DPNPFuncName::DPNP_FN_MATMUL][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_matmul_default_c}; - return; } diff --git a/dpnp/backend/kernels/dpnp_krnl_linalg.cpp b/dpnp/backend/kernels/dpnp_krnl_linalg.cpp deleted file mode 100644 index 1dc2783d48c..00000000000 --- a/dpnp/backend/kernels/dpnp_krnl_linalg.cpp +++ /dev/null @@ -1,914 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -#include -#include - -#include "dpnp_fptr.hpp" -#include "dpnp_utils.hpp" -#include "dpnpc_memory_adapter.hpp" -#include "queue_sycl.hpp" -#include - -namespace mkl_blas = oneapi::mkl::blas::row_major; -namespace mkl_lapack = oneapi::mkl::lapack; - -template -DPCTLSyclEventRef dpnp_cholesky_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - const size_t size, - const size_t data_size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - if (!data_size) { - return event_ref; - } - sycl::queue q = *(reinterpret_cast(q_ref)); - - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, size, true, true); - _DataType *in_array = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - size_t iters = size / (data_size * data_size); - - // math lib func overrides input - _DataType *in_a = reinterpret_cast<_DataType *>( - sycl::malloc_shared(data_size * data_size * sizeof(_DataType), q)); - - for (size_t k = 0; k < iters; ++k) { - for (size_t it = 0; it < data_size * data_size; ++it) { - in_a[it] = in_array[k * (data_size * data_size) + it]; - } - - const std::int64_t n = data_size; - - const std::int64_t lda = std::max(1UL, n); - - const std::int64_t scratchpad_size = - mkl_lapack::potrf_scratchpad_size<_DataType>( - q, oneapi::mkl::uplo::upper, n, lda); - - _DataType *scratchpad = reinterpret_cast<_DataType *>( - sycl::malloc_shared(scratchpad_size * sizeof(_DataType), q)); - - event = mkl_lapack::potrf(q, oneapi::mkl::uplo::upper, n, in_a, lda, - scratchpad, scratchpad_size); - - event.wait(); - - for (size_t i = 0; i < data_size; i++) { - bool arg = false; - for (size_t j = 0; j < data_size; j++) { - if (i == j - 1) { - arg = true; - } - if (arg) { - in_a[i * data_size + j] = 0; - } - } - } - - sycl::free(scratchpad, q); - - for (size_t t = 0; t < data_size * data_size; ++t) { - result[k * (data_size * data_size) + t] = in_a[t]; - } - } - - sycl::free(in_a, q); - - return event_ref; -} - -template -void dpnp_cholesky_c(void *array1_in, - void *result1, - const size_t size, - const size_t data_size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_cholesky_c<_DataType>( - q_ref, array1_in, result1, size, data_size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_cholesky_default_c)(void *, void *, const size_t, const size_t) = - dpnp_cholesky_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_det_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - const size_t input_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - if (!input_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - size_t n = shape[ndim - 1]; - size_t size_out = 1; - if (ndim != 2) { - for (size_t i = 0; i < ndim - 2; i++) { - size_out *= shape[i]; - } - } - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, input_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, size_out, true, - true); - _DataType *array_1 = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - _DataType *matrix = new _DataType[n * n]; - _DataType *elems = new _DataType[n * n]; - - for (size_t i = 0; i < size_out; i++) { - if (size_out > 1) { - for (size_t j = i * n * n; j < (i + 1) * n * n; j++) { - elems[j - i * n * n] = array_1[j]; - } - - for (size_t j = 0; j < n; j++) { - for (size_t k = 0; k < n; k++) { - matrix[j * n + k] = elems[j * n + k]; - } - } - } - else { - for (size_t j = 0; j < n; j++) { - for (size_t k = 0; k < n; k++) { - matrix[j * n + k] = array_1[j * n + k]; - } - } - } - - _DataType det_val = 1; - for (size_t l = 0; l < n; l++) { - if (matrix[l * n + l] == 0) { - for (size_t j = l; j < n; j++) { - if (matrix[j * n + l] != 0) { - for (size_t k = l; k < n; k++) { - _DataType c = matrix[l * n + k]; - matrix[l * n + k] = -1 * matrix[j * n + k]; - matrix[j * n + k] = c; - } - break; - } - if (j == n - 1 and matrix[j * n + l] == 0) { - det_val = 0; - } - } - } - if (det_val != 0) { - for (size_t j = l + 1; j < n; j++) { - _DataType quotient = - -(matrix[j * n + l] / matrix[l * n + l]); - for (size_t k = l + 1; k < n; k++) { - matrix[j * n + k] += quotient * matrix[l * n + k]; - } - } - } - } - - if (det_val != 0) { - for (size_t l = 0; l < n; l++) { - det_val *= matrix[l * n + l]; - } - } - - result[i] = det_val; - } - - delete[] elems; - delete[] matrix; - return event_ref; -} - -template -void dpnp_det_c(void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_det_c<_DataType>( - q_ref, array1_in, result1, shape, ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_det_default_c)(void *, void *, shape_elem_type *, size_t) = - dpnp_det_c<_DataType>; - -template -DPCTLSyclEventRef (*dpnp_det_ext_c)(DPCTLSyclQueueRef, - void *, - void *, - shape_elem_type *, - size_t, - const DPCTLEventVectorRef) = - dpnp_det_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_inv_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)ndim; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - const size_t input_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - if (!input_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, input_size, true); - DPNPC_ptr_adapter<_ResultType> result_ptr(q_ref, result1, input_size, true, - true); - - _DataType *array_1 = input1_ptr.get_ptr(); - _ResultType *result = result_ptr.get_ptr(); - - size_t n = shape[0]; - - _ResultType *a_arr = new _ResultType[n * n]; - _ResultType *e_arr = new _ResultType[n * n]; - - for (size_t i = 0; i < n; ++i) { - for (size_t j = 0; j < n; ++j) { - a_arr[i * n + j] = array_1[i * n + j]; - if (i == j) { - e_arr[i * n + j] = 1; - } - else { - e_arr[i * n + j] = 0; - } - } - } - - for (size_t k = 0; k < n; ++k) { - if (a_arr[k * n + k] == 0) { - for (size_t i = k; i < n; ++i) { - if (a_arr[i * n + k] != 0) { - for (size_t j = 0; j < n; ++j) { - float c = a_arr[k * n + j]; - a_arr[k * n + j] = a_arr[i * n + j]; - a_arr[i * n + j] = c; - float c_e = e_arr[k * n + j]; - e_arr[k * n + j] = e_arr[i * n + j]; - e_arr[i * n + j] = c_e; - } - break; - } - } - } - - float temp = a_arr[k * n + k]; - - for (size_t j = 0; j < n; ++j) { - a_arr[k * n + j] = a_arr[k * n + j] / temp; - e_arr[k * n + j] = e_arr[k * n + j] / temp; - } - - for (size_t i = k + 1; i < n; ++i) { - temp = a_arr[i * n + k]; - for (size_t j = 0; j < n; j++) { - a_arr[i * n + j] = a_arr[i * n + j] - a_arr[k * n + j] * temp; - e_arr[i * n + j] = e_arr[i * n + j] - e_arr[k * n + j] * temp; - } - } - } - - for (size_t k = 0; k < n - 1; ++k) { - size_t ind_k = n - 1 - k; - for (size_t i = 0; i < ind_k; ++i) { - size_t ind_i = ind_k - 1 - i; - - float temp = a_arr[ind_i * n + ind_k]; - for (size_t j = 0; j < n; ++j) { - a_arr[ind_i * n + j] = - a_arr[ind_i * n + j] - a_arr[ind_k * n + j] * temp; - e_arr[ind_i * n + j] = - e_arr[ind_i * n + j] - e_arr[ind_k * n + j] * temp; - } - } - } - - for (size_t i = 0; i < n; ++i) { - for (size_t j = 0; j < n; ++j) { - result[i * n + j] = e_arr[i * n + j]; - } - } - - delete[] a_arr; - delete[] e_arr; - return event_ref; -} - -template -void dpnp_inv_c(void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_inv_c<_DataType, _ResultType>( - q_ref, array1_in, result1, shape, ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_inv_default_c)(void *, void *, shape_elem_type *, size_t) = - dpnp_inv_c<_DataType, _ResultType>; - -template -class dpnp_kron_c_kernel; - -template -DPCTLSyclEventRef dpnp_kron_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *array2_in, - void *result1, - shape_elem_type *in1_shape, - shape_elem_type *in2_shape, - shape_elem_type *res_shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - const size_t input1_size = std::accumulate( - in1_shape, in1_shape + ndim, 1, std::multiplies()); - const size_t input2_size = std::accumulate( - in2_shape, in2_shape + ndim, 1, std::multiplies()); - const size_t result_size = std::accumulate( - res_shape, res_shape + ndim, 1, std::multiplies()); - if (!(result_size && input1_size && input2_size)) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType1> input1_ptr(q_ref, array1_in, input1_size); - DPNPC_ptr_adapter<_DataType2> input2_ptr(q_ref, array2_in, input2_size); - DPNPC_ptr_adapter<_ResultType> result_ptr(q_ref, result1, result_size); - - _DataType1 *array1 = input1_ptr.get_ptr(); - _DataType2 *array2 = input2_ptr.get_ptr(); - _ResultType *result = result_ptr.get_ptr(); - - shape_elem_type *_in1_shape = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - shape_elem_type *_in2_shape = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - - q.memcpy(_in1_shape, in1_shape, ndim * sizeof(shape_elem_type)).wait(); - q.memcpy(_in2_shape, in2_shape, ndim * sizeof(shape_elem_type)).wait(); - - shape_elem_type *in1_offsets = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - shape_elem_type *in2_offsets = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - shape_elem_type *res_offsets = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - - get_shape_offsets_inkernel(in1_shape, ndim, in1_offsets); - get_shape_offsets_inkernel(in2_shape, ndim, in2_offsets); - get_shape_offsets_inkernel(res_shape, ndim, res_offsets); - - sycl::range<1> gws(result_size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - const size_t idx = global_id[0]; - - size_t idx1 = 0; - size_t idx2 = 0; - size_t reminder = idx; - for (size_t axis = 0; axis < ndim; ++axis) { - const size_t res_axis = reminder / res_offsets[axis]; - reminder = reminder - res_axis * res_offsets[axis]; - - const size_t in1_axis = res_axis / _in2_shape[axis]; - const size_t in2_axis = res_axis - in1_axis * _in2_shape[axis]; - - idx1 += in1_axis * in1_offsets[axis]; - idx2 += in2_axis * in2_offsets[axis]; - } - - result[idx] = array1[idx1] * array2[idx2]; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for< - class dpnp_kron_c_kernel<_DataType1, _DataType2, _ResultType>>( - gws, kernel_parallel_for_func); - }; - - sycl::event event = q.submit(kernel_func); - - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_kron_c(void *array1_in, - void *array2_in, - void *result1, - shape_elem_type *in1_shape, - shape_elem_type *in2_shape, - shape_elem_type *res_shape, - size_t ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_kron_c<_DataType1, _DataType2, _ResultType>( - q_ref, array1_in, array2_in, result1, in1_shape, in2_shape, - res_shape, ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_kron_default_c)(void *, - void *, - void *, - shape_elem_type *, - shape_elem_type *, - shape_elem_type *, - size_t) = - dpnp_kron_c<_DataType1, _DataType2, _ResultType>; - -template -DPCTLSyclEventRef - dpnp_matrix_rank_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - const size_t input_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - if (!input_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, input_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, 1, true, true); - _DataType *array_1 = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - shape_elem_type elems = 1; - if (ndim > 1) { - elems = shape[0]; - for (size_t i = 1; i < ndim; i++) { - if (shape[i] < elems) { - elems = shape[i]; - } - } - } - - _DataType acc = 0; - for (size_t i = 0; i < static_cast(elems); i++) { - size_t ind = 0; - for (size_t j = 0; j < ndim; j++) { - ind += (shape[j] - 1) * i; - } - acc += array_1[ind]; - } - result[0] = acc; - - return event_ref; -} - -template -void dpnp_matrix_rank_c(void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_matrix_rank_c<_DataType>( - q_ref, array1_in, result1, shape, ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_matrix_rank_default_c)(void *, void *, shape_elem_type *, size_t) = - dpnp_matrix_rank_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_qr_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - void *result2, - void *result3, - size_t size_m, - size_t size_n, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - if (!size_m || !size_n) { - return event_ref; - } - sycl::queue q = *(reinterpret_cast(q_ref)); - - sycl::event event; - - DPNPC_ptr_adapter<_InputDT> input1_ptr(q_ref, array1_in, size_m * size_n, - true); - _InputDT *in_array = input1_ptr.get_ptr(); - - // math lib func overrides input - _ComputeDT *in_a = reinterpret_cast<_ComputeDT *>( - sycl::malloc_shared(size_m * size_n * sizeof(_ComputeDT), q)); - - for (size_t i = 0; i < size_m; ++i) { - for (size_t j = 0; j < size_n; ++j) { - // TODO transpose? use dpnp_transpose_c() - in_a[j * size_m + i] = in_array[i * size_n + j]; - } - } - - const size_t min_size_m_n = std::min(size_m, size_n); - DPNPC_ptr_adapter<_ComputeDT> result1_ptr( - q_ref, result1, size_m * min_size_m_n, true, true); - DPNPC_ptr_adapter<_ComputeDT> result2_ptr( - q_ref, result2, min_size_m_n * size_n, true, true); - DPNPC_ptr_adapter<_ComputeDT> result3_ptr(q_ref, result3, min_size_m_n, - true, true); - _ComputeDT *res_q = result1_ptr.get_ptr(); - _ComputeDT *res_r = result2_ptr.get_ptr(); - _ComputeDT *tau = result3_ptr.get_ptr(); - - const std::int64_t lda = size_m; - - const std::int64_t geqrf_scratchpad_size = - mkl_lapack::geqrf_scratchpad_size<_ComputeDT>(q, size_m, size_n, lda); - - _ComputeDT *geqrf_scratchpad = reinterpret_cast<_ComputeDT *>( - sycl::malloc_shared(geqrf_scratchpad_size * sizeof(_ComputeDT), q)); - - std::vector depends(1); - set_barrier_event(q, depends); - - event = mkl_lapack::geqrf(q, size_m, size_n, in_a, lda, tau, - geqrf_scratchpad, geqrf_scratchpad_size, depends); - event.wait(); - - if (!depends.empty()) { - verbose_print("oneapi::mkl::lapack::geqrf", depends.front(), event); - } - - sycl::free(geqrf_scratchpad, q); - - // R - size_t mrefl = min_size_m_n; - for (size_t i = 0; i < mrefl; ++i) { - for (size_t j = 0; j < size_n; ++j) { - if (j >= i) { - res_r[i * size_n + j] = in_a[j * size_m + i]; - } - else { - res_r[i * size_n + j] = _ComputeDT(0); - } - } - } - - // Q - const size_t nrefl = min_size_m_n; - const std::int64_t orgqr_scratchpad_size = - mkl_lapack::orgqr_scratchpad_size<_ComputeDT>(q, size_m, nrefl, nrefl, - lda); - - _ComputeDT *orgqr_scratchpad = reinterpret_cast<_ComputeDT *>( - sycl::malloc_shared(orgqr_scratchpad_size * sizeof(_ComputeDT), q)); - - set_barrier_event(q, depends); - - event = mkl_lapack::orgqr(q, size_m, nrefl, nrefl, in_a, lda, tau, - orgqr_scratchpad, orgqr_scratchpad_size, depends); - event.wait(); - - if (!depends.empty()) { - verbose_print("oneapi::mkl::lapack::orgqr", depends.front(), event); - } - - sycl::free(orgqr_scratchpad, q); - - for (size_t i = 0; i < size_m; ++i) { - for (size_t j = 0; j < nrefl; ++j) { - res_q[i * nrefl + j] = in_a[j * size_m + i]; - } - } - - sycl::free(in_a, q); - - return event_ref; -} - -template -void dpnp_qr_c(void *array1_in, - void *result1, - void *result2, - void *result3, - size_t size_m, - size_t size_n) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_qr_c<_InputDT, _ComputeDT>( - q_ref, array1_in, result1, result2, result3, size_m, size_n, - dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_qr_default_c)(void *, void *, void *, void *, size_t, size_t) = - dpnp_qr_c<_InputDT, _ComputeDT>; - -template -DPCTLSyclEventRef dpnp_svd_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - void *result2, - void *result3, - size_t size_m, - size_t size_n, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - sycl::queue q = *(reinterpret_cast(q_ref)); - - sycl::event event; - - DPNPC_ptr_adapter<_InputDT> input1_ptr( - q_ref, array1_in, size_m * size_n, - true); // TODO no need this if use dpnp_copy_to() - _InputDT *in_array = input1_ptr.get_ptr(); - - // math lib gesvd func overrides input - _ComputeDT *in_a = reinterpret_cast<_ComputeDT *>( - sycl::malloc_shared(size_m * size_n * sizeof(_ComputeDT), q)); - for (size_t it = 0; it < size_m * size_n; ++it) { - in_a[it] = in_array[it]; // TODO Type conversion. memcpy can not be used - // directly. dpnp_copy_to() ? - } - - DPNPC_ptr_adapter<_ComputeDT> result1_ptr(q_ref, result1, size_m * size_m, - true, true); - DPNPC_ptr_adapter<_SVDT> result2_ptr(q_ref, result2, - std::min(size_m, size_n), true, true); - DPNPC_ptr_adapter<_ComputeDT> result3_ptr(q_ref, result3, size_n * size_n, - true, true); - _ComputeDT *res_u = result1_ptr.get_ptr(); - _SVDT *res_s = result2_ptr.get_ptr(); - _ComputeDT *res_vt = result3_ptr.get_ptr(); - - const std::int64_t m = size_m; - const std::int64_t n = size_n; - - const std::int64_t lda = std::max(1UL, n); - const std::int64_t ldu = std::max(1UL, m); - const std::int64_t ldvt = std::max(1UL, n); - - const std::int64_t scratchpad_size = - mkl_lapack::gesvd_scratchpad_size<_ComputeDT>( - q, oneapi::mkl::jobsvd::vectors, oneapi::mkl::jobsvd::vectors, n, m, - lda, ldvt, ldu); - - _ComputeDT *scratchpad = reinterpret_cast<_ComputeDT *>( - sycl::malloc_shared(scratchpad_size * sizeof(_ComputeDT), q)); - - event = - mkl_lapack::gesvd(q, - oneapi::mkl::jobsvd::vectors, // onemkl::job jobu, - oneapi::mkl::jobsvd::vectors, // onemkl::job jobvt, - n, m, in_a, lda, res_s, res_vt, ldvt, res_u, ldu, - scratchpad, scratchpad_size); - - event.wait(); - - sycl::free(scratchpad, q); - - return event_ref; -} - -template -void dpnp_svd_c(void *array1_in, - void *result1, - void *result2, - void *result3, - size_t size_m, - size_t size_n) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_svd_c<_InputDT, _ComputeDT, _SVDT>( - q_ref, array1_in, result1, result2, result3, size_m, size_n, - dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_svd_default_c)(void *, void *, void *, void *, size_t, size_t) = - dpnp_svd_c<_InputDT, _ComputeDT, _SVDT>; - -void func_map_init_linalg_func(func_map_t &fmap) -{ - fmap[DPNPFuncName::DPNP_FN_CHOLESKY][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cholesky_default_c}; - fmap[DPNPFuncName::DPNP_FN_CHOLESKY][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cholesky_default_c}; - - fmap[DPNPFuncName::DPNP_FN_DET][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_det_default_c}; - fmap[DPNPFuncName::DPNP_FN_DET][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_det_default_c}; - fmap[DPNPFuncName::DPNP_FN_DET][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_det_default_c}; - fmap[DPNPFuncName::DPNP_FN_DET][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_det_default_c}; - - fmap[DPNPFuncName::DPNP_FN_INV][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_inv_default_c}; - fmap[DPNPFuncName::DPNP_FN_INV][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_inv_default_c}; - fmap[DPNPFuncName::DPNP_FN_INV][eft_FLT][eft_FLT] = { - eft_DBL, (void *)dpnp_inv_default_c}; - fmap[DPNPFuncName::DPNP_FN_INV][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_inv_default_c}; - - fmap[DPNPFuncName::DPNP_FN_KRON][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_INT][eft_FLT] = { - eft_FLT, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_kron_default_c}; - // fmap[DPNPFuncName::DPNP_FN_KRON][eft_INT][eft_C128] = { - // eft_C128, (void*)dpnp_kron_default_c, - // std::complex>}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_LNG][eft_FLT] = { - eft_FLT, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_kron_default_c}; - // fmap[DPNPFuncName::DPNP_FN_KRON][eft_LNG][eft_C128] = { - // eft_C128, (void*)dpnp_kron_default_c, - // std::complex>}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_FLT][eft_INT] = { - eft_FLT, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_FLT][eft_LNG] = { - eft_FLT, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_kron_default_c}; - // fmap[DPNPFuncName::DPNP_FN_KRON][eft_FLT][eft_C128] = { - // eft_C128, (void*)dpnp_kron_default_c, - // std::complex>}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_kron_default_c}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_DBL][eft_C128] = { - eft_C128, (void *)dpnp_kron_default_c, - std::complex>}; - // fmap[DPNPFuncName::DPNP_FN_KRON][eft_C128][eft_INT] = { - // eft_C128, (void*)dpnp_kron_default_c, int32_t, - // std::complex>}; - // fmap[DPNPFuncName::DPNP_FN_KRON][eft_C128][eft_LNG] = { - // eft_C128, (void*)dpnp_kron_default_c, int64_t, - // std::complex>}; - // fmap[DPNPFuncName::DPNP_FN_KRON][eft_C128][eft_FLT] = { - // eft_C128, (void*)dpnp_kron_default_c, float, - // std::complex>}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_C128][eft_DBL] = { - eft_C128, (void *)dpnp_kron_default_c, double, - std::complex>}; - fmap[DPNPFuncName::DPNP_FN_KRON][eft_C128][eft_C128] = { - eft_C128, - (void *)dpnp_kron_default_c, std::complex, - std::complex>}; - - fmap[DPNPFuncName::DPNP_FN_MATRIX_RANK][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_matrix_rank_default_c}; - fmap[DPNPFuncName::DPNP_FN_MATRIX_RANK][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_matrix_rank_default_c}; - fmap[DPNPFuncName::DPNP_FN_MATRIX_RANK][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_matrix_rank_default_c}; - fmap[DPNPFuncName::DPNP_FN_MATRIX_RANK][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_matrix_rank_default_c}; - - fmap[DPNPFuncName::DPNP_FN_QR][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_qr_default_c}; - fmap[DPNPFuncName::DPNP_FN_QR][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_qr_default_c}; - fmap[DPNPFuncName::DPNP_FN_QR][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_qr_default_c}; - fmap[DPNPFuncName::DPNP_FN_QR][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_qr_default_c}; - // fmap[DPNPFuncName::DPNP_FN_QR][eft_C128][eft_C128] = { - // eft_C128, (void*)dpnp_qr_c, std::complex>}; - - fmap[DPNPFuncName::DPNP_FN_SVD][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_svd_default_c}; - fmap[DPNPFuncName::DPNP_FN_SVD][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_svd_default_c}; - fmap[DPNPFuncName::DPNP_FN_SVD][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_svd_default_c}; - fmap[DPNPFuncName::DPNP_FN_SVD][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_svd_default_c}; - fmap[DPNPFuncName::DPNP_FN_SVD][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_svd_default_c, - std::complex, double>}; - - return; -} diff --git a/dpnp/backend/src/dpnp_fptr.hpp b/dpnp/backend/src/dpnp_fptr.hpp index 022e844319d..20fc5305e9a 100644 --- a/dpnp/backend/src/dpnp_fptr.hpp +++ b/dpnp/backend/src/dpnp_fptr.hpp @@ -331,7 +331,6 @@ void func_map_init_elemwise(func_map_t &fmap); void func_map_init_fft_func(func_map_t &fmap); void func_map_init_indexing_func(func_map_t &fmap); void func_map_init_linalg(func_map_t &fmap); -void func_map_init_linalg_func(func_map_t &fmap); void func_map_init_logic(func_map_t &fmap); void func_map_init_manipulation(func_map_t &fmap); void func_map_init_mathematical(func_map_t &fmap); diff --git a/dpnp/backend/src/dpnp_iface_fptr.cpp b/dpnp/backend/src/dpnp_iface_fptr.cpp index a0683d44a96..460896bfa2d 100644 --- a/dpnp/backend/src/dpnp_iface_fptr.cpp +++ b/dpnp/backend/src/dpnp_iface_fptr.cpp @@ -172,7 +172,6 @@ static func_map_t func_map_init() func_map_init_fft_func(fmap); func_map_init_indexing_func(fmap); func_map_init_linalg(fmap); - func_map_init_linalg_func(fmap); func_map_init_logic(fmap); func_map_init_manipulation(fmap); func_map_init_mathematical(fmap); From 3a742d174482d5f94d99c034b9178cee1b4c17ba Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Tue, 25 Jun 2024 23:39:51 +0200 Subject: [PATCH 31/49] Remove the w/a which breaks the build on Windows with python `3.12` (#1896) * Remove a temporary w/a to unblock Windows build * Channel OVERRIDE_INTEL_IPO env. variable Set variable in public CI to override using interprocedural optimization in public CI to avoid insufficient resources failure during compilation on Windows. --- conda-recipe/bld.bat | 37 ++++++++++++++++++------------------- conda-recipe/meta.yaml | 1 + 2 files changed, 19 insertions(+), 19 deletions(-) diff --git a/conda-recipe/bld.bat b/conda-recipe/bld.bat index 960b254bd39..922b6949d2f 100644 --- a/conda-recipe/bld.bat +++ b/conda-recipe/bld.bat @@ -1,14 +1,7 @@ REM A workaround for activate-dpcpp.bat issue to be addressed in 2021.4 -set "LIB=%BUILD_PREFIX%\Library\lib;%BUILD_PREFIX%\compiler\lib;%LIB%" +SET "LIB=%BUILD_PREFIX%\Library\lib;%BUILD_PREFIX%\compiler\lib;%LIB%" SET "INCLUDE=%BUILD_PREFIX%\include;%INCLUDE%" -REM Since the 60.0.0 release, setuptools includes a local, vendored copy -REM of distutils (from late copies of CPython) that is enabled by default. -REM It breaks build for Windows, so use distutils from "stdlib" as before. -REM @TODO: remove the setting, once transition to build backend on Windows -REM to cmake is complete. -SET "SETUPTOOLS_USE_DISTUTILS=stdlib" - "%PYTHON%" setup.py clean --all set "MKLROOT=%PREFIX%/Library" @@ -18,10 +11,15 @@ set "DPL_ROOT_HINT=%PREFIX%/Library" set "SKBUILD_ARGS=-G Ninja -- -DCMAKE_C_COMPILER:PATH=icx -DCMAKE_CXX_COMPILER:PATH=icx -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON" set "SKBUILD_ARGS=%SKBUILD_ARGS% -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON" +REM Overriding IPO is useful for building in resources constrained VMs (public CI) +if DEFINED OVERRIDE_INTEL_IPO ( + set "SKBUILD_ARGS=%SKBUILD_ARGS% -DCMAKE_INTERPROCEDURAL_OPTIMIZATION:BOOL=FALSE" +) + FOR %%V IN (14.0.0 14 15.0.0 15 16.0.0 16 17.0.0 17) DO @( REM set DIR_HINT if directory exists IF EXIST "%BUILD_PREFIX%\Library\lib\clang\%%V\" ( - SET "SYCL_INCLUDE_DIR_HINT=%BUILD_PREFIX%\Library\lib\clang\%%V" + SET "SYCL_INCLUDE_DIR_HINT=%BUILD_PREFIX%\Library\lib\clang\%%V" ) ) @@ -40,19 +38,20 @@ if EXIST "%PLATFORM_DIR%" ( ) if NOT "%WHEELS_OUTPUT_FOLDER%"=="" ( - rem Install and assemble wheel package from the build bits - "%PYTHON%" setup.py install bdist_wheel %SKBUILD_ARGS% - if errorlevel 1 exit 1 - copy dist\dpnp*.whl %WHEELS_OUTPUT_FOLDER% - if errorlevel 1 exit 1 + rem Install and assemble wheel package from the build bits + "%PYTHON%" setup.py install bdist_wheel %SKBUILD_ARGS% + if errorlevel 1 exit 1 + copy dist\dpnp*.whl %WHEELS_OUTPUT_FOLDER% + if errorlevel 1 exit 1 ) ELSE ( - rem Only install - "%PYTHON%" setup.py install %SKBUILD_ARGS% - if errorlevel 1 exit 1 + rem Only install + "%PYTHON%" setup.py install %SKBUILD_ARGS% + if errorlevel 1 exit 1 ) rem copy back if EXIST "%PLATFORM_DIR%" ( - copy /Y "%FN%" "%PLATFORM_DIR%\%FN%" - if errorlevel 1 exit 1 + rem copy back + copy /Y "%FN%" "%PLATFORM_DIR%\%FN%" + if errorlevel 1 exit 1 ) diff --git a/conda-recipe/meta.yaml b/conda-recipe/meta.yaml index c10cd061345..6e12e122e17 100644 --- a/conda-recipe/meta.yaml +++ b/conda-recipe/meta.yaml @@ -42,6 +42,7 @@ build: include_recipe: False script_env: - WHEELS_OUTPUT_FOLDER + - OVERRIDE_INTEL_IPO # [win] test: requires: From c78f28a97c0889b7630df479ca9aeeaab99c18fc Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Wed, 26 Jun 2024 18:56:30 +0200 Subject: [PATCH 32/49] Skip test_distr in TestNormal and TestRandN (#1899) * Skip test_distr in TestNormal and TestRandN * Add jira ticket number --- tests/test_random_state.py | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tests/test_random_state.py b/tests/test_random_state.py index ed56dbdf730..2f9b76a43e8 100644 --- a/tests/test_random_state.py +++ b/tests/test_random_state.py @@ -35,6 +35,9 @@ def get_default_floating(): class TestNormal: + # TODO: Temporary skip due to incorrect results in public CI + # (ARM architecture) with the new MKL package 2024.2.0 (SAT-7080) + @pytest.mark.skipif(is_cpu_device(), reason="SAT-7080") @pytest.mark.parametrize( "dtype", [dpnp.float32, dpnp.float64, dpnp.float, None], @@ -605,6 +608,9 @@ def test_invalid_usm_type(self, usm_type): class TestRandN: + # TODO: Temporary skip due to incorrect results in public CI + # (ARM architecture) with the new MKL package 2024.2.0 (SAT-7080) + @pytest.mark.skipif(is_cpu_device(), reason="SAT-7080") @pytest.mark.parametrize( "usm_type", ["host", "device", "shared"], From 323ce509c3d81f7325179e18e6ca5ac80d23b67d Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Wed, 26 Jun 2024 20:44:29 +0200 Subject: [PATCH 33/49] Support `out` parameter for `dpnp.all/any()` (#1893) * Update dpnp.all/any with support out param * Update cupy tests * Add TestAllAny * Update dpnp tests * Apply comments --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- dpnp/dpnp_iface_logic.py | 194 +++++++++++------- tests/test_logic.py | 104 +++++++--- tests/test_sycl_queue.py | 2 + tests/test_usm_type.py | 2 + .../cupy/logic_tests/test_truth.py | 3 - 5 files changed, 200 insertions(+), 105 deletions(-) diff --git a/dpnp/dpnp_iface_logic.py b/dpnp/dpnp_iface_logic.py index d780cf578bf..6dfa1a15dcc 100644 --- a/dpnp/dpnp_iface_logic.py +++ b/dpnp/dpnp_iface_logic.py @@ -46,7 +46,7 @@ import dpctl.tensor as dpt -import dpctl.tensor._tensor_elementwise_impl as ti +import dpctl.tensor._tensor_elementwise_impl as tei import numpy import dpnp @@ -76,25 +76,48 @@ ] -def all(x, /, axis=None, out=None, keepdims=False, *, where=True): +def all(a, /, axis=None, out=None, keepdims=False, *, where=True): """ Test whether all array elements along a given axis evaluate to True. For full documentation refer to :obj:`numpy.all`. + Parameters + ---------- + a : {dpnp.ndarray, usm_ndarray} + Input array. + axis : {None, int, tuple of ints}, optional + Axis or axes along which a logical AND reduction is performed. + The default is to perform a logical AND over all the dimensions + of the input array.`axis` may be negative, in which case it counts + from the last to the first axis. + Default: ``None``. + out : {None, dpnp.ndarray, usm_ndarray}, optional + Alternative output array in which to place the result. It must have + the same shape as the expected output but the type (of the returned + values) will be cast if necessary. + Default: ``None``. + keepdims : bool, optional + If ``True``, the reduced axes (dimensions) are included in the result + as singleton dimensions, so that the returned array remains + compatible with the input array according to Array Broadcasting + rules. Otherwise, if ``False``, the reduced axes are not included in + the returned array. + Default: ``False``. + Returns ------- out : dpnp.ndarray An array with a data type of `bool` - containing the results of the logical AND reduction. + containing the results of the logical AND reduction is returned + unless `out` is specified. Otherwise, a reference to `out` is returned. + The result has the same shape as `a` if `axis` is not ``None`` + or `a` is a 0-d array. Limitations ----------- - Parameters `x` is supported either as :class:`dpnp.ndarray` - or :class:`dpctl.tensor.usm_ndarray`. - Parameters `out` and `where` are supported with default value. - Input array data types are limited by supported DPNP :ref:`Data types`. - Otherwise the function will be executed sequentially on CPU. + Parameters `where` is only supported with its default value. + Otherwise ``NotImplementedError`` exception will be raised. See Also -------- @@ -105,7 +128,7 @@ def all(x, /, axis=None, out=None, keepdims=False, *, where=True): Notes ----- Not a Number (NaN), positive infinity and negative infinity - evaluate to `True` because these are not equal to zero. + evaluate to ``True`` because these are not equal to zero. Examples -------- @@ -125,22 +148,27 @@ def all(x, /, axis=None, out=None, keepdims=False, *, where=True): >>> np.all(x3) array(True) + >>> o = np.array(False) + >>> z = np.all(x2, out=o) + >>> z, o + (array(True), array(True)) + >>> # Check now that `z` is a reference to `o` + >>> z is o + True + >>> id(z), id(o) # identity of `z` and `o` + (139884456208480, 139884456208480) # may vary + """ - if dpnp.is_supported_array_type(x): - if out is not None: - pass - elif where is not True: - pass - else: - dpt_array = dpnp.get_usm_ndarray(x) - return dpnp_array._create_from_usm_ndarray( - dpt.all(dpt_array, axis=axis, keepdims=keepdims) - ) + dpnp.check_limitations(where=where) - return call_origin( - numpy.all, x, axis=axis, out=out, keepdims=keepdims, where=where + dpt_array = dpnp.get_usm_ndarray(a) + result = dpnp_array._create_from_usm_ndarray( + dpt.all(dpt_array, axis=axis, keepdims=keepdims) ) + # TODO: temporary solution until dpt.all supports out parameter + result = dpnp.get_result_array(result, out) + return result def allclose(a, b, rtol=1.0e-5, atol=1.0e-8, **kwargs): @@ -238,25 +266,48 @@ def allclose(a, b, rtol=1.0e-5, atol=1.0e-8, **kwargs): return call_origin(numpy.allclose, a, b, rtol=rtol, atol=atol, **kwargs) -def any(x, /, axis=None, out=None, keepdims=False, *, where=True): +def any(a, /, axis=None, out=None, keepdims=False, *, where=True): """ Test whether any array element along a given axis evaluates to True. For full documentation refer to :obj:`numpy.any`. + Parameters + ---------- + a : {dpnp.ndarray, usm_ndarray} + Input array. + axis : {None, int, tuple of ints}, optional + Axis or axes along which a logical OR reduction is performed. + The default is to perform a logical OR over all the dimensions + of the input array.`axis` may be negative, in which case it counts + from the last to the first axis. + Default: ``None``. + out : {None, dpnp.ndarray, usm_ndarray}, optional + Alternative output array in which to place the result. It must have + the same shape as the expected output but the type (of the returned + values) will be cast if necessary. + Default: ``None``. + keepdims : bool, optional + If ``True``, the reduced axes (dimensions) are included in the result + as singleton dimensions, so that the returned array remains + compatible with the input array according to Array Broadcasting + rules. Otherwise, if ``False``, the reduced axes are not included in + the returned array. + Default: ``False``. + Returns ------- out : dpnp.ndarray An array with a data type of `bool` - containing the results of the logical OR reduction. + containing the results of the logical OR reduction is returned + unless `out` is specified. Otherwise, a reference to `out` is returned. + The result has the same shape as `a` if `axis` is not ``None`` + or `a` is a 0-d array. Limitations ----------- - Parameters `x` is supported either as :class:`dpnp.ndarray` - or :class:`dpctl.tensor.usm_ndarray`. - Parameters `out` and `where` are supported with default value. - Input array data types are limited by supported DPNP :ref:`Data types`. - Otherwise the function will be executed sequentially on CPU. + Parameters `where` is only supported with its default value. + Otherwise ``NotImplementedError`` exception will be raised. See Also -------- @@ -267,7 +318,7 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): Notes ----- Not a Number (NaN), positive infinity and negative infinity evaluate - to `True` because these are not equal to zero. + to ``True`` because these are not equal to zero. Examples -------- @@ -279,30 +330,35 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): >>> np.any(x, axis=0) array([ True, True]) - >>> x2 = np.array([0, 0, 0]) + >>> x2 = np.array([-1, 0, 5]) >>> np.any(x2) - array(False) + array(True) >>> x3 = np.array([1.0, np.nan]) >>> np.any(x3) array(True) + >>> o = np.array(False) + >>> z = np.any(x2, out=o) + >>> z, o + (array(True), array(True)) + >>> # Check now that `z` is a reference to `o` + >>> z is o + True + >>> id(z), id(o) # identity of `z` and `o` + >>> (140053638309840, 140053638309840) # may vary + """ - if dpnp.is_supported_array_type(x): - if out is not None: - pass - elif where is not True: - pass - else: - dpt_array = dpnp.get_usm_ndarray(x) - return dpnp_array._create_from_usm_ndarray( - dpt.any(dpt_array, axis=axis, keepdims=keepdims) - ) + dpnp.check_limitations(where=where) - return call_origin( - numpy.any, x, axis=axis, out=out, keepdims=keepdims, where=where + dpt_array = dpnp.get_usm_ndarray(a) + result = dpnp_array._create_from_usm_ndarray( + dpt.any(dpt_array, axis=axis, keepdims=keepdims) ) + # TODO: temporary solution until dpt.any supports out parameter + result = dpnp.get_result_array(result, out) + return result _EQUAL_DOCSTRING = """ @@ -368,8 +424,8 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): equal = DPNPBinaryFunc( "equal", - ti._equal_result_type, - ti._equal, + tei._equal_result_type, + tei._equal, _EQUAL_DOCSTRING, ) @@ -431,8 +487,8 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): greater = DPNPBinaryFunc( "greater", - ti._greater_result_type, - ti._greater, + tei._greater_result_type, + tei._greater, _GREATER_DOCSTRING, ) @@ -495,8 +551,8 @@ def any(x, /, axis=None, out=None, keepdims=False, *, where=True): greater_equal = DPNPBinaryFunc( "greater", - ti._greater_equal_result_type, - ti._greater_equal, + tei._greater_equal_result_type, + tei._greater_equal, _GREATER_EQUAL_DOCSTRING, ) @@ -597,8 +653,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): isfinite = DPNPUnaryFunc( "isfinite", - ti._isfinite_result_type, - ti._isfinite, + tei._isfinite_result_type, + tei._isfinite, _ISFINITE_DOCSTRING, ) @@ -650,8 +706,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): isinf = DPNPUnaryFunc( "isinf", - ti._isinf_result_type, - ti._isinf, + tei._isinf_result_type, + tei._isinf, _ISINF_DOCSTRING, ) @@ -704,8 +760,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): isnan = DPNPUnaryFunc( "isnan", - ti._isnan_result_type, - ti._isnan, + tei._isnan_result_type, + tei._isnan, _ISNAN_DOCSTRING, ) @@ -767,8 +823,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): less = DPNPBinaryFunc( "less", - ti._less_result_type, - ti._less, + tei._less_result_type, + tei._less, _LESS_DOCSTRING, ) @@ -830,8 +886,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): less_equal = DPNPBinaryFunc( "less_equal", - ti._less_equal_result_type, - ti._less_equal, + tei._less_equal_result_type, + tei._less_equal, _LESS_EQUAL_DOCSTRING, ) @@ -895,8 +951,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): logical_and = DPNPBinaryFunc( "logical_and", - ti._logical_and_result_type, - ti._logical_and, + tei._logical_and_result_type, + tei._logical_and, _LOGICAL_AND_DOCSTRING, ) @@ -947,8 +1003,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): logical_not = DPNPUnaryFunc( "logical_not", - ti._logical_not_result_type, - ti._logical_not, + tei._logical_not_result_type, + tei._logical_not, _LOGICAL_NOT_DOCSTRING, ) @@ -1012,8 +1068,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): logical_or = DPNPBinaryFunc( "logical_or", - ti._logical_or_result_type, - ti._logical_or, + tei._logical_or_result_type, + tei._logical_or, _LOGICAL_OR_DOCSTRING, ) @@ -1075,8 +1131,8 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): logical_xor = DPNPBinaryFunc( "logical_xor", - ti._logical_xor_result_type, - ti._logical_xor, + tei._logical_xor_result_type, + tei._logical_xor, _LOGICAL_XOR_DOCSTRING, ) @@ -1138,7 +1194,7 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): not_equal = DPNPBinaryFunc( "not_equal", - ti._not_equal_result_type, - ti._not_equal, + tei._not_equal_result_type, + tei._not_equal, _NOT_EQUAL_DOCSTRING, ) diff --git a/tests/test_logic.py b/tests/test_logic.py index 1b8e34a6fe8..e4f103e22c2 100644 --- a/tests/test_logic.py +++ b/tests/test_logic.py @@ -1,47 +1,85 @@ import numpy import pytest -from numpy.testing import assert_allclose, assert_equal +from numpy.testing import assert_allclose, assert_equal, assert_raises import dpnp from .helper import ( get_all_dtypes, get_float_complex_dtypes, - has_support_aspect64, ) -@pytest.mark.parametrize("type", get_all_dtypes()) -@pytest.mark.parametrize( - "shape", - [(0,), (4,), (2, 3), (2, 2, 2)], - ids=["(0,)", "(4,)", "(2,3)", "(2,2,2)"], -) -def test_all(type, shape): - size = 1 - for i in range(len(shape)): - size *= shape[i] - - for i in range(2**size): - t = i - - a = numpy.empty(size, dtype=type) - - for j in range(size): - a[j] = 0 if t % 2 == 0 else j + 1 - t = t >> 1 - - a = a.reshape(shape) - - ia = dpnp.array(a) - - np_res = numpy.all(a) - dpnp_res = dpnp.all(ia) - assert_allclose(dpnp_res, np_res) - - np_res = a.all() - dpnp_res = ia.all() - assert_allclose(dpnp_res, np_res) +class TestAllAny: + @pytest.mark.parametrize("func", ["all", "any"]) + @pytest.mark.parametrize("dtype", get_all_dtypes()) + @pytest.mark.parametrize("axis", [None, 0, 1, (0, 1)]) + @pytest.mark.parametrize("keepdims", [True, False]) + def test_all_any(self, func, dtype, axis, keepdims): + dp_array = dpnp.array([[0, 1, 2], [3, 4, 0]], dtype=dtype) + np_array = dpnp.asnumpy(dp_array) + + expected = getattr(numpy, func)(np_array, axis=axis, keepdims=keepdims) + result = getattr(dpnp, func)(dp_array, axis=axis, keepdims=keepdims) + assert_allclose(result, expected) + + @pytest.mark.parametrize("func", ["all", "any"]) + @pytest.mark.parametrize("a_dtype", get_all_dtypes()) + @pytest.mark.parametrize("out_dtype", get_all_dtypes()) + def test_all_any_out(self, func, a_dtype, out_dtype): + dp_array = dpnp.array([[0, 1, 2], [3, 4, 0]], dtype=a_dtype) + np_array = dpnp.asnumpy(dp_array) + + expected = getattr(numpy, func)(np_array) + out = dpnp.empty(expected.shape, dtype=out_dtype) + result = getattr(dpnp, func)(dp_array, out=out) + assert out is result + assert_allclose(result, expected) + + @pytest.mark.parametrize("func", ["all", "any"]) + @pytest.mark.parametrize("axis", [None, 0, 1, (0, 1)]) + @pytest.mark.parametrize("shape", [(2, 3), (2, 0), (0, 3)]) + def test_all_any_empty(self, func, axis, shape): + dp_array = dpnp.empty(shape, dtype=dpnp.int64) + np_array = dpnp.asnumpy(dp_array) + + result = getattr(dpnp, func)(dp_array, axis=axis) + expected = getattr(numpy, func)(np_array, axis=axis) + assert_allclose(result, expected) + + @pytest.mark.parametrize("func", ["all", "any"]) + def test_all_any_scalar(self, func): + dp_array = dpnp.array(0) + np_array = dpnp.asnumpy(dp_array) + + result = getattr(dp_array, func)() + expected = getattr(np_array, func)() + assert_allclose(result, expected) + + @pytest.mark.parametrize("func", ["all", "any"]) + @pytest.mark.parametrize("axis", [None, 0, 1]) + @pytest.mark.parametrize("keepdims", [True, False]) + def test_all_any_nan_inf(self, func, axis, keepdims): + dp_array = dpnp.array([[dpnp.nan, 1, 2], [dpnp.inf, -dpnp.inf, 0]]) + np_array = dpnp.asnumpy(dp_array) + + expected = getattr(numpy, func)(np_array, axis=axis, keepdims=keepdims) + result = getattr(dpnp, func)(dp_array, axis=axis, keepdims=keepdims) + assert_allclose(result, expected) + + @pytest.mark.parametrize("func", ["all", "any"]) + def test_all_any_error(self, func): + def check_raises(func_name, exception, *args, **kwargs): + assert_raises( + exception, lambda: getattr(dpnp, func_name)(*args, **kwargs) + ) + + a = dpnp.arange(5) + # unsupported where parameter + check_raises(func, NotImplementedError, a, where=False) + # unsupported type + check_raises(func, TypeError, dpnp.asnumpy(a)) + check_raises(func, TypeError, [0, 1, 2, 3]) @pytest.mark.parametrize("dtype", get_all_dtypes(no_bool=True, no_complex=True)) diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index 99334cfabfc..3349c013428 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -394,6 +394,8 @@ def test_meshgrid(device_x, device_y): @pytest.mark.parametrize( "func,data", [ + pytest.param("all", [-1.0, 0.0, 1.0]), + pytest.param("any", [-1.0, 0.0, 1.0]), pytest.param("average", [1.0, 2.0, 4.0, 7.0]), pytest.param("abs", [-1.2, 1.2]), pytest.param("angle", [[1.0 + 1.0j, 2.0 + 3.0j]]), diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index 4f7314ff2db..427151dcc51 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -510,6 +510,8 @@ def test_norm(usm_type, ord, axis): @pytest.mark.parametrize( "func,data", [ + pytest.param("all", [-1.0, 0.0, 1.0]), + pytest.param("any", [-1.0, 0.0, 1.0]), pytest.param("average", [1.0, 2.0, 4.0, 7.0]), pytest.param("abs", [-1.2, 1.2]), pytest.param("angle", [[1.0 + 1.0j, 2.0 + 3.0j]]), diff --git a/tests/third_party/cupy/logic_tests/test_truth.py b/tests/third_party/cupy/logic_tests/test_truth.py index e715aa24405..c76ccd48aa5 100644 --- a/tests/third_party/cupy/logic_tests/test_truth.py +++ b/tests/third_party/cupy/logic_tests/test_truth.py @@ -1,7 +1,6 @@ import unittest import numpy -import pytest from tests.third_party.cupy import testing @@ -47,7 +46,6 @@ def test_without_out(self, xp, dtype): x = xp.asarray(self.x).astype(dtype) return getattr(xp, self.f)(x, self.axis, None, self.keepdims) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_with_out(self, xp, dtype): @@ -80,7 +78,6 @@ def test_without_out(self, xp, dtype): x = xp.asarray(self.x).astype(dtype) return getattr(xp, self.f)(x, self.axis, None, self.keepdims) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_dtypes((*testing._loops._float_dtypes, numpy.bool_)) @testing.numpy_cupy_array_equal() def test_with_out(self, xp, dtype): From 73ace1269c5c45fdde499234529f88a421ac6380 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Wed, 26 Jun 2024 23:17:09 +0200 Subject: [PATCH 34/49] Rework implementation of `dpnp.fmod` function (#1883) * Preparation to reuse common dpctl f/w for VM functions * PoC to decouple abs implementation to separate source file * Reuse typedef for function poiter from dpctl.tensor * Define populating vectors by a separate macro * Move implementation of utility functions from headers to source to resolve link issues * Separated implementation of acos function * Separated implementation of acosh function * Use function to simplify strides from dpctl tensor headers * PoC to decouple add implementation to separate source file * Separated implementation of asin function * Separated implementation of asinh function * Separated implementation of atan, atan2, atanh functions * Resolve issue with calling MKL function for undefined types * Separated implementation of cbrt, ceil, conj, cos and cosh functions * Separated implementation of div, exp, exp2, expm1, floor and hypot functions * Separated implementation of ln, log1p, log2 and log10 functions * Separated implementation of mul, pow, rint, sin and sinh functions * Separated implementation of sqr, sqrt, sub, tan, tanh and trunc functions * Removed unused header with types matrix * Remove unused functions * Use passing by reference in unary and binary funcs * Implement dpnp.fabs function * Create an instance of DPNPUnaryFunc for fabs * Enable and add relating tests * Decouple populate logic to a macro * Resolve compilation failure on Win * Implement dpnp.fmod function * Add vector implementation and dedicated kernel for boolean inputs * Update python implementation part * Add MKL function to the VM extension * Add tests * Add a link to gh issue in arithmetic tests * Suppress divide warning * Resolve compilation warning * Updated docstring description of inputs per review comment --- dpnp/backend/extensions/ufunc/CMakeLists.txt | 1 + .../ufunc/elementwise_functions/common.cpp | 2 + .../ufunc/elementwise_functions/fmod.cpp | 193 ++++++++ .../ufunc/elementwise_functions/fmod.hpp | 35 ++ .../ufunc/elementwise_functions/populate.hpp | 110 ++++- dpnp/backend/extensions/vm/CMakeLists.txt | 1 + dpnp/backend/extensions/vm/fmod.cpp | 161 ++++++ dpnp/backend/extensions/vm/fmod.hpp | 35 ++ dpnp/backend/extensions/vm/vm_py.cpp | 2 + dpnp/backend/include/dpnp_iface_fptr.hpp | 10 +- dpnp/backend/kernels/dpnp_krnl_elemwise.cpp | 14 - .../kernels/elementwise_functions/fmod.hpp | 61 +++ dpnp/dpnp_algo/dpnp_algo.pxd | 1 - dpnp/dpnp_algo/dpnp_algo_mathematical.pxi | 9 - dpnp/dpnp_iface_bitwise.py | 43 +- dpnp/dpnp_iface_logic.py | 66 ++- dpnp/dpnp_iface_mathematical.py | 257 +++++----- dpnp/dpnp_iface_trigonometric.py | 49 +- tests/test_mathematical.py | 51 +- tests/test_usm_type.py | 1 + .../cupy/core_tests/test_ndarray_math.py | 3 +- .../cupy/math_tests/test_arithmetic.py | 458 ++++++++++++++---- 22 files changed, 1263 insertions(+), 300 deletions(-) create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fmod.cpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fmod.hpp create mode 100644 dpnp/backend/extensions/vm/fmod.cpp create mode 100644 dpnp/backend/extensions/vm/fmod.hpp create mode 100644 dpnp/backend/kernels/elementwise_functions/fmod.hpp diff --git a/dpnp/backend/extensions/ufunc/CMakeLists.txt b/dpnp/backend/extensions/ufunc/CMakeLists.txt index 7f9a240271b..1d140b06658 100644 --- a/dpnp/backend/extensions/ufunc/CMakeLists.txt +++ b/dpnp/backend/extensions/ufunc/CMakeLists.txt @@ -26,6 +26,7 @@ set(_elementwise_sources ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/common.cpp ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fabs.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fmod.cpp ) set(python_module_name _ufunc_impl) diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp index 44173fc764f..b915f9a299a 100644 --- a/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp @@ -26,6 +26,7 @@ #include #include "fabs.hpp" +#include "fmod.hpp" namespace py = pybind11; @@ -37,5 +38,6 @@ namespace dpnp::extensions::ufunc void init_elementwise_functions(py::module_ m) { init_fabs(m); + init_fmod(m); } } // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fmod.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fmod.cpp new file mode 100644 index 00000000000..dbc215ec1f4 --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fmod.cpp @@ -0,0 +1,193 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +#include "dpctl4pybind11.hpp" + +#include "fmod.hpp" +#include "kernels/elementwise_functions/fmod.hpp" +#include "populate.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +namespace impl +{ +/** + * @brief A factory to define pairs of supported types for which + * sycl::fmod function is available. + * + * @tparam T1 Type of input vectors `a` + * @tparam T2 Type of input vectors `b` + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +using dpnp::kernels::fmod::FmodFunctor; + +template +using ContigFunctor = + ew_cmn_ns::BinaryContigFunctor, + vec_sz, + n_vecs, + enable_sg_loadstore>; + +template +using StridedFunctor = + ew_cmn_ns::BinaryStridedFunctor>; + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static binary_contig_impl_fn_ptr_t fmod_contig_dispatch_table[td_ns::num_types] + [td_ns::num_types]; +static int fmod_output_typeid_table[td_ns::num_types][td_ns::num_types]; +static binary_strided_impl_fn_ptr_t + fmod_strided_dispatch_table[td_ns::num_types][td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(fmod); +} // namespace impl + +void init_fmod(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + { + impl::populate_fmod_dispatch_tables(); + using impl::fmod_contig_dispatch_table; + using impl::fmod_output_typeid_table; + using impl::fmod_strided_dispatch_table; + + auto fmod_pyapi = [&](const arrayT &src1, const arrayT &src2, + const arrayT &dst, sycl::queue &exec_q, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, fmod_output_typeid_table, + fmod_contig_dispatch_table, fmod_strided_dispatch_table, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_fmod", fmod_pyapi, "", py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("sycl_queue"), + py::arg("depends") = py::list()); + + auto fmod_result_type_pyapi = [&](const py::dtype &dtype1, + const py::dtype &dtype2) { + return py_int::py_binary_ufunc_result_type( + dtype1, dtype2, fmod_output_typeid_table); + }; + m.def("_fmod_result_type", fmod_result_type_pyapi); + } +} +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fmod.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fmod.hpp new file mode 100644 index 00000000000..cfc61ba218f --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fmod.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +void init_fmod(py::module_ m); +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp index 6261fcc08eb..0b3cc8dac15 100644 --- a/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/populate.hpp @@ -26,7 +26,8 @@ #pragma once /** - * @brief A macro used to define factories and a populating universal functions. + * @brief A macro used to define factories and a populating unary universal + * functions. */ #define MACRO_POPULATE_DISPATCH_VECTORS(__name__) \ template \ + class __name__##_contig_kernel; \ + \ + template \ + sycl::event __name__##_contig_impl( \ + sycl::queue &exec_q, size_t nelems, const char *arg1_p, \ + py::ssize_t arg1_offset, const char *arg2_p, py::ssize_t arg2_offset, \ + char *res_p, py::ssize_t res_offset, \ + const std::vector &depends = {}) \ + { \ + return ew_cmn_ns::binary_contig_impl( \ + exec_q, nelems, arg1_p, arg1_offset, arg2_p, arg2_offset, res_p, \ + res_offset, depends); \ + } \ + \ + template \ + struct ContigFactory \ + { \ + fnT get() \ + { \ + if constexpr (std::is_same_v< \ + typename OutputType::value_type, void>) \ + { \ + \ + fnT fn = nullptr; \ + return fn; \ + } \ + else { \ + fnT fn = __name__##_contig_impl; \ + return fn; \ + } \ + } \ + }; \ + \ + template \ + struct TypeMapFactory \ + { \ + std::enable_if_t::value, int> get() \ + { \ + using rT = typename OutputType::value_type; \ + return td_ns::GetTypeid{}.get(); \ + } \ + }; \ + \ + template \ + class __name__##_strided_kernel; \ + \ + template \ + sycl::event __name__##_strided_impl( \ + sycl::queue &exec_q, size_t nelems, int nd, \ + const py::ssize_t *shape_and_strides, const char *arg1_p, \ + py::ssize_t arg1_offset, const char *arg2_p, py::ssize_t arg2_offset, \ + char *res_p, py::ssize_t res_offset, \ + const std::vector &depends, \ + const std::vector &additional_depends) \ + { \ + return ew_cmn_ns::binary_strided_impl( \ + exec_q, nelems, nd, shape_and_strides, arg1_p, arg1_offset, \ + arg2_p, arg2_offset, res_p, res_offset, depends, \ + additional_depends); \ + } \ + \ + template \ + struct StridedFactory \ + { \ + fnT get() \ + { \ + if constexpr (std::is_same_v< \ + typename OutputType::value_type, void>) \ + { \ + fnT fn = nullptr; \ + return fn; \ + } \ + else { \ + fnT fn = __name__##_strided_impl; \ + return fn; \ + } \ + } \ + }; \ + \ + void populate_##__name__##_dispatch_tables(void) \ + { \ + td_ns::DispatchTableBuilder \ + dvb1; \ + dvb1.populate_dispatch_table(__name__##_contig_dispatch_table); \ + \ + td_ns::DispatchTableBuilder \ + dvb2; \ + dvb2.populate_dispatch_table(__name__##_strided_dispatch_table); \ + \ + td_ns::DispatchTableBuilder \ + dvb3; \ + dvb3.populate_dispatch_table(__name__##_output_typeid_table); \ + }; diff --git a/dpnp/backend/extensions/vm/CMakeLists.txt b/dpnp/backend/extensions/vm/CMakeLists.txt index ba1e46ea0ed..de6262581f5 100644 --- a/dpnp/backend/extensions/vm/CMakeLists.txt +++ b/dpnp/backend/extensions/vm/CMakeLists.txt @@ -43,6 +43,7 @@ set(_elementwise_sources ${CMAKE_CURRENT_SOURCE_DIR}/exp2.cpp ${CMAKE_CURRENT_SOURCE_DIR}/expm1.cpp ${CMAKE_CURRENT_SOURCE_DIR}/floor.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/fmod.cpp ${CMAKE_CURRENT_SOURCE_DIR}/hypot.cpp ${CMAKE_CURRENT_SOURCE_DIR}/ln.cpp ${CMAKE_CURRENT_SOURCE_DIR}/log10.cpp diff --git a/dpnp/backend/extensions/vm/fmod.cpp b/dpnp/backend/extensions/vm/fmod.cpp new file mode 100644 index 00000000000..e985492de04 --- /dev/null +++ b/dpnp/backend/extensions/vm/fmod.cpp @@ -0,0 +1,161 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "fmod.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::fmod function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event fmod_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + py::ssize_t a_offset, + const char *in_b, + py::ssize_t b_offset, + char *out_y, + py::ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::fmod(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(fmod); +} // namespace impl + +void init_fmod(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto fmod_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_fmod", fmod_pyapi, + "Call `fmod` function from OneMKL VM library to performs element " + "by element computation of the modulus function of vector `src1` " + "with respect to vector `src2` to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto fmod_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_fmod_to_call", fmod_need_to_call_pyapi, + "Check input arguments to answer if `fmod` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/fmod.hpp b/dpnp/backend/extensions/vm/fmod.hpp new file mode 100644 index 00000000000..492ac8f9889 --- /dev/null +++ b/dpnp/backend/extensions/vm/fmod.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2023-2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::vm +{ +void init_fmod(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/vm_py.cpp b/dpnp/backend/extensions/vm/vm_py.cpp index 791a8f6d656..b78ae51ddc3 100644 --- a/dpnp/backend/extensions/vm/vm_py.cpp +++ b/dpnp/backend/extensions/vm/vm_py.cpp @@ -46,6 +46,7 @@ #include "exp2.hpp" #include "expm1.hpp" #include "floor.hpp" +#include "fmod.hpp" #include "hypot.hpp" #include "ln.hpp" #include "log10.hpp" @@ -86,6 +87,7 @@ PYBIND11_MODULE(_vm_impl, m) vm_ns::init_exp2(m); vm_ns::init_expm1(m); vm_ns::init_floor(m); + vm_ns::init_fmod(m); vm_ns::init_hypot(m); vm_ns::init_ln(m); vm_ns::init_log10(m); diff --git a/dpnp/backend/include/dpnp_iface_fptr.hpp b/dpnp/backend/include/dpnp_iface_fptr.hpp index 0f6ef51bc7c..1172bcbe4f5 100644 --- a/dpnp/backend/include/dpnp_iface_fptr.hpp +++ b/dpnp/backend/include/dpnp_iface_fptr.hpp @@ -140,12 +140,10 @@ enum class DPNPFuncName : size_t DPNP_FN_FLOOR, /**< Used in numpy.floor() impl */ DPNP_FN_FLOOR_DIVIDE, /**< Used in numpy.floor_divide() impl */ DPNP_FN_FMOD, /**< Used in numpy.fmod() impl */ - DPNP_FN_FMOD_EXT, /**< Used in numpy.fmod() impl, requires extra parameters - */ - DPNP_FN_FULL, /**< Used in numpy.full() impl */ - DPNP_FN_FULL_LIKE, /**< Used in numpy.full_like() impl */ - DPNP_FN_HYPOT, /**< Used in numpy.hypot() impl */ - DPNP_FN_IDENTITY, /**< Used in numpy.identity() impl */ + DPNP_FN_FULL, /**< Used in numpy.full() impl */ + DPNP_FN_FULL_LIKE, /**< Used in numpy.full_like() impl */ + DPNP_FN_HYPOT, /**< Used in numpy.hypot() impl */ + DPNP_FN_IDENTITY, /**< Used in numpy.identity() impl */ DPNP_FN_INITVAL, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ DPNP_FN_INITVAL_EXT, /**< Used in numpy ones, ones_like, zeros, zeros_like diff --git a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp index 122a3ccdedd..486851516dc 100644 --- a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp @@ -1401,20 +1401,6 @@ static void func_map_elemwise_2arg_3type_core(func_map_t &fmap) template static void func_map_elemwise_2arg_3type_short_core(func_map_t &fmap) { - ((fmap[DPNPFuncName::DPNP_FN_FMOD_EXT][FT1][FTs] = - {get_floating_res_type(), - (void *) - dpnp_fmod_c_ext()>, - func_type_map_t::find_type, - func_type_map_t::find_type>, - get_floating_res_type(), - (void *)dpnp_fmod_c_ext< - func_type_map_t::find_type()>, - func_type_map_t::find_type, - func_type_map_t::find_type>}), - ...); ((fmap[DPNPFuncName::DPNP_FN_MAXIMUM_EXT][FT1][FTs] = {get_floating_res_type(), (void *)dpnp_maximum_c_ext< diff --git a/dpnp/backend/kernels/elementwise_functions/fmod.hpp b/dpnp/backend/kernels/elementwise_functions/fmod.hpp new file mode 100644 index 00000000000..e97b257cb06 --- /dev/null +++ b/dpnp/backend/kernels/elementwise_functions/fmod.hpp @@ -0,0 +1,61 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace dpnp::kernels::fmod +{ +template +struct FmodFunctor +{ + using supports_sg_loadstore = typename std::true_type; + using supports_vec = std::negation< + std::conjunction, std::is_integral>>; + + resT operator()(const argT1 &in1, const argT2 &in2) const + { + if constexpr (std::is_integral::value && + std::is_integral::value) { + if (in2 == argT2(0)) { + return resT(0); + } + return in1 % in2; + } + else { + return sycl::fmod(in1, in2); + } + } + + template + sycl::vec + operator()(const sycl::vec &in1, + const sycl::vec &in2) const + { + return sycl::fmod(in1, in2); + } +}; +} // namespace dpnp::kernels::fmod diff --git a/dpnp/dpnp_algo/dpnp_algo.pxd b/dpnp/dpnp_algo/dpnp_algo.pxd index f6df42981a9..4e91151697c 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pxd +++ b/dpnp/dpnp_algo/dpnp_algo.pxd @@ -42,7 +42,6 @@ cdef extern from "dpnp_iface_fptr.hpp" namespace "DPNPFuncName": # need this na DPNP_FN_ERF_EXT DPNP_FN_FFT_FFT_EXT DPNP_FN_FFT_RFFT_EXT - DPNP_FN_FMOD_EXT DPNP_FN_MAXIMUM_EXT DPNP_FN_MEDIAN_EXT DPNP_FN_MINIMUM_EXT diff --git a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi index 405037da782..fca1e6dc303 100644 --- a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi +++ b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi @@ -37,7 +37,6 @@ and the rest of the library __all__ += [ "dpnp_ediff1d", - "dpnp_fmod", "dpnp_fmax", "dpnp_fmin", "dpnp_modf", @@ -109,14 +108,6 @@ cpdef utils.dpnp_descriptor dpnp_ediff1d(utils.dpnp_descriptor x1): return result -cpdef utils.dpnp_descriptor dpnp_fmod(utils.dpnp_descriptor x1_obj, - utils.dpnp_descriptor x2_obj, - object dtype=None, - utils.dpnp_descriptor out=None, - object where=True): - return call_fptr_2in_1out_strides(DPNP_FN_FMOD_EXT, x1_obj, x2_obj, dtype, out, where) - - cpdef utils.dpnp_descriptor dpnp_fmax(utils.dpnp_descriptor x1_obj, utils.dpnp_descriptor x2_obj, object dtype=None, diff --git a/dpnp/dpnp_iface_bitwise.py b/dpnp/dpnp_iface_bitwise.py index 21ee7cc3d82..6a9c44b813e 100644 --- a/dpnp/dpnp_iface_bitwise.py +++ b/dpnp/dpnp_iface_bitwise.py @@ -65,14 +65,16 @@ Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have integer or boolean data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have integer or boolean data - type. + type. Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -132,14 +134,16 @@ Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have integer or boolean data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have integer or boolean data - type. + type. Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -194,14 +198,16 @@ Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have integer or boolean data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have integer or boolean data - type. + type. Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -264,6 +270,7 @@ out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -326,14 +333,17 @@ Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have integer data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have integer data type. - Each element must be greater than or equal to 0. + Each element must be greater than or equal to ``0``. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -384,14 +394,17 @@ Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have integer data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have integer data type. - Each element must be greater than or equal to 0. + Each element must be greater than or equal to ``0``. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. diff --git a/dpnp/dpnp_iface_logic.py b/dpnp/dpnp_iface_logic.py index 6dfa1a15dcc..70f92830637 100644 --- a/dpnp/dpnp_iface_logic.py +++ b/dpnp/dpnp_iface_logic.py @@ -369,10 +369,12 @@ def any(a, /, axis=None, out=None, keepdims=False, *, where=True): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array have the correct shape and the expected data type. @@ -438,13 +440,16 @@ def any(a, /, axis=None, out=None, keepdims=False, *, where=True): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -501,13 +506,16 @@ def any(a, /, axis=None, out=None, keepdims=False, *, where=True): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -612,6 +620,7 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -671,6 +680,7 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -724,6 +734,7 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -774,13 +785,16 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -837,13 +851,16 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -900,13 +917,16 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -969,6 +989,7 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1017,13 +1038,16 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1082,13 +1106,16 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1145,13 +1172,16 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index 2f34be46312..1fe7839f596 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -63,7 +63,6 @@ dpnp_ediff1d, dpnp_fmax, dpnp_fmin, - dpnp_fmod, dpnp_modf, dpnp_trapz, ) @@ -343,6 +342,7 @@ def _gradient_num_diff_edges( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -404,13 +404,16 @@ def _gradient_num_diff_edges( Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -480,6 +483,7 @@ def _gradient_num_diff_edges( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -537,6 +541,7 @@ def around(x, /, decimals=0, out=None): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. Returns ------- @@ -573,6 +578,7 @@ def around(x, /, decimals=0, out=None): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -699,6 +705,7 @@ def clip(a, a_min, a_max, *, out=None, order="K", **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -762,14 +769,17 @@ def convolve(a, v, mode="full"): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have a real floating-point data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have a real floating-point data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1236,13 +1246,16 @@ def diff(a, n=1, axis=-1, prepend=None, append=None): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1363,6 +1376,7 @@ def ediff1d(x1, to_end=None, to_begin=None): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1411,6 +1425,7 @@ def ediff1d(x1, to_end=None, to_begin=None): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1463,13 +1478,16 @@ def ediff1d(x1, to_end=None, to_begin=None): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1748,116 +1766,78 @@ def fmin(x1, x2, /, out=None, *, where=True, dtype=None, subok=True, **kwargs): ) -def fmod(x1, x2, /, out=None, *, where=True, dtype=None, subok=True, **kwargs): - """ - Returns the element-wise remainder of division. +_FMOD_DOCSTRING = """ +Calculates the remainder of division for each element `x1_i` of the input array +`x1` with the respective element `x2_i` of the input array `x2`. - For full documentation refer to :obj:`numpy.fmod`. +This function is equivalent to the Matlab(TM) ``rem`` function and should not +be confused with the Python modulus operator ``x1 % x2``. - Returns - ------- - out : dpnp.ndarray - The remainder of the division of `x1` by `x2`. +For full documentation refer to :obj:`numpy.fmod`. - Limitations - ----------- - Parameters `x1` and `x2` are supported as either scalar, - :class:`dpnp.ndarray` or :class:`dpctl.tensor.usm_ndarray`, but both `x1` - and `x2` can not be scalars at the same time. - Parameters `where`, `dtype` and `subok` are supported with their default - values. - Keyword argument `kwargs` is currently unsupported. - Otherwise the function will be executed sequentially on CPU. - Input array data types are limited by supported DPNP :ref:`Data types`. - - See Also - -------- - :obj:`dpnp.remainder` : Remainder complementary to floor_divide. - :obj:`dpnp.divide` : Standard division. - - Examples - -------- - >>> import dpnp as np - >>> a = np.array([-3, -2, -1, 1, 2, 3]) - >>> np.fmod(a, 2) - array([-1, 0, -1, 1, 0, 1]) - >>> np.remainder(a, 2) - array([1, 0, 1, 1, 0, 1]) - - >>> a = np.array([5, 3]) - >>> b = np.array([2, 2.]) - >>> np.fmod(a, b) - array([1., 1.]) - - >>> a = np.arange(-3, 3).reshape(3, 2) - >>> a - array([[-3, -2], - [-1, 0], - [ 1, 2]]) - >>> b = np.array([2, 2]) - >>> np.fmod(a, b) - array([[-1, 0], - [-1, 0], - [ 1, 0]]) +Parameters +---------- +x1 : {dpnp.ndarray, usm_ndarray, scalar} + First input array, expected to have a real-valued data type. + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} + Second input array, also expected to have a real-valued data type. + Both inputs `x1` and `x2` can not be scalars at the same time. +out : {None, dpnp.ndarray, usm_ndarray}, optional + Output array to populate. + Array must have the correct shape and the expected data type. + Default: ``None``. +order : {"C", "F", "A", "K"}, optional + Memory layout of the newly output array, if parameter `out` is ``None``. + Default: ``"K"``. - """ +Returns +------- +out : dpnp.ndarray + An array containing the element-wise remainders. The data type of the + returned array is determined by the Type Promotion Rules. - if kwargs: - pass - elif where is not True: - pass - elif dtype is not None: - pass - elif subok is not True: - pass - elif dpnp.isscalar(x1) and dpnp.isscalar(x2): - # at least either x1 or x2 has to be an array - pass - else: - # get USM type and queue to copy scalar from the host memory into - # a USM allocation - usm_type, queue = ( - get_usm_allocations([x1, x2]) - if dpnp.isscalar(x1) or dpnp.isscalar(x2) - else (None, None) - ) +Limitations +---------- +Parameters `where` and `subok` are supported with their default values. +Keyword argument `kwargs` is currently unsupported. +Otherwise ``NotImplementedError`` exception will be raised. - x1_desc = dpnp.get_dpnp_descriptor( - x1, - copy_when_strides=False, - copy_when_nondefault_queue=False, - alloc_usm_type=usm_type, - alloc_queue=queue, - ) - x2_desc = dpnp.get_dpnp_descriptor( - x2, - copy_when_strides=False, - copy_when_nondefault_queue=False, - alloc_usm_type=usm_type, - alloc_queue=queue, - ) - if x1_desc and x2_desc: - if out is not None: - if not dpnp.is_supported_array_type(out): - raise TypeError( - "return array must be of supported array type" - ) - out_desc = ( - dpnp.get_dpnp_descriptor( - out, copy_when_nondefault_queue=False - ) - or None - ) - else: - out_desc = None +See Also +-------- +:obj:`dpnp.remainder` : Equivalent to the Python ``%`` operator. +:obj:`dpnp.divide` : Standard division. - return dpnp_fmod( - x1_desc, x2_desc, dtype=dtype, out=out_desc, where=where - ).get_pyobj() +Examples +-------- +>>> import dpnp as np +>>> a = np.array([-3, -2, -1, 1, 2, 3]) +>>> np.fmod(a, 2) +array([-1, 0, -1, 1, 0, 1]) +>>> np.remainder(a, 2) +array([1, 0, 1, 1, 0, 1]) + +>>> np.fmod(np.array([5, 3]), np.array([2, 2.])) +array([1., 1.]) +>>> a = np.arange(-3, 3).reshape(3, 2) +>>> a +array([[-3, -2], + [-1, 0], + [ 1, 2]]) +>>> np.fmod(a, np.array([2, 2])) +array([[-1, 0], + [-1, 0], + [ 1, 0]]) +""" - return call_origin( - numpy.fmod, x1, x2, dtype=dtype, out=out, where=where, **kwargs - ) +fmod = DPNPBinaryFunc( + "fmod", + ufi._fmod_result_type, + ufi._fmod, + _FMOD_DOCSTRING, + mkl_fn_to_call=vmi._mkl_fmod_to_call, + mkl_impl_fn=vmi._fmod, +) def gradient(f, *varargs, axis=None, edge_order=1): @@ -2074,6 +2054,7 @@ def gradient(f, *varargs, axis=None, edge_order=1): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2124,13 +2105,16 @@ def gradient(f, *varargs, axis=None, edge_order=1): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2196,13 +2180,16 @@ def gradient(f, *varargs, axis=None, edge_order=1): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2298,13 +2285,16 @@ def modf(x1, **kwargs): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2371,6 +2361,7 @@ def modf(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2426,6 +2417,7 @@ def modf(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2481,13 +2473,16 @@ def modf(x1, **kwargs): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional - Output array to populate. Array must have the correct - shape and the expected data type. + Output array to populate. Array must have the correct shape and + the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2668,6 +2663,7 @@ def prod( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2744,13 +2740,16 @@ def prod( Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have a real-valued data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have a real-valued data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2826,6 +2825,7 @@ def prod( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2885,6 +2885,7 @@ def prod( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. Returns ------- @@ -2941,6 +2942,7 @@ def prod( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -2995,6 +2997,7 @@ def prod( out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -3041,13 +3044,16 @@ def prod( Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have numeric data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -3331,6 +3337,7 @@ def trapz(y1, x1=None, dx=1.0, axis=-1): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. diff --git a/dpnp/dpnp_iface_trigonometric.py b/dpnp/dpnp_iface_trigonometric.py index d38af96ea2c..4d5703cfc6c 100644 --- a/dpnp/dpnp_iface_trigonometric.py +++ b/dpnp/dpnp_iface_trigonometric.py @@ -121,6 +121,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -175,6 +176,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -229,6 +231,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -282,7 +285,8 @@ def _get_accumulation_res_dt(a, dtype, _out): Input array, expected to have numeric data type. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. - Array must have the correct shape and the expected data type.. + Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -336,7 +340,8 @@ def _get_accumulation_res_dt(a, dtype, _out): Input array, expected to have numeric data type. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. - Array must have the correct shape and the expected data type.. + Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -390,15 +395,18 @@ def _get_accumulation_res_dt(a, dtype, _out): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have a real-valued floating-point data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have a real-valued floating-point data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -466,6 +474,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -520,6 +529,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -571,6 +581,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -624,6 +635,7 @@ def _get_accumulation_res_dt(a, dtype, _out): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -820,6 +832,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -872,6 +885,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -927,6 +941,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -982,13 +997,16 @@ def degrees(x1, **kwargs): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have a real-valued data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have a real-valued data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1049,6 +1067,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1103,6 +1122,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1162,6 +1182,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1221,6 +1242,7 @@ def degrees(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1278,15 +1300,18 @@ def degrees(x1, **kwargs): Parameters ---------- -x1 : {dpnp.ndarray, usm_ndarray} +x1 : {dpnp.ndarray, usm_ndarray, scalar} First input array, expected to have a real-valued floating-point data type. -x2 : {dpnp.ndarray, usm_ndarray} + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} Second input array, also expected to have a real-valued floating-point data type. + Both inputs `x1` and `x2` can not be scalars at the same time. out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1426,6 +1451,7 @@ def logsumexp(x, /, *, axis=None, dtype=None, keepdims=False, out=None): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1556,6 +1582,7 @@ def reduce_hypot(x, /, *, axis=None, dtype=None, keepdims=False, out=None): out : ({None, dpnp.ndarray, usm_ndarray}, optional): Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : ({'C', 'F', 'A', 'K'}, optional): Memory layout of the newly output array, if parameter `out` is `None`. Default: ``"K"`` @@ -1660,6 +1687,7 @@ def radians(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1713,6 +1741,7 @@ def radians(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1765,6 +1794,7 @@ def radians(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1820,6 +1850,7 @@ def radians(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1874,6 +1905,7 @@ def radians(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. @@ -1927,6 +1959,7 @@ def radians(x1, **kwargs): out : {None, dpnp.ndarray, usm_ndarray}, optional Output array to populate. Array must have the correct shape and the expected data type. + Default: ``None``. order : {"C", "F", "A", "K"}, optional Memory layout of the newly output array, if parameter `out` is ``None``. Default: ``"K"``. diff --git a/tests/test_mathematical.py b/tests/test_mathematical.py index 6cf52e91deb..ae2c73748b5 100644 --- a/tests/test_mathematical.py +++ b/tests/test_mathematical.py @@ -1040,12 +1040,11 @@ def test_fmax(self, dtype, lhs, rhs): def test_fmin(self, dtype, lhs, rhs): self._test_mathematical("fmin", dtype, lhs, rhs, check_type=False) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @pytest.mark.parametrize( "dtype", get_all_dtypes(no_bool=True, no_complex=True) ) def test_fmod(self, dtype, lhs, rhs): - if rhs == 0.3: + if rhs == 0.3 and not has_support_aspect64(): """ Due to accuracy reason, the results are different for `float32` and `float64` >>> numpy.fmod(numpy.array([3.9], dtype=numpy.float32), 0.3) @@ -1053,7 +1052,7 @@ def test_fmod(self, dtype, lhs, rhs): >>> numpy.fmod(numpy.array([3.9], dtype=numpy.float64), 0.3) array([9.53674318e-08]) - On a gpu without support for `float64`, dpnp produces results similar to the second one. + On a gpu without fp64 support, dpnp produces results similar to the second one. """ pytest.skip("Due to accuracy reason, the results are different.") self._test_mathematical("fmod", dtype, lhs, rhs, check_type=False) @@ -1300,6 +1299,52 @@ def test_positive_boolean(): dpnp.positive(dpnp_a) +@pytest.mark.parametrize("dtype", get_float_dtypes(no_float16=False)) +def test_float_remainder_magnitude(dtype): + b = numpy.array(1.0, dtype=dtype) + a = numpy.nextafter(numpy.array(0.0, dtype=dtype), -b) + + ia = dpnp.array(a) + ib = dpnp.array(b) + + result = dpnp.remainder(ia, ib) + expected = numpy.remainder(a, b) + assert_equal(result, expected) + + result = dpnp.remainder(-ia, -ib) + expected = numpy.remainder(-a, -b) + assert_equal(result, expected) + + +@pytest.mark.usefixtures("suppress_divide_numpy_warnings") +@pytest.mark.usefixtures("suppress_invalid_numpy_warnings") +@pytest.mark.parametrize("func", ["remainder", "fmod"]) +@pytest.mark.parametrize("dtype", get_float_dtypes(no_float16=False)) +@pytest.mark.parametrize( + "lhs, rhs", + [ + pytest.param(1.0, 0.0, id="one-zero"), + pytest.param(1.0, numpy.inf, id="one-inf"), + pytest.param(numpy.inf, 1.0, id="inf-one"), + pytest.param(numpy.inf, numpy.inf, id="inf-inf"), + pytest.param(numpy.inf, 0.0, id="inf-zero"), + pytest.param(1.0, numpy.nan, id="one-nan"), + pytest.param(numpy.nan, 0.0, id="nan-zero"), + pytest.param(numpy.nan, 1.0, id="nan-one"), + ], +) +def test_float_remainder_fmod_nans_inf(func, dtype, lhs, rhs): + a = numpy.array(lhs, dtype=dtype) + b = numpy.array(rhs, dtype=dtype) + + ia = dpnp.array(a) + ib = dpnp.array(b) + + result = getattr(dpnp, func)(ia, ib) + expected = getattr(numpy, func)(a, b) + assert_equal(result, expected) + + class TestProd: @pytest.mark.parametrize("axis", [None, 0, 1, -1, 2, -2, (1, 2), (0, -2)]) @pytest.mark.parametrize("keepdims", [False, True]) diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index 427151dcc51..5dafcfb7582 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -627,6 +627,7 @@ def test_1in_1out(func, data, usm_type): pytest.param("dot", [3 + 2j, 4 + 1j, 5], [1, 2 + 3j, 3]), pytest.param("fmax", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), pytest.param("fmin", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), + pytest.param("fmod", [5, 3], [2, 2.0]), pytest.param( "gradient", [1, 2, 4, 7, 11, 16], [0.0, 1.0, 1.5, 3.5, 4.0, 6.0] ), diff --git a/tests/third_party/cupy/core_tests/test_ndarray_math.py b/tests/third_party/cupy/core_tests/test_ndarray_math.py index 81caf2c8ceb..40ae44174ae 100644 --- a/tests/third_party/cupy/core_tests/test_ndarray_math.py +++ b/tests/third_party/cupy/core_tests/test_ndarray_math.py @@ -3,7 +3,6 @@ import numpy import pytest -import dpnp as cupy from tests.helper import has_support_aspect64 from tests.third_party.cupy import testing @@ -87,7 +86,7 @@ def test_round_halfway_int(self, xp, dtype): a -= a.size + 1 scale = 10 ** abs(self.decimals) if self.decimals < 0: - a *= xp.array(scale, dtype=dtype) + a *= xp.array(scale).astype(dtype) a >>= 1 return a.round(self.decimals) diff --git a/tests/third_party/cupy/math_tests/test_arithmetic.py b/tests/third_party/cupy/math_tests/test_arithmetic.py index 36593a2a99e..7a7d1014388 100644 --- a/tests/third_party/cupy/math_tests/test_arithmetic.py +++ b/tests/third_party/cupy/math_tests/test_arithmetic.py @@ -1,29 +1,28 @@ import itertools -import unittest import warnings import numpy import pytest import dpnp as cupy -from tests.helper import has_support_aspect64 +from tests.helper import has_support_aspect16, has_support_aspect64 from tests.third_party.cupy import testing -float_types = list(testing._loops._float_dtypes) -complex_types = [] -signed_int_types = [numpy.int32, numpy.int64] -unsigned_int_types = [] +float_types = [numpy.float16, numpy.float32, numpy.float64] +complex_types = [numpy.complex64, numpy.complex128] +signed_int_types = [numpy.int8, numpy.int16, numpy.int32, numpy.int64] +unsigned_int_types = [numpy.uint8, numpy.uint16, numpy.uint32, numpy.uint64] int_types = signed_int_types + unsigned_int_types -all_types = float_types + int_types + complex_types +all_types = [numpy.bool_] + float_types + int_types + complex_types +negative_types = [numpy.bool_] + float_types + signed_int_types + complex_types negative_types_wo_fp16 = ( [numpy.bool_] - + float_types + + [numpy.float32, numpy.float64] + [numpy.int16, numpy.int32, numpy.int64] + complex_types ) -negative_types = float_types + signed_int_types + complex_types -negative_no_complex_types = float_types + signed_int_types -no_complex_types = float_types + int_types +negative_no_complex_types = [numpy.bool_] + float_types + signed_int_types +no_complex_types = [numpy.bool_] + float_types + int_types @testing.parameterize( @@ -31,12 +30,7 @@ testing.product( { "nargs": [1], - "name": [ - "reciprocal", - "conj", - "conjugate", - "angle", - ], + "name": ["reciprocal", "conj", "conjugate", "angle"], } ) + testing.product( @@ -52,7 +46,6 @@ "floor_divide", "fmod", "remainder", - "mod", ], } ) @@ -128,47 +121,38 @@ class TestArithmeticUnary: @testing.numpy_cupy_allclose(atol=1e-5, type_check=has_support_aspect64()) def test_unary(self, xp): arg1 = self.arg1 - arg1 = xp.asarray(arg1) + if isinstance(arg1, numpy.ndarray): + arg1 = xp.asarray(arg1) if self.name in ("reciprocal") and xp is numpy: # In NumPy, for integer arguments with absolute value larger than 1 the result is always zero. # We need to convert the input data type to float then compare the output with DPNP. - if isinstance(arg1, numpy.ndarray) and numpy.issubdtype( - arg1.dtype, numpy.integer - ): - np_dtype = ( - numpy.float64 if has_support_aspect64() else numpy.float32 - ) + if numpy.issubdtype(arg1.dtype, numpy.integer): + if arg1.dtype.char in "bB": # int8 + np_dtype = numpy.float16 + elif arg1.dtype.char in "hH": # int16 + np_dtype = numpy.float32 + else: # int32, int64 + if has_support_aspect64(): + np_dtype = numpy.float64 + else: + np_dtype = numpy.float32 arg1 = xp.asarray(arg1, dtype=np_dtype) if self.name in {"angle"}: y = getattr(xp, self.name)(arg1, self.deg) - # In NumPy, for boolean arguments the output data type is always default floating data type. - # while data type of output in DPNP is determined by Type Promotion Rules. - if ( - isinstance(arg1, cupy.ndarray) - and cupy.issubdtype(arg1.dtype, cupy.bool) - and has_support_aspect64() - ): - y = y.astype(cupy.float64) + if isinstance(arg1, cupy.ndarray): + if arg1.dtype == cupy.bool and has_support_aspect64(): + # In NumPy, for boolean input the output data type is always default floating data type. + # while data type of output in DPNP is determined by Type Promotion Rules. + y = y.astype(cupy.float64) + elif arg1.dtype.char in "bBe" and has_support_aspect16(): + # In NumPy, for int8, uint8 and float16 inputs the output data type is always float16. + # while data type of output in DPNP is float32. + y = y.astype(cupy.float16) else: y = getattr(xp, self.name)(arg1) - # if self.name in ("real", "imag"): - # Some NumPy functions return Python scalars for Python scalar - # inputs. - # We need to convert them to arrays to compare with CuPy outputs. - # if xp is numpy and isinstance(arg1, (bool, int, float, complex)): - # y = xp.asarray(y) - - # TODO(niboshi): Fix this - # numpy.real and numpy.imag return Python int if the input is - # Python bool. CuPy should return an array of dtype.int32 or - # dtype.int64 (depending on the platform) in such cases, instead - # of an array of dtype.bool. - # if xp is cupy and isinstance(arg1, bool): - # y = y.astype(int) - return y @@ -210,9 +194,61 @@ def test_imag_nocomplex(self, xp, dtype): imag = xp.imag(x) return imag + @pytest.mark.skip("'dpnp_array' object has no attribute 'base' yet") + @testing.for_complex_dtypes() + @testing.numpy_cupy_array_equal() + def test_real_ndarray_complex(self, xp, dtype): + x = testing.shaped_arange(self.shape, xp, dtype=dtype) + x_ = x.copy() + real = x_.real + # real returns a view + assert real.base is x_ + x_ += 1 + 1j + testing.assert_array_equal(real, x.real + 1) + return real + + @pytest.mark.skip("'dpnp_array' object has no attribute 'base' yet") + @testing.for_complex_dtypes() + @testing.numpy_cupy_array_equal() + def test_real_complex(self, xp, dtype): + x = testing.shaped_arange(self.shape, xp, dtype=dtype) + x_ = x.copy() + real = xp.real(x_) + # real returns a view + assert real.base is x_ + x_ += 1 + 1j + testing.assert_array_equal(real, x.real + 1) + return real + + @pytest.mark.skip("'dpnp_array' object has no attribute 'base' yet") + @testing.for_complex_dtypes() + @testing.numpy_cupy_array_equal() + def test_imag_ndarray_complex(self, xp, dtype): + x = testing.shaped_arange(self.shape, xp, dtype=dtype) + x_ = x.copy() + imag = x_.imag + # imag returns a view + assert imag.base is x_ + x_ += 1 + 1j + testing.assert_array_equal(imag, x.imag + 1) + return imag + + @pytest.mark.skip("'dpnp_array' object has no attribute 'base' yet") + @testing.for_complex_dtypes() + @testing.numpy_cupy_array_equal() + def test_imag_complex(self, xp, dtype): + x = testing.shaped_arange(self.shape, xp, dtype=dtype) + x_ = x.copy() + imag = xp.imag(x_) + # imag returns a view + assert imag.base is x_ + x_ += 1 + 1j + testing.assert_array_equal(imag, x.imag + 1) + return imag + class ArithmeticBinaryBase: - @testing.numpy_cupy_allclose(atol=1e-4, type_check=False) + @testing.numpy_cupy_allclose(rtol=1e-4, type_check=has_support_aspect64()) def check_binary(self, xp): arg1 = self.arg1 arg2 = self.arg2 @@ -221,15 +257,37 @@ def check_binary(self, xp): dtype1 = np1.dtype dtype2 = np2.dtype - # TODO(niboshi): Fix this: xp.add(0j, xp.array([2.], 'f')).dtype - # numpy => complex64 - # # cupy => complex128 - # if isinstance(arg1, complex): - # if dtype2 in (numpy.float16, numpy.float32): - # return xp.array(True) - - arg1 = xp.asarray(arg1) - arg2 = xp.asarray(arg2) + if xp.isscalar(arg1) and xp.isscalar(arg2): + pytest.skip("both scalar inputs is not supported") + + if self.name == "power": + # TODO(niboshi): Fix this: power(0, 1j) + # numpy => 1+0j + # cupy => 0j + if dtype2 in complex_types and (np1 == 0).any(): + return xp.array(True) + # TODO: Fix this: power(0j, 0) + # numpy => 1+0j + # cupy => nan+nanj + elif dtype1 in complex_types and (np2 == 0).any(): + return xp.array(True) + + if self.name in ("true_divide", "floor_divide", "fmod", "remainder"): + if dtype1.kind in "u" and xp.isscalar(arg2) and arg2 < 0: + # TODO: Fix this: array(3, dtype=uint) / -2 + # numpy => -1.5 + # cupy => 0.01181102 + pytest.skip("due to dpctl gh-1711") + if dtype2.kind in "u" and xp.isscalar(arg1) and arg1 < 0: + # TODO: Fix this: 2 / array(3, dtype=uint) + # numpy => -0.666667 + # cupy => 84.666667 + pytest.skip("due to dpctl gh-1711") + + if isinstance(arg1, numpy.ndarray): + arg1 = xp.asarray(arg1) + if isinstance(arg2, numpy.ndarray): + arg2 = xp.asarray(arg2) # Subtraction between booleans is not allowed. if ( @@ -255,15 +313,6 @@ def check_binary(self, xp): if dtype1 in (numpy.float16, numpy.float32): y = y.astype(numpy.complex64) - # NumPy returns an output array of another type than DPNP when input ones have different types. - if xp is numpy and dtype1 != dtype2: - is_array_arg1 = not xp.isscalar(arg1) - is_array_arg2 = not xp.isscalar(arg2) - - is_int_float = lambda _x, _y: numpy.issubdtype( - _x, numpy.integer - ) and numpy.issubdtype(_y, numpy.floating) - return y @@ -271,16 +320,17 @@ def check_binary(self, xp): *( testing.product( { + # TODO(unno): boolean subtract causes DeprecationWarning in numpy>=1.13 "arg1": [ testing.shaped_arange((2, 3), numpy, dtype=d) for d in all_types ] - + [0, 0.0, 2, 2.0], + + [0, 0.0, 0j, 2, 2.0, 2j, True, False], "arg2": [ testing.shaped_reverse_arange((2, 3), numpy, dtype=d) for d in all_types ] - + [0, 0.0, 2, 2.0], + + [0, 0.0, 0j, 2, 2.0, 2j, True, False], "name": ["add", "multiply", "power", "subtract"], } ) @@ -290,19 +340,18 @@ def check_binary(self, xp): numpy.array([-3, -2, -1, 1, 2, 3], dtype=d) for d in negative_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], + + [0, 0.0, 0j, 2, 2.0, 2j, -2, -2.0, -2j, True, False], "arg2": [ numpy.array([-3, -2, -1, 1, 2, 3], dtype=d) for d in negative_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], + + [0, 0.0, 0j, 2, 2.0, 2j, -2, -2.0, -2j, True, False], "name": ["divide", "true_divide", "subtract"], } ) ) ) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") -class TestArithmeticBinary(ArithmeticBinaryBase, unittest.TestCase): +class TestArithmeticBinary(ArithmeticBinaryBase): def test_binary(self): self.use_dtype = False self.check_binary() @@ -311,19 +360,36 @@ def test_binary(self): @testing.parameterize( *( testing.product( + { + "arg1": [ + numpy.array([3, 2, 1, 1, 2, 3], dtype=d) + for d in unsigned_int_types + ] + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], + "arg2": [ + numpy.array([3, 2, 1, 1, 2, 3], dtype=d) + for d in unsigned_int_types + ] + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], + "name": ["true_divide"], + "dtype": [cupy.default_float_type()], + "use_dtype": [True, False], + } + ) + + testing.product( { "arg1": [ numpy.array([-3, -2, -1, 1, 2, 3], dtype=d) - for d in int_types + for d in signed_int_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], "arg2": [ numpy.array([-3, -2, -1, 1, 2, 3], dtype=d) - for d in int_types + for d in signed_int_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], "name": ["true_divide"], - "dtype": float_types, + "dtype": [cupy.default_float_type()], "use_dtype": [True, False], } ) @@ -340,7 +406,7 @@ def test_binary(self): ] + [0.0, 2.0, -2.0], "name": ["power", "true_divide", "subtract"], - "dtype": float_types, + "dtype": [cupy.default_float_type()], "use_dtype": [True, False], } ) @@ -350,14 +416,14 @@ def test_binary(self): testing.shaped_arange((2, 3), numpy, dtype=d) for d in no_complex_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], "arg2": [ testing.shaped_reverse_arange((2, 3), numpy, dtype=d) for d in no_complex_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], - "name": ["floor_divide", "fmod", "remainder", "mod"], - "dtype": float_types, + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], + "name": ["floor_divide", "fmod", "remainder"], + "dtype": [cupy.default_float_type()], "use_dtype": [True, False], } ) @@ -367,31 +433,229 @@ def test_binary(self): numpy.array([-3, -2, -1, 1, 2, 3], dtype=d) for d in negative_no_complex_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], "arg2": [ numpy.array([-3, -2, -1, 1, 2, 3], dtype=d) for d in negative_no_complex_types ] - + [0, 0.0, 2, 2.0, -2, -2.0], - "name": ["floor_divide", "fmod", "remainder", "mod"], - "dtype": float_types, + + [0, 0.0, 2, 2.0, -2, -2.0, True, False], + "name": ["floor_divide", "fmod", "remainder"], + "dtype": [cupy.default_float_type()], "use_dtype": [True, False], } ) ) ) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") -class TestArithmeticBinary2(ArithmeticBinaryBase, unittest.TestCase): +class TestArithmeticBinary2(ArithmeticBinaryBase): def test_binary(self): - if ( - self.use_dtype - and numpy.lib.NumpyVersion(numpy.__version__) < "1.10.0" - ): - raise unittest.SkipTest("NumPy>=1.10") self.check_binary() -class TestArithmeticModf(unittest.TestCase): +@pytest.mark.skip("'casting' keyword is not supported yet") +class UfuncTestBase: + @testing.numpy_cupy_allclose(accept_error=TypeError) + def check_casting_out(self, in0_type, in1_type, out_type, casting, xp): + a = testing.shaped_arange((2, 3), xp, in0_type) + b = testing.shaped_arange((2, 3), xp, in1_type) + c = xp.zeros((2, 3), out_type) + if casting != "unsafe": + # may raise TypeError + return xp.add(a, b, out=c, casting=casting) + + with warnings.catch_warnings(record=True) as ws: + warnings.simplefilter("always") + ret = xp.add(a, b, out=c, casting=casting) + ws = [w.category for w in ws] + assert all([w == numpy.ComplexWarning for w in ws]), str(ws) + return ret, xp.array(len(ws)) + + @testing.numpy_cupy_allclose(accept_error=TypeError) + def check_casting_dtype(self, in0_type, in1_type, dtype, casting, xp): + a = testing.shaped_arange((2, 3), xp, in0_type) + b = testing.shaped_arange((2, 3), xp, in1_type) + if casting != "unsafe": + # may raise TypeError + return xp.add(a, b, dtype=dtype, casting=casting) + + with warnings.catch_warnings(record=True) as ws: + warnings.simplefilter("always") + ret = xp.add(a, b, dtype=dtype, casting="unsafe") + ws = [w.category for w in ws] + assert all([w == numpy.ComplexWarning for w in ws]), str(ws) + return ret, xp.array(len(ws)) + + # delete this, once check_casting_dtype passes + @testing.numpy_cupy_allclose() + def check_casting_dtype_unsafe_ignore_warnings( + self, in0_type, in1_type, dtype, xp + ): + a = testing.shaped_arange((2, 3), xp, in0_type) + b = testing.shaped_arange((2, 3), xp, in1_type) + with warnings.catch_warnings(): + warnings.simplefilter("ignore") + return xp.add(a, b, dtype=dtype, casting="unsafe") + + +class TestUfunc(UfuncTestBase): + @pytest.mark.parametrize( + "casting", + [ + "no", + "equiv", + "safe", + "same_kind", + "unsafe", + ], + ) + @testing.for_all_dtypes_combination(names=["in_type", "out_type"]) + def test_casting_out_only(self, in_type, out_type, casting): + self.check_casting_out(in_type, in_type, out_type, casting) + + @pytest.mark.parametrize( + "casting", + [ + pytest.param("no", marks=pytest.mark.skip("flaky xfail")), + pytest.param("equiv", marks=pytest.mark.skip("flaky xfail")), + "safe", + "same_kind", + "unsafe", + ], + ) + @testing.for_all_dtypes_combination( + names=["in0_type", "in1_type", "out_type"], full=False + ) + def test_casting_in_out(self, in0_type, in1_type, out_type, casting): + self.check_casting_out(in0_type, in1_type, out_type, casting) + + @pytest.mark.xfail() + @pytest.mark.parametrize( + "casting", + [ + "no", + "equiv", + ], + ) + @pytest.mark.parametrize( + ("in0_type", "in1_type", "out_type"), + [ + (numpy.int16, numpy.int32, numpy.int32), + ], + ) + def test_casting_in_xfail1(self, in0_type, in1_type, out_type, casting): + self.check_casting_out(in0_type, in1_type, out_type, casting) + + @pytest.mark.skip("flaky xfail") + @pytest.mark.parametrize( + "casting", + [ + "no", + "equiv", + "safe", + "same_kind", + "unsafe", + ], + ) + @testing.for_all_dtypes_combination( + names=["in0_type", "in1_type", "dtype"], full=False + ) + def test_casting_dtype(self, in0_type, in1_type, dtype, casting): + self.check_casting_dtype(in0_type, in1_type, dtype, casting) + + @pytest.mark.xfail() + @pytest.mark.parametrize( + "casting", + [ + "no", + "equiv", + ], + ) + @pytest.mark.parametrize( + ("in0_type", "in1_type", "dtype"), + [ + (numpy.int16, numpy.int32, numpy.int32), + ], + ) + def test_casting_dtype_xfail1(self, in0_type, in1_type, dtype, casting): + self.check_casting_dtype(in0_type, in1_type, dtype, casting) + + @pytest.mark.xfail() + @pytest.mark.parametrize( + "casting", + [ + "no", + "equiv", + "safe", + "same_kind", + ], + ) + @pytest.mark.parametrize( + ("in0_type", "in1_type", "dtype"), + [ + (numpy.int32, numpy.int32, numpy.bool_), + (numpy.float64, numpy.float64, numpy.int32), + ], + ) + def test_casting_dtype_xfail2(self, in0_type, in1_type, dtype, casting): + self.check_casting_dtype(in0_type, in1_type, dtype, casting) + + @testing.for_all_dtypes_combination( + names=["in0_type", "in1_type", "dtype"], full=False + ) + def test_casting_dtype_unsafe_ignore_warnings( + self, in0_type, in1_type, dtype + ): + self.check_casting_dtype_unsafe_ignore_warnings( + in0_type, in1_type, dtype + ) + + +@testing.slow +class TestUfuncSlow(UfuncTestBase): + @pytest.mark.parametrize( + "casting", + [ + pytest.param("no", marks=pytest.mark.xfail()), + pytest.param("equiv", marks=pytest.mark.xfail()), + "safe", + "same_kind", + "unsafe", + ], + ) + @testing.for_all_dtypes_combination( + names=["in0_type", "in1_type", "out_type"], full=True + ) + def test_casting_out(self, in0_type, in1_type, out_type, casting): + self.check_casting_out(in0_type, in1_type, out_type, casting) + + @pytest.mark.xfail() + @pytest.mark.parametrize( + "casting", + [ + "no", + "equiv", + "safe", + "same_kind", + "unsafe", + ], + ) + @testing.for_all_dtypes_combination( + names=["in0_type", "in1_type", "dtype"], full=True + ) + def test_casting_dtype(self, in0_type, in1_type, dtype, casting): + self.check_casting_dtype(in0_type, in1_type, dtype, casting) + + @testing.for_all_dtypes_combination( + names=["in0_type", "in1_type", "dtype"], full=True + ) + def test_casting_dtype_unsafe_ignore_warnings( + self, in0_type, in1_type, dtype + ): + self.check_casting_dtype_unsafe_ignore_warnings( + in0_type, in1_type, dtype + ) + + +class TestArithmeticModf: @testing.for_float_dtypes() @testing.numpy_cupy_allclose() def test_modf(self, xp, dtype): @@ -406,11 +670,9 @@ def test_modf(self, xp, dtype): @testing.parameterize( *testing.product({"xp": [numpy, cupy], "shape": [(3, 2), (), (3, 0, 2)]}) ) -class TestBoolSubtract(unittest.TestCase): +class TestBoolSubtract: def test_bool_subtract(self): xp = self.xp - if xp is numpy and not testing.numpy_satisfies(">=1.14.0"): - raise unittest.SkipTest("NumPy<1.14.0") shape = self.shape x = testing.shaped_random(shape, xp, dtype=numpy.bool_) y = testing.shaped_random(shape, xp, dtype=numpy.bool_) From 0b7c230a8c3880e628d5f2b860ecd67d1966e850 Mon Sep 17 00:00:00 2001 From: vlad-perevezentsev Date: Thu, 27 Jun 2024 00:49:13 +0200 Subject: [PATCH 35/49] Implement `dpnp.isneginf` and `dpnp.isposinf` (#1888) * Implement dpnp.isneginf() * Add tests for dpnp.isneginf() * Implement dpnp.isposinf() * Add tests for dpnp.isposinf() * Add new functions to gen docs * Add additional checks * Add test_infinity_sign_errors * Add sycl_queue/usm tests for logic functions * Update tests * Remove out dtype check * Add TODO with support different out dtype * Update test_logic_op_2in --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- doc/reference/logic.rst | 2 + dpnp/dpnp_iface_logic.py | 146 ++++++++++++++++++ tests/test_logic.py | 47 ++++++ tests/test_sycl_queue.py | 83 ++++++++++ tests/test_usm_type.py | 34 ++-- .../cupy/logic_tests/test_content.py | 13 ++ 6 files changed, 314 insertions(+), 11 deletions(-) diff --git a/doc/reference/logic.rst b/doc/reference/logic.rst index f5b3e646e66..57133259c71 100644 --- a/doc/reference/logic.rst +++ b/doc/reference/logic.rst @@ -26,6 +26,8 @@ Infinities and NaNs dpnp.isfinite dpnp.isinf dpnp.isnan + dpnp.isneginf + dpnp.isposinf Array type testing diff --git a/dpnp/dpnp_iface_logic.py b/dpnp/dpnp_iface_logic.py index 70f92830637..8203d4e2ad1 100644 --- a/dpnp/dpnp_iface_logic.py +++ b/dpnp/dpnp_iface_logic.py @@ -66,6 +66,8 @@ "isfinite", "isinf", "isnan", + "isneginf", + "isposinf", "less", "less_equal", "logical_and", @@ -777,6 +779,150 @@ def isclose(x1, x2, rtol=1e-05, atol=1e-08, equal_nan=False): ) +def isneginf(x, out=None): + """ + Test element-wise for negative infinity, return result as bool array. + + For full documentation refer to :obj:`numpy.isneginf`. + + Parameters + ---------- + x : {dpnp.ndarray, usm_ndarray} + Input array. + out : {None, dpnp.ndarray, usm_ndarray}, optional + A location into which the result is stored. If provided, it must have a + shape that the input broadcasts to and a boolean data type. + If not provided or ``None``, a freshly-allocated boolean array + is returned. + Default: ``None``. + + Returns + ------- + out : dpnp.ndarray + Boolean array of same shape as ``x``. + + See Also + -------- + :obj:`dpnp.isinf` : Test element-wise for positive or negative infinity. + :obj:`dpnp.isposinf` : Test element-wise for positive infinity, + return result as bool array. + :obj:`dpnp.isnan` : Test element-wise for NaN and + return result as a boolean array. + :obj:`dpnp.isfinite` : Test element-wise for finiteness. + + Examples + -------- + >>> import dpnp as np + >>> x = np.array(np.inf) + >>> np.isneginf(-x) + array(True) + >>> np.isneginf(x) + array(False) + + >>> x = np.array([-np.inf, 0., np.inf]) + >>> np.isneginf(x) + array([ True, False, False]) + + >>> x = np.array([-np.inf, 0., np.inf]) + >>> y = np.zeros(x.shape, dtype='bool') + >>> np.isneginf(x, y) + array([ True, False, False]) + >>> y + array([ True, False, False]) + + """ + + dpnp.check_supported_arrays_type(x) + + if out is not None: + dpnp.check_supported_arrays_type(out) + + x_dtype = x.dtype + if dpnp.issubdtype(x_dtype, dpnp.complexfloating): + raise TypeError( + f"This operation is not supported for {x_dtype} values " + "because it would be ambiguous." + ) + + is_inf = dpnp.isinf(x) + signbit = dpnp.signbit(x) + + # TODO: support different out dtype #1717(dpctl) + return dpnp.logical_and(is_inf, signbit, out=out) + + +def isposinf(x, out=None): + """ + Test element-wise for positive infinity, return result as bool array. + + For full documentation refer to :obj:`numpy.isposinf`. + + Parameters + ---------- + x : {dpnp.ndarray, usm_ndarray} + Input array. + out : {None, dpnp.ndarray, usm_ndarray}, optional + A location into which the result is stored. If provided, it must have a + shape that the input broadcasts to and a boolean data type. + If not provided or ``None``, a freshly-allocated boolean array + is returned. + Default: ``None``. + + Returns + ------- + out : dpnp.ndarray + Boolean array of same shape as ``x``. + + See Also + -------- + :obj:`dpnp.isinf` : Test element-wise for positive or negative infinity. + :obj:`dpnp.isneginf` : Test element-wise for negative infinity, + return result as bool array. + :obj:`dpnp.isnan` : Test element-wise for NaN and + return result as a boolean array. + :obj:`dpnp.isfinite` : Test element-wise for finiteness. + + Examples + -------- + >>> import dpnp as np + >>> x = np.array(np.inf) + >>> np.isposinf(x) + array(True) + >>> np.isposinf(-x) + array(False) + + >>> x = np.array([-np.inf, 0., np.inf]) + >>> np.isposinf(x) + array([False, False, True]) + + >>> x = np.array([-np.inf, 0., np.inf]) + >>> y = np.zeros(x.shape, dtype='bool') + >>> np.isposinf(x, y) + array([False, False, True]) + >>> y + array([False, False, True]) + + """ + + dpnp.check_supported_arrays_type(x) + + if out is not None: + dpnp.check_supported_arrays_type(out) + + x_dtype = x.dtype + if dpnp.issubdtype(x_dtype, dpnp.complexfloating): + raise TypeError( + f"This operation is not supported for {x_dtype} values " + "because it would be ambiguous." + ) + + is_inf = dpnp.isinf(x) + signbit = ~dpnp.signbit(x) + + # TODO: support different out dtype #1717(dpctl) + return dpnp.logical_and(is_inf, signbit, out=out) + + _LESS_DOCSTRING = """ Computes the less-than test results for each element `x1_i` of the input array `x1` with the respective element `x2_i` of the input array `x2`. diff --git a/tests/test_logic.py b/tests/test_logic.py index e4f103e22c2..2b5e68d7d72 100644 --- a/tests/test_logic.py +++ b/tests/test_logic.py @@ -7,6 +7,7 @@ from .helper import ( get_all_dtypes, get_float_complex_dtypes, + get_float_dtypes, ) @@ -432,3 +433,49 @@ def test_finite(op, data, dtype): dpnp_res = getattr(dpnp, op)(x, out=dp_out) assert dp_out is dpnp_res assert_equal(dpnp_res, np_res) + + +@pytest.mark.parametrize("func", ["isneginf", "isposinf"]) +@pytest.mark.parametrize( + "data", + [ + [dpnp.inf, -1, 0, 1, dpnp.nan, -dpnp.inf], + [[dpnp.inf, dpnp.nan], [dpnp.nan, 0], [1, -dpnp.inf]], + ], + ids=[ + "1D array", + "2D array", + ], +) +@pytest.mark.parametrize("dtype", get_float_dtypes()) +def test_infinity_sign(func, data, dtype): + x = dpnp.asarray(data, dtype=dtype) + np_res = getattr(numpy, func)(x.asnumpy()) + dpnp_res = getattr(dpnp, func)(x) + assert_equal(dpnp_res, np_res) + + dp_out = dpnp.empty(np_res.shape, dtype=dpnp.bool) + dpnp_res = getattr(dpnp, func)(x, out=dp_out) + assert dp_out is dpnp_res + assert_equal(dpnp_res, np_res) + + +@pytest.mark.parametrize("func", ["isneginf", "isposinf"]) +def test_infinity_sign_errors(func): + data = [dpnp.inf, 0, -dpnp.inf] + + # unsupported data type + x = dpnp.asarray(data, dtype="c8") + x_np = dpnp.asnumpy(x) + assert_raises(TypeError, getattr(dpnp, func), x) + assert_raises(TypeError, getattr(numpy, func), x_np) + + # unsupported type + assert_raises(TypeError, getattr(dpnp, func), data) + assert_raises(TypeError, getattr(dpnp, func), x_np) + + # unsupported `out` data type + x = dpnp.asarray(data, dtype=dpnp.default_float_type()) + out = dpnp.empty_like(x, dtype="int32") + with pytest.raises(ValueError): + getattr(dpnp, func)(x, out=out) diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index 3349c013428..378ecaf9b19 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -501,6 +501,40 @@ def test_1in_1out(func, data, device): assert_sycl_queue_equal(result_queue, expected_queue) +@pytest.mark.parametrize( + "op", + [ + "all", + "any", + "isfinite", + "isinf", + "isnan", + "isneginf", + "isposinf", + "logical_not", + ], +) +@pytest.mark.parametrize( + "device", + valid_devices, + ids=[device.filter_string for device in valid_devices], +) +def test_logic_op_1in(op, device): + x = dpnp.array( + [-dpnp.inf, -1.0, 0.0, 1.0, dpnp.inf, dpnp.nan], device=device + ) + result = getattr(dpnp, op)(x) + + x_orig = dpnp.asnumpy(x) + expected = getattr(numpy, op)(x_orig) + assert_dtype_allclose(result, expected) + + expected_queue = x.get_array().sycl_queue + result_queue = result.get_array().sycl_queue + + assert_sycl_queue_equal(result_queue, expected_queue) + + @pytest.mark.parametrize( "device", valid_devices, @@ -705,6 +739,55 @@ def test_2in_1out(func, data1, data2, device): assert_sycl_queue_equal(result.sycl_queue, x2.sycl_queue) +@pytest.mark.parametrize( + "op", + [ + "equal", + "greater", + "greater_equal", + # TODO: unblock when dpnp.isclose() is updated + # "isclose", + "less", + "less_equal", + "logical_and", + "logical_or", + "logical_xor", + "not_equal", + ], +) +@pytest.mark.parametrize( + "device", + valid_devices, + ids=[device.filter_string for device in valid_devices], +) +def test_logic_op_2in(op, device): + x1 = dpnp.array( + [-dpnp.inf, -1.0, 0.0, 1.0, dpnp.inf, dpnp.nan], device=device + ) + x2 = dpnp.array( + [dpnp.inf, 1.0, 0.0, -1.0, -dpnp.inf, dpnp.nan], device=device + ) + # Remove NaN value from input arrays because numpy raises RuntimeWarning + if op in [ + "greater", + "greater_equal", + "less", + "less_equal", + ]: + x1 = x1[:-1] + x2 = x2[:-1] + result = getattr(dpnp, op)(x1, x2) + + x1_orig = dpnp.asnumpy(x1) + x2_orig = dpnp.asnumpy(x2) + expected = getattr(numpy, op)(x1_orig, x2_orig) + + assert_dtype_allclose(result, expected) + + assert_sycl_queue_equal(result.sycl_queue, x1.sycl_queue) + assert_sycl_queue_equal(result.sycl_queue, x2.sycl_queue) + + @pytest.mark.parametrize( "func, data, scalar", [ diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index 5dafcfb7582..8d43bccd75a 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -357,20 +357,32 @@ def test_tril_triu(func, usm_type): @pytest.mark.parametrize( "op", [ - "equal", - "greater", - "greater_equal", - "less", - "less_equal", - "logical_and", - "logical_or", - "logical_xor", - "not_equal", + "all", + "any", + "isfinite", + "isinf", + "isnan", + "isneginf", + "isposinf", + "logical_not", ], - ids=[ +) +@pytest.mark.parametrize("usm_type_x", list_of_usm_types, ids=list_of_usm_types) +def test_coerced_usm_types_logic_op_1in(op, usm_type_x): + x = dp.arange(-10, 10, usm_type=usm_type_x) + res = getattr(dp, op)(x) + + assert x.usm_type == res.usm_type == usm_type_x + + +@pytest.mark.parametrize( + "op", + [ "equal", "greater", "greater_equal", + # TODO: unblock when dpnp.isclose() is updated + # "isclose", "less", "less_equal", "logical_and", @@ -381,7 +393,7 @@ def test_tril_triu(func, usm_type): ) @pytest.mark.parametrize("usm_type_x", list_of_usm_types, ids=list_of_usm_types) @pytest.mark.parametrize("usm_type_y", list_of_usm_types, ids=list_of_usm_types) -def test_coerced_usm_types_logic_op(op, usm_type_x, usm_type_y): +def test_coerced_usm_types_logic_op_2in(op, usm_type_x, usm_type_y): x = dp.arange(100, usm_type=usm_type_x) y = dp.arange(100, usm_type=usm_type_y)[::-1] diff --git a/tests/third_party/cupy/logic_tests/test_content.py b/tests/third_party/cupy/logic_tests/test_content.py index fe2446d68b2..3f0a88c6781 100644 --- a/tests/third_party/cupy/logic_tests/test_content.py +++ b/tests/third_party/cupy/logic_tests/test_content.py @@ -29,3 +29,16 @@ def test_isinf(self): def test_isnan(self): self.check_unary_nan("isnan") + + +class TestUfuncLike(unittest.TestCase): + @testing.numpy_cupy_array_equal() + def check_unary(self, name, xp): + a = xp.array([-3, xp.inf, -1, -xp.inf, 0, 1, 2, xp.nan]) + return getattr(xp, name)(a) + + def test_isneginf(self): + self.check_unary("isneginf") + + def test_isposinf(self): + self.check_unary("isposinf") From acb74b98d8e3c22c4c6880d156ef23c4d835efdf Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 27 Jun 2024 13:52:03 +0200 Subject: [PATCH 36/49] Remove MKL_VERSION_2024 variable from cmake files (#1889) * Get rid of MKL_VERSION_2024 variable * Return back a quite discovery --- CMakeLists.txt | 13 +++---------- dpnp/backend/CMakeLists.txt | 7 +------ dpnp/backend/extensions/blas/CMakeLists.txt | 6 +----- dpnp/backend/extensions/lapack/CMakeLists.txt | 6 +----- dpnp/backend/extensions/vm/CMakeLists.txt | 6 +----- 5 files changed, 7 insertions(+), 31 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 9d061b8020c..dfcb1667438 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -26,18 +26,11 @@ endif() set(MKL_ARCH "intel64") set(MKL_LINK "dynamic") set(MKL_THREADING "tbb_thread") -set(MKL_VERSION_2024 FALSE) +set(MKL_INTERFACE "ilp64") find_package(MKL QUIET) if(MKL_FOUND) - if(MKL_VERSION VERSION_GREATER_EQUAL "2024.0.0") - set(MKL_VERSION_2024 TRUE) - set(MKL_INTERFACE "ilp64") - find_package(MKL REQUIRED) - endif() -endif() - -if(NOT MKL_VERSION_2024) - set(MKL_INTERFACE_FULL "intel_ilp64") + find_package(MKL REQUIRED) +else() find_package(MKL REQUIRED PATHS ${CMAKE_SOURCE_DIR}/dpnp/backend/cmake/Modules NO_DEFAULT_PATH) endif() diff --git a/dpnp/backend/CMakeLists.txt b/dpnp/backend/CMakeLists.txt index f1f5b447772..f9eb1a35d28 100644 --- a/dpnp/backend/CMakeLists.txt +++ b/dpnp/backend/CMakeLists.txt @@ -84,12 +84,7 @@ if(DPNP_GENERATE_COVERAGE) target_link_options(${_trgt} PRIVATE -fprofile-instr-generate -fcoverage-mapping) endif() -if (MKL_VERSION_2024) - target_link_libraries(${_trgt} PUBLIC MKL::MKL_SYCL) -else() - target_link_libraries(${_trgt} PUBLIC MKL::MKL_DPCPP) -endif() - +target_link_libraries(${_trgt} PUBLIC MKL::MKL_SYCL) target_link_libraries(${_trgt} PUBLIC oneDPL) if (UNIX) diff --git a/dpnp/backend/extensions/blas/CMakeLists.txt b/dpnp/backend/extensions/blas/CMakeLists.txt index 8ef4e7d79e1..7e2ce831870 100644 --- a/dpnp/backend/extensions/blas/CMakeLists.txt +++ b/dpnp/backend/extensions/blas/CMakeLists.txt @@ -69,11 +69,7 @@ if (DPNP_GENERATE_COVERAGE) target_link_options(${python_module_name} PRIVATE -fprofile-instr-generate -fcoverage-mapping) endif() -if (MKL_VERSION_2024) - target_link_libraries(${python_module_name} PUBLIC MKL::MKL_SYCL::BLAS) -else() - target_link_libraries(${python_module_name} PUBLIC MKL::MKL_DPCPP) -endif() +target_link_libraries(${python_module_name} PUBLIC MKL::MKL_SYCL::BLAS) install(TARGETS ${python_module_name} DESTINATION "dpnp/backend/extensions/blas" diff --git a/dpnp/backend/extensions/lapack/CMakeLists.txt b/dpnp/backend/extensions/lapack/CMakeLists.txt index c25ef1d97bc..f21f61c84df 100644 --- a/dpnp/backend/extensions/lapack/CMakeLists.txt +++ b/dpnp/backend/extensions/lapack/CMakeLists.txt @@ -82,11 +82,7 @@ if (DPNP_GENERATE_COVERAGE) target_link_options(${python_module_name} PRIVATE -fprofile-instr-generate -fcoverage-mapping) endif() -if (MKL_VERSION_2024) - target_link_libraries(${python_module_name} PUBLIC MKL::MKL_SYCL::LAPACK) -else() - target_link_libraries(${python_module_name} PUBLIC MKL::MKL_DPCPP) -endif() +target_link_libraries(${python_module_name} PUBLIC MKL::MKL_SYCL::LAPACK) install(TARGETS ${python_module_name} DESTINATION "dpnp/backend/extensions/lapack" diff --git a/dpnp/backend/extensions/vm/CMakeLists.txt b/dpnp/backend/extensions/vm/CMakeLists.txt index de6262581f5..0a7646cfc57 100644 --- a/dpnp/backend/extensions/vm/CMakeLists.txt +++ b/dpnp/backend/extensions/vm/CMakeLists.txt @@ -109,11 +109,7 @@ if (DPNP_GENERATE_COVERAGE) target_link_options(${python_module_name} PRIVATE -fprofile-instr-generate -fcoverage-mapping) endif() -if (MKL_VERSION_2024) - target_link_libraries(${python_module_name} PUBLIC MKL::MKL_SYCL::VM) -else() - target_link_libraries(${python_module_name} PUBLIC MKL::MKL_DPCPP) -endif() +target_link_libraries(${python_module_name} PUBLIC MKL::MKL_SYCL::VM) install(TARGETS ${python_module_name} DESTINATION "dpnp/backend/extensions/vm" From 090ae64568b0a415742003207883ad1ff019a79a Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 27 Jun 2024 15:25:01 +0200 Subject: [PATCH 37/49] Bump `test_basic.py` with the latest content (#1901) * Bump test_basic.py with the latest content * Add test scope to public CI --- .github/workflows/conda-package.yml | 1 + tests/skipped_tests.tbl | 18 -- tests/skipped_tests_gpu.tbl | 18 -- .../cupy/creation_tests/test_basic.py | 201 +++++++++++------- 4 files changed, 124 insertions(+), 114 deletions(-) diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index 8f474e5398e..d11f1a3038c 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -49,6 +49,7 @@ env: test_umath.py test_usm_type.py third_party/cupy/core_tests + third_party/cupy/creation_tests third_party/cupy/indexing_tests/test_indexing.py third_party/cupy/lib_tests third_party/cupy/linalg_tests diff --git a/tests/skipped_tests.tbl b/tests/skipped_tests.tbl index c86b0d848c5..37285be810f 100644 --- a/tests/skipped_tests.tbl +++ b/tests/skipped_tests.tbl @@ -89,24 +89,6 @@ tests/third_party/cupy/core_tests/test_ndarray_reduction.py::TestCubReduction_pa tests/third_party/cupy/core_tests/test_ndarray_reduction.py::TestCubReduction_param_7_{order='F', shape=(10, 20, 30, 40)}::test_cub_max tests/third_party/cupy/core_tests/test_ndarray_reduction.py::TestCubReduction_param_7_{order='F', shape=(10, 20, 30, 40)}::test_cub_min -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_0_{shape=4}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_1_{shape=(4,)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_2_{shape=(4, 2)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_3_{shape=(4, 2, 3)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_4_{shape=(5, 4, 2, 3)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_huge_size -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_huge_size_fill0 -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_int_huge_size -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_int_huge_size_fill0 -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_like_invalid_order -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_like_K_strides -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_zero_sized_array_strides -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_full_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_ones_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_zeros_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_zeros_strides - tests/third_party/cupy/indexing_tests/test_generate.py::TestAxisConcatenator::test_AxisConcatenator_init1 tests/third_party/cupy/indexing_tests/test_generate.py::TestAxisConcatenator::test_len tests/third_party/cupy/indexing_tests/test_generate.py::TestC_::test_c_1 diff --git a/tests/skipped_tests_gpu.tbl b/tests/skipped_tests_gpu.tbl index 45b41f2dafb..55fd91b0def 100644 --- a/tests/skipped_tests_gpu.tbl +++ b/tests/skipped_tests_gpu.tbl @@ -115,24 +115,6 @@ tests/third_party/cupy/core_tests/test_ndarray_reduction.py::TestCubReduction_pa tests/third_party/cupy/core_tests/test_ndarray_reduction.py::TestCubReduction_param_7_{order='F', shape=(10, 20, 30, 40)}::test_cub_max tests/third_party/cupy/core_tests/test_ndarray_reduction.py::TestCubReduction_param_7_{order='F', shape=(10, 20, 30, 40)}::test_cub_min -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_0_{shape=4}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_1_{shape=(4,)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_2_{shape=(4, 2)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_3_{shape=(4, 2, 3)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasicReshape_param_4_{shape=(5, 4, 2, 3)}::test_empty_like_K_strides_reshape -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_huge_size -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_huge_size_fill0 -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_int_huge_size -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_int_huge_size_fill0 -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_like_invalid_order -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_like_K_strides -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_empty_zero_sized_array_strides -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_full_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_ones_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_zeros_like_subok -tests/third_party/cupy/creation_tests/test_basic.py::TestBasic::test_zeros_strides - tests/third_party/cupy/fft_tests/test_fft.py::TestFft2_param_1_{axes=None, norm=None, s=(1, None), shape=(3, 4)}::test_fft2 tests/third_party/cupy/fft_tests/test_fft.py::TestFft2_param_7_{axes=(), norm=None, s=None, shape=(3, 4)}::test_fft2 tests/third_party/cupy/fft_tests/test_fft.py::TestFft2_param_7_{axes=(), norm=None, s=None, shape=(3, 4)}::test_ifft2 diff --git a/tests/third_party/cupy/creation_tests/test_basic.py b/tests/third_party/cupy/creation_tests/test_basic.py index f2fe44b9fac..4623a39d383 100644 --- a/tests/third_party/cupy/creation_tests/test_basic.py +++ b/tests/third_party/cupy/creation_tests/test_basic.py @@ -1,4 +1,4 @@ -import unittest +import warnings import numpy import pytest @@ -7,7 +7,7 @@ from tests.third_party.cupy import testing -class TestBasic(unittest.TestCase): +class TestBasic: @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() @@ -20,19 +20,17 @@ def test_empty(self, xp, dtype, order): def test_empty_huge_size(self): a = cupy.empty((1024, 2048, 1024), dtype="b") a.fill(123) - self.assertTrue((a == 123).all()) + assert (a == 123).all() # Free huge memory for slow test del a - cupy.get_default_memory_pool().free_all_blocks() @testing.slow def test_empty_huge_size_fill0(self): a = cupy.empty((1024, 2048, 1024), dtype="b") a.fill(0) - self.assertTrue((a == 0).all()) + assert (a == 0).all() # Free huge memory for slow test del a - cupy.get_default_memory_pool().free_all_blocks() @testing.for_CF_orders() @testing.for_all_dtypes() @@ -42,6 +40,17 @@ def test_empty_scalar(self, xp, dtype, order): a.fill(0) return a + @pytest.mark.skip("passing 'None' into shape arguments is not supported") + @testing.with_requires("numpy>=1.20") + @testing.for_CF_orders() + @testing.for_all_dtypes() + @testing.numpy_cupy_array_equal() + def test_empty_scalar_none(self, xp, dtype, order): + with testing.assert_warns(DeprecationWarning): + a = xp.empty(None, dtype=dtype, order=order) + a.fill(0) + return a + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() @@ -54,7 +63,7 @@ def test_empty_int(self, xp, dtype, order): def test_empty_int_huge_size(self): a = cupy.empty(2**31, dtype="b") a.fill(123) - self.assertTrue((a == 123).all()) + assert (a == 123).all() # Free huge memory for slow test del a cupy.get_default_memory_pool().free_all_blocks() @@ -63,12 +72,12 @@ def test_empty_int_huge_size(self): def test_empty_int_huge_size_fill0(self): a = cupy.empty(2**31, dtype="b") a.fill(0) - self.assertTrue((a == 0).all()) + assert (a == 0).all() # Free huge memory for slow test del a cupy.get_default_memory_pool().free_all_blocks() - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like(self, xp, dtype, order): @@ -77,7 +86,7 @@ def test_empty_like(self, xp, dtype, order): b.fill(0) return b - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_contiguity(self, xp, dtype, order): @@ -85,12 +94,12 @@ def test_empty_like_contiguity(self, xp, dtype, order): b = xp.empty_like(a, order=order) b.fill(0) if order in ["f", "F"]: - self.assertTrue(b.flags.f_contiguous) + assert b.flags.f_contiguous else: - self.assertTrue(b.flags.c_contiguous) + assert b.flags.c_contiguous return b - @testing.for_orders("C") + @testing.for_orders("CF") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_contiguity2(self, xp, dtype, order): @@ -99,12 +108,12 @@ def test_empty_like_contiguity2(self, xp, dtype, order): b = xp.empty_like(a, order=order) b.fill(0) if order in ["c", "C"]: - self.assertTrue(b.flags.c_contiguous) + assert b.flags.c_contiguous else: - self.assertTrue(b.flags.f_contiguous) + assert b.flags.f_contiguous return b - @testing.for_orders("C") + @testing.for_orders("CF") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_contiguity3(self, xp, dtype, order): @@ -114,16 +123,17 @@ def test_empty_like_contiguity3(self, xp, dtype, order): b = xp.empty_like(a, order=order) b.fill(0) if order in ["k", "K", None]: - self.assertFalse(b.flags.c_contiguous) - self.assertFalse(b.flags.f_contiguous) + assert not b.flags.c_contiguous + assert not b.flags.f_contiguous elif order in ["f", "F"]: - self.assertFalse(b.flags.c_contiguous) - self.assertTrue(b.flags.f_contiguous) + assert not b.flags.c_contiguous + assert b.flags.f_contiguous else: - self.assertTrue(b.flags.c_contiguous) - self.assertFalse(b.flags.f_contiguous) + assert b.flags.c_contiguous + assert not b.flags.f_contiguous return b + @pytest.mark.skip("order 'K' is not supported") @testing.for_all_dtypes() def test_empty_like_K_strides(self, dtype): # test strides that are both non-contiguous and non-descending @@ -139,31 +149,37 @@ def test_empty_like_K_strides(self, dtype): bg.fill(0) # make sure NumPy and CuPy strides agree - self.assertEqual(b.strides, bg.strides) + assert b.strides == bg.strides return + @testing.with_requires("numpy>=1.19") @testing.for_all_dtypes() def test_empty_like_invalid_order(self, dtype): for xp in (numpy, cupy): a = testing.shaped_arange((2, 3, 4), xp, dtype) - with pytest.raises(TypeError): + with pytest.raises(ValueError): xp.empty_like(a, order="Q") + @pytest.mark.skip("subok keyword is not supported") def test_empty_like_subok(self): a = testing.shaped_arange((2, 3, 4), cupy) with pytest.raises(TypeError): cupy.empty_like(a, subok=True) + @pytest.mark.skip("strides for zero sized array is different") @testing.for_CF_orders() + @testing.with_requires("numpy>=1.23") def test_empty_zero_sized_array_strides(self, order): a = numpy.empty((1, 0, 2), dtype="d", order=order) b = cupy.empty((1, 0, 2), dtype="d", order=order) - self.assertEqual(b.strides, a.strides) + assert b.strides == a.strides + @pytest.mark.parametrize("offset", [1, -1, 1 << 63, -(1 << 63)]) + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() - def test_eye(self, xp, dtype): - return xp.eye(5, 4, k=1, dtype=dtype) + def test_eye(self, xp, dtype, order, offset): + return xp.eye(5, 4, offset, dtype, order=order) @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() @@ -182,6 +198,15 @@ def test_zeros(self, xp, dtype, order): def test_zeros_scalar(self, xp, dtype, order): return xp.zeros((), dtype=dtype, order=order) + @pytest.mark.skip("passing 'None' into shape arguments is not supported") + @testing.with_requires("numpy>=1.20") + @testing.for_CF_orders() + @testing.for_all_dtypes() + @testing.numpy_cupy_array_equal() + def test_zeros_scalar_none(self, xp, dtype, order): + with testing.assert_warns(DeprecationWarning): + return xp.zeros(None, dtype=dtype, order=order) + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() @@ -190,61 +215,80 @@ def test_zeros_int(self, xp, dtype, order): @testing.for_CF_orders() def test_zeros_strides(self, order): - a = numpy.zeros((2, 3), dtype="d", order=order) - b = cupy.zeros((2, 3), dtype="d", order=order) - self.assertEqual(b.strides, a.strides) + a = numpy.zeros((2, 3), dtype="f", order=order) + b = cupy.zeros((2, 3), dtype="f", order=order) + b_strides = tuple(x * b.itemsize for x in b.strides) + assert b_strides == a.strides - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_zeros_like(self, xp, dtype, order): a = xp.ndarray((2, 3, 4), dtype=dtype) return xp.zeros_like(a, order=order) + @pytest.mark.skip("subok keyword is not supported") def test_zeros_like_subok(self): a = cupy.ndarray((2, 3, 4)) with pytest.raises(TypeError): cupy.zeros_like(a, subok=True) + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() - def test_ones(self, xp, dtype): - return xp.ones((2, 3, 4), dtype=dtype) + def test_ones(self, xp, dtype, order): + return xp.ones((2, 3, 4), dtype=dtype, order=order) - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_ones_like(self, xp, dtype, order): a = xp.ndarray((2, 3, 4), dtype=dtype) return xp.ones_like(a, order=order) + @pytest.mark.skip("subok keyword is not supported") def test_ones_like_subok(self): a = cupy.ndarray((2, 3, 4)) with pytest.raises(TypeError): cupy.ones_like(a, subok=True) + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() - def test_full(self, xp, dtype): - return xp.full((2, 3, 4), 1, dtype=dtype) + def test_full(self, xp, dtype, order): + return xp.full((2, 3, 4), 1, dtype=dtype, order=order) + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() - def test_full_default_dtype(self, xp, dtype): - return xp.full((2, 3, 4), xp.array(1, dtype=dtype)) + def test_full_default_dtype(self, xp, dtype, order): + return xp.full((2, 3, 4), xp.array(1, dtype=dtype), order=order) - @testing.for_all_dtypes() + @testing.for_all_dtypes_combination(("dtype1", "dtype2")) @testing.numpy_cupy_array_equal() - def test_full_default_dtype_cpu_input(self, xp, dtype): - return xp.full((2, 3, 4), numpy.array(1, dtype=dtype)) + def test_full_dtypes_cpu_input(self, xp, dtype1, dtype2): + with warnings.catch_warnings(): + warnings.simplefilter("ignore", numpy.ComplexWarning) + return xp.full( + (2, 3, 4), numpy.array(1, dtype=dtype1), dtype=dtype2 + ) - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_full_like(self, xp, dtype, order): a = xp.ndarray((2, 3, 4), dtype=dtype) return xp.full_like(a, 1, order=order) + @testing.for_all_dtypes_combination(("dtype1", "dtype2")) + @testing.numpy_cupy_array_equal() + def test_full_like_dtypes_cpu_input(self, xp, dtype1, dtype2): + a = xp.ndarray((2, 3, 4), dtype=dtype1) + with warnings.catch_warnings(): + warnings.simplefilter("ignore", numpy.ComplexWarning) + return xp.full_like(a, numpy.array(1, dtype=dtype1)) + + @pytest.mark.skip("subok keyword is not supported") def test_full_like_subok(self): a = cupy.ndarray((2, 3, 4)) with pytest.raises(TypeError): @@ -258,9 +302,9 @@ def test_full_like_subok(self): } ) ) -class TestBasicReshape(unittest.TestCase): +class TestBasicReshape: @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_reshape(self, xp, dtype, order): @@ -281,7 +325,7 @@ def test_empty_like_reshape_cupy_only(self, dtype, order): testing.assert_array_equal(b, c) @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_reshape_contiguity(self, xp, dtype, order): @@ -289,12 +333,12 @@ def test_empty_like_reshape_contiguity(self, xp, dtype, order): b = xp.empty_like(a, order=order, shape=self.shape) b.fill(0) if order in ["f", "F"]: - self.assertTrue(b.flags.f_contiguous) + assert b.flags.f_contiguous else: - self.assertTrue(b.flags.c_contiguous) + assert b.flags.c_contiguous return b - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() def test_empty_like_reshape_contiguity_cupy_only(self, dtype, order): a = testing.shaped_arange((2, 3, 4), cupy, dtype) @@ -303,13 +347,13 @@ def test_empty_like_reshape_contiguity_cupy_only(self, dtype, order): c = cupy.empty(self.shape) c.fill(0) if order in ["f", "F"]: - self.assertTrue(b.flags.f_contiguous) + assert b.flags.f_contiguous else: - self.assertTrue(b.flags.c_contiguous) + assert b.flags.c_contiguous testing.assert_array_equal(b, c) @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_orders("CF") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_reshape_contiguity2(self, xp, dtype, order): @@ -321,12 +365,12 @@ def test_empty_like_reshape_contiguity2(self, xp, dtype, order): if order in ["c", "C"] or ( order in ["k", "K", None] and len(shape) != a.ndim ): - self.assertTrue(b.flags.c_contiguous) + assert b.flags.c_contiguous else: - self.assertTrue(b.flags.f_contiguous) + assert b.flags.f_contiguous return b - @testing.for_orders("C") + @testing.for_orders("CF") @testing.for_all_dtypes() def test_empty_like_reshape_contiguity2_cupy_only(self, dtype, order): a = testing.shaped_arange((2, 3, 4), cupy, dtype) @@ -339,13 +383,13 @@ def test_empty_like_reshape_contiguity2_cupy_only(self, dtype, order): if order in ["c", "C"] or ( order in ["k", "K", None] and len(shape) != a.ndim ): - self.assertTrue(b.flags.c_contiguous) + assert b.flags.c_contiguous else: - self.assertTrue(b.flags.f_contiguous) + assert b.flags.f_contiguous testing.assert_array_equal(b, c) @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_orders("CF") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_empty_like_reshape_contiguity3(self, xp, dtype, order): @@ -356,20 +400,20 @@ def test_empty_like_reshape_contiguity3(self, xp, dtype, order): b.fill(0) shape = self.shape if not numpy.isscalar(self.shape) else (self.shape,) if len(shape) == 1: - self.assertTrue(b.flags.c_contiguous) - self.assertTrue(b.flags.f_contiguous) + assert b.flags.c_contiguous + assert b.flags.f_contiguous elif order in ["k", "K", None] and len(shape) == a.ndim: - self.assertFalse(b.flags.c_contiguous) - self.assertFalse(b.flags.f_contiguous) + assert not b.flags.c_contiguous + assert not b.flags.f_contiguous elif order in ["f", "F"]: - self.assertFalse(b.flags.c_contiguous) - self.assertTrue(b.flags.f_contiguous) + assert not b.flags.c_contiguous + assert b.flags.f_contiguous else: - self.assertTrue(b.flags.c_contiguous) - self.assertFalse(b.flags.f_contiguous) + assert b.flags.c_contiguous + assert not b.flags.f_contiguous return b - @testing.for_orders("C") + @testing.for_orders("CF") @testing.for_all_dtypes() def test_empty_like_reshape_contiguity3_cupy_only(self, dtype, order): a = testing.shaped_arange((2, 3, 4), cupy, dtype) @@ -379,22 +423,23 @@ def test_empty_like_reshape_contiguity3_cupy_only(self, dtype, order): b.fill(0) shape = self.shape if not numpy.isscalar(self.shape) else (self.shape,) if len(shape) == 1: - self.assertTrue(b.flags.c_contiguous) - self.assertTrue(b.flags.f_contiguous) + assert b.flags.c_contiguous + assert b.flags.f_contiguous elif order in ["k", "K", None] and len(shape) == a.ndim: - self.assertFalse(b.flags.c_contiguous) - self.assertFalse(b.flags.f_contiguous) + assert not b.flags.c_contiguous + assert not b.flags.f_contiguous elif order in ["f", "F"]: - self.assertFalse(b.flags.c_contiguous) - self.assertTrue(b.flags.f_contiguous) + assert not b.flags.c_contiguous + assert b.flags.f_contiguous else: - self.assertTrue(b.flags.c_contiguous) - self.assertFalse(b.flags.f_contiguous) + assert b.flags.c_contiguous + assert not b.flags.f_contiguous c = cupy.zeros(self.shape) c.fill(0) testing.assert_array_equal(b, c) + @pytest.mark.skip("order 'K' is not supported") @testing.with_requires("numpy>=1.17.0") @testing.for_all_dtypes() def test_empty_like_K_strides_reshape(self, dtype): @@ -411,11 +456,11 @@ def test_empty_like_K_strides_reshape(self, dtype): bg.fill(0) # make sure NumPy and CuPy strides agree - self.assertEqual(b.strides, bg.strides) + assert b.strides == bg.strides return @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_zeros_like_reshape(self, xp, dtype, order): @@ -432,7 +477,7 @@ def test_zeros_like_reshape_cupy_only(self, dtype, order): testing.assert_array_equal(b, c) @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_ones_like_reshape(self, xp, dtype, order): @@ -448,7 +493,7 @@ def test_ones_like_reshape_cupy_only(self, dtype): testing.assert_array_equal(b, c) @testing.with_requires("numpy>=1.17.0") - @testing.for_orders("C") + @testing.for_CF_orders() @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_full_like_reshape(self, xp, dtype, order): From 067a7849835b3e9add5418cd8ae2b81e08f2bd60 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 27 Jun 2024 17:05:21 +0200 Subject: [PATCH 38/49] Bump conda-index from 0.4.0 to 0.5.0 (#1902) --- .github/workflows/conda-package.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index d11f1a3038c..c67b7487429 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -13,7 +13,7 @@ env: MODULE_NAME: dpnp CHANNELS: '-c dppy/label/dev -c intel -c conda-forge --override-channels' CONDA_BUILD_VERSION: '24.5.1' - CONDA_INDEX_VERSION: '0.4.0' + CONDA_INDEX_VERSION: '0.5.0' RUN_TESTS_MAX_ATTEMPTS: 2 TEST_ENV_NAME: 'test' TEST_SCOPE: >- From 437f0468fd8fa9afd8e34c305d4df60218e0c6d1 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Fri, 28 Jun 2024 12:37:54 +0200 Subject: [PATCH 39/49] Check type of input in `dpnp.repeat` to raise a proper validation exception if any (#1894) * Check type of input to raise a proper validation exception if any * Update dpnp/dpnp_iface_manipulation.py Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> --------- Co-authored-by: vtavana <120411540+vtavana@users.noreply.github.com> --- dpnp/dpnp_iface_manipulation.py | 27 +- tests/test_arraymanipulation.py | 111 -------- tests/test_manipulation.py | 236 ++++++++++++++++-- .../cupy/manipulation_tests/test_tiling.py | 4 - 4 files changed, 237 insertions(+), 141 deletions(-) diff --git a/dpnp/dpnp_iface_manipulation.py b/dpnp/dpnp_iface_manipulation.py index 056ac790720..bf3c66d7fda 100644 --- a/dpnp/dpnp_iface_manipulation.py +++ b/dpnp/dpnp_iface_manipulation.py @@ -1248,12 +1248,16 @@ def repeat(a, repeats, axis=None): ---------- x : {dpnp.ndarray, usm_ndarray} Input array. - repeat : int or array of int + repeats : {int, tuple, list, range, dpnp.ndarray, usm_ndarray} The number of repetitions for each element. `repeats` is broadcasted to fit the shape of the given axis. - axis : int, optional + If `repeats` is an array, it must have an integer data type. + Otherwise, `repeats` must be a Python integer or sequence of Python + integers (i.e., a tuple, list, or range). + axis : {None, int}, optional The axis along which to repeat values. By default, use the flattened input array, and return a flat output array. + Default: ``None``. Returns ------- @@ -1263,8 +1267,8 @@ def repeat(a, repeats, axis=None): See Also -------- - :obj:`dpnp.tile` : Construct an array by repeating A the number of times - given by reps. + :obj:`dpnp.tile` : Tile an array. + :obj:`dpnp.unique` : Find the unique elements of an array. Examples -------- @@ -1286,14 +1290,15 @@ def repeat(a, repeats, axis=None): """ - rep = repeats - if isinstance(repeats, dpnp_array): - rep = dpnp.get_usm_ndarray(repeats) + dpnp.check_supported_arrays_type(a) + if not isinstance(repeats, (int, tuple, list, range)): + repeats = dpnp.get_usm_ndarray(repeats) + if axis is None and a.ndim > 1: - usm_arr = dpnp.get_usm_ndarray(a.flatten()) - else: - usm_arr = dpnp.get_usm_ndarray(a) - usm_arr = dpt.repeat(usm_arr, rep, axis=axis) + a = dpnp.ravel(a) + + usm_arr = dpnp.get_usm_ndarray(a) + usm_arr = dpt.repeat(usm_arr, repeats, axis=axis) return dpnp_array._create_from_usm_ndarray(usm_arr) diff --git a/tests/test_arraymanipulation.py b/tests/test_arraymanipulation.py index 12f14bf4109..a6bbfd0e987 100644 --- a/tests/test_arraymanipulation.py +++ b/tests/test_arraymanipulation.py @@ -1016,114 +1016,3 @@ def test_can_cast(): assert dpnp.can_cast(X, "float32") == numpy.can_cast(X_np, "float32") assert dpnp.can_cast(X, dpnp.int32) == numpy.can_cast(X_np, numpy.int32) assert dpnp.can_cast(X, dpnp.int64) == numpy.can_cast(X_np, numpy.int64) - - -def test_repeat_scalar_sequence_agreement(): - x = dpnp.arange(5, dtype="i4") - expected_res = dpnp.empty(10, dtype="i4") - expected_res[1::2], expected_res[::2] = x, x - - # scalar case - reps = 2 - res = dpnp.repeat(x, reps) - assert dpnp.all(res == expected_res) - - # tuple - reps = (2, 2, 2, 2, 2) - res = dpnp.repeat(x, reps) - assert dpnp.all(res == expected_res) - - -def test_repeat_as_broadcasting(): - reps = 5 - x = dpnp.arange(reps, dtype="i4") - x1 = x[:, dpnp.newaxis] - expected_res = dpnp.broadcast_to(x1, (reps, reps)) - - res = dpnp.repeat(x1, reps, axis=1) - assert dpnp.all(res == expected_res) - - x2 = x[dpnp.newaxis, :] - expected_res = dpnp.broadcast_to(x2, (reps, reps)) - - res = dpnp.repeat(x2, reps, axis=0) - assert dpnp.all(res == expected_res) - - -def test_repeat_axes(): - reps = 2 - x = dpnp.reshape(dpnp.arange(5 * 10, dtype="i4"), (5, 10)) - expected_res = dpnp.empty((x.shape[0] * 2, x.shape[1]), dtype=x.dtype) - expected_res[::2, :], expected_res[1::2] = x, x - res = dpnp.repeat(x, reps, axis=0) - assert dpnp.all(res == expected_res) - - expected_res = dpnp.empty((x.shape[0], x.shape[1] * 2), dtype=x.dtype) - expected_res[:, ::2], expected_res[:, 1::2] = x, x - res = dpnp.repeat(x, reps, axis=1) - assert dpnp.all(res == expected_res) - - -def test_repeat_size_0_outputs(): - x = dpnp.ones((3, 0, 5), dtype="i4") - reps = 10 - res = dpnp.repeat(x, reps, axis=0) - assert res.size == 0 - assert res.shape == (30, 0, 5) - - res = dpnp.repeat(x, reps, axis=1) - assert res.size == 0 - assert res.shape == (3, 0, 5) - - res = dpnp.repeat(x, (2, 2, 2), axis=0) - assert res.size == 0 - assert res.shape == (6, 0, 5) - - x = dpnp.ones((3, 2, 5)) - res = dpnp.repeat(x, 0, axis=1) - assert res.size == 0 - assert res.shape == (3, 0, 5) - - x = dpnp.ones((3, 2, 5)) - res = dpnp.repeat(x, (0, 0), axis=1) - assert res.size == 0 - assert res.shape == (3, 0, 5) - - -def test_repeat_strides(): - reps = 2 - x = dpnp.reshape(dpnp.arange(10 * 10, dtype="i4"), (10, 10)) - x1 = x[:, ::-2] - expected_res = dpnp.empty((10, 10), dtype="i4") - expected_res[:, ::2], expected_res[:, 1::2] = x1, x1 - res = dpnp.repeat(x1, reps, axis=1) - assert dpnp.all(res == expected_res) - res = dpnp.repeat(x1, (reps,) * x1.shape[1], axis=1) - assert dpnp.all(res == expected_res) - - x1 = x[::-2, :] - expected_res = dpnp.empty((10, 10), dtype="i4") - expected_res[::2, :], expected_res[1::2, :] = x1, x1 - res = dpnp.repeat(x1, reps, axis=0) - assert dpnp.all(res == expected_res) - res = dpnp.repeat(x1, (reps,) * x1.shape[0], axis=0) - assert dpnp.all(res == expected_res) - - -def test_repeat_casting(): - x = dpnp.arange(5, dtype="i4") - # i4 is cast to i8 - reps = dpnp.ones(5, dtype="i4") - res = dpnp.repeat(x, reps) - assert res.shape == x.shape - assert dpnp.all(res == x) - - -def test_repeat_strided_repeats(): - x = dpnp.arange(5, dtype="i4") - reps = dpnp.ones(10, dtype="i8") - reps[::2] = 0 - reps = reps[::-2] - res = dpnp.repeat(x, reps) - assert res.shape == x.shape - assert dpnp.all(res == x) diff --git a/tests/test_manipulation.py b/tests/test_manipulation.py index 9c0869024a5..0178ff9a28b 100644 --- a/tests/test_manipulation.py +++ b/tests/test_manipulation.py @@ -1,6 +1,7 @@ +import dpctl.tensor as dpt import numpy import pytest -from numpy.testing import assert_array_equal +from numpy.testing import assert_array_equal, assert_raises import dpnp @@ -58,20 +59,6 @@ def test_copyto_where_raises(where): dpnp.copyto(a, b, where=where) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") -@pytest.mark.parametrize( - "arr", - [[], [1, 2, 3, 4], [[1, 2], [3, 4]], [[[1], [2]], [[3], [4]]]], - ids=["[]", "[1, 2, 3, 4]", "[[1, 2], [3, 4]]", "[[[1], [2]], [[3], [4]]]"], -) -def test_repeat(arr): - a = numpy.array(arr) - dpnp_a = dpnp.array(arr) - expected = numpy.repeat(a, 2) - result = dpnp.repeat(dpnp_a, 2) - assert_array_equal(expected, result) - - def test_result_type(): X = [dpnp.ones((2), dtype=dpnp.int64), dpnp.int32, "float32"] X_np = [numpy.ones((2), dtype=numpy.int64), numpy.int32, "float32"] @@ -114,6 +101,225 @@ def test_unique(array): assert_array_equal(expected, result) +class TestRepeat: + @pytest.mark.parametrize( + "data", + [[], [1, 2, 3, 4], [[1, 2], [3, 4]], [[[1], [2]], [[3], [4]]]], + ids=[ + "[]", + "[1, 2, 3, 4]", + "[[1, 2], [3, 4]]", + "[[[1], [2]], [[3], [4]]]", + ], + ) + @pytest.mark.parametrize("dtype", get_all_dtypes()) + def test_data(self, data, dtype): + a = numpy.array(data, dtype=dtype) + ia = dpnp.array(a) + + expected = numpy.repeat(a, 2) + result = dpnp.repeat(ia, 2) + assert_array_equal(expected, result) + + @pytest.mark.parametrize( + "repeats", [2, (2, 2, 2, 2, 2)], ids=["scalar", "tuple"] + ) + def test_scalar_sequence_agreement(self, repeats): + a = numpy.arange(5, dtype="i4") + ia = dpnp.array(a) + + expected = numpy.repeat(a, repeats) + result = dpnp.repeat(ia, repeats) + assert_array_equal(expected, result) + + @pytest.mark.parametrize("axis", [0, 1]) + def test_broadcasting(self, axis): + reps = 5 + a = numpy.arange(reps, dtype="i4") + if axis == 0: + sh = (reps, 1) + else: + sh = (1, reps) + a = a.reshape(sh) + ia = dpnp.array(a) + + expected = numpy.repeat(a, reps) + result = dpnp.repeat(ia, reps) + assert_array_equal(expected, result) + + @pytest.mark.parametrize("axis", [0, 1]) + def test_axes(self, axis): + reps = 2 + a = numpy.arange(5 * 10, dtype="i4").reshape((5, 10)) + ia = dpnp.array(a) + + expected = numpy.repeat(a, reps, axis=axis) + result = dpnp.repeat(ia, reps, axis=axis) + assert_array_equal(expected, result) + + def test_size_0_outputs(self): + reps = 10 + a = dpnp.ones((3, 0, 5), dtype="i4") + ia = dpnp.array(a) + + expected = numpy.repeat(a, reps, axis=0) + result = dpnp.repeat(ia, reps, axis=0) + assert_array_equal(expected, result) + + expected = numpy.repeat(a, reps, axis=1) + result = dpnp.repeat(ia, reps, axis=1) + assert_array_equal(expected, result) + + reps = (2, 2, 2) + expected = numpy.repeat(a, reps, axis=0) + result = dpnp.repeat(ia, reps, axis=0) + assert_array_equal(expected, result) + + a = numpy.ones((3, 2, 5)) + ia = dpnp.array(a) + + reps = 0 + expected = numpy.repeat(a, reps, axis=1) + result = dpnp.repeat(ia, reps, axis=1) + assert_array_equal(expected, result) + + reps = (0, 0) + expected = numpy.repeat(a, reps, axis=1) + result = dpnp.repeat(ia, reps, axis=1) + assert_array_equal(expected, result) + + def test_strides_0(self): + reps = 2 + a = numpy.arange(10 * 10, dtype="i4").reshape((10, 10)) + ia = dpnp.array(a) + + a = a[::-2, :] + ia = ia[::-2, :] + + expected = numpy.repeat(a, reps, axis=0) + result = dpnp.repeat(ia, reps, axis=0) + assert_array_equal(expected, result) + + expected = numpy.repeat(a, (reps,) * a.shape[0], axis=0) + result = dpnp.repeat(ia, (reps,) * ia.shape[0], axis=0) + assert_array_equal(expected, result) + + def test_strides_1(self): + reps = 2 + a = numpy.arange(10 * 10, dtype="i4").reshape((10, 10)) + ia = dpnp.array(a) + + a = a[:, ::-2] + ia = ia[:, ::-2] + + expected = numpy.repeat(a, reps, axis=1) + result = dpnp.repeat(ia, reps, axis=1) + assert_array_equal(expected, result) + + expected = numpy.repeat(a, (reps,) * a.shape[1], axis=1) + result = dpnp.repeat(ia, (reps,) * ia.shape[1], axis=1) + assert_array_equal(expected, result) + + def test_casting(self): + a = numpy.arange(5, dtype="i4") + ia = dpnp.array(a) + + # i4 is cast to i8 + reps = numpy.ones(5, dtype="i4") + ireps = dpnp.array(reps) + + expected = numpy.repeat(a, reps) + result = dpnp.repeat(ia, ireps) + assert_array_equal(expected, result) + + def test_strided_repeats(self): + a = numpy.arange(5, dtype="i4") + ia = dpnp.array(a) + + reps = numpy.ones(10, dtype="i8") + reps[::2] = 0 + ireps = dpnp.array(reps) + + reps = reps[::-2] + ireps = ireps[::-2] + + expected = numpy.repeat(a, reps) + result = dpnp.repeat(ia, ireps) + assert_array_equal(expected, result) + + def test_usm_ndarray_as_input_array(self): + reps = [1, 3, 2, 1, 1, 2] + a = numpy.array([[1, 2, 3, 4, 5, 6]]) + ia = dpt.asarray(a) + + expected = numpy.repeat(a, reps) + result = dpnp.repeat(ia, reps) + assert_array_equal(expected, result) + assert isinstance(result, dpnp.ndarray) + + def test_scalar_as_input_array(self): + assert_raises(TypeError, dpnp.repeat, 3, 2) + + def test_usm_ndarray_as_repeats(self): + a = numpy.array([1, 2, 3, 4, 5, 6]).reshape((2, 3)) + ia = dpnp.asarray(a) + + reps = numpy.array([1, 3, 2]) + ireps = dpt.asarray(reps) + + expected = a.repeat(reps, axis=1) + result = ia.repeat(ireps, axis=1) + assert_array_equal(expected, result) + assert isinstance(result, dpnp.ndarray) + + def test_unsupported_array_as_repeats(self): + assert_raises(TypeError, dpnp.arange(5, dtype="i4"), numpy.array(3)) + + @pytest.mark.parametrize( + "data, dtype", + [ + pytest.param([1, 2**7 - 1, -(2**7)], numpy.int8, id="int8"), + pytest.param([1, 2**15 - 1, -(2**15)], numpy.int16, id="int16"), + pytest.param([1, 2**31 - 1, -(2**31)], numpy.int32, id="int32"), + pytest.param([1, 2**63 - 1, -(2**63)], numpy.int64, id="int64"), + ], + ) + def test_maximum_signed_integers(self, data, dtype): + reps = 129 + a = numpy.array(data, dtype=dtype) + ia = dpnp.asarray(a) + + expected = a.repeat(reps) + result = ia.repeat(reps) + assert_array_equal(expected, result) + + @pytest.mark.parametrize( + "data, dtype", + [ + pytest.param( + [1, -(2**7), -(2**7) + 1, 2**7 - 1], numpy.int8, id="int8" + ), + pytest.param( + [1, -(2**15), -(2**15) + 1, 2**15 - 1], numpy.int16, id="int16" + ), + pytest.param( + [1, -(2**31), -(2**31) + 1, 2**31 - 1], numpy.int32, id="int32" + ), + pytest.param( + [1, -(2**63), -(2**63) + 1, 2**63 - 1], numpy.int64, id="int64" + ), + ], + ) + def test_minimum_signed_integers(self, data, dtype): + reps = 129 + a = numpy.array(data, dtype=dtype) + ia = dpnp.asarray(a) + + expected = a.repeat(reps) + result = ia.repeat(reps) + assert_array_equal(expected, result) + + class TestTranspose: @pytest.mark.parametrize("axes", [(0, 1), (1, 0), [0, 1]]) def test_2d_with_axes(self, axes): diff --git a/tests/third_party/cupy/manipulation_tests/test_tiling.py b/tests/third_party/cupy/manipulation_tests/test_tiling.py index eb29036d248..365a01f7e14 100644 --- a/tests/third_party/cupy/manipulation_tests/test_tiling.py +++ b/tests/third_party/cupy/manipulation_tests/test_tiling.py @@ -16,7 +16,6 @@ {"repeats": [1, 2, 3], "axis": 1}, {"repeats": [1, 2, 3], "axis": -2}, ) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") class TestRepeat(unittest.TestCase): @testing.numpy_cupy_array_equal() def test_array_repeat(self, xp): @@ -42,7 +41,6 @@ def test_method(self): {"repeats": [2], "axis": None}, {"repeats": [2], "axis": 1}, ) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") class TestRepeatListBroadcast(unittest.TestCase): """Test for `repeats` argument using single element list. @@ -62,7 +60,6 @@ def test_array_repeat(self, xp): {"repeats": [1, 2, 3, 4], "axis": None}, {"repeats": [1, 2, 3, 4], "axis": 0}, ) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") class TestRepeat1D(unittest.TestCase): @testing.numpy_cupy_array_equal() def test_array_repeat(self, xp): @@ -91,7 +88,6 @@ def test_array_repeat(self, xp): {"repeats": 2, "axis": -4}, {"repeats": 2, "axis": 3}, ) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") class TestRepeatFailure(unittest.TestCase): def test_repeat_failure(self): for xp in (numpy, cupy): From a265663d211cb074625781dee5e4b3ba7328cca0 Mon Sep 17 00:00:00 2001 From: Natalia Polina Date: Fri, 28 Jun 2024 04:44:50 -0700 Subject: [PATCH 40/49] Clean up legacy element-wise implementation from the backend (#1890) * Clean up legacy element-wise implementation from the backend * return legacy copy implementation for partition function * Apply comments * Fix pre-commit * Fix pre-commit * Clean-up MACRO_2ARG_2TYPES_LOGIC_OP. Clean-up /backend/include --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> Co-authored-by: Anton Volkov --- dpnp/backend/CMakeLists.txt | 1 - .../include/dpnp_gen_1arg_1type_tbl.hpp | 15 - .../include/dpnp_gen_1arg_2type_tbl.hpp | 80 -- .../include/dpnp_gen_2arg_1type_tbl.hpp | 105 -- .../include/dpnp_gen_2arg_2type_tbl.hpp | 94 -- .../include/dpnp_gen_2arg_3type_tbl.hpp | 52 - dpnp/backend/include/dpnp_iface.hpp | 613 ---------- dpnp/backend/include/dpnp_iface_fptr.hpp | 95 +- dpnp/backend/kernels/dpnp_krnl_bitwise.cpp | 438 ------- dpnp/backend/kernels/dpnp_krnl_elemwise.cpp | 690 +---------- dpnp/backend/kernels/dpnp_krnl_logic.cpp | 272 ----- .../kernels/dpnp_krnl_mathematical.cpp | 1034 ----------------- dpnp/backend/src/dpnp_fptr.hpp | 1 - dpnp/backend/src/dpnp_iface_fptr.cpp | 1 - dpnp/dpnp_algo/CMakeLists.txt | 1 - dpnp/dpnp_algo/dpnp_algo.pxd | 6 - dpnp/dpnp_algo/dpnp_algo.pyx | 1 - dpnp/dpnp_algo/dpnp_algo_arraycreation.pxi | 78 -- dpnp/dpnp_algo/dpnp_algo_sorting.pxi | 2 +- 19 files changed, 16 insertions(+), 3563 deletions(-) delete mode 100644 dpnp/backend/include/dpnp_gen_2arg_1type_tbl.hpp delete mode 100644 dpnp/backend/include/dpnp_gen_2arg_2type_tbl.hpp delete mode 100644 dpnp/backend/kernels/dpnp_krnl_bitwise.cpp delete mode 100644 dpnp/dpnp_algo/dpnp_algo_arraycreation.pxi diff --git a/dpnp/backend/CMakeLists.txt b/dpnp/backend/CMakeLists.txt index f9eb1a35d28..d96320bf0ac 100644 --- a/dpnp/backend/CMakeLists.txt +++ b/dpnp/backend/CMakeLists.txt @@ -25,7 +25,6 @@ set(DPNP_SRC kernels/dpnp_krnl_arraycreation.cpp - kernels/dpnp_krnl_bitwise.cpp kernels/dpnp_krnl_common.cpp kernels/dpnp_krnl_elemwise.cpp kernels/dpnp_krnl_fft.cpp diff --git a/dpnp/backend/include/dpnp_gen_1arg_1type_tbl.hpp b/dpnp/backend/include/dpnp_gen_1arg_1type_tbl.hpp index ea1c477173f..32df8aeda72 100644 --- a/dpnp/backend/include/dpnp_gen_1arg_1type_tbl.hpp +++ b/dpnp/backend/include/dpnp_gen_1arg_1type_tbl.hpp @@ -83,24 +83,9 @@ #endif // _SECTION_DOCUMENTATION_GENERATION_ -MACRO_1ARG_1TYPE_OP(dpnp_conjugate_c, - std::conj(input_elem), - q.submit(kernel_func)) -MACRO_1ARG_1TYPE_OP(dpnp_copy_c, input_elem, q.submit(kernel_func)) MACRO_1ARG_1TYPE_OP(dpnp_erf_c, dispatch_erf_op(input_elem), oneapi::mkl::vm::erf(q, input1_size, input1_data, result)) -MACRO_1ARG_1TYPE_OP(dpnp_negative_c, -input_elem, q.submit(kernel_func)) -MACRO_1ARG_1TYPE_OP( - dpnp_recip_c, - _DataType(1) / input_elem, - q.submit(kernel_func)) // error: no member named 'recip' in namespace 'sycl' -MACRO_1ARG_1TYPE_OP(dpnp_sign_c, - dispatch_sign_op(input_elem), - q.submit(kernel_func)) // no sycl::sign for int and long -MACRO_1ARG_1TYPE_OP(dpnp_square_c, - input_elem *input_elem, - oneapi::mkl::vm::sqr(q, input1_size, input1_data, result)) #undef MACRO_1ARG_1TYPE_OP diff --git a/dpnp/backend/include/dpnp_gen_1arg_2type_tbl.hpp b/dpnp/backend/include/dpnp_gen_1arg_2type_tbl.hpp index 3abc54c7212..a27353866d2 100644 --- a/dpnp/backend/include/dpnp_gen_1arg_2type_tbl.hpp +++ b/dpnp/backend/include/dpnp_gen_1arg_2type_tbl.hpp @@ -85,95 +85,15 @@ #endif -MACRO_1ARG_2TYPES_OP(dpnp_acos_c, - sycl::acos(input_elem), - oneapi::mkl::vm::acos(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_acosh_c, - sycl::acosh(input_elem), - oneapi::mkl::vm::acosh(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_asin_c, - sycl::asin(input_elem), - oneapi::mkl::vm::asin(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_asinh_c, - sycl::asinh(input_elem), - oneapi::mkl::vm::asinh(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_atan_c, - sycl::atan(input_elem), - oneapi::mkl::vm::atan(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_atanh_c, - sycl::atanh(input_elem), - oneapi::mkl::vm::atanh(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_cbrt_c, - sycl::cbrt(input_elem), - oneapi::mkl::vm::cbrt(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_ceil_c, - sycl::ceil(input_elem), - oneapi::mkl::vm::ceil(q, input1_size, input1_data, result)) MACRO_1ARG_2TYPES_OP(dpnp_copyto_c, input_elem, q.submit(kernel_func)) -MACRO_1ARG_2TYPES_OP(dpnp_cos_c, - sycl::cos(input_elem), - oneapi::mkl::vm::cos(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_cosh_c, - sycl::cosh(input_elem), - oneapi::mkl::vm::cosh(q, input1_size, input1_data, result)) MACRO_1ARG_2TYPES_OP(dpnp_degrees_c, sycl::degrees(input_elem), q.submit(kernel_func)) -MACRO_1ARG_2TYPES_OP(dpnp_exp2_c, - sycl::exp2(input_elem), - oneapi::mkl::vm::exp2(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_exp_c, - sycl::exp(input_elem), - oneapi::mkl::vm::exp(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_expm1_c, - sycl::expm1(input_elem), - oneapi::mkl::vm::expm1(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_fabs_c, - sycl::fabs(input_elem), - oneapi::mkl::vm::abs(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_floor_c, - sycl::floor(input_elem), - oneapi::mkl::vm::floor(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_log10_c, - sycl::log10(input_elem), - oneapi::mkl::vm::log10(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_log1p_c, - sycl::log1p(input_elem), - oneapi::mkl::vm::log1p(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_log2_c, - sycl::log2(input_elem), - oneapi::mkl::vm::log2(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_log_c, - sycl::log(input_elem), - oneapi::mkl::vm::ln(q, input1_size, input1_data, result)) MACRO_1ARG_2TYPES_OP(dpnp_radians_c, sycl::radians(input_elem), q.submit(kernel_func)) -MACRO_1ARG_2TYPES_OP(dpnp_sin_c, - sycl::sin(input_elem), - oneapi::mkl::vm::sin(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_sinh_c, - sycl::sinh(input_elem), - oneapi::mkl::vm::sinh(q, input1_size, input1_data, result)) MACRO_1ARG_2TYPES_OP(dpnp_sqrt_c, sycl::sqrt(input_elem), oneapi::mkl::vm::sqrt(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_tan_c, - sycl::tan(input_elem), - oneapi::mkl::vm::tan(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP(dpnp_tanh_c, - sycl::tanh(input_elem), - oneapi::mkl::vm::tanh(q, input1_size, input1_data, result)) -MACRO_1ARG_2TYPES_OP( - dpnp_trunc_c, - sycl::trunc(input_elem), - oneapi::mkl::vm::trunc(q, input1_size, input1_data, result)) #undef MACRO_1ARG_2TYPES_OP diff --git a/dpnp/backend/include/dpnp_gen_2arg_1type_tbl.hpp b/dpnp/backend/include/dpnp_gen_2arg_1type_tbl.hpp deleted file mode 100644 index 130283e5834..00000000000 --- a/dpnp/backend/include/dpnp_gen_2arg_1type_tbl.hpp +++ /dev/null @@ -1,105 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/* - * This header file contains single argument bitwise functions definitions - * - * Macro `MACRO_2ARG_1TYPE_OP` must be defined before usage - * - * Parameters: - * - public name of the function and kernel name - * - operation used to calculate the result - * - */ - -#ifndef MACRO_2ARG_1TYPE_OP -#error "MACRO_2ARG_1TYPE_OP is not defined" -#endif - -#ifdef _SECTION_DOCUMENTATION_GENERATION_ - -#define MACRO_2ARG_1TYPE_OP(__name__, __operation__) \ - /** @ingroup BACKEND_API */ \ - /** @brief Element wise operation function __name__ */ \ - /** */ \ - /** Function "__name__" executes operator "__operation__" over \ - * corresponding elements of input arrays */ \ - /** */ \ - /** @param[in] q_ref Reference to SYCL queue. */ \ - /** @param[out] result_out Output array. */ \ - /** @param[in] result_size Output array size. */ \ - /** @param[in] result_ndim Number of output array dimensions. \ - */ \ - /** @param[in] result_shape Output array shape. */ \ - /** @param[in] result_strides Output array strides. */ \ - /** @param[in] input1_in Input array 1. */ \ - /** @param[in] input1_size Input array 1 size. */ \ - /** @param[in] input1_ndim Number of input array 1 dimensions. \ - */ \ - /** @param[in] input1_shape Input array 1 shape. */ \ - /** @param[in] input1_strides Input array 1 strides. */ \ - /** @param[in] input2_in Input array 2. */ \ - /** @param[in] input2_size Input array 2 size. */ \ - /** @param[in] input2_ndim Number of input array 2 dimensions. \ - */ \ - /** @param[in] input2_shape Input array 2 shape. */ \ - /** @param[in] input2_strides Input array 2 strides. */ \ - /** @param[in] where Where condition. */ \ - /** @param[in] dep_event_vec_ref Reference to vector of SYCL events. \ - */ \ - template \ - DPCTLSyclEventRef __name__( \ - DPCTLSyclQueueRef q_ref, void *result_out, const size_t result_size, \ - const size_t result_ndim, const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where, \ - const DPCTLEventVectorRef dep_event_vec_ref); \ - \ - template \ - void __name__( \ - void *result_out, const size_t result_size, const size_t result_ndim, \ - const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where); - -#endif - -MACRO_2ARG_1TYPE_OP(dpnp_bitwise_and_c, input1_elem &input2_elem) -MACRO_2ARG_1TYPE_OP(dpnp_bitwise_or_c, input1_elem | input2_elem) -MACRO_2ARG_1TYPE_OP(dpnp_bitwise_xor_c, input1_elem ^ input2_elem) -MACRO_2ARG_1TYPE_OP(dpnp_left_shift_c, input1_elem << input2_elem) -MACRO_2ARG_1TYPE_OP(dpnp_right_shift_c, input1_elem >> input2_elem) - -#undef MACRO_2ARG_1TYPE_OP diff --git a/dpnp/backend/include/dpnp_gen_2arg_2type_tbl.hpp b/dpnp/backend/include/dpnp_gen_2arg_2type_tbl.hpp deleted file mode 100644 index d84accb0757..00000000000 --- a/dpnp/backend/include/dpnp_gen_2arg_2type_tbl.hpp +++ /dev/null @@ -1,94 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2023-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/* - * This header file contains single argument element wise functions definitions - * - * Macro `MACRO_2ARG_2TYPES_LOGIC_OP` must be defined before usage - * - * Parameters: - * - public name of the function and kernel name - * - operation used to calculate the result - * - */ - -#ifndef MACRO_2ARG_2TYPES_LOGIC_OP -#error "MACRO_2ARG_2TYPES_LOGIC_OP is not defined" -#endif - -#ifdef _SECTION_DOCUMENTATION_GENERATION_ - -#define MACRO_2ARG_2TYPES_LOGIC_OP(__name__, __operation__) \ - /** @ingroup BACKEND_API */ \ - /** @brief Per element operation function __name__ */ \ - /** */ \ - /** Function "__name__" executes operator "__operation__" over \ - * corresponding elements of input arrays */ \ - /** */ \ - /** @param[in] q_ref Reference to SYCL queue. */ \ - /** @param[out] result_out Output array. */ \ - /** @param[in] result_size Output array size. */ \ - /** @param[in] result_ndim Number of output array dimensions. \ - */ \ - /** @param[in] result_shape Output array shape. */ \ - /** @param[in] result_strides Output array strides. */ \ - /** @param[in] input1_in Input array 1. */ \ - /** @param[in] input1_size Input array 1 size. */ \ - /** @param[in] input1_ndim Number of input array 1 dimensions. \ - */ \ - /** @param[in] input1_shape Input array 1 shape. */ \ - /** @param[in] input1_strides Input array 1 strides. */ \ - /** @param[in] input2_in Input array 2. */ \ - /** @param[in] input2_size Input array 2 size. */ \ - /** @param[in] input2_ndim Number of input array 2 dimensions. \ - */ \ - /** @param[in] input2_shape Input array 2 shape. */ \ - /** @param[in] input2_strides Input array 2 strides. */ \ - /** @param[in] where Where condition. */ \ - /** @param[in] dep_event_vec_ref Reference to vector of SYCL events. \ - */ \ - template \ - DPCTLSyclEventRef __name__( \ - DPCTLSyclQueueRef q_ref, void *result_out, const size_t result_size, \ - const size_t result_ndim, const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where, \ - const DPCTLEventVectorRef dep_event_vec_ref); - -#endif - -MACRO_2ARG_2TYPES_LOGIC_OP(dpnp_equal_c, input1_elem == input2_elem) -MACRO_2ARG_2TYPES_LOGIC_OP(dpnp_greater_c, input1_elem > input2_elem) -MACRO_2ARG_2TYPES_LOGIC_OP(dpnp_greater_equal_c, input1_elem >= input2_elem) -MACRO_2ARG_2TYPES_LOGIC_OP(dpnp_less_c, input1_elem < input2_elem) -MACRO_2ARG_2TYPES_LOGIC_OP(dpnp_less_equal_c, input1_elem <= input2_elem) -MACRO_2ARG_2TYPES_LOGIC_OP(dpnp_not_equal_c, input1_elem != input2_elem) - -#undef MACRO_2ARG_2TYPES_LOGIC_OP diff --git a/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp b/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp index 7423085d659..dcec3f8192b 100644 --- a/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp +++ b/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp @@ -103,40 +103,6 @@ #endif -MACRO_2ARG_3TYPES_OP(dpnp_add_c, - input1_elem + input2_elem, - x1 + x2, - MACRO_UNPACK_TYPES(bool, std::int32_t, std::int64_t), - oneapi::mkl::vm::add, - MACRO_UNPACK_TYPES(float, - double, - std::complex, - std::complex)) - -MACRO_2ARG_3TYPES_OP(dpnp_arctan2_c, - sycl::atan2(input1_elem, input2_elem), - sycl::atan2(x1, x2), - MACRO_UNPACK_TYPES(float, double), - oneapi::mkl::vm::atan2, - MACRO_UNPACK_TYPES(float, double)) - -MACRO_2ARG_3TYPES_OP(dpnp_copysign_c, - sycl::copysign(input1_elem, input2_elem), - sycl::copysign(x1, x2), - MACRO_UNPACK_TYPES(float, double), - oneapi::mkl::vm::copysign, - MACRO_UNPACK_TYPES(float, double)) - -MACRO_2ARG_3TYPES_OP(dpnp_divide_c, - input1_elem / input2_elem, - x1 / x2, - MACRO_UNPACK_TYPES(bool, std::int32_t, std::int64_t), - oneapi::mkl::vm::div, - MACRO_UNPACK_TYPES(float, - double, - std::complex, - std::complex)) - MACRO_2ARG_3TYPES_OP( dpnp_fmod_c, dispatch_fmod_op(input1_elem, input2_elem), @@ -145,13 +111,6 @@ MACRO_2ARG_3TYPES_OP( oneapi::mkl::vm::fmod, MACRO_UNPACK_TYPES(float, double)) -MACRO_2ARG_3TYPES_OP(dpnp_hypot_c, - sycl::hypot(input1_elem, input2_elem), - sycl::hypot(x1, x2), - MACRO_UNPACK_TYPES(float, double), - oneapi::mkl::vm::hypot, - MACRO_UNPACK_TYPES(float, double)) - MACRO_2ARG_3TYPES_OP(dpnp_maximum_c, sycl::max(input1_elem, input2_elem), nullptr, @@ -181,17 +140,6 @@ MACRO_2ARG_3TYPES_OP(dpnp_multiply_c, std::complex, std::complex)) -MACRO_2ARG_3TYPES_OP(dpnp_power_c, - static_cast<_DataType_output>(std::pow(input1_elem, - input2_elem)), - sycl::pow(x1, x2), - MACRO_UNPACK_TYPES(float, double), - oneapi::mkl::vm::pow, - MACRO_UNPACK_TYPES(float, - double, - std::complex, - std::complex)) - MACRO_2ARG_3TYPES_OP(dpnp_subtract_c, input1_elem - input2_elem, x1 - x2, diff --git a/dpnp/backend/include/dpnp_iface.hpp b/dpnp/backend/include/dpnp_iface.hpp index ccbf6fa8536..324e7a612b1 100644 --- a/dpnp/backend/include/dpnp_iface.hpp +++ b/dpnp/backend/include/dpnp_iface.hpp @@ -202,28 +202,6 @@ template INP_DLLEXPORT void dpnp_arange_c(size_t start, size_t step, void *result1, size_t size); -/** - * @ingroup BACKEND_API - * @brief Copy of the array, cast to a specified type. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array. - * @param [out] result Output array. - * @param [in] size Number of input elements in `array`. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_astype_c(DPCTLSyclQueueRef q_ref, - const void *array, - void *result, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_astype_c(const void *array, void *result, const size_t size); - /** * @ingroup BACKEND_API * @brief Implementation of full function @@ -266,67 +244,6 @@ INP_DLLEXPORT DPCTLSyclEventRef template INP_DLLEXPORT void dpnp_full_like_c(void *array_in, void *result, size_t size); -/** - * @ingroup BACKEND_API - * @brief Matrix multiplication. - * - * Matrix multiplication procedure. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result_out Output array. - * @param [in] result_size Size of output array. - * @param [in] result_ndim Number of output array dimensions. - * @param [in] result_shape Shape of output array. - * @param [in] result_strides Strides of output array. - * @param [in] input1_in First input array. - * @param [in] input1_size Size of first input array. - * @param [in] input1_ndim Number of first input array dimensions. - * @param [in] input1_shape Shape of first input array. - * @param [in] input1_strides Strides of first input array. - * @param [in] input2_in Second input array. - * @param [in] input2_size Size of second input array. - * @param [in] input2_ndim Number of second input array dimensions. - * @param [in] input2_shape Shape of second input array. - * @param [in] input2_strides Strides of second input array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_matmul_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input1_in, - const size_t input1_size, - const size_t input1_ndim, - const shape_elem_type *input1_shape, - const shape_elem_type *input1_strides, - const void *input2_in, - const size_t input2_size, - const size_t input2_ndim, - const shape_elem_type *input2_shape, - const shape_elem_type *input2_strides, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_matmul_c(void *result_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input1_in, - const size_t input1_size, - const size_t input1_ndim, - const shape_elem_type *input1_shape, - const shape_elem_type *input1_strides, - const void *input2_in, - const size_t input2_size, - const size_t input2_ndim, - const shape_elem_type *input2_shape, - const shape_elem_type *input2_strides); - /** * @ingroup BACKEND_API * @brief Compute the variance along the specified axis, while ignoring NaNs. @@ -388,28 +305,6 @@ INP_DLLEXPORT void dpnp_nonzero_c(const void *array1, const size_t ndim, const size_t j); -/** - * @ingroup BACKEND_API - * @brief absolute function. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] input1_in Input array. - * @param [out] result1 Output array. - * @param [in] size Number of elements in input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_elemwise_absolute_c(DPCTLSyclQueueRef q_ref, - const void *input1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_elemwise_absolute_c(const void *input1_in, void *result1, size_t size); - /** * @ingroup BACKEND_API * @brief Custom implementation of dot function @@ -473,98 +368,6 @@ INP_DLLEXPORT void dpnp_dot_c(void *result_out, const shape_elem_type *input2_shape, const shape_elem_type *input2_strides); -/** - * @ingroup BACKEND_API - * @brief Custom implementation of cross function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result_out Output array. - * @param [in] input1_in First input array. - * @param [in] input1_size Size of first input array. - * @param [in] input1_shape Shape of first input array. - * @param [in] input1_shape_ndim Number of first array dimensions. - * @param [in] input2_in Second input array. - * @param [in] input2_size Shape of second input array. - * @param [in] input2_shape Shape of first input array. - * @param [in] input2_shape_ndim Number of second array dimensions. - * @param [in] where Mask array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_cross_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_cross_c(void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where); - -/** - * @ingroup BACKEND_API - * @brief Custom implementation of cumprod function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array. - * @param [out] result1 Output array. - * @param [in] size Number of elements in input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - * - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_cumprod_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_cumprod_c(void *array1_in, void *result1, size_t size); - -/** - * @ingroup BACKEND_API - * @brief Custom implementation of cumsum function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array. - * @param [out] result1 Output array. - * @param [in] size Number of elements in input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - * - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_cumsum_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_cumsum_c(void *array1_in, void *result1, size_t size); - /** * @ingroup BACKEND_API * @brief The differences between consecutive elements of an array. @@ -910,54 +713,6 @@ INP_DLLEXPORT void dpnp_put_along_axis_c(void *arr_in, size_t size_indices, size_t values_size); -/** - * @ingroup BACKEND_API - * @brief Compute the eigenvalues and right eigenvectors of a square array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array_in Input array[size][size] - * @param [out] result1 The eigenvalues, each repeated according to - * its multiplicity - * @param [out] result2 The normalized (unit "length") eigenvectors - * @param [in] size One dimension of square [size][size] array - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_eig_c(DPCTLSyclQueueRef q_ref, - const void *array_in, - void *result1, - void *result2, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_eig_c(const void *array_in, void *result1, void *result2, size_t size); - -/** - * @ingroup BACKEND_API - * @brief Compute the eigenvalues of a square array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array_in Input array[size][size] - * @param [out] result1 The eigenvalues, each repeated according to - * its multiplicity - * @param [in] size One dimension of square [size][size] array - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_eigvals_c(DPCTLSyclQueueRef q_ref, - const void *array_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_eigvals_c(const void *array_in, void *result1, size_t size); - /** * @ingroup BACKEND_API * @brief Return a 2-D array with ones on the diagonal and zeros elsewhere. @@ -1056,32 +811,6 @@ INP_DLLEXPORT DPCTLSyclEventRef template INP_DLLEXPORT void dpnp_sort_c(void *array, void *result, size_t size); -/** - * @ingroup BACKEND_API - * @brief math library implementation of cholesky function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [out] result Output array. - * @param [in] size Number of elements in input arrays. - * @param [in] data_size Last element of shape arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_cholesky_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - const size_t size, - const size_t data_size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_cholesky_c(void *array1_in, - void *result1, - const size_t size, - const size_t data_size); - /** * @ingroup BACKEND_API * @brief correlate function @@ -1154,32 +883,6 @@ template INP_DLLEXPORT void dpnp_cov_c(void *array1_in, void *result1, size_t nrows, size_t ncols); -/** - * @ingroup BACKEND_API - * @brief math library implementation of det function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [out] result Output array. - * @param [in] shape Shape of input array. - * @param [in] ndim Number of elements in shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_det_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_det_c(void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim); - /** * @ingroup BACKEND_API * @brief Construct an array from an index array and a list of arrays to choose @@ -1344,58 +1047,6 @@ INP_DLLEXPORT DPCTLSyclEventRef template INP_DLLEXPORT void dpnp_initval_c(void *result1, void *value, size_t size); -/** - * @ingroup BACKEND_API - * @brief math library implementation of inv function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array with data. - * @param [out] result1 Output array. - * @param [in] shape Shape of input array. - * @param [in] ndim Number of elements in shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_inv_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_inv_c(void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim); - -/** - * @ingroup BACKEND_API - * @brief math library implementation of matrix_rank function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array with data. - * @param [out] result1 Output array. - * @param [in] shape Shape of input array. - * @param [in] ndim Number of elements in shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_matrix_rank_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_matrix_rank_c(void *array1_in, - void *result1, - shape_elem_type *shape, - size_t ndim); - /** * @ingroup BACKEND_API * @brief math library implementation of max function @@ -1572,33 +1223,6 @@ INP_DLLEXPORT DPCTLSyclEventRef template INP_DLLEXPORT void dpnp_argmin_c(void *array, void *result, size_t size); -/** - * @ingroup BACKEND_API - * @brief math library implementation of around function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] input_in Input array with data. - * @param [out] result_out Output array with indices. - * @param [in] input_size Number of elements in input arrays. - * @param [in] decimals Number of decimal places to round. Support - * only with default value 0. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_around_c(DPCTLSyclQueueRef q_ref, - const void *input_in, - void *result_out, - const size_t input_size, - const int decimals, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_around_c(const void *input_in, - void *result_out, - const size_t input_size, - const int decimals); - /** * @ingroup BACKEND_API * @brief math library implementation of std function @@ -1820,55 +1444,6 @@ INP_DLLEXPORT void dpnp_var_c(void *array, size_t naxis, size_t ddof); -/** - * @ingroup BACKEND_API - * @brief Implementation of invert function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array. - * @param [out] result1 Output array. - * @param [in] size Number of elements in the input array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_invert_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_invert_c(void *array1_in, void *result, size_t size); - -#define MACRO_2ARG_1TYPE_OP(__name__, __operation__) \ - template \ - INP_DLLEXPORT DPCTLSyclEventRef __name__( \ - DPCTLSyclQueueRef q_ref, void *result_out, const size_t result_size, \ - const size_t result_ndim, const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where, \ - const DPCTLEventVectorRef dep_event_vec_ref); \ - \ - template \ - INP_DLLEXPORT void __name__( \ - void *result_out, const size_t result_size, const size_t result_ndim, \ - const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where); - -#include - #define MACRO_1ARG_1TYPE_OP(__name__, __operation1__, __operation2__) \ template \ INP_DLLEXPORT DPCTLSyclEventRef __name__( \ @@ -1913,23 +1488,6 @@ INP_DLLEXPORT void dpnp_invert_c(void *array1_in, void *result, size_t size); #include -#define MACRO_2ARG_2TYPES_LOGIC_OP(__name__, __operation__) \ - template \ - INP_DLLEXPORT DPCTLSyclEventRef __name__( \ - DPCTLSyclQueueRef q_ref, void *result_out, const size_t result_size, \ - const size_t result_ndim, const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where, \ - const DPCTLEventVectorRef dep_event_vec_ref); - -#include - #define MACRO_2ARG_3TYPES_OP(__name__, __operation__, __vec_operation__, \ __vec_types__, __mkl_operation__, __mkl_types__) \ template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_floor_divide_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_floor_divide_c(void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where); - /** * @ingroup BACKEND_API * @brief modf function. @@ -2099,54 +1609,6 @@ INP_DLLEXPORT DPCTLSyclEventRef template INP_DLLEXPORT void dpnp_ones_like_c(void *result, size_t size); -/** - * @ingroup BACKEND_API - * @brief remainder function. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result_out Output array. - * @param [in] input1_in First input array. - * @param [in] input1_size Size of first input array. - * @param [in] input1_shape Shape of first input array. - * @param [in] input1_shape_ndim Number of first array dimensions. - * @param [in] input2_in Second input array. - * @param [in] input2_size Shape of second input array. - * @param [in] input2_shape Shape of first input array. - * @param [in] input2_shape_ndim Number of second array dimensions. - * @param [in] where Mask array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_remainder_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_remainder_c(void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where); - /** * @ingroup BACKEND_API * @brief repeat elements of an array. @@ -2173,81 +1635,6 @@ INP_DLLEXPORT void dpnp_repeat_c(const void *array_in, const size_t repeats, const size_t size); -/** - * @ingroup BACKEND_API - * @brief transpose function. Permute axes of the input to the output with - * elements permutation. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array. - * @param [in] input_shape Input shape. - * @param [in] result_shape Output shape. - * @param [in] permute_axes Order of axis by it's id as it should be - * presented in output. - * @param [in] ndim Number of elements in shapes and axes. - * @param [out] result1 Output array. - * @param [in] size Number of elements in input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_elemwise_transpose_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - const shape_elem_type *input_shape, - const shape_elem_type *result_shape, - const shape_elem_type *permute_axes, - size_t ndim, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_elemwise_transpose_c(void *array1_in, - const shape_elem_type *input_shape, - const shape_elem_type *result_shape, - const shape_elem_type *permute_axes, - size_t ndim, - void *result1, - size_t size); - -/** - * @ingroup BACKEND_API - * @brief Custom implementation of trapz function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in First input array. - * @param [in] array2_in Second input array. - * @param [out] result1 Output array. - * @param [in] dx The spacing between sample points. - * @param [in] array1_size Number of elements in first input array. - * @param [in] array2_size Number of elements in second input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - * - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_trapz_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - const void *array2_in, - void *result1, - double dx, - size_t array1_size, - size_t array2_size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_trapz_c(const void *array1_in, - const void *array2_in, - void *result1, - double dx, - size_t array1_size, - size_t array2_size); - /** * @ingroup BACKEND_API * @brief Implementation of vander function diff --git a/dpnp/backend/include/dpnp_iface_fptr.hpp b/dpnp/backend/include/dpnp_iface_fptr.hpp index 1172bcbe4f5..a39174931fe 100644 --- a/dpnp/backend/include/dpnp_iface_fptr.hpp +++ b/dpnp/backend/include/dpnp_iface_fptr.hpp @@ -58,77 +58,42 @@ */ enum class DPNPFuncName : size_t { - DPNP_FN_NONE, /**< Very first element of the enumeration */ - DPNP_FN_ABSOLUTE, /**< Used in numpy.absolute() impl */ - DPNP_FN_ADD, /**< Used in numpy.add() impl */ - DPNP_FN_ALL, /**< Used in numpy.all() impl */ - DPNP_FN_ALLCLOSE, /**< Used in numpy.allclose() impl */ - DPNP_FN_ALLCLOSE_EXT, /**< Used in numpy.allclose() impl, requires extra - parameters */ - DPNP_FN_ANY, /**< Used in numpy.any() impl */ - DPNP_FN_ARANGE, /**< Used in numpy.arange() impl */ - DPNP_FN_ARCCOS, /**< Used in numpy.arccos() impl */ - DPNP_FN_ARCCOSH, /**< Used in numpy.arccosh() impl */ - DPNP_FN_ARCSIN, /**< Used in numpy.arcsin() impl */ - DPNP_FN_ARCSINH, /**< Used in numpy.arcsinh() impl */ - DPNP_FN_ARCTAN, /**< Used in numpy.arctan() impl */ - DPNP_FN_ARCTAN2, /**< Used in numpy.arctan2() impl */ - DPNP_FN_ARCTANH, /**< Used in numpy.arctanh() impl */ - DPNP_FN_ARGMAX, /**< Used in numpy.argmax() impl */ - DPNP_FN_ARGMIN, /**< Used in numpy.argmin() impl */ - DPNP_FN_ARGSORT, /**< Used in numpy.argsort() impl */ - DPNP_FN_AROUND, /**< Used in numpy.around() impl */ - DPNP_FN_ASTYPE, /**< Used in numpy.astype() impl */ - DPNP_FN_BITWISE_AND, /**< Used in numpy.bitwise_and() impl */ - DPNP_FN_BITWISE_OR, /**< Used in numpy.bitwise_or() impl */ - DPNP_FN_BITWISE_XOR, /**< Used in numpy.bitwise_xor() impl */ - DPNP_FN_CBRT, /**< Used in numpy.cbrt() impl */ - DPNP_FN_CEIL, /**< Used in numpy.ceil() impl */ - DPNP_FN_CHOLESKY, /**< Used in numpy.linalg.cholesky() impl */ - DPNP_FN_CONJUGATE, /**< Used in numpy.conjugate() impl */ - DPNP_FN_CHOOSE, /**< Used in numpy.choose() impl */ - DPNP_FN_CHOOSE_EXT, /**< Used in numpy.choose() impl, requires extra - parameters */ - DPNP_FN_COPY, /**< Used in numpy.copy() impl */ - DPNP_FN_COPY_EXT, /**< Used in numpy.copy() impl, requires extra parameters - */ - DPNP_FN_COPYSIGN, /**< Used in numpy.copysign() impl */ - DPNP_FN_COPYTO, /**< Used in numpy.copyto() impl */ + DPNP_FN_NONE, /**< Very first element of the enumeration */ + DPNP_FN_ALL, /**< Used in numpy.all() impl */ + DPNP_FN_ALLCLOSE, /**< Used in numpy.allclose() impl */ + DPNP_FN_ALLCLOSE_EXT, /**< Used in numpy.allclose() impl, requires extra + parameters */ + DPNP_FN_ANY, /**< Used in numpy.any() impl */ + DPNP_FN_ARANGE, /**< Used in numpy.arange() impl */ + DPNP_FN_ARGMAX, /**< Used in numpy.argmax() impl */ + DPNP_FN_ARGMIN, /**< Used in numpy.argmin() impl */ + DPNP_FN_ARGSORT, /**< Used in numpy.argsort() impl */ + DPNP_FN_CHOOSE, /**< Used in numpy.choose() impl */ + DPNP_FN_CHOOSE_EXT, /**< Used in numpy.choose() impl, requires extra + parameters */ + DPNP_FN_COPYTO, /**< Used in numpy.copyto() impl */ DPNP_FN_COPYTO_EXT, /**< Used in numpy.copyto() impl, requires extra parameters */ DPNP_FN_CORRELATE, /**< Used in numpy.correlate() impl */ DPNP_FN_CORRELATE_EXT, /**< Used in numpy.correlate() impl, requires extra parameters */ - DPNP_FN_COS, /**< Used in numpy.cos() impl */ - DPNP_FN_COSH, /**< Used in numpy.cosh() impl */ DPNP_FN_COUNT_NONZERO, /**< Used in numpy.count_nonzero() impl */ DPNP_FN_COV, /**< Used in numpy.cov() impl */ - DPNP_FN_CROSS, /**< Used in numpy.cross() impl */ - DPNP_FN_CUMPROD, /**< Used in numpy.cumprod() impl */ - DPNP_FN_CUMSUM, /**< Used in numpy.cumsum() impl */ DPNP_FN_DEGREES, /**< Used in numpy.degrees() impl */ DPNP_FN_DEGREES_EXT, /**< Used in numpy.degrees() impl, requires extra parameters */ - DPNP_FN_DET, /**< Used in numpy.linalg.det() impl */ DPNP_FN_DIAG, /**< Used in numpy.diag() impl */ DPNP_FN_DIAG_INDICES, /**< Used in numpy.diag_indices() impl */ DPNP_FN_DIAGONAL, /**< Used in numpy.diagonal() impl */ - DPNP_FN_DIVIDE, /**< Used in numpy.divide() impl */ DPNP_FN_DOT, /**< Used in numpy.dot() impl */ DPNP_FN_DOT_EXT, /**< Used in numpy.dot() impl, requires extra parameters */ DPNP_FN_EDIFF1D, /**< Used in numpy.ediff1d() impl */ DPNP_FN_EDIFF1D_EXT, /**< Used in numpy.ediff1d() impl, requires extra parameters */ - DPNP_FN_EIG, /**< Used in numpy.linalg.eig() impl */ - DPNP_FN_EIGVALS, /**< Used in numpy.linalg.eigvals() impl */ DPNP_FN_ERF, /**< Used in scipy.special.erf impl */ DPNP_FN_ERF_EXT, /**< Used in scipy.special.erf impl, requires extra parameters */ DPNP_FN_EYE, /**< Used in numpy.eye() impl */ - DPNP_FN_EXP, /**< Used in numpy.exp() impl */ - DPNP_FN_EXP2, /**< Used in numpy.exp2() impl */ - DPNP_FN_EXPM1, /**< Used in numpy.expm1() impl */ - DPNP_FN_FABS, /**< Used in numpy.fabs() impl */ DPNP_FN_FFT_FFT, /**< Used in numpy.fft.fft() impl */ DPNP_FN_FFT_FFT_EXT, /**< Used in numpy.fft.fft() impl, requires extra parameters */ @@ -136,30 +101,15 @@ enum class DPNPFuncName : size_t DPNP_FN_FFT_RFFT_EXT, /**< Used in numpy.fft.rfft() impl, requires extra parameters */ DPNP_FN_FILL_DIAGONAL, /**< Used in numpy.fill_diagonal() impl */ - DPNP_FN_FLATTEN, /**< Used in numpy.flatten() impl */ - DPNP_FN_FLOOR, /**< Used in numpy.floor() impl */ - DPNP_FN_FLOOR_DIVIDE, /**< Used in numpy.floor_divide() impl */ - DPNP_FN_FMOD, /**< Used in numpy.fmod() impl */ DPNP_FN_FULL, /**< Used in numpy.full() impl */ DPNP_FN_FULL_LIKE, /**< Used in numpy.full_like() impl */ - DPNP_FN_HYPOT, /**< Used in numpy.hypot() impl */ DPNP_FN_IDENTITY, /**< Used in numpy.identity() impl */ DPNP_FN_INITVAL, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ DPNP_FN_INITVAL_EXT, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ - DPNP_FN_INV, /**< Used in numpy.linalg.inv() impl */ DPNP_FN_INVERT, /**< Used in numpy.invert() impl */ - DPNP_FN_KRON, /**< Used in numpy.kron() impl */ - DPNP_FN_LEFT_SHIFT, /**< Used in numpy.left_shift() impl */ - DPNP_FN_LOG, /**< Used in numpy.log() impl */ - DPNP_FN_LOG10, /**< Used in numpy.log10() impl */ - DPNP_FN_LOG2, /**< Used in numpy.log2() impl */ - DPNP_FN_LOG1P, /**< Used in numpy.log1p() impl */ - DPNP_FN_MATMUL, /**< Used in numpy.matmul() impl */ - DPNP_FN_MATRIX_RANK, /**< Used in numpy.linalg.matrix_rank() impl */ DPNP_FN_MAX, /**< Used in numpy.max() impl */ - DPNP_FN_MAXIMUM, /**< Used in numpy.fmax() impl */ DPNP_FN_MAXIMUM_EXT, /**< Used in numpy.fmax() impl , requires extra parameters */ DPNP_FN_MEAN, /**< Used in numpy.mean() impl */ @@ -167,7 +117,6 @@ enum class DPNPFuncName : size_t DPNP_FN_MEDIAN_EXT, /**< Used in numpy.median() impl, requires extra parameters */ DPNP_FN_MIN, /**< Used in numpy.min() impl */ - DPNP_FN_MINIMUM, /**< Used in numpy.fmin() impl */ DPNP_FN_MINIMUM_EXT, /**< Used in numpy.fmax() impl, requires extra parameters */ DPNP_FN_MODF, /**< Used in numpy.modf() impl */ @@ -175,7 +124,6 @@ enum class DPNPFuncName : size_t */ DPNP_FN_MULTIPLY, /**< Used in numpy.multiply() impl */ DPNP_FN_NANVAR, /**< Used in numpy.nanvar() impl */ - DPNP_FN_NEGATIVE, /**< Used in numpy.negative() impl */ DPNP_FN_NONZERO, /**< Used in numpy.nonzero() impl */ DPNP_FN_ONES, /**< Used in numpy.ones() impl */ DPNP_FN_ONES_LIKE, /**< Used in numpy.ones_like() impl */ @@ -183,19 +131,14 @@ enum class DPNPFuncName : size_t DPNP_FN_PARTITION_EXT, /**< Used in numpy.partition() impl, requires extra parameters */ DPNP_FN_PLACE, /**< Used in numpy.place() impl */ - DPNP_FN_POWER, /**< Used in numpy.power() impl */ DPNP_FN_PROD, /**< Used in numpy.prod() impl */ DPNP_FN_PTP, /**< Used in numpy.ptp() impl */ DPNP_FN_PUT, /**< Used in numpy.put() impl */ DPNP_FN_PUT_ALONG_AXIS, /**< Used in numpy.put_along_axis() impl */ - DPNP_FN_QR, /**< Used in numpy.linalg.qr() impl */ DPNP_FN_RADIANS, /**< Used in numpy.radians() impl */ DPNP_FN_RADIANS_EXT, /**< Used in numpy.radians() impl, requires extra parameters */ - DPNP_FN_REMAINDER, /**< Used in numpy.remainder() impl */ - DPNP_FN_RECIP, /**< Used in numpy.recip() impl */ DPNP_FN_REPEAT, /**< Used in numpy.repeat() impl */ - DPNP_FN_RIGHT_SHIFT, /**< Used in numpy.right_shift() impl */ DPNP_FN_RNG_BETA, /**< Used in numpy.random.beta() impl */ DPNP_FN_RNG_BETA_EXT, /**< Used in numpy.random.beta() impl, requires extra parameters */ @@ -314,32 +257,22 @@ enum class DPNPFuncName : size_t DPNP_FN_RNG_ZIPF_EXT, /**< Used in numpy.random.zipf() impl, requires extra parameters */ DPNP_FN_SEARCHSORTED, /**< Used in numpy.searchsorted() impl */ - DPNP_FN_SIGN, /**< Used in numpy.sign() impl */ - DPNP_FN_SIN, /**< Used in numpy.sin() impl */ - DPNP_FN_SINH, /**< Used in numpy.sinh() impl */ DPNP_FN_SORT, /**< Used in numpy.sort() impl */ DPNP_FN_SQRT, /**< Used in numpy.sqrt() impl */ DPNP_FN_SQRT_EXT, /**< Used in numpy.sqrt() impl, requires extra parameters */ - DPNP_FN_SQUARE, /**< Used in numpy.square() impl */ DPNP_FN_STD, /**< Used in numpy.std() impl */ - DPNP_FN_SUBTRACT, /**< Used in numpy.subtract() impl */ DPNP_FN_SUBTRACT_EXT, /**< Used in numpy.subtract() impl, requires extra parameters */ DPNP_FN_SUM, /**< Used in numpy.sum() impl */ - DPNP_FN_SVD, /**< Used in numpy.linalg.svd() impl */ DPNP_FN_TAKE, /**< Used in numpy.take() impl */ - DPNP_FN_TAN, /**< Used in numpy.tan() impl */ - DPNP_FN_TANH, /**< Used in numpy.tanh() impl */ DPNP_FN_TRANSPOSE, /**< Used in numpy.transpose() impl */ DPNP_FN_TRACE, /**< Used in numpy.trace() impl */ - DPNP_FN_TRAPZ, /**< Used in numpy.trapz() impl */ DPNP_FN_TRAPZ_EXT, /**< Used in numpy.trapz() impl, requires extra parameters */ DPNP_FN_TRI, /**< Used in numpy.tri() impl */ DPNP_FN_TRIL, /**< Used in numpy.tril() impl */ DPNP_FN_TRIU, /**< Used in numpy.triu() impl */ - DPNP_FN_TRUNC, /**< Used in numpy.trunc() impl */ DPNP_FN_VANDER, /**< Used in numpy.vander() impl */ DPNP_FN_VAR, /**< Used in numpy.var() impl */ DPNP_FN_ZEROS, /**< Used in numpy.zeros() impl */ diff --git a/dpnp/backend/kernels/dpnp_krnl_bitwise.cpp b/dpnp/backend/kernels/dpnp_krnl_bitwise.cpp deleted file mode 100644 index 9db8425f6de..00000000000 --- a/dpnp/backend/kernels/dpnp_krnl_bitwise.cpp +++ /dev/null @@ -1,438 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -#include - -#include "dpnp_fptr.hpp" -#include "dpnp_iface.hpp" -#include "dpnp_iterator.hpp" -#include "dpnp_utils.hpp" -#include "dpnpc_memory_adapter.hpp" -#include "queue_sycl.hpp" - -// dpctl tensor headers -#include "kernels/alignment.hpp" - -using dpctl::tensor::kernels::alignment_utils::is_aligned; -using dpctl::tensor::kernels::alignment_utils::required_alignment; - -template -class dpnp_invert_c_kernel; - -template -DPCTLSyclEventRef dpnp_invert_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - _DataType *input_data = static_cast<_DataType *>(array1_in); - _DataType *result = static_cast<_DataType *>(result1); - - constexpr size_t lws = 64; - constexpr unsigned int vec_sz = 8; - - auto gws_range = - sycl::range<1>(((size + lws * vec_sz - 1) / (lws * vec_sz)) * lws); - auto lws_range = sycl::range<1>(lws); - - auto kernel_parallel_for_func = [=](sycl::nd_item<1> nd_it) { - auto sg = nd_it.get_sub_group(); - const auto max_sg_size = sg.get_max_local_range()[0]; - const size_t start = - vec_sz * (nd_it.get_group(0) * nd_it.get_local_range(0) + - sg.get_group_id()[0] * max_sg_size); - - if (is_aligned(input_data) && - is_aligned(result) && - (start + static_cast(vec_sz) * max_sg_size < size)) - { - auto input_multi_ptr = sycl::address_space_cast< - sycl::access::address_space::global_space, - sycl::access::decorated::yes>(&input_data[start]); - auto result_multi_ptr = sycl::address_space_cast< - sycl::access::address_space::global_space, - sycl::access::decorated::yes>(&result[start]); - - sycl::vec<_DataType, vec_sz> x = sg.load(input_multi_ptr); - sycl::vec<_DataType, vec_sz> res_vec; - - if constexpr (std::is_same_v<_DataType, bool>) { -#pragma unroll - for (size_t k = 0; k < vec_sz; ++k) { - res_vec[k] = !(x[k]); - } - } - else { - res_vec = ~x; - } - - sg.store(result_multi_ptr, res_vec); - } - else { - for (size_t k = start + sg.get_local_id()[0]; k < size; - k += max_sg_size) { - if constexpr (std::is_same_v<_DataType, bool>) { - result[k] = !(input_data[k]); - } - else { - result[k] = ~(input_data[k]); - } - } - } - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - sycl::nd_range<1>(gws_range, lws_range), kernel_parallel_for_func); - }; - event = q.submit(kernel_func); - - event_ref = reinterpret_cast(&event); - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_invert_c(void *array1_in, void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_invert_c<_DataType>( - q_ref, array1_in, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_invert_default_c)(void *, - void *, - size_t) = dpnp_invert_c<_DataType>; - -static void func_map_init_bitwise_1arg_1type(func_map_t &fmap) -{ - fmap[DPNPFuncName::DPNP_FN_INVERT][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_invert_default_c}; - fmap[DPNPFuncName::DPNP_FN_INVERT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_invert_default_c}; - fmap[DPNPFuncName::DPNP_FN_INVERT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_invert_default_c}; - - return; -} - -#define MACRO_2ARG_1TYPE_OP(__name__, __operation__) \ - template \ - class __name__##_kernel; \ - \ - template \ - class __name__##_strides_kernel; \ - \ - template \ - class __name__##_broadcast_kernel; \ - \ - template \ - DPCTLSyclEventRef __name__( \ - DPCTLSyclQueueRef q_ref, void *result_out, const size_t result_size, \ - const size_t result_ndim, const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where, \ - const DPCTLEventVectorRef dep_event_vec_ref) \ - { \ - /* avoid warning unused variable*/ \ - (void)result_shape; \ - (void)where; \ - (void)dep_event_vec_ref; \ - \ - DPCTLSyclEventRef event_ref = nullptr; \ - \ - if (!input1_size || !input2_size) { \ - return event_ref; \ - } \ - \ - sycl::queue q = *(reinterpret_cast(q_ref)); \ - \ - _DataType *input1_data = \ - static_cast<_DataType *>(const_cast(input1_in)); \ - _DataType *input2_data = \ - static_cast<_DataType *>(const_cast(input2_in)); \ - _DataType *result = static_cast<_DataType *>(result_out); \ - \ - bool use_broadcasting = !array_equal(input1_shape, input1_ndim, \ - input2_shape, input2_ndim); \ - \ - shape_elem_type *input1_shape_offsets = \ - new shape_elem_type[input1_ndim]; \ - \ - get_shape_offsets_inkernel(input1_shape, input1_ndim, \ - input1_shape_offsets); \ - bool use_strides = !array_equal(input1_strides, input1_ndim, \ - input1_shape_offsets, input1_ndim); \ - delete[] input1_shape_offsets; \ - \ - shape_elem_type *input2_shape_offsets = \ - new shape_elem_type[input2_ndim]; \ - \ - get_shape_offsets_inkernel(input2_shape, input2_ndim, \ - input2_shape_offsets); \ - use_strides = \ - use_strides || !array_equal(input2_strides, input2_ndim, \ - input2_shape_offsets, input2_ndim); \ - delete[] input2_shape_offsets; \ - \ - sycl::event event; \ - sycl::range<1> gws(result_size); \ - \ - if (use_broadcasting) { \ - DPNPC_id<_DataType> *input1_it; \ - const size_t input1_it_size_in_bytes = \ - sizeof(DPNPC_id<_DataType>); \ - input1_it = reinterpret_cast *>( \ - dpnp_memory_alloc_c(q_ref, input1_it_size_in_bytes)); \ - new (input1_it) \ - DPNPC_id<_DataType>(q_ref, input1_data, input1_shape, \ - input1_strides, input1_ndim); \ - \ - input1_it->broadcast_to_shape(result_shape, result_ndim); \ - \ - DPNPC_id<_DataType> *input2_it; \ - const size_t input2_it_size_in_bytes = \ - sizeof(DPNPC_id<_DataType>); \ - input2_it = reinterpret_cast *>( \ - dpnp_memory_alloc_c(q_ref, input2_it_size_in_bytes)); \ - new (input2_it) \ - DPNPC_id<_DataType>(q_ref, input2_data, input2_shape, \ - input2_strides, input2_ndim); \ - \ - input2_it->broadcast_to_shape(result_shape, result_ndim); \ - \ - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { \ - const size_t i = global_id[0]; /* for (size_t i = 0; i < \ - result_size; ++i) */ \ - { \ - const _DataType input1_elem = (*input1_it)[i]; \ - const _DataType input2_elem = (*input2_it)[i]; \ - result[i] = __operation__; \ - } \ - }; \ - auto kernel_func = [&](sycl::handler &cgh) { \ - cgh.parallel_for< \ - class __name__##_broadcast_kernel<_DataType>>( \ - gws, kernel_parallel_for_func); \ - }; \ - \ - q.submit(kernel_func).wait(); \ - \ - input1_it->~DPNPC_id(); \ - input2_it->~DPNPC_id(); \ - \ - return event_ref; \ - } \ - else if (use_strides) { \ - if ((result_ndim != input1_ndim) || (result_ndim != input2_ndim)) \ - { \ - throw std::runtime_error( \ - "Result ndim=" + std::to_string(result_ndim) + \ - " mismatches with either input1 ndim=" + \ - std::to_string(input1_ndim) + \ - " or input2 ndim=" + std::to_string(input2_ndim)); \ - } \ - \ - /* memory transfer optimization, use USM-host for temporary speeds \ - * up transfer to device */ \ - using usm_host_allocatorT = \ - sycl::usm_allocator; \ - \ - size_t strides_size = 3 * result_ndim; \ - shape_elem_type *dev_strides_data = \ - sycl::malloc_device(strides_size, q); \ - \ - /* create host temporary for packed strides managed by shared \ - * pointer */ \ - auto strides_host_packed = \ - std::vector( \ - strides_size, usm_host_allocatorT(q)); \ - \ - /* packed vector is concatenation of result_strides, \ - * input1_strides and input2_strides */ \ - std::copy(result_strides, result_strides + result_ndim, \ - strides_host_packed.begin()); \ - std::copy(input1_strides, input1_strides + result_ndim, \ - strides_host_packed.begin() + result_ndim); \ - std::copy(input2_strides, input2_strides + result_ndim, \ - strides_host_packed.begin() + 2 * result_ndim); \ - \ - auto copy_strides_ev = q.copy( \ - strides_host_packed.data(), dev_strides_data, \ - strides_host_packed.size()); \ - \ - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { \ - const size_t output_id = \ - global_id[0]; /* for (size_t i = 0; i < result_size; ++i) \ - */ \ - { \ - const shape_elem_type *result_strides_data = \ - &dev_strides_data[0]; \ - const shape_elem_type *input1_strides_data = \ - &dev_strides_data[result_ndim]; \ - const shape_elem_type *input2_strides_data = \ - &dev_strides_data[2 * result_ndim]; \ - \ - size_t input1_id = 0; \ - size_t input2_id = 0; \ - \ - for (size_t i = 0; i < result_ndim; ++i) { \ - const size_t output_xyz_id = \ - get_xyz_id_by_id_inkernel(output_id, \ - result_strides_data, \ - result_ndim, i); \ - input1_id += output_xyz_id * input1_strides_data[i]; \ - input2_id += output_xyz_id * input2_strides_data[i]; \ - } \ - \ - const _DataType input1_elem = \ - (input1_size == 1) ? input1_data[0] \ - : input1_data[input1_id]; \ - const _DataType input2_elem = \ - (input2_size == 1) ? input2_data[0] \ - : input2_data[input2_id]; \ - result[output_id] = __operation__; \ - } \ - }; \ - auto kernel_func = [&](sycl::handler &cgh) { \ - cgh.depends_on(copy_strides_ev); \ - cgh.parallel_for>( \ - gws, kernel_parallel_for_func); \ - }; \ - \ - q.submit(kernel_func).wait(); \ - \ - sycl::free(dev_strides_data, q); \ - return event_ref; \ - } \ - else { \ - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { \ - size_t i = global_id[0]; /* for (size_t i = 0; i < \ - result_size; ++i) */ \ - const _DataType input1_elem = \ - (input1_size == 1) ? input1_data[0] : input1_data[i]; \ - const _DataType input2_elem = \ - (input2_size == 1) ? input2_data[0] : input2_data[i]; \ - result[i] = __operation__; \ - }; \ - auto kernel_func = [&](sycl::handler &cgh) { \ - cgh.parallel_for>( \ - gws, kernel_parallel_for_func); \ - }; \ - event = q.submit(kernel_func); \ - } \ - \ - event_ref = reinterpret_cast(&event); \ - return DPCTLEvent_Copy(event_ref); \ - } \ - \ - template \ - void __name__( \ - void *result_out, const size_t result_size, const size_t result_ndim, \ - const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where) \ - { \ - DPCTLSyclQueueRef q_ref = \ - reinterpret_cast(&DPNP_QUEUE); \ - DPCTLEventVectorRef dep_event_vec_ref = nullptr; \ - DPCTLSyclEventRef event_ref = __name__<_DataType>( \ - q_ref, result_out, result_size, result_ndim, result_shape, \ - result_strides, input1_in, input1_size, input1_ndim, input1_shape, \ - input1_strides, input2_in, input2_size, input2_ndim, input2_shape, \ - input2_strides, where, dep_event_vec_ref); \ - DPCTLEvent_WaitAndThrow(event_ref); \ - DPCTLEvent_Delete(event_ref); \ - } \ - \ - template \ - void (*__name__##_default)( \ - void *, const size_t, const size_t, const shape_elem_type *, \ - const shape_elem_type *, const void *, const size_t, const size_t, \ - const shape_elem_type *, const shape_elem_type *, const void *, \ - const size_t, const size_t, const shape_elem_type *, \ - const shape_elem_type *, const size_t *) = __name__<_DataType>; - -#include - -static void func_map_init_bitwise_2arg_1type(func_map_t &fmap) -{ - fmap[DPNPFuncName::DPNP_FN_BITWISE_AND][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_bitwise_and_c_default}; - fmap[DPNPFuncName::DPNP_FN_BITWISE_AND][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_bitwise_and_c_default}; - - fmap[DPNPFuncName::DPNP_FN_BITWISE_OR][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_bitwise_or_c_default}; - fmap[DPNPFuncName::DPNP_FN_BITWISE_OR][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_bitwise_or_c_default}; - - fmap[DPNPFuncName::DPNP_FN_BITWISE_XOR][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_bitwise_xor_c_default}; - fmap[DPNPFuncName::DPNP_FN_BITWISE_XOR][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_bitwise_xor_c_default}; - - fmap[DPNPFuncName::DPNP_FN_LEFT_SHIFT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_left_shift_c_default}; - fmap[DPNPFuncName::DPNP_FN_LEFT_SHIFT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_left_shift_c_default}; - - fmap[DPNPFuncName::DPNP_FN_RIGHT_SHIFT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_right_shift_c_default}; - fmap[DPNPFuncName::DPNP_FN_RIGHT_SHIFT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_right_shift_c_default}; - - return; -} - -void func_map_init_bitwise(func_map_t &fmap) -{ - func_map_init_bitwise_1arg_1type(fmap); - func_map_init_bitwise_2arg_1type(fmap); - - return; -} diff --git a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp index 486851516dc..20be65f53ca 100644 --- a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp @@ -230,77 +230,6 @@ using dpctl::tensor::kernels::alignment_utils::required_alignment; static void func_map_init_elemwise_1arg_2type(func_map_t &fmap) { - fmap[DPNPFuncName::DPNP_FN_ARCCOS][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_acos_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCCOS][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_acos_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCCOS][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_acos_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCCOS][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_acos_c_default}; - - fmap[DPNPFuncName::DPNP_FN_ARCCOSH][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_acosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCCOSH][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_acosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCCOSH][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_acosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCCOSH][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_acosh_c_default}; - - fmap[DPNPFuncName::DPNP_FN_ARCSIN][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_asin_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCSIN][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_asin_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCSIN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_asin_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCSIN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_asin_c_default}; - - fmap[DPNPFuncName::DPNP_FN_ARCSINH][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_asinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCSINH][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_asinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCSINH][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_asinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCSINH][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_asinh_c_default}; - - fmap[DPNPFuncName::DPNP_FN_ARCTAN][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_atan_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_atan_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_atan_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_atan_c_default}; - - fmap[DPNPFuncName::DPNP_FN_ARCTANH][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_atanh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTANH][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_atanh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTANH][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_atanh_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTANH][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_atanh_c_default}; - - fmap[DPNPFuncName::DPNP_FN_CBRT][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_cbrt_c_default}; - fmap[DPNPFuncName::DPNP_FN_CBRT][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_cbrt_c_default}; - fmap[DPNPFuncName::DPNP_FN_CBRT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cbrt_c_default}; - fmap[DPNPFuncName::DPNP_FN_CBRT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cbrt_c_default}; - - fmap[DPNPFuncName::DPNP_FN_CEIL][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_ceil_c_default}; - fmap[DPNPFuncName::DPNP_FN_CEIL][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_ceil_c_default}; - fmap[DPNPFuncName::DPNP_FN_CEIL][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_ceil_c_default}; - fmap[DPNPFuncName::DPNP_FN_CEIL][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_ceil_c_default}; fmap[DPNPFuncName::DPNP_FN_COPYTO][eft_BLN][eft_BLN] = { eft_BLN, (void *)dpnp_copyto_c_default}; @@ -378,24 +307,6 @@ static void func_map_init_elemwise_1arg_2type(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_COPYTO_EXT][eft_FLT][eft_FLT] = { eft_FLT, (void *)dpnp_copyto_c_ext}; - fmap[DPNPFuncName::DPNP_FN_COS][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_cos_c_default}; - fmap[DPNPFuncName::DPNP_FN_COS][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_cos_c_default}; - fmap[DPNPFuncName::DPNP_FN_COS][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cos_c_default}; - fmap[DPNPFuncName::DPNP_FN_COS][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cos_c_default}; - - fmap[DPNPFuncName::DPNP_FN_COSH][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_cosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_COSH][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_cosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_COSH][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_COSH][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cosh_c_default}; - fmap[DPNPFuncName::DPNP_FN_DEGREES][eft_INT][eft_INT] = { eft_DBL, (void *)dpnp_degrees_c_default}; fmap[DPNPFuncName::DPNP_FN_DEGREES][eft_LNG][eft_LNG] = { @@ -426,87 +337,6 @@ static void func_map_init_elemwise_1arg_2type(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_DEGREES_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_degrees_c_ext}; - fmap[DPNPFuncName::DPNP_FN_EXP2][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_exp2_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXP2][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_exp2_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXP2][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_exp2_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXP2][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_exp2_c_default}; - - fmap[DPNPFuncName::DPNP_FN_EXP][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_exp_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXP][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_exp_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXP][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_exp_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXP][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_exp_c_default}; - - fmap[DPNPFuncName::DPNP_FN_EXPM1][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_expm1_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXPM1][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_expm1_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXPM1][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_expm1_c_default}; - fmap[DPNPFuncName::DPNP_FN_EXPM1][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_expm1_c_default}; - - fmap[DPNPFuncName::DPNP_FN_FABS][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_fabs_c_default}; - fmap[DPNPFuncName::DPNP_FN_FABS][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_fabs_c_default}; - fmap[DPNPFuncName::DPNP_FN_FABS][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_fabs_c_default}; - fmap[DPNPFuncName::DPNP_FN_FABS][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_fabs_c_default}; - - fmap[DPNPFuncName::DPNP_FN_FLOOR][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_floor_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLOOR][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_floor_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLOOR][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_floor_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLOOR][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_floor_c_default}; - - fmap[DPNPFuncName::DPNP_FN_LOG10][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_log10_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG10][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_log10_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG10][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_log10_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG10][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_log10_c_default}; - - fmap[DPNPFuncName::DPNP_FN_LOG1P][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_log1p_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG1P][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_log1p_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG1P][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_log1p_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG1P][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_log1p_c_default}; - - fmap[DPNPFuncName::DPNP_FN_LOG2][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_log2_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG2][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_log2_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG2][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_log2_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG2][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_log2_c_default}; - - fmap[DPNPFuncName::DPNP_FN_LOG][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_log_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_log_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_log_c_default}; - fmap[DPNPFuncName::DPNP_FN_LOG][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_log_c_default}; - fmap[DPNPFuncName::DPNP_FN_RADIANS][eft_INT][eft_INT] = { eft_DBL, (void *)dpnp_radians_c_default}; fmap[DPNPFuncName::DPNP_FN_RADIANS][eft_LNG][eft_LNG] = { @@ -537,24 +367,6 @@ static void func_map_init_elemwise_1arg_2type(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_RADIANS_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_radians_c_ext}; - fmap[DPNPFuncName::DPNP_FN_SIN][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_sin_c_default}; - fmap[DPNPFuncName::DPNP_FN_SIN][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_sin_c_default}; - fmap[DPNPFuncName::DPNP_FN_SIN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_sin_c_default}; - fmap[DPNPFuncName::DPNP_FN_SIN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_sin_c_default}; - - fmap[DPNPFuncName::DPNP_FN_SINH][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_sinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_SINH][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_sinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_SINH][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_sinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_SINH][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_sinh_c_default}; - fmap[DPNPFuncName::DPNP_FN_SQRT][eft_INT][eft_INT] = { eft_DBL, (void *)dpnp_sqrt_c_default}; fmap[DPNPFuncName::DPNP_FN_SQRT][eft_LNG][eft_LNG] = { @@ -570,33 +382,6 @@ static void func_map_init_elemwise_1arg_2type(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_SQRT_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_sqrt_c_ext}; - fmap[DPNPFuncName::DPNP_FN_TAN][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_tan_c_default}; - fmap[DPNPFuncName::DPNP_FN_TAN][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_tan_c_default}; - fmap[DPNPFuncName::DPNP_FN_TAN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_tan_c_default}; - fmap[DPNPFuncName::DPNP_FN_TAN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_tan_c_default}; - - fmap[DPNPFuncName::DPNP_FN_TANH][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_tanh_c_default}; - fmap[DPNPFuncName::DPNP_FN_TANH][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_tanh_c_default}; - fmap[DPNPFuncName::DPNP_FN_TANH][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_tanh_c_default}; - fmap[DPNPFuncName::DPNP_FN_TANH][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_tanh_c_default}; - - fmap[DPNPFuncName::DPNP_FN_TRUNC][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_trunc_c_default}; - fmap[DPNPFuncName::DPNP_FN_TRUNC][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_trunc_c_default}; - fmap[DPNPFuncName::DPNP_FN_TRUNC][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_trunc_c_default}; - fmap[DPNPFuncName::DPNP_FN_TRUNC][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_trunc_c_default}; - return; } @@ -612,21 +397,6 @@ constexpr T dispatch_erf_op(T elem) } } -template -constexpr T dispatch_sign_op(T elem) -{ - if constexpr (is_any_v) { - if (elem > 0) - return T(1); - if (elem < 0) - return T(-1); - return elem; // elem is 0 - } - else { - return sycl::sign(elem); - } -} - template constexpr auto dispatch_fmod_op(T elem1, T elem2) { @@ -837,45 +607,6 @@ constexpr auto dispatch_fmod_op(T elem1, T elem2) static void func_map_init_elemwise_1arg_1type(func_map_t &fmap) { - fmap[DPNPFuncName::DPNP_FN_CONJUGATE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_CONJUGATE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_CONJUGATE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_CONJUGATE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_CONJUGATE][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_conjugate_c_default>}; - - fmap[DPNPFuncName::DPNP_FN_COPY][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPY][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPY][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPY][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPY][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPY][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_copy_c_default>}; - - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_copy_c_ext}; - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_copy_c_ext}; - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_copy_c_ext}; - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_copy_c_ext}; - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_copy_c_ext}; - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_C64][eft_C64] = { - eft_C64, (void *)dpnp_copy_c_ext>}; - fmap[DPNPFuncName::DPNP_FN_COPY_EXT][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_copy_c_ext>}; - fmap[DPNPFuncName::DPNP_FN_ERF][eft_INT][eft_INT] = { eft_INT, (void *)dpnp_erf_c_default}; fmap[DPNPFuncName::DPNP_FN_ERF][eft_LNG][eft_LNG] = { @@ -894,55 +625,6 @@ static void func_map_init_elemwise_1arg_1type(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_ERF_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_erf_c_ext}; - fmap[DPNPFuncName::DPNP_FN_FLATTEN][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLATTEN][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLATTEN][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLATTEN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLATTEN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_copy_c_default}; - fmap[DPNPFuncName::DPNP_FN_FLATTEN][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_copy_c_default>}; - - fmap[DPNPFuncName::DPNP_FN_NEGATIVE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_negative_c_default}; - fmap[DPNPFuncName::DPNP_FN_NEGATIVE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_negative_c_default}; - fmap[DPNPFuncName::DPNP_FN_NEGATIVE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_negative_c_default}; - fmap[DPNPFuncName::DPNP_FN_NEGATIVE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_negative_c_default}; - - fmap[DPNPFuncName::DPNP_FN_RECIP][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_recip_c_default}; - fmap[DPNPFuncName::DPNP_FN_RECIP][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_recip_c_default}; - fmap[DPNPFuncName::DPNP_FN_RECIP][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_recip_c_default}; - fmap[DPNPFuncName::DPNP_FN_RECIP][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_recip_c_default}; - - fmap[DPNPFuncName::DPNP_FN_SIGN][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_sign_c_default}; - fmap[DPNPFuncName::DPNP_FN_SIGN][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_sign_c_default}; - fmap[DPNPFuncName::DPNP_FN_SIGN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_sign_c_default}; - fmap[DPNPFuncName::DPNP_FN_SIGN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_sign_c_default}; - - fmap[DPNPFuncName::DPNP_FN_SQUARE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_square_c_default}; - fmap[DPNPFuncName::DPNP_FN_SQUARE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_square_c_default}; - fmap[DPNPFuncName::DPNP_FN_SQUARE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_square_c_default}; - fmap[DPNPFuncName::DPNP_FN_SQUARE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_square_c_default}; - return; } @@ -1344,47 +1026,6 @@ static void func_map_init_elemwise_1arg_1type(func_map_t &fmap) #include -template -static constexpr DPNPFuncType get_divide_res_type() -{ - constexpr auto widest_type = populate_func_types(); - constexpr auto shortes_type = (widest_type == FT1) ? FT2 : FT1; - - if constexpr (widest_type == DPNPFuncType::DPNP_FT_CMPLX128 || - widest_type == DPNPFuncType::DPNP_FT_DOUBLE) - { - return widest_type; - } - else if constexpr (widest_type == DPNPFuncType::DPNP_FT_CMPLX64) { - if constexpr (shortes_type == DPNPFuncType::DPNP_FT_DOUBLE) { - return DPNPFuncType::DPNP_FT_CMPLX128; - } - else if constexpr (has_fp64::value && - (shortes_type == DPNPFuncType::DPNP_FT_INT || - shortes_type == DPNPFuncType::DPNP_FT_LONG)) - { - return DPNPFuncType::DPNP_FT_CMPLX128; - } - } - else if constexpr (widest_type == DPNPFuncType::DPNP_FT_FLOAT) { - if constexpr (has_fp64::value && - (shortes_type == DPNPFuncType::DPNP_FT_INT || - shortes_type == DPNPFuncType::DPNP_FT_LONG)) - { - return DPNPFuncType::DPNP_FT_DOUBLE; - } - } - else if constexpr (has_fp64::value) { - return DPNPFuncType::DPNP_FT_DOUBLE; - } - else { - return DPNPFuncType::DPNP_FT_FLOAT; - } - return widest_type; -} - template static void func_map_elemwise_2arg_3type_core(func_map_t &fmap) { @@ -1445,270 +1086,7 @@ static void func_map_elemwise_2arg_3type_short_helper(func_map_t &fmap) static void func_map_init_elemwise_2arg_3type(func_map_t &fmap) { - fmap[DPNPFuncName::DPNP_FN_ADD][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_add_c_default}; - fmap[DPNPFuncName::DPNP_FN_ADD][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_add_c_default}; - - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_INT][eft_LNG] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_LNG][eft_INT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - fmap[DPNPFuncName::DPNP_FN_ARCTAN2][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_arctan2_c_default}; - - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_INT][eft_LNG] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_LNG][eft_INT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - fmap[DPNPFuncName::DPNP_FN_COPYSIGN][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_copysign_c_default}; - - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_INT][eft_LNG] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_LNG][eft_INT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_divide_c_default}; - fmap[DPNPFuncName::DPNP_FN_DIVIDE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_divide_c_default}; - - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - fmap[DPNPFuncName::DPNP_FN_FMOD][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_fmod_c_default}; - - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_INT][eft_LNG] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_LNG][eft_INT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - fmap[DPNPFuncName::DPNP_FN_HYPOT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_hypot_c_default}; - - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MAXIMUM][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_maximum_c_default}; - - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - fmap[DPNPFuncName::DPNP_FN_MINIMUM][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_minimum_c_default}; - + // Used in dpnp_dot_c fmap[DPNPFuncName::DPNP_FN_MULTIPLY][eft_BLN][eft_BLN] = { eft_BLN, (void *)dpnp_multiply_c_default}; fmap[DPNPFuncName::DPNP_FN_MULTIPLY][eft_BLN][eft_INT] = { @@ -1811,72 +1189,6 @@ static void func_map_init_elemwise_2arg_3type(func_map_t &fmap) (void *)dpnp_multiply_c_default< std::complex, std::complex, std::complex>}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_power_c_default}; - fmap[DPNPFuncName::DPNP_FN_POWER][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_power_c_default}; - - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - fmap[DPNPFuncName::DPNP_FN_SUBTRACT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_subtract_c_default}; - func_map_elemwise_2arg_3type_helper(fmap); diff --git a/dpnp/backend/kernels/dpnp_krnl_logic.cpp b/dpnp/backend/kernels/dpnp_krnl_logic.cpp index 0174b47339a..c8197d72889 100644 --- a/dpnp/backend/kernels/dpnp_krnl_logic.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_logic.cpp @@ -405,278 +405,6 @@ DPCTLSyclEventRef (*dpnp_any_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_any_c<_DataType, _ResultType>; -#define MACRO_2ARG_2TYPES_LOGIC_OP(__name__, __operation__) \ - template \ - class __name__##_kernel; \ - \ - template \ - class __name__##_broadcast_kernel; \ - \ - template \ - class __name__##_strides_kernel; \ - \ - template \ - DPCTLSyclEventRef __name__( \ - DPCTLSyclQueueRef q_ref, void *result_out, const size_t result_size, \ - const size_t result_ndim, const shape_elem_type *result_shape, \ - const shape_elem_type *result_strides, const void *input1_in, \ - const size_t input1_size, const size_t input1_ndim, \ - const shape_elem_type *input1_shape, \ - const shape_elem_type *input1_strides, const void *input2_in, \ - const size_t input2_size, const size_t input2_ndim, \ - const shape_elem_type *input2_shape, \ - const shape_elem_type *input2_strides, const size_t *where, \ - const DPCTLEventVectorRef dep_event_vec_ref) \ - { \ - /* avoid warning unused variable*/ \ - (void)where; \ - (void)dep_event_vec_ref; \ - \ - DPCTLSyclEventRef event_ref = nullptr; \ - \ - if (!input1_size || !input2_size) { \ - return event_ref; \ - } \ - \ - sycl::queue q = *(reinterpret_cast(q_ref)); \ - \ - _DataType_input1 *input1_data = \ - static_cast<_DataType_input1 *>(const_cast(input1_in)); \ - _DataType_input2 *input2_data = \ - static_cast<_DataType_input2 *>(const_cast(input2_in)); \ - bool *result = static_cast(result_out); \ - \ - bool use_broadcasting = !array_equal(input1_shape, input1_ndim, \ - input2_shape, input2_ndim); \ - \ - shape_elem_type *input1_shape_offsets = \ - new shape_elem_type[input1_ndim]; \ - \ - get_shape_offsets_inkernel(input1_shape, input1_ndim, \ - input1_shape_offsets); \ - bool use_strides = !array_equal(input1_strides, input1_ndim, \ - input1_shape_offsets, input1_ndim); \ - delete[] input1_shape_offsets; \ - \ - shape_elem_type *input2_shape_offsets = \ - new shape_elem_type[input2_ndim]; \ - \ - get_shape_offsets_inkernel(input2_shape, input2_ndim, \ - input2_shape_offsets); \ - use_strides = \ - use_strides || !array_equal(input2_strides, input2_ndim, \ - input2_shape_offsets, input2_ndim); \ - delete[] input2_shape_offsets; \ - \ - sycl::event event; \ - sycl::range<1> gws(result_size); /* used only when use_broadcasting or \ - use_strides is true */ \ - \ - if (use_broadcasting) { \ - DPNPC_id<_DataType_input1> *input1_it; \ - const size_t input1_it_size_in_bytes = \ - sizeof(DPNPC_id<_DataType_input1>); \ - input1_it = reinterpret_cast *>( \ - dpnp_memory_alloc_c(q_ref, input1_it_size_in_bytes)); \ - new (input1_it) \ - DPNPC_id<_DataType_input1>(q_ref, input1_data, input1_shape, \ - input1_strides, input1_ndim); \ - \ - input1_it->broadcast_to_shape(result_shape, result_ndim); \ - \ - DPNPC_id<_DataType_input2> *input2_it; \ - const size_t input2_it_size_in_bytes = \ - sizeof(DPNPC_id<_DataType_input2>); \ - input2_it = reinterpret_cast *>( \ - dpnp_memory_alloc_c(q_ref, input2_it_size_in_bytes)); \ - new (input2_it) \ - DPNPC_id<_DataType_input2>(q_ref, input2_data, input2_shape, \ - input2_strides, input2_ndim); \ - \ - input2_it->broadcast_to_shape(result_shape, result_ndim); \ - \ - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { \ - const size_t i = global_id[0]; /* for (size_t i = 0; i < \ - result_size; ++i) */ \ - { \ - const _DataType_input1 input1_elem = (*input1_it)[i]; \ - const _DataType_input2 input2_elem = (*input2_it)[i]; \ - result[i] = __operation__; \ - } \ - }; \ - auto kernel_func = [&](sycl::handler &cgh) { \ - cgh.parallel_for>( \ - gws, kernel_parallel_for_func); \ - }; \ - \ - q.submit(kernel_func).wait(); \ - \ - input1_it->~DPNPC_id(); \ - input2_it->~DPNPC_id(); \ - \ - return event_ref; \ - } \ - else if (use_strides) { \ - if ((result_ndim != input1_ndim) || (result_ndim != input2_ndim)) \ - { \ - throw std::runtime_error( \ - "Result ndim=" + std::to_string(result_ndim) + \ - " mismatches with either input1 ndim=" + \ - std::to_string(input1_ndim) + \ - " or input2 ndim=" + std::to_string(input2_ndim)); \ - } \ - \ - /* memory transfer optimization, use USM-host for temporary speeds \ - * up transfer to device */ \ - using usm_host_allocatorT = \ - sycl::usm_allocator; \ - \ - size_t strides_size = 3 * result_ndim; \ - shape_elem_type *dev_strides_data = \ - sycl::malloc_device(strides_size, q); \ - \ - /* create host temporary for packed strides managed by shared \ - * pointer */ \ - auto strides_host_packed = \ - std::vector( \ - strides_size, usm_host_allocatorT(q)); \ - \ - /* packed vector is concatenation of result_strides, \ - * input1_strides and input2_strides */ \ - std::copy(result_strides, result_strides + result_ndim, \ - strides_host_packed.begin()); \ - std::copy(input1_strides, input1_strides + result_ndim, \ - strides_host_packed.begin() + result_ndim); \ - std::copy(input2_strides, input2_strides + result_ndim, \ - strides_host_packed.begin() + 2 * result_ndim); \ - \ - auto copy_strides_ev = q.copy( \ - strides_host_packed.data(), dev_strides_data, \ - strides_host_packed.size()); \ - \ - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { \ - const size_t output_id = \ - global_id[0]; /* for (size_t i = 0; i < result_size; ++i) \ - */ \ - { \ - const shape_elem_type *result_strides_data = \ - &dev_strides_data[0]; \ - const shape_elem_type *input1_strides_data = \ - &dev_strides_data[result_ndim]; \ - const shape_elem_type *input2_strides_data = \ - &dev_strides_data[2 * result_ndim]; \ - \ - size_t input1_id = 0; \ - size_t input2_id = 0; \ - \ - for (size_t i = 0; i < result_ndim; ++i) { \ - const size_t output_xyz_id = \ - get_xyz_id_by_id_inkernel(output_id, \ - result_strides_data, \ - result_ndim, i); \ - input1_id += output_xyz_id * input1_strides_data[i]; \ - input2_id += output_xyz_id * input2_strides_data[i]; \ - } \ - \ - const _DataType_input1 input1_elem = \ - input1_data[input1_id]; \ - const _DataType_input2 input2_elem = \ - input2_data[input2_id]; \ - result[output_id] = __operation__; \ - } \ - }; \ - auto kernel_func = [&](sycl::handler &cgh) { \ - cgh.depends_on(copy_strides_ev); \ - cgh.parallel_for>( \ - gws, kernel_parallel_for_func); \ - }; \ - \ - q.submit(kernel_func).wait(); \ - \ - sycl::free(dev_strides_data, q); \ - return event_ref; \ - } \ - else { \ - constexpr size_t lws = 64; \ - constexpr unsigned int vec_sz = 8; \ - \ - auto gws_range = sycl::range<1>( \ - ((result_size + lws * vec_sz - 1) / (lws * vec_sz)) * lws); \ - auto lws_range = sycl::range<1>(lws); \ - \ - auto kernel_parallel_for_func = [=](sycl::nd_item<1> nd_it) { \ - auto sg = nd_it.get_sub_group(); \ - const auto max_sg_size = sg.get_max_local_range()[0]; \ - const size_t start = \ - vec_sz * (nd_it.get_group(0) * nd_it.get_local_range(0) + \ - sg.get_group_id()[0] * max_sg_size); \ - \ - if (is_aligned(input1_data) && \ - is_aligned(input2_data) && \ - is_aligned(result) && \ - (start + static_cast(vec_sz) * max_sg_size < \ - result_size)) \ - { \ - auto input1_multi_ptr = sycl::address_space_cast< \ - sycl::access::address_space::global_space, \ - sycl::access::decorated::yes>(&input1_data[start]); \ - auto input2_multi_ptr = sycl::address_space_cast< \ - sycl::access::address_space::global_space, \ - sycl::access::decorated::yes>(&input2_data[start]); \ - auto result_multi_ptr = sycl::address_space_cast< \ - sycl::access::address_space::global_space, \ - sycl::access::decorated::yes>(&result[start]); \ - \ - sycl::vec<_DataType_input1, vec_sz> x1 = \ - sg.load(input1_multi_ptr); \ - sycl::vec<_DataType_input2, vec_sz> x2 = \ - sg.load(input2_multi_ptr); \ - sycl::vec res_vec; \ - \ - for (size_t k = 0; k < vec_sz; ++k) { \ - const _DataType_input1 input1_elem = x1[k]; \ - const _DataType_input2 input2_elem = x2[k]; \ - res_vec[k] = __operation__; \ - } \ - sg.store(result_multi_ptr, res_vec); \ - } \ - else { \ - for (size_t k = start; k < result_size; ++k) { \ - const _DataType_input1 input1_elem = input1_data[k]; \ - const _DataType_input2 input2_elem = input2_data[k]; \ - result[k] = __operation__; \ - } \ - } \ - }; \ - \ - auto kernel_func = [&](sycl::handler &cgh) { \ - cgh.parallel_for>( \ - sycl::nd_range<1>(gws_range, lws_range), \ - kernel_parallel_for_func); \ - }; \ - event = q.submit(kernel_func); \ - } \ - \ - event_ref = reinterpret_cast(&event); \ - return DPCTLEvent_Copy(event_ref); \ - } \ - \ - template \ - DPCTLSyclEventRef (*__name__##_ext)( \ - DPCTLSyclQueueRef, void *, const size_t, const size_t, \ - const shape_elem_type *, const shape_elem_type *, const void *, \ - const size_t, const size_t, const shape_elem_type *, \ - const shape_elem_type *, const void *, const size_t, const size_t, \ - const shape_elem_type *, const shape_elem_type *, const size_t *, \ - const DPCTLEventVectorRef) = \ - __name__<_DataType_input1, _DataType_input2>; - void func_map_init_logic(func_map_t &fmap) { fmap[DPNPFuncName::DPNP_FN_ALL][eft_BLN][eft_BLN] = { diff --git a/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp b/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp index b485701154c..44cd91854df 100644 --- a/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp @@ -44,379 +44,6 @@ using dpctl::tensor::kernels::alignment_utils::required_alignment; static_assert(__SYCL_COMPILER_VERSION >= __SYCL_COMPILER_VECTOR_ABS_CHANGED, "SYCL DPC++ compiler does not meet minimum version requirement"); -template -class dpnp_around_c_kernel; - -template -DPCTLSyclEventRef dpnp_around_c(DPCTLSyclQueueRef q_ref, - const void *input_in, - void *result_out, - const size_t input_size, - const int decimals, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - (void)decimals; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!input_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, input_in, input_size); - _DataType *input = input1_ptr.get_ptr(); - _DataType *result = reinterpret_cast<_DataType *>(result_out); - - if constexpr (std::is_same<_DataType, double>::value || - std::is_same<_DataType, float>::value) - { - event = oneapi::mkl::vm::rint(q, input_size, input, result); - } - else { - sycl::range<1> gws(input_size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - size_t i = global_id[0]; - { - result[i] = std::rint(input[i]); - } - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - } - - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_around_c(const void *input_in, - void *result_out, - const size_t input_size, - const int decimals) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_around_c<_DataType>( - q_ref, input_in, result_out, input_size, decimals, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_around_default_c)(const void *, void *, const size_t, const int) = - dpnp_around_c<_DataType>; - -template -class dpnp_elemwise_absolute_c_kernel; - -template -DPCTLSyclEventRef - dpnp_elemwise_absolute_c(DPCTLSyclQueueRef q_ref, - const void *input1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - _DataType_input *array1 = - static_cast<_DataType_input *>(const_cast(input1_in)); - _DataType_output *result = static_cast<_DataType_output *>(result1); - - if constexpr (is_any_v<_DataType_input, float, double, std::complex, - std::complex>) - { - event = oneapi::mkl::vm::abs(q, size, array1, result); - } - else { - static_assert( - is_any_v<_DataType_input, int32_t, int64_t>, - "Integer types are only expected to pass in 'abs' kernel"); - static_assert(std::is_same_v<_DataType_input, _DataType_output>, - "Result type must match a type of input data"); - - constexpr size_t lws = 64; - constexpr unsigned int vec_sz = 8; - - auto gws_range = - sycl::range<1>(((size + lws * vec_sz - 1) / (lws * vec_sz)) * lws); - auto lws_range = sycl::range<1>(lws); - - auto kernel_parallel_for_func = [=](sycl::nd_item<1> nd_it) { - auto sg = nd_it.get_sub_group(); - const auto max_sg_size = sg.get_max_local_range()[0]; - const size_t start = - vec_sz * (nd_it.get_group(0) * nd_it.get_local_range(0) + - sg.get_group_id()[0] * max_sg_size); - - if (is_aligned(array1) && - is_aligned(result) && - (start + static_cast(vec_sz) * max_sg_size < size)) - { - auto array_multi_ptr = sycl::address_space_cast< - sycl::access::address_space::global_space, - sycl::access::decorated::yes>(&array1[start]); - auto result_multi_ptr = sycl::address_space_cast< - sycl::access::address_space::global_space, - sycl::access::decorated::yes>(&result[start]); - - sycl::vec<_DataType_input, vec_sz> data_vec = - sg.load(array_multi_ptr); - - sycl::vec<_DataType_output, vec_sz> res_vec = - sycl::abs(data_vec); - - sg.store(result_multi_ptr, res_vec); - } - else { - for (size_t k = start + sg.get_local_id()[0]; k < size; - k += max_sg_size) { - result[k] = std::abs(array1[k]); - } - } - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - sycl::nd_range<1>(gws_range, lws_range), - kernel_parallel_for_func); - }; - event = q.submit(kernel_func); - } - - event_ref = reinterpret_cast(&event); - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_elemwise_absolute_c(const void *input1_in, void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_elemwise_absolute_c<_DataType, _DataType>( - q_ref, input1_in, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_elemwise_absolute_default_c)(const void *, void *, size_t) = - dpnp_elemwise_absolute_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_cross_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - (void)input1_size; // avoid warning unused variable - (void)input1_shape; - (void)input1_shape_ndim; - (void)input2_size; - (void)input2_shape; - (void)input2_shape_ndim; - (void)where; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType_input1> input1_ptr(q_ref, input1_in, - input1_size, true); - DPNPC_ptr_adapter<_DataType_input2> input2_ptr(q_ref, input2_in, - input2_size, true); - DPNPC_ptr_adapter<_DataType_output> result_ptr(q_ref, result_out, - input1_size, true, true); - const _DataType_input1 *input1 = input1_ptr.get_ptr(); - const _DataType_input2 *input2 = input2_ptr.get_ptr(); - _DataType_output *result = result_ptr.get_ptr(); - - result[0] = input1[1] * input2[2] - input1[2] * input2[1]; - - result[1] = input1[2] * input2[0] - input1[0] * input2[2]; - - result[2] = input1[0] * input2[1] - input1[1] * input2[0]; - - return event_ref; -} - -template -void dpnp_cross_c(void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_cross_c<_DataType_output, _DataType_input1, _DataType_input2>( - q_ref, result_out, input1_in, input1_size, input1_shape, - input1_shape_ndim, input2_in, input2_size, input2_shape, - input2_shape_ndim, where, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_cross_default_c)(void *, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const size_t *) = - dpnp_cross_c<_DataType_output, _DataType_input1, _DataType_input2>; - -template -class dpnp_cumprod_c_kernel; - -template -DPCTLSyclEventRef dpnp_cumprod_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType_input> input1_ptr(q_ref, array1_in, size, true); - DPNPC_ptr_adapter<_DataType_output> result_ptr(q_ref, result1, size, true, - true); - _DataType_input *array1 = input1_ptr.get_ptr(); - _DataType_output *result = result_ptr.get_ptr(); - - _DataType_output cur_res = 1; - - for (size_t i = 0; i < size; ++i) { - cur_res *= array1[i]; - result[i] = cur_res; - } - - return event_ref; -} - -template -void dpnp_cumprod_c(void *array1_in, void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_cumprod_c<_DataType_input, _DataType_output>( - q_ref, array1_in, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_cumprod_default_c)(void *, void *, size_t) = - dpnp_cumprod_c<_DataType_input, _DataType_output>; - -template -class dpnp_cumsum_c_kernel; - -template -DPCTLSyclEventRef dpnp_cumsum_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType_input> input1_ptr(q_ref, array1_in, size, true); - DPNPC_ptr_adapter<_DataType_output> result_ptr(q_ref, result1, size, true, - true); - _DataType_input *array1 = input1_ptr.get_ptr(); - _DataType_output *result = result_ptr.get_ptr(); - - _DataType_output cur_res = 0; - - for (size_t i = 0; i < size; ++i) { - cur_res += array1[i]; - result[i] = cur_res; - } - - return event_ref; -} - -template -void dpnp_cumsum_c(void *array1_in, void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_cumsum_c<_DataType_input, _DataType_output>( - q_ref, array1_in, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_cumsum_default_c)(void *, void *, size_t) = - dpnp_cumsum_c<_DataType_input, _DataType_output>; - template class dpnp_ediff1d_c_kernel; @@ -541,176 +168,6 @@ DPCTLSyclEventRef (*dpnp_ediff1d_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_ediff1d_c<_DataType_input, _DataType_output>; -template -class dpnp_floor_divide_c_kernel; - -template -DPCTLSyclEventRef - dpnp_floor_divide_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)where; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!input1_size || !input2_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType_input1> input1_ptr(q_ref, input1_in, - input1_size); - DPNPC_ptr_adapter<_DataType_input2> input2_ptr(q_ref, input2_in, - input2_size); - _DataType_input1 *input1_data = input1_ptr.get_ptr(); - _DataType_input2 *input2_data = input2_ptr.get_ptr(); - _DataType_output *result = reinterpret_cast<_DataType_output *>(result_out); - - std::vector result_shape = get_result_shape( - input1_shape, input1_shape_ndim, input2_shape, input2_shape_ndim); - - DPNPC_id<_DataType_input1> *input1_it; - const size_t input1_it_size_in_bytes = sizeof(DPNPC_id<_DataType_input1>); - input1_it = reinterpret_cast *>( - dpnp_memory_alloc_c(q_ref, input1_it_size_in_bytes)); - new (input1_it) DPNPC_id<_DataType_input1>(q_ref, input1_data, input1_shape, - input1_shape_ndim); - - input1_it->broadcast_to_shape(result_shape); - - DPNPC_id<_DataType_input2> *input2_it; - const size_t input2_it_size_in_bytes = sizeof(DPNPC_id<_DataType_input2>); - input2_it = reinterpret_cast *>( - dpnp_memory_alloc_c(q_ref, input2_it_size_in_bytes)); - new (input2_it) DPNPC_id<_DataType_input2>(q_ref, input2_data, input2_shape, - input2_shape_ndim); - - input2_it->broadcast_to_shape(result_shape); - - const size_t result_size = input1_it->get_output_size(); - - sycl::range<1> gws(result_size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - const size_t i = - global_id[0]; /* for (size_t i = 0; i < result_size; ++i) */ - const _DataType_output input1_elem = (*input1_it)[i]; - const _DataType_output input2_elem = (*input2_it)[i]; - - double div = (double)input1_elem / (double)input2_elem; - result[i] = static_cast<_DataType_output>(sycl::floor(div)); - }; - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - sycl::event event; - - if (input1_size == input2_size) { - if constexpr ((std::is_same<_DataType_input1, double>::value || - std::is_same<_DataType_input1, float>::value) && - std::is_same<_DataType_input2, _DataType_input1>::value) - { - event = oneapi::mkl::vm::div(q, input1_size, input1_data, - input2_data, result); - event.wait(); - event = oneapi::mkl::vm::floor(q, input1_size, result, result); - } - else { - event = q.submit(kernel_func); - } - } - else { - event = q.submit(kernel_func); - } - - event.wait(); - - input1_it->~DPNPC_id(); - input2_it->~DPNPC_id(); - - sycl::free(input1_it, q); - sycl::free(input2_it, q); - - return event_ref; -} - -template -void dpnp_floor_divide_c(void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_floor_divide_c<_DataType_output, _DataType_input1, - _DataType_input2>( - q_ref, result_out, input1_in, input1_size, input1_shape, - input1_shape_ndim, input2_in, input2_size, input2_shape, - input2_shape_ndim, where, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_floor_divide_default_c)(void *, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const size_t *) = - dpnp_floor_divide_c<_DataType_output, _DataType_input1, _DataType_input2>; - -template -DPCTLSyclEventRef (*dpnp_floor_divide_ext_c)(DPCTLSyclQueueRef, - void *, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const size_t *, - const DPCTLEventVectorRef) = - dpnp_floor_divide_c<_DataType_output, _DataType_input1, _DataType_input2>; - template class dpnp_modf_c_kernel; @@ -796,363 +253,8 @@ DPCTLSyclEventRef (*dpnp_modf_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_modf_c<_DataType_input, _DataType_output>; -template -class dpnp_remainder_c_kernel; - -template -DPCTLSyclEventRef dpnp_remainder_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)where; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!input1_size || !input2_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType_input1> input1_ptr(q_ref, input1_in, - input1_size); - DPNPC_ptr_adapter<_DataType_input2> input2_ptr(q_ref, input2_in, - input2_size); - _DataType_input1 *input1_data = input1_ptr.get_ptr(); - _DataType_input2 *input2_data = input2_ptr.get_ptr(); - _DataType_output *result = reinterpret_cast<_DataType_output *>(result_out); - - std::vector result_shape = get_result_shape( - input1_shape, input1_shape_ndim, input2_shape, input2_shape_ndim); - - DPNPC_id<_DataType_input1> *input1_it; - const size_t input1_it_size_in_bytes = sizeof(DPNPC_id<_DataType_input1>); - input1_it = reinterpret_cast *>( - dpnp_memory_alloc_c(q_ref, input1_it_size_in_bytes)); - new (input1_it) DPNPC_id<_DataType_input1>(q_ref, input1_data, input1_shape, - input1_shape_ndim); - - input1_it->broadcast_to_shape(result_shape); - - DPNPC_id<_DataType_input2> *input2_it; - const size_t input2_it_size_in_bytes = sizeof(DPNPC_id<_DataType_input2>); - input2_it = reinterpret_cast *>( - dpnp_memory_alloc_c(q_ref, input2_it_size_in_bytes)); - new (input2_it) DPNPC_id<_DataType_input2>(q_ref, input2_data, input2_shape, - input2_shape_ndim); - - input2_it->broadcast_to_shape(result_shape); - - const size_t result_size = input1_it->get_output_size(); - - sycl::range<1> gws(result_size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - const size_t i = global_id[0]; - const _DataType_output input1_elem = (*input1_it)[i]; - const _DataType_output input2_elem = (*input2_it)[i]; - double fmod_res = sycl::fmod((double)input1_elem, (double)input2_elem); - double add = fmod_res + input2_elem; - result[i] = sycl::fmod(add, (double)input2_elem); - }; - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - sycl::event event; - - if (input1_size == input2_size) { - if constexpr ((std::is_same<_DataType_input1, double>::value || - std::is_same<_DataType_input1, float>::value) && - std::is_same<_DataType_input2, _DataType_input1>::value) - { - event = oneapi::mkl::vm::fmod(q, input1_size, input1_data, - input2_data, result); - event.wait(); - event = oneapi::mkl::vm::add(q, input1_size, result, input2_data, - result); - event.wait(); - event = oneapi::mkl::vm::fmod(q, input1_size, result, input2_data, - result); - } - else { - event = q.submit(kernel_func); - } - } - else { - event = q.submit(kernel_func); - } - - event.wait(); - - input1_it->~DPNPC_id(); - input2_it->~DPNPC_id(); - - return event_ref; -} - -template -void dpnp_remainder_c(void *result_out, - const void *input1_in, - const size_t input1_size, - const shape_elem_type *input1_shape, - const size_t input1_shape_ndim, - const void *input2_in, - const size_t input2_size, - const shape_elem_type *input2_shape, - const size_t input2_shape_ndim, - const size_t *where) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_remainder_c<_DataType_output, _DataType_input1, _DataType_input2>( - q_ref, result_out, input1_in, input1_size, input1_shape, - input1_shape_ndim, input2_in, input2_size, input2_shape, - input2_shape_ndim, where, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_remainder_default_c)(void *, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const void *, - const size_t, - const shape_elem_type *, - const size_t, - const size_t *) = - dpnp_remainder_c<_DataType_output, _DataType_input1, _DataType_input2>; - -template -class dpnp_trapz_c_kernel; - -template -DPCTLSyclEventRef dpnp_trapz_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - const void *array2_in, - void *result1, - double dx, - size_t array1_size, - size_t array2_size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if ((array1_in == nullptr) || (array2_in == nullptr && array2_size > 1)) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType_input1> input1_ptr(q_ref, array1_in, - array1_size); - DPNPC_ptr_adapter<_DataType_input2> input2_ptr(q_ref, array2_in, - array2_size); - _DataType_input1 *array1 = input1_ptr.get_ptr(); - _DataType_input2 *array2 = input2_ptr.get_ptr(); - _DataType_output *result = reinterpret_cast<_DataType_output *>(result1); - - if (array1_size < 2) { - const _DataType_output init_val = 0; - q.memcpy(result, &init_val, sizeof(_DataType_output)) - .wait(); // result[0] = 0; - - return event_ref; - } - - if (array1_size == array2_size) { - size_t cur_res_size = array1_size - 2; - - _DataType_output *cur_res = reinterpret_cast<_DataType_output *>( - sycl::malloc_shared((cur_res_size) * sizeof(_DataType_output), q)); - - sycl::range<1> gws(cur_res_size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - size_t i = global_id[0]; - { - cur_res[i] = array1[i + 1] * (array2[i + 2] - array2[i]); - } - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - - event.wait(); - - shape_elem_type _shape = cur_res_size; - dpnp_sum_c<_DataType_output, _DataType_output>(result, cur_res, &_shape, - 1, NULL, 0, NULL, NULL); - - sycl::free(cur_res, q); - - result[0] += array1[0] * (array2[1] - array2[0]) + - array1[array1_size - 1] * - (array2[array2_size - 1] - array2[array2_size - 2]); - - result[0] *= 0.5; - } - else { - shape_elem_type _shape = array1_size; - dpnp_sum_c<_DataType_output, _DataType_input1>(result, array1, &_shape, - 1, NULL, 0, NULL, NULL); - - result[0] -= (array1[0] + array1[array1_size - 1]) * 0.5; - result[0] *= dx; - } - return event_ref; -} - -template -void dpnp_trapz_c(const void *array1_in, - const void *array2_in, - void *result1, - double dx, - size_t array1_size, - size_t array2_size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_trapz_c<_DataType_input1, _DataType_input2, _DataType_output>( - q_ref, array1_in, array2_in, result1, dx, array1_size, array2_size, - dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_trapz_default_c)(const void *, - const void *, - void *, - double, - size_t, - size_t) = - dpnp_trapz_c<_DataType_input1, _DataType_input2, _DataType_output>; - -template -DPCTLSyclEventRef (*dpnp_trapz_ext_c)(DPCTLSyclQueueRef, - const void *, - const void *, - void *, - double, - size_t, - size_t, - const DPCTLEventVectorRef) = - dpnp_trapz_c<_DataType_input1, _DataType_input2, _DataType_output>; - void func_map_init_mathematical(func_map_t &fmap) { - fmap[DPNPFuncName::DPNP_FN_ABSOLUTE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_elemwise_absolute_default_c}; - fmap[DPNPFuncName::DPNP_FN_ABSOLUTE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_elemwise_absolute_default_c}; - fmap[DPNPFuncName::DPNP_FN_ABSOLUTE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_elemwise_absolute_default_c}; - fmap[DPNPFuncName::DPNP_FN_ABSOLUTE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_elemwise_absolute_default_c}; - - fmap[DPNPFuncName::DPNP_FN_AROUND][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_around_default_c}; - fmap[DPNPFuncName::DPNP_FN_AROUND][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_around_default_c}; - fmap[DPNPFuncName::DPNP_FN_AROUND][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_around_default_c}; - fmap[DPNPFuncName::DPNP_FN_AROUND][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_around_default_c}; - - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_cross_default_c}; - fmap[DPNPFuncName::DPNP_FN_CROSS][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cross_default_c}; - - fmap[DPNPFuncName::DPNP_FN_CUMPROD][eft_INT][eft_INT] = { - eft_LNG, (void *)dpnp_cumprod_default_c}; - fmap[DPNPFuncName::DPNP_FN_CUMPROD][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_cumprod_default_c}; - fmap[DPNPFuncName::DPNP_FN_CUMPROD][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cumprod_default_c}; - fmap[DPNPFuncName::DPNP_FN_CUMPROD][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cumprod_default_c}; - - fmap[DPNPFuncName::DPNP_FN_CUMSUM][eft_INT][eft_INT] = { - eft_LNG, (void *)dpnp_cumsum_default_c}; - fmap[DPNPFuncName::DPNP_FN_CUMSUM][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_cumsum_default_c}; - fmap[DPNPFuncName::DPNP_FN_CUMSUM][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_cumsum_default_c}; - fmap[DPNPFuncName::DPNP_FN_CUMSUM][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_cumsum_default_c}; fmap[DPNPFuncName::DPNP_FN_EDIFF1D][eft_INT][eft_INT] = { eft_LNG, (void *)dpnp_ediff1d_default_c}; @@ -1172,43 +274,6 @@ void func_map_init_mathematical(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_EDIFF1D_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_ediff1d_ext_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_INT][eft_INT] = { - eft_INT, - (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_INT][eft_LNG] = { - eft_LNG, - (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_LNG][eft_INT] = { - eft_LNG, - (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_LNG][eft_LNG] = { - eft_LNG, - (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_FLOOR_DIVIDE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_floor_divide_default_c}; - fmap[DPNPFuncName::DPNP_FN_MODF][eft_INT][eft_INT] = { eft_DBL, (void *)dpnp_modf_default_c}; fmap[DPNPFuncName::DPNP_FN_MODF][eft_LNG][eft_LNG] = { @@ -1227,104 +292,5 @@ void func_map_init_mathematical(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_MODF_EXT][eft_DBL][eft_DBL] = { eft_DBL, (void *)dpnp_modf_ext_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - fmap[DPNPFuncName::DPNP_FN_REMAINDER][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_remainder_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_INT][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_LNG][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_INT][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_INT][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_INT][eft_FLT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_LNG][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_LNG][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_LNG][eft_FLT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_FLT][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_FLT][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_DBL][eft_FLT] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - fmap[DPNPFuncName::DPNP_FN_TRAPZ_EXT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_trapz_ext_c}; - return; } diff --git a/dpnp/backend/src/dpnp_fptr.hpp b/dpnp/backend/src/dpnp_fptr.hpp index 20fc5305e9a..2a9c42eb172 100644 --- a/dpnp/backend/src/dpnp_fptr.hpp +++ b/dpnp/backend/src/dpnp_fptr.hpp @@ -326,7 +326,6 @@ static constexpr DPNPFuncType get_floating_res_type() * FPTR interface initialization functions */ void func_map_init_arraycreation(func_map_t &fmap); -void func_map_init_bitwise(func_map_t &fmap); void func_map_init_elemwise(func_map_t &fmap); void func_map_init_fft_func(func_map_t &fmap); void func_map_init_indexing_func(func_map_t &fmap); diff --git a/dpnp/backend/src/dpnp_iface_fptr.cpp b/dpnp/backend/src/dpnp_iface_fptr.cpp index 460896bfa2d..f8214212728 100644 --- a/dpnp/backend/src/dpnp_iface_fptr.cpp +++ b/dpnp/backend/src/dpnp_iface_fptr.cpp @@ -167,7 +167,6 @@ static func_map_t func_map_init() func_map_t fmap; func_map_init_arraycreation(fmap); - func_map_init_bitwise(fmap); func_map_init_elemwise(fmap); func_map_init_fft_func(fmap); func_map_init_indexing_func(fmap); diff --git a/dpnp/dpnp_algo/CMakeLists.txt b/dpnp/dpnp_algo/CMakeLists.txt index 2c3a49c6be4..1aea452c5d9 100644 --- a/dpnp/dpnp_algo/CMakeLists.txt +++ b/dpnp/dpnp_algo/CMakeLists.txt @@ -3,7 +3,6 @@ set(dpnp_algo_pyx_deps ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_statistics.pxi ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_trigonometric.pxi ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_sorting.pxi - ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_arraycreation.pxi ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_mathematical.pxi ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_indexing.pxi ${CMAKE_CURRENT_SOURCE_DIR}/dpnp_algo_logic.pxi diff --git a/dpnp/dpnp_algo/dpnp_algo.pxd b/dpnp/dpnp_algo/dpnp_algo.pxd index 4e91151697c..37663bee834 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pxd +++ b/dpnp/dpnp_algo/dpnp_algo.pxd @@ -35,7 +35,6 @@ cdef extern from "dpnp_iface_fptr.hpp" namespace "DPNPFuncName": # need this na cdef enum DPNPFuncName "DPNPFuncName": DPNP_FN_ALLCLOSE_EXT DPNP_FN_CHOOSE_EXT - DPNP_FN_COPY_EXT DPNP_FN_CORRELATE_EXT DPNP_FN_DEGREES_EXT DPNP_FN_EDIFF1D_EXT @@ -172,11 +171,6 @@ cpdef dpnp_descriptor dpnp_isclose(dpnp_descriptor input1, dpnp_descriptor input double rtol=*, double atol=*, cpp_bool equal_nan=*) -""" -Array creation routines -""" -cpdef dpnp_descriptor dpnp_copy(dpnp_descriptor x1) - """ Mathematical functions """ diff --git a/dpnp/dpnp_algo/dpnp_algo.pyx b/dpnp/dpnp_algo/dpnp_algo.pyx index c8d99c56912..4c560d50e0b 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pyx +++ b/dpnp/dpnp_algo/dpnp_algo.pyx @@ -58,7 +58,6 @@ __all__ = [ ] -include "dpnp_algo_arraycreation.pxi" include "dpnp_algo_indexing.pxi" include "dpnp_algo_logic.pxi" include "dpnp_algo_mathematical.pxi" diff --git a/dpnp/dpnp_algo/dpnp_algo_arraycreation.pxi b/dpnp/dpnp_algo/dpnp_algo_arraycreation.pxi deleted file mode 100644 index bd86a461848..00000000000 --- a/dpnp/dpnp_algo/dpnp_algo_arraycreation.pxi +++ /dev/null @@ -1,78 +0,0 @@ -# cython: language_level=3 -# cython: linetrace=True -# -*- coding: utf-8 -*- -# ***************************************************************************** -# Copyright (c) 2016-2024, Intel Corporation -# All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions are met: -# - Redistributions of source code must retain the above copyright notice, -# this list of conditions and the following disclaimer. -# - Redistributions in binary form must reproduce the above copyright notice, -# this list of conditions and the following disclaimer in the documentation -# and/or other materials provided with the distribution. -# -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -# THE POSSIBILITY OF SUCH DAMAGE. -# ***************************************************************************** - -"""Module Backend (array creation part) - -This module contains interface functions between C backend layer -and the rest of the library - -""" - -# NO IMPORTs here. All imports must be placed into main "dpnp_algo.pyx" file - -__all__ += [ - "dpnp_copy", -] - - -ctypedef c_dpctl.DPCTLSyclEventRef(*custom_1in_1out_func_ptr_t)(c_dpctl.DPCTLSyclQueueRef, - void *, - void * , - const int , - shape_elem_type * , - shape_elem_type * , - const size_t, - const size_t, - const c_dpctl.DPCTLEventVectorRef) -ctypedef c_dpctl.DPCTLSyclEventRef(*ftpr_custom_vander_1in_1out_t)(c_dpctl.DPCTLSyclQueueRef, - void * , void * , size_t, size_t, int, - const c_dpctl.DPCTLEventVectorRef) except + -ctypedef c_dpctl.DPCTLSyclEventRef(*custom_arraycreation_1in_1out_func_ptr_t)(c_dpctl.DPCTLSyclQueueRef, - void *, - const size_t, - const size_t, - const shape_elem_type*, - const shape_elem_type*, - void *, - const size_t, - const size_t, - const shape_elem_type*, - const shape_elem_type*, - const shape_elem_type *, - const size_t, - const c_dpctl.DPCTLEventVectorRef) -ctypedef c_dpctl.DPCTLSyclEventRef(*custom_indexing_1out_func_ptr_t)(c_dpctl.DPCTLSyclQueueRef, - void * , - const size_t , - const size_t , - const int, - const c_dpctl.DPCTLEventVectorRef) except + - - -cpdef utils.dpnp_descriptor dpnp_copy(utils.dpnp_descriptor x1): - return call_fptr_1in_1out_strides(DPNP_FN_COPY_EXT, x1) diff --git a/dpnp/dpnp_algo/dpnp_algo_sorting.pxi b/dpnp/dpnp_algo/dpnp_algo_sorting.pxi index 4947fa9e41d..5da472a246b 100644 --- a/dpnp/dpnp_algo/dpnp_algo_sorting.pxi +++ b/dpnp/dpnp_algo/dpnp_algo_sorting.pxi @@ -58,7 +58,7 @@ cpdef utils.dpnp_descriptor dpnp_partition(utils.dpnp_descriptor arr, int kth, a cdef DPNPFuncData kernel_data = get_dpnp_function_ptr(DPNP_FN_PARTITION_EXT, param1_type, param1_type) - cdef utils.dpnp_descriptor arr2 = dpnp_copy(arr) + cdef utils.dpnp_descriptor arr2 = dpnp.get_dpnp_descriptor(arr.get_pyobj().copy(), copy_when_nondefault_queue=False) arr_obj = arr.get_array() From 805d50218142833206af2d7441a845942c01440a Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun, 30 Jun 2024 12:36:40 +0200 Subject: [PATCH 41/49] Bump github/codeql-action from 3.25.10 to 3.25.11 (#1904) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.25.10 to 3.25.11. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/23acc5c183826b7a8a97bce3cecc52db901f8251...b611370bb5703a7efb587f9d136a52ea24c5c38c) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --- .github/workflows/openssf-scorecard.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index 803f20d284b..9658c7e3b2f 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -68,6 +68,6 @@ jobs: # Upload the results to GitHub's code scanning dashboard. - name: "Upload to code-scanning" - uses: github/codeql-action/upload-sarif@23acc5c183826b7a8a97bce3cecc52db901f8251 # v3.25.10 + uses: github/codeql-action/upload-sarif@b611370bb5703a7efb587f9d136a52ea24c5c38c # v3.25.11 with: sarif_file: results.sarif From 1a4f8a4b8ff9223425f13b83af5728b7ba56d396 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Tue, 2 Jul 2024 16:50:35 +0200 Subject: [PATCH 42/49] Adopt dpnp interface to asynchronous dpctl execution (Part #1) (#1897) * Update manipulation functions * Update functions from the array creation container * Update dpnp array methods * Implement backward compatible solution * dpnp.meshgrid has to follow CFD and prohibit input arrays allocating on different SYCL queues * updated linspace, logspace and geomspace functions * Updated elementwise functions and astype * Updated counting and histogram functions * Switched back to use dppy/label/dev for coverage GH action * Removed dpnp_container.linspace since unused * Return dpnp ndarray for linspace, logspace and geomspace internal functions --- .github/workflows/generate_coverage.yaml | 2 +- dpnp/dpnp_algo/dpnp_arraycreation.py | 114 ++++++++++---------- dpnp/dpnp_algo/dpnp_elementwise_common.py | 125 +++++++++++++--------- dpnp/dpnp_array.py | 3 + dpnp/dpnp_container.py | 42 ++------ dpnp/dpnp_iface.py | 19 +++- dpnp/dpnp_iface_arraycreation.py | 77 +++++++++---- dpnp/dpnp_iface_counting.py | 7 +- dpnp/dpnp_iface_histograms.py | 68 ++++++++---- dpnp/dpnp_iface_manipulation.py | 64 +++++++---- tests/test_sycl_queue.py | 13 +-- 11 files changed, 318 insertions(+), 216 deletions(-) diff --git a/.github/workflows/generate_coverage.yaml b/.github/workflows/generate_coverage.yaml index 1fa71fb479d..5a0480235a7 100644 --- a/.github/workflows/generate_coverage.yaml +++ b/.github/workflows/generate_coverage.yaml @@ -21,7 +21,7 @@ jobs: env: python-ver: '3.10' - CHANNELS: '-c dppy/label/coverage -c intel -c conda-forge --override-channels' + CHANNELS: '-c dppy/label/dev -c intel -c conda-forge --override-channels' # Install the latest oneAPI compiler to work around an issue INSTALL_ONE_API: 'yes' diff --git a/dpnp/dpnp_algo/dpnp_arraycreation.py b/dpnp/dpnp_algo/dpnp_arraycreation.py index 83cd9da4acf..b493efac993 100644 --- a/dpnp/dpnp_algo/dpnp_arraycreation.py +++ b/dpnp/dpnp_algo/dpnp_arraycreation.py @@ -1,12 +1,13 @@ import math import operator +import dpctl.tensor as dpt import dpctl.utils as dpu import numpy import dpnp -import dpnp.dpnp_container as dpnp_container import dpnp.dpnp_utils as utils +from dpnp.dpnp_array import dpnp_array __all__ = [ "dpnp_geomspace", @@ -16,6 +17,12 @@ ] +def _as_usm_ndarray(a, usm_type, sycl_queue): + if isinstance(a, dpnp_array): + return a.get_array() + return dpt.asarray(a, usm_type=usm_type, sycl_queue=sycl_queue) + + def dpnp_geomspace( start, stop, @@ -40,14 +47,8 @@ def dpnp_geomspace( else: _usm_type = usm_type - if not dpnp.is_supported_array_type(start): - start = dpnp.asarray( - start, usm_type=_usm_type, sycl_queue=sycl_queue_normalized - ) - if not dpnp.is_supported_array_type(stop): - stop = dpnp.asarray( - stop, usm_type=_usm_type, sycl_queue=sycl_queue_normalized - ) + start = _as_usm_ndarray(start, _usm_type, sycl_queue_normalized) + stop = _as_usm_ndarray(stop, _usm_type, sycl_queue_normalized) dt = numpy.result_type(start, stop, float(num)) dt = utils.map_dtype_to_device(dt, sycl_queue_normalized.sycl_device) @@ -57,8 +58,8 @@ def dpnp_geomspace( if dpnp.any(start == 0) or dpnp.any(stop == 0): raise ValueError("Geometric sequence cannot include zero") - out_sign = dpnp.ones( - dpnp.broadcast_arrays(start, stop)[0].shape, + out_sign = dpt.ones( + dpt.broadcast_arrays(start, stop)[0].shape, dtype=dt, usm_type=_usm_type, sycl_queue=sycl_queue_normalized, @@ -72,15 +73,15 @@ def dpnp_geomspace( stop[all_imag] = stop[all_imag].imag out_sign[all_imag] = 1j - both_negative = (dpnp.sign(start) == -1) & (dpnp.sign(stop) == -1) + both_negative = (dpt.sign(start) == -1) & (dpt.sign(stop) == -1) if dpnp.any(both_negative): - dpnp.negative(start[both_negative], out=start[both_negative]) - dpnp.negative(stop[both_negative], out=stop[both_negative]) - dpnp.negative(out_sign[both_negative], out=out_sign[both_negative]) + dpt.negative(start[both_negative], out=start[both_negative]) + dpt.negative(stop[both_negative], out=stop[both_negative]) + dpt.negative(out_sign[both_negative], out=out_sign[both_negative]) - log_start = dpnp.log10(start) - log_stop = dpnp.log10(stop) - result = dpnp_logspace( + log_start = dpt.log10(start) + log_stop = dpt.log10(stop) + res = dpnp_logspace( log_start, log_stop, num=num, @@ -89,19 +90,20 @@ def dpnp_geomspace( dtype=dtype, usm_type=_usm_type, sycl_queue=sycl_queue_normalized, - ) + ).get_array() if num > 0: - result[0] = start + res[0] = start if num > 1 and endpoint: - result[-1] = stop + res[-1] = stop - result = out_sign * result + res = out_sign * res if axis != 0: - result = dpnp.moveaxis(result, 0, axis) + res = dpt.moveaxis(res, 0, axis) - return result.astype(dtype, copy=False) + res = dpt.astype(res, dtype, copy=False) + return dpnp_array._create_from_usm_ndarray(res) def dpnp_linspace( @@ -129,14 +131,11 @@ def dpnp_linspace( else: _usm_type = usm_type - if not hasattr(start, "dtype") and not dpnp.isscalar(start): - start = dpnp.asarray( - start, usm_type=_usm_type, sycl_queue=sycl_queue_normalized - ) - if not hasattr(stop, "dtype") and not dpnp.isscalar(stop): - stop = dpnp.asarray( - stop, usm_type=_usm_type, sycl_queue=sycl_queue_normalized - ) + if not dpnp.isscalar(start): + start = _as_usm_ndarray(start, _usm_type, sycl_queue_normalized) + + if not dpnp.isscalar(stop): + stop = _as_usm_ndarray(stop, _usm_type, sycl_queue_normalized) dt = numpy.result_type(start, stop, float(num)) dt = utils.map_dtype_to_device(dt, sycl_queue_normalized.sycl_device) @@ -155,7 +154,7 @@ def dpnp_linspace( if dpnp.isscalar(start) and dpnp.isscalar(stop): # Call linspace() function for scalars. - res = dpnp_container.linspace( + usm_res = dpt.linspace( start, stop, num, @@ -167,17 +166,17 @@ def dpnp_linspace( if retstep is True and step_nan is False: step = (stop - start) / step_num else: - _start = dpnp.asarray( + usm_start = dpt.asarray( start, dtype=dt, usm_type=_usm_type, sycl_queue=sycl_queue_normalized, ) - _stop = dpnp.asarray( + usm_stop = dpt.asarray( stop, dtype=dt, usm_type=_usm_type, sycl_queue=sycl_queue_normalized ) - res = dpnp_container.arange( + usm_res = dpt.arange( 0, stop=num, step=1, @@ -187,28 +186,29 @@ def dpnp_linspace( ) if step_nan is False: - step = (_stop - _start) / step_num - res = res.reshape((-1,) + (1,) * step.ndim) - res = res * step + _start + step = (usm_stop - usm_start) / step_num + usm_res = dpt.reshape(usm_res, (-1,) + (1,) * step.ndim, copy=False) + usm_res = usm_res * step + usm_res += usm_start if endpoint and num > 1: - res[-1] = dpnp_container.full(step.shape, _stop) + usm_res[-1] = dpt.full(step.shape, usm_stop) if axis != 0: - res = dpnp.moveaxis(res, 0, axis) + usm_res = dpt.moveaxis(usm_res, 0, axis) if numpy.issubdtype(dtype, dpnp.integer): - dpnp.floor(res, out=res) + dpt.floor(usm_res, out=usm_res) - res = res.astype(dtype, copy=False) + res = dpt.astype(usm_res, dtype, copy=False) + res = dpnp_array._create_from_usm_ndarray(res) if retstep is True: if dpnp.isscalar(step): - step = dpnp.asarray( + step = dpt.asarray( step, usm_type=res.usm_type, sycl_queue=res.sycl_queue ) - return (res, step) - + return res, dpnp_array._create_from_usm_ndarray(step) return res @@ -239,12 +239,15 @@ def dpnp_logspace( usm_type = "device" if usm_type_alloc is None else usm_type_alloc else: usm_type = usm_type - start = dpnp.asarray(start, usm_type=usm_type, sycl_queue=sycl_queue) - stop = dpnp.asarray(stop, usm_type=usm_type, sycl_queue=sycl_queue) - base = dpnp.asarray(base, usm_type=usm_type, sycl_queue=sycl_queue) - [start, stop, base] = dpnp.broadcast_arrays(start, stop, base) - base = dpnp.expand_dims(base, axis=axis) + start = _as_usm_ndarray(start, usm_type, sycl_queue) + stop = _as_usm_ndarray(stop, usm_type, sycl_queue) + base = _as_usm_ndarray(base, usm_type, sycl_queue) + + [start, stop, base] = dpt.broadcast_arrays(start, stop, base) + base = dpt.expand_dims(base, axis=axis) + + # assume res as not a tuple, because retstep is False res = dpnp_linspace( start, stop, @@ -254,11 +257,12 @@ def dpnp_logspace( sycl_queue=sycl_queue, endpoint=endpoint, axis=axis, - ) + ).get_array() - if dtype is None: - return dpnp.power(base, res) - return dpnp.power(base, res).astype(dtype, copy=False) + dpt.pow(base, res, out=res) + if dtype is not None: + res = dpt.astype(res, dtype, copy=False) + return dpnp_array._create_from_usm_ndarray(res) class dpnp_nd_grid: diff --git a/dpnp/dpnp_algo/dpnp_elementwise_common.py b/dpnp/dpnp_algo/dpnp_elementwise_common.py index 374981a6303..b13ea56bc32 100644 --- a/dpnp/dpnp_algo/dpnp_elementwise_common.py +++ b/dpnp/dpnp_algo/dpnp_elementwise_common.py @@ -24,6 +24,7 @@ # THE POSSIBILITY OF SUCH DAMAGE. # ***************************************************************************** +import dpctl.tensor as dpt import numpy from dpctl.tensor._elementwise_common import ( BinaryElementwiseFunc, @@ -161,24 +162,27 @@ def __call__( f"Requested function={self.name_} only takes `out` or `dtype`" "as an argument, but both were provided." ) + + if order is None: + order = "K" + elif order in "afkcAFKC": + order = order.upper() else: - if order is None: - order = "K" - elif order in "afkcAFKC": - order = order.upper() - else: - raise ValueError( - "order must be one of 'C', 'F', 'A', or 'K' " - f"(got '{order}')" - ) - if dtype is not None: - x = dpnp.astype(x, dtype=dtype, copy=False) - x_usm = dpnp.get_usm_ndarray(x) - out_usm = None if out is None else dpnp.get_usm_ndarray(out) - res_usm = super().__call__(x_usm, out=out_usm, order=order) - if out is not None and isinstance(out, dpnp_array): - return out - return dpnp_array._create_from_usm_ndarray(res_usm) + raise ValueError( + "order must be one of 'C', 'F', 'A', or 'K' " f"(got '{order}')" + ) + + x_usm = dpnp.get_usm_ndarray(x) + if dtype is not None: + x_usm = dpt.astype(x_usm, dtype, copy=False) + + out_usm = None if out is None else dpnp.get_usm_ndarray(out) + res_usm = super().__call__(x_usm, out=out_usm, order=order) + + dpnp.synchronize_array_data(res_usm) + if out is not None and isinstance(out, dpnp_array): + return out + return dpnp_array._create_from_usm_ndarray(res_usm) class DPNPBinaryFunc(BinaryElementwiseFunc): @@ -311,35 +315,47 @@ def __call__( f"Requested function={self.name_} only takes `out` or `dtype`" "as an argument, but both were provided." ) + + if order is None: + order = "K" + elif order in "afkcAFKC": + order = order.upper() else: - if order is None: - order = "K" - elif order in "afkcAFKC": - order = order.upper() - else: - raise ValueError( - "order must be one of 'C', 'F', 'A', or 'K' " - f"(got '{order}')" + raise ValueError( + "order must be one of 'C', 'F', 'A', or 'K' (got '{order}')" + ) + + x1_usm = dpnp.get_usm_ndarray_or_scalar(x1) + x2_usm = dpnp.get_usm_ndarray_or_scalar(x2) + + if dtype is not None: + if dpnp.isscalar(x1): + x1_usm = dpt.asarray( + x1, + dtype=dtype, + sycl_queue=x2.sycl_queue, + usm_type=x2.usm_type, ) - if dtype is not None: - if dpnp.isscalar(x1): - x1 = dpnp.asarray(x1, dtype=dtype) - x2 = dpnp.astype(x2, dtype=dtype, copy=False) - elif dpnp.isscalar(x2): - x1 = dpnp.astype(x1, dtype=dtype, copy=False) - x2 = dpnp.asarray(x2, dtype=dtype) - else: - x1 = dpnp.astype(x1, dtype=dtype, copy=False) - x2 = dpnp.astype(x2, dtype=dtype, copy=False) - - x1_usm = dpnp.get_usm_ndarray_or_scalar(x1) - x2_usm = dpnp.get_usm_ndarray_or_scalar(x2) + x2_usm = dpt.astype(x2_usm, dtype, copy=False) + elif dpnp.isscalar(x2): + x1_usm = dpt.astype(x1_usm, dtype, copy=False) + x2_usm = dpt.asarray( + x2, + dtype=dtype, + sycl_queue=x1.sycl_queue, + usm_type=x1.usm_type, + ) + else: + x1_usm = dpt.astype(x1_usm, dtype, copy=False) + x2_usm = dpt.astype(x2_usm, dtype, copy=False) - out_usm = None if out is None else dpnp.get_usm_ndarray(out) - res_usm = super().__call__(x1_usm, x2_usm, out=out_usm, order=order) - if out is not None and isinstance(out, dpnp_array): - return out - return dpnp_array._create_from_usm_ndarray(res_usm) + out_usm = None if out is None else dpnp.get_usm_ndarray(out) + res_usm = super().__call__(x1_usm, x2_usm, out=out_usm, order=order) + + dpnp.synchronize_array_data(res_usm) + if out is not None and isinstance(out, dpnp_array): + return out + return dpnp_array._create_from_usm_ndarray(res_usm) def outer( self, @@ -463,7 +479,7 @@ def __init__( def __call__(self, x, deg=False): res = super().__call__(x) if deg is True: - res = res * (180 / dpnp.pi) + res *= 180 / dpnp.pi return res @@ -513,14 +529,21 @@ def __init__( def __call__(self, x, decimals=0, out=None, dtype=None): if decimals != 0: - if dpnp.issubdtype(x.dtype, dpnp.integer) and dtype is None: - dtype = x.dtype - res = dpnp.true_divide( - dpnp.rint(x * 10**decimals, out=out), 10**decimals, out=out - ) + x_usm = dpnp.get_usm_ndarray(x) + if dpnp.issubdtype(x_usm.dtype, dpnp.integer) and dtype is None: + dtype = x_usm.dtype + + out_usm = None if out is None else dpnp.get_usm_ndarray(out) + x_usm = dpt.round(x_usm * 10**decimals, out=out_usm) + res_usm = dpt.divide(x_usm, 10**decimals, out=out_usm) + if dtype is not None: - res = res.astype(dtype) - return res + res_usm = dpt.astype(res_usm, dtype, copy=False) + + dpnp.synchronize_array_data(res_usm) + if out is not None and isinstance(out, dpnp_array): + return out + return dpnp_array._create_from_usm_ndarray(res_usm) else: return super().__call__(x, out=out, dtype=dtype) diff --git a/dpnp/dpnp_array.py b/dpnp/dpnp_array.py index fd2d06f7428..d9936872a89 100644 --- a/dpnp/dpnp_array.py +++ b/dpnp/dpnp_array.py @@ -258,6 +258,8 @@ def __getitem__(self, key): res = self.__new__(dpnp_array) res._array_obj = item + if self._array_obj.usm_data is not res._array_obj.usm_data: + dpnp.synchronize_array_data(self) return res def __gt__(self, other): @@ -454,6 +456,7 @@ def __setitem__(self, key, val): val = val.get_array() self._array_obj.__setitem__(key, val) + dpnp.synchronize_array_data(self) # '__setstate__', # '__sizeof__', diff --git a/dpnp/dpnp_container.py b/dpnp/dpnp_container.py index 5322df3324b..8f70e015393 100644 --- a/dpnp/dpnp_container.py +++ b/dpnp/dpnp_container.py @@ -47,7 +47,6 @@ "empty", "eye", "full", - "linspace", "ones", "tril", "triu", @@ -81,6 +80,7 @@ def arange( sycl_queue=sycl_queue_normalized, ) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj) @@ -133,6 +133,7 @@ def asarray( if array_obj is x1_obj and isinstance(x1, dpnp_array): return x1 + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order=order) @@ -142,6 +143,7 @@ def copy(x1, /, *, order="K"): order = "K" array_obj = dpt.copy(dpnp.get_usm_ndarray(x1), order=order) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order="K") @@ -203,6 +205,7 @@ def eye( usm_type=usm_type, sycl_queue=sycl_queue_normalized, ) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order=order) @@ -237,40 +240,10 @@ def full( usm_type=usm_type, sycl_queue=sycl_queue_normalized, ) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order=order) -def linspace( - start, - stop, - /, - num, - *, - dtype=None, - device=None, - usm_type="device", - sycl_queue=None, - endpoint=True, -): - """Validate input parameters before passing them into `dpctl.tensor` module""" - dpu.validate_usm_type(usm_type, allow_none=False) - sycl_queue_normalized = dpnp.get_normalized_queue_device( - sycl_queue=sycl_queue, device=device - ) - - """Creates `dpnp_array` with evenly spaced numbers of specified interval.""" - array_obj = dpt.linspace( - start, - stop, - num, - dtype=dtype, - usm_type=usm_type, - sycl_queue=sycl_queue_normalized, - endpoint=endpoint, - ) - return dpnp_array(array_obj.shape, buffer=array_obj) - - def ones( shape, *, @@ -296,18 +269,21 @@ def ones( usm_type=usm_type, sycl_queue=sycl_queue_normalized, ) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order=order) def tril(x1, /, *, k=0): """Creates `dpnp_array` as lower triangular part of an input array.""" array_obj = dpt.tril(dpnp.get_usm_ndarray(x1), k=k) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order="K") def triu(x1, /, *, k=0): """Creates `dpnp_array` as upper triangular part of an input array.""" array_obj = dpt.triu(dpnp.get_usm_ndarray(x1), k=k) + dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order="K") @@ -336,4 +312,6 @@ def zeros( usm_type=usm_type, sycl_queue=sycl_queue_normalized, ) + # TODO: uncomment once dpctl implements asynchronous call + # dpnp.synchronize_array_data(array_obj) return dpnp_array(array_obj.shape, buffer=array_obj, order=order) diff --git a/dpnp/dpnp_iface.py b/dpnp/dpnp_iface.py index 49e7b41c01c..b3103869e8d 100644 --- a/dpnp/dpnp_iface.py +++ b/dpnp/dpnp_iface.py @@ -42,6 +42,7 @@ import dpctl import dpctl.tensor as dpt +import dpctl.utils as dpu import numpy from dpctl.tensor._device import normalize_queue_device @@ -69,6 +70,7 @@ "get_usm_ndarray_or_scalar", "is_supported_array_or_scalar", "is_supported_array_type", + "synchronize_array_data", ] from dpnp import float64, isscalar @@ -238,10 +240,10 @@ def astype(x1, dtype, order="K", casting="unsafe", copy=True, device=None): x1_obj, dtype, order=order, casting=casting, copy=copy, device=device ) - # return x1 if dpctl returns a zero copy of x1_obj + dpnp.synchronize_array_data(x1) if array_obj is x1_obj and isinstance(x1, dpnp_array): + # return x1 if dpctl returns a zero copy of x1_obj return x1 - return dpnp_array._create_from_usm_ndarray(array_obj) @@ -699,3 +701,16 @@ def is_supported_array_type(a): """ return isinstance(a, (dpnp_array, dpt.usm_ndarray)) + + +def synchronize_array_data(a): + """ + The dpctl interface was reworked to make asynchronous execution. + That function makes a synchronization call to ensure array data is valid + before exit from dpnp interface function. + + """ + + if hasattr(dpu, "SequentialOrderManager"): + check_supported_arrays_type(a) + dpu.SequentialOrderManager[a.sycl_queue].wait() diff --git a/dpnp/dpnp_iface_arraycreation.py b/dpnp/dpnp_iface_arraycreation.py index 5cf63ea0fca..6698f3f782e 100644 --- a/dpnp/dpnp_iface_arraycreation.py +++ b/dpnp/dpnp_iface_arraycreation.py @@ -40,6 +40,7 @@ import operator +import dpctl.tensor as dpt import numpy import dpnp @@ -51,6 +52,10 @@ dpnp_logspace, dpnp_nd_grid, ) +from .dpnp_array import dpnp_array + +# pylint: disable=no-name-in-module +from .dpnp_utils import get_usm_allocations, map_dtype_to_device __all__ = [ "arange", @@ -2183,7 +2188,7 @@ def geomspace( """ - return dpnp_geomspace( + res = dpnp_geomspace( start, stop, num, @@ -2195,6 +2200,9 @@ def geomspace( axis=axis, ) + dpnp.synchronize_array_data(res) + return res + def identity( n, @@ -2402,7 +2410,7 @@ def linspace( """ - return dpnp_linspace( + res = dpnp_linspace( start, stop, num, @@ -2415,6 +2423,12 @@ def linspace( axis=axis, ) + if isinstance(res, tuple): # (result, step) is returning + dpnp.synchronize_array_data(res[0]) + else: + dpnp.synchronize_array_data(res) + return res + def loadtxt( fname, @@ -2629,7 +2643,7 @@ def logspace( """ - return dpnp_logspace( + res = dpnp_logspace( start, stop, num=num, @@ -2642,6 +2656,9 @@ def logspace( axis=axis, ) + dpnp.synchronize_array_data(res) + return res + # pylint: disable=redefined-outer-name def meshgrid(*xi, copy=True, sparse=False, indexing="xy"): @@ -2720,21 +2737,30 @@ def meshgrid(*xi, copy=True, sparse=False, indexing="xy"): "Unrecognized indexing keyword value, expecting 'xy' or 'ij'." ) + if ndim < 1: + return [] + s0 = (1,) * ndim output = [ - dpnp.reshape(x, s0[:i] + (-1,) + s0[i + 1 :]) for i, x in enumerate(xi) + dpt.reshape(dpnp.get_usm_ndarray(x), s0[:i] + (-1,) + s0[i + 1 :]) + for i, x in enumerate(xi) ] + # input arrays must be allocated on the same queue + _, _ = get_usm_allocations(output) + if indexing == "xy" and ndim > 1: - output[0] = output[0].reshape((1, -1) + s0[2:]) - output[1] = output[1].reshape((-1, 1) + s0[2:]) + output[0] = dpt.reshape(output[0], (1, -1) + s0[2:]) + output[1] = dpt.reshape(output[1], (-1, 1) + s0[2:]) if not sparse: - output = dpnp.broadcast_arrays(*output) + output = dpt.broadcast_arrays(*output) if copy: - output = [x.copy() for x in output] + output = [dpt.copy(x) for x in output] + dpnp.synchronize_array_data(output[0]) + output = [dpnp_array._create_from_usm_ndarray(x) for x in output] return output @@ -3261,7 +3287,10 @@ def tri( _dtype = dpnp.default_float_type() if dtype in (dpnp.float, None) else dtype - m = dpnp.ones( + if usm_type is None: + usm_type = "device" + + m = dpt.ones( (N, M), dtype=_dtype, device=device, @@ -3469,28 +3498,34 @@ def vander( [125, 25, 5, 1]]), Device(level_zero:gpu:0), 'host') """ - x = dpnp.asarray(x, device=device, usm_type=usm_type, sycl_queue=sycl_queue) + if dpnp.is_supported_array_type(x): + x = dpnp.get_usm_ndarray(x) + usm_x = dpt.asarray( + x, device=device, usm_type=usm_type, sycl_queue=sycl_queue + ) + + x_sycl_queue = usm_x.sycl_queue + x_usm_type = usm_x.usm_type if N is not None and not isinstance(N, int): raise TypeError(f"An integer is required, but got {type(N)}") - if x.ndim != 1: + if usm_x.ndim != 1: raise ValueError("`x` must be a one-dimensional array or sequence.") if N is None: - N = x.size + N = usm_x.size + + _dtype = numpy.promote_types(usm_x.dtype, int) + _dtype = map_dtype_to_device(_dtype, x_sycl_queue.sycl_device) + m = dpnp.empty_like(usm_x, shape=(usm_x.size, N), dtype=_dtype) - _dtype = int if x.dtype == bool else x.dtype - m = empty( - (x.size, N), - dtype=_dtype, - usm_type=x.usm_type, - sycl_queue=x.sycl_queue, - ) tmp = m[:, ::-1] if not increasing else m dpnp.power( - x.reshape(-1, 1), - dpnp.arange(N, dtype=_dtype, sycl_queue=x.sycl_queue), + dpt.reshape(usm_x, (-1, 1)), + dpt.arange( + N, dtype=_dtype, usm_type=x_usm_type, sycl_queue=x_sycl_queue + ), out=tmp, ) return m diff --git a/dpnp/dpnp_iface_counting.py b/dpnp/dpnp_iface_counting.py index 8a90601ce8f..515cad08a06 100644 --- a/dpnp/dpnp_iface_counting.py +++ b/dpnp/dpnp_iface_counting.py @@ -37,6 +37,8 @@ """ +import dpctl.tensor as dpt + import dpnp __all__ = ["count_nonzero"] @@ -87,5 +89,6 @@ def count_nonzero(a, axis=None, *, keepdims=False): # TODO: might be improved by implementing an extension # with `count_nonzero` kernel - a = dpnp.astype(a, dpnp.bool, copy=False) - return a.sum(axis=axis, dtype=dpnp.intp, keepdims=keepdims) + usm_a = dpnp.get_usm_ndarray(a) + usm_a = dpt.astype(usm_a, dpnp.bool, copy=False) + return dpnp.sum(usm_a, axis=axis, dtype=dpnp.intp, keepdims=keepdims) diff --git a/dpnp/dpnp_iface_histograms.py b/dpnp/dpnp_iface_histograms.py index 1a1b4daf740..24c8b6aaf78 100644 --- a/dpnp/dpnp_iface_histograms.py +++ b/dpnp/dpnp_iface_histograms.py @@ -40,11 +40,17 @@ import operator import warnings +import dpctl.tensor as dpt import dpctl.utils as dpu import numpy import dpnp +from .dpnp_algo.dpnp_arraycreation import ( + dpnp_linspace, +) +from .dpnp_array import dpnp_array + __all__ = [ "digitize", "histogram", @@ -60,7 +66,7 @@ def _ravel_check_a_and_weights(a, weights): """Check input `a` and `weights` arrays, and ravel both.""" # ensure that `a` array has supported type - dpnp.check_supported_arrays_type(a) + a = dpnp.get_usm_ndarray(a) usm_type = a.usm_type # ensure that the array is a "subtractable" dtype @@ -71,11 +77,11 @@ def _ravel_check_a_and_weights(a, weights): RuntimeWarning, stacklevel=3, ) - a = a.astype(numpy.uint8) + a = dpt.astype(a, numpy.uint8) if weights is not None: # check that `weights` array has supported type - dpnp.check_supported_arrays_type(weights) + weights = dpnp.get_usm_ndarray(weights) usm_type = dpu.get_coerced_usm_type([usm_type, weights.usm_type]) # check that arrays have the same allocation queue @@ -86,8 +92,9 @@ def _ravel_check_a_and_weights(a, weights): if weights.shape != a.shape: raise ValueError("weights should have the same shape as a.") - weights = weights.ravel() - a = a.ravel() + weights = dpt.reshape(weights, -1) + + a = dpt.reshape(a, -1) return a, weights, usm_type @@ -113,7 +120,7 @@ def _get_outer_edges(a, range): first_edge, last_edge = 0, 1 else: - first_edge, last_edge = a.min(), a.max() + first_edge, last_edge = dpt.min(a), dpt.max(a) if not (dpnp.isfinite(first_edge) and dpnp.isfinite(last_edge)): raise ValueError( f"autodetected range of [{first_edge}, {last_edge}] " @@ -157,9 +164,9 @@ def _get_bin_edges(a, bins, range, usm_type): "a and bins must be allocated on the same SYCL queue" ) - bin_edges = bins + bin_edges = dpnp.get_usm_ndarray(bins) else: - bin_edges = dpnp.asarray( + bin_edges = dpt.asarray( bins, sycl_queue=sycl_queue, usm_type=usm_type ) @@ -183,7 +190,7 @@ def _get_bin_edges(a, bins, range, usm_type): ) # bin edges must be computed - bin_edges = dpnp.linspace( + bin_edges = dpnp_linspace( first_edge, last_edge, n_equal_bins + 1, @@ -191,7 +198,7 @@ def _get_bin_edges(a, bins, range, usm_type): dtype=bin_type, sycl_queue=sycl_queue, usm_type=usm_type, - ) + ).get_array() return bin_edges, (first_edge, last_edge, n_equal_bins) return bin_edges, None @@ -204,8 +211,11 @@ def _search_sorted_inclusive(a, v): """ - return dpnp.concatenate( - (a.searchsorted(v[:-1], "left"), a.searchsorted(v[-1:], "right")) + return dpt.concat( + ( + dpt.searchsorted(a, v[:-1], side="left"), + dpt.searchsorted(a, v[-1:], side="right"), + ) ) @@ -297,8 +307,14 @@ def digitize(x, bins, right=False): # Use dpnp.searchsorted directly if bins are increasing return dpnp.searchsorted(bins, x, side=side) + usm_x = dpnp.get_usm_ndarray(x) + usm_bins = dpnp.get_usm_ndarray(bins) + # Reverse bins and adjust indices if bins are decreasing - return bins.size - dpnp.searchsorted(bins[::-1], x, side=side) + usm_res = usm_bins.size - dpt.searchsorted(usm_bins[::-1], usm_x, side=side) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def histogram(a, bins=10, range=None, density=None, weights=None): @@ -412,26 +428,36 @@ def histogram(a, bins=10, range=None, density=None, weights=None): else: # Compute via cumulative histogram if weights is None: - sa = dpnp.sort(a) + sa = dpt.sort(a) cum_n = _search_sorted_inclusive(sa, bin_edges) else: - zero = dpnp.zeros( + zero = dpt.zeros( 1, dtype=ntype, sycl_queue=a.sycl_queue, usm_type=usm_type ) - sorting_index = dpnp.argsort(a) + sorting_index = dpt.argsort(a) sa = a[sorting_index] sw = weights[sorting_index] - cw = dpnp.concatenate((zero, sw.cumsum(dtype=ntype))) + cw = dpt.concat((zero, dpt.cumulative_sum(sw, dtype=ntype))) bin_index = _search_sorted_inclusive(sa, bin_edges) cum_n = cw[bin_index] n = dpnp.diff(cum_n) + # convert bin_edges to dpnp.ndarray + bin_edges = dpnp_array._create_from_usm_ndarray(bin_edges) + if density: # pylint: disable=possibly-used-before-assignment - db = dpnp.diff(bin_edges).astype(dpnp.default_float_type()) - return n / db / n.sum(), bin_edges + db = dpnp.diff(bin_edges) + db = dpt.astype(db.get_array(), dpnp.default_float_type()) + + usm_n = n.get_array() + hist = usm_n / db / dpt.sum(usm_n) + dpnp.synchronize_array_data(hist) + return dpnp_array._create_from_usm_ndarray(hist), bin_edges + + dpnp.synchronize_array_data(n) return n, bin_edges @@ -517,4 +543,6 @@ def histogram_bin_edges(a, bins=10, range=None, weights=None): a, weights, usm_type = _ravel_check_a_and_weights(a, weights) bin_edges, _ = _get_bin_edges(a, bins, range, usm_type) - return bin_edges + + dpnp.synchronize_array_data(bin_edges) + return dpnp_array._create_from_usm_ndarray(bin_edges) diff --git a/dpnp/dpnp_iface_manipulation.py b/dpnp/dpnp_iface_manipulation.py index bf3c66d7fda..a4b7352d4e6 100644 --- a/dpnp/dpnp_iface_manipulation.py +++ b/dpnp/dpnp_iface_manipulation.py @@ -668,12 +668,15 @@ def concatenate( usm_arrays = [dpnp.get_usm_ndarray(x) for x in arrays] usm_res = dpt.concat(usm_arrays, axis=axis) + res = dpnp_array._create_from_usm_ndarray(usm_res) if dtype is not None: res = res.astype(dtype, casting=casting, copy=False) elif out is not None: dpnp.copyto(out, res, casting=casting) return out + + dpnp.synchronize_array_data(res) return res @@ -907,10 +910,11 @@ def expand_dims(a, axis): """ - usm_array = dpnp.get_usm_ndarray(a) - return dpnp_array._create_from_usm_ndarray( - dpt.expand_dims(usm_array, axis=axis) - ) + usm_a = dpnp.get_usm_ndarray(a) + usm_res = dpt.expand_dims(usm_a, axis=axis) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def flip(m, axis=None): @@ -1298,8 +1302,10 @@ def repeat(a, repeats, axis=None): a = dpnp.ravel(a) usm_arr = dpnp.get_usm_ndarray(a) - usm_arr = dpt.repeat(usm_arr, repeats, axis=axis) - return dpnp_array._create_from_usm_ndarray(usm_arr) + usm_res = dpt.repeat(usm_arr, repeats, axis=axis) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def reshape(a, /, newshape, order="C", copy=None): @@ -1374,9 +1380,11 @@ def reshape(a, /, newshape, order="C", copy=None): elif order not in "cfCF": raise ValueError(f"order must be one of 'C' or 'F' (got {order})") - usm_arr = dpnp.get_usm_ndarray(a) - usm_arr = dpt.reshape(usm_arr, shape=newshape, order=order, copy=copy) - return dpnp_array._create_from_usm_ndarray(usm_arr) + usm_a = dpnp.get_usm_ndarray(a) + usm_res = dpt.reshape(usm_a, shape=newshape, order=order, copy=copy) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def result_type(*arrays_and_dtypes): @@ -1483,10 +1491,12 @@ def roll(x, shift, axis=None): """ if axis is None: return roll(x.reshape(-1), shift, 0).reshape(x.shape) - usm_array = dpnp.get_usm_ndarray(x) - return dpnp_array._create_from_usm_ndarray( - dpt.roll(usm_array, shift=shift, axis=axis) - ) + + usm_x = dpnp.get_usm_ndarray(x) + usm_res = dpt.roll(usm_x, shift=shift, axis=axis) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def rollaxis(x, axis, start=0): @@ -1633,10 +1643,11 @@ def squeeze(a, /, axis=None): """ - usm_array = dpnp.get_usm_ndarray(a) - return dpnp_array._create_from_usm_ndarray( - dpt.squeeze(usm_array, axis=axis) - ) + usm_a = dpnp.get_usm_ndarray(a) + usm_res = dpt.squeeze(usm_a, axis=axis) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def stack(arrays, /, *, axis=0, out=None, dtype=None, casting="same_kind"): @@ -1714,12 +1725,15 @@ def stack(arrays, /, *, axis=0, out=None, dtype=None, casting="same_kind"): usm_arrays = [dpnp.get_usm_ndarray(x) for x in arrays] usm_res = dpt.stack(usm_arrays, axis=axis) + res = dpnp_array._create_from_usm_ndarray(usm_res) if dtype is not None: res = res.astype(dtype, casting=casting, copy=False) elif out is not None: dpnp.copyto(out, res, casting=casting) return out + + dpnp.synchronize_array_data(res) return res @@ -1772,10 +1786,11 @@ def swapaxes(a, axis1, axis2): """ - usm_array = dpnp.get_usm_ndarray(a) - return dpnp_array._create_from_usm_ndarray( - dpt.swapaxes(usm_array, axis1=axis1, axis2=axis2) - ) + usm_a = dpnp.get_usm_ndarray(a) + usm_res = dpt.swapaxes(usm_a, axis1=axis1, axis2=axis2) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) # pylint: disable=invalid-name @@ -1853,8 +1868,11 @@ def tile(A, reps): """ - usm_array = dpnp.get_usm_ndarray(A) - return dpnp_array._create_from_usm_ndarray(dpt.tile(usm_array, reps)) + usm_a = dpnp.get_usm_ndarray(A) + usm_res = dpt.tile(usm_a, reps) + + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def transpose(a, axes=None): diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index 378ecaf9b19..f7c70320dbf 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -373,18 +373,13 @@ def test_array_creation_load_txt(device): @pytest.mark.parametrize( - "device_x", - valid_devices, - ids=[device.filter_string for device in valid_devices], -) -@pytest.mark.parametrize( - "device_y", + "device", valid_devices, ids=[device.filter_string for device in valid_devices], ) -def test_meshgrid(device_x, device_y): - x = dpnp.arange(100, device=device_x) - y = dpnp.arange(100, device=device_y) +def test_meshgrid(device): + x = dpnp.arange(100, device=device) + y = dpnp.arange(100, device=device) z = dpnp.meshgrid(x, y) assert_sycl_queue_equal(z[0].sycl_queue, x.sycl_queue) assert_sycl_queue_equal(z[1].sycl_queue, y.sycl_queue) From 2fff1f12fea976836b7a0a66fad4fbf760a578f2 Mon Sep 17 00:00:00 2001 From: Natalia Polina Date: Wed, 3 Jul 2024 12:29:39 -0700 Subject: [PATCH 43/49] Clean up legacy array creation and manipulation implementation from the backend (#1903) * Clean up legacy element-wise implementation from the backend * return legacy copy implementation for partition function * Apply comments * Fix pre-commit * Fix pre-commit * Clean up legacy array creation implementation from the backend * Clean-up MACRO_2ARG_2TYPES_LOGIC_OP. Clean-up /backend/include * Removed backend/examples for removed functions * address comments * address comments --------- Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> Co-authored-by: Anton Volkov --- dpnp/backend/CMakeLists.txt | 1 - dpnp/backend/examples/example11.cpp | 85 -- dpnp/backend/examples/example3.cpp | 79 -- dpnp/backend/examples/example7.cpp | 77 -- dpnp/backend/examples/example_bs.cpp | 282 ----- .../examples/example_experimental_iface.cpp | 63 - .../include/dpnp_gen_2arg_3type_tbl.hpp | 10 - dpnp/backend/include/dpnp_iface.hpp | 392 ------ dpnp/backend/include/dpnp_iface_fptr.hpp | 42 +- .../kernels/dpnp_krnl_arraycreation.cpp | 1128 +---------------- dpnp/backend/kernels/dpnp_krnl_elemwise.cpp | 22 - dpnp/backend/kernels/dpnp_krnl_indexing.cpp | 34 - .../kernels/dpnp_krnl_manipulation.cpp | 235 ---- dpnp/backend/src/dpnp_fptr.hpp | 1 - dpnp/backend/src/dpnp_iface_fptr.cpp | 40 - dpnp/backend/src/queue_sycl.cpp | 30 - dpnp/dpnp_algo/dpnp_algo.pxd | 1 - dpnp/dpnp_algo/dpnp_algo_mathematical.pxi | 42 - dpnp/dpnp_iface_mathematical.py | 31 - 19 files changed, 9 insertions(+), 2586 deletions(-) delete mode 100644 dpnp/backend/examples/example11.cpp delete mode 100644 dpnp/backend/examples/example3.cpp delete mode 100644 dpnp/backend/examples/example7.cpp delete mode 100644 dpnp/backend/examples/example_bs.cpp delete mode 100644 dpnp/backend/examples/example_experimental_iface.cpp delete mode 100644 dpnp/backend/kernels/dpnp_krnl_manipulation.cpp diff --git a/dpnp/backend/CMakeLists.txt b/dpnp/backend/CMakeLists.txt index d96320bf0ac..7ed57fd929d 100644 --- a/dpnp/backend/CMakeLists.txt +++ b/dpnp/backend/CMakeLists.txt @@ -30,7 +30,6 @@ set(DPNP_SRC kernels/dpnp_krnl_fft.cpp kernels/dpnp_krnl_indexing.cpp kernels/dpnp_krnl_logic.cpp - kernels/dpnp_krnl_manipulation.cpp kernels/dpnp_krnl_mathematical.cpp kernels/dpnp_krnl_random.cpp kernels/dpnp_krnl_reduction.cpp diff --git a/dpnp/backend/examples/example11.cpp b/dpnp/backend/examples/example11.cpp deleted file mode 100644 index 3a16991bae6..00000000000 --- a/dpnp/backend/examples/example11.cpp +++ /dev/null @@ -1,85 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/** - * Example 11. - * - * This example shows simple usage of the DPNP C++ Backend library RNG shuffle - * function for one and ndim arrays. - * - * Possible compile line: - * g++ -g dpnp/backend/examples/example11.cpp -Idpnp -Idpnp/backend/include - * -Ldpnp -Wl,-rpath='$ORIGIN'/dpnp -ldpnp_backend_c -o example11 - * - */ - -#include - -#include - -template -void print_dpnp_array(T *arr, size_t size) -{ - std::cout << std::endl; - for (size_t i = 0; i < size; ++i) { - std::cout << arr[i] << ", "; - } - std::cout << std::endl; -} - -int main(int, char **) -{ - // Two cases: - // 1) array size = 100, ndim = 1, high_dim_size = 10 (aka ndarray with shape - // (100,) ) 2) array size = 100, ndim = 2, high_dim_size = 20 (e.g. ndarray - // with shape (20, 5) and len(array) = 20 ) - const size_t ndim_cases = 2; - const size_t itemsize = sizeof(double); - const size_t ndim[ndim_cases] = {1, 2}; - const size_t high_dim_size[ndim_cases] = {100, 20}; - const size_t size = 100; - const size_t seed = 1234; - - // DPNPC dpnp_rng_shuffle_c - // DPNPC interface - double *array_1 = - reinterpret_cast(dpnp_memory_alloc_c(size * sizeof(double))); - for (size_t i = 0; i < ndim_cases; i++) { - std::cout << "\nREPRODUCE: DPNPC dpnp_rng_shuffle_c:"; - std::cout << "\nDIMS: " << ndim[i] << std::endl; - // init array 0, 1, 2, 3, 4, 5, 6, .... - dpnp_arange_c(0, 1, array_1, size); - // print before shuffle - std::cout << "\nINPUT array:"; - print_dpnp_array(array_1, size); - dpnp_rng_srand_c(seed); - dpnp_rng_shuffle_c(array_1, itemsize, ndim[i], high_dim_size[i], - size); - // print shuffle result - std::cout << "\nSHUFFLE INPUT array:"; - print_dpnp_array(array_1, size); - } - dpnp_memory_free_c(array_1); -} diff --git a/dpnp/backend/examples/example3.cpp b/dpnp/backend/examples/example3.cpp deleted file mode 100644 index 2d516dc0b8d..00000000000 --- a/dpnp/backend/examples/example3.cpp +++ /dev/null @@ -1,79 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/** - * Example 3. - * - * This example shows simple usage of the DPNP C++ Backend library - * to calculate cos of input vector elements - * - * Possible compile line: - * . /opt/intel/oneapi/setvars.sh - * g++ -g dpnp/backend/examples/example3.cpp -Idpnp -Idpnp/backend/include - * -Ldpnp -Wl,-rpath='$ORIGIN'/dpnp -ldpnp_backend_c -o example3 - * - */ - -#include - -#include "dpnp_iface.hpp" - -int main(int, char **) -{ - const size_t size = 256; - - std::cout << "SYCL queue is CPU: " << dpnp_queue_is_cpu_c() << std::endl; - - int *array1 = (int *)dpnp_memory_alloc_c(size * sizeof(int)); - double *result = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - - for (size_t i = 0; i < 10; ++i) { - array1[i] = i; - result[i] = 0; - std::cout << ", " << array1[i]; - } - std::cout << std::endl; - - const long ndim = 1; - shape_elem_type *shape = reinterpret_cast( - dpnp_memory_alloc_c(ndim * sizeof(shape_elem_type))); - shape[0] = size; - shape_elem_type *strides = reinterpret_cast( - dpnp_memory_alloc_c(ndim * sizeof(shape_elem_type))); - strides[0] = 1; - - dpnp_cos_c(result, size, ndim, shape, strides, array1, size, - ndim, shape, strides, NULL); - - for (size_t i = 0; i < 10; ++i) { - std::cout << ", " << result[i]; - } - std::cout << std::endl; - - dpnp_memory_free_c(result); - dpnp_memory_free_c(array1); - - return 0; -} diff --git a/dpnp/backend/examples/example7.cpp b/dpnp/backend/examples/example7.cpp deleted file mode 100644 index 49c12c5dd51..00000000000 --- a/dpnp/backend/examples/example7.cpp +++ /dev/null @@ -1,77 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/** - * Example 7. - * - * This example shows simple usage of the DPNP C++ Backend library - * to calculate eigenvalues and eigenvectors of a symmetric matrix - * - * Possible compile line: - * . /opt/intel/oneapi/setvars.sh - * g++ -g dpnp/backend/examples/example7.cpp -Idpnp -Idpnp/backend/include - * -Ldpnp -Wl,-rpath='$ORIGIN'/dpnp -ldpnp_backend_c -o example7 - * - */ - -#include - -#include "dpnp_iface.hpp" - -int main(int, char **) -{ - const size_t size = 2; - size_t len = size * size; - - float *array = (float *)dpnp_memory_alloc_c(len * sizeof(float)); - float *result1 = (float *)dpnp_memory_alloc_c(size * sizeof(float)); - float *result2 = (float *)dpnp_memory_alloc_c(len * sizeof(float)); - - /* init input diagonal array like: - 1, 0, 0, - 0, 2, 0, - 0, 0, 3 - */ - for (size_t i = 0; i < len; ++i) { - array[i] = 0; - } - for (size_t i = 0; i < size; ++i) { - array[size * i + i] = i + 1; - } - - dpnp_eig_c(array, result1, result2, size); - - std::cout << "eigen values" << std::endl; - for (size_t i = 0; i < size; ++i) { - std::cout << result1[i] << ", "; - } - std::cout << std::endl; - - dpnp_memory_free_c(result2); - dpnp_memory_free_c(result1); - dpnp_memory_free_c(array); - - return 0; -} diff --git a/dpnp/backend/examples/example_bs.cpp b/dpnp/backend/examples/example_bs.cpp deleted file mode 100644 index c20c6d27e29..00000000000 --- a/dpnp/backend/examples/example_bs.cpp +++ /dev/null @@ -1,282 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/** - * Example BS. - * - * This example shows simple usage of the DPNP C++ Backend library - * to calculate black scholes algorithm like in Python version - * - * Possible compile line: - * . /opt/intel/oneapi/setvars.sh - * g++ -g dpnp/backend/examples/example_bs.cpp -Idpnp -Idpnp/backend/include - * -Ldpnp -Wl,-rpath='$ORIGIN'/dpnp -ldpnp_backend_c -o example_bs - */ - -#include -#include - -#include "dpnp_iface.hpp" - -void black_scholes(double *price, - double *strike, - double *t, - const double rate, - const double vol, - double *call, - double *put, - const size_t size) -{ - const size_t ndim = 1; - const size_t scalar_size = 1; - - double *mr = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - mr[0] = -rate; - - double *vol_vol_two = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - vol_vol_two[0] = vol * vol * 2; - - double *quarter = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - quarter[0] = 0.25; - - double *one = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - one[0] = 1.; - - double *half = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - half[0] = 0.5; - - double *P = price; - double *S = strike; - double *T = t; - - double *p_div_s = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // p_div_s = P / S - dpnp_divide_c(p_div_s, P, size, &size, ndim, S, - size, &size, ndim, NULL); - double *a = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - dpnp_log_c(p_div_s, a, size); // a = np.log(p_div_s) - dpnp_memory_free_c(p_div_s); - - double *b = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // b = T * mr - dpnp_multiply_c( - b, T, size, &size, ndim, mr, scalar_size, &scalar_size, ndim, NULL); - dpnp_memory_free_c(mr); - double *z = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // z = T * vol_vol_twos - dpnp_multiply_c(z, T, size, &size, ndim, - vol_vol_two, scalar_size, - &scalar_size, ndim, NULL); - dpnp_memory_free_c(vol_vol_two); - - double *c = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // c = quarters * z - dpnp_multiply_c(c, quarter, scalar_size, - &scalar_size, ndim, z, size, &size, - ndim, NULL); - dpnp_memory_free_c(quarter); - - double *sqrt_z = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - dpnp_sqrt_c(z, sqrt_z, size); // sqrt_z = np.sqrt(z) - dpnp_memory_free_c(z); - double *y = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // y = ones / np.sqrt(z) - dpnp_divide_c(y, one, scalar_size, &scalar_size, - ndim, sqrt_z, size, &size, ndim, - NULL); - dpnp_memory_free_c(sqrt_z); - dpnp_memory_free_c(one); - - double *a_sub_b = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // a_sub_b = a - b - dpnp_subtract_c(a_sub_b, a, size, &size, ndim, b, - size, &size, ndim, NULL); - dpnp_memory_free_c(a); - double *a_sub_b_add_c = - (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // a_sub_b_add_c = a_sub_b + c - dpnp_add_c(a_sub_b_add_c, a_sub_b, size, &size, - ndim, c, size, &size, ndim, NULL); - double *w1 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // w1 = a_sub_b_add_c * y - dpnp_multiply_c(w1, a_sub_b_add_c, size, &size, - ndim, y, size, &size, ndim, NULL); - dpnp_memory_free_c(a_sub_b_add_c); - - double *a_sub_b_sub_c = - (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // a_sub_b_sub_c = a_sub_b - c - dpnp_subtract_c(a_sub_b_sub_c, a_sub_b, size, &size, - ndim, c, size, &size, ndim, NULL); - dpnp_memory_free_c(a_sub_b); - dpnp_memory_free_c(c); - double *w2 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // w2 = a_sub_b_sub_c * y - dpnp_multiply_c(w2, a_sub_b_sub_c, size, &size, - ndim, y, size, &size, ndim, NULL); - dpnp_memory_free_c(a_sub_b_sub_c); - dpnp_memory_free_c(y); - - double *erf_w1 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - dpnp_erf_c(w1, erf_w1, size); // erf_w1 = np.erf(w1) - dpnp_memory_free_c(w1); - double *halfs_mul_erf_w1 = - (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // halfs_mul_erf_w1 = half * erf_w1 - dpnp_multiply_c(halfs_mul_erf_w1, half, scalar_size, - &scalar_size, ndim, erf_w1, size, - &size, ndim, NULL); - dpnp_memory_free_c(erf_w1); - double *d1 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // d1 = half + halfs_mul_erf_w1 - dpnp_add_c(d1, half, scalar_size, &scalar_size, - ndim, halfs_mul_erf_w1, size, &size, - ndim, NULL); - dpnp_memory_free_c(halfs_mul_erf_w1); - - double *erf_w2 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - dpnp_erf_c(w2, erf_w2, size); // erf_w2 = np.erf(w2) - dpnp_memory_free_c(w2); - double *halfs_mul_erf_w2 = - (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // halfs_mul_erf_w2 = half * erf_w2 - dpnp_multiply_c(halfs_mul_erf_w2, half, scalar_size, - &scalar_size, ndim, erf_w2, size, - &size, ndim, NULL); - dpnp_memory_free_c(erf_w2); - double *d2 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // d2 = half + halfs_mul_erf_w2 - dpnp_add_c(d2, half, scalar_size, &scalar_size, - ndim, halfs_mul_erf_w2, size, &size, - ndim, NULL); - dpnp_memory_free_c(halfs_mul_erf_w2); - dpnp_memory_free_c(half); - - double *exp_b = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - dpnp_exp_c(b, exp_b, size); // exp_b = np.exp(b) - double *Se = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // Se = exp_b * S - dpnp_multiply_c(Se, exp_b, size, &size, ndim, S, - size, &size, ndim, NULL); - dpnp_memory_free_c(exp_b); - dpnp_memory_free_c(b); - - double *P_mul_d1 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // P_mul_d1 = P * d1 - dpnp_multiply_c(P_mul_d1, P, size, &size, ndim, d1, - size, &size, ndim, NULL); - dpnp_memory_free_c(d1); - double *Se_mul_d2 = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // Se_mul_d2 = Se * d2 - dpnp_multiply_c(Se_mul_d2, Se, size, &size, ndim, - d2, size, &size, ndim, NULL); - dpnp_memory_free_c(d2); - double *r = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // r = P_mul_d1 - Se_mul_d2 - dpnp_subtract_c(r, P_mul_d1, size, &size, ndim, - Se_mul_d2, size, &size, ndim, NULL); - dpnp_memory_free_c(Se_mul_d2); - dpnp_memory_free_c(P_mul_d1); - - dpnp_copyto_c(call, r, size); // call[:] = r - double *r_sub_P = (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // r_sub_P = r - P - dpnp_subtract_c(r_sub_P, r, size, &size, ndim, P, - size, &size, ndim, NULL); - dpnp_memory_free_c(r); - double *r_sub_P_add_Se = - (double *)dpnp_memory_alloc_c(size * sizeof(double)); - // r_sub_P_add_Se = r_sub_P + Se - dpnp_add_c(r_sub_P_add_Se, r_sub_P, size, &size, - ndim, Se, size, &size, ndim, NULL); - dpnp_memory_free_c(r_sub_P); - dpnp_memory_free_c(Se); - dpnp_copyto_c(put, r_sub_P_add_Se, - size); // put[:] = r_sub_P_add_Se - dpnp_memory_free_c(r_sub_P_add_Se); -} - -int main(int, char **) -{ - const size_t SIZE = 256; - - const size_t SEED = 7777777; - const long PL = 10, PH = 50; - const long SL = 10, SH = 50; - const long TL = 1, TH = 2; - const double RISK_FREE = 0.1; - const double VOLATILITY = 0.2; - - std::cout << "SYCL queue is CPU: " << dpnp_queue_is_cpu_c() << std::endl; - - double *price = (double *)dpnp_memory_alloc_c(SIZE * sizeof(double)); - double *strike = (double *)dpnp_memory_alloc_c(SIZE * sizeof(double)); - double *t = (double *)dpnp_memory_alloc_c(SIZE * sizeof(double)); - - dpnp_rng_srand_c(SEED); // np.random.seed(SEED) - dpnp_rng_uniform_c(price, PL, PH, - SIZE); // np.random.uniform(PL, PH, SIZE) - dpnp_rng_uniform_c(strike, SL, SH, - SIZE); // np.random.uniform(SL, SH, SIZE) - dpnp_rng_uniform_c(t, TL, TH, - SIZE); // np.random.uniform(TL, TH, SIZE) - - double *zero = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - zero[0] = 0.; - - double *mone = (double *)dpnp_memory_alloc_c(1 * sizeof(double)); - mone[0] = -1.; - - double *call = (double *)dpnp_memory_alloc_c(SIZE * sizeof(double)); - double *put = (double *)dpnp_memory_alloc_c(SIZE * sizeof(double)); - - dpnp_full_c(zero, call, SIZE); // np.full(SIZE, 0., dtype=DTYPE) - dpnp_full_c(mone, put, SIZE); // np.full(SIZE, -1., dtype=DTYPE) - - dpnp_memory_free_c(mone); - dpnp_memory_free_c(zero); - - black_scholes(price, strike, t, RISK_FREE, VOLATILITY, call, put, SIZE); - - std::cout << "call: "; - for (size_t i = 0; i < 10; ++i) { - std::cout << call[i] << ", "; - } - std::cout << "..." << std::endl; - std::cout << "put: "; - for (size_t i = 0; i < 10; ++i) { - std::cout << put[i] << ", "; - } - std::cout << "..." << std::endl; - - dpnp_memory_free_c(put); - dpnp_memory_free_c(call); - - dpnp_memory_free_c(t); - dpnp_memory_free_c(strike); - dpnp_memory_free_c(price); - - return 0; -} diff --git a/dpnp/backend/examples/example_experimental_iface.cpp b/dpnp/backend/examples/example_experimental_iface.cpp deleted file mode 100644 index 4454a34b9a4..00000000000 --- a/dpnp/backend/examples/example_experimental_iface.cpp +++ /dev/null @@ -1,63 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -/** - * Example of experimental interface. - * - * This example shows how to get a runtime pointer from DPNP C++ Backend library - * - * Possible compile line: - * . /opt/intel/oneapi/setvars.sh - * g++ -g dpnp/backend/examples/example_experimental_iface.cpp -Idpnp - * -Idpnp/backend/include -Ldpnp -Wl,-rpath='$ORIGIN'/dpnp -ldpnp_backend_c -o - * example_experimental_iface - */ - -#include - -#include -// TODO #include - -int main(int, char **) -{ - void *result = get_backend_function_name("dpnp_dot", "float"); - std::cout << "Result Dot() function pointer (by old interface): " << result - << std::endl; - - DPNPFuncData_t dpnp_dot_f = get_dpnp_function_ptr( - DPNPFuncName::DPNP_FN_DOT, DPNPFuncType::DPNP_FT_LONG); - std::cout << "Result Dot() function pointer: " << dpnp_dot_f.ptr - << " with return datatype " << (size_t)dpnp_dot_f.return_type - << std::endl; - - DPNPFuncData_t dpnp_add_f = get_dpnp_function_ptr( - DPNPFuncName::DPNP_FN_ADD, DPNPFuncType::DPNP_FT_FLOAT, - DPNPFuncType::DPNP_FT_INT); - std::cout << "Result Add() function pointer: " << dpnp_add_f.ptr - << " with return datatype " << (size_t)dpnp_add_f.return_type - << std::endl; - - return 0; -} diff --git a/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp b/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp index dcec3f8192b..e5a2c924653 100644 --- a/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp +++ b/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp @@ -140,14 +140,4 @@ MACRO_2ARG_3TYPES_OP(dpnp_multiply_c, std::complex, std::complex)) -MACRO_2ARG_3TYPES_OP(dpnp_subtract_c, - input1_elem - input2_elem, - x1 - x2, - MACRO_UNPACK_TYPES(bool, std::int32_t, std::int64_t), - oneapi::mkl::vm::sub, - MACRO_UNPACK_TYPES(float, - double, - std::complex, - std::complex)) - #undef MACRO_2ARG_3TYPES_OP diff --git a/dpnp/backend/include/dpnp_iface.hpp b/dpnp/backend/include/dpnp_iface.hpp index 324e7a612b1..0fc5595041c 100644 --- a/dpnp/backend/include/dpnp_iface.hpp +++ b/dpnp/backend/include/dpnp_iface.hpp @@ -176,74 +176,6 @@ template INP_DLLEXPORT void dpnp_any_c(const void *array, void *result, const size_t size); -/** - * @ingroup BACKEND_API - * @brief Array initialization - * - * Input array, step based, initialization procedure. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] start Start of initialization sequence - * @param [in] step Step for initialization sequence - * @param [out] result1 Output array. - * @param [in] size Number of elements in input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_arange_c(DPCTLSyclQueueRef q_ref, - size_t start, - size_t step, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_arange_c(size_t start, size_t step, void *result1, size_t size); - -/** - * @ingroup BACKEND_API - * @brief Implementation of full function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array_in Input one-element array. - * @param [out] result Output array. - * @param [in] size Number of elements in the output array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_full_c(DPCTLSyclQueueRef q_ref, - void *array_in, - void *result, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_full_c(void *array_in, void *result, const size_t size); - -/** - * @ingroup BACKEND_API - * @brief Implementation of full_like function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array_in Input one-element array. - * @param [out] result Output array. - * @param [in] size Number of elements in the output array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_full_like_c(DPCTLSyclQueueRef q_ref, - void *array_in, - void *result, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_full_like_c(void *array_in, void *result, size_t size); - /** * @ingroup BACKEND_API * @brief Compute the variance along the specified axis, while ignoring NaNs. @@ -591,56 +523,6 @@ INP_DLLEXPORT void dpnp_prod_c(void *result_out, const void *initial, const long *where); -/** - * @ingroup BACKEND_API - * @brief Range of values (maximum - minimum) along an axis. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result_out Output array. - * @param [in] result_size Size of output array. - * @param [in] result_ndim Number of output array dimensions. - * @param [in] result_shape Shape of output array. - * @param [in] result_strides Strides of output array. - * @param [in] input_in First input array. - * @param [in] input_size Size of first input array. - * @param [in] input_ndim Number of first input array dimensions. - * @param [in] input_shape Shape of first input array. - * @param [in] input_strides Strides of first input array. - * @param [in] axis Axis. - * @param [in] naxis Number of elements in axis. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_ptp_c(DPCTLSyclQueueRef q_ref, - void *result_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input_in, - const size_t input_size, - const size_t input_ndim, - const shape_elem_type *input_shape, - const shape_elem_type *input_strides, - const shape_elem_type *axis, - const size_t naxis, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_ptp_c(void *result_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input_in, - const size_t input_size, - const size_t input_ndim, - const shape_elem_type *input_shape, - const shape_elem_type *input_strides, - const shape_elem_type *axis, - const size_t naxis); - /** * @ingroup BACKEND_API * @brief Replaces specified elements of an array with given values. @@ -715,29 +597,7 @@ INP_DLLEXPORT void dpnp_put_along_axis_c(void *arr_in, /** * @ingroup BACKEND_API - * @brief Return a 2-D array with ones on the diagonal and zeros elsewhere. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result The eigenvalues, each repeated according to - * its multiplicity - * @param [in] k Index of the diagonal - * @param [in] shape Shape of result - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_eye_c(DPCTLSyclQueueRef q_ref, - void *result, - int k, - const shape_elem_type *res_shape, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_eye_c(void *result, int k, const shape_elem_type *res_shape); -/** - * @ingroup BACKEND_API * @brief math library implementation of argsort function * * @param [in] q_ref Reference to SYCL queue. @@ -916,60 +776,6 @@ INP_DLLEXPORT void dpnp_choose_c(void *result1, size_t choices_size, size_t choice_size); -/** - * @ingroup BACKEND_API - * @brief Extract a diagonal or construct a diagonal array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [out] result Output array. - * @param [in] k Diagonal in question. - * @param [in] shape Shape of input array. - * @param [in] res_shape Shape of result array. - * @param [in] ndim Number of elements in shape of input array. - * @param [in] res_ndim Number of elements in shape of result array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_diag_c(DPCTLSyclQueueRef q_ref, - void *array, - void *result, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_diag_c(void *array, - void *result, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim); - -/** - * @ingroup BACKEND_API - * @brief Return the indices to access the main diagonal of an array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result1 Output array. - * @param [in] size Size of array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_diag_indices_c(DPCTLSyclQueueRef q_ref, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_diag_indices_c(void *result1, size_t size); - /** * @ingroup BACKEND_API * @brief math library implementation of diagonal function @@ -1006,26 +812,6 @@ INP_DLLEXPORT void dpnp_diagonal_c(void *array1_in, shape_elem_type *res_shape, const size_t res_ndim); -/** - * @ingroup BACKEND_API - * @brief Implementation of identity function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result1 Output array. - * @param [in] n Number of rows (and columns) in n x n - * output. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_identity_c(DPCTLSyclQueueRef q_ref, - void *result1, - const size_t n, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_identity_c(void *result1, const size_t n); - /** * @ingroup BACKEND_API * @brief implementation of creating filled with value array function @@ -1287,128 +1073,6 @@ INP_DLLEXPORT void dpnp_take_c(void *array, void *result, size_t size); -/** - * @ingroup BACKEND_API - * @brief math library implementation of trace function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [out] result Output array. - * @param [in] shape Shape of input array. - * @param [in] ndim Number of elements in array.shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_trace_c(DPCTLSyclQueueRef q_ref, - const void *array, - void *result, - const shape_elem_type *shape, - const size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_trace_c(const void *array, - void *result, - const shape_elem_type *shape, - const size_t ndim); - -/** - * @ingroup BACKEND_API - * @brief An array with ones at and below the given diagonal and zeros - * elsewhere. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [out] result Output array. - * @param [in] N Number of rows in the array. - * @param [in] M Number of columns in the array. - * @param [in] k The sub-diagonal at and below which the - * array is filled. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_tri_c(DPCTLSyclQueueRef q_ref, - void *result, - const size_t N, - const size_t M, - const int k, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void - dpnp_tri_c(void *result, const size_t N, const size_t M, const int k); - -/** - * @ingroup BACKEND_API - * @brief Lower triangle of an array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [out] result Output array. - * @param [in] k Diagonal above which to zero elements. - * @param [in] shape Shape of input array. - * @param [in] res_shape Shape of result array. - * @param [in] ndim Number of elements in array.shape. - * @param [in] res_ndim Number of elements in res_shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_tril_c(DPCTLSyclQueueRef q_ref, - void *array, - void *result, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_tril_c(void *array, - void *result, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim); - -/** - * @ingroup BACKEND_API - * @brief Upper triangle of an array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [out] result Output array. - * @param [in] k Diagonal above which to zero elements. - * @param [in] shape Shape of input array. - * @param [in] res_shape Shape of result array. - * @param [in] ndim Number of elements in array.shape. - * @param [in] res_ndim Number of elements in res_shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_triu_c(DPCTLSyclQueueRef q_ref, - void *array, - void *result, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_triu_c(void *array, - void *result, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim); - /** * @ingroup BACKEND_API * @brief math library implementation of var function @@ -1609,62 +1273,6 @@ INP_DLLEXPORT DPCTLSyclEventRef template INP_DLLEXPORT void dpnp_ones_like_c(void *result, size_t size); -/** - * @ingroup BACKEND_API - * @brief repeat elements of an array. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array_in Input array. - * @param [out] result Output array. - * @param [in] repeats The number of repetitions for each element. - * @param [in] size Number of elements in input arrays. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_repeat_c(DPCTLSyclQueueRef q_ref, - const void *array_in, - void *result, - const size_t repeats, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_repeat_c(const void *array_in, - void *result, - const size_t repeats, - const size_t size); - -/** - * @ingroup BACKEND_API - * @brief Implementation of vander function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array_in Input array. - * @param [out] result Output array. - * @param [in] size_in Number of elements in the input array. - * @param [in] N Number of columns in the output. - * @param [in] increasing Order of the powers of the columns. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - * - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_vander_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - void *result1, - const size_t size_in, - const size_t N, - const int increasing, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_vander_c(const void *array1_in, - void *result1, - const size_t size_in, - const size_t N, - const int increasing); - /** * @ingroup BACKEND_API * @brief Implementation of zeros function diff --git a/dpnp/backend/include/dpnp_iface_fptr.hpp b/dpnp/backend/include/dpnp_iface_fptr.hpp index a39174931fe..d62e5998583 100644 --- a/dpnp/backend/include/dpnp_iface_fptr.hpp +++ b/dpnp/backend/include/dpnp_iface_fptr.hpp @@ -64,7 +64,6 @@ enum class DPNPFuncName : size_t DPNP_FN_ALLCLOSE_EXT, /**< Used in numpy.allclose() impl, requires extra parameters */ DPNP_FN_ANY, /**< Used in numpy.any() impl */ - DPNP_FN_ARANGE, /**< Used in numpy.arange() impl */ DPNP_FN_ARGMAX, /**< Used in numpy.argmax() impl */ DPNP_FN_ARGMIN, /**< Used in numpy.argmin() impl */ DPNP_FN_ARGSORT, /**< Used in numpy.argsort() impl */ @@ -82,8 +81,6 @@ enum class DPNPFuncName : size_t DPNP_FN_DEGREES, /**< Used in numpy.degrees() impl */ DPNP_FN_DEGREES_EXT, /**< Used in numpy.degrees() impl, requires extra parameters */ - DPNP_FN_DIAG, /**< Used in numpy.diag() impl */ - DPNP_FN_DIAG_INDICES, /**< Used in numpy.diag_indices() impl */ DPNP_FN_DIAGONAL, /**< Used in numpy.diagonal() impl */ DPNP_FN_DOT, /**< Used in numpy.dot() impl */ DPNP_FN_DOT_EXT, /**< Used in numpy.dot() impl, requires extra parameters */ @@ -93,7 +90,6 @@ enum class DPNPFuncName : size_t DPNP_FN_ERF, /**< Used in scipy.special.erf impl */ DPNP_FN_ERF_EXT, /**< Used in scipy.special.erf impl, requires extra parameters */ - DPNP_FN_EYE, /**< Used in numpy.eye() impl */ DPNP_FN_FFT_FFT, /**< Used in numpy.fft.fft() impl */ DPNP_FN_FFT_FFT_EXT, /**< Used in numpy.fft.fft() impl, requires extra parameters */ @@ -101,14 +97,10 @@ enum class DPNPFuncName : size_t DPNP_FN_FFT_RFFT_EXT, /**< Used in numpy.fft.rfft() impl, requires extra parameters */ DPNP_FN_FILL_DIAGONAL, /**< Used in numpy.fill_diagonal() impl */ - DPNP_FN_FULL, /**< Used in numpy.full() impl */ - DPNP_FN_FULL_LIKE, /**< Used in numpy.full_like() impl */ - DPNP_FN_IDENTITY, /**< Used in numpy.identity() impl */ DPNP_FN_INITVAL, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ DPNP_FN_INITVAL_EXT, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ - DPNP_FN_INVERT, /**< Used in numpy.invert() impl */ DPNP_FN_MAX, /**< Used in numpy.max() impl */ DPNP_FN_MAXIMUM_EXT, /**< Used in numpy.fmax() impl , requires extra parameters */ @@ -132,13 +124,11 @@ enum class DPNPFuncName : size_t parameters */ DPNP_FN_PLACE, /**< Used in numpy.place() impl */ DPNP_FN_PROD, /**< Used in numpy.prod() impl */ - DPNP_FN_PTP, /**< Used in numpy.ptp() impl */ DPNP_FN_PUT, /**< Used in numpy.put() impl */ DPNP_FN_PUT_ALONG_AXIS, /**< Used in numpy.put_along_axis() impl */ DPNP_FN_RADIANS, /**< Used in numpy.radians() impl */ DPNP_FN_RADIANS_EXT, /**< Used in numpy.radians() impl, requires extra parameters */ - DPNP_FN_REPEAT, /**< Used in numpy.repeat() impl */ DPNP_FN_RNG_BETA, /**< Used in numpy.random.beta() impl */ DPNP_FN_RNG_BETA_EXT, /**< Used in numpy.random.beta() impl, requires extra parameters */ @@ -262,22 +252,12 @@ enum class DPNPFuncName : size_t DPNP_FN_SQRT_EXT, /**< Used in numpy.sqrt() impl, requires extra parameters */ DPNP_FN_STD, /**< Used in numpy.std() impl */ - DPNP_FN_SUBTRACT_EXT, /**< Used in numpy.subtract() impl, requires extra - parameters */ - DPNP_FN_SUM, /**< Used in numpy.sum() impl */ - DPNP_FN_TAKE, /**< Used in numpy.take() impl */ - DPNP_FN_TRANSPOSE, /**< Used in numpy.transpose() impl */ - DPNP_FN_TRACE, /**< Used in numpy.trace() impl */ - DPNP_FN_TRAPZ_EXT, /**< Used in numpy.trapz() impl, requires extra - parameters */ - DPNP_FN_TRI, /**< Used in numpy.tri() impl */ - DPNP_FN_TRIL, /**< Used in numpy.tril() impl */ - DPNP_FN_TRIU, /**< Used in numpy.triu() impl */ - DPNP_FN_VANDER, /**< Used in numpy.vander() impl */ - DPNP_FN_VAR, /**< Used in numpy.var() impl */ - DPNP_FN_ZEROS, /**< Used in numpy.zeros() impl */ - DPNP_FN_ZEROS_LIKE, /**< Used in numpy.zeros_like() impl */ - DPNP_FN_LAST, /**< The latest element of the enumeration */ + DPNP_FN_SUM, /**< Used in numpy.sum() impl */ + DPNP_FN_TAKE, /**< Used in numpy.take() impl */ + DPNP_FN_VAR, /**< Used in numpy.var() impl */ + DPNP_FN_ZEROS, /**< Used in numpy.zeros() impl */ + DPNP_FN_ZEROS_LIKE, /**< Used in numpy.zeros_like() impl */ + DPNP_FN_LAST, /**< The latest element of the enumeration */ }; /** @@ -381,14 +361,4 @@ void *get_dpnp_function_ptr1( DPNPFuncType first_type, DPNPFuncType second_type = DPNPFuncType::DPNP_FT_NONE); -/** - * DEPRECATED. - * Experimental interface. DO NOT USE IT! - * - * parameter @ref type_name will be converted into var_args or char *[] with - * extra length parameter - */ -INP_DLLEXPORT -void *get_backend_function_name(const char *func_name, const char *type_name); - #endif // BACKEND_IFACE_FPTR_H diff --git a/dpnp/backend/kernels/dpnp_krnl_arraycreation.cpp b/dpnp/backend/kernels/dpnp_krnl_arraycreation.cpp index 175eb3d7698..ebcffa944c0 100644 --- a/dpnp/backend/kernels/dpnp_krnl_arraycreation.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_arraycreation.cpp @@ -31,355 +31,6 @@ #include "dpnpc_memory_adapter.hpp" #include "queue_sycl.hpp" -template -class dpnp_arange_c_kernel; - -template -DPCTLSyclEventRef dpnp_arange_c(DPCTLSyclQueueRef q_ref, - size_t start, - size_t step, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // parameter `size` used instead `stop` to avoid dependency on array length - // calculation algorithm - // TODO: floating point (and negatives) types from `start` and `step` - - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - validate_type_for_device<_DataType>(q); - - _DataType *result = reinterpret_cast<_DataType *>(result1); - - sycl::range<1> gws(size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - size_t i = global_id[0]; - - result[i] = start + i * step; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_arange_c(size_t start, size_t step, void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_arange_c<_DataType>( - q_ref, start, step, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_arange_default_c)(size_t, size_t, void *, size_t) = - dpnp_arange_c<_DataType>; - -// Explicit instantiation of the function, since dpnp_arange_c() is used by -// other template functions, but implicit instantiation is not applied anymore. -template DPCTLSyclEventRef dpnp_arange_c(DPCTLSyclQueueRef, - size_t, - size_t, - void *, - size_t, - const DPCTLEventVectorRef); - -template DPCTLSyclEventRef dpnp_arange_c(DPCTLSyclQueueRef, - size_t, - size_t, - void *, - size_t, - const DPCTLEventVectorRef); - -template DPCTLSyclEventRef dpnp_arange_c(DPCTLSyclQueueRef, - size_t, - size_t, - void *, - size_t, - const DPCTLEventVectorRef); - -template DPCTLSyclEventRef dpnp_arange_c(DPCTLSyclQueueRef, - size_t, - size_t, - void *, - size_t, - const DPCTLEventVectorRef); - -template -DPCTLSyclEventRef dpnp_diag_c(DPCTLSyclQueueRef q_ref, - void *v_in, - void *result1, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)res_ndim; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - - const size_t input1_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - const size_t result_size = std::accumulate( - res_shape, res_shape + res_ndim, 1, std::multiplies()); - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, v_in, input1_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, result_size, true, - true); - _DataType *v = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - size_t init0 = std::max(0, -k); - size_t init1 = std::max(0, k); - - if (ndim == 1) { - for (size_t i = 0; i < static_cast(shape[0]); ++i) { - size_t ind = (init0 + i) * res_shape[1] + init1 + i; - result[ind] = v[i]; - } - } - else { - for (size_t i = 0; i < static_cast(res_shape[0]); ++i) { - size_t ind = (init0 + i) * shape[1] + init1 + i; - result[i] = v[ind]; - } - } - return event_ref; -} - -template -void dpnp_diag_c(void *v_in, - void *result1, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_diag_c<_DataType>(q_ref, v_in, result1, k, shape, res_shape, ndim, - res_ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_diag_default_c)(void *, - void *, - const int, - shape_elem_type *, - shape_elem_type *, - const size_t, - const size_t) = dpnp_diag_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_eye_c(DPCTLSyclQueueRef q_ref, - void *result1, - int k, - const shape_elem_type *res_shape, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (result1 == nullptr) { - return event_ref; - } - - if (res_shape == nullptr) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - - size_t result_size = res_shape[0] * res_shape[1]; - - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, result_size, true, - true); - _DataType *result = result_ptr.get_ptr(); - - int diag_val_; - diag_val_ = std::min((int)res_shape[0], (int)res_shape[1]); - diag_val_ = std::min(diag_val_, ((int)res_shape[0] + k)); - diag_val_ = std::min(diag_val_, ((int)res_shape[1] - k)); - - size_t diag_val = (diag_val_ < 0) ? 0 : (size_t)diag_val_; - - for (size_t i = 0; i < result_size; ++i) { - result[i] = 0; - for (size_t j = 0; j < diag_val; ++j) { - size_t ind = (k >= 0) ? (j * res_shape[1] + j + k) - : (j - k) * res_shape[1] + j; - if (i == ind) { - result[i] = 1; - break; - } - } - } - - return event_ref; -} - -template -void dpnp_eye_c(void *result1, int k, const shape_elem_type *res_shape) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_eye_c<_DataType>(q_ref, result1, k, res_shape, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_eye_default_c)(void *, - int, - const shape_elem_type *) = dpnp_eye_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_full_c(DPCTLSyclQueueRef q_ref, - void *array_in, - void *result, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - return dpnp_initval_c<_DataType>(q_ref, result, array_in, size, - dep_event_vec_ref); -} - -template -void dpnp_full_c(void *array_in, void *result, const size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_full_c<_DataType>( - q_ref, array_in, result, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_full_default_c)(void *, - void *, - const size_t) = dpnp_full_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_full_like_c(DPCTLSyclQueueRef q_ref, - void *array_in, - void *result, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - return dpnp_full_c<_DataType>(q_ref, array_in, result, size, - dep_event_vec_ref); -} - -template -void dpnp_full_like_c(void *array_in, void *result, const size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_full_like_c<_DataType>( - q_ref, array_in, result, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_full_like_default_c)(void *, - void *, - const size_t) = dpnp_full_like_c<_DataType>; - -template -class dpnp_identity_c_kernel; - -template -DPCTLSyclEventRef dpnp_identity_c(DPCTLSyclQueueRef q_ref, - void *result1, - const size_t n, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (n == 0) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - validate_type_for_device<_DataType>(q); - - _DataType *result = static_cast<_DataType *>(result1); - - sycl::range<2> gws(n, n); - auto kernel_parallel_for_func = [=](sycl::id<2> global_id) { - size_t i = global_id[0]; - size_t j = global_id[1]; - result[i * n + j] = i == j; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_identity_c(void *result1, const size_t n) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_identity_c<_DataType>(q_ref, result1, n, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_identity_default_c)(void *, - const size_t) = dpnp_identity_c<_DataType>; - template class dpnp_ones_c_kernel; @@ -442,632 +93,6 @@ void dpnp_ones_like_c(void *result, size_t size) template void (*dpnp_ones_like_default_c)(void *, size_t) = dpnp_ones_like_c<_DataType>; -template -DPCTLSyclEventRef dpnp_ptp_c(DPCTLSyclQueueRef q_ref, - void *result1_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input1_in, - const size_t input_size, - const size_t input_ndim, - const shape_elem_type *input_shape, - const shape_elem_type *input_strides, - const shape_elem_type *axis, - const size_t naxis, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)result_strides; - (void)input_strides; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - DPCTLSyclEventRef e1_ref = nullptr; - DPCTLSyclEventRef e2_ref = nullptr; - DPCTLSyclEventRef e3_ref = nullptr; - - if ((input1_in == nullptr) || (result1_out == nullptr)) { - return event_ref; - } - - if (input_ndim < 1) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, input1_in, input_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1_out, result_size, - false, true); - _DataType *arr = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - _DataType *min_arr = reinterpret_cast<_DataType *>( - sycl::malloc_shared(result_size * sizeof(_DataType), q)); - _DataType *max_arr = reinterpret_cast<_DataType *>( - sycl::malloc_shared(result_size * sizeof(_DataType), q)); - - e1_ref = dpnp_min_c<_DataType>(q_ref, arr, min_arr, result_size, - input_shape, input_ndim, axis, naxis, NULL); - e2_ref = dpnp_max_c<_DataType>(q_ref, arr, max_arr, result_size, - input_shape, input_ndim, axis, naxis, NULL); - - shape_elem_type *_strides = reinterpret_cast( - sycl::malloc_shared(result_ndim * sizeof(shape_elem_type), q)); - get_shape_offsets_inkernel(result_shape, result_ndim, _strides); - - e3_ref = dpnp_subtract_c<_DataType, _DataType, _DataType>( - q_ref, result, result_size, result_ndim, result_shape, result_strides, - max_arr, result_size, result_ndim, result_shape, _strides, min_arr, - result_size, result_ndim, result_shape, _strides, NULL, NULL); - - DPCTLEvent_Wait(e1_ref); - DPCTLEvent_Wait(e2_ref); - DPCTLEvent_Wait(e3_ref); - DPCTLEvent_Delete(e1_ref); - DPCTLEvent_Delete(e2_ref); - DPCTLEvent_Delete(e3_ref); - - sycl::free(min_arr, q); - sycl::free(max_arr, q); - sycl::free(_strides, q); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_ptp_c(void *result1_out, - const size_t result_size, - const size_t result_ndim, - const shape_elem_type *result_shape, - const shape_elem_type *result_strides, - const void *input1_in, - const size_t input_size, - const size_t input_ndim, - const shape_elem_type *input_shape, - const shape_elem_type *input_strides, - const shape_elem_type *axis, - const size_t naxis) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_ptp_c<_DataType>( - q_ref, result1_out, result_size, result_ndim, result_shape, - result_strides, input1_in, input_size, input_ndim, input_shape, - input_strides, axis, naxis, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_ptp_default_c)(void *, - const size_t, - const size_t, - const shape_elem_type *, - const shape_elem_type *, - const void *, - const size_t, - const size_t, - const shape_elem_type *, - const shape_elem_type *, - const shape_elem_type *, - const size_t) = dpnp_ptp_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_vander_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - void *result1, - const size_t size_in, - const size_t N, - const int increasing, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - DPCTLSyclEventRef event_ref = nullptr; - - if ((array1_in == nullptr) || (result1 == nullptr)) - return event_ref; - - if (!size_in || !N) - return event_ref; - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType_input>(q); - validate_type_for_device<_DataType_output>(q); - - DPNPC_ptr_adapter<_DataType_input> input1_ptr(q_ref, array1_in, size_in, - true); - DPNPC_ptr_adapter<_DataType_output> result_ptr(q_ref, result1, size_in * N, - true, true); - const _DataType_input *array_in = input1_ptr.get_ptr(); - _DataType_output *result = result_ptr.get_ptr(); - - if (N == 1) { - return dpnp_ones_c<_DataType_output>(q_ref, result, size_in, - dep_event_vec_ref); - } - - if (increasing) { - for (size_t i = 0; i < size_in; ++i) { - result[i * N] = 1; - } - for (size_t i = 1; i < N; ++i) { - for (size_t j = 0; j < size_in; ++j) { - result[j * N + i] = result[j * N + i - 1] * array_in[j]; - } - } - } - else { - for (size_t i = 0; i < size_in; ++i) { - result[i * N + N - 1] = 1; - } - for (size_t i = N - 2; i > 0; --i) { - for (size_t j = 0; j < size_in; ++j) { - result[j * N + i] = result[j * N + i + 1] * array_in[j]; - } - } - - for (size_t i = 0; i < size_in; ++i) { - result[i * N] = result[i * N + 1] * array_in[i]; - } - } - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_vander_c(const void *array1_in, - void *result1, - const size_t size_in, - const size_t N, - const int increasing) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_vander_c<_DataType_input, _DataType_output>( - q_ref, array1_in, result1, size_in, N, increasing, - dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_vander_default_c)(const void *, - void *, - const size_t, - const size_t, - const int) = - dpnp_vander_c<_DataType_input, _DataType_output>; - -template -class dpnp_trace_c_kernel; - -template -DPCTLSyclEventRef dpnp_trace_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - void *result_in, - const shape_elem_type *shape_, - const size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!array1_in || !result_in || !shape_ || !ndim) { - return event_ref; - } - - const size_t last_dim = shape_[ndim - 1]; - const size_t size = std::accumulate(shape_, shape_ + (ndim - 1), 1, - std::multiplies()); - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - validate_type_for_device<_ResultType>(q); - - const _DataType *input = static_cast(array1_in); - _ResultType *result = static_cast<_ResultType *>(result_in); - - sycl::range<1> gws(size); - auto kernel_parallel_for_func = [=](auto index) { - size_t i = index[0]; - _ResultType acc = _ResultType(0); - - for (size_t j = 0; j < last_dim; ++j) { - acc += input[i * last_dim + j]; - } - - result[i] = acc; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - auto event = q.submit(kernel_func); - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_trace_c(const void *array1_in, - void *result_in, - const shape_elem_type *shape_, - const size_t ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_trace_c<_DataType, _ResultType>( - q_ref, array1_in, result_in, shape_, ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_trace_default_c)(const void *, - void *, - const shape_elem_type *, - const size_t) = - dpnp_trace_c<_DataType, _ResultType>; - -template -class dpnp_tri_c_kernel; - -template -DPCTLSyclEventRef dpnp_tri_c(DPCTLSyclQueueRef q_ref, - void *result1, - const size_t N, - const size_t M, - const int k, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - sycl::event event; - - if (!result1 || !N || !M) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - - _DataType *result = static_cast<_DataType *>(result1); - - size_t idx = N * M; - sycl::range<1> gws(idx); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - size_t ind = global_id[0]; - size_t i = ind / M; - size_t j = ind % M; - - int val = i + k + 1; - size_t diag_idx_ = (val > 0) ? (size_t)val : 0; - size_t diag_idx = (M < diag_idx_) ? M : diag_idx_; - - if (j < diag_idx) { - result[ind] = 1; - } - else { - result[ind] = 0; - } - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_tri_c(void *result1, const size_t N, const size_t M, const int k) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_tri_c<_DataType>(q_ref, result1, N, M, k, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_tri_default_c)(void *, const size_t, const size_t, const int) = - dpnp_tri_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_tril_c(DPCTLSyclQueueRef q_ref, - void *array_in, - void *result1, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if ((array_in == nullptr) || (result1 == nullptr)) { - return event_ref; - } - - if ((shape == nullptr) || (res_shape == nullptr)) { - return event_ref; - } - - if ((ndim == 0) || (res_ndim == 0)) { - return event_ref; - } - - const size_t res_size = std::accumulate(res_shape, res_shape + res_ndim, 1, - std::multiplies()); - if (res_size == 0) { - return event_ref; - } - - const size_t input_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - if (input_size == 0) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array_in, input_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, res_size, true, - true); - _DataType *array_m = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - int *ids = new int[res_ndim]; - - if (ndim == 1) { - for (size_t i = 0; i < res_size; ++i) { - size_t n = res_size; - size_t val = i; - for (size_t j = 0; j < res_ndim; ++j) { - n /= res_shape[j]; - size_t p = val / n; - ids[j] = p; - if (p != 0) { - val = val - p * n; - } - } - - int diag_idx_ = - (ids[res_ndim - 2] + k > -1) ? (ids[res_ndim - 2] + k) : -1; - int values = res_shape[res_ndim - 1]; - int diag_idx = (values < diag_idx_) ? values : diag_idx_; - - if (ids[res_ndim - 1] <= diag_idx) { - result[i] = array_m[ids[res_ndim - 1]]; - } - else { - result[i] = 0; - } - } - } - else { - for (size_t i = 0; i < res_size; ++i) { - size_t n = res_size; - size_t val = i; - for (size_t j = 0; j < res_ndim; ++j) { - n /= res_shape[j]; - size_t p = val / n; - ids[j] = p; - if (p != 0) { - val = val - p * n; - } - } - - int diag_idx_ = - (ids[res_ndim - 2] + k > -1) ? (ids[res_ndim - 2] + k) : -1; - int values = res_shape[res_ndim - 1]; - int diag_idx = (values < diag_idx_) ? values : diag_idx_; - - if (ids[res_ndim - 1] <= diag_idx) { - result[i] = array_m[i]; - } - else { - result[i] = 0; - } - } - } - - delete[] ids; - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_tril_c(void *array_in, - void *result1, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_tril_c<_DataType>(q_ref, array_in, result1, k, shape, res_shape, - ndim, res_ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_tril_default_c)(void *, - void *, - const int, - shape_elem_type *, - shape_elem_type *, - const size_t, - const size_t) = dpnp_tril_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_triu_c(DPCTLSyclQueueRef q_ref, - void *array_in, - void *result1, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if ((array_in == nullptr) || (result1 == nullptr)) { - return event_ref; - } - - if ((shape == nullptr) || (res_shape == nullptr)) { - return event_ref; - } - - if ((ndim == 0) || (res_ndim == 0)) { - return event_ref; - } - - const size_t res_size = std::accumulate(res_shape, res_shape + res_ndim, 1, - std::multiplies()); - if (res_size == 0) { - return event_ref; - } - - const size_t input_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - if (input_size == 0) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - validate_type_for_device<_DataType>(q); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array_in, input_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, res_size, true, - true); - _DataType *array_m = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - int *ids = new int[res_ndim]; - - if (ndim == 1) { - for (size_t i = 0; i < res_size; ++i) { - size_t n = res_size; - size_t val = i; - for (size_t j = 0; j < res_ndim; ++j) { - n /= res_shape[j]; - size_t p = val / n; - ids[j] = p; - if (p != 0) { - val = val - p * n; - } - } - - int diag_idx_ = - (ids[res_ndim - 2] + k > -1) ? (ids[res_ndim - 2] + k) : -1; - int values = res_shape[res_ndim - 1]; - int diag_idx = (values < diag_idx_) ? values : diag_idx_; - - if (ids[res_ndim - 1] >= diag_idx) { - result[i] = array_m[ids[res_ndim - 1]]; - } - else { - result[i] = 0; - } - } - } - else { - for (size_t i = 0; i < res_size; ++i) { - size_t n = res_size; - size_t val = i; - for (size_t j = 0; j < res_ndim; ++j) { - n /= res_shape[j]; - size_t p = val / n; - ids[j] = p; - if (p != 0) { - val = val - p * n; - } - } - - int diag_idx_ = - (ids[res_ndim - 2] + k > -1) ? (ids[res_ndim - 2] + k) : -1; - int values = res_shape[res_ndim - 1]; - int diag_idx = (values < diag_idx_) ? values : diag_idx_; - - if (ids[res_ndim - 1] >= diag_idx) { - result[i] = array_m[i]; - } - else { - result[i] = 0; - } - } - } - - delete[] ids; - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_triu_c(void *array_in, - void *result1, - const int k, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t ndim, - const size_t res_ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_triu_c<_DataType>(q_ref, array_in, result1, k, shape, res_shape, - ndim, res_ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_triu_default_c)(void *, - void *, - const int, - shape_elem_type *, - shape_elem_type *, - const size_t, - const size_t) = dpnp_triu_c<_DataType>; - template DPCTLSyclEventRef dpnp_zeros_c(DPCTLSyclQueueRef q_ref, void *result, @@ -1130,72 +155,7 @@ void (*dpnp_zeros_like_default_c)(void *, void func_map_init_arraycreation(func_map_t &fmap) { - fmap[DPNPFuncName::DPNP_FN_ARANGE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_arange_default_c}; - fmap[DPNPFuncName::DPNP_FN_ARANGE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_arange_default_c}; - fmap[DPNPFuncName::DPNP_FN_ARANGE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_arange_default_c}; - fmap[DPNPFuncName::DPNP_FN_ARANGE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_arange_default_c}; - - fmap[DPNPFuncName::DPNP_FN_DIAG][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_diag_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_diag_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_diag_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_diag_default_c}; - - fmap[DPNPFuncName::DPNP_FN_EYE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_eye_default_c}; - fmap[DPNPFuncName::DPNP_FN_EYE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_eye_default_c}; - fmap[DPNPFuncName::DPNP_FN_EYE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_eye_default_c}; - fmap[DPNPFuncName::DPNP_FN_EYE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_eye_default_c}; - - fmap[DPNPFuncName::DPNP_FN_FULL][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_full_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_full_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_full_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_full_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_full_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_full_default_c>}; - - fmap[DPNPFuncName::DPNP_FN_FULL_LIKE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_full_like_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL_LIKE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_full_like_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL_LIKE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_full_like_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL_LIKE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_full_like_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL_LIKE][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_full_like_default_c}; - fmap[DPNPFuncName::DPNP_FN_FULL_LIKE][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_full_like_default_c>}; - - fmap[DPNPFuncName::DPNP_FN_IDENTITY][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_identity_default_c}; - fmap[DPNPFuncName::DPNP_FN_IDENTITY][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_identity_default_c}; - fmap[DPNPFuncName::DPNP_FN_IDENTITY][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_identity_default_c}; - fmap[DPNPFuncName::DPNP_FN_IDENTITY][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_identity_default_c}; - fmap[DPNPFuncName::DPNP_FN_IDENTITY][eft_BLN][eft_BLN] = { - eft_BLN, (void *)dpnp_identity_default_c}; - fmap[DPNPFuncName::DPNP_FN_IDENTITY][eft_C128][eft_C128] = { - eft_C128, (void *)dpnp_identity_default_c>}; - + // Used in dpnp_rng_geometric_c fmap[DPNPFuncName::DPNP_FN_ONES][eft_INT][eft_INT] = { eft_INT, (void *)dpnp_ones_default_c}; fmap[DPNPFuncName::DPNP_FN_ONES][eft_LNG][eft_LNG] = { @@ -1222,90 +182,8 @@ void func_map_init_arraycreation(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_ONES_LIKE][eft_C128][eft_C128] = { eft_C128, (void *)dpnp_ones_like_default_c>}; - fmap[DPNPFuncName::DPNP_FN_PTP][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_ptp_default_c}; - fmap[DPNPFuncName::DPNP_FN_PTP][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_ptp_default_c}; - fmap[DPNPFuncName::DPNP_FN_PTP][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_ptp_default_c}; - fmap[DPNPFuncName::DPNP_FN_PTP][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_ptp_default_c}; - - fmap[DPNPFuncName::DPNP_FN_VANDER][eft_INT][eft_INT] = { - eft_LNG, (void *)dpnp_vander_default_c}; - fmap[DPNPFuncName::DPNP_FN_VANDER][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_vander_default_c}; - fmap[DPNPFuncName::DPNP_FN_VANDER][eft_FLT][eft_FLT] = { - eft_DBL, (void *)dpnp_vander_default_c}; - fmap[DPNPFuncName::DPNP_FN_VANDER][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_vander_default_c}; - fmap[DPNPFuncName::DPNP_FN_VANDER][eft_BLN][eft_BLN] = { - eft_LNG, (void *)dpnp_vander_default_c}; - fmap[DPNPFuncName::DPNP_FN_VANDER][eft_C128][eft_C128] = { - eft_C128, - (void *) - dpnp_vander_default_c, std::complex>}; - - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_LNG][eft_INT] = { - eft_INT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_FLT][eft_INT] = { - eft_INT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_DBL][eft_INT] = { - eft_INT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_INT][eft_LNG] = { - eft_LNG, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_FLT][eft_LNG] = { - eft_LNG, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_DBL][eft_LNG] = { - eft_LNG, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_INT][eft_FLT] = { - eft_FLT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_LNG][eft_FLT] = { - eft_FLT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_DBL][eft_FLT] = { - eft_FLT, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_INT][eft_DBL] = { - eft_DBL, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_LNG][eft_DBL] = { - eft_DBL, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_FLT][eft_DBL] = { - eft_DBL, (void *)dpnp_trace_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRACE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_trace_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TRI][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_tri_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRI][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_tri_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRI][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_tri_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRI][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_tri_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TRIL][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_tril_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRIL][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_tril_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRIL][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_tril_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRIL][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_tril_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TRIU][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_triu_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRIU][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_triu_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRIU][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_triu_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRIU][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_triu_default_c}; - + // Used in dpnp_rng_binomial_c, dpnp_rng_gamma_c, dpnp_rng_hypergeometric_c + // dpnp_rng_laplace_c, dpnp_rng_multinomial_c, dpnp_rng_weibull_c fmap[DPNPFuncName::DPNP_FN_ZEROS][eft_INT][eft_INT] = { eft_INT, (void *)dpnp_zeros_default_c}; fmap[DPNPFuncName::DPNP_FN_ZEROS][eft_LNG][eft_LNG] = { diff --git a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp index 20be65f53ca..e3797bd22e6 100644 --- a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp @@ -1026,19 +1026,6 @@ static void func_map_init_elemwise_1arg_1type(func_map_t &fmap) #include -template -static void func_map_elemwise_2arg_3type_core(func_map_t &fmap) -{ - // dpnp_subtract_c_ext is implicitly used by dpnp_ptp_c - ((fmap[DPNPFuncName::DPNP_FN_SUBTRACT_EXT][FT1][FTs] = - {populate_func_types(), - (void *)dpnp_subtract_c_ext< - func_type_map_t::find_type()>, - func_type_map_t::find_type, - func_type_map_t::find_type>}), - ...); -} - template static void func_map_elemwise_2arg_3type_short_core(func_map_t &fmap) { @@ -1072,12 +1059,6 @@ static void func_map_elemwise_2arg_3type_short_core(func_map_t &fmap) ...); } -template -static void func_map_elemwise_2arg_3type_helper(func_map_t &fmap) -{ - ((func_map_elemwise_2arg_3type_core(fmap)), ...); -} - template static void func_map_elemwise_2arg_3type_short_helper(func_map_t &fmap) { @@ -1189,9 +1170,6 @@ static void func_map_init_elemwise_2arg_3type(func_map_t &fmap) (void *)dpnp_multiply_c_default< std::complex, std::complex, std::complex>}; - func_map_elemwise_2arg_3type_helper(fmap); - func_map_elemwise_2arg_3type_short_helper(fmap); diff --git a/dpnp/backend/kernels/dpnp_krnl_indexing.cpp b/dpnp/backend/kernels/dpnp_krnl_indexing.cpp index dcbf6ca906c..523acd447c6 100644 --- a/dpnp/backend/kernels/dpnp_krnl_indexing.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_indexing.cpp @@ -125,31 +125,6 @@ DPCTLSyclEventRef (*dpnp_choose_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_choose_c<_DataType1, _DataType2>; -template -DPCTLSyclEventRef - dpnp_diag_indices_c(DPCTLSyclQueueRef q_ref, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - return dpnp_arange_c<_DataType>(q_ref, 0, 1, result1, size, - dep_event_vec_ref); -} - -template -void dpnp_diag_indices_c(void *result1, size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_diag_indices_c<_DataType>(q_ref, result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_diag_indices_default_c)(void *, - size_t) = dpnp_diag_indices_c<_DataType>; - template DPCTLSyclEventRef dpnp_diagonal_c(DPCTLSyclQueueRef q_ref, void *array1_in, @@ -873,15 +848,6 @@ void func_map_init_indexing_func(func_map_t &fmap) fmap[DPNPFuncName::DPNP_FN_CHOOSE_EXT][eft_LNG][eft_DBL] = { eft_DBL, (void *)dpnp_choose_ext_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG_INDICES][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_diag_indices_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG_INDICES][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_diag_indices_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG_INDICES][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_diag_indices_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAG_INDICES][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_diag_indices_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAGONAL][eft_INT][eft_INT] = { eft_INT, (void *)dpnp_diagonal_default_c}; fmap[DPNPFuncName::DPNP_FN_DIAGONAL][eft_LNG][eft_LNG] = { diff --git a/dpnp/backend/kernels/dpnp_krnl_manipulation.cpp b/dpnp/backend/kernels/dpnp_krnl_manipulation.cpp deleted file mode 100644 index aaaa5a179dd..00000000000 --- a/dpnp/backend/kernels/dpnp_krnl_manipulation.cpp +++ /dev/null @@ -1,235 +0,0 @@ -//***************************************************************************** -// Copyright (c) 2016-2024, Intel Corporation -// All rights reserved. -// -// Redistribution and use in source and binary forms, with or without -// modification, are permitted provided that the following conditions are met: -// - Redistributions of source code must retain the above copyright notice, -// this list of conditions and the following disclaimer. -// - Redistributions in binary form must reproduce the above copyright notice, -// this list of conditions and the following disclaimer in the documentation -// and/or other materials provided with the distribution. -// -// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE -// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF -// THE POSSIBILITY OF SUCH DAMAGE. -//***************************************************************************** - -#include -#include -#include - -#include - -#include "dpnp_fptr.hpp" -#include "dpnp_utils.hpp" -#include "dpnpc_memory_adapter.hpp" -#include "queue_sycl.hpp" - -template -class dpnp_repeat_c_kernel; - -template -DPCTLSyclEventRef dpnp_repeat_c(DPCTLSyclQueueRef q_ref, - const void *array1_in, - void *result1, - const size_t repeats, - const size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!array1_in || !result1) { - return event_ref; - } - - if (!size || !repeats) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, size); - const _DataType *array_in = input1_ptr.get_ptr(); - _DataType *result = reinterpret_cast<_DataType *>(result1); - - sycl::range<2> gws(size, repeats); - auto kernel_parallel_for_func = [=](sycl::id<2> global_id) { - size_t idx1 = global_id[0]; - size_t idx2 = global_id[1]; - result[(idx1 * repeats) + idx2] = array_in[idx1]; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - - event_ref = reinterpret_cast(&event); - - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_repeat_c(const void *array1_in, - void *result1, - const size_t repeats, - const size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_repeat_c<_DataType>( - q_ref, array1_in, result1, repeats, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_repeat_default_c)(const void *, - void *, - const size_t, - const size_t) = dpnp_repeat_c<_DataType>; - -template -class dpnp_elemwise_transpose_c_kernel; - -template -DPCTLSyclEventRef - dpnp_elemwise_transpose_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - const shape_elem_type *input_shape, - const shape_elem_type *result_shape, - const shape_elem_type *permute_axes, - size_t ndim, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - sycl::event event; - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, size); - _DataType *array1 = input1_ptr.get_ptr(); - _DataType *result = reinterpret_cast<_DataType *>(result1); - - shape_elem_type *input_offset_shape = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - get_shape_offsets_inkernel(input_shape, ndim, input_offset_shape); - - shape_elem_type *temp_result_offset_shape = - reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - get_shape_offsets_inkernel(result_shape, ndim, temp_result_offset_shape); - - shape_elem_type *result_offset_shape = reinterpret_cast( - sycl::malloc_shared(ndim * sizeof(shape_elem_type), q)); - for (size_t axis = 0; axis < ndim; ++axis) { - result_offset_shape[permute_axes[axis]] = - temp_result_offset_shape[axis]; - } - - sycl::range<1> gws(size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - const size_t idx = global_id[0]; - - size_t output_index = 0; - size_t reminder = idx; - for (size_t axis = 0; axis < ndim; ++axis) { - /* reconstruct [x][y][z] from given linear idx */ - size_t xyz_id = reminder / input_offset_shape[axis]; - reminder = reminder % input_offset_shape[axis]; - - /* calculate destination index based on reconstructed [x][y][z] */ - output_index += (xyz_id * result_offset_shape[axis]); - } - - result[output_index] = array1[idx]; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - event = q.submit(kernel_func); - - event.wait(); - - sycl::free(input_offset_shape, q); - sycl::free(temp_result_offset_shape, q); - sycl::free(result_offset_shape, q); - - return event_ref; -} - -template -void dpnp_elemwise_transpose_c(void *array1_in, - const shape_elem_type *input_shape, - const shape_elem_type *result_shape, - const shape_elem_type *permute_axes, - size_t ndim, - void *result1, - size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_elemwise_transpose_c<_DataType>( - q_ref, array1_in, input_shape, result_shape, permute_axes, ndim, - result1, size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_elemwise_transpose_default_c)(void *, - const shape_elem_type *, - const shape_elem_type *, - const shape_elem_type *, - size_t, - void *, - size_t) = - dpnp_elemwise_transpose_c<_DataType>; - -void func_map_init_manipulation(func_map_t &fmap) -{ - fmap[DPNPFuncName::DPNP_FN_REPEAT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_repeat_default_c}; - fmap[DPNPFuncName::DPNP_FN_REPEAT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_repeat_default_c}; - fmap[DPNPFuncName::DPNP_FN_REPEAT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_repeat_default_c}; - fmap[DPNPFuncName::DPNP_FN_REPEAT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_repeat_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TRANSPOSE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_elemwise_transpose_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRANSPOSE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_elemwise_transpose_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRANSPOSE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_elemwise_transpose_default_c}; - fmap[DPNPFuncName::DPNP_FN_TRANSPOSE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_elemwise_transpose_default_c}; - return; -} diff --git a/dpnp/backend/src/dpnp_fptr.hpp b/dpnp/backend/src/dpnp_fptr.hpp index 2a9c42eb172..73d627812a5 100644 --- a/dpnp/backend/src/dpnp_fptr.hpp +++ b/dpnp/backend/src/dpnp_fptr.hpp @@ -331,7 +331,6 @@ void func_map_init_fft_func(func_map_t &fmap); void func_map_init_indexing_func(func_map_t &fmap); void func_map_init_linalg(func_map_t &fmap); void func_map_init_logic(func_map_t &fmap); -void func_map_init_manipulation(func_map_t &fmap); void func_map_init_mathematical(func_map_t &fmap); void func_map_init_random(func_map_t &fmap); void func_map_init_reduction(func_map_t &fmap); diff --git a/dpnp/backend/src/dpnp_iface_fptr.cpp b/dpnp/backend/src/dpnp_iface_fptr.cpp index f8214212728..f80c5b35863 100644 --- a/dpnp/backend/src/dpnp_iface_fptr.cpp +++ b/dpnp/backend/src/dpnp_iface_fptr.cpp @@ -96,45 +96,6 @@ void (*dpnp_dot_default_c)(void *, const shape_elem_type *) = dpnp_dot_c<_DataType_output, _DataType_input1, _DataType_input2>; -void *get_backend_function_name(const char *func_name, const char *type_name) -{ - /** Implement it in this way to allow easier play with it */ - const char *supported_func_name = "dpnp_dot"; - const char *supported_type1_name = "double"; - const char *supported_type2_name = "float"; - const char *supported_type3_name = "long"; - const char *supported_type4_name = "int"; - - /** of coerce it will be converted into std::map later */ - if (!strncmp(func_name, supported_func_name, strlen(supported_func_name))) { - if (!strncmp(type_name, supported_type1_name, - strlen(supported_type1_name))) { - return reinterpret_cast( - dpnp_dot_default_c); - } - else if (!strncmp(type_name, supported_type2_name, - strlen(supported_type2_name))) - { - return reinterpret_cast( - dpnp_dot_default_c); - } - else if (!strncmp(type_name, supported_type3_name, - strlen(supported_type3_name))) - { - return reinterpret_cast( - dpnp_dot_default_c); - } - else if (!strncmp(type_name, supported_type4_name, - strlen(supported_type4_name))) - { - return reinterpret_cast( - dpnp_dot_default_c); - } - } - - throw std::runtime_error("DPNP Error: Unsupported function call"); -} - /** * This operator is needed for compatibility with Cython 0.29 which has a bug in * Enum handling @@ -172,7 +133,6 @@ static func_map_t func_map_init() func_map_init_indexing_func(fmap); func_map_init_linalg(fmap); func_map_init_logic(fmap); - func_map_init_manipulation(fmap); func_map_init_mathematical(fmap); func_map_init_random(fmap); func_map_init_reduction(fmap); diff --git a/dpnp/backend/src/queue_sycl.cpp b/dpnp/backend/src/queue_sycl.cpp index 5e6df29d21d..786752facd6 100644 --- a/dpnp/backend/src/queue_sycl.cpp +++ b/dpnp/backend/src/queue_sycl.cpp @@ -80,36 +80,6 @@ } #endif -#if defined(DPNPC_TOUCH_KERNEL_TO_LINK) -/** - * Function push the SYCL kernels to be linked (final stage of the compilation) - * for the current queue - * - * TODO it is not the best idea to just a call some kernel. Needs better - * solution. - */ -static long dpnp_kernels_link() -{ - /* must use memory pre-allocated at the current queue */ - long *value_ptr = - reinterpret_cast(dpnp_memory_alloc_c(1 * sizeof(long))); - long *result_ptr = - reinterpret_cast(dpnp_memory_alloc_c(1 * sizeof(long))); - long result = 1; - - *value_ptr = 2; - - dpnp_square_c(value_ptr, result_ptr, 1); - - result = *result_ptr; - - dpnp_memory_free_c(result_ptr); - dpnp_memory_free_c(value_ptr); - - return result; -} -#endif - size_t dpnp_queue_is_cpu_c() { const auto &be = backend_sycl::get(); diff --git a/dpnp/dpnp_algo/dpnp_algo.pxd b/dpnp/dpnp_algo/dpnp_algo.pxd index 37663bee834..0c8bd1134a7 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pxd +++ b/dpnp/dpnp_algo/dpnp_algo.pxd @@ -84,7 +84,6 @@ cdef extern from "dpnp_iface_fptr.hpp" namespace "DPNPFuncName": # need this na DPNP_FN_RNG_WALD_EXT DPNP_FN_RNG_WEIBULL_EXT DPNP_FN_RNG_ZIPF_EXT - DPNP_FN_TRAPZ_EXT cdef extern from "dpnp_iface_fptr.hpp" namespace "DPNPFuncType": # need this namespace for Enum import cdef enum DPNPFuncType "DPNPFuncType": diff --git a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi index fca1e6dc303..28b89ce60a1 100644 --- a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi +++ b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi @@ -40,16 +40,12 @@ __all__ += [ "dpnp_fmax", "dpnp_fmin", "dpnp_modf", - "dpnp_trapz", ] ctypedef c_dpctl.DPCTLSyclEventRef(*fptr_1in_2out_t)(c_dpctl.DPCTLSyclQueueRef, void * , void * , void * , size_t, const c_dpctl.DPCTLEventVectorRef) -ctypedef c_dpctl.DPCTLSyclEventRef(*ftpr_custom_trapz_2in_1out_with_2size_t)(c_dpctl.DPCTLSyclQueueRef, - void *, void * , void * , double, size_t, size_t, - const c_dpctl.DPCTLEventVectorRef) cpdef utils.dpnp_descriptor dpnp_ediff1d(utils.dpnp_descriptor x1): @@ -166,41 +162,3 @@ cpdef tuple dpnp_modf(utils.dpnp_descriptor x1): c_dpctl.DPCTLEvent_Delete(event_ref) return (result1.get_pyobj(), result2.get_pyobj()) - - -cpdef utils.dpnp_descriptor dpnp_trapz(utils.dpnp_descriptor y1, utils.dpnp_descriptor x1, double dx): - - cdef DPNPFuncType param1_type = dpnp_dtype_to_DPNPFuncType(y1.dtype) - cdef DPNPFuncType param2_type = dpnp_dtype_to_DPNPFuncType(x1.dtype) - cdef DPNPFuncData kernel_data = get_dpnp_function_ptr(DPNP_FN_TRAPZ_EXT, param1_type, param2_type) - - result_sycl_device, result_usm_type, result_sycl_queue = utils.get_common_usm_allocation(y1, x1) - - # create result array with type given by FPTR data - cdef shape_type_c result_shape = (1,) - cdef utils.dpnp_descriptor result = utils.create_output_descriptor(result_shape, - kernel_data.return_type, - None, - device=result_sycl_device, - usm_type=result_usm_type, - sycl_queue=result_sycl_queue) - - result_sycl_queue = result.get_array().sycl_queue - - cdef c_dpctl.SyclQueue q = result_sycl_queue - cdef c_dpctl.DPCTLSyclQueueRef q_ref = q.get_queue_ref() - - cdef ftpr_custom_trapz_2in_1out_with_2size_t func = kernel_data.ptr - cdef c_dpctl.DPCTLSyclEventRef event_ref = func(q_ref, - y1.get_data(), - x1.get_data(), - result.get_data(), - dx, - y1.size, - x1.size, - NULL) # dep_events_ref - - with nogil: c_dpctl.DPCTLEvent_WaitAndThrow(event_ref) - c_dpctl.DPCTLEvent_Delete(event_ref) - - return result diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index 1fe7839f596..1caf1359be3 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -64,7 +64,6 @@ dpnp_fmax, dpnp_fmin, dpnp_modf, - dpnp_trapz, ) from .dpnp_algo.dpnp_elementwise_common import ( DPNPAngle, @@ -3287,36 +3286,6 @@ def trapz(y1, x1=None, dx=1.0, axis=-1): """ - y_desc = dpnp.get_dpnp_descriptor(y1, copy_when_nondefault_queue=False) - if y_desc: - if y_desc.ndim > 1: - pass - else: - y_obj = y_desc.get_array() - if x1 is None: - x_obj = dpnp.empty( - y_desc.shape, - dtype=y_desc.dtype, - device=y_obj.sycl_device, - usm_type=y_obj.usm_type, - sycl_queue=y_obj.sycl_queue, - ) - else: - x_obj = x1 - - x_desc = dpnp.get_dpnp_descriptor( - x_obj, copy_when_nondefault_queue=False - ) - # TODO: change to "not x_desc" - if x_desc: - pass - elif y_desc.size != x_desc.size: - pass - elif y_desc.shape != x_desc.shape: - pass - else: - return dpnp_trapz(y_desc, x_desc, dx).get_pyobj() - return call_origin(numpy.trapz, y1, x1, dx, axis) From 03b585b09145525e21edb688436e14cf9a19d484 Mon Sep 17 00:00:00 2001 From: Natalia Polina Date: Thu, 4 Jul 2024 04:04:20 -0700 Subject: [PATCH 44/49] Clean up legacy indexing implementation from the backend (#1908) * Clean up legacy indexing implementation from the backend * fix pre-commit --- dpnp/backend/include/dpnp_iface.hpp | 224 ------ dpnp/backend/include/dpnp_iface_fptr.hpp | 49 +- dpnp/backend/kernels/dpnp_krnl_indexing.cpp | 767 -------------------- 3 files changed, 21 insertions(+), 1019 deletions(-) diff --git a/dpnp/backend/include/dpnp_iface.hpp b/dpnp/backend/include/dpnp_iface.hpp index 0fc5595041c..4efea15a38b 100644 --- a/dpnp/backend/include/dpnp_iface.hpp +++ b/dpnp/backend/include/dpnp_iface.hpp @@ -205,38 +205,6 @@ INP_DLLEXPORT void dpnp_nanvar_c(void *array, const size_t result_size, size_t size); -/** - * @ingroup BACKEND_API - * @brief Return the indices of the elements that are non-zero. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1 Input array. - * @param [out] result1 Output array. - * @param [in] result_size Output array size. - * @param [in] shape Shape of input array. - * @param [in] ndim Number of elements in shape. - * @param [in] j Number input array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_nonzero_c(DPCTLSyclQueueRef q_ref, - const void *array1, - void *result1, - const size_t result_size, - const shape_elem_type *shape, - const size_t ndim, - const size_t j, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_nonzero_c(const void *array1, - void *result1, - const size_t result_size, - const shape_elem_type *shape, - const size_t ndim, - const size_t j); - /** * @ingroup BACKEND_API * @brief Custom implementation of dot function @@ -448,35 +416,6 @@ INP_DLLEXPORT void dpnp_partition_c(void *array, const shape_elem_type *shape, const size_t ndim); -/** - * @ingroup BACKEND_API - * @brief Place of array elements - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] arr Input array. - * @param [in] mask Mask array. - * @param [in] vals Vals array. - * @param [in] arr_size Number of input elements in `arr`. - * @param [in] vals_size Number of input elements in `vals`. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_place_c(DPCTLSyclQueueRef q_ref, - void *arr, - long *mask, - void *vals, - const size_t arr_size, - const size_t vals_size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_place_c(void *arr, - long *mask, - void *vals, - const size_t arr_size, - const size_t vals_size); - /** * @ingroup BACKEND_API * @brief Compute Product of input array elements. @@ -523,78 +462,6 @@ INP_DLLEXPORT void dpnp_prod_c(void *result_out, const void *initial, const long *where); -/** - * @ingroup BACKEND_API - * @brief Replaces specified elements of an array with given values. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array. - * @param [in] ind Target indices, interpreted as integers. - * @param [in] v Values to place in array at target indices. - * @param [in] size Number of input elements in `array`. - * @param [in] size_ind Number of input elements in `ind`. - * @param [in] size_v Number of input elements in `v`. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_put_c(DPCTLSyclQueueRef q_ref, - void *array, - void *ind, - void *v, - const size_t size, - const size_t size_ind, - const size_t size_v, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_put_c(void *array, - void *ind, - void *v, - const size_t size, - const size_t size_ind, - const size_t size_v); - -/** - * @ingroup BACKEND_API - * @brief Put values into the destination array by matching 1d index and data - * slices. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] arr_in Input array. - * @param [in] indices_in Indices to change along each 1d slice of - * arr. - * @param [in] values_in Values to insert at those indices. - * @param [in] axis The axis to take 1d slices along. - * @param [in] shape Shape of input array. - * @param [in] ndim Number of input array dimensions. - * @param [in] size_indices Size of indices. - * @param [in] values_size Size of values. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_put_along_axis_c(DPCTLSyclQueueRef q_ref, - void *arr_in, - long *indices_in, - void *values_in, - size_t axis, - const shape_elem_type *shape, - size_t ndim, - size_t size_indices, - size_t values_size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_put_along_axis_c(void *arr_in, - long *indices_in, - void *values_in, - size_t axis, - const shape_elem_type *shape, - size_t ndim, - size_t size_indices, - size_t values_size); - /** * @ingroup BACKEND_API @@ -776,42 +643,6 @@ INP_DLLEXPORT void dpnp_choose_c(void *result1, size_t choices_size, size_t choice_size); -/** - * @ingroup BACKEND_API - * @brief math library implementation of diagonal function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array with data. - * @param [in] input1_size Input1 data size. - * @param [out] result1 Output array. - * @param [in] offset Offset of the diagonal from the main - * diagonal. - * @param [in] shape Shape of input array. - * @param [in] res_shape Shape of output array. - * @param [in] res_ndim Number of elements in shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_diagonal_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - const size_t input1_size, - void *result1, - const size_t offset, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_diagonal_c(void *array1_in, - const size_t input1_size, - void *result1, - const size_t offset, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t res_ndim); - /** * @ingroup BACKEND_API * @brief implementation of creating filled with value array function @@ -1044,35 +875,6 @@ INP_DLLEXPORT void dpnp_std_c(void *array, size_t naxis, size_t ddof); -/** - * @ingroup BACKEND_API - * @brief math library implementation of take function - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array Input array with data. - * @param [in] array1_size Input array size. - * @param [in] indices Input array with indices. - * @param [out] result Output array. - * @param [in] size Number of elements in the input array. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_take_c(DPCTLSyclQueueRef q_ref, - void *array, - const size_t array1_size, - void *indices, - void *result, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_take_c(void *array, - const size_t array1_size, - void *indices, - void *result, - size_t size); - /** * @ingroup BACKEND_API * @brief math library implementation of var function @@ -1183,32 +985,6 @@ INP_DLLEXPORT void dpnp_var_c(void *array, #include -/** - * @ingroup BACKEND_API - * @brief fill_diagonal function. - * - * @param [in] q_ref Reference to SYCL queue. - * @param [in] array1_in Input array. - * @param [in] val Value to write on the diagonal. - * @param [in] shape Input shape. - * @param [in] ndim Number of elements in shape. - * @param [in] dep_event_vec_ref Reference to vector of SYCL events. - */ -template -INP_DLLEXPORT DPCTLSyclEventRef - dpnp_fill_diagonal_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *val, - shape_elem_type *shape, - const size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref); - -template -INP_DLLEXPORT void dpnp_fill_diagonal_c(void *array1_in, - void *val, - shape_elem_type *shape, - const size_t ndim); - /** * @ingroup BACKEND_API * @brief modf function. diff --git a/dpnp/backend/include/dpnp_iface_fptr.hpp b/dpnp/backend/include/dpnp_iface_fptr.hpp index d62e5998583..aaaf90c27bb 100644 --- a/dpnp/backend/include/dpnp_iface_fptr.hpp +++ b/dpnp/backend/include/dpnp_iface_fptr.hpp @@ -81,22 +81,20 @@ enum class DPNPFuncName : size_t DPNP_FN_DEGREES, /**< Used in numpy.degrees() impl */ DPNP_FN_DEGREES_EXT, /**< Used in numpy.degrees() impl, requires extra parameters */ - DPNP_FN_DIAGONAL, /**< Used in numpy.diagonal() impl */ DPNP_FN_DOT, /**< Used in numpy.dot() impl */ DPNP_FN_DOT_EXT, /**< Used in numpy.dot() impl, requires extra parameters */ DPNP_FN_EDIFF1D, /**< Used in numpy.ediff1d() impl */ - DPNP_FN_EDIFF1D_EXT, /**< Used in numpy.ediff1d() impl, requires extra - parameters */ - DPNP_FN_ERF, /**< Used in scipy.special.erf impl */ - DPNP_FN_ERF_EXT, /**< Used in scipy.special.erf impl, requires extra - parameters */ - DPNP_FN_FFT_FFT, /**< Used in numpy.fft.fft() impl */ - DPNP_FN_FFT_FFT_EXT, /**< Used in numpy.fft.fft() impl, requires extra - parameters */ - DPNP_FN_FFT_RFFT, /**< Used in numpy.fft.rfft() impl */ - DPNP_FN_FFT_RFFT_EXT, /**< Used in numpy.fft.rfft() impl, requires extra - parameters */ - DPNP_FN_FILL_DIAGONAL, /**< Used in numpy.fill_diagonal() impl */ + DPNP_FN_EDIFF1D_EXT, /**< Used in numpy.ediff1d() impl, requires extra + parameters */ + DPNP_FN_ERF, /**< Used in scipy.special.erf impl */ + DPNP_FN_ERF_EXT, /**< Used in scipy.special.erf impl, requires extra + parameters */ + DPNP_FN_FFT_FFT, /**< Used in numpy.fft.fft() impl */ + DPNP_FN_FFT_FFT_EXT, /**< Used in numpy.fft.fft() impl, requires extra + parameters */ + DPNP_FN_FFT_RFFT, /**< Used in numpy.fft.rfft() impl */ + DPNP_FN_FFT_RFFT_EXT, /**< Used in numpy.fft.rfft() impl, requires extra + parameters */ DPNP_FN_INITVAL, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ DPNP_FN_INITVAL_EXT, /**< Used in numpy ones, ones_like, zeros, zeros_like @@ -116,23 +114,19 @@ enum class DPNPFuncName : size_t */ DPNP_FN_MULTIPLY, /**< Used in numpy.multiply() impl */ DPNP_FN_NANVAR, /**< Used in numpy.nanvar() impl */ - DPNP_FN_NONZERO, /**< Used in numpy.nonzero() impl */ DPNP_FN_ONES, /**< Used in numpy.ones() impl */ DPNP_FN_ONES_LIKE, /**< Used in numpy.ones_like() impl */ DPNP_FN_PARTITION, /**< Used in numpy.partition() impl */ - DPNP_FN_PARTITION_EXT, /**< Used in numpy.partition() impl, requires extra - parameters */ - DPNP_FN_PLACE, /**< Used in numpy.place() impl */ - DPNP_FN_PROD, /**< Used in numpy.prod() impl */ - DPNP_FN_PUT, /**< Used in numpy.put() impl */ - DPNP_FN_PUT_ALONG_AXIS, /**< Used in numpy.put_along_axis() impl */ - DPNP_FN_RADIANS, /**< Used in numpy.radians() impl */ - DPNP_FN_RADIANS_EXT, /**< Used in numpy.radians() impl, requires extra - parameters */ - DPNP_FN_RNG_BETA, /**< Used in numpy.random.beta() impl */ - DPNP_FN_RNG_BETA_EXT, /**< Used in numpy.random.beta() impl, requires extra - parameters */ - DPNP_FN_RNG_BINOMIAL, /**< Used in numpy.random.binomial() impl */ + DPNP_FN_PARTITION_EXT, /**< Used in numpy.partition() impl, requires extra + parameters */ + DPNP_FN_PROD, /**< Used in numpy.prod() impl */ + DPNP_FN_RADIANS, /**< Used in numpy.radians() impl */ + DPNP_FN_RADIANS_EXT, /**< Used in numpy.radians() impl, requires extra + parameters */ + DPNP_FN_RNG_BETA, /**< Used in numpy.random.beta() impl */ + DPNP_FN_RNG_BETA_EXT, /**< Used in numpy.random.beta() impl, requires extra + parameters */ + DPNP_FN_RNG_BINOMIAL, /**< Used in numpy.random.binomial() impl */ DPNP_FN_RNG_BINOMIAL_EXT, /**< Used in numpy.random.binomial() impl, requires extra parameters */ DPNP_FN_RNG_CHISQUARE, /**< Used in numpy.random.chisquare() impl */ @@ -253,7 +247,6 @@ enum class DPNPFuncName : size_t */ DPNP_FN_STD, /**< Used in numpy.std() impl */ DPNP_FN_SUM, /**< Used in numpy.sum() impl */ - DPNP_FN_TAKE, /**< Used in numpy.take() impl */ DPNP_FN_VAR, /**< Used in numpy.var() impl */ DPNP_FN_ZEROS, /**< Used in numpy.zeros() impl */ DPNP_FN_ZEROS_LIKE, /**< Used in numpy.zeros_like() impl */ diff --git a/dpnp/backend/kernels/dpnp_krnl_indexing.cpp b/dpnp/backend/kernels/dpnp_krnl_indexing.cpp index 523acd447c6..5400da81758 100644 --- a/dpnp/backend/kernels/dpnp_krnl_indexing.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_indexing.cpp @@ -125,693 +125,6 @@ DPCTLSyclEventRef (*dpnp_choose_ext_c)(DPCTLSyclQueueRef, const DPCTLEventVectorRef) = dpnp_choose_c<_DataType1, _DataType2>; -template -DPCTLSyclEventRef dpnp_diagonal_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - const size_t input1_size, - void *result1, - const size_t offset, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t res_ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - const size_t res_size = std::accumulate(res_shape, res_shape + res_ndim, 1, - std::multiplies()); - if (!(res_size && input1_size)) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, array1_in, input1_size, - true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, result1, res_size, true, - true); - _DataType *array_1 = input1_ptr.get_ptr(); - _DataType *result = result_ptr.get_ptr(); - - const size_t res_shape_ndim_sub_1 = - static_cast(res_shape[res_ndim - 1]); - - if (res_ndim <= 1) { - for (size_t i = 0; i < res_shape_ndim_sub_1; ++i) { - result[i] = array_1[i * shape[res_ndim] + i + offset]; - } - } - else { - std::map> xyz; - for (size_t i = 0; i < static_cast(res_shape[0]); i++) { - xyz[i] = {i}; - } - - size_t index = 1; - while (index < res_ndim - 1) { - size_t shape_element = res_shape[index]; - std::map> new_shape_array; - size_t ind = 0; - for (size_t i = 0; i < shape_element; i++) { - for (size_t j = 0; j < xyz.size(); j++) { - std::vector new_shape; - std::vector list_ind = xyz[j]; - for (size_t k = 0; k < list_ind.size(); k++) { - new_shape.push_back(list_ind.at(k)); - } - new_shape.push_back(i); - new_shape_array[ind] = new_shape; - ind += 1; - } - } - size_t len_new_shape_array = new_shape_array.size() * (index + 1); - - for (size_t k = 0; k < len_new_shape_array; k++) { - xyz[k] = new_shape_array[k]; - } - index += 1; - } - - for (size_t i = 0; i < res_shape_ndim_sub_1; i++) { - for (size_t j = 0; j < xyz.size(); j++) { - std::vector ind_list = xyz[j]; - if (ind_list.size() == 0) { - continue; - } - else { - std::vector ind_input_{i, i + offset}; - ind_input_.insert(ind_input_.end(), ind_list.begin(), - ind_list.end()); - - std::vector ind_output_ = ind_list; - ind_output_.push_back(i); - - const size_t ind_output_size = ind_output_.size(); - size_t ind_output = 0; - size_t n = 1; - for (size_t k = 0; k < ind_output_size; k++) { - size_t ind = ind_output_size - 1 - k; - ind_output += n * ind_output_[ind]; - n *= res_shape[ind]; - } - - const size_t ind_input_size = ind_input_.size(); - size_t ind_input = 0; - size_t m = 1; - for (size_t k = 0; k < ind_input_size; k++) { - size_t ind = ind_input_size - 1 - k; - ind_input += m * ind_input_[ind]; - m *= shape[ind]; - } - - result[ind_output] = array_1[ind_input]; - } - } - } - } - - return event_ref; -} - -template -void dpnp_diagonal_c(void *array1_in, - const size_t input1_size, - void *result1, - const size_t offset, - shape_elem_type *shape, - shape_elem_type *res_shape, - const size_t res_ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_diagonal_c<_DataType>( - q_ref, array1_in, input1_size, result1, offset, shape, res_shape, - res_ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_diagonal_default_c)(void *, - const size_t, - void *, - const size_t, - shape_elem_type *, - shape_elem_type *, - const size_t) = dpnp_diagonal_c<_DataType>; - -template -DPCTLSyclEventRef - dpnp_fill_diagonal_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *val_in, - shape_elem_type *shape, - const size_t ndim, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - const size_t result_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - if (!(result_size && array1_in)) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, array1_in, result_size, true, - true); - DPNPC_ptr_adapter<_DataType> val_ptr(q_ref, val_in, 1, true); - _DataType *array_1 = result_ptr.get_ptr(); - _DataType *val_arr = val_ptr.get_ptr(); - - shape_elem_type min_shape = shape[0]; - for (size_t i = 0; i < ndim; ++i) { - if (shape[i] < min_shape) { - min_shape = shape[i]; - } - } - - _DataType val = val_arr[0]; - - for (size_t i = 0; i < static_cast(min_shape); ++i) { - size_t ind = 0; - size_t n = 1; - for (size_t k = 0; k < ndim; k++) { - size_t ind_ = ndim - 1 - k; - ind += n * i; - n *= shape[ind_]; - } - array_1[ind] = val; - } - - return event_ref; -} - -template -void dpnp_fill_diagonal_c(void *array1_in, - void *val_in, - shape_elem_type *shape, - const size_t ndim) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_fill_diagonal_c<_DataType>( - q_ref, array1_in, val_in, shape, ndim, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_fill_diagonal_default_c)(void *, - void *, - shape_elem_type *, - const size_t) = - dpnp_fill_diagonal_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_nonzero_c(DPCTLSyclQueueRef q_ref, - const void *in_array1, - void *result1, - const size_t result_size, - const shape_elem_type *shape, - const size_t ndim, - const size_t j, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if ((in_array1 == nullptr) || (result1 == nullptr)) { - return event_ref; - } - - if (ndim == 0) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - const size_t input1_size = std::accumulate( - shape, shape + ndim, 1, std::multiplies()); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, in_array1, input1_size, - true); - DPNPC_ptr_adapter result_ptr(q_ref, result1, result_size, true, true); - const _DataType *arr = input1_ptr.get_ptr(); - long *result = result_ptr.get_ptr(); - - size_t idx = 0; - size_t *ids = new size_t[ndim]; - - for (size_t i = 0; i < input1_size; ++i) { - if (arr[i] != 0) { - size_t ind1 = input1_size; - size_t ind2 = i; - - for (size_t k = 0; k < ndim; ++k) { - ind1 = ind1 / shape[k]; - ids[k] = ind2 / ind1; - ind2 = ind2 % ind1; - } - - result[idx] = ids[j]; - idx += 1; - } - } - delete[] ids; - - return event_ref; -} - -template -void dpnp_nonzero_c(const void *in_array1, - void *result1, - const size_t result_size, - const shape_elem_type *shape, - const size_t ndim, - const size_t j) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_nonzero_c<_DataType>(q_ref, in_array1, result1, result_size, shape, - ndim, j, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_nonzero_default_c)(const void *, - void *, - const size_t, - const shape_elem_type *, - const size_t, - const size_t) = dpnp_nonzero_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_place_c(DPCTLSyclQueueRef q_ref, - void *arr_in, - long *mask_in, - void *vals_in, - const size_t arr_size, - const size_t vals_size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if (!arr_size) { - return event_ref; - } - - if (!vals_size) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - - DPNPC_ptr_adapter<_DataType> input1_ptr(q_ref, vals_in, vals_size, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, arr_in, arr_size, true, - true); - _DataType *vals = input1_ptr.get_ptr(); - _DataType *arr = result_ptr.get_ptr(); - - DPNPC_ptr_adapter mask_ptr(q_ref, mask_in, arr_size, true); - long *mask = mask_ptr.get_ptr(); - - size_t counter = 0; - for (size_t i = 0; i < arr_size; ++i) { - if (mask[i]) { - arr[i] = vals[counter % vals_size]; - counter += 1; - } - } - - return event_ref; -} - -template -void dpnp_place_c(void *arr_in, - long *mask_in, - void *vals_in, - const size_t arr_size, - const size_t vals_size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_place_c<_DataType>(q_ref, arr_in, mask_in, vals_in, arr_size, - vals_size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_place_default_c)(void *, - long *, - void *, - const size_t, - const size_t) = dpnp_place_c<_DataType>; - -template -DPCTLSyclEventRef dpnp_put_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - void *ind_in, - void *v_in, - const size_t size, - const size_t size_ind, - const size_t size_v, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - - if ((array1_in == nullptr) || (ind_in == nullptr) || (v_in == nullptr)) { - return event_ref; - } - - if (size_v == 0) { - return event_ref; - } - - sycl::queue q = *(reinterpret_cast(q_ref)); - DPNPC_ptr_adapter input1_ptr(q_ref, ind_in, size_ind, true); - DPNPC_ptr_adapter<_DataType> input2_ptr(q_ref, v_in, size_v, true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, array1_in, size, true, true); - size_t *ind = input1_ptr.get_ptr(); - _DataType *v = input2_ptr.get_ptr(); - _DataType *array_1 = result_ptr.get_ptr(); - - for (size_t i = 0; i < size; ++i) { - for (size_t j = 0; j < size_ind; ++j) { - if (i == ind[j] || (i == (size + ind[j]))) { - array_1[i] = v[j % size_v]; - } - } - } - - return event_ref; -} - -template -void dpnp_put_c(void *array1_in, - void *ind_in, - void *v_in, - const size_t size, - const size_t size_ind, - const size_t size_v) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = - dpnp_put_c<_DataType, _IndecesType, _ValueType>( - q_ref, array1_in, ind_in, v_in, size, size_ind, size_v, - dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_put_default_c)(void *, - void *, - void *, - const size_t, - const size_t, - const size_t) = - dpnp_put_c<_DataType, _IndecesType, _ValueType>; - -template -DPCTLSyclEventRef - dpnp_put_along_axis_c(DPCTLSyclQueueRef q_ref, - void *arr_in, - long *indices_in, - void *values_in, - size_t axis, - const shape_elem_type *shape, - size_t ndim, - size_t size_indices, - size_t values_size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - sycl::queue q = *(reinterpret_cast(q_ref)); - - const size_t size_arr = std::accumulate(shape, shape + ndim, 1, - std::multiplies()); - - DPNPC_ptr_adapter input1_ptr(q_ref, indices_in, size_indices, true); - DPNPC_ptr_adapter<_DataType> input2_ptr(q_ref, values_in, values_size, - true); - DPNPC_ptr_adapter<_DataType> result_ptr(q_ref, arr_in, size_arr, true, - true); - size_t *indices = input1_ptr.get_ptr(); - _DataType *values = input2_ptr.get_ptr(); - _DataType *arr = result_ptr.get_ptr(); - - if (axis != (ndim - 1)) { - std::vector res_shape; - for (size_t i = 0; i < ndim; i++) { - if (axis != i) { - res_shape.push_back(shape[i]); - } - } - size_t res_ndim = res_shape.size(); - - size_t prod = 1; - for (size_t i = 0; i < res_ndim; ++i) { - if (res_shape[i] != 0) { - prod *= res_shape[i]; - } - } - - size_t *ind_array = new size_t[prod]; - bool *bool_ind_array = new bool[prod]; - for (size_t i = 0; i < prod; ++i) { - bool_ind_array[i] = true; - } - - size_t *arr_shape_offsets = new size_t[ndim]; - size_t acc = 1; - for (size_t i = ndim - 1; i > 0; --i) { - arr_shape_offsets[i] = acc; - acc *= shape[i]; - } - arr_shape_offsets[0] = acc; - - size_t *output_shape_offsets = new size_t[res_ndim]; - acc = 1; - if (res_ndim > 0) { - for (size_t i = res_ndim - 1; i > 0; --i) { - output_shape_offsets[i] = acc; - acc *= res_shape[i]; - } - } - output_shape_offsets[0] = acc; - - size_t size_result = 1; - for (size_t i = 0; i < res_ndim; ++i) { - size_result *= res_shape[i]; - } - - // init result array - size_t *xyz = new size_t[res_ndim]; - for (size_t result_idx = 0; result_idx < size_result; ++result_idx) { - size_t remainder = result_idx; - for (size_t i = 0; i < res_ndim; ++i) { - xyz[i] = remainder / output_shape_offsets[i]; - remainder = remainder - xyz[i] * output_shape_offsets[i]; - } - - // FIXME: computed and unused. Commented out per compiler warning - // size_t source_axis[ndim]; - // size_t result_axis_idx = 0; - // for (size_t idx = 0; idx < ndim; ++idx) { - // bool found = false; - // if (axis == idx) { - // found = true; - // } - // if (found) { - // source_axis[idx] = 0; - // } - // else { - // source_axis[idx] = xyz[result_axis_idx]; - // result_axis_idx++; - // } - // } - - // size_t source_idx = 0; - // for (size_t i = 0; i < static_cast(ndim); ++i) - // { - // source_idx += arr_shape_offsets[i] * source_axis[i]; - // } - } - - for (size_t source_idx = 0; source_idx < size_arr; ++source_idx) { - // reconstruct x,y,z from linear source_idx - size_t remainder = source_idx; - for (size_t i = 0; i < ndim; ++i) { - xyz[i] = remainder / arr_shape_offsets[i]; - remainder = remainder - xyz[i] * arr_shape_offsets[i]; - } - - // extract result axis - std::vector result_axis; - for (size_t idx = 0; idx < ndim; ++idx) { - // try to find current idx in axis array - if (axis != idx) { - result_axis.push_back(xyz[idx]); - } - } - - // Construct result offset - size_t result_offset = 0; - for (size_t i = 0; i < res_ndim; ++i) { - result_offset += output_shape_offsets[i] * result_axis[i]; - } - - if (bool_ind_array[result_offset]) { - ind_array[result_offset] = 0; - bool_ind_array[result_offset] = false; - } - else { - ind_array[result_offset] += 1; - } - - if ((ind_array[result_offset] % size_indices) == - indices[result_offset % size_indices]) - { - arr[source_idx] = values[source_idx % values_size]; - } - } - - delete[] ind_array; - delete[] bool_ind_array; - delete[] arr_shape_offsets; - delete[] output_shape_offsets; - delete[] xyz; - } - else { - for (size_t i = 0; i < size_arr; ++i) { - size_t ind = - size_indices * (i / size_indices) + indices[i % size_indices]; - arr[ind] = values[i % values_size]; - } - } - return event_ref; -} - -template -void dpnp_put_along_axis_c(void *arr_in, - long *indices_in, - void *values_in, - size_t axis, - const shape_elem_type *shape, - size_t ndim, - size_t size_indices, - size_t values_size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_put_along_axis_c<_DataType>( - q_ref, arr_in, indices_in, values_in, axis, shape, ndim, size_indices, - values_size, dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); -} - -template -void (*dpnp_put_along_axis_default_c)(void *, - long *, - void *, - size_t, - const shape_elem_type *, - size_t, - size_t, - size_t) = - dpnp_put_along_axis_c<_DataType>; - -template -class dpnp_take_c_kernel; - -template -DPCTLSyclEventRef dpnp_take_c(DPCTLSyclQueueRef q_ref, - void *array1_in, - const size_t array1_size, - void *indices1, - void *result1, - size_t size, - const DPCTLEventVectorRef dep_event_vec_ref) -{ - // avoid warning unused variable - (void)array1_size; - (void)dep_event_vec_ref; - - DPCTLSyclEventRef event_ref = nullptr; - sycl::queue q = *(reinterpret_cast(q_ref)); - - _DataType *array_1 = reinterpret_cast<_DataType *>(array1_in); - _IndecesType *indices = reinterpret_cast<_IndecesType *>(indices1); - _DataType *result = reinterpret_cast<_DataType *>(result1); - - sycl::range<1> gws(size); - auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { - const size_t idx = global_id[0]; - result[idx] = array_1[indices[idx]]; - }; - - auto kernel_func = [&](sycl::handler &cgh) { - cgh.parallel_for>( - gws, kernel_parallel_for_func); - }; - - sycl::event event = q.submit(kernel_func); - - event_ref = reinterpret_cast(&event); - return DPCTLEvent_Copy(event_ref); -} - -template -void dpnp_take_c(void *array1_in, - const size_t array1_size, - void *indices1, - void *result1, - size_t size) -{ - DPCTLSyclQueueRef q_ref = reinterpret_cast(&DPNP_QUEUE); - DPCTLEventVectorRef dep_event_vec_ref = nullptr; - DPCTLSyclEventRef event_ref = dpnp_take_c<_DataType, _IndecesType>( - q_ref, array1_in, array1_size, indices1, result1, size, - dep_event_vec_ref); - DPCTLEvent_WaitAndThrow(event_ref); - DPCTLEvent_Delete(event_ref); -} - -template -void (*dpnp_take_default_c)(void *, const size_t, void *, void *, size_t) = - dpnp_take_c<_DataType, _IndecesType>; - -template -DPCTLSyclEventRef (*dpnp_take_ext_c)(DPCTLSyclQueueRef, - void *, - const size_t, - void *, - void *, - size_t, - const DPCTLEventVectorRef) = - dpnp_take_c<_DataType, _IndecesType>; - void func_map_init_indexing_func(func_map_t &fmap) { fmap[DPNPFuncName::DPNP_FN_CHOOSE][eft_INT][eft_INT] = { @@ -847,85 +160,5 @@ void func_map_init_indexing_func(func_map_t &fmap) eft_FLT, (void *)dpnp_choose_ext_c}; fmap[DPNPFuncName::DPNP_FN_CHOOSE_EXT][eft_LNG][eft_DBL] = { eft_DBL, (void *)dpnp_choose_ext_c}; - - fmap[DPNPFuncName::DPNP_FN_DIAGONAL][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_diagonal_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAGONAL][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_diagonal_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAGONAL][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_diagonal_default_c}; - fmap[DPNPFuncName::DPNP_FN_DIAGONAL][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_diagonal_default_c}; - - fmap[DPNPFuncName::DPNP_FN_FILL_DIAGONAL][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_fill_diagonal_default_c}; - fmap[DPNPFuncName::DPNP_FN_FILL_DIAGONAL][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_fill_diagonal_default_c}; - fmap[DPNPFuncName::DPNP_FN_FILL_DIAGONAL][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_fill_diagonal_default_c}; - fmap[DPNPFuncName::DPNP_FN_FILL_DIAGONAL][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_fill_diagonal_default_c}; - - fmap[DPNPFuncName::DPNP_FN_NONZERO][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_nonzero_default_c}; - fmap[DPNPFuncName::DPNP_FN_NONZERO][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_nonzero_default_c}; - fmap[DPNPFuncName::DPNP_FN_NONZERO][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_nonzero_default_c}; - fmap[DPNPFuncName::DPNP_FN_NONZERO][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_nonzero_default_c}; - - fmap[DPNPFuncName::DPNP_FN_PLACE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_place_default_c}; - fmap[DPNPFuncName::DPNP_FN_PLACE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_place_default_c}; - fmap[DPNPFuncName::DPNP_FN_PLACE][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_place_default_c}; - fmap[DPNPFuncName::DPNP_FN_PLACE][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_place_default_c}; - - fmap[DPNPFuncName::DPNP_FN_PUT][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_put_default_c}; - fmap[DPNPFuncName::DPNP_FN_PUT][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_put_default_c}; - fmap[DPNPFuncName::DPNP_FN_PUT][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_put_default_c}; - fmap[DPNPFuncName::DPNP_FN_PUT][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_put_default_c}; - - fmap[DPNPFuncName::DPNP_FN_PUT_ALONG_AXIS][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_put_along_axis_default_c}; - fmap[DPNPFuncName::DPNP_FN_PUT_ALONG_AXIS][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_put_along_axis_default_c}; - fmap[DPNPFuncName::DPNP_FN_PUT_ALONG_AXIS][eft_FLT][eft_FLT] = { - eft_FLT, (void *)dpnp_put_along_axis_default_c}; - fmap[DPNPFuncName::DPNP_FN_PUT_ALONG_AXIS][eft_DBL][eft_DBL] = { - eft_DBL, (void *)dpnp_put_along_axis_default_c}; - - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_BLN][eft_INT] = { - eft_BLN, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_INT][eft_INT] = { - eft_INT, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_LNG][eft_INT] = { - eft_LNG, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_FLT][eft_INT] = { - eft_FLT, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_DBL][eft_INT] = { - eft_DBL, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_C128][eft_INT] = { - eft_C128, (void *)dpnp_take_default_c, int32_t>}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_BLN][eft_LNG] = { - eft_BLN, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_INT][eft_LNG] = { - eft_INT, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_LNG][eft_LNG] = { - eft_LNG, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_FLT][eft_LNG] = { - eft_FLT, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_DBL][eft_LNG] = { - eft_DBL, (void *)dpnp_take_default_c}; - fmap[DPNPFuncName::DPNP_FN_TAKE][eft_C128][eft_LNG] = { - eft_C128, (void *)dpnp_take_default_c, int64_t>}; - return; } From 740b08bef1b936e1c450db609fe60b6f679891f5 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 4 Jul 2024 14:19:38 +0200 Subject: [PATCH 45/49] Update `dpnp.extract` implementation to get rid of limitations for input arguments (#1906) * Remove limitations from dpnp.extract implementation * Add more tests * Tune rtol and atol for a histogram test, since might fail on Windows * Fix a typo in description * Add test to cover condition as list --- doc/reference/sorting.rst | 2 +- dpnp/dpnp_iface_indexing.py | 92 +++++-- tests/skipped_tests.tbl | 5 - tests/skipped_tests_gpu.tbl | 5 - tests/test_histogram.py | 2 +- tests/test_indexing.py | 240 ++++++++++++------ tests/test_sycl_queue.py | 1 + tests/test_usm_type.py | 2 + .../cupy/indexing_tests/test_indexing.py | 47 +++- 9 files changed, 276 insertions(+), 120 deletions(-) diff --git a/doc/reference/sorting.rst b/doc/reference/sorting.rst index d0a966c6731..ead79b1098a 100644 --- a/doc/reference/sorting.rst +++ b/doc/reference/sorting.rst @@ -31,10 +31,10 @@ Searching dpnp.nanargmax dpnp.argmin dpnp.nanargmin + dpnp.argwhere dpnp.nonzero dpnp.flatnonzero dpnp.where - dpnp.argwhere dpnp.searchsorted dpnp.extract diff --git a/dpnp/dpnp_iface_indexing.py b/dpnp/dpnp_iface_indexing.py index 0a1c8529c42..20a046c82c1 100644 --- a/dpnp/dpnp_iface_indexing.py +++ b/dpnp/dpnp_iface_indexing.py @@ -490,42 +490,86 @@ def diagonal(a, offset=0, axis1=0, axis2=1): ) -def extract(condition, x): +def extract(condition, a): """ Return the elements of an array that satisfy some condition. + This is equivalent to + ``dpnp.compress(dpnp.ravel(condition), dpnp.ravel(a))``. If `condition` + is boolean :obj:`dpnp.extract` is equivalent to ``a[condition]``. + + Note that :obj:`dpnp.place` does the exact opposite of :obj:`dpnp.extract`. + For full documentation refer to :obj:`numpy.extract`. + Parameters + ---------- + condition : {array_like, scalar} + An array whose non-zero or ``True`` entries indicate the element of `a` + to extract. + a : {dpnp_array, usm_ndarray} + Input array of the same size as `condition`. + Returns ------- out : dpnp.ndarray - Rank 1 array of values from `x` where `condition` is True. + Rank 1 array of values from `a` where `condition` is ``True``. + + See Also + -------- + :obj:`dpnp.take` : Take elements from an array along an axis. + :obj:`dpnp.put` : Replaces specified elements of an array with given values. + :obj:`dpnp.copyto` : Copies values from one array to another, broadcasting + as necessary. + :obj:`dpnp.compress` : eturn selected slices of an array along given axis. + :obj:`dpnp.place` : Change elements of an array based on conditional and + input values. + + Examples + -------- + >>> import dpnp as np + >>> a = np.arange(12).reshape((3, 4)) + >>> a + array([[ 0, 1, 2, 3], + [ 4, 5, 6, 7], + [ 8, 9, 10, 11]]) + >>> condition = np.mod(a, 3) == 0 + >>> condition + array([[ True, False, False, True], + [False, False, True, False], + [False, True, False, False]]) + >>> np.extract(condition, a) + array([0, 3, 6, 9]) + + If `condition` is boolean: + + >>> a[condition] + array([0, 3, 6, 9]) - Limitations - ----------- - Parameters `condition` and `x` are supported either as - :class:`dpnp.ndarray` or :class:`dpctl.tensor.usm_ndarray`. - Parameter `x` must be the same shape as `condition`. - Otherwise the function will be executed sequentially on CPU. """ - if dpnp.is_supported_array_type(condition) and dpnp.is_supported_array_type( - x - ): - if condition.shape != x.shape: - pass - else: - dpt_condition = ( - condition.get_array() - if isinstance(condition, dpnp_array) - else condition - ) - dpt_array = x.get_array() if isinstance(x, dpnp_array) else x - return dpnp_array._create_from_usm_ndarray( - dpt.extract(dpt_condition, dpt_array) - ) + usm_a = dpnp.get_usm_ndarray(a) + if not dpnp.is_supported_array_type(condition): + usm_cond = dpt.asarray( + condition, usm_type=a.usm_type, sycl_queue=a.sycl_queue + ) + else: + usm_cond = dpnp.get_usm_ndarray(condition) + + if usm_cond.size != usm_a.size: + usm_a = dpt.reshape(usm_a, -1) + usm_cond = dpt.reshape(usm_cond, -1) + + usm_res = dpt.take(usm_a, dpt.nonzero(usm_cond)[0]) + else: + if usm_cond.shape != usm_a.shape: + usm_a = dpt.reshape(usm_a, -1) + usm_cond = dpt.reshape(usm_cond, -1) + + usm_res = dpt.extract(usm_cond, usm_a) - return call_origin(numpy.extract, condition, x) + dpnp.synchronize_array_data(usm_res) + return dpnp_array._create_from_usm_ndarray(usm_res) def fill_diagonal(a, val, wrap=False): diff --git a/tests/skipped_tests.tbl b/tests/skipped_tests.tbl index 37285be810f..199566295a3 100644 --- a/tests/skipped_tests.tbl +++ b/tests/skipped_tests.tbl @@ -124,11 +124,6 @@ tests/third_party/cupy/indexing_tests/test_generate.py::TestUnravelIndex::test_i tests/third_party/cupy/indexing_tests/test_generate.py::TestUnravelIndex::test_invalid_index tests/third_party/cupy/indexing_tests/test_generate.py::TestUnravelIndex::test_invalid_order -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_empty_1dim -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_empty_1dim_no_axis -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_no_axis -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_no_bool tests/third_party/cupy/indexing_tests/test_indexing.py::TestSelect::test_select tests/third_party/cupy/indexing_tests/test_indexing.py::TestSelect::test_select_1D_choicelist tests/third_party/cupy/indexing_tests/test_indexing.py::TestSelect::test_select_choicelist_condlist_broadcast diff --git a/tests/skipped_tests_gpu.tbl b/tests/skipped_tests_gpu.tbl index 55fd91b0def..26b52190539 100644 --- a/tests/skipped_tests_gpu.tbl +++ b/tests/skipped_tests_gpu.tbl @@ -174,11 +174,6 @@ tests/third_party/cupy/indexing_tests/test_generate.py::TestUnravelIndex::test_i tests/third_party/cupy/indexing_tests/test_generate.py::TestUnravelIndex::test_invalid_index tests/third_party/cupy/indexing_tests/test_generate.py::TestUnravelIndex::test_invalid_order -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_empty_1dim -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_empty_1dim_no_axis -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_no_axis -tests/third_party/cupy/indexing_tests/test_indexing.py::TestIndexing::test_compress_no_bool tests/third_party/cupy/indexing_tests/test_indexing.py::TestSelect::test_select tests/third_party/cupy/indexing_tests/test_indexing.py::TestSelect::test_select_1D_choicelist tests/third_party/cupy/indexing_tests/test_indexing.py::TestSelect::test_select_choicelist_condlist_broadcast diff --git a/tests/test_histogram.py b/tests/test_histogram.py index da58a4ac2f8..0e6e33fd99c 100644 --- a/tests/test_histogram.py +++ b/tests/test_histogram.py @@ -182,7 +182,7 @@ def test_density(self, dtype): result_hist, result_edges = dpnp.histogram(iv, density=True) if numpy.issubdtype(dtype, numpy.inexact): - tol = numpy.finfo(dtype).resolution + tol = 4 * numpy.finfo(dtype).resolution assert_allclose(result_hist, expected_hist, rtol=tol, atol=tol) assert_allclose(result_edges, expected_edges, rtol=tol, atol=tol) else: diff --git a/tests/test_indexing.py b/tests/test_indexing.py index f001f994dbd..8b54bc482ce 100644 --- a/tests/test_indexing.py +++ b/tests/test_indexing.py @@ -7,6 +7,7 @@ assert_array_equal, assert_equal, assert_raises, + assert_raises_regex, ) import dpnp @@ -29,6 +30,169 @@ def wrapped(a, axis, **kwargs): return wrapped +class TestDiagonal: + @pytest.mark.parametrize("dtype", get_all_dtypes(no_bool=True)) + @pytest.mark.parametrize("offset", [-3, -1, 0, 1, 3]) + @pytest.mark.parametrize( + "shape", + [(2, 2), (3, 3), (2, 5), (3, 2, 2), (2, 2, 2, 2), (2, 2, 2, 3)], + ids=[ + "(2,2)", + "(3,3)", + "(2,5)", + "(3,2,2)", + "(2,2,2,2)", + "(2,2,2,3)", + ], + ) + def test_diagonal_offset(self, shape, dtype, offset): + a = numpy.arange(numpy.prod(shape), dtype=dtype).reshape(shape) + a_dp = dpnp.array(a) + expected = numpy.diagonal(a, offset) + result = dpnp.diagonal(a_dp, offset) + assert_array_equal(expected, result) + + @pytest.mark.parametrize("dtype", get_all_dtypes(no_bool=True)) + @pytest.mark.parametrize( + "shape, axis_pairs", + [ + ((3, 4), [(0, 1), (1, 0)]), + ((3, 4, 5), [(0, 1), (1, 2), (0, 2)]), + ((4, 3, 5, 2), [(0, 1), (1, 2), (2, 3), (0, 3)]), + ], + ) + def test_diagonal_axes(self, shape, axis_pairs, dtype): + a = numpy.arange(numpy.prod(shape), dtype=dtype).reshape(shape) + a_dp = dpnp.array(a) + for axis1, axis2 in axis_pairs: + expected = numpy.diagonal(a, axis1=axis1, axis2=axis2) + result = dpnp.diagonal(a_dp, axis1=axis1, axis2=axis2) + assert_array_equal(expected, result) + + def test_diagonal_errors(self): + a = dpnp.arange(12).reshape(3, 4) + + # unsupported type + a_np = dpnp.asnumpy(a) + assert_raises(TypeError, dpnp.diagonal, a_np) + + # a.ndim < 2 + a_ndim_1 = a.flatten() + assert_raises(ValueError, dpnp.diagonal, a_ndim_1) + + # unsupported type `offset` + assert_raises(TypeError, dpnp.diagonal, a, offset=1.0) + assert_raises(TypeError, dpnp.diagonal, a, offset=[0]) + + # axes are out of bounds + assert_raises(numpy.AxisError, a.diagonal, axis1=0, axis2=5) + assert_raises(numpy.AxisError, a.diagonal, axis1=5, axis2=0) + assert_raises(numpy.AxisError, a.diagonal, axis1=5, axis2=5) + + # same axes + assert_raises(ValueError, a.diagonal, axis1=1, axis2=1) + assert_raises(ValueError, a.diagonal, axis1=1, axis2=-1) + + +class TestExtins: + @pytest.mark.parametrize("a_dt", get_all_dtypes(no_none=True)) + @pytest.mark.parametrize("cond_dt", get_all_dtypes(no_none=True)) + def test_extract_diff_dtypes(self, a_dt, cond_dt): + a = numpy.array([-2, -1, 0, 1, 2, 3], dtype=a_dt) + cond = numpy.array([1, -1, 2, 0, -2, 3], dtype=cond_dt) + ia, icond = dpnp.array(a), dpnp.array(cond) + + result = dpnp.extract(icond, ia) + expected = numpy.extract(cond, a) + assert_array_equal(result, expected) + + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True)) + def test_extract(self, dt): + a = numpy.array([1, 3, 2, 1, 2, 3, 3], dtype=dt) + ia = dpnp.array(a) + + result = dpnp.extract(ia > 1, ia) + expected = numpy.extract(a > 1, a) + assert_array_equal(result, expected) + + @pytest.mark.parametrize("a_dt", get_all_dtypes(no_none=True)) + def test_extract_list_cond(self, a_dt): + a = numpy.array([-2, -1, 0, 1, 2, 3], dtype=a_dt) + cond = [1, -1, 2, 0, -2, 3] + ia = dpnp.array(a) + + result = dpnp.extract(cond, ia) + expected = numpy.extract(cond, a) + assert_array_equal(result, expected) + + @pytest.mark.usefixtures("allow_fall_back_on_numpy") + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True)) + def test_place(self, dt): + a = numpy.array([1, 4, 3, 2, 5, 8, 7], dtype=dt) + ia = dpnp.array(a) + + dpnp.place(ia, [0, 1, 0, 1, 0, 1, 0], [2, 4, 6]) + numpy.place(a, [0, 1, 0, 1, 0, 1, 0], [2, 4, 6]) + assert_array_equal(ia, a) + + @pytest.mark.usefixtures("allow_fall_back_on_numpy") + def test_place_broadcast_vals(self): + a = numpy.array([1, 4, 3, 2, 5, 8, 7]) + ia = dpnp.array(a) + + dpnp.place(ia, [1, 0, 1, 0, 1, 0, 1], [8, 9]) + numpy.place(a, [1, 0, 1, 0, 1, 0, 1], [8, 9]) + assert_array_equal(ia, a) + + @pytest.mark.usefixtures("allow_fall_back_on_numpy") + def test_place_empty_vals(self): + a = numpy.array([1, 4, 3, 2, 5, 8, 7]) + mask = numpy.zeros(7) + ia, imask = dpnp.array(a), dpnp.array(mask) + vals = [] + + dpnp.place(ia, imask, vals) + numpy.place(a, mask, vals) + assert_array_equal(ia, a) + + @pytest.mark.usefixtures("allow_fall_back_on_numpy") + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_place_insert_from_empty_vals(self, xp): + a = xp.array([1, 4, 3, 2, 5, 8, 7]) + assert_raises_regex( + ValueError, + "Cannot insert from an empty array", + lambda: xp.place(a, [0, 0, 0, 0, 0, 1, 0], []), + ) + + @pytest.mark.usefixtures("allow_fall_back_on_numpy") + @pytest.mark.parametrize("xp", [numpy, dpnp]) + def test_place_wrong_array_type(self, xp): + assert_raises(TypeError, xp.place, [1, 2, 3], [True, False], [0, 1]) + + @pytest.mark.usefixtures("allow_fall_back_on_numpy") + @pytest.mark.parametrize("dt", get_all_dtypes(no_none=True)) + def test_both(self, dt): + a = numpy.random.rand(10).astype(dt) + mask = a > 0.5 + ia, imask = dpnp.array(a), dpnp.array(mask) + + result = dpnp.extract(imask, ia) + expected = numpy.extract(mask, a) + assert_array_equal(result, expected) + + ic = dpnp.extract(imask, ia) + c = numpy.extract(mask, a) + assert_array_equal(ic, c) + + dpnp.place(ia, imask, 0) + dpnp.place(ia, imask, ic) + + numpy.place(a, mask, 0) + numpy.place(a, mask, c) + assert_array_equal(ia, a) + + class TestIndexing: def test_ellipsis_index(self): a = dpnp.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) @@ -477,82 +641,6 @@ def test_choose(): assert_array_equal(expected, result) -class TestDiagonal: - @pytest.mark.parametrize("dtype", get_all_dtypes(no_bool=True)) - @pytest.mark.parametrize("offset", [-3, -1, 0, 1, 3]) - @pytest.mark.parametrize( - "shape", - [(2, 2), (3, 3), (2, 5), (3, 2, 2), (2, 2, 2, 2), (2, 2, 2, 3)], - ids=[ - "(2,2)", - "(3,3)", - "(2,5)", - "(3,2,2)", - "(2,2,2,2)", - "(2,2,2,3)", - ], - ) - def test_diagonal_offset(self, shape, dtype, offset): - a = numpy.arange(numpy.prod(shape), dtype=dtype).reshape(shape) - a_dp = dpnp.array(a) - expected = numpy.diagonal(a, offset) - result = dpnp.diagonal(a_dp, offset) - assert_array_equal(expected, result) - - @pytest.mark.parametrize("dtype", get_all_dtypes(no_bool=True)) - @pytest.mark.parametrize( - "shape, axis_pairs", - [ - ((3, 4), [(0, 1), (1, 0)]), - ((3, 4, 5), [(0, 1), (1, 2), (0, 2)]), - ((4, 3, 5, 2), [(0, 1), (1, 2), (2, 3), (0, 3)]), - ], - ) - def test_diagonal_axes(self, shape, axis_pairs, dtype): - a = numpy.arange(numpy.prod(shape), dtype=dtype).reshape(shape) - a_dp = dpnp.array(a) - for axis1, axis2 in axis_pairs: - expected = numpy.diagonal(a, axis1=axis1, axis2=axis2) - result = dpnp.diagonal(a_dp, axis1=axis1, axis2=axis2) - assert_array_equal(expected, result) - - def test_diagonal_errors(self): - a = dpnp.arange(12).reshape(3, 4) - - # unsupported type - a_np = dpnp.asnumpy(a) - assert_raises(TypeError, dpnp.diagonal, a_np) - - # a.ndim < 2 - a_ndim_1 = a.flatten() - assert_raises(ValueError, dpnp.diagonal, a_ndim_1) - - # unsupported type `offset` - assert_raises(TypeError, dpnp.diagonal, a, offset=1.0) - assert_raises(TypeError, dpnp.diagonal, a, offset=[0]) - - # axes are out of bounds - assert_raises(numpy.AxisError, a.diagonal, axis1=0, axis2=5) - assert_raises(numpy.AxisError, a.diagonal, axis1=5, axis2=0) - assert_raises(numpy.AxisError, a.diagonal, axis1=5, axis2=5) - - # same axes - assert_raises(ValueError, a.diagonal, axis1=1, axis2=1) - assert_raises(ValueError, a.diagonal, axis1=1, axis2=-1) - - -@pytest.mark.parametrize("arr_dtype", get_all_dtypes()) -@pytest.mark.parametrize("cond_dtype", get_all_dtypes()) -def test_extract_1d(arr_dtype, cond_dtype): - a = numpy.array([-2, -1, 0, 1, 2, 3], dtype=arr_dtype) - ia = dpnp.array(a) - cond = numpy.array([1, -1, 2, 0, -2, 3], dtype=cond_dtype) - icond = dpnp.array(cond) - expected = numpy.extract(cond, a) - result = dpnp.extract(icond, ia) - assert_array_equal(expected, result) - - @pytest.mark.parametrize("val", [-1, 0, 1], ids=["-1", "0", "1"]) @pytest.mark.parametrize( "array", diff --git a/tests/test_sycl_queue.py b/tests/test_sycl_queue.py index f7c70320dbf..1ea5592ecc4 100644 --- a/tests/test_sycl_queue.py +++ b/tests/test_sycl_queue.py @@ -647,6 +647,7 @@ def test_reduce_hypot(device): pytest.param("dot", [3.0, 4.0, 5.0], [1.0, 2.0, 3.0]), pytest.param("dot", [3, 4, 5], [1, 2, 3]), pytest.param("dot", [3 + 2j, 4 + 1j, 5], [1, 2 + 3j, 3]), + pytest.param("extract", [False, True, True, False], [0, 1, 2, 3]), pytest.param( "floor_divide", [1.0, 2.0, 3.0, 4.0], [2.5, 2.5, 2.5, 2.5] ), diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index 8d43bccd75a..d38acc4a657 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -637,6 +637,8 @@ def test_1in_1out(func, data, usm_type): pytest.param("dot", [3.0, 4.0, 5.0], [1.0, 2.0, 3.0]), pytest.param("dot", [3, 4, 5], [1, 2, 3]), pytest.param("dot", [3 + 2j, 4 + 1j, 5], [1, 2 + 3j, 3]), + # TODO: uncomment once resolved in gh-1723 by dpctl + # pytest.param("extract", [False, True, True, False], [0, 1, 2, 3]), pytest.param("fmax", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), pytest.param("fmin", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), pytest.param("fmod", [5, 3], [2, 2.0]), diff --git a/tests/third_party/cupy/indexing_tests/test_indexing.py b/tests/third_party/cupy/indexing_tests/test_indexing.py index 7d05eedd2c3..6696bc47087 100644 --- a/tests/third_party/cupy/indexing_tests/test_indexing.py +++ b/tests/third_party/cupy/indexing_tests/test_indexing.py @@ -4,6 +4,7 @@ import pytest import dpnp as cupy +from tests.helper import has_support_aspect64 from tests.third_party.cupy import testing @@ -35,8 +36,10 @@ def test_take_no_axis(self, xp): return a.take(b) # see cupy#3017 + # mark slow as NumPy could go OOM on the Windows CI + @testing.slow @testing.for_int_dtypes(no_bool=True) - @testing.numpy_cupy_array_equal() + @testing.numpy_cupy_array_equal(type_check=has_support_aspect64()) def test_take_index_range_overflow(self, xp, dtype): # Skip for too large dimensions if numpy.dtype(dtype) in (numpy.int64, numpy.uint64): @@ -46,7 +49,7 @@ def test_take_index_range_overflow(self, xp, dtype): if dtype in (numpy.int32, numpy.uint32): pytest.skip() iinfo = numpy.iinfo(dtype) - a = xp.broadcast_to(xp.ones(1, dtype=dtype), (iinfo.max + 1,)) + a = xp.broadcast_to(xp.ones(1), (iinfo.max + 1,)) b = xp.array([0], dtype=dtype) return a.take(b) @@ -62,18 +65,21 @@ def test_take_along_axis_none_axis(self, xp): b = testing.shaped_random((30,), xp, dtype="int64", scale=24) return xp.take_along_axis(a, b, axis=None) + @pytest.mark.skip("compress() is not implemented yet") @testing.numpy_cupy_array_equal() def test_compress(self, xp): a = testing.shaped_arange((3, 4, 5), xp) b = xp.array([True, False, True]) return xp.compress(b, a, axis=1) + @pytest.mark.skip("compress() is not implemented yet") @testing.numpy_cupy_array_equal() def test_compress_no_axis(self, xp): a = testing.shaped_arange((3, 4, 5), xp) b = xp.array([True, False, True]) return xp.compress(b, a) + @pytest.mark.skip("compress() is not implemented yet") @testing.for_int_dtypes() @testing.numpy_cupy_array_equal() def test_compress_no_bool(self, xp, dtype): @@ -81,18 +87,34 @@ def test_compress_no_bool(self, xp, dtype): b = testing.shaped_arange((3,), xp, dtype) return xp.compress(b, a, axis=1) + @pytest.mark.skip("compress() is not implemented yet") + @testing.numpy_cupy_array_equal() + def test_compress_overrun_false(self, xp): + a = testing.shaped_arange((3,), xp) + b = xp.array([True, False, True, False, False, False]) + return xp.compress(b, a) + + @pytest.mark.skip("compress() is not implemented yet") @testing.numpy_cupy_array_equal() def test_compress_empty_1dim(self, xp): a = testing.shaped_arange((3, 4, 5), xp) b = xp.array([]) return xp.compress(b, a, axis=1) + @pytest.mark.skip("compress() is not implemented yet") @testing.numpy_cupy_array_equal() def test_compress_empty_1dim_no_axis(self, xp): a = testing.shaped_arange((3, 4, 5), xp) b = xp.array([]) return xp.compress(b, a) + @pytest.mark.skip("compress() is not implemented yet") + @testing.numpy_cupy_array_equal() + def test_compress_0dim(self, xp): + a = xp.array(3) + b = xp.array([True]) + return xp.compress(b, a) + @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_diagonal(self, xp, dtype): @@ -162,28 +184,24 @@ def test_extract_no_bool(self, xp, dtype): b = xp.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]], dtype=dtype) return xp.extract(b, a) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.numpy_cupy_array_equal() def test_extract_shape_mismatch(self, xp): a = testing.shaped_arange((2, 3), xp) b = xp.array([[True, False], [True, False], [True, False]]) return xp.extract(b, a) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.numpy_cupy_array_equal() def test_extract_size_mismatch(self, xp): a = testing.shaped_arange((3, 3), xp) b = xp.array([[True, False, True], [False, True, False]]) return xp.extract(b, a) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.numpy_cupy_array_equal() def test_extract_size_mismatch2(self, xp): a = testing.shaped_arange((3, 3), xp) b = xp.array([[True, False, True, False], [False, True, False, True]]) return xp.extract(b, a) - @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.numpy_cupy_array_equal() def test_extract_empty_1dim(self, xp): a = testing.shaped_arange((3, 3), xp) @@ -191,7 +209,6 @@ def test_extract_empty_1dim(self, xp): return xp.extract(b, a) -@pytest.mark.usefixtures("allow_fall_back_on_numpy") class TestChoose(unittest.TestCase): @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() @@ -200,13 +217,15 @@ def test_choose(self, xp, dtype): c = testing.shaped_arange((3, 4), xp, dtype) return a.choose(c) + @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_choose_broadcast(self, xp, dtype): a = xp.array([[1, 0, 1], [0, 1, 0], [1, 0, 1]]) - c = xp.array([-10, 10], dtype=dtype) + c = xp.array([-10, 10]).astype(dtype) return a.choose(c) + @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_choose_broadcast2(self, xp, dtype): @@ -214,6 +233,7 @@ def test_choose_broadcast2(self, xp, dtype): c = testing.shaped_arange((3, 5, 2), xp, dtype) return a.choose(c) + @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_choose_wrap(self, xp, dtype): @@ -221,6 +241,7 @@ def test_choose_wrap(self, xp, dtype): c = testing.shaped_arange((3, 4), xp, dtype) return a.choose(c, mode="wrap") + @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_all_dtypes() @testing.numpy_cupy_array_equal() def test_choose_clip(self, xp, dtype): @@ -228,6 +249,7 @@ def test_choose_clip(self, xp, dtype): c = testing.shaped_arange((3, 4), xp, dtype) return a.choose(c, mode="clip") + @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.with_requires("numpy>=1.19") def test_unknown_clip(self): for xp in (numpy, cupy): @@ -236,12 +258,14 @@ def test_unknown_clip(self): with pytest.raises(ValueError): a.choose(c, mode="unknown") + @pytest.mark.usefixtures("allow_fall_back_on_numpy") def test_raise(self): a = cupy.array([2]) c = cupy.array([[0, 1]]) with self.assertRaises(ValueError): a.choose(c) + @pytest.mark.usefixtures("allow_fall_back_on_numpy") @testing.for_all_dtypes() def test_choose_broadcast_fail(self, dtype): for xp in (numpy, cupy): @@ -370,3 +394,10 @@ def test_select_default_scalar(self, dtype): choicelist = [a, b] with pytest.raises(TypeError): cupy.select(condlist, choicelist, [dtype(2)]) + + @pytest.mark.skip("as_strided() is not implemented yet") + @testing.numpy_cupy_array_equal() + def test_indexing_overflows(self, xp): + a = xp.arange(2, dtype=xp.int32) + a = xp.lib.stride_tricks.as_strided(a, shape=(2, 2**32), strides=(4, 0)) + return a[xp.array([1]), xp.array([1])] From 05e1bb6366d09b2c230d4433fb1a2f1a627e582d Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Thu, 4 Jul 2024 18:14:42 +0200 Subject: [PATCH 46/49] Rework implementation of `dpnp.fmax` and `dpnp.fmin` functions (#1905) * Implement dpnp.fmax and dpnp.fmin functions * Updated existing tests and added new ones * Removed unused code from cython backend * Removed a reference to original descriptor --- doc/reference/ufunc.rst | 2 + dpnp/backend/extensions/ufunc/CMakeLists.txt | 2 + .../ufunc/elementwise_functions/common.cpp | 4 + .../ufunc/elementwise_functions/fmax.cpp | 137 +++++++ .../ufunc/elementwise_functions/fmax.hpp | 35 ++ .../ufunc/elementwise_functions/fmin.cpp | 137 +++++++ .../ufunc/elementwise_functions/fmin.hpp | 35 ++ dpnp/backend/extensions/vm/CMakeLists.txt | 2 + dpnp/backend/extensions/vm/fmax.cpp | 161 ++++++++ dpnp/backend/extensions/vm/fmax.hpp | 35 ++ dpnp/backend/extensions/vm/fmin.cpp | 161 ++++++++ dpnp/backend/extensions/vm/fmin.hpp | 35 ++ dpnp/backend/extensions/vm/vm_py.cpp | 4 + .../include/dpnp_gen_2arg_3type_tbl.hpp | 22 -- dpnp/backend/include/dpnp_iface_fptr.hpp | 4 - dpnp/backend/kernels/dpnp_krnl_elemwise.cpp | 42 -- .../kernels/elementwise_functions/fmax.hpp | 83 ++++ .../kernels/elementwise_functions/fmin.hpp | 83 ++++ .../kernels/elementwise_functions/fmod.hpp | 3 +- dpnp/dpnp_algo/dpnp_algo.pxd | 11 - dpnp/dpnp_algo/dpnp_algo.pyx | 96 ----- dpnp/dpnp_algo/dpnp_algo_mathematical.pxi | 18 - dpnp/dpnp_iface.py | 8 +- dpnp/dpnp_iface_mathematical.py | 362 ++++++++---------- dpnp/dpnp_utils/dpnp_algo_utils.pxd | 8 - dpnp/dpnp_utils/dpnp_algo_utils.pyx | 78 +--- tests/skipped_tests.tbl | 2 - tests/skipped_tests_gpu.tbl | 2 - tests/test_mathematical.py | 75 ++++ tests/test_usm_type.py | 8 +- 30 files changed, 1154 insertions(+), 501 deletions(-) create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fmax.cpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fmax.hpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fmin.cpp create mode 100644 dpnp/backend/extensions/ufunc/elementwise_functions/fmin.hpp create mode 100644 dpnp/backend/extensions/vm/fmax.cpp create mode 100644 dpnp/backend/extensions/vm/fmax.hpp create mode 100644 dpnp/backend/extensions/vm/fmin.cpp create mode 100644 dpnp/backend/extensions/vm/fmin.hpp create mode 100644 dpnp/backend/kernels/elementwise_functions/fmax.hpp create mode 100644 dpnp/backend/kernels/elementwise_functions/fmin.hpp diff --git a/doc/reference/ufunc.rst b/doc/reference/ufunc.rst index 2dffca15e88..a5b64852bd4 100644 --- a/doc/reference/ufunc.rst +++ b/doc/reference/ufunc.rst @@ -105,10 +105,12 @@ Comparison functions dpnp.less_equal dpnp.not_equal dpnp.equal + dpnp.logical_and dpnp.logical_or dpnp.logical_xor dpnp.logical_not + dpnp.maximum dpnp.minimum dpnp.fmax diff --git a/dpnp/backend/extensions/ufunc/CMakeLists.txt b/dpnp/backend/extensions/ufunc/CMakeLists.txt index 1d140b06658..077710cb55c 100644 --- a/dpnp/backend/extensions/ufunc/CMakeLists.txt +++ b/dpnp/backend/extensions/ufunc/CMakeLists.txt @@ -26,6 +26,8 @@ set(_elementwise_sources ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/common.cpp ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fabs.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fmax.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fmin.cpp ${CMAKE_CURRENT_SOURCE_DIR}/elementwise_functions/fmod.cpp ) diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp index b915f9a299a..e4af134f46d 100644 --- a/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/common.cpp @@ -26,6 +26,8 @@ #include #include "fabs.hpp" +#include "fmax.hpp" +#include "fmin.hpp" #include "fmod.hpp" namespace py = pybind11; @@ -38,6 +40,8 @@ namespace dpnp::extensions::ufunc void init_elementwise_functions(py::module_ m) { init_fabs(m); + init_fmax(m); + init_fmin(m); init_fmod(m); } } // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fmax.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fmax.cpp new file mode 100644 index 00000000000..64f68d146be --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fmax.cpp @@ -0,0 +1,137 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// maxification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +#include "dpctl4pybind11.hpp" + +#include "fmax.hpp" +#include "kernels/elementwise_functions/fmax.hpp" +#include "populate.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "kernels/elementwise_functions/maximum.hpp" +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace max_ns = dpctl::tensor::kernels::maximum; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +namespace impl +{ +// Supports the same types table as for maximum function in dpctl +template +using OutputType = max_ns::MaximumOutputType; + +using dpnp::kernels::fmax::FmaxFunctor; + +template +using ContigFunctor = + ew_cmn_ns::BinaryContigFunctor, + vec_sz, + n_vecs, + enable_sg_loadstore>; + +template +using StridedFunctor = + ew_cmn_ns::BinaryStridedFunctor>; + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static binary_contig_impl_fn_ptr_t fmax_contig_dispatch_table[td_ns::num_types] + [td_ns::num_types]; +static int fmax_output_typeid_table[td_ns::num_types][td_ns::num_types]; +static binary_strided_impl_fn_ptr_t + fmax_strided_dispatch_table[td_ns::num_types][td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(fmax); +} // namespace impl + +void init_fmax(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + { + impl::populate_fmax_dispatch_tables(); + using impl::fmax_contig_dispatch_table; + using impl::fmax_output_typeid_table; + using impl::fmax_strided_dispatch_table; + + auto fmax_pyapi = [&](const arrayT &src1, const arrayT &src2, + const arrayT &dst, sycl::queue &exec_q, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, fmax_output_typeid_table, + fmax_contig_dispatch_table, fmax_strided_dispatch_table, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_fmax", fmax_pyapi, "", py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("sycl_queue"), + py::arg("depends") = py::list()); + + auto fmax_result_type_pyapi = [&](const py::dtype &dtype1, + const py::dtype &dtype2) { + return py_int::py_binary_ufunc_result_type( + dtype1, dtype2, fmax_output_typeid_table); + }; + m.def("_fmax_result_type", fmax_result_type_pyapi); + } +} +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fmax.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fmax.hpp new file mode 100644 index 00000000000..70d0baac314 --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fmax.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +void init_fmax(py::module_ m); +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fmin.cpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fmin.cpp new file mode 100644 index 00000000000..0972ffde922 --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fmin.cpp @@ -0,0 +1,137 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// maxification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include + +#include "dpctl4pybind11.hpp" + +#include "fmin.hpp" +#include "kernels/elementwise_functions/fmin.hpp" +#include "populate.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "kernels/elementwise_functions/minimum.hpp" +#include "utils/type_dispatch.hpp" + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace min_ns = dpctl::tensor::kernels::minimum; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; + +using ew_cmn_ns::unary_contig_impl_fn_ptr_t; +using ew_cmn_ns::unary_strided_impl_fn_ptr_t; + +namespace impl +{ +// Supports the same types table as for minimum function in dpctl +template +using OutputType = min_ns::MinimumOutputType; + +using dpnp::kernels::fmin::FminFunctor; + +template +using ContigFunctor = + ew_cmn_ns::BinaryContigFunctor, + vec_sz, + n_vecs, + enable_sg_loadstore>; + +template +using StridedFunctor = + ew_cmn_ns::BinaryStridedFunctor>; + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static binary_contig_impl_fn_ptr_t fmin_contig_dispatch_table[td_ns::num_types] + [td_ns::num_types]; +static int fmin_output_typeid_table[td_ns::num_types][td_ns::num_types]; +static binary_strided_impl_fn_ptr_t + fmin_strided_dispatch_table[td_ns::num_types][td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(fmin); +} // namespace impl + +void init_fmin(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + { + impl::populate_fmin_dispatch_tables(); + using impl::fmin_contig_dispatch_table; + using impl::fmin_output_typeid_table; + using impl::fmin_strided_dispatch_table; + + auto fmin_pyapi = [&](const arrayT &src1, const arrayT &src2, + const arrayT &dst, sycl::queue &exec_q, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, fmin_output_typeid_table, + fmin_contig_dispatch_table, fmin_strided_dispatch_table, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_fmin", fmin_pyapi, "", py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("sycl_queue"), + py::arg("depends") = py::list()); + + auto fmin_result_type_pyapi = [&](const py::dtype &dtype1, + const py::dtype &dtype2) { + return py_int::py_binary_ufunc_result_type( + dtype1, dtype2, fmin_output_typeid_table); + }; + m.def("_fmin_result_type", fmin_result_type_pyapi); + } +} +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/ufunc/elementwise_functions/fmin.hpp b/dpnp/backend/extensions/ufunc/elementwise_functions/fmin.hpp new file mode 100644 index 00000000000..9c2ca9baab3 --- /dev/null +++ b/dpnp/backend/extensions/ufunc/elementwise_functions/fmin.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::ufunc +{ +void init_fmin(py::module_ m); +} // namespace dpnp::extensions::ufunc diff --git a/dpnp/backend/extensions/vm/CMakeLists.txt b/dpnp/backend/extensions/vm/CMakeLists.txt index 0a7646cfc57..159ca57993c 100644 --- a/dpnp/backend/extensions/vm/CMakeLists.txt +++ b/dpnp/backend/extensions/vm/CMakeLists.txt @@ -43,6 +43,8 @@ set(_elementwise_sources ${CMAKE_CURRENT_SOURCE_DIR}/exp2.cpp ${CMAKE_CURRENT_SOURCE_DIR}/expm1.cpp ${CMAKE_CURRENT_SOURCE_DIR}/floor.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/fmax.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/fmin.cpp ${CMAKE_CURRENT_SOURCE_DIR}/fmod.cpp ${CMAKE_CURRENT_SOURCE_DIR}/hypot.cpp ${CMAKE_CURRENT_SOURCE_DIR}/ln.cpp diff --git a/dpnp/backend/extensions/vm/fmax.cpp b/dpnp/backend/extensions/vm/fmax.cpp new file mode 100644 index 00000000000..b711516f679 --- /dev/null +++ b/dpnp/backend/extensions/vm/fmax.cpp @@ -0,0 +1,161 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "fmax.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::fmax function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event fmax_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + py::ssize_t a_offset, + const char *in_b, + py::ssize_t b_offset, + char *out_y, + py::ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::fmax(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(fmax); +} // namespace impl + +void init_fmax(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto fmax_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_fmax", fmax_pyapi, + "Call `fmax` function from OneMKL VM library to performs element " + "by element computation of the modulus function of vector `src1` " + "with respect to vector `src2` to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto fmax_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_fmax_to_call", fmax_need_to_call_pyapi, + "Check input arguments to answer if `fmax` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/fmax.hpp b/dpnp/backend/extensions/vm/fmax.hpp new file mode 100644 index 00000000000..13d8ccad9ff --- /dev/null +++ b/dpnp/backend/extensions/vm/fmax.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2023-2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::vm +{ +void init_fmax(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/fmin.cpp b/dpnp/backend/extensions/vm/fmin.cpp new file mode 100644 index 00000000000..3b288216c92 --- /dev/null +++ b/dpnp/backend/extensions/vm/fmin.cpp @@ -0,0 +1,161 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#include +#include + +#include "dpctl4pybind11.hpp" + +#include "common.hpp" +#include "fmin.hpp" + +// include a local copy of elementwise common header from dpctl tensor: +// dpctl/tensor/libtensor/source/elementwise_functions/elementwise_functions.hpp +// TODO: replace by including dpctl header once available +#include "../elementwise_functions/elementwise_functions.hpp" + +// dpctl tensor headers +#include "kernels/elementwise_functions/common.hpp" +#include "utils/type_dispatch.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::extensions::vm +{ +namespace ew_cmn_ns = dpctl::tensor::kernels::elementwise_common; +namespace py = pybind11; +namespace py_int = dpnp::extensions::py_internal; +namespace td_ns = dpctl::tensor::type_dispatch; +namespace tu_ns = dpctl::tensor::type_utils; + +namespace impl +{ +// OneMKL namespace with VM functions +namespace mkl_vm = oneapi::mkl::vm; + +/** + * @brief A factory to define pairs of supported types for which + * MKL VM library provides support in oneapi::mkl::vm::fmin function. + * + * @tparam T Type of input vectors `a` and `b` and of result vector `y`. + */ +template +struct OutputType +{ + using value_type = typename std::disjunction< + td_ns::BinaryTypeMapResultEntry, + td_ns::BinaryTypeMapResultEntry, + td_ns::DefaultResultEntry>::result_type; +}; + +template +static sycl::event fmin_contig_impl(sycl::queue &exec_q, + std::size_t in_n, + const char *in_a, + py::ssize_t a_offset, + const char *in_b, + py::ssize_t b_offset, + char *out_y, + py::ssize_t out_offset, + const std::vector &depends) +{ + tu_ns::validate_type_for_device(exec_q); + tu_ns::validate_type_for_device(exec_q); + + if ((a_offset != 0) || (b_offset != 0) || (out_offset != 0)) { + throw std::runtime_error("Arrays offsets have to be equals to 0"); + } + + std::int64_t n = static_cast(in_n); + const T1 *a = reinterpret_cast(in_a); + const T2 *b = reinterpret_cast(in_b); + + using resTy = typename OutputType::value_type; + resTy *y = reinterpret_cast(out_y); + + return mkl_vm::fmin(exec_q, + n, // number of elements to be calculated + a, // pointer `a` containing 1st input vector of size n + b, // pointer `b` containing 2nd input vector of size n + y, // pointer `y` to the output vector of size n + depends); +} + +using ew_cmn_ns::binary_contig_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t; +using ew_cmn_ns::binary_strided_impl_fn_ptr_t; + +static int output_typeid_vector[td_ns::num_types][td_ns::num_types]; +static binary_contig_impl_fn_ptr_t contig_dispatch_vector[td_ns::num_types] + [td_ns::num_types]; + +MACRO_POPULATE_DISPATCH_TABLES(fmin); +} // namespace impl + +void init_fmin(py::module_ m) +{ + using arrayT = dpctl::tensor::usm_ndarray; + using event_vecT = std::vector; + + impl::populate_dispatch_tables(); + using impl::contig_dispatch_vector; + using impl::output_typeid_vector; + + auto fmin_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst, + const event_vecT &depends = {}) { + return py_int::py_binary_ufunc( + src1, src2, dst, exec_q, depends, output_typeid_vector, + contig_dispatch_vector, + // no support of strided implementation in OneMKL + td_ns::NullPtrTable{}, + // no support of C-contig row with broadcasting in OneMKL + td_ns::NullPtrTable< + impl:: + binary_contig_matrix_contig_row_broadcast_impl_fn_ptr_t>{}, + td_ns::NullPtrTable< + impl:: + binary_contig_row_contig_matrix_broadcast_impl_fn_ptr_t>{}); + }; + m.def("_fmin", fmin_pyapi, + "Call `fmin` function from OneMKL VM library to performs element " + "by element computation of the modulus function of vector `src1` " + "with respect to vector `src2` to resulting vector `dst`", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst"), py::arg("depends") = py::list()); + + auto fmin_need_to_call_pyapi = [&](sycl::queue &exec_q, const arrayT &src1, + const arrayT &src2, const arrayT &dst) { + return py_internal::need_to_call_binary_ufunc(exec_q, src1, src2, dst, + output_typeid_vector, + contig_dispatch_vector); + }; + m.def("_mkl_fmin_to_call", fmin_need_to_call_pyapi, + "Check input arguments to answer if `fmin` function from " + "OneMKL VM library can be used", + py::arg("sycl_queue"), py::arg("src1"), py::arg("src2"), + py::arg("dst")); +} +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/fmin.hpp b/dpnp/backend/extensions/vm/fmin.hpp new file mode 100644 index 00000000000..d1eefe5eccb --- /dev/null +++ b/dpnp/backend/extensions/vm/fmin.hpp @@ -0,0 +1,35 @@ +//***************************************************************************** +// Copyright (c) 2023-2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +namespace py = pybind11; + +namespace dpnp::extensions::vm +{ +void init_fmin(py::module_ m); +} // namespace dpnp::extensions::vm diff --git a/dpnp/backend/extensions/vm/vm_py.cpp b/dpnp/backend/extensions/vm/vm_py.cpp index b78ae51ddc3..4491703957a 100644 --- a/dpnp/backend/extensions/vm/vm_py.cpp +++ b/dpnp/backend/extensions/vm/vm_py.cpp @@ -46,6 +46,8 @@ #include "exp2.hpp" #include "expm1.hpp" #include "floor.hpp" +#include "fmax.hpp" +#include "fmin.hpp" #include "fmod.hpp" #include "hypot.hpp" #include "ln.hpp" @@ -87,6 +89,8 @@ PYBIND11_MODULE(_vm_impl, m) vm_ns::init_exp2(m); vm_ns::init_expm1(m); vm_ns::init_floor(m); + vm_ns::init_fmax(m); + vm_ns::init_fmin(m); vm_ns::init_fmod(m); vm_ns::init_hypot(m); vm_ns::init_ln(m); diff --git a/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp b/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp index e5a2c924653..11aed0ebac2 100644 --- a/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp +++ b/dpnp/backend/include/dpnp_gen_2arg_3type_tbl.hpp @@ -103,28 +103,6 @@ #endif -MACRO_2ARG_3TYPES_OP( - dpnp_fmod_c, - dispatch_fmod_op(input1_elem, input2_elem), - dispatch_fmod_op(x1, x2), - MACRO_UNPACK_TYPES(std::int32_t, std::int64_t, float, double), - oneapi::mkl::vm::fmod, - MACRO_UNPACK_TYPES(float, double)) - -MACRO_2ARG_3TYPES_OP(dpnp_maximum_c, - sycl::max(input1_elem, input2_elem), - nullptr, - std::false_type, - oneapi::mkl::vm::fmax, - MACRO_UNPACK_TYPES(float, double)) - -MACRO_2ARG_3TYPES_OP(dpnp_minimum_c, - sycl::min(input1_elem, input2_elem), - nullptr, - std::false_type, - oneapi::mkl::vm::fmin, - MACRO_UNPACK_TYPES(float, double)) - // "multiply" needs to be standalone kernel (not autogenerated) due to complex // algorithm. This is not an element wise. pytest // "tests/third_party/cupy/creation_tests/test_ranges.py::TestMgrid::test_mgrid3" diff --git a/dpnp/backend/include/dpnp_iface_fptr.hpp b/dpnp/backend/include/dpnp_iface_fptr.hpp index aaaf90c27bb..9f9b7a89143 100644 --- a/dpnp/backend/include/dpnp_iface_fptr.hpp +++ b/dpnp/backend/include/dpnp_iface_fptr.hpp @@ -100,15 +100,11 @@ enum class DPNPFuncName : size_t DPNP_FN_INITVAL_EXT, /**< Used in numpy ones, ones_like, zeros, zeros_like impls */ DPNP_FN_MAX, /**< Used in numpy.max() impl */ - DPNP_FN_MAXIMUM_EXT, /**< Used in numpy.fmax() impl , requires extra - parameters */ DPNP_FN_MEAN, /**< Used in numpy.mean() impl */ DPNP_FN_MEDIAN, /**< Used in numpy.median() impl */ DPNP_FN_MEDIAN_EXT, /**< Used in numpy.median() impl, requires extra parameters */ DPNP_FN_MIN, /**< Used in numpy.min() impl */ - DPNP_FN_MINIMUM_EXT, /**< Used in numpy.fmax() impl, requires extra - parameters */ DPNP_FN_MODF, /**< Used in numpy.modf() impl */ DPNP_FN_MODF_EXT, /**< Used in numpy.modf() impl, requires extra parameters */ diff --git a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp index e3797bd22e6..75413cc5e60 100644 --- a/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_elemwise.cpp @@ -1026,45 +1026,6 @@ static void func_map_init_elemwise_1arg_1type(func_map_t &fmap) #include -template -static void func_map_elemwise_2arg_3type_short_core(func_map_t &fmap) -{ - ((fmap[DPNPFuncName::DPNP_FN_MAXIMUM_EXT][FT1][FTs] = - {get_floating_res_type(), - (void *)dpnp_maximum_c_ext< - func_type_map_t::find_type()>, - func_type_map_t::find_type, - func_type_map_t::find_type>, - get_floating_res_type(), - (void *)dpnp_maximum_c_ext< - func_type_map_t::find_type()>, - func_type_map_t::find_type, - func_type_map_t::find_type>}), - ...); - ((fmap[DPNPFuncName::DPNP_FN_MINIMUM_EXT][FT1][FTs] = - {get_floating_res_type(), - (void *)dpnp_minimum_c_ext< - func_type_map_t::find_type()>, - func_type_map_t::find_type, - func_type_map_t::find_type>, - get_floating_res_type(), - (void *)dpnp_minimum_c_ext< - func_type_map_t::find_type()>, - func_type_map_t::find_type, - func_type_map_t::find_type>}), - ...); -} - -template -static void func_map_elemwise_2arg_3type_short_helper(func_map_t &fmap) -{ - ((func_map_elemwise_2arg_3type_short_core(fmap)), ...); -} - static void func_map_init_elemwise_2arg_3type(func_map_t &fmap) { // Used in dpnp_dot_c @@ -1170,9 +1131,6 @@ static void func_map_init_elemwise_2arg_3type(func_map_t &fmap) (void *)dpnp_multiply_c_default< std::complex, std::complex, std::complex>}; - func_map_elemwise_2arg_3type_short_helper(fmap); - return; } diff --git a/dpnp/backend/kernels/elementwise_functions/fmax.hpp b/dpnp/backend/kernels/elementwise_functions/fmax.hpp new file mode 100644 index 00000000000..6b0ebb81ec6 --- /dev/null +++ b/dpnp/backend/kernels/elementwise_functions/fmax.hpp @@ -0,0 +1,83 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +// dpctl tensor headers +#include "utils/math_utils.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::kernels::fmax +{ +namespace mu_ns = dpctl::tensor::math_utils; +namespace tu_ns = dpctl::tensor::type_utils; + +template +struct FmaxFunctor +{ + using supports_sg_loadstore = std::negation< + std::disjunction, tu_ns::is_complex>>; + using supports_vec = + std::conjunction, + std::disjunction, + std::is_same>>; + + resT operator()(const argT1 &in1, const argT2 &in2) const + { + if constexpr (std::is_integral_v && std::is_integral_v) { + return in1 >= in2 ? in1 : in2; + } + else if constexpr (tu_ns::is_complex::value && + tu_ns::is_complex::value) + { + static_assert(std::is_same_v); + + using realT = typename argT1::value_type; + const realT in2r = std::real(in2); + const realT in2i = std::imag(in2); + + if (sycl::isnan(in2r) || sycl::isnan(in2i) || + mu_ns::greater_equal_complex(in1, in2)) + { + return in1; + } + return in2; + } + else { + return sycl::fmax(in1, in2); + } + } + + template + sycl::vec + operator()(const sycl::vec &in1, + const sycl::vec &in2) const + { + return sycl::fmax(in1, in2); + } +}; +} // namespace dpnp::kernels::fmax diff --git a/dpnp/backend/kernels/elementwise_functions/fmin.hpp b/dpnp/backend/kernels/elementwise_functions/fmin.hpp new file mode 100644 index 00000000000..30e4af8884f --- /dev/null +++ b/dpnp/backend/kernels/elementwise_functions/fmin.hpp @@ -0,0 +1,83 @@ +//***************************************************************************** +// Copyright (c) 2024, Intel Corporation +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions are met: +// - Redistributions of source code must retain the above copyright notice, +// this list of conditions and the following disclaimer. +// - Redistributions in binary form must reproduce the above copyright notice, +// this list of conditions and the following disclaimer in the documentation +// and/or other materials provided with the distribution. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +// THE POSSIBILITY OF SUCH DAMAGE. +//***************************************************************************** + +#pragma once + +#include + +// dpctl tensor headers +#include "utils/math_utils.hpp" +#include "utils/type_utils.hpp" + +namespace dpnp::kernels::fmin +{ +namespace mu_ns = dpctl::tensor::math_utils; +namespace tu_ns = dpctl::tensor::type_utils; + +template +struct FminFunctor +{ + using supports_sg_loadstore = std::negation< + std::disjunction, tu_ns::is_complex>>; + using supports_vec = + std::conjunction, + std::disjunction, + std::is_same>>; + + resT operator()(const argT1 &in1, const argT2 &in2) const + { + if constexpr (std::is_integral_v && std::is_integral_v) { + return in1 <= in2 ? in1 : in2; + } + else if constexpr (tu_ns::is_complex::value && + tu_ns::is_complex::value) + { + static_assert(std::is_same_v); + + using realT = typename argT1::value_type; + const realT in2r = std::real(in2); + const realT in2i = std::imag(in2); + + if (sycl::isnan(in2r) || sycl::isnan(in2i) || + mu_ns::less_equal_complex(in1, in2)) + { + return in1; + } + return in2; + } + else { + return sycl::fmin(in1, in2); + } + } + + template + sycl::vec + operator()(const sycl::vec &in1, + const sycl::vec &in2) const + { + return sycl::fmin(in1, in2); + } +}; +} // namespace dpnp::kernels::fmin diff --git a/dpnp/backend/kernels/elementwise_functions/fmod.hpp b/dpnp/backend/kernels/elementwise_functions/fmod.hpp index e97b257cb06..bf60bd09564 100644 --- a/dpnp/backend/kernels/elementwise_functions/fmod.hpp +++ b/dpnp/backend/kernels/elementwise_functions/fmod.hpp @@ -38,8 +38,7 @@ struct FmodFunctor resT operator()(const argT1 &in1, const argT2 &in2) const { - if constexpr (std::is_integral::value && - std::is_integral::value) { + if constexpr (std::is_integral_v && std::is_integral_v) { if (in2 == argT2(0)) { return resT(0); } diff --git a/dpnp/dpnp_algo/dpnp_algo.pxd b/dpnp/dpnp_algo/dpnp_algo.pxd index 0c8bd1134a7..3b5b2383226 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pxd +++ b/dpnp/dpnp_algo/dpnp_algo.pxd @@ -41,9 +41,7 @@ cdef extern from "dpnp_iface_fptr.hpp" namespace "DPNPFuncName": # need this na DPNP_FN_ERF_EXT DPNP_FN_FFT_FFT_EXT DPNP_FN_FFT_RFFT_EXT - DPNP_FN_MAXIMUM_EXT DPNP_FN_MEDIAN_EXT - DPNP_FN_MINIMUM_EXT DPNP_FN_MODF_EXT DPNP_FN_PARTITION_EXT DPNP_FN_RADIANS_EXT @@ -170,15 +168,6 @@ cpdef dpnp_descriptor dpnp_isclose(dpnp_descriptor input1, dpnp_descriptor input double rtol=*, double atol=*, cpp_bool equal_nan=*) -""" -Mathematical functions -""" -cpdef dpnp_descriptor dpnp_fmax(dpnp_descriptor x1_obj, dpnp_descriptor x2_obj, object dtype=*, - dpnp_descriptor out=*, object where=*) -cpdef dpnp_descriptor dpnp_fmin(dpnp_descriptor x1_obj, dpnp_descriptor x2_obj, object dtype=*, - dpnp_descriptor out=*, object where=*) - - """ Trigonometric functions """ diff --git a/dpnp/dpnp_algo/dpnp_algo.pyx b/dpnp/dpnp_algo/dpnp_algo.pyx index 4c560d50e0b..d304f1d32d3 100644 --- a/dpnp/dpnp_algo/dpnp_algo.pyx +++ b/dpnp/dpnp_algo/dpnp_algo.pyx @@ -219,99 +219,3 @@ cdef utils.dpnp_descriptor call_fptr_1in_1out_strides(DPNPFuncName fptr_name, c_dpctl.DPCTLEvent_Delete(event_ref) return result - - -cdef utils.dpnp_descriptor call_fptr_2in_1out_strides(DPNPFuncName fptr_name, - utils.dpnp_descriptor x1_obj, - utils.dpnp_descriptor x2_obj, - object dtype=None, - utils.dpnp_descriptor out=None, - object where=True, - func_name=None): - - # Convert type (x1_obj.dtype) to C enum DPNPFuncType - cdef DPNPFuncType x1_c_type = dpnp_dtype_to_DPNPFuncType(x1_obj.dtype) - cdef DPNPFuncType x2_c_type = dpnp_dtype_to_DPNPFuncType(x2_obj.dtype) - - # get the FPTR data structure - cdef DPNPFuncData kernel_data = get_dpnp_function_ptr(fptr_name, x1_c_type, x2_c_type) - - result_sycl_device, result_usm_type, result_sycl_queue = utils.get_common_usm_allocation(x1_obj, x2_obj) - - # get FPTR function and return type - cdef (DPNPFuncType, void *) ret_type_and_func = utils.get_ret_type_and_func(kernel_data, - result_sycl_device.has_aspect_fp64) - cdef DPNPFuncType return_type = ret_type_and_func[0] - cdef fptr_2in_1out_strides_t func = < fptr_2in_1out_strides_t > ret_type_and_func[1] - - # Create result array - cdef shape_type_c x1_shape = x1_obj.shape - - cdef shape_type_c x1_strides = utils.strides_to_vector(x1_obj.strides, x1_shape) - cdef shape_type_c x2_shape = x2_obj.shape - cdef shape_type_c x2_strides = utils.strides_to_vector(x2_obj.strides, x2_shape) - - cdef shape_type_c result_shape = utils.get_common_shape(x1_shape, x2_shape) - cdef utils.dpnp_descriptor result - - # check 'out' parameter data - if out is not None: - if out.shape != result_shape: - utils.checker_throw_value_error(func_name, 'out.shape', out.shape, result_shape) - - utils.get_common_usm_allocation(x1_obj, out) # check USM allocation is common - - if out is None or out.is_array_overlapped(x1_obj) or out.is_array_overlapped(x2_obj) or not out.match_ctype(return_type): - """ - Create result array with type given by FPTR data. - If 'out' array has another dtype than expected or overlaps a memory from any input array, - we have to create a temporary array and to copy data from the temporary into 'out' array, - once the computation is completed. - Otherwise simultaneously access to the same memory may cause a race condition issue - which will result into undefined behaviour. - """ - is_result_memory_allocated = True - result = utils.create_output_descriptor(result_shape, - return_type, - None, - device=result_sycl_device, - usm_type=result_usm_type, - sycl_queue=result_sycl_queue) - else: - is_result_memory_allocated = False - result = out - - cdef shape_type_c result_strides = utils.strides_to_vector(result.strides, result_shape) - - result_obj = result.get_array() - - cdef c_dpctl.SyclQueue q = < c_dpctl.SyclQueue > result_obj.sycl_queue - cdef c_dpctl.DPCTLSyclQueueRef q_ref = q.get_queue_ref() - - """ Call FPTR function """ - cdef c_dpctl.DPCTLSyclEventRef event_ref = func(q_ref, - result.get_data(), - result.size, - result.ndim, - result_shape.data(), - result_strides.data(), - x1_obj.get_data(), - x1_obj.size, - x1_obj.ndim, - x1_shape.data(), - x1_strides.data(), - x2_obj.get_data(), - x2_obj.size, - x2_obj.ndim, - x2_shape.data(), - x2_strides.data(), - NULL, - NULL) # dep_events_ref) - - with nogil: c_dpctl.DPCTLEvent_WaitAndThrow(event_ref) - c_dpctl.DPCTLEvent_Delete(event_ref) - - if out is not None and is_result_memory_allocated: - return out.get_result_desc(result) - - return result.get_result_desc() diff --git a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi index 28b89ce60a1..84b004856bd 100644 --- a/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi +++ b/dpnp/dpnp_algo/dpnp_algo_mathematical.pxi @@ -37,8 +37,6 @@ and the rest of the library __all__ += [ "dpnp_ediff1d", - "dpnp_fmax", - "dpnp_fmin", "dpnp_modf", ] @@ -104,22 +102,6 @@ cpdef utils.dpnp_descriptor dpnp_ediff1d(utils.dpnp_descriptor x1): return result -cpdef utils.dpnp_descriptor dpnp_fmax(utils.dpnp_descriptor x1_obj, - utils.dpnp_descriptor x2_obj, - object dtype=None, - utils.dpnp_descriptor out=None, - object where=True): - return call_fptr_2in_1out_strides(DPNP_FN_MAXIMUM_EXT, x1_obj, x2_obj, dtype, out, where) - - -cpdef utils.dpnp_descriptor dpnp_fmin(utils.dpnp_descriptor x1_obj, - utils.dpnp_descriptor x2_obj, - object dtype=None, - utils.dpnp_descriptor out=None, - object where=True): - return call_fptr_2in_1out_strides(DPNP_FN_MINIMUM_EXT, x1_obj, x2_obj, dtype, out, where) - - cpdef tuple dpnp_modf(utils.dpnp_descriptor x1): """ Convert string type names (array.dtype) to C enum DPNPFuncType """ cdef DPNPFuncType param1_type = dpnp_dtype_to_DPNPFuncType(x1.dtype) diff --git a/dpnp/dpnp_iface.py b/dpnp/dpnp_iface.py index b3103869e8d..3402f7d23a8 100644 --- a/dpnp/dpnp_iface.py +++ b/dpnp/dpnp_iface.py @@ -438,11 +438,6 @@ def get_dpnp_descriptor( if use_origin_backend(): return False - # It's required to keep track of input object if a non-strided copy is - # going to be created. Thus there will be an extra descriptor allocated - # to refer on original input. - orig_desc = None - # If input object is a scalar, it means it was allocated on host memory. # We need to copy it to USM memory according to compute follows data. if isscalar(ext_obj): @@ -473,7 +468,6 @@ def get_dpnp_descriptor( ext_obj_offset = 0 if ext_obj.strides != shape_offsets or ext_obj_offset != 0: - orig_desc = dpnp_descriptor(ext_obj) ext_obj = array(ext_obj, order="C") # while dpnp functions are based on DPNP_QUEUE @@ -490,7 +484,7 @@ def get_dpnp_descriptor( if not queue_is_default: ext_obj = array(ext_obj, sycl_queue=default_queue) - dpnp_desc = dpnp_descriptor(ext_obj, orig_desc) + dpnp_desc = dpnp_descriptor(ext_obj) if dpnp_desc.is_valid: # pylint: disable=using-constant-test return dpnp_desc diff --git a/dpnp/dpnp_iface_mathematical.py b/dpnp/dpnp_iface_mathematical.py index 1caf1359be3..51d7f2ceddc 100644 --- a/dpnp/dpnp_iface_mathematical.py +++ b/dpnp/dpnp_iface_mathematical.py @@ -61,8 +61,6 @@ from .backend.extensions.sycl_ext import _sycl_ext_impl from .dpnp_algo import ( dpnp_ediff1d, - dpnp_fmax, - dpnp_fmin, dpnp_modf, ) from .dpnp_algo.dpnp_elementwise_common import ( @@ -1537,232 +1535,174 @@ def ediff1d(x1, to_end=None, to_begin=None): ) -def fmax(x1, x2, /, out=None, *, where=True, dtype=None, subok=True, **kwargs): - """ - Element-wise maximum of array elements. +_FMAX_DOCSTRING = """ +Compares two input arrays `x1` and `x2` and returns a new array containing the +element-wise maxima. - For full documentation refer to :obj:`numpy.fmax`. +If one of the elements being compared is a NaN, then the non-nan element is +returned. If both elements are NaNs then the first is returned. The latter +distinction is important for complex NaNs, which are defined as at least one of +the real or imaginary parts being a NaN. The net effect is that NaNs are +ignored when possible. - Returns - ------- - out : dpnp.ndarray - The maximum of `x1` and `x2`, element-wise, ignoring NaNs. +For full documentation refer to :obj:`numpy.fmax`. - Limitations - ----------- - Parameters `x1` and `x2` are supported as either scalar, - :class:`dpnp.ndarray` or :class:`dpctl.tensor.usm_ndarray`, but both `x1` - and `x2` can not be scalars at the same time. - Parameters `where`, `dtype` and `subok` are supported with their default - values. - Keyword argument `kwargs` is currently unsupported. - Otherwise the function will be executed sequentially on CPU. - Input array data types are limited by real-valued data types. +Parameters +---------- +x1 : {dpnp.ndarray, usm_ndarray, scalar} + First input array, expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} + Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. +out : {None, dpnp.ndarray, usm_ndarray}, optional + Output array to populate. + Array must have the correct shape and the expected data type. + Default: ``None``. +order : {"C", "F", "A", "K"}, optional + Memory layout of the newly output array, if parameter `out` is ``None``. + Default: ``"K"``. - See Also - -------- - :obj:`dpnp.maximum` : Element-wise maximum of array elements, propagates - NaNs. - :obj:`dpnp.fmin` : Element-wise minimum of array elements, ignores NaNs. - :obj:`dpnp.max` : The maximum value of an array along a given axis, - propagates NaNs.. - :obj:`dpnp.nanmax` : The maximum value of an array along a given axis, - ignores NaNs. - :obj:`dpnp.minimum` : Element-wise minimum of array elements, propagates - NaNs. - :obj:`dpnp.fmod` : Calculate the element-wise remainder of division. +Returns +------- +out : dpnp.ndarray + An array containing the element-wise maxima. The data type of + the returned array is determined by the Type Promotion Rules. - Examples - -------- - >>> import dpnp as np - >>> x1 = np.array([2, 3, 4]) - >>> x2 = np.array([1, 5, 2]) - >>> np.fmax(x1, x2) - array([2, 5, 4]) - - >>> x1 = np.eye(2) - >>> x2 = np.array([0.5, 2]) - >>> np.fmax(x1, x2) # broadcasting - array([[1. , 2. ], - [0.5, 2. ]]) - - >>> x1 = np.array([np.nan, 0, np.nan]) - >>> x2 = np.array([0, np.nan, np.nan]) - >>> np.fmax(x1, x2) - array([ 0., 0., nan]) +Limitations +----------- +Parameters `where` and `subok` are supported with their default values. +Keyword argument `kwargs` is currently unsupported. +Otherwise ``NotImplementedError`` exception will be raised. - """ +See Also +-------- +:obj:`dpnp.fmin` : Element-wise minimum of two arrays, ignores NaNs. +:obj:`dpnp.maximum` : Element-wise maximum of two arrays, propagates NaNs. +:obj:`dpnp.max` : The maximum value of an array along a given axis, propagates NaNs. +:obj:`dpnp.nanmax` : The maximum value of an array along a given axis, ignores NaNs. +:obj:`dpnp.minimum` : Element-wise minimum of two arrays, propagates NaNs. +:obj:`dpnp.min` : The minimum value of an array along a given axis, propagates NaNs. +:obj:`dpnp.nanmin` : The minimum value of an array along a given axis, ignores NaNs. - if kwargs: - pass - elif where is not True: - pass - elif dtype is not None: - pass - elif subok is not True: - pass - elif dpnp.isscalar(x1) and dpnp.isscalar(x2): - # at least either x1 or x2 has to be an array - pass - else: - # get USM type and queue to copy scalar from the host memory - # into a USM allocation - usm_type, queue = ( - get_usm_allocations([x1, x2]) - if dpnp.isscalar(x1) or dpnp.isscalar(x2) - else (None, None) - ) +Notes +----- +The fmax is equivalent to ``dpnp.where(x1 >= x2, x1, x2)`` when neither +`x1` nor `x2` are NaNs, but it is faster and does proper broadcasting. - x1_desc = dpnp.get_dpnp_descriptor( - x1, - copy_when_strides=False, - copy_when_nondefault_queue=False, - alloc_usm_type=usm_type, - alloc_queue=queue, - ) - x2_desc = dpnp.get_dpnp_descriptor( - x2, - copy_when_strides=False, - copy_when_nondefault_queue=False, - alloc_usm_type=usm_type, - alloc_queue=queue, - ) - if x1_desc and x2_desc: - if out is not None: - if not dpnp.is_supported_array_type(out): - raise TypeError( - "return array must be of supported array type" - ) - out_desc = ( - dpnp.get_dpnp_descriptor( - out, copy_when_nondefault_queue=False - ) - or None - ) - else: - out_desc = None +Examples +-------- +>>> import dpnp as np +>>> x1 = np.array([2, 3, 4]) +>>> x2 = np.array([1, 5, 2]) +>>> np.fmax(x1, x2) +array([2, 5, 4]) - return dpnp_fmax( - x1_desc, x2_desc, dtype=dtype, out=out_desc, where=where - ).get_pyobj() +>>> x1 = np.eye(2) +>>> x2 = np.array([0.5, 2]) +>>> np.fmax(x1, x2) +array([[1. , 2. ], + [0.5, 2. ]]) - return call_origin( - numpy.fmax, x1, x2, dtype=dtype, out=out, where=where, **kwargs - ) +>>> x1 = np.array([np.nan, 0, np.nan]) +>>> x2 = np.array([0, np.nan, np.nan]) +>>> np.fmax(x1, x2) +array([ 0., 0., nan]) +""" +fmax = DPNPBinaryFunc( + "fmax", + ufi._fmax_result_type, + ufi._fmax, + _FMAX_DOCSTRING, + mkl_fn_to_call=vmi._mkl_fmax_to_call, + mkl_impl_fn=vmi._fmax, +) -def fmin(x1, x2, /, out=None, *, where=True, dtype=None, subok=True, **kwargs): - """ - Element-wise minimum of array elements. - For full documentation refer to :obj:`numpy.fmin`. +_FMIN_DOCSTRING = """ +Compares two input arrays `x1` and `x2` and returns a new array containing the +element-wise minima. - Returns - ------- - out : dpnp.ndarray - The minimum of `x1` and `x2`, element-wise, ignoring NaNs. +If one of the elements being compared is a NaN, then the non-nan element is +returned. If both elements are NaNs then the first is returned. The latter +distinction is important for complex NaNs, which are defined as at least one of +the real or imaginary parts being a NaN. The net effect is that NaNs are +ignored when possible. - Limitations - ----------- - Parameters `x1` and `x2` are supported as either scalar, - :class:`dpnp.ndarray` or :class:`dpctl.tensor.usm_ndarray`, but both `x1` - and `x2` can not be scalars at the same time. - Parameters `where`, `dtype` and `subok` are supported with their default - values. - Keyword argument `kwargs` is currently unsupported. - Otherwise the function will be executed sequentially on CPU. - Input array data types are limited by real-valued data types. +For full documentation refer to :obj:`numpy.fmin`. - See Also - -------- - :obj:`dpnp.minimum` : Element-wise minimum of array elements, propagates - NaNs. - :obj:`dpnp.fmax` : Element-wise maximum of array elements, ignores NaNs. - :obj:`dpnp.min` : The minimum value of an array along a given axis, - propagates NaNs. - :obj:`dpnp.nanmin` : The minimum value of an array along a given axis, - ignores NaNs. - :obj:`dpnp.maximum` : Element-wise maximum of array elements, propagates - NaNs. - :obj:`dpnp.fmod` : Calculate the element-wise remainder of division. +Parameters +---------- +x1 : {dpnp.ndarray, usm_ndarray, scalar} + First input array, expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. +x2 : {dpnp.ndarray, usm_ndarray, scalar} + Second input array, also expected to have numeric data type. + Both inputs `x1` and `x2` can not be scalars at the same time. +out : {None, dpnp.ndarray, usm_ndarray}, optional + Output array to populate. + Array must have the correct shape and the expected data type. + Default: ``None``. +order : {"C", "F", "A", "K"}, optional + Memory layout of the newly output array, if parameter `out` is ``None``. + Default: ``"K"``. - Examples - -------- - >>> import dpnp as np - >>> x1 = np.array([2, 3, 4]) - >>> x2 = np.array([1, 5, 2]) - >>> np.fmin(x1, x2) - array([1, 3, 2]) - - >>> x1 = np.eye(2) - >>> x2 = np.array([0.5, 2]) - >>> np.fmin(x1, x2) # broadcasting - array([[0.5, 0. ], - [0. , 1. ]] - - >>> x1 = np.array([np.nan, 0, np.nan]) - >>> x2 = np.array([0, np.nan, np.nan]) - >>> np.fmin(x1, x2) - array([ 0., 0., nan]) +Returns +------- +out : dpnp.ndarray + An array containing the element-wise minima. The data type of + the returned array is determined by the Type Promotion Rules. - """ +Limitations +----------- +Parameters `where` and `subok` are supported with their default values. +Keyword argument `kwargs` is currently unsupported. +Otherwise ``NotImplementedError`` exception will be raised. - if kwargs: - pass - elif where is not True: - pass - elif dtype is not None: - pass - elif subok is not True: - pass - elif dpnp.isscalar(x1) and dpnp.isscalar(x2): - # at least either x1 or x2 has to be an array - pass - else: - # get USM type and queue to copy scalar from the host memory into - # a USM allocation - usm_type, queue = ( - get_usm_allocations([x1, x2]) - if dpnp.isscalar(x1) or dpnp.isscalar(x2) - else (None, None) - ) +See Also +-------- +:obj:`dpnp.fmax` : Element-wise maximum of two arrays, ignores NaNs. +:obj:`dpnp.minimum` : Element-wise minimum of two arrays, propagates NaNs. +:obj:`dpnp.min` : The minimum value of an array along a given axis, propagates NaNs. +:obj:`dpnp.nanmin` : The minimum value of an array along a given axis, ignores NaNs. +:obj:`dpnp.maximum` : Element-wise maximum of two arrays, propagates NaNs. +:obj:`dpnp.max` : The maximum value of an array along a given axis, propagates NaNs. +:obj:`dpnp.nanmax` : The maximum value of an array along a given axis, ignores NaNs. - x1_desc = dpnp.get_dpnp_descriptor( - x1, - copy_when_strides=False, - copy_when_nondefault_queue=False, - alloc_usm_type=usm_type, - alloc_queue=queue, - ) - x2_desc = dpnp.get_dpnp_descriptor( - x2, - copy_when_strides=False, - copy_when_nondefault_queue=False, - alloc_usm_type=usm_type, - alloc_queue=queue, - ) - if x1_desc and x2_desc: - if out is not None: - if not dpnp.is_supported_array_type(out): - raise TypeError( - "return array must be of supported array type" - ) - out_desc = ( - dpnp.get_dpnp_descriptor( - out, copy_when_nondefault_queue=False - ) - or None - ) - else: - out_desc = None +Notes +----- +The fmin is equivalent to ``dpnp.where(x1 <= x2, x1, x2)`` when neither +`x1` nor `x2` are NaNs, but it is faster and does proper broadcasting. - return dpnp_fmin( - x1_desc, x2_desc, dtype=dtype, out=out_desc, where=where - ).get_pyobj() +Examples +-------- +>>> import dpnp as np +>>> x1 = np.array([2, 3, 4]) +>>> x2 = np.array([1, 5, 2]) +>>> np.fmin(x1, x2) +array([1, 3, 2]) - return call_origin( - numpy.fmin, x1, x2, dtype=dtype, out=out, where=where, **kwargs - ) +>>> x1 = np.eye(2) +>>> x2 = np.array([0.5, 2]) +>>> np.fmin(x1, x2) +array([[0.5, 0. ], + [0. , 1. ]]) + +>>> x1 = np.array([np.nan, 0, np.nan]) +>>> x2 = np.array([0, np.nan, np.nan]) +>>> np.fmin(x1, x2) +array([ 0., 0., nan]) +""" + +fmin = DPNPBinaryFunc( + "fmin", + ufi._fmin_result_type, + ufi._fmin, + _FMIN_DOCSTRING, + mkl_fn_to_call=vmi._mkl_fmin_to_call, + mkl_impl_fn=vmi._fmin, +) _FMOD_DOCSTRING = """ @@ -2100,6 +2040,11 @@ def gradient(f, *varargs, axis=None, edge_order=1): Compares two input arrays `x1` and `x2` and returns a new array containing the element-wise maxima. +If one of the elements being compared is a NaN, then that element is returned. +If both elements are NaNs then the first is returned. The latter distinction is +important for complex NaNs, which are defined as at least one of the real or +imaginary parts being a NaN. The net effect is that NaNs are propagated. + For full documentation refer to :obj:`numpy.maximum`. Parameters @@ -2175,6 +2120,11 @@ def gradient(f, *varargs, axis=None, edge_order=1): Compares two input arrays `x1` and `x2` and returns a new array containing the element-wise minima. +If one of the elements being compared is a NaN, then that element is returned. +If both elements are NaNs then the first is returned. The latter distinction is +important for complex NaNs, which are defined as at least one of the real or +imaginary parts being a NaN. The net effect is that NaNs are propagated. + For full documentation refer to :obj:`numpy.minimum`. Parameters diff --git a/dpnp/dpnp_utils/dpnp_algo_utils.pxd b/dpnp/dpnp_utils/dpnp_algo_utils.pxd index 4d4272ac9fb..23714b5218c 100644 --- a/dpnp/dpnp_utils/dpnp_algo_utils.pxd +++ b/dpnp/dpnp_utils/dpnp_algo_utils.pxd @@ -91,19 +91,11 @@ cdef class dpnp_descriptor: cdef public: # TODO remove "public" as python accessible attribute object origin_pyobj - dpnp_descriptor origin_desc dict descriptor Py_ssize_t dpnp_descriptor_data_size cpp_bool dpnp_descriptor_is_scalar cdef void * get_data(self) - cdef cpp_bool match_ctype(self, DPNPFuncType ctype) - - -cdef shape_type_c get_common_shape(shape_type_c input1_shape, shape_type_c input2_shape) except * -""" -Calculate common shape from input shapes -""" cdef dpnp_descriptor create_output_descriptor(shape_type_c output_shape, DPNPFuncType c_type, diff --git a/dpnp/dpnp_utils/dpnp_algo_utils.pyx b/dpnp/dpnp_utils/dpnp_algo_utils.pyx index 1e3a793d868..ad9a2f10ff4 100644 --- a/dpnp/dpnp_utils/dpnp_algo_utils.pyx +++ b/dpnp/dpnp_utils/dpnp_algo_utils.pyx @@ -33,8 +33,6 @@ This module contains different helpers and utilities """ import dpctl -import dpctl.tensor._copy_utils as dpt_cu -import dpctl.tensor._tensor_impl as dpt_ti import dpctl.utils as dpu import numpy @@ -381,32 +379,6 @@ cpdef long _get_linear_index(key, tuple shape, int ndim): return li -cdef shape_type_c get_common_shape(shape_type_c input1_shape, shape_type_c input2_shape) except *: - cdef shape_type_c input1_shape_orig = input1_shape - cdef shape_type_c input2_shape_orig = input2_shape - cdef shape_type_c result_shape - - # ex (8, 1, 6, 1) and (7, 1, 5) -> (8, 1, 6, 1) and (1, 7, 1, 5) - cdef size_t max_shape_size = max(input1_shape.size(), input2_shape.size()) - input1_shape.insert(input1_shape.begin(), max_shape_size - input1_shape.size(), 1) - input2_shape.insert(input2_shape.begin(), max_shape_size - input2_shape.size(), 1) - - # ex result (8, 7, 6, 5) - for it in range(max_shape_size): - if input1_shape[it] == input2_shape[it]: - result_shape.push_back(input1_shape[it]) - elif input1_shape[it] == 1: - result_shape.push_back(input2_shape[it]) - elif input2_shape[it] == 1: - result_shape.push_back(input1_shape[it]) - else: - err_msg = f"{ERROR_PREFIX} in function get_common_shape(): " - err_msg += f"operands could not be broadcast together with shapes {input1_shape_orig} {input2_shape_orig}" - raise ValueError(err_msg) - - return result_shape - - cdef dpnp_descriptor create_output_descriptor(shape_type_c output_shape, DPNPFuncType c_type, dpnp_descriptor requested_out, @@ -572,10 +544,9 @@ cdef (DPNPFuncType, void *) get_ret_type_and_func(DPNPFuncData kernel_data, cdef class dpnp_descriptor: - def __init__(self, obj, dpnp_descriptor orig_desc=None): + def __init__(self, obj): """ Initialize variables """ self.origin_pyobj = None - self.origin_desc = None self.descriptor = None self.dpnp_descriptor_data_size = 0 self.dpnp_descriptor_is_scalar = True @@ -594,10 +565,6 @@ cdef class dpnp_descriptor: self.origin_pyobj = obj - """ Keep track of a descriptor with original data """ - if orig_desc is not None and orig_desc.is_valid: - self.origin_desc = orig_desc - """ array size calculation """ cdef Py_ssize_t shape_it = 0 self.dpnp_descriptor_data_size = 1 @@ -657,14 +624,6 @@ cdef class dpnp_descriptor: def is_scalar(self): return self.dpnp_descriptor_is_scalar - @property - def is_temporary(self): - """ - Non-none descriptor of original data means the current descriptor - holds a temporary allocated data. - """ - return self.origin_desc is not None - @property def data(self): if self.is_valid: @@ -696,15 +655,6 @@ cdef class dpnp_descriptor: return interface_dict - def _copy_array_from(self, other_desc): - """ - Fill array data with usm_ndarray of the same shape from other DPNP descriptor - """ - if not isinstance(other_desc, dpnp_descriptor): - raise TypeError("expected dpnp_descriptor, got {}".format(type(other_desc))) - - dpt_cu._copy_same_shape(self.get_array(), other_desc.get_array()) - def get_pyobj(self): return self.origin_pyobj @@ -718,29 +668,6 @@ cdef class dpnp_descriptor: "expected either dpctl.tensor.usm_ndarray or dpnp.dpnp_array.dpnp_array, got {}" "".format(type(self.origin_pyobj))) - def get_result_desc(self, result_desc=None): - """ - Copy the result data into an original array - """ - if self.is_temporary: - # Original descriptor is not None, so copy the array data into it and return - from_desc = self if result_desc is None else result_desc - self.origin_desc._copy_array_from(from_desc) - return self.origin_desc - elif result_desc is not None: - # A temporary result descriptor was allocated, needs to copy data back into 'out' descriptor - self._copy_array_from(result_desc) - return self - - def is_array_overlapped(self, other_desc): - """ - Check if usm_ndarray overlaps an array from other DPNP descriptor - """ - if not isinstance(other_desc, dpnp_descriptor): - raise TypeError("expected dpnp_descriptor, got {}".format(type(other_desc))) - - return dpt_ti._array_overlap(self.get_array(), other_desc.get_array()) - cdef void * get_data(self): cdef Py_ssize_t item_size = 0 cdef Py_ssize_t elem_offset = 0 @@ -755,9 +682,6 @@ cdef class dpnp_descriptor: return < void * > val - cdef cpp_bool match_ctype(self, DPNPFuncType ctype): - return self.dtype == dpnp_DPNPFuncType_to_dtype(< size_t > ctype) - def __bool__(self): return self.is_valid diff --git a/tests/skipped_tests.tbl b/tests/skipped_tests.tbl index 199566295a3..944a4bd122d 100644 --- a/tests/skipped_tests.tbl +++ b/tests/skipped_tests.tbl @@ -222,8 +222,6 @@ tests/third_party/cupy/math_tests/test_floating.py::TestFloating::test_ldexp tests/third_party/cupy/math_tests/test_floating.py::TestFloating::test_nextafter_combination tests/third_party/cupy/math_tests/test_floating.py::TestFloating::test_nextafter_float -tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_fmax_nan -tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_fmin_nan tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_negative tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_for_old_numpy diff --git a/tests/skipped_tests_gpu.tbl b/tests/skipped_tests_gpu.tbl index 26b52190539..61f981c2b9c 100644 --- a/tests/skipped_tests_gpu.tbl +++ b/tests/skipped_tests_gpu.tbl @@ -273,8 +273,6 @@ tests/third_party/cupy/math_tests/test_floating.py::TestFloating::test_ldexp tests/third_party/cupy/math_tests/test_floating.py::TestFloating::test_nextafter_combination tests/third_party/cupy/math_tests/test_floating.py::TestFloating::test_nextafter_float -tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_fmax_nan -tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_fmin_nan tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_negative tests/third_party/cupy/math_tests/test_misc.py::TestMisc::test_nan_to_num_for_old_numpy diff --git a/tests/test_mathematical.py b/tests/test_mathematical.py index ae2c73748b5..54bc03d0a3f 100644 --- a/tests/test_mathematical.py +++ b/tests/test_mathematical.py @@ -24,6 +24,7 @@ get_float_complex_dtypes, get_float_dtypes, get_integer_dtypes, + has_support_aspect16, has_support_aspect64, ) from .test_umath import ( @@ -1953,6 +1954,80 @@ def test_invalid_out(self, out): assert_raises(TypeError, numpy.divide, a.asnumpy(), 2, out) +class TestFmaxFmin: + @pytest.mark.skipif(not has_support_aspect16(), reason="no fp16 support") + @pytest.mark.parametrize("func", ["fmax", "fmin"]) + def test_half(self, func): + a = numpy.array([0, 1, 2, 4, 2], dtype=numpy.float16) + b = numpy.array([-2, 5, 1, 4, 3], dtype=numpy.float16) + c = numpy.array([0, -1, -numpy.inf, numpy.nan, 6], dtype=numpy.float16) + ia, ib, ic = dpnp.array(a), dpnp.array(b), dpnp.array(c) + + result = getattr(dpnp, func)(ia, ib) + expected = getattr(numpy, func)(a, b) + assert_equal(result, expected) + + result = getattr(dpnp, func)(ib, ic) + expected = getattr(numpy, func)(b, c) + assert_equal(result, expected) + + @pytest.mark.parametrize("func", ["fmax", "fmin"]) + @pytest.mark.parametrize("dtype", get_float_dtypes()) + def test_float_nans(self, func, dtype): + a = numpy.array([0, numpy.nan, numpy.nan], dtype=dtype) + b = numpy.array([numpy.nan, 0, numpy.nan], dtype=dtype) + ia, ib = dpnp.array(a), dpnp.array(b) + + result = getattr(dpnp, func)(ia, ib) + expected = getattr(numpy, func)(a, b) + assert_equal(result, expected) + + @pytest.mark.parametrize("func", ["fmax", "fmin"]) + @pytest.mark.parametrize("dtype", get_complex_dtypes()) + @pytest.mark.parametrize( + "nan_val", + [ + complex(numpy.nan, 0), + complex(0, numpy.nan), + complex(numpy.nan, numpy.nan), + ], + ids=["nan+0j", "nanj", "nan+nanj"], + ) + def test_complex_nans(self, func, dtype, nan_val): + a = numpy.array([0, nan_val, nan_val], dtype=dtype) + b = numpy.array([nan_val, 0, nan_val], dtype=dtype) + ia, ib = dpnp.array(a), dpnp.array(b) + + result = getattr(dpnp, func)(ia, ib) + expected = getattr(numpy, func)(a, b) + assert_equal(result, expected) + + @pytest.mark.parametrize("func", ["fmax", "fmin"]) + @pytest.mark.parametrize("dtype", get_float_dtypes(no_float16=False)) + def test_precision(self, func, dtype): + dtmin = numpy.finfo(dtype).min + dtmax = numpy.finfo(dtype).max + d1 = dtype(0.1) + d1_next = numpy.nextafter(d1, numpy.inf) + + test_cases = [ + # v1 v2 + (dtmin, -numpy.inf), + (dtmax, -numpy.inf), + (d1, d1_next), + (dtmax, numpy.nan), + ] + + for v1, v2 in test_cases: + a = numpy.array([v1]) + b = numpy.array([v2]) + ia, ib = dpnp.array(a), dpnp.array(b) + + result = getattr(dpnp, func)(ia, ib) + expected = getattr(numpy, func)(a, b) + assert_allclose(result, expected) + + class TestFloorDivide: @pytest.mark.usefixtures("suppress_divide_numpy_warnings") @pytest.mark.parametrize( diff --git a/tests/test_usm_type.py b/tests/test_usm_type.py index d38acc4a657..44311813b18 100644 --- a/tests/test_usm_type.py +++ b/tests/test_usm_type.py @@ -639,8 +639,8 @@ def test_1in_1out(func, data, usm_type): pytest.param("dot", [3 + 2j, 4 + 1j, 5], [1, 2 + 3j, 3]), # TODO: uncomment once resolved in gh-1723 by dpctl # pytest.param("extract", [False, True, True, False], [0, 1, 2, 3]), - pytest.param("fmax", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), - pytest.param("fmin", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), + pytest.param("fmax", [0.0, 1.0, 2.0], [3.0, 4.0, 5.0]), + pytest.param("fmin", [0.0, 1.0, 2.0], [3.0, 4.0, 5.0]), pytest.param("fmod", [5, 3], [2, 2.0]), pytest.param( "gradient", [1, 2, 4, 7, 11, 16], [0.0, 1.0, 1.5, 3.5, 4.0, 6.0] @@ -651,8 +651,8 @@ def test_1in_1out(func, data, usm_type): pytest.param("inner", [1.0, 2.0, 3.0], [4.0, 5.0, 6.0]), pytest.param("kron", [3.0, 4.0, 5.0], [1.0, 2.0]), pytest.param("logaddexp", [[-1, 2, 5, 9]], [[4, -3, 2, -8]]), - pytest.param("maximum", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), - pytest.param("minimum", [[0.0, 1.0, 2.0]], [[3.0, 4.0, 5.0]]), + pytest.param("maximum", [0.0, 1.0, 2.0], [3.0, 4.0, 5.0]), + pytest.param("minimum", [0.0, 1.0, 2.0], [3.0, 4.0, 5.0]), pytest.param("searchsorted", [11, 12, 13, 14, 15], [-10, 20, 12, 13]), pytest.param( "tensordot", From e6cf9d7192cf5ac1c5421b12ed7130963edffaa5 Mon Sep 17 00:00:00 2001 From: Anton <100830759+antonwolfy@users.noreply.github.com> Date: Sat, 6 Jul 2024 21:46:15 +0200 Subject: [PATCH 47/49] Resolve compilation issues with new DPC++ 2025.0 compiler (#1907) * CL/sycl.hpp is deprecated, use sycl/sycl.hpp * Explicitly include complex header * Use explicit type casting in a function from sycl namespace * Use proper sycl namespace * Use multi_ptr instead of raw pointer in sycl::modf * Applied pre-commit hook for clang-format --- dpnp/backend/examples/example10.cpp | 2 +- dpnp/backend/extensions/lapack/geqrf.hpp | 2 +- dpnp/backend/extensions/lapack/gesv.hpp | 2 +- dpnp/backend/extensions/lapack/gesvd.hpp | 2 +- dpnp/backend/extensions/lapack/getrf.hpp | 2 +- dpnp/backend/extensions/lapack/getri.hpp | 2 +- dpnp/backend/extensions/lapack/getrs.hpp | 2 +- dpnp/backend/extensions/lapack/heevd.hpp | 2 +- dpnp/backend/extensions/lapack/orgqr.hpp | 2 +- dpnp/backend/extensions/lapack/potrf.hpp | 2 +- dpnp/backend/extensions/lapack/syevd.hpp | 2 +- dpnp/backend/extensions/lapack/ungqr.hpp | 2 +- dpnp/backend/extensions/sycl_ext/sum_mean.hpp | 2 +- dpnp/backend/kernels/dpnp_krnl_mathematical.cpp | 16 ++++++++++------ dpnp/backend/src/dpnp_fptr.hpp | 2 +- dpnp/backend/src/dpnp_utils.hpp | 3 ++- dpnp/backend/src/verbose.hpp | 2 +- 17 files changed, 27 insertions(+), 22 deletions(-) diff --git a/dpnp/backend/examples/example10.cpp b/dpnp/backend/examples/example10.cpp index b09ea9b335d..6607bbfd7ab 100644 --- a/dpnp/backend/examples/example10.cpp +++ b/dpnp/backend/examples/example10.cpp @@ -35,8 +35,8 @@ #include #include -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/geqrf.hpp b/dpnp/backend/extensions/lapack/geqrf.hpp index 4ab65286b29..2ef15ba7a89 100644 --- a/dpnp/backend/extensions/lapack/geqrf.hpp +++ b/dpnp/backend/extensions/lapack/geqrf.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/gesv.hpp b/dpnp/backend/extensions/lapack/gesv.hpp index 12486fae787..057d839e941 100644 --- a/dpnp/backend/extensions/lapack/gesv.hpp +++ b/dpnp/backend/extensions/lapack/gesv.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/gesvd.hpp b/dpnp/backend/extensions/lapack/gesvd.hpp index 17ebd0edbe7..891a041c89b 100644 --- a/dpnp/backend/extensions/lapack/gesvd.hpp +++ b/dpnp/backend/extensions/lapack/gesvd.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/getrf.hpp b/dpnp/backend/extensions/lapack/getrf.hpp index fee9b209426..cd96f73bb50 100644 --- a/dpnp/backend/extensions/lapack/getrf.hpp +++ b/dpnp/backend/extensions/lapack/getrf.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/getri.hpp b/dpnp/backend/extensions/lapack/getri.hpp index 75e9b16d4ef..870a2936252 100644 --- a/dpnp/backend/extensions/lapack/getri.hpp +++ b/dpnp/backend/extensions/lapack/getri.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/getrs.hpp b/dpnp/backend/extensions/lapack/getrs.hpp index ca78ed8b80d..551c607c1e1 100644 --- a/dpnp/backend/extensions/lapack/getrs.hpp +++ b/dpnp/backend/extensions/lapack/getrs.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/heevd.hpp b/dpnp/backend/extensions/lapack/heevd.hpp index 7b3bfc05d87..3eae78bde24 100644 --- a/dpnp/backend/extensions/lapack/heevd.hpp +++ b/dpnp/backend/extensions/lapack/heevd.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/orgqr.hpp b/dpnp/backend/extensions/lapack/orgqr.hpp index 9cc4f530d03..83b9cdebe62 100644 --- a/dpnp/backend/extensions/lapack/orgqr.hpp +++ b/dpnp/backend/extensions/lapack/orgqr.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/potrf.hpp b/dpnp/backend/extensions/lapack/potrf.hpp index f0850b3fd98..c377820e1d1 100644 --- a/dpnp/backend/extensions/lapack/potrf.hpp +++ b/dpnp/backend/extensions/lapack/potrf.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/syevd.hpp b/dpnp/backend/extensions/lapack/syevd.hpp index 9dfaba08ae1..1b6750487fd 100644 --- a/dpnp/backend/extensions/lapack/syevd.hpp +++ b/dpnp/backend/extensions/lapack/syevd.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/lapack/ungqr.hpp b/dpnp/backend/extensions/lapack/ungqr.hpp index 1a9b68e94f9..06729e82eee 100644 --- a/dpnp/backend/extensions/lapack/ungqr.hpp +++ b/dpnp/backend/extensions/lapack/ungqr.hpp @@ -25,8 +25,8 @@ #pragma once -#include #include +#include #include diff --git a/dpnp/backend/extensions/sycl_ext/sum_mean.hpp b/dpnp/backend/extensions/sycl_ext/sum_mean.hpp index 5333456b0c7..fe935752b03 100644 --- a/dpnp/backend/extensions/sycl_ext/sum_mean.hpp +++ b/dpnp/backend/extensions/sycl_ext/sum_mean.hpp @@ -26,7 +26,7 @@ #pragma once #include "dispatcher_utils.hpp" -#include +#include #include #include "utils/memory_overlap.hpp" diff --git a/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp b/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp index 44cd91854df..7e358a8a710 100644 --- a/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp +++ b/dpnp/backend/kernels/dpnp_krnl_mathematical.cpp @@ -89,10 +89,10 @@ DPCTLSyclEventRef dpnp_ediff1d_c(DPCTLSyclQueueRef q_ref, _DataType_input *input1_data = input1_ptr.get_ptr(); _DataType_output *result = result_ptr.get_ptr(); - cl::sycl::event event; - cl::sycl::range<1> gws(result_size); + sycl::event event; + sycl::range<1> gws(result_size); - auto kernel_parallel_for_func = [=](cl::sycl::id<1> global_id) { + auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { size_t output_id = global_id[0]; /*for (size_t i = 0; i < result_size; ++i)*/ { @@ -101,7 +101,7 @@ DPCTLSyclEventRef dpnp_ediff1d_c(DPCTLSyclQueueRef q_ref, result[output_id] = next_elem - curr_elem; } }; - auto kernel_func = [&](cl::sycl::handler &cgh) { + auto kernel_func = [&](sycl::handler &cgh) { cgh.parallel_for< class dpnp_ediff1d_c_kernel<_DataType_input, _DataType_output>>( gws, kernel_parallel_for_func); @@ -205,8 +205,12 @@ DPCTLSyclEventRef dpnp_modf_c(DPCTLSyclQueueRef q_ref, auto kernel_parallel_for_func = [=](sycl::id<1> global_id) { size_t i = global_id[0]; /*for (size_t i = 0; i < size; ++i)*/ { - _DataType_input input_elem1 = array1[i]; - result2[i] = sycl::modf(double(input_elem1), &result1[i]); + double input_elem1 = static_cast(array1[i]); + auto res_multi_ptr = sycl::address_space_cast< + sycl::access::address_space::global_space, + sycl::access::decorated::yes>(&result1[i]); + + result2[i] = sycl::modf(input_elem1, res_multi_ptr); } }; diff --git a/dpnp/backend/src/dpnp_fptr.hpp b/dpnp/backend/src/dpnp_fptr.hpp index 73d627812a5..5e07b11542d 100644 --- a/dpnp/backend/src/dpnp_fptr.hpp +++ b/dpnp/backend/src/dpnp_fptr.hpp @@ -35,7 +35,7 @@ #include #include -#include +#include #include diff --git a/dpnp/backend/src/dpnp_utils.hpp b/dpnp/backend/src/dpnp_utils.hpp index 88e993a0a20..89b8a733153 100644 --- a/dpnp/backend/src/dpnp_utils.hpp +++ b/dpnp/backend/src/dpnp_utils.hpp @@ -29,10 +29,11 @@ #include #include +#include #include #include -#include +#include #include diff --git a/dpnp/backend/src/verbose.hpp b/dpnp/backend/src/verbose.hpp index ae67dbe56fa..20a106ced3e 100644 --- a/dpnp/backend/src/verbose.hpp +++ b/dpnp/backend/src/verbose.hpp @@ -27,7 +27,7 @@ #ifndef VERBOSE_H // Cython compatibility #define VERBOSE_H -#include +#include bool is_verbose_mode(); void set_barrier_event(sycl::queue queue, std::vector &depends); From 637b4c56fc9639f3e73cc1c384f6c44ef7360c21 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Sat, 6 Jul 2024 22:47:30 +0200 Subject: [PATCH 48/49] Bump actions/upload-artifact from 4.3.3 to 4.3.4 (#1911) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.3 to 4.3.4. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/65462800fd760344b1a7b4382951275a0abb4808...0b2256b8c012f0828dc542b3febcab082c67f72b) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- .github/workflows/conda-package.yml | 4 ++-- .github/workflows/openssf-scorecard.yml | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index c67b7487429..29ba04a40b6 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -141,13 +141,13 @@ jobs: run: conda build --no-test --python ${{ matrix.python }} --numpy 1.24 ${{ env.CHANNELS }} conda-recipe - name: Upload artifact - uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4.3.3 + uses: actions/upload-artifact@0b2256b8c012f0828dc542b3febcab082c67f72b # v4.3.4 with: name: ${{ env.PACKAGE_NAME }} ${{ runner.os }} Python ${{ matrix.python }} path: ${{ env.CONDA_BLD }}${{ env.PACKAGE_NAME }}-*.tar.bz2 - name: Upload wheels artifact - uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4.3.3 + uses: actions/upload-artifact@0b2256b8c012f0828dc542b3febcab082c67f72b # v4.3.4 with: name: ${{ env.PACKAGE_NAME }} ${{ runner.os }} Wheels Python ${{ matrix.python }} path: ${{ env.WHEELS_OUTPUT_FOLDER }}${{ env.PACKAGE_NAME }}-*.whl diff --git a/.github/workflows/openssf-scorecard.yml b/.github/workflows/openssf-scorecard.yml index 9658c7e3b2f..df1ff4f9590 100644 --- a/.github/workflows/openssf-scorecard.yml +++ b/.github/workflows/openssf-scorecard.yml @@ -60,7 +60,7 @@ jobs: # Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF # format to the repository Actions tab. - name: "Upload artifact" - uses: actions/upload-artifact@65462800fd760344b1a7b4382951275a0abb4808 # v4.3.3 + uses: actions/upload-artifact@0b2256b8c012f0828dc542b3febcab082c67f72b # v4.3.4 with: name: SARIF file path: results.sarif From b64442b314b3b24f4a5a1f27fc1fb08b2db695d9 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun, 7 Jul 2024 13:13:57 +0200 Subject: [PATCH 49/49] Bump actions/download-artifact from 4.1.7 to 4.1.8 (#1910) Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4.1.7 to 4.1.8. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/65a9edc5881444af0b9093a5e628f2fe47ea3b2e...fa0a91b85d4f404e444e00e005971372dc801d16) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com> --- .github/workflows/conda-package.yml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/.github/workflows/conda-package.yml b/.github/workflows/conda-package.yml index 29ba04a40b6..d1e2ebf5492 100644 --- a/.github/workflows/conda-package.yml +++ b/.github/workflows/conda-package.yml @@ -180,7 +180,7 @@ jobs: steps: - name: Download artifact - uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7 + uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8 with: name: ${{ env.PACKAGE_NAME }} ${{ runner.os }} Python ${{ matrix.python }} path: ${{ env.pkg-path-in-channel }} @@ -301,7 +301,7 @@ jobs: steps: - name: Download artifact - uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7 + uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8 with: name: ${{ env.PACKAGE_NAME }} ${{ runner.os }} Python ${{ matrix.python }} path: ${{ env.pkg-path-in-channel }} @@ -453,12 +453,12 @@ jobs: steps: - name: Download artifact - uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7 + uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8 with: name: ${{ env.PACKAGE_NAME }} ${{ runner.os }} Python ${{ matrix.python }} - name: Download wheels artifact - uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7 + uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8 with: name: ${{ env.PACKAGE_NAME }} ${{ runner.os }} Wheels Python ${{ matrix.python }}