Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/reproducible #1446

Open
wants to merge 40 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
ccb1c73
Fix compiler warning with dbldble
maddyscientist Jul 18, 2023
2049be6
Add array copy assignment from one type of array to another
maddyscientist Jul 19, 2023
81566c8
Remove use of zero function and fix caxpyxmazMR functor for when the …
maddyscientist Jul 19, 2023
ce5d396
Make math_helper.cuh safe to include in non CUDA-aware compiler
maddyscientist Jul 19, 2023
7a4e04f
Add doubledouble support for host, add complex-number support, remove…
maddyscientist Jul 19, 2023
2d67d97
Modify reduction kernels to use device_reduce_t and not double for in…
maddyscientist Jul 20, 2023
feccf89
Use same underlying reduction type on host as device
maddyscientist Jul 20, 2023
d70303a
Move get_scalar<deviation_t> overload to float_Vector.h
maddyscientist Jul 20, 2023
4a7061a
Add *= and /= overloads for doubledouble
maddyscientist Jul 20, 2023
7e40280
Fix heavy quark residual norm for non-double reduction type
maddyscientist Jul 20, 2023
2a80b2f
Add various functions to doubledouble needed for generic deployment
maddyscientist Jul 20, 2023
e9089e1
Commence the slog that generizes the host-side scalar precision: intr…
maddyscientist Jul 20, 2023
a4e8f76
Add isfinite method for doubledouble
maddyscientist Jul 21, 2023
a7cc5f7
99% of double -> real_t replacement now done (MPI reductions not yet …
maddyscientist Jul 22, 2023
008c632
Updated ReduceArg::complete function to work when real_t and device_r…
maddyscientist Jul 22, 2023
dc62b01
Remove some legacy code
maddyscientist Jul 22, 2023
3324b05
Fix some issues
maddyscientist Jul 23, 2023
a16ff6c
Add missing cast operator to deviation_t::operator= when copying from…
maddyscientist Jul 25, 2023
2b5bac8
Add ostream << overlead for doubledouble type
maddyscientist Jul 25, 2023
9d69abd
Update CUDA block_reduce_helper.h atomic types to work with doubledou…
maddyscientist Jul 25, 2023
7e21a5b
Add support for doubledouble heterogeneous-atomic reductions. This a…
maddyscientist Jul 25, 2023
d5f914d
transform_reduce now respects device_reduce_t and real_t
maddyscientist Jul 25, 2023
1a73132
Add initial support for multi-process doubledouble reductions: only Q…
maddyscientist Jul 25, 2023
d76e57c
Multi-process reduction now uses device_reduce_t with the conversion …
maddyscientist Jul 25, 2023
27ba8de
Updates for blas_test: use same basis for host and device to allow fo…
maddyscientist Jul 25, 2023
4b5aa52
Minor comment clean up
maddyscientist Jul 26, 2023
bcde6ad
Add single gpu support for doubledouble
maddyscientist Jul 28, 2023
2ee73a6
Small fix for doubledouble::operator>
maddyscientist Jul 28, 2023
9789820
Initial version of reproduction reductions, fully works but a few lim…
maddyscientist Aug 15, 2023
67514d0
Merge branch 'feature/gaugefield_unity' of github.com:lattice/quda in…
maddyscientist Aug 15, 2023
d455000
Fix io_test when not all precision compiled
maddyscientist Aug 15, 2023
030836d
Fix compiler warning
maddyscientist Aug 15, 2023
08b9776
Reenable explicit zero support with rfa_t (fixes dilution_test)
maddyscientist Aug 15, 2023
64ed607
Fix gauge loop trace when using doubledouble precision reductions
maddyscientist Aug 15, 2023
ba96720
Fix doubledouble multi-GPU compilation (missing comm_allreduce_max fu…
maddyscientist Aug 15, 2023
b7687b4
Fix gauge_path_test loop trace test when using doubledouble reduction…
maddyscientist Aug 15, 2023
bc74e7b
Rework of reproducible reductions to pre-compute the bins when initia…
maddyscientist Aug 23, 2023
6a60bc3
Minor optimization of det_trace kernel
maddyscientist Aug 23, 2023
a8085dc
Fix compiler warning
maddyscientist Aug 23, 2023
a413153
Merge branch 'feature/gaugefield_unity' of github.com:lattice/quda in…
maddyscientist Aug 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions include/array.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,12 @@ namespace quda

array<T, n> &operator=(const array<T, n> &) = default;
array<T, n> &operator=(array<T, n> &&) = default;

template <typename U> constexpr array<T, n> &operator=(const array<U, n> &other)
{
for (int i = 0; i < n; i++) data[i] = other[i];
return *this;
}
};

template <typename T, int n> std::ostream &operator<<(std::ostream &output, const array<T, n> &a)
Expand Down
42 changes: 0 additions & 42 deletions include/blas_helper.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -23,48 +23,6 @@ namespace quda
static constexpr bool V = V_;
};

__host__ __device__ inline double set(double &x) { return x; }
__host__ __device__ inline double2 set(double2 &x) { return x; }
__host__ __device__ inline double3 set(double3 &x) { return x; }
__host__ __device__ inline double4 set(double4 &x) { return x; }
__host__ __device__ inline void sum(double &a, double &b) { a += b; }
__host__ __device__ inline void sum(double2 &a, double2 &b)
{
a.x += b.x;
a.y += b.y;
}
__host__ __device__ inline void sum(double3 &a, double3 &b)
{
a.x += b.x;
a.y += b.y;
a.z += b.z;
}
__host__ __device__ inline void sum(double4 &a, double4 &b)
{
a.x += b.x;
a.y += b.y;
a.z += b.z;
a.w += b.w;
}

#ifdef QUAD_SUM
__host__ __device__ inline double set(doubledouble &a) { return a.head(); }
__host__ __device__ inline double2 set(doubledouble2 &a) { return make_double2(a.x.head(), a.y.head()); }
__host__ __device__ inline double3 set(doubledouble3 &a) { return make_double3(a.x.head(), a.y.head(), a.z.head()); }
__host__ __device__ inline void sum(double &a, doubledouble &b) { a += b.head(); }
__host__ __device__ inline void sum(double2 &a, doubledouble2 &b)
{
a.x += b.x.head();
a.y += b.y.head();
}
__host__ __device__ inline void sum(double3 &a, doubledouble3 &b)
{
a.x += b.x.head();
a.y += b.y.head();
a.z += b.z.head();
}
#endif

// Vector types used for AoS load-store on CPU
template <> struct VectorType<double, 24> {
using type = array<double, 24>;
Expand Down
Loading