Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datatype: map builtin MPI datatypes to internal types #7264

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Jan 15, 2025

Pull Request Description

Many of the MPI Datatypes are redundant, for example, MPI_INT, MPI_INT32_t, MPI_INTEGER. The new ABI proposal requires setting some of these types at runtime, for example, MPI_Abi_set_fortran_info. In this PR, we create a set of fixed-size internal datatypes and then map all external builtin types to internal ones. This allows runtime setting and resetting any builtin types. And internally, we only need to support a fixed set.

image

  • MPI_Datatypes are defined by MPI Forum. It mixes alias types, different languages, and unnamed types.
  • C types are defined by C language. It contains aliases, mixed availability, and compiler variations.
  • Internal types are defined by MPICH. It is stable and deterministic. Can be used in a switch cases.
  • The handle bits works with current bit logic for builtin and type size, and also contains index to user-input type.

Discussion

  • For communication routines, such as MPI_Send, MPI_Bcast, we can replace builtin datatypes with MPIR_FIXED#, since all we care is the data size.
  • For reduction routines, such as MPI_Reduce, MPI_Accumulate, we can replace builtin datatypes with internal types (but not MPIR_FIXED#)
  • For datatype creation routines, I think we can replace all builtin "oldtype" with MPIR_FIXED#. We don't need worry about reduction op because we'll always rely on user op for them.
  • NOTE: once we convert to internal types, we'll not be able to perform strict type matching validation, such as matching MPI_INT to MPI_FLOAT. But we don't perform such validation today anyway. It is an extra overhead that we can't afford. In principle, we could perform strict type matching under e.g. --enable-error-checking=2

[skip warnings]

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

configure.ac Outdated Show resolved Hide resolved
configure.ac Outdated Show resolved Hide resolved
@hzhou hzhou force-pushed the 2501_type_swap branch 3 times, most recently from 99abfbc to 4617ce2 Compare January 22, 2025 04:43
@dalcinl
Copy link
Contributor

dalcinl commented Jan 22, 2025

@hzhou Is Intel 32bits (i386) something you do not care at all by now? If not, maybe adding a MPIR_FLOAT12 to support long double may be not that hard.

@hzhou
Copy link
Contributor Author

hzhou commented Jan 22, 2025

@hzhou Is Intel 32bits (i386) something you do not care at all by now? If not, maybe adding a MPIR_FLOAT12 to support long double may be not that hard.

We still want to support i386. I am not sure about MPIR_FLOAT12. "long double" is 16-byte, 80-bit, 4-byte alignment on i386. I think we need to make it into its own category, e.g. just call it MPIR_LONG_DOUBLE.

@dalcinl
Copy link
Contributor

dalcinl commented Jan 22, 2025

"long double" is 16-byte

IIRC, "long double" is 12 bytes on on i386 (at least on Linux, but not Windows).
That's why I proposed MPIR_FLOAT12. This type would be NULL on x86_64.
However, MPIR_LONG_DOUBLE would work totally fine, maybe even better, and it meaning would just be "whatever the platform's C long double is", so there is no ambiguity.

@hzhou
Copy link
Contributor Author

hzhou commented Jan 22, 2025

Here is my current scheme:

MPIR_FLOAT[2,4,8,16]  /* for IEEE 754 floating points */
MPIR_FLOAT_ALT[2,12,16] /* for alternative format, e.g. BFLOAT16, long double on i386 and x86_64 */

Hopefully we only ever support 1 alternative format. But if more is needed, we'll need new category names.

As you pointed out that long double may have different sizes depend on platforms, but my design goals is to have fixed handle bits and semantics for internal types.

@hzhou hzhou force-pushed the 2501_type_swap branch 2 times, most recently from 18bfb0c to 37c10f1 Compare January 23, 2025 00:42
@dalcinl
Copy link
Contributor

dalcinl commented Jan 23, 2025

MPIR_FLOAT[2,4,8,16] /* for IEEE 754 floating points /
MPIR_FLOAT_ALT[2,12,16] /
for alternative format, e.g. BFLOAT16, long double on i386 and x86_64 */

This is great for the time being!

@hzhou hzhou force-pushed the 2501_type_swap branch 14 times, most recently from 21c8585 to a0d258c Compare January 30, 2025 05:02
@hzhou hzhou force-pushed the 2501_type_swap branch 3 times, most recently from fc9b7fb to 0127291 Compare February 4, 2025 15:48
@hzhou
Copy link
Contributor Author

hzhou commented Feb 4, 2025

test:mpich/ch3/most
test:mpich/ch4/most

hzhou added 24 commits February 4, 2025 14:19
* in yaksa, map internal types to yaksa builtin types

* use internal types in MPIR_Datatype_builtintype_alignment
* Replace e.g. MPI_INT with MPIR_INT_INTERNAL.

* Use new groups in MPIR_OP_TYPE_GROUP.
Directly use fi_datatype and fi_op to index
MPIDI_OFI_global.win_op_table and dtypes_max_count in
MPIDI_OFI_win_acc_hint_t. We'll directly convert MPI datatypes and MPI
ops to fi_datatype and fi_op.
At device layer we only need deal with internal datatypes.
The external builtin datatypes, e.g. MPI_INT, may be reconfigured at
runtime. This won't be the case practically, but it is possibility by
design, so that all MPI builtin datatypes, MPI_INT or MPI_INTEGER, are
treated the same.
It is unnecessary. And internal types don't have corresponding
datatype structures.
Refactor MPIR_Type_match_size_impl and support MPIX_TYPECLASS_LOGICAL.

NOTE: now the fixed-width types are always available, we only need match
one of fixed-width types. Check whether reduction op is available in
case for example we don't have a matching C native type.
* External32 format converts types the original types rather than the
internal types.

* The error reporting need report to user the original datatypes rather
than the internal ones.

* Reduce_local need call user op function with the original datatypes.
Update for the new internal pairtypes.
The MPI standard didn't list MPI_BYTE as a valid type for MPI_MAX.
However, I bet any coder would think it is a sensible to compare
max/min of two byte values. Thus, we (mpich implementation) will allow
it.

Modify the test to check the error case of MPI_FLOAT and MPI_LAND
instead.
Add a missing newline in error messages in reduce_local.c.

Limit the number of error messages in atomic_rmw_cas.c.
This has been passed and merged by MPI Forum. Assuming the next MPI standard
will be ratified before next major mpich release, we are directly using the
MPI_ prefix rather than MPIX_ prefix.

Also add MPI_TYPECLASS_LOGICAL for MPI_Type_match_size API.

reference:
mpi-forum/mpi-issues#699
mpi-forum/mpi-standard#963
This serves as an example how we add a new builtin mpi datatype.
    1. define the constant in mpi.h.in
    2. (optional) define the internal datatype in mpir_datatype.h if there isn't
one already
        2a. add alignment in MPIR_Datatype_builtintype_alignment
        2b. add mapping in MPII_Typerep_get_yaksa_type
    3. define the mapping in configure.ac
    4. (optional) define case for the supported reduction op
Provide half-precision float sum operation by casting to C float.
Create MPIR_op_dt_check for reduction op-type validation in the
binding-layer, where datatype is user-input mpi datatypes, and use
MPIR_Internal_op_dt_check for op-type check where datatype is an
internal datatype.

The binding-layer validation follows the text literally from the MPI
standard, while the internal checks are more flexible. For example,
internally byte/char is an integer and are allowed for all integer
operations. But externally, users only can use byte for bit-logic
operations and char is invalid for any reduction op.
With internal types, the matched_datatype if matched will always be good
for op. Thus the check_dtype is no longer necessary.
We centralized the the op/type check with MPIR_op_dt_check and
MPIR_Internal_op_dt_check.
Now each op functions are quite simple and share a lot of commonalities,
it is easier to maintain moving them into a single source file.
MPI standard does not list MPI_CHAR as a valid type for reduction
using builtin ops.

Also, multi language types such as MPI_AINT are not allowed in logical
ops.
Allowing MPI_CHAR with op or MPI_AINT etc. with logical op are
non-standard. However, MPICH used to allow it. Thus, this commit
preserves the old behavior.
@hzhou
Copy link
Contributor Author

hzhou commented Feb 4, 2025

test:mpich/ch3/most
test:mpich/ch4/most

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants