Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Introduce U<address space> to DataLayout to represent undesirable address space if a target has it #108786

Closed
wants to merge 1 commit into from

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Sep 16, 2024

This patch adds undesirable address space to DataLayout if a target supports it. It can be set via U<address space>.

Motivation

First and foremost: The flat address space is an existing concept in LLVM.

/// Returns the address space ID for a target's 'flat' address space. Note

It is not necessarily the same as addrspace 0, which LLVM sometimes refers to as the generic address space. The flat address space is a generic address space that can be used access multiple segments of memory with different address spaces.

Having the query function getFlatAddressSpace as a member function of TargetTransformInfo means if we want to query it, we need to have a Function & because both TargetTransformInfoWrapperPass and TargetIRAnalysis are function pass. However, this is not always feasible. For example, in the module pass Attributor, those target independent AttributorAttributes can potentially just analyze a Constant (which means it only has access to Constant *). In this case, we don't have a Function & to get TTI. A detailed discussion and use case can be found here.

If the above issue is the symptom, then the root cause is, it should not be in TTI at the first place, because it is really a target or module level property. We can't put it in TargetMachine either because that would introduce circular dependence (see here for more information).

Data layout is the perfect place to encode this information because the definition of various address spaces are defined there already. In this way, we can query the information just through a reference to DataLayout, which is widely available almost everywhere.

Q&A

  1. Isn't address space 0 flat?

    Based on existing code, not necessarily.

  2. Can there be more than one flat address space?

    No.

  3. If you specify a flat address space, does that mean that all other address spaces are not flat, and thus cannot alias with other address spaces?

    Yes, all other address spaces are not flat. A flat address space pointer can still point to the same place as a non-flat address space pointer, so it doesn't guarantee no alias.

  4. Is it optional for a target to provide a flat address space?

    Yes. Similar to existing address space concept, a target doesn't need to have a flat address space.


Update:

The PR has been updated to call it undesirable address space w/o giving it a concrete definition. It is up to targets how to interpret undesirable address space. The LangRef only defines the minimum requirement to be qualified as undesirable address space.

Copy link
Contributor Author

shiltian commented Sep 16, 2024

@shiltian shiltian marked this pull request as ready for review September 16, 2024 02:45
@llvmbot llvmbot added clang Clang issues not falling into any other category lld backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" lld:ELF mlir:gpu mlir backend:NVPTX llvm:ir labels Sep 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 16, 2024

@llvm/pr-subscribers-backend-spir-v
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-backend-nvptx
@llvm/pr-subscribers-lld

@llvm/pr-subscribers-clang

Author: Shilei Tian (shiltian)

Changes

Patch is 23.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/108786.diff

15 Files Affected:

  • (modified) clang/lib/Basic/Targets/AMDGPU.cpp (+1-1)
  • (modified) clang/lib/Basic/Targets/NVPTX.cpp (+4-4)
  • (modified) clang/test/CodeGen/target-data.c (+4-4)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl (+1-1)
  • (modified) lld/test/ELF/lto/amdgcn-oses.ll (+3-3)
  • (modified) lld/test/ELF/lto/amdgcn.ll (+1-1)
  • (modified) llvm/docs/LangRef.rst (+30-17)
  • (modified) llvm/docs/ReleaseNotes.rst (+3)
  • (modified) llvm/include/llvm/IR/DataLayout.h (+2)
  • (modified) llvm/lib/IR/AutoUpgrade.cpp (+10)
  • (modified) llvm/lib/IR/DataLayout.cpp (+9)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+1-1)
  • (modified) llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp (+1-1)
  • (modified) llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp (+20-20)
  • (modified) mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp (+1-1)
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 3b748d0249d57b..0ee56848a6cb98 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -36,7 +36,7 @@ static const char *const DataLayoutStringAMDGCN =
     "-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:"
     "32-v48:64-v96:128"
     "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1"
-    "-ni:7:8:9";
+    "-ni:7:8:9-T0";
 
 const LangASMap AMDGPUTargetInfo::AMDGPUDefIsGenMap = {
     llvm::AMDGPUAS::FLAT_ADDRESS,     // Default
diff --git a/clang/lib/Basic/Targets/NVPTX.cpp b/clang/lib/Basic/Targets/NVPTX.cpp
index 43b653dc52ce0d..59344c3c71aee2 100644
--- a/clang/lib/Basic/Targets/NVPTX.cpp
+++ b/clang/lib/Basic/Targets/NVPTX.cpp
@@ -66,12 +66,12 @@ NVPTXTargetInfo::NVPTXTargetInfo(const llvm::Triple &Triple,
   HasFloat16 = true;
 
   if (TargetPointerWidth == 32)
-    resetDataLayout("e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64");
+    resetDataLayout("e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64-T0");
   else if (Opts.NVPTXUseShortPointers)
-    resetDataLayout(
-        "e-p3:32:32-p4:32:32-p5:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64");
+    resetDataLayout("e-p3:32:32-p4:32:32-p5:32:32-i64:64-i128:128-v16:16-v32:"
+                    "32-n16:32:64-T0");
   else
-    resetDataLayout("e-i64:64-i128:128-v16:16-v32:32-n16:32:64");
+    resetDataLayout("e-i64:64-i128:128-v16:16-v32:32-n16:32:64-T0");
 
   // If possible, get a TargetInfo for our host triple, so we can match its
   // types.
diff --git a/clang/test/CodeGen/target-data.c b/clang/test/CodeGen/target-data.c
index 41cbd5a0219d5e..a3d1a8cb9ee234 100644
--- a/clang/test/CodeGen/target-data.c
+++ b/clang/test/CodeGen/target-data.c
@@ -160,11 +160,11 @@
 
 // RUN: %clang_cc1 -triple nvptx-unknown -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=NVPTX
-// NVPTX: target datalayout = "e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+// NVPTX: target datalayout = "e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64-T0"
 
 // RUN: %clang_cc1 -triple nvptx64-unknown -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=NVPTX64
-// NVPTX64: target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
+// NVPTX64: target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64-T0"
 
 // RUN: %clang_cc1 -triple r600-unknown -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=R600
@@ -176,12 +176,12 @@
 
 // RUN: %clang_cc1 -triple amdgcn-unknown -target-cpu hawaii -o - -emit-llvm %s \
 // RUN: | FileCheck %s -check-prefix=R600SI
-// R600SI: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
+// R600SI: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9-T0"
 
 // Test default -target-cpu
 // RUN: %clang_cc1 -triple amdgcn-unknown -o - -emit-llvm %s \
 // RUN: | FileCheck %s -check-prefix=R600SIDefault
-// R600SIDefault: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
+// R600SIDefault: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9-T0"
 
 // RUN: %clang_cc1 -triple arm64-unknown -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=AARCH64
diff --git a/clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl b/clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl
index bb52f87615214b..b4b246cc082e00 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl
@@ -1,5 +1,5 @@
 // RUN: %clang_cc1 %s -O0 -triple amdgcn -emit-llvm -o - | FileCheck %s
 // RUN: %clang_cc1 %s -O0 -triple amdgcn---opencl -emit-llvm -o - | FileCheck %s
 
-// CHECK: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
+// CHECK: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9-T0"
 void foo(void) {}
diff --git a/lld/test/ELF/lto/amdgcn-oses.ll b/lld/test/ELF/lto/amdgcn-oses.ll
index 7a74d0317f2b9e..8903b45565b41f 100644
--- a/lld/test/ELF/lto/amdgcn-oses.ll
+++ b/lld/test/ELF/lto/amdgcn-oses.ll
@@ -25,7 +25,7 @@
 
 ;--- amdhsa.ll
 target triple = "amdgcn-amd-amdhsa"
-target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5"
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-T0"
 
 !llvm.module.flags = !{!0}
 !0 = !{i32 1, !"amdhsa_code_object_version", i32 500}
@@ -36,7 +36,7 @@ define void @_start() {
 
 ;--- amdpal.ll
 target triple = "amdgcn-amd-amdpal"
-target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5"
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-T0"
 
 define amdgpu_cs void @_start() {
   ret void
@@ -44,7 +44,7 @@ define amdgpu_cs void @_start() {
 
 ;--- mesa3d.ll
 target triple = "amdgcn-amd-mesa3d"
-target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5"
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-T0"
 
 define void @_start() {
   ret void
diff --git a/lld/test/ELF/lto/amdgcn.ll b/lld/test/ELF/lto/amdgcn.ll
index 4281e209fd9789..bcfbc272afdf9a 100644
--- a/lld/test/ELF/lto/amdgcn.ll
+++ b/lld/test/ELF/lto/amdgcn.ll
@@ -5,7 +5,7 @@
 ; Make sure the amdgcn triple is handled
 
 target triple = "amdgcn-amd-amdhsa"
-target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5"
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-T0"
 
 define void @_start() {
   ret void
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 144b4497ca63ce..dc9b6b9c9ccbf8 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -354,7 +354,7 @@ added in the future:
     not be used lightly but only for specific situations such as an
     alternative to the *register pinning* performance technique often
     used when implementing functional programming languages. At the
-    moment only X86, AArch64, and RISCV support this convention. The 
+    moment only X86, AArch64, and RISCV support this convention. The
     following limitations exist:
 
     -  On *X86-32* only up to 4 bit type parameters are supported. No
@@ -685,10 +685,10 @@ implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
-such as :ref:`llvm.ptrmask<int_ptrmask>`. 
+such as :ref:`llvm.ptrmask<int_ptrmask>`.
 
 The alignment information provided by the frontend for a non-integral pointer
-(typically using attributes or metadata) must be valid for every possible 
+(typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
 .. _globalvars:
@@ -1677,10 +1677,10 @@ Currently, only the following parameter attributes are defined:
     -  The range is allowed to wrap.
     -  The empty range is represented using ``0,0``.
     -  Otherwise, ``a`` and ``b`` are not allowed to be equal.
-    
-    This attribute may only be applied to parameters or return values with integer 
+
+    This attribute may only be applied to parameters or return values with integer
     or vector of integer types.
-    
+
     For vector-typed parameters, the range is applied element-wise.
 
 .. _gc:
@@ -3050,6 +3050,19 @@ as follows:
     address space 0, this property only affects the default value to be used
     when creating globals without additional contextual information (e.g. in
     LLVM passes).
+``T<address space>``
+    Specifies the address space for a target's 'flat' address space. Note this
+    is not necessarily the same as addrspace 0, which LLVM sometimes refers to
+    as the generic address space. The flat address space is a generic address
+    space that can be used access multiple segments of memory with different
+    address spaces. Access of a memory location through a pointer with this
+    address space is expected to be legal but slower compared to the same memory
+    location accessed through a pointer with a different address space. This is
+    for targets with different pointer representations which can be converted
+    with the addrspacecast instruction. If a pointer is converted to this
+    address space, optimizations should attempt to replace the access with the
+    source address space. The absence of this specification indicates the target
+    does not have such a flat address space to optimize away.
 
 .. _alloca_addrspace:
 
@@ -14346,7 +14359,7 @@ Arguments:
 """"""""""
 The first 4 arguments are similar to ``llvm.instrprof.increment``. The indexing
 is specific to callsites, meaning callsites are indexed from 0, independent from
-the indexes used by the other intrinsics (such as 
+the indexes used by the other intrinsics (such as
 ``llvm.instrprof.increment[.step]``).
 
 The last argument is the called value of the callsite this intrinsic precedes.
@@ -14360,7 +14373,7 @@ a buffer LLVM can use to perform counter increments (i.e. the lowering of
 ``llvm.instrprof.increment[.step]``. The address range following the counter
 buffer, ``<num-counters>`` x ``sizeof(ptr)`` - sized, is expected to contain
 pointers to contexts of functions called from this function ("subcontexts").
-LLVM does not dereference into that memory region, just calculates GEPs. 
+LLVM does not dereference into that memory region, just calculates GEPs.
 
 The lowering of ``llvm.instrprof.callsite`` consists of:
 
@@ -14929,8 +14942,8 @@ integer bit width or any vector of integer elements.
 Overview:
 """""""""
 
-Return ``-1`` if ``%a`` is signed less than ``%b``, ``0`` if they are equal, and 
-``1`` if ``%a`` is signed greater than ``%b``. Vector intrinsics operate on a per-element basis. 
+Return ``-1`` if ``%a`` is signed less than ``%b``, ``0`` if they are equal, and
+``1`` if ``%a`` is signed greater than ``%b``. Vector intrinsics operate on a per-element basis.
 
 Arguments:
 """"""""""
@@ -14958,8 +14971,8 @@ integer bit width or any vector of integer elements.
 Overview:
 """""""""
 
-Return ``-1`` if ``%a`` is unsigned less than ``%b``, ``0`` if they are equal, and 
-``1`` if ``%a`` is unsigned greater than ``%b``. Vector intrinsics operate on a per-element basis. 
+Return ``-1`` if ``%a`` is unsigned less than ``%b``, ``0`` if they are equal, and
+``1`` if ``%a`` is unsigned greater than ``%b``. Vector intrinsics operate on a per-element basis.
 
 Arguments:
 """"""""""
@@ -21556,9 +21569,9 @@ Semantics:
 """"""""""
 
 The '``llvm.vp.minimum``' intrinsic performs floating-point minimum (:ref:`minimum <i_minimum>`)
-of the first and second vector arguments on each enabled lane, the result being 
+of the first and second vector arguments on each enabled lane, the result being
 NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this
-intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`. 
+intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
 The operation is performed in the default floating-point environment.
 
 Examples:
@@ -29191,7 +29204,7 @@ Semantics:
 """"""""""
 
 The intrinsic ``@llvm.allow.ubsan.check()`` returns either ``true`` or
-``false``, depending on compiler options. 
+``false``, depending on compiler options.
 
 For each evaluation of a call to this intrinsic, the program must be valid and
 correct both if it returns ``true`` and if it returns ``false``.
@@ -29250,13 +29263,13 @@ Semantics:
 """"""""""
 
 The intrinsic ``@llvm.allow.runtime.check()`` returns either ``true`` or
-``false``, depending on compiler options. 
+``false``, depending on compiler options.
 
 For each evaluation of a call to this intrinsic, the program must be valid and
 correct both if it returns ``true`` and if it returns ``false``.
 
 When used in a branch condition, it allows us to choose between
-two alternative correct solutions for the same problem. 
+two alternative correct solutions for the same problem.
 
 If the intrinsic is evaluated as ``true``, program should execute a guarded
 check. If the intrinsic is evaluated as ``false``, the program should avoid any
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 52456896f2fc6c..c4edbc6daff774 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -56,6 +56,9 @@ Changes to the LLVM IR
 
 * Added ``usub_cond`` and ``usub_sat`` operations to ``atomicrmw``.
 
+* Added ``T<address space>`` to data layout to represent flat address space if a
+  target has it.
+
 Changes to LLVM infrastructure
 ------------------------------
 
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 8f7ab2f9df389e..5ee388a5d059dd 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -94,6 +94,7 @@ class DataLayout {
   unsigned AllocaAddrSpace = 0;
   unsigned ProgramAddrSpace = 0;
   unsigned DefaultGlobalsAddrSpace = 0;
+  unsigned FlatAddressSpace = ~0U;
 
   MaybeAlign StackNaturalAlign;
   MaybeAlign FunctionPtrAlign;
@@ -245,6 +246,7 @@ class DataLayout {
   unsigned getDefaultGlobalsAddressSpace() const {
     return DefaultGlobalsAddrSpace;
   }
+  unsigned getFlatAddressSpace() const { return FlatAddressSpace; }
 
   bool hasMicrosoftFastStdCallMangling() const {
     return ManglingMode == MM_WinCOFFX86;
diff --git a/llvm/lib/IR/AutoUpgrade.cpp b/llvm/lib/IR/AutoUpgrade.cpp
index 69dae5e32dbbe8..a08700d4585985 100644
--- a/llvm/lib/IR/AutoUpgrade.cpp
+++ b/llvm/lib/IR/AutoUpgrade.cpp
@@ -5451,6 +5451,10 @@ std::string llvm::UpgradeDataLayoutString(StringRef DL, StringRef TT) {
     if (!DL.contains("-p9") && !DL.starts_with("p9"))
       Res.append("-p9:192:256:256:32");
 
+    // Add flat address space.
+    if (!DL.contains("-T0") && !DL.starts_with("T0"))
+      Res.append("-T0");
+
     return Res;
   }
 
@@ -5501,6 +5505,12 @@ std::string llvm::UpgradeDataLayoutString(StringRef DL, StringRef TT) {
       Res = (Ref.take_front(I) + "-f80:128-" + Ref.drop_front(I + 8)).str();
   }
 
+  if (T.isNVPTX()) {
+    // Add flat address space.
+    if (!DL.contains("-T0") && !DL.starts_with("T0"))
+      Res.append("-T0");
+  }
+
   return Res;
 }
 
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index d295d1f5785eb9..51396241515647 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -228,6 +228,7 @@ DataLayout &DataLayout::operator=(const DataLayout &Other) {
   AllocaAddrSpace = Other.AllocaAddrSpace;
   ProgramAddrSpace = Other.ProgramAddrSpace;
   DefaultGlobalsAddrSpace = Other.DefaultGlobalsAddrSpace;
+  FlatAddressSpace = Other.FlatAddressSpace;
   StackNaturalAlign = Other.StackNaturalAlign;
   FunctionPtrAlign = Other.FunctionPtrAlign;
   TheFunctionPtrAlignType = Other.TheFunctionPtrAlignType;
@@ -250,6 +251,7 @@ bool DataLayout::operator==(const DataLayout &Other) const {
          AllocaAddrSpace == Other.AllocaAddrSpace &&
          ProgramAddrSpace == Other.ProgramAddrSpace &&
          DefaultGlobalsAddrSpace == Other.DefaultGlobalsAddrSpace &&
+         FlatAddressSpace == Other.FlatAddressSpace &&
          StackNaturalAlign == Other.StackNaturalAlign &&
          FunctionPtrAlign == Other.FunctionPtrAlign &&
          TheFunctionPtrAlignType == Other.TheFunctionPtrAlignType &&
@@ -568,6 +570,13 @@ Error DataLayout::parseSpecification(StringRef Spec) {
       return Err;
     break;
   }
+  case 'T': { // Flat address space.
+    if (Rest.empty())
+      return createSpecFormatError("T<address space>");
+    if (Error Err = parseAddrSpace(Rest, FlatAddressSpace))
+      return Err;
+    break;
+  }
   case 'm':
     if (!Rest.consume_front(":") || Rest.empty())
       return createSpecFormatError("m:<mangling>");
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index f860b139945122..2e26cb969206f5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -579,7 +579,7 @@ static StringRef computeDataLayout(const Triple &TT) {
          "-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-"
          "v32:32-v48:64-v96:"
          "128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-"
-         "G1-ni:7:8:9";
+         "G1-ni:7:8:9-T0";
 }
 
 LLVM_READNONE
diff --git a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
index 57b7fa783c14a7..eb7f29655e6441 100644
--- a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
@@ -117,7 +117,7 @@ static std::string computeDataLayout(bool is64Bit, bool UseShortPointers) {
   else if (UseShortPointers)
     Ret += "-p3:32:32-p4:32:32-p5:32:32";
 
-  Ret += "-i64:64-i128:128-v16:16-v32:32-n16:32:64";
+  Ret += "-i64:64-i128:128-v16:16-v32:32-n16:32:64-T0";
 
   return Ret;
 }
diff --git a/llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp b/llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp
index ca50187e5e5ee0..85029297a72df2 100644
--- a/llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp
+++ b/llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp
@@ -36,12 +36,12 @@ TEST(DataLayoutUpgradeTest, ValidDataLayoutUpgrade) {
   // Check that AMDGPU targets add -G1 if it's not present.
   EXPECT_EQ(UpgradeDataLayoutString("e-p:32:32", "r600"), "e-p:32:32-G1");
   // and that ANDGCN adds p7 and p8 as well.
-  EXPECT_EQ(
-      UpgradeDataLayoutString("e-p:64:64", "amdgcn"),
-      "e-p:64:64-G1-ni:7:8:9-p7:160:256:256:32-p8:128:128-p9:192:256:256:32");
-  EXPECT_EQ(
-      UpgradeDataLayoutString("e-p:64:64-G1", "amdgcn"),
-      "e-p:64:64-G1-ni:7:8:9-p7:160:256:256:32-p8:128:128-p9:192:256:256:32");
+  EXPECT_EQ(UpgradeDataLayoutString("e-p:64:64", "amdgcn"),
+            "e-p:64:64-G1-ni:7:8:9-p7:160:256:256:32-p8:128:128-p9:192:256:256:"
+            "32-T0");
+  EXPECT_EQ(UpgradeDataLayoutString("e-p:64:64-G1", "amdgcn"),
+            "e-p:64:64-G1-ni:7:8:9-p7:160:256:256:32-p8:128:128-p9:192:256:256:"
+            "32-T0");
   // but that r600 does not.
   EXPECT_EQ(UpgradeDataLayoutString("e-p:32:32-G1", "r600"), "e-p:32:32-G1");
 
@@ -56,7 +56,7 @@ TEST(DataLayoutUpgradeTest, ValidDataLayoutUpgrade) {
       "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-"
       "v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:"
     ...
[truncated]

@@ -579,7 +579,7 @@ static StringRef computeDataLayout(const Triple &TT) {
"-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-"
"v32:32-v48:64-v96:"
"128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-"
"G1-ni:7:8:9";
"G1-ni:7:8:9-T0";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does r600 have flat address space?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but yes. We probably should just define 0 to be the flat address space and take the same numbers as amdgcn. Flat will just be unsupported in codegen (but theoretically someone could go implement software tagged pointers)

llvm/docs/LangRef.rst Outdated Show resolved Hide resolved
@gonzalobg
Copy link
Contributor

gonzalobg commented Sep 16, 2024

Is there some github issue or discourse thread that provides context for this change?
What problem does this PR solve?

My understanding is that the default LLVM IR address space (*) is flat. Is that currently the case or is my understanding incorrect? Is this PR proposing that addrspace(0) should be allowed to no longer be flat, and that whether a target provides a flat address space shall be optional? What alternatives were considered? For example, why isn't it sufficient for targets that cannot provide a flat address pace to just not implement support for addrspace(0)? Why does LLVM IR need to support a non-flat addrspace(0) and allow a target to provide a flat addresspace at a different identifier?

(*) The address space C++/Rust/... pointers map to, i.e., addrspace(0).

@shiltian
Copy link
Contributor Author

I updated the PR description. Hopefully it can make the motivation of this patch more clear. I think the central question is, why it can't be just address space 0. I don't have a clear answer to that. @arsenm first introduced this concept to LLVM at least 8 years ago. I agree that the only two targets that have this concept are AMDGPU and NVPTX, and they both set it to address space 0. My understanding is, LLVM doesn't force any other characteristics to address 0 in addition to just making it as the default one if there is no other specified for the allocation of certain variables, so we can't assert or assume how a target needs to implement it.

@AlexVlx
Copy link
Contributor

AlexVlx commented Sep 16, 2024

I updated the PR description. Hopefully it can make the motivation of this patch more clear. I think the central question is, why it can't be just address space 0. I don't have a clear answer to that. @arsenm first introduced this concept to LLVM at least 8 years ago. I agree that the only two targets that have this concept are AMDGPU and NVPTX, and they both set it to address space 0. My understanding is, LLVM doesn't force any other characteristics to address 0 in addition to just making it as the default one if there is no other specified for the allocation of certain variables, so we can't assert or assume how a target needs to implement it.

There are targets that use a different integer to denote flat (e.g. see SPIR & SPIR-V). Whilst I know that there are objections to that, the fact remains that they had historical reason (wanted to make legacy OCL convention that the default is private work, and given that IR defaults to 0 this was an easy, if possibly costly, way out; AMDGPU also borks this for legacy OCL reasons, which has been a source of pain). I think the core matter, which we also talked about in #95728, is that LLVM defaults to 0 and posits some guarantees around that as per LangRef, but it doesn't say much of anything about guaranteed convertibility to / from 0. Which turns out to be a problem for targets / pseudo-targets which do not also use 0 to denote flat/generic, as a number of constructs that merely rely on getting the default end up needing a cast. @efriedma-quic made the point that in general there should / would always be a correct AS and the solution is to emit these in the right AS from the get-go, but the problem is somewhat pervasive.

@Artem-B
Copy link
Member

Artem-B commented Sep 16, 2024

The description of the flat address space in the TargetTransformInfo.h is somewhat vague and both, soo specific and not precise enough, IMO:

The flat address space is a
  /// generic address space that can be used access multiple segments of memory
  /// with different address spaces. Access of a memory location through a
  /// pointer with this address space is expected to be legal but slower
  /// compared to the same memory location accessed through a pointer with a
  /// different address space.
  ///
  /// This is for targets with different pointer representations which can
  /// be converted with the addrspacecast instruction. If a pointer is converted
  /// to this address space, optimizations should attempt to replace the access
  /// with the source address space.
  • "can be used access multiple segments of memory with different AS" -- does it mean all of them? If it covers only some of the segments, which ones?
  • "access expected to be slower" -- that's also not necessarily a universal property. It's conceivable that on some architectures, it's conversion between address spaces that may be non-free, but once you have the pointer, actual accesses work equally fast.
  • Can there be more than one flat address space?
  • No

Why not? E.g. if we have a processor with Harvard architecture, there may conceivably be multiple 'flat', non-overlapping address spaces, each covering a set of specific ASes.

I think conceptually we should describe AS hierarchy explicitly, and avoid the assumptions on their number or layout.
E.g. T0:1,2,3,4,5 may mean AS 0 is a flat superset of AS 1,2,3,4,5.

... and then change the API to getParentAS(AS) which would return AS itself if it's not contained in any other AS, or the higher level AS.

@jayfoad
Copy link
Contributor

jayfoad commented Sep 19, 2024

+1 to @efriedma-quic and @jdoerfert's comments. DataLayout should remain as generic as possible. Trying to encode a concept of "the flat address space" in it seems way too specific to one optimization for one or two targets.

@shiltian
Copy link
Contributor Author

It's a good idea to make it more generic. Potentially we can add an API like isAddressSpaceOptimizable(unsigned AS). If we want to do that, the question would be, where to put this API? Like I mentioned in the description, I tried some similar approach but they didn't pan out well. So far we know that, neither TTI nor TargetMachine is good place. Where else can we do this, other than putting it into DataLayout? If DataLayout is still a good place, then it might be just about whether we call it a flat address space, or optimizable address space, and nothing would be different from what is done in this PR.

@AlexVlx
Copy link
Contributor

AlexVlx commented Sep 19, 2024

+1 to @efriedma-quic and @jdoerfert's comments. DataLayout should remain as generic as possible. Trying to encode a concept of "the flat address space" in it seems way too specific to one optimization for one or two targets.

This isn't purely a nice to have optimisation aid though, is it? There are cases where you need a safe (portable?) default that the target can perhaps optimise, but, absent that, would at least work, and at the moment there's no handy way to query that generically (or from Clang). We do handwave 0 as being that safe default, and hope for the best, but as mentioned that relies on targets using 0 to correspond to flat/generic/whatever we call it, which they are not bound to do. To me, adding this is not entirely different from encoding the Alloca AS, which we already do.

@efriedma-quic
Copy link
Collaborator

If DataLayout is still a good place, then it might be just about whether we call it a flat address space, or optimizable address space, and nothing would be different from what is done in this PR.

We've avoided putting optimization properties in the DataLayout in the past. The interactions with LTO mean we want the DataLayout for a given subtarget to be stable.

In this particular case, a target's "flat" address-space can't really ever change, though, so it's probably fine.

There are cases where you need a safe (portable?) default that the target can perhaps optimise, but, absent that, would at least work, and at the moment there's no handy way to query that generically (or from Clang).

The use-cases currently under discussion, and LangRef itself, don't require the flat address-space to have any particular semantics; the only property it has is "operations using it are slower". What semantics do you need, and where do you need them?

@jdoerfert
Copy link
Member

+1 to @efriedma-quic and @jdoerfert's comments. DataLayout should remain as generic as possible. Trying to encode a concept of "the flat address space" in it seems way too specific to one optimization for one or two targets.

This isn't purely a nice to have optimisation aid though, is it? There are cases where you need a safe (portable?) default that the target can perhaps optimise, but, absent that, would at least work, and at the moment there's no handy way to query that generically (or from Clang). We do handwave 0 as being that safe default, and hope for the best, but as mentioned that relies on targets using 0 to correspond to flat/generic/whatever we call it, which they are not bound to do. To me, adding this is not entirely different from encoding the Alloca AS, which we already do.

Your argument is my point 3), right? #108786 (comment)

Why is that a DL property and not just a target property? Or even just "computable" by asking query 1) for different ASs?

@shiltian
Copy link
Contributor Author

Why is that a DL property and not just a target property?

It is indeed a target property. DL is one choice here (if it is not the most preferable one). Like I mentioned before, TTI and TargetMachine could have been one, but not feasible. Where else do you suggest?

Or even just "computable" by asking query 1) for different ASs?

IIUC, the query 1) you were referring to was Type::canLosslesslyBitCastTo. If that's the case, it does nothing for AS cast.

@shiltian
Copy link
Contributor Author

ping

@jdoerfert
Copy link
Member

IIUC, the query 1) you were referring to was Type::canLosslesslyBitCastTo. If that's the case, it does nothing for AS cast.

I don't know what I was thinking of, the above or CastInst::castIsValid, neither is very helpful.

I am personally not opposed to the DL solution, but I guess it's a big change that feels too much for me to approve. I'm also worried the "flat" concept is a special case of something more generic and we should instead define that. E.g.,

  • Does flat imply all AS can cast to it? If so, what happens if that is not true anymore in the future. If it's not implied, how do we query what's allowed and what is not?
  • What about other AS casts that do not involve flat, we can't restrict them even if we introduce the flat idea in the DL, right?
  • Is there a value in not defining 0 as flat? It's what we basically assume everywhere already, isn't it?

@AlexVlx
Copy link
Contributor

AlexVlx commented Sep 24, 2024

I am personally not opposed to the DL solution, but I guess it's a big change that feels too much for me to approve. I'm also worried the "flat" concept is a special case of something more generic and we should instead define that. E.g.,

In reply to your question above re whether this is a DL or a Target property, I don't have a strong opinion there (it appears @shiltian and @arsenm might). I do believe that this is a necessary bit of query-able information, especially from a Clang, for correctness reasons (more on that below).

  • Does flat imply all AS can cast to it? If so, what happens if that is not true anymore in the future. If it's not implied, how do we query what's allowed and what is not?

To make this generically useful, it probably should, but I don't think the current words imply it.

  • Is there a value in not defining 0 as flat? It's what we basically assume everywhere already, isn't it?

Ah, this is part of the challenge - we do indeed assume that 0 is flat, but Targets aren't bound by LangRef to use 0 to denote flat (and some, like SPIR / SPIR-V) do not, but rather problematically use 0 to denote Private, which is extremely constrained. This leads to problematic IR, due to the assumption you mentioned. An example I ran into the other day is emitCalloc - note it just uses the default pointer type for its return, which is problematic unless 0 is flat(-ish). It is, of course, fixable via other means, but it's just an example of where we rely on 0 just working.

Perhaps another solution(?) here would be to tweak LangRef to be binding in that it spells out that iff a particular Target has a Flat AS, that will be reflected in LLVM as AS0?

@shiltian
Copy link
Contributor Author

shiltian commented Sep 24, 2024

Is there a value in not defining 0 as flat?

Like @AlexVlx mentioned, as well as the wording I put into LangRef.rst, SPIR/SPIR-V uses AS 4 as flat AS.

It's what we basically assume everywhere already, isn't it?

I don't think this is technically true, but it is unfortunately practically true. There is no formal definition of what a flat AS is (except the one in TTI which is not widely used at all), which is the central question of this PR. That means all the "assumption everywhere" is basically assumption of any means for its own purpose. The current AS 0 is nothing special but just taken as the default AS for various object allocation if no explicit AS is used. That's all.

Perhaps another solution(?) here would be to tweak LangRef to be binding in that it spells out that iff a particular Target has a Flat AS, that will be reflected in LLVM as AS0?

I'm fine with adding the enforcement in LLVM that AS0 needs to be the flat AS, if a target has it, but the definition of a flat AS still needs to be set. If we do that, how will SPIR/SPIR-V work?


I think to have a clear definition of what a flat AS is critical, no matter if we want to add a new specification, as what is done is this PR, or add the wording into LLVM LangRef, enforcing it to AS 0. I'd like to give it a try here. ⬇️


A flat address space is an address space with the following characteristics:

  1. It must correspond to an existing address space; defining a flat address space does not create a new address space.
  2. Pointers in this address space can be used to access multiple memory segments, each associated with distinct address spaces.
  3. Accessing memory through a pointer in this address space may be slower than accessing memory through a pointer in a more specific address space.
  4. Pointers in this address space must be able to be losslessly converted to other address spaces.

All other characteristics of the flat address space will align with those of the original existing address space.


This is the most generic wording I can come up with so far. Happy to hear more feedbacks.

@arsenm
Copy link
Contributor

arsenm commented Sep 25, 2024

There are targets that use a different integer to denote flat (e.g. see SPIR & SPIR-V). Whilst I know that there are objections to that, the fact remains that they had historical reason (wanted to make legacy OCL convention that the default is private work, and given that IR defaults to 0 this was an easy, if possibly costly, way out;

The SPIRV IR would be better off changing its numbers around like we did in AMDGPU ages ago. The only concern would be bitcode compatibility, but given it's still an "experimental target" that shouldn't be an issue.

AMDGPU also borks this for legacy OCL reasons, which has been a source of pain).

This is only a broken in-clang hack, the backend IR always uses the correct address space

@arsenm
Copy link
Contributor

arsenm commented Sep 25, 2024

Just to clarify, does this mean any two non-flat address space pointers cannot alias?

This should change nothing about aliasing. The IR assumption is any address space may alias any other

@arsenm
Copy link
Contributor

arsenm commented Sep 25, 2024

Both in InferAddressSpaces, and in Attributor, you don't really care about whether a flat address-space exists.

Right, this is more of an undesirable address space. Optimizations don't need to know anything about its behavior beyond that.

In reply to your question above re whether this is a DL or a Target property, I don't have a strong opinion there (it appears @shiltian and @arsenm might).

I don't really like putting this in the DataLayout. My original idea was to move it to TargetMachine, but we want to avoid the dependence on CodeGen. The DataLayout is just the other place we have that defines module level target information. The simple solution is just have a switch over the target architecture in Attributor.

I do believe that this is a necessary bit of query-able information, especially from a Clang, for correctness reasons (more on that below).

I don't think this buys frontends much. Clang still needs to understand the full language address space -> target address space mapping. This would just allow populating one entry generically

Ah, this is part of the challenge - we do indeed assume that 0 is flat, but Targets aren't bound by LangRef to use 0 to denote flat (and some, like SPIR / SPIR-V) do not

As I mentioned above, SPIRV can just work its way out of this problem for its IR. SPIR's only reason for existence is bitcode compatibility, so doing anything with there will be quite a lot of work which will never realistically happen.

I'm fine with adding the enforcement in LLVM that AS0 needs to be the flat AS, if a target has it, but the definition of a flat AS still needs to be set. If we do that, how will SPIR/SPIR-V work?
This is the most generic wording I can come up with so far. Happy to hear more feedbacks.

I would like to avoid adding additional special properties to AS0, or defining the flat concept.

@shiltian
Copy link
Contributor Author

I would like to avoid adding additional special properties to AS0, or defining the flat concept.

How can we add a new specification w/o defining it?

The simple solution is just have a switch over the target architecture in Attributor.

That means in the future when a new (GPU/accelerator) target is added, someone has to know that is one place to add a new case. FWIW, it is not just Attributor. The situation applies to all passes that potentially don't have access to TTI/TM.

I don't really like the solution, but that is what I'm doing to move other PRs forward w/o being blocked by this one.

@arsenm
Copy link
Contributor

arsenm commented Oct 1, 2024

I would like to avoid adding additional special properties to AS0, or defining the flat concept.

How can we add a new specification w/o defining it?

By not defining it in terms of flat addressing. Just make it the undesirable address space

@shiltian
Copy link
Contributor Author

shiltian commented Oct 1, 2024

I would like to avoid adding additional special properties to AS0, or defining the flat concept.

How can we add a new specification w/o defining it?

By not defining it in terms of flat addressing. Just make it the undesirable address space

But that will still require to define, what is undesirable address space right?

@arsenm
Copy link
Contributor

arsenm commented Oct 10, 2024

But that will still require to define, what is undesirable address space right?

The LangRef doesn't need to know why it's undesirable. It's like the n field

@shiltian
Copy link
Contributor Author

The LangRef doesn't need to know why it's undesirable. It's like the n field

n field? The following?

n<size1>:<size2>:<size3>...
This specifies a set of native integer widths for the target CPU in bits. For example, it might contain n32 for 32-bit PowerPC, n32:64 for PowerPC 64, or n8:16:32:64 for X86-64. Elements of this set are considered to support most general arithmetic operations efficiently.

@arsenm
Copy link
Contributor

arsenm commented Oct 11, 2024

The LangRef doesn't need to know why it's undesirable. It's like the n field

n field? The following?

Yes. It's an optimization hint

@shiltian shiltian force-pushed the users/shiltian/data-layout-flat-address-space branch from cad9ac7 to c2be824 Compare October 14, 2024 19:30
@shiltian shiltian changed the title [IR] Introduce T<address space> to DataLayout to represent flat address space if a target supports it [IR] Introduce U<address space> to DataLayout to represent undesirable address space if a target has it Oct 14, 2024
@AlexVlx
Copy link
Contributor

AlexVlx commented Oct 20, 2024

I do believe that this is a necessary bit of query-able information, especially from a Clang, for correctness reasons (more on that below).

I don't think this buys frontends much. Clang still needs to understand the full language address space -> target address space mapping. This would just allow populating one entry generically

Ah, this is part of the challenge - we do indeed assume that 0 is flat, but Targets aren't bound by LangRef to use 0 to denote flat (and some, like SPIR / SPIR-V) do not

As I mentioned above, SPIRV can just work its way out of this problem for its IR. SPIR's only reason for existence is bitcode compatibility, so doing anything with there will be quite a lot of work which will never realistically happen.

I'm fine with adding the enforcement in LLVM that AS0 needs to be the flat AS, if a target has it, but the definition of a flat AS still needs to be set. If we do that, how will SPIR/SPIR-V work?
This is the most generic wording I can come up with so far. Happy to hear more feedbacks.

I would like to avoid adding additional special properties to AS0, or defining the flat concept.

This is all fine, let's ignore SPIR-V for a second (SPIR being relevant is debatable, and I'm not sure what its reason for being was, but I'm pretty sure nothing is being done with it), and assume that that gets fixed. What is there to prevent some other target coming in and doing the same thing (using 0 to represent some constrained / problematic AS), without placing any sort of additional constraints on 0 (the absence thereof being an issue as it stands), and without giving the FE a mechanism to at least attempt to get flat/flat-like AS? More specifically, how do you expect Clang to figure this out when e.g. compiling C++ for some AS rich target that decided to default to something odd (so querying the correspondent for LangAS::Default is not useful) and uses 0 to denote a special purpose AS (so the many uses of getUnqual/get with the AS argument defaulted end up doing something wrong)? There isn't an awful lot of guidance being provided, you don't want to update LangRef, and the "undesirable" wording is rather opaque to put it mildly.

@arsenm
Copy link
Contributor

arsenm commented Oct 20, 2024

More specifically, how do you expect Clang to figure this out when e.g. compiling C++ for some AS rich target that decided to default to something odd

I do not expect clang to be using the datalayout to decide anything. That is not really the purpose of the datalayout. It does not provide semantic information to the frontend. The frontend needs to directly understand the source language semantics and the address spaces of the target.

@AlexVlx
Copy link
Contributor

AlexVlx commented Oct 21, 2024

More specifically, how do you expect Clang to figure this out when e.g. compiling C++ for some AS rich target that decided to default to something odd

I do not expect clang to be using the datalayout to decide anything. That is not really the purpose of the datalayout. It does not provide semantic information to the frontend. The frontend needs to directly understand the source language semantics and the address spaces of the target.

So your solution here is to simply "infect" every language with full address space awareness? Seems problematic. Furthermore, this stance on DataLayout is somewhat curious, considering we do use it precisely for that, in Clang, around things like globals and allocas. What's your suggestion there?

@efriedma-quic
Copy link
Collaborator

My impression is that the treatment of address-spaces varies so much across different languages and different targets that it doesn't make sense to encode the entire address-space layout the datalayout. Narrowly encoding just the specific properties optimizations need means we don't need to rework optimizations and the datalayout for each new generation of GPU/accelerator. And developers working on IR optimizations don't need to try to decipher the implications of different address-spaces.

@arsenm
Copy link
Contributor

arsenm commented Oct 21, 2024

So your solution here is to simply "infect" every language with full address space awareness?

It's similar to, but a simpler version of the ABI emission problem. The frontend has to know properties of the target to emit the IR for it. It's infeasible to encode all of the possible address space information in a simple datalayout string.

Furthermore, this stance on DataLayout is somewhat curious, considering we do use it precisely for that, in Clang, around things like globals and allocas. What's your suggestion there?

Where is the datalayout providing semantics in clang? The clang usage of the datalayout should be limited to computing type sizes and other low level ABI details (not semantics)

@shiltian
Copy link
Contributor Author

It seems like we can’t reach consensus on how to move forward with this. I already worked my issue around so I guess we no longer need this PR.

@shiltian shiltian closed this Oct 27, 2024
@shiltian shiltian deleted the users/shiltian/data-layout-flat-address-space branch October 27, 2024 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU backend:NVPTX backend:SPIR-V clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category lld:ELF lld llvm:ir mlir:gpu mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.