Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conform non-suffixed integer literals #5717

Merged
merged 13 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 31 additions & 10 deletions docs/64bit-type-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ Slang 64-bit Type Support

* Not all targets support 64 bit types, or all 64 bit types
* 64 bit integers generally require later APIs/shader models
* When specifying 64 bit literals *always* use the type suffixes (ie `L`, `ULL`, `LL`)
* When specifying 64 bit floating-point literals *always* use the type suffixes (ie `L`)
* An integer literal will be interpreted as 64 bits if it cannot fit in a 32 bit value.
* GPU target/s generally do not support all double intrinsics
* Typically missing are trascendentals (sin, cos etc), logarithm and exponential functions
* CUDA is the exception supporting nearly all double intrinsics
Expand All @@ -28,7 +29,7 @@ This also applies to vector and matrix versions of these types.

Unfortunately if a specific target supports the type or the typical HLSL intrinsic functions (such as sin/cos/max/min etc) depends very much on the target.

Special attention has to be made with respect to literal 64 bit types. By default float and integer literals if they do not have an explicit suffix are assumed to be 32 bit. There is a variety of reasons for this design choice - the main one being around by default behavior of getting good performance. The suffixes required for 64 bit types are as follows
Special attention has to be made with respect to literal 64 bit types. By default float literals if they do not have an explicit suffix are assumed to be 32 bit. There is a variety of reasons for this design choice - the main one being around by default behavior of getting good performance. The suffixes required for 64 bit types are as follows

```
// double - 'l' or 'L'
Expand All @@ -40,27 +41,47 @@ double b = 1.34e-200;
// int64_t - 'll' or 'LL' (or combination of upper/lower)

int64_t c = -5436365345345234ll;
// WRONG!: This is the same as d = int64_t(int32_t(-5436365345345234)) which means d ! = -5436365345345234LL.
// Will produce a warning.
int64_t d = -5436365345345234;

int64_t e = ~0LL; // Same as 0xffffffffffffffff
// Does produce the same result as 'e' because equivalent int64_t(~int32_t(0))
int64_t f = ~0;

// uint64_t - 'ull' or 'ULL' (or combination of upper/lower)

uint64_t g = 0x8000000000000000ull;
// WRONG!: This is the same as h = uint64_t(uint32_t(0x8000000000000000)) which means h = 0
// Will produce a warning.
uint64_t h = 0x8000000000000000u;

uint64_t i = ~0ull; // Same as 0xffffffffffffffff
uint64_t j = ~0; // Equivalent to 'i' because uint64_t(int64_t(~int32_t(0)));
```

These issues are discussed more on issue [#1185](https://github.com/shader-slang/slang/issues/1185)

The type of a decimal non-suffixed integer literal is the first integer type from the list [`int`, `int64_t`]
which can represent the specified literal value. If the value cannot fit, the literal is represented as an `uint64_t`
and a warning is given.
The type of a hexadecimal non-suffixed integer literal is the first type from the list [`int`, `uint`, `int64_t`, `uint64_t`]
that can represent the specified literal value. A non-suffixed integer literal will be 64 bit if it cannot fit in 32 bits.
```
// Same as int64_t a = int(1), the value can fit into a 32 bit integer.
int64_t a = 1;

// Same as int64_t b = int64_t(2147483648), the value cannot fit into a 32 bit integer.
int64_t b = 2147483648;

// Same as int64_t c = uint64_t(18446744073709551615), the value is larger than the maximum value of a signed 64 bit
// integer, and is interpreted as an unsigned 64 bit integer. Warning is given.
uint64_t c = 18446744073709551615;

// Same as uint64_t = int(0x7FFFFFFF), the value can fit into a 32 bit integer.
uint64_t d = 0x7FFFFFFF;

// Same as uint64_t = int64_t(0x7FFFFFFFFFFFFFFF), the value cannot fit into an unsigned 32 bit integer but
// can fit into a signed 64 bit integer.
uint64_t e = 0x7FFFFFFFFFFFFFFF;

// Same as uint64_t = uint64_t(0xFFFFFFFFFFFFFFFF), the value cannot fit into a signed 64 bit integer, and
// is interpreted as an unsigned 64 bit integer.
uint64_t f = 0xFFFFFFFFFFFFFFFF;
```

Double support
==============

Expand Down
7 changes: 5 additions & 2 deletions docs/user-guide/02-conventional-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,11 @@ The following integer types are provided:

All targets support the 32-bit `int` and `uint` types, but support for the other types depends on the capabilities of each target platform.

Integer literals can be both decimal and hexadecimal, and default to the `int` type.
A literal can be explicitly made unsigned with a `u` suffix.
Integer literals can be both decimal and hexadecimal. An integer literal can be explicitly made unsigned
with a `u` suffix, and explicitly made 64-bit with the `ll` suffix. The type of a decimal non-suffixed integer literal is the first integer type from
the list [`int`, `int64_t`] which can represent the specified literal value. If the value cannot fit, the literal is represented as
an `uint64_t` and a warning is given. The type of hexadecimal non-suffixed integer literal is the first type from the list
[`int`, `uint`, `int64_t`, `uint64_t`] that can represent the specified literal value. For more information on 64 bit integer literals see the documentation on [64 bit type support](../64bit-type-support.md).

The following floating-point type are provided:

Expand Down
10 changes: 9 additions & 1 deletion source/compiler-core/slang-lexer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -673,7 +673,10 @@ static int _readOptionalBase(char const** ioCursor)
}


IntegerLiteralValue getIntegerLiteralValue(Token const& token, UnownedStringSlice* outSuffix)
IntegerLiteralValue getIntegerLiteralValue(
Token const& token,
UnownedStringSlice* outSuffix,
bool* outIsDecimalBase)
{
IntegerLiteralValue value = 0;

Expand All @@ -698,6 +701,11 @@ IntegerLiteralValue getIntegerLiteralValue(Token const& token, UnownedStringSlic
*outSuffix = UnownedStringSlice(cursor, end);
}

if (outIsDecimalBase)
{
*outIsDecimalBase = (base == 10);
}

return value;
}

Expand Down
5 changes: 4 additions & 1 deletion source/compiler-core/slang-lexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,10 @@ String getFileNameTokenValue(Token const& token);
typedef int64_t IntegerLiteralValue;
typedef double FloatingPointLiteralValue;

IntegerLiteralValue getIntegerLiteralValue(Token const& token, UnownedStringSlice* outSuffix = 0);
IntegerLiteralValue getIntegerLiteralValue(
Token const& token,
UnownedStringSlice* outSuffix = 0,
bool* outIsDecimalBase = 0);
FloatingPointLiteralValue getFloatingPointLiteralValue(
Token const& token,
UnownedStringSlice* outSuffix = 0);
Expand Down
6 changes: 6 additions & 0 deletions source/slang/slang-diagnostic-defs.h
Original file line number Diff line number Diff line change
Expand Up @@ -1574,6 +1574,12 @@ DIAGNOSTIC(
Error,
invalidFloatingPointLiteralSuffix,
"invalid suffix '$0' on floating-point literal")
DIAGNOSTIC(
39999,
Warning,
integerLiteralTooLarge,
"integer literal is too large to be represented in a signed integer type, interpreting as "
"unsigned")

DIAGNOSTIC(
39999,
Expand Down
83 changes: 73 additions & 10 deletions source/slang/slang-parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3136,8 +3136,7 @@ static Modifier* ParseSemantic(Parser* parser)
BitFieldModifier* bitWidthMod = parser->astBuilder->create<BitFieldModifier>();
parser->FillPosition(bitWidthMod);
const auto token = parser->tokenReader.advanceToken();
UnownedStringSlice suffix;
bitWidthMod->width = getIntegerLiteralValue(token, &suffix);
bitWidthMod->width = getIntegerLiteralValue(token);
return bitWidthMod;
}
else if (parser->LookAheadToken(TokenType::CompletionRequest))
Expand Down Expand Up @@ -6638,6 +6637,64 @@ static IntegerLiteralValue _fixIntegerLiteral(
return value;
}

static BaseType _determineNonSuffixedIntegerLiteralType(
IntegerLiteralValue value,
bool isDecimalBase,
Token* token,
DiagnosticSink* sink)
{
const uint64_t rawValue = (uint64_t)value;

/// Non-suffixed integer literal types
///
/// The type is the first from the following list in which the value can fit:
/// - For decimal bases:
/// - `int`
/// - `int64_t`
/// - For non-decimal bases:
/// - `int`
/// - `uint`
/// - `int64_t`
/// - `uint64_t`
///
/// The lexer scans the negative(-) part of literal separately, and the value part here
/// is always positive hence it is sufficient to only compare with the maximum limits.
BaseType baseType;
if (rawValue <= INT32_MAX)
{
baseType = BaseType::Int;
}
else if ((rawValue <= UINT32_MAX) && !isDecimalBase)
{
baseType = BaseType::UInt;
}
else if (rawValue <= INT64_MAX)
{
baseType = BaseType::Int64;
}
else
{
baseType = BaseType::UInt64;

if (isDecimalBase)
{
// There is an edge case here where 9223372036854775808 or INT64_MAX + 1
// brings us here, but the complete literal is -9223372036854775808 or INT64_MIN and is
// valid. Unfortunately because the lexer handles the negative(-) part of the literal
// separately it is impossible to know whether the literal has a negative sign or not.
// We emit the warning and initially process it as a uint64 anyways, and the negative
// sign will be properly parsed and the value will still be properly stored as a
// negative INT64_MIN.

// Decimal integer is too large to be represented as signed.
// Output warning that it is represented as unsigned instead.
sink->diagnose(*token, Diagnostics::integerLiteralTooLarge);
}
}

return baseType;
}

static bool _isCast(Parser* parser, Expr* expr)
{
if (as<PointerTypeExpr>(expr))
Expand Down Expand Up @@ -6925,20 +6982,18 @@ static Expr* parseAtomicExpr(Parser* parser)
constExpr->token = token;

UnownedStringSlice suffix;
IntegerLiteralValue value = getIntegerLiteralValue(token, &suffix);
bool isDecimalBase;
IntegerLiteralValue value = getIntegerLiteralValue(token, &suffix, &isDecimalBase);

// Look at any suffix on the value
char const* suffixCursor = suffix.begin();
const char* const suffixEnd = suffix.end();
const bool suffixExists = (suffixCursor != suffixEnd);

// If no suffix is defined go with the default
BaseType suffixBaseType = BaseType::Int;

if (suffixCursor < suffixEnd)
// Mark as void, taken as an error
BaseType suffixBaseType = BaseType::Void;
if (suffixExists)
{
// Mark as void, taken as an error
suffixBaseType = BaseType::Void;

int lCount = 0;
int uCount = 0;
int zCount = 0;
Expand Down Expand Up @@ -7008,6 +7063,14 @@ static Expr* parseAtomicExpr(Parser* parser)
suffixBaseType = BaseType::Int;
}
}
else
{
suffixBaseType = _determineNonSuffixedIntegerLiteralType(
value,
isDecimalBase,
&token,
parser->sink);
}

value = _fixIntegerLiteral(suffixBaseType, value, &token, parser->sink);

Expand Down
25 changes: 18 additions & 7 deletions tests/diagnostics/int-literal.slang
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

int doSomething(int a)
{
// Warning can't fit
int c0 = 0x800000000;
// No warning, literal will be interpreted as 64 bit.
uint64_t c0 = 0x800000000;

// No warning as top bits are just ignored
int c1 = -1ll;
Expand All @@ -13,19 +13,30 @@ int doSomething(int a)
// Should sign extend
int c3 = 0x80000000;

// Should give a warning (ideally including the preceeding -)
// Currently we don't have the -, because the lexer lexes - independently
int c4 = -0xfffffffff;
// No warning, hex literal will be interpreted as an unsigned 64 integer then signed with negative operator.
int64_t c4 = -0xfffffffff;

//
a += c0 + c1 + c2;
a += (int)c0 + c1 + c2;

int64_t b = 0;

// Ok
b += 0x800000000ll;

uint64_t c5 = -2ull;

// Warning, integer literal is too large for signed 64 bit, must be interpreted as unsigned.
uint64_t d0 = 18446744073709551615;

// Warning, integer literal is too small for signed 64 bit, must be interpreted as unsigned.
uint64_t d1 = -9223372036854775809;

// This is INT64_MIN and valid negative signed integer, but warning will be emitted as negative(-) is scanned
// separately in the lexer, and the positive literal portion will emit a warning.
// The final value will still be correctly set as INT64_MIN.
//
// To not have this warning the lexer must scan the negative operator and number together.
uint64_t d2 = -9223372036854775808;

return a + int(b);
}
Expand Down
15 changes: 9 additions & 6 deletions tests/diagnostics/int-literal.slang.expected
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
result code = 0
standard error = {
tests/diagnostics/int-literal.slang(6): warning 39999: integer literal '0x800000000' too large for type 'int' truncated to '0'
int c0 = 0x800000000;
^~~~~~~~~~~
tests/diagnostics/int-literal.slang(18): warning 39999: integer literal '0xfffffffff' too large for type 'int' truncated to '-1'
int c4 = -0xfffffffff;
^~~~~~~~~~~
tests/diagnostics/int-literal.slang(29): warning 39999: integer literal is too large to be represented in a signed integer type, interpreting as unsigned
uint64_t d0 = 18446744073709551615;
^~~~~~~~~~~~~~~~~~~~
tests/diagnostics/int-literal.slang(32): warning 39999: integer literal is too large to be represented in a signed integer type, interpreting as unsigned
uint64_t d1 = -9223372036854775809;
^~~~~~~~~~~~~~~~~~~
tests/diagnostics/int-literal.slang(39): warning 39999: integer literal is too large to be represented in a signed integer type, interpreting as unsigned
uint64_t d2 = -9223372036854775808;
^~~~~~~~~~~~~~~~~~~
}
standard output = {
}
Loading
Loading