Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8 bit floating point #1910

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 104 additions & 0 deletions softfloat/f16_to_f8.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@

/*============================================================================

This C source file is part of the SoftFloat IEEE Floating-Point Arithmetic
Package, Release 3d, by John R. Hauser.

Copyright 2011, 2012, 2013, 2014, 2015 The Regents of the University of
California. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions, and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the University nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS "AS IS", AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE
DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=============================================================================*/

#include <stdbool.h>
#include <stdint.h>
#include "platform.h"
#include "internals.h"
#include "specialize.h"
#include "softfloat.h"

float8_t f16_to_f8( float16_t a )
{
union ui16_f16 uA;
uint_fast16_t uiA;
bool sign;
int_fast8_t exp;
uint_fast16_t frac;
struct commonNaN commonNaN;
uint_fast8_t uiZ;
uint_fast16_t frac8;
union ui8_f8 uZ;

/*------------------------------------------------------------------------
*------------------------------------------------------------------------*/
uA.f = a;
uiA = uA.ui;
sign = signF16UI( uiA );
exp = expF16UI( uiA );
frac = fracF16UI( uiA );
/*------------------------------------------------------------------------
*------------------------------------------------------------------------*/
if ( exp == 0xFF ) {
if ( frac ) {
softfloat_f16UIToCommonNaN( uiA, &commonNaN );
switch ( softfloat_fp8Mode ) {
case softfloat_fp8_e4m3:
uiZ = softfloat_commonNaNToE4M3F8UI( &commonNaN );
case softfloat_fp8_e5m2:
uiZ = softfloat_commonNaNToE5M2F8UI( &commonNaN );
default:
uiZ = softfloat_commonNaNToF8UI( &commonNaN );
}
} else {
switch ( softfloat_fp8Mode ) {
case softfloat_fp8_e4m3:
// Assuming overflow mode (Inf --> NaN)
uiZ = softfloat_commonNaNToE4M3F8UI( &commonNaN );
case softfloat_fp8_e5m2:
uiZ = signInfE5M2F8UI( sign );
default:
uiZ = signInfF8UI( sign );
}
}
goto uiZ;
}
/*------------------------------------------------------------------------
*------------------------------------------------------------------------*/
frac8 = frac>>2 | ((frac & 0x3) != 0); // Round and preserve sticky bit
if ( ! (exp | frac8) ) {
uiZ = packToF8UI( 0, 0, 0 ); // zero
goto uiZ;
}
/*------------------------------------------------------------------------
*------------------------------------------------------------------------*/
return softfloat_roundPackToF8( sign, exp - 0xC, frac8 | 0x100 );
uiZ:
uZ.ui = uiZ;
return uZ.f;

}

46 changes: 46 additions & 0 deletions softfloat/f8_add.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*============================================================================

This C source file is part of the SoftFloat IEEE Floating-Point Arithmetic
Package, Release 3d, by John R. Hauser.

Copyright 2011, 2012, 2013, 2014, 2015, 2016 The Regents of the University of
California. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions, and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the University nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS "AS IS", AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE
DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=============================================================================*/

#include <stdbool.h>
#include <stdint.h>
#include "platform.h"
#include "internals.h"
#include "specialize.h"
#include "softfloat.h"

float8_t f8_add( float8_t a, float8_t b )
{
return f8_emulation_2_operands(a, b, f16_add);
}
34 changes: 34 additions & 0 deletions softfloat/f8_classify.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#include <stdbool.h>
#include <stdint.h>
#include "platform.h"
#include "internals.h"
#include "specialize.h"
#include "softfloat.h"

uint_fast16_t f8_classify( float8_t a )
{
union ui8_f8 uA;
uint_fast16_t uiA;

uA.f = a;
uiA = uA.ui;

uint_fast16_t infOrNaN = isInfF8UI(uiA) || isNaNF8UI(uiA);
uint_fast16_t subnormalOrZero = expF8UI( uiA ) == 0;
bool sign = signF8UI( uiA );
bool fracZero = fracF8UI( uiA ) == 0;
bool isNaN = isNaNF8UI( uiA );
bool isSNaN = false;

return
( sign && infOrNaN && fracZero ) << 0 |
( sign && !infOrNaN && !subnormalOrZero ) << 1 |
( sign && subnormalOrZero && !fracZero ) << 2 |
( sign && subnormalOrZero && fracZero ) << 3 |
( !sign && infOrNaN && fracZero ) << 7 |
( !sign && !infOrNaN && !subnormalOrZero ) << 6 |
( !sign && subnormalOrZero && !fracZero ) << 5 |
( !sign && subnormalOrZero && fracZero ) << 4 |
( isNaN && isSNaN ) << 8 |
( isNaN && !isSNaN ) << 9;
}
46 changes: 46 additions & 0 deletions softfloat/f8_div.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*============================================================================

This C source file is part of the SoftFloat IEEE Floating-Point Arithmetic
Package, Release 3d, by John R. Hauser.

Copyright 2011, 2012, 2013, 2014, 2015, 2016 The Regents of the University of
California. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions, and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the University nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS "AS IS", AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE
DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=============================================================================*/

#include <stdbool.h>
#include <stdint.h>
#include "platform.h"
#include "internals.h"
#include "specialize.h"
#include "softfloat.h"

float8_t f8_div( float8_t a, float8_t b )
{
return f8_emulation_2_operands(a, b, f16_div);
}
74 changes: 74 additions & 0 deletions softfloat/f8_emulation.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
/*============================================================================

This C source file is part of the SoftFloat IEEE Floating-Point Arithmetic
Package, Release 3d, by John R. Hauser.

Copyright 2011, 2012, 2013, 2014, 2015, 2016 The Regents of the University of
California. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions, and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the University nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS "AS IS", AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE
DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=============================================================================*/

#include <stdbool.h>
#include <stdint.h>
#include "platform.h"
#include "internals.h"
#include "specialize.h"
#include "softfloat.h"

float8_t f8_emulation_3_operands(float8_t a8, float8_t b8, float8_t c8, float16_t (*operation)(float16_t, float16_t, float16_t)) {
uint_fast8_t roundingMode = softfloat_roundingMode;
softfloat_roundingMode = softfloat_round_odd;
float16_t a16 = f8_to_f16(a8);
float16_t b16 = f8_to_f16(b8);
float16_t c16 = f8_to_f16(c8);
float16_t z16 = operation(a16, b16, c16);
softfloat_roundingMode = roundingMode;
float8_t z = f16_to_f8(z16);
return z;
}

float8_t f8_emulation_2_operands(float8_t a8, float8_t b8, float16_t (*operation)(float16_t, float16_t)) {
uint_fast8_t roundingMode = softfloat_roundingMode;
softfloat_roundingMode = softfloat_round_odd;
float16_t a16 = f8_to_f16(a8);
float16_t b16 = f8_to_f16(b8);
float16_t z16 = operation(a16, b16);
softfloat_roundingMode = roundingMode;
float8_t z = f16_to_f8(z16);
return z;
}

float8_t f8_emulation_1_operand(float8_t a8, float16_t (*operation)(float16_t)) {
uint_fast8_t roundingMode = softfloat_roundingMode;
softfloat_roundingMode = softfloat_round_odd;
float16_t a16 = f8_to_f16(a8);
float16_t z16 = operation(a16);
softfloat_roundingMode = roundingMode;
float8_t z = f16_to_f8(z16);
return z;
}
59 changes: 59 additions & 0 deletions softfloat/f8_eq.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
/*============================================================================

This C source file is part of the SoftFloat IEEE Floating-Point Arithmetic
Package, Release 3d, by John R. Hauser.

Copyright 2011, 2012, 2013, 2014, 2015 The Regents of the University of
California. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
this list of conditions, and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the University nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS "AS IS", AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE
DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=============================================================================*/

#include <stdbool.h>
#include <stdint.h>
#include "platform.h"
#include "internals.h"
#include "softfloat.h"
#include "specialize.h"

bool f8_eq( float8_t a, float8_t b )
{
union ui8_f8 uA;
uint_fast8_t uiA;
union ui8_f8 uB;
uint_fast8_t uiB;

uA.f = a;
uiA = uA.ui;
uB.f = b;
uiB = uB.ui;
if ( isNaNF8UI( uiA ) || isNaNF8UI( uiB ) ) {
softfloat_raiseFlags( softfloat_flag_invalid );
return false;
}
return (uiA == uiB);
}
Loading
Loading