Skip to content

Commit

Permalink
AOCL-LibMem: AMD-optimized memory/string functions
Browse files Browse the repository at this point in the history
Details:
 * Initial Public Version of AOCL LibMem Library
 * This corresponds to AOCL 4.1 Release
 * Supports memcpy, mempcpy, memmove, memset, memcmp and strcpy
  • Loading branch information
SajanKarumanchi committed Aug 7, 2023
0 parents commit b13b857
Show file tree
Hide file tree
Showing 75 changed files with 10,584 additions and 0 deletions.
134 changes: 134 additions & 0 deletions BUILD_RUN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Build & Run Guide for **_AOCL-LibMem_**

## Requirements
* **cmake 3.11**
* **python 3.10**
* **gcc 12.2.0**
* **aocc 4.0**

## Build Procedure:
### Shared Library:
```sh
$ mkdir build
$ cd build
#Configure for GCC build
# Default Native Build
$ cmake -D CMAKE_C_COMPILER=gcc ../aocl-libmem
# Cross Compiling AVX2 binary on AVX512 machine
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx2 ../aocl-libmem
# Cross Compiling AVX512 binary on AVX2 machine
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx512 ../aocl-libmem
# Enabling Tunable Parameters
$ cmake -D CMAKE_C_COMPILER=gcc -D ENABLE_TUNABLES=Y ../aocl-libmem
#Configure for AOCC(Clang) build
# Default Native Build
$ cmake -D CMAKE_C_COMPILER=clang ../aocl-libmem
# Cross Compiling AVX2 binary on AVX512 machine
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx2 ../aocl-libmem
# Cross Compiling AVX512 binary on AVX2 machine
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx512 ../aocl-libmem
# Enabling Tunable Parameters
$ cmake -D CMAKE_C_COMPILER=clang -D ENABLE_TUNABLES=Y ../aocl-libmem
#Build
$ cmake --build .
#Install
$ make install
```

A shared library file 'libaocl-libmem.so' will be generated and stored under 'build/lib/shared/' path.


### Static Library:
```sh
$ mkdir build
$ cd build
#Configure for GCC build
# Default Native Build
$ cmake -D CMAKE_C_COMPILER=gcc -D BUILD_SHARED_LIBS=N ../aocl-libmem
# Cross Compiling AVX2 binary on AVX512 machine
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx2 -D BUILD_SHARED_LIBS=N ../aocl-libmem
# Cross Compiling AVX512 binary on AVX2 machine
$ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx512 -D BUILD_SHARED_LIBS=N ../aocl-libmem
# Enabling Tunable Parameters
$ cmake -D CMAKE_C_COMPILER=gcc -D ENABLE_TUNABLES=Y -D BUILD_SHARED_LIBS=N ../aocl-libmem
#Configure for AOCC(Clang) build
# Default Native Build
$ cmake -D CMAKE_C_COMPILER=clang -D BUILD_SHARED_LIBS=N ../aocl-libmem
# Cross Compiling AVX2 binary on AVX512 machine
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx2 -D BUILD_SHARED_LIBS=N ../aocl-libmem
# Cross Compiling AVX512 binary on AVX2 machine
$ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx512 -D BUILD_SHARED_LIBS=N ../aocl-libmem
# Enabling Tunable Parameters
$ cmake -D CMAKE_C_COMPILER=clang -D ENABLE_TUNABLES=Y -D BUILD_SHARED_LIBS=N ../aocl-libmem
#Build
$ cmake --build .
#Install
$ make install
```

A static library file 'libaocl-libmem.a' will be generated and stored under 'build/lib/static' path.

## Debug Build:
To enable logging build the source as below
```sh
$ cmake -D ENABLE_LOGGING=Y ../aocl-libmem
```
Logs will be stored in the`"/tmp/libmem.log"` file.

Enable debugging logs by uncommenting the below line from "CMakeLists.txt" in root directory.
_debugging logs_: `add_definitions(-DLOG_LEVEL=4)`

## Running application:
``Run the application by preloading the shared 'libaocl-libmem.so' generated from the above build procedure.``
```sh
$ LD_PRELOAD=<path to build/lib/shared/libaocl-libmem.so> <executable> <params>
```
* **`WARNING: Do not load/run AVX512 library on Non-AVX512 machine. Running AVX512 on non-AVX512 will lead to crash(invalid instructions).`**

## User Config:
### 1. Default State Run:
``Best fit implementation for the underlying ZEN microarchitecture will be chosen by the library.``


### 2. Tunable State Run:

_There are two tunables that will be parsed by libmem._
* **`LIBMEM_OPERATION`** :- instruction based on alignment and cacheability
* **`LIBMEM_THRESHOLD`** :- the threshold for ERMS and Non-Temporal instructions

The library will choose the implementation based on the tuned parameter at run time.

#### 2.1. _LIBMEM_OPERATION_ :
**Setting this tunable will let you choose implementation which is a combination of move instructions and alignment of the source and destination addresses.**

**LIBMEM_OPERATION** format: **`<operations>,<source_alignment>,<destination_alignmnet>`**

##### Valid options:
* `<operations> = [avx2|avx512|erms]`
* `<source_alignment> = [b|w|d|q|x|y|n]`
* `<destination_alignmnet> = [b|w|d|q|x|y|n]`

e.g.: To use only avx2 based move operations with both unaligned source and destination addresses.
```sh
LD_PRELOAD=<build/lib/shared/libaocl-libmem.so> LIBMEM_OPERATION=avx2,b,b <executable>
```

#### 2.2. _LIBMEM_THRESHOLD_ :
**Setting this tunable will let us configure the threshold values for the supported instruction set.**

**LIBMEM_THRESHOLD** format: **`<repmov_start_threshold>,<repmov_stop_threshold>,<nt_start_threshold>,<nt_stop_threshold>`**

##### Valid options:
* `<repmov_start_threshold> = [0, +ve integers]`
* `<repmov_stop_threshold> = [0, +ve integers, -1]`
* `<nt_start_threshold> = [0, +ve integers]`
* `<nt_stop_threshold> = [0, +ve integers, -1]`

One has to make sure that they provide valid start and stop range values.
If the size has to be set to maximum length then pass "-1"

e.g.: To use **REP MOVE** instructions for a range of 1KB to 2KB and non_temporal instructions for a range of 512KB and above.
```sh
LD_PRELOAD=<build/lib/shared/libaocl-libmem.so> LIBMEM_THRESHOLD=1024,2048,524288,-1 <executable>
```
**` Kindly refer to User Guide(docs/User_Guide.md) for the detailed tuning of parameters.`**
104 changes: 104 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Copyright (C) 2022-23 Advanced Micro Devices, Inc. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without modification,
# are permitted provided that the following conditions are met:
# 1. Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# 3. Neither the name of the copyright holder nor the names of its contributors
# may be used to endorse or promote products derived from this software without
# specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
# INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
# OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.

cmake_minimum_required(VERSION 3.10)

# avoid build in root directory
if(${CMAKE_SOURCE_DIR} STREQUAL ${CMAKE_BINARY_DIR})
message(FATAL_ERROR “In-source build detected!”)
endif()


# set the project name and version
set(LIBMEM_VERSION_STRING 4.1.0)

project(aocl-libmem VERSION ${LIBMEM_VERSION_STRING} LANGUAGES C DESCRIPTION
"Library of AMD optimized string/memory functions")

string(TIMESTAMP BUILD_DATE "%Y%m%d")

set(LIBMEM_BUILD_VERSION_STR
"AOCL-LibMem ${LIBMEM_VERSION_STRING} Build ${BUILD_DATE}")

add_definitions(-DLIBMEM_BUILD_VERSION="${LIBMEM_BUILD_VERSION_STR}")

set(DEFAULT_BUILD_TYPE "Release")

option(BUILD_SHARED_LIBS "Build shared libraries" ON)

option(ENABLE_LOGGING "Enable Logger" OFF)

option(ENABLE_TUNABLES "Enable user input" OFF)

if (ENABLE_LOGGING)
add_definitions(-DENABLE_LOGGER)
# uncomment the below for debug logs, LOG_LEVEL=DEBUG.
#add_definitions(-DLOG_LEVEL=4)
endif ()

option(ALMEM_ARCH "ISA_ARCH_TYPE" ON)

execute_process(COMMAND bash "-c" "lscpu | grep erms"
RESULT_VARIABLE ERMS_FEATURE OUTPUT_QUIET)

if (NOT ${ERMS_FEATURE})
#uncomment after addng ERMS support for all funcs
#add_definitions(-DERMS_FEATURE_ENABLED)
message("ERMS Feature Enabled on Build machine.")
endif ()

execute_process(COMMAND bash "-c" "lscpu | grep avx512"
RESULT_VARIABLE AVX512_FEATURE OUTPUT_QUIET)

if (NOT ${AVX512_FEATURE})
message("AVX512 Feature Enabled on Build machine.")
if (NOT ${ALMEM_ARCH} STREQUAL "avx2")
message("Setting Arch to AVX512")
set(ALMEM_ARCH "avx512")
endif ()
endif ()

if (ALMEM_ARCH MATCHES "avx512")
set(AVX512_FEATURE_AVAILABLE true)
add_definitions(-DAVX512_FEATURE_ENABLED)
if (${AVX512_FEATURE})
message("Cross-Compiling for AVX512 Arch...")
else ()
message("Native-Compiling for AVX512 Arch...")
endif ()
else ()
if (NOT ${AVX512_FEATURE})
message("Cross-Compiling for AVX2 Arch...")
else ()
message("Native-Compiling for AVX2 Arch...")
endif ()
endif ()

# option for building shared lib
option(BUILD_SHARED_LIBS "Build using shared libraries" ON)

# let the build system know the source directory
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/src)

file(WRITE ${CMAKE_BINARY_DIR}/version.h ${LIBMEM_BUILD_VERSION_STR})
153 changes: 153 additions & 0 deletions COPYRIGHT.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
(C) 2022-23 Advanced Micro Devices, Inc. All Rights Reserved.

Advanced Micro Devices, Inc.
Software License Agreement

IMPORTANT-READ CAREFULLY: Do not load or use the Software until you have
carefully read and agreed to the following terms and conditions. This is a
legal agreement ("Agreement") between you (either an individual or an entity)
("Licensee") and Advanced Micro Devices, Inc. ("AMD"). If Licensee does not
agree to the terms of this Agreement, do not install or use this software or
any portion thereof. By loading or using the object code version only of the
software obtained herewith, which may include associated install scripts and
online or electronic documentation or any portion thereof, that is made
available by AMD to download from any media ("Software"), Licensee agrees to
all of the terms of this Agreement.

1. LICENSE:

a. Subject to the terms and conditions of this Agreement, AMD grants
Licensee the following non-exclusive, non-transferable, royalty-free,
limited copyright license to download, copy, use, distribute and sublicense
the foregoing rights through multiple tiers of sublicenses the object code
version of the Software and materials associated with this Agreement,
including without limitation printed documentation, (collectively,
"Materials"), provided that Licensee agrees to include all copyright
legends and other legal notices that may appear in the Materials. The
foregoing license is conditioned upon Licensee distributing the object code
version of the Software only and under this software license agreement.
Except for the limited license granted herein, Licensee shall have no other
rights in the Materials, whether express, implied, arising by estoppel or
otherwise.

b. Except as expressly set forth in Section 1(a), Licensee does not have
the right to (i) distribute, rent, lease, sell, sublicense, assign, or
otherwise transfer the Materials, in whole or in part, to third parties for
commercial or for non-commercial use; or (ii) modify, disassemble, reverse
engineer, or decompile the Software, or otherwise reduce any part of the
Software to any human readable form. All rights in and to the Materials
not expressly granted to Licensee in this Agreement are reserved to AMD.

2. FEEDBACK: Licensee may provide AMD feedback, suggestions or opinions as to
the Software, its features, and desired enhancements or changes. If Licensee
provides feedback, suggestions or opinions to AMD regarding any new features,
use, functionality, or change to the Software or any materials related to the
Software, Licensee hereby agrees to grant, and does grant, AMD all rights
needed for AMD to incorporate, modify, distribute, use and commercialize any
new feature, use, functionality, or change at no charge or encumbrance to AMD.
Licensee agrees that AMD may disclose such feedback, suggestions or opinions to
any third party in any manner, and Licensee agrees that AMD has the ability to
sublicense any of the foregoing rights in any feedback, suggestions or opinions
or AMD products or services in any form to any third party without restriction.

3. OWNERSHIP AND COPYRIGHT OF MATERIALS: Licensee agrees that the Materials
are owned by AMD and are protected by United States and foreign intellectual
property laws (e.g. patent and copyright laws) and international treaty
provisions. Licensee will not remove the copyright notice from the Materials.
Licensee agrees to prevent any unauthorized copying of the Materials. All
title and copyrights in and to the Materials, all copies thereof (in whole or
in part, and in any form), and all rights therein shall remain vested in AMD.
Except as expressly provided herein, AMD does not grant any express or implied
right to Licensee under AMD patents, copyrights, trademarks, or trade secret
information.

4. WARRANTY DISCLAIMER: THE MATERIALS ARE PROVIDED "AS IS" WITHOUT ANY EXPRESS
OR IMPLIED WARRANTY OF ANY KIND INCLUDING WARRANTIES OF MERCHANTABILITY,
NONINFRINGEMENT OF THIRD-PARTY INTELLECTUAL PROPERTY, TITLE, OR FITNESS FOR ANY
PARTICULAR PURPOSE, OR THOSE ARISING FROM CUSTOM OF TRADE OR COURSE OF USAGE.
THE ENTIRE RISK ARISING OUT OF USE OR PERFORMANCE OF THE MATERIALS REMAINS WITH
LICENSEE. AMD DOES NOT WARRANT, GUARANTEE, OR MAKE ANY REPRESENTATIONS AS TO
THE CORRECTNESS, ACCURACY, COMPLETENESS, QUALITY, OR RELIABILITY OF THE
MATERIALS.

AMD DOES NOT WARRANT THAT OPERATION OF THE MATERIALS WILL BE UNINTERRUPTED OR
ERROR-FREE. YOU ARE RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING
THE SOFTWARE AND ASSUME ALL RISKS ASSOCIATED WITH THE USE OF THE MATERIALS,
INCLUDING BUT NOT LIMITED TO THE RISKS OF PROGRAM ERRORS, DAMAGE TO OR LOSS OF
DATA, PROGRAMS OR EQUIPMENT, AND UNAVAILABILITY OR INTERRUPTION OF OPERATIONS.
SOME JURISDICTIONS DO NOT ALLOW FOR THE EXCLUSION OR LIMITATION OF IMPLIED
WARRANTIES, SO THE ABOVE LIMITATIONS OR EXCLUSIONS MAY NOT APPLY TO LICENSEE.

5. LIMITATION OF LIABILITY: IN NO EVENT SHALL AMD OR ITS DIRECTORS, OFFICERS,
EMPLOYEES AND AGENTS, ITS SUPPLIERS OR ITS LICENSORS BE LIABLE TO LICENSEE OR
ANY THIRD PARTIES IN RECEIPT OF THE MATERIALS FOR CONSEQUENTIAL, INCIDENTAL,
PUNITIVE OR SPECIAL DAMAGES, INCLUDING, BUT NOT LIMITED TO LOSS OF PROFITS,
BUSINESS INTERRUPTION, OR LOSS OF INFORMATION ARISING OUT OF THE USE OF OR
INABILITY TO USE THE MATERIALS, EVEN IF AMD HAS BEEN ADVISED OF THE POSSIBILITY
OF SUCH DAMAGES. AMD DOES NOT ASSUME ANY RESPONSIBILITY TO SUPPORT OR UPDATE
THE MATERIALS. BY USING THE MATERIALS WITHOUT CHARGE, YOU ACCEPT THIS
ALLOCATION OF RISK. BECAUSE SOME JURSIDICTIONS PROHIBIT THE EXCLUSION OR
LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE ABOVE
LIMITATION MAY NOT APPLY TO LICENSEE.

6. U.S. GOVERNMENT RESTRICTED RIGHTS: The Materials are provided with
"RESTRICTED RIGHTS." Use, duplication or disclosure by the Government is
subject to restrictions as set forth in FAR52.227-14 and DFAR252.227-7013, et
seq., or its successor. Use of the Materials by the Government constitutes
acknowledgment of AMD's proprietary rights in them.

7. TERMINATION OF LICENSE: This Agreement will terminate immediately without
notice from AMD or judicial resolution if Licensee fails to comply with any
provisions of this Agreement. Upon termination of this Agreement, Licensee
must delete or destroy all copies of the Materials.

8. SUPPORT. Under this Agreement, AMD is under no obligation to assist in the
use of the Materials, to provide support to licensees of the Materials, or to
provide maintenance, correction, modification, enhancement, or upgrades to the
Materials. If AMD determines, in its sole discretion, to support, maintain,
correct, modify, enhance, or upgrade the Software, such support, maintenance,
correction, modification, enhancement or upgrade shall be considered part of
the Materials, and shall be subject to this Agreement.

9. SURVIVAL: Sections 1(b), 2, 3, 4, 5, 6, and 8 through 14 shall survive any
expiration or termination of this Agreement.

10. APPLICABLE LAWS: Any claim arising under or relating to this Agreement
shall be governed by and construed in accordance with the substantive laws of
the State of California, without regard to principles of conflict of laws.
Each party hereto submits to the jurisdiction of the state and federal courts
of Santa Clara County and the Northern District of California for the purposes
of all legal proceedings arising out of or relating to this Agreement or the
subject matter hereof. Each party waives any objection which it may have to
contest such forum.

11. IMPORT/EXPORT/RE-EXPORT/USE/RELEASE/TRANSFER RESTRICTIONS AND COMPLIANCE
WITH LAWS: Licensee is hereby provided notice, and agrees and acknowledges,
that the Software, its source code, any accompanying media, material or
information, and any product of the foregoing, may be subject to restrictions
on use, release, transfer, importation, exportation and/or re- exportation
under the laws and regulations of the United States or other countries
("Applicable Laws"), which include but are not limited to U.S. export control
laws such as the Export Administration Regulations and national security
controls as defined thereunder, as well as State Department controls under the
U.S. Munitions List. Licensee further agrees that the Software, its source
code, any accompanying media, material or information, and any product of the
foregoing, will not be used, released, transferred, imported, exported and/or
re-exported in any manner prohibited under Applicable Laws, including U.S.
export control laws regarding specifically designated persons, countries and
nationals of countries subject to national security controls as provided in
License Exception TSR of the Export Administration Regulations and any
successor regulations.

12. SEVERABILITY: Should any term of this Agreement be declared void or
unenforceable by any court of competent jurisdiction, such declaration shall
have no effect on the remaining terms hereof.

13. NO WAIVER: The failure of either party to enforce any rights granted
hereunder or to take action against the other party in the event of any breach
hereunder shall not be deemed a waiver by that party as to subsequent
enforcement of rights or subsequent actions in the event of future breaches.

14. ENTIRE AGREEMENT: This Agreement constitutes the entire agreement between
the parties and supersedes any prior or contemporaneous oral or written
agreements with respect to the subject matter of this Agreement.
Loading

0 comments on commit b13b857

Please sign in to comment.