Skip to content

Latest commit

 

History

History
165 lines (122 loc) · 7.15 KB

INSTALL.md

File metadata and controls

165 lines (122 loc) · 7.15 KB

Installation

These installation instructions are for running Retina on a bare metal Ubuntu server with a Mellanox server NIC, and have been tested on the following platforms:

CPU OS NIC
Intel Xeon Gold 6154R Ubuntu 18.04 Mellanox ConnectX-5 100G MCX516A-CCAT
Intel Xeon Gold 6248R Ubuntu 20.04 Mellanox ConnectX-5 100G MCX516A-CCAT
Intel Xeon Silver 4314 Ubuntu 20.04 Mellanox ConnectX-5 100G MCX516A-CCA_Ax
AMD EPYC 7452 32-Core Ubuntu 20.04 Mellanox ConnectX-5 Ex Dual Port 100 GbE

We have also tested Retina in offline mode on both x86 and ARM-based Ubuntu VMs.

Retina can run on other platforms as well, detail to come.

Hardware Recommendations

Retina should work on any commodity x86 server, but the more cores and memory the better. For real-time operation in 100G network environments, we recommend at least 64GB of memory and a 100G Mellanox ConnectX-5 or similar, but any DPDK-compatible NIC should work.

Installing Dependencies

On Ubuntu, install dependencies with the following command:

sudo apt install build-essential meson pkg-config libnuma-dev python3-pyelftools libpcap-dev libclang-dev python3-pip

Building and Installing DPDK

Retina currently requires DPDK 21.08. The latest LTS release (21.11) contains breaking API changes while 20.11 LTS has a bug that causes inaccurate packet drop metrics on some NICs.

System Configuration

To get high performance from DPDK applications, we recommend the following system configuration steps. More details from the DPDK docs can be found here.

Allocate 1GB hugepages

Edit the GRUB boot settings /etc/default/grub to reserve 1GB hugepages and isolate CPU cores that will be used for Retina. For example, to reserve 64 1GB hugepages and isolate cores 1-32:

GRUB_CMDLINE_LINUX="default_hugepagesz=1G hugepagesz=1G hugepages=64 iommu=pt intel_iommu=on isolcpus=1-32"

Update the GRUB settings and reboot:

sudo update-grub
sudo reboot now

Mount hugepages to make them available for DPDK use:

sudo mkdir /mnt/huge
sudo mount -t hugetlbfs pagesize=1GB /mnt/huge

Install MLX5 PMD Dependencies

If using a Mellanox ConnectX-5 (recommended), you will need to separately install some dependencies that do not come with DPDK (details). This can be done by installing Mellanox OFED. DPDK recommends MLNX_OFED 5.4-1.0.3.0 in combination with DPDK 21.08.

Download the MLNX_OFED from the MLNX_OFED downloads page, then run the following commands to install:

tar xvf MLNX_OFED_LINUX-5.4-1.0.3.0-ubuntu20.04-x86_64.tgz
cd MLNX_OFED_LINUX-5.4-1.0.3.0-ubuntu20.04-x86_64/
sudo ./mlnxofedinstall --dpdk --upstream-libs --with-mft --with-kernel-mft
ibv_devinfo    # verify firmware is correct, set to Ethernet
sudo /etc/init.d/openibd restart

This may update the firmware on your NIC, a reboot should complete the update if necessary.

Install DPDK from source

We recommend a local DPDK install from source. Download version 21.08 from the DPDK downloads page:

wget http://fast.dpdk.org/rel/dpdk-21.08.tar.xz
tar xJf dpdk-21.08.tar.xz

Set environment variables:

export DPDK_PATH=/path/to/dpdk/dpdk-21.08
export LD_LIBRARY_PATH=$DPDK_PATH/lib/x86_64-linux-gnu
export PKG_CONFIG_PATH=$LD_LIBRARY_PATH/pkgconfig

Compile DPDK

From DPDK_PATH, run:

meson --prefix=$DPDK_PATH build
cd build
sudo ninja install
sudo ldconfig

More information on compiling DPDK can be found here.

Troubleshooting: Building DPDK 21.08 with Meson

Meson >= 0.60 may fail to build DPDK 21.08. You can insert the fix into DPDK or build Meson < 0.60 from source. (After downloading and extracting, run python3 setup.py build && sudo python3 setup.py install.)

(Optional) Binding network interfaces to DPDK-compatible driver

Depending on your NIC and the associated DPDK poll mode driver (PMD), you may need to bind the device/interface to a DPDK-compatible driver in order to make it work properly. Note: this step does not need to be done for the Mellanox PMD (mlx5). Details on binding and unbinding to drivers can be found here.

Example bind to a DPDK-compatible driver:

sudo modprobe vfio-pci  # Load the vfio-pci module
sudo $DPDK_PATH/usertools/dpdk-devbind.py --bind=vfio-pci <interface_name/pci_address>   # Unbinds from kernel module, binds to vfio-pci

Installing Rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

More information on Rust installation can be found here.

Building and Running Retina

Retina should be built and run from source. Clone the main git repository:

git clone [email protected]:stanford-esrg/retina.git

Build all applications:

cargo build --release

Run:

sudo env LD_LIBRARY_PATH=$LD_LIBRARY_PATH RUST_LOG=error ./target/release/my_app

Troubleshooting: Bindgen

Retina uses bindgen to generate bindings to DPDK functions implemented in C. As of 06/2024, we have encountered issues when using bindgen with clang/llvm >13, apparently due to introduced APIs for SIMD intrinsics.

If you are using clang and building Retina fails with an error such as the below, downgrade clang/llvm to <=13.

error: invalid conversion between vector type '__m128i' (vector of 2 'long long' values) and integer type 'int' of different size

Testing Retina (Offline) on a VM

We have deployed Retina in offline mode (streaming pcaps) on both ARM- and x86-based Ubuntu VMs. This can be useful for getting started, development, and functional testing.

The main branch of Retina may specify "mlx5" as a default feature, as this is the recommended setup. Remove this in core/Cargo.toml if not present on the VM.

For an x86 architecture, no other changes are needed.

For ARM vCPU:

  • When building DPDK, add a meson build option to configure for generic or native SoC:
meson setup configure -Dplatform=generic
  • Let LD_LIBRARY_PATH point to aarch64-linux-gnu:
export LD_LIBRARY_PATH=$DPDK_PATH/lib/aarch64-linux-gnu

Troubleshooting: Mempool Capacity

When running applications using the provided offline config file, a mempool creation error may occur:

Error: Mempool mempool_0 creation failed

This can be resolved by reducing the mempool capacity in the config file.