Compute node configuration

This page is obsolete.

The project OpenVIM, as well as OpenMANO, has been contributed to the open source community project Open Source MANO (OSM), hosted by ETSI.

Go to the URL osm.etsi.org to know more about OSM.

#Introduction#

This article contains the general guidelines to configure a compute node for NFV based on a 64 bits Linux system OS with KVM, qemu and libvirt (e.g. RHEL7.1, RHEL7.0, CentOS 7.1, Ubuntu Server 14.04).

This article is general for all Linux systems, and try to gather all the configuration steps. These steps have not been thoroughly tested in all Linux distros and there are no guarantees that the steps below will be 100% accurate.

For specifics of the installation procedure for a specific distro, follow these links:

Note: Openvim has been tested with servers based on Xeon E5-based Intel processors with Ivy Bridge architecture, and with Intel X520 NICs based on Intel 82599 controller. No tests have been carried out with Intel Core i3, i5 and i7 families, so there are no guarantees that the integration will be seamless.

The configuration that must be applied to the compute node is the following:

BIOS setup
Install virtualization packages (kvm, qemu, libvirt, etc.)
Use a kernel with support of huge page TLB cache in IOMMU
Enable IOMMU
Enable 1G hugepages, and reserve enough hugepages for running the VNFs
Isolate CPUs so that the host OS is restricted to run on the first core of each NUMA node.
Enable SR-IOV
Enable all processor virtualization features in the BIOS;
Enable hyperthreading in the BIOS (optional)
Deactivate KSM
Pre-provision Linux bridges
Additional configuration to allow access from openvim, including the configuration to access the image repository and the creation of appropriate folders for image on-boarding

A full description of this configuration is detailed below.

#BIOS setup#

Ensure that virtualization options are active. If they are active, the following command should give a non empty output:
```
  egrep "(vmx|svm)" /proc/cpuinfo
```
It is also recommended to activate hyperthreading. If it is active, the following command should give a non empty output:
```
  egrep ht /proc/cpuinfo
```
Ensure no power saving option is enabled.

#Installation of virtualization packages#

Install the following packages in your host OS: qemu-kvm libvirt-bin bridge-utils virt-viewer virt-manager

#IOMMU TLB cache support#

Use a kernel with support huge page TLB cache in IOMMU. For example RHEL7.1, Ubuntu 14.04, or a vanilla kernel 3.14 or higher. In case you are using a kernel without this support, you should update your kernel. For instance, you can use the following kernel for RHEL7.0 (not needed for RHEL7.1):
```
  wget http://people.redhat.com/~mtosatti/qemu-kvm-take5/kernel-3.10.0-123.el7gig2.x86_64.rpm
  rpm -Uvh kernel-3.10.0-123.el7gig2.x86_64.rpm --oldpackage
```

#Enabling IOMMU#

Enable IOMMU, by adding the following to the grub command line
```
  intel_iommu=on 
```

#Enabling 1G hugepages#

Enable 1G hugepages, by adding the following to the grub command line
```
  default_hugepagesz=1G hugepagesz=1G
```

There are several options to indicate the memory to reserve

At boot option, adding hugepages=24 at grub, (reserves 24GB)

With a hugetlb-gigantic-pages.service for modern kernels. For a RHEL based linux system you need to create a configuration file /usr/lib/systemd/system/hugetlb-gigantic-pages.service with this content

  [Unit]
  Description=HugeTLB Gigantic Pages Reservation
  DefaultDependencies=no
  Before=dev-hugepages.mount
  ConditionPathExists=/sys/devices/system/node
  ConditionKernelCommandLine=hugepagesz=1G
  
  [Service]
  Type=oneshot
  RemainAfterExit=yes
  ExecStart=/usr/lib/systemd/hugetlb-reserve-pages
  
  [Install]
  WantedBy=sysinit.target

and set the huge pages at each numa node. For instance, in a system with 2 NUMA nodes, in case we want to reserve 4GB for the host OS (2GB on each NUMA node), and all remaining memory for hugepages:

totalmem=`dmidecode --type 17|grep Size |grep MB |gawk '{suma+=$2} END {print suma/1024}'`
hugepages=$(($totalmem-4))
echo $((hugepages/2)) > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
echo $((hugepages/2)) > /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages

Copy the last two lines into /usr/lib/systemd/hugetlb-reserve-pages file for automatic execution after boot

#CPU isolation#

Isolate CPUs so that the host OS is restricted to run on the first core of each NUMA node, by adding the isolcpus field to the grub command line. For instance:
```
  isolcpus=1-9,11-19,21-29,31-39
```
The exact CPU numbers might differ depending on the CPU numbers presented by the host OS. In the previous example, CPUs 0, 10, 20 and 30 are excluded because CPU 0 and its sibling 20 correspond to the first core of NUMA node 0, and CPU 10 and its sibling 30 correspond to the first core of NUMA node 1.

Running this awk script suggest the value to use in your compute node:
```
  gawk 'BEGIN{pre=-2;} ($1=="processor"){pro=$3;} ($1=="core" && $4!=0){ if (pre+1==pro){endrange="-" pro} else{cpus=cpus endrange sep pro; sep=","; endrange="";}; pre=pro;} END{printf("isolcpus=%s\n",cpus endrange);}' /proc/cpuinfo
```

#Deactivating KSM# KSM enables the kernel to examine two or more already running programs and compare their memory. If any memory regions or pages are identical, KSM reduces multiple identical memory pages to a single page. This page is then marked copy on write. If the contents of the page is modified by a guest virtual machine, a new page is created for that guest virtual machine.

KSM has a performance overhead which may be too large for certain environments or host physical machine systems.

KSM can be deactivated by stopping the ksmtuned and the ksm service. Stopping the services deactivates KSM but does not persist after restarting.

# service ksmtuned stop
Stopping ksmtuned:                                         [  OK  ]
# service ksm stop
Stopping ksm:                                              [  OK  ]

Persistently deactivate KSM with the chkconfig command. To turn off the services, run the following commands:

# chkconfig ksm off
# chkconfig ksmtuned off

Check RHEL 7 - THE KSM TUNING SERVICE for more information.

#Enabling SR-IOV# We assume that you are using Intel X520 NICs (based on Intel 82599 controller) or Intel Fortville NICs. In case you are using other NICs, the configuration might be different.

Configure 8 virtual functions on each 10G network interface. A larger number can be configured if desired. (This paragraph is provissional, because not allways works for all nic cards!!!)

  for iface in `ifconfig -a | grep ": " | cut -f 1 -d":" | grep -v -e "_" -e "\." -e "lo" -e "virbr" -e "tap"`
  do
      driver=`ethtool -i $iface| awk '($0~"driver"){print $2}'`
      if [ "$driver" == "i40e" -o "$driver" == "ixgbe" ]
          #Create 8 SR-IOV per PF
          echo 0 >  /sys/bus/pci/devices/`ethtool -i $iface | awk '($0~"bus-info"){print $2}'`/sriov_numvfs
          echo 8 >  /sys/bus/pci/devices/`ethtool -i $iface | awk '($0~"bus-info"){print $2}'`/sriov_numvfs
      fi
  done

For Niantic X520 NICs the parameter max_vfs must be set to workaround a bug with the ixgbe driver managing VFs by the sysfs interface:
```
  echo "options ixgbe max_vfs=8" >> /etc/modprobe.d/ixgbe.conf
```
Blacklist the ixgbevf module, by adding the following to the grub command line. The reason for blacklisting this driver is because it causes that the VLAN tag of broadcast packets is not properly removed when received by an SRIOV port.
```
  modprobe.blacklist=ixgbevf
```

#Pre-provision of Linux bridges# Openvim relies on Linux bridges to interconnect VMs when there are no high performance requirements for I/O. This is the case of control plane VNF interfaces that are expected to carry a small amount of traffic.

A set of Linux bridges must be pre-provisioned on every host. Every Linux bridge must be attached to a physical host interface with a specific VLAN. In addition, a external switch must be used to interconnect those physical host interfaces. Bear in mind that the host interfaces used for data plane VM interfaces will be different from the host interfaces used for control plane VM interfaces.

For example, in RHEL7.0, to create a bridge associated to the physical "em1" interface, it is needed to add two files per bridge at /etc/sysconfig/network-scripts folder:

File with name ifcfg-virbrManX with the content:

   DEVICE=virbrManX
   TYPE=Bridge
   ONBOOT=yes
   DELAY=0
   NM_CONTROLLED=no
   USERCTL=no

File with name em1.200X #uses vlan tag 200X
```
   DEVICE=em1.200X
   ONBOOT=yes
   NM_CONTROLLED=no
   USERCTL=no
   VLAN=yes
   BOOTPROTO=none
   BRIDGE=virbrManX
```
The name of the bridge and the VLAN tag can be different. In case you use a different name for the bridge, you should take it into account in 'openvimd.cfg'.

#Additional configuration to allow access from openvim#

Uncomment the following lines of /etc/libvirt/libvirtd.conf to allow external connection to libvirtd:

  unix_sock_group = "libvirt"
  unix_sock_rw_perms = "0770"
  unix_sock_dir = "/var/run/libvirt"
  auth_unix_rw = "none"

Create and configure a user for openvim access:
- A new user must be created to access the compute node from openvim. The user must belong to group libvirt
```
  #creates a new user 
  useradd -m -G libvirt <user>
  #or modified an existing user
  usermod -a -G libvirt <user>
```
- Allow to get root privileges without password, for example all members of group libvirt:
```
  sudo visudo # add the line:   %libvirt ALL=(ALL) NOPASSWD: ALL
```
Copy the ssh key of openvim into compute node. From the machine where OPENVIM is running (not from the compute node), run:
```
  ssh-keygen  #needed for generate ssh keys if not done before
  ssh-copy-id <user>@<compute host>
```
After that, ensure that you can access directly without password prompt from openvim to compute host:
```
ssh <user>@<compute host>
```
Configure access to image repository

The way that openvim deals with images is a bit different from other CMS. Instead of copying the images when doing the on-boarding, openvim assumes that images are locally accessible on each compute node on a local folder, identical for all compute nodes. This does not mean that the images are forced to be copied on each compute node disk.

Typically this can be done by storing all images in a remote shared location accessible by all compute nodes through a NAS file system and mounting locally the shared folder via NFS on a specific local folder with identical on each compute node.

VNF descriptors contain image paths pointing to a location on that folder. When doing the on-boarding, the image will be copied from the image path (accessible through NFS) to the on-boarding folder, whose configuration is described next.
Create a local folder for image on-boarding and grant access from openvim:

A local folder for image on-boarding must be created on each compute note (in default configuration, we assume that the folder is /opt/VNF/images). This folder must be created in a disk with enough space to store the images of the active VMs. If there is only a root partition in the server, the recommended procedure is to link the openvim required folder to the standard libvirt folder for holding images:
```
mkdir -p /opt/VNF/
ln -s /var/lib/libvirt/images /opt/VNF/images
chown -R <user>:nfvgroup /opt/VNF
chown -R root:nfvgroup /var/lib/libvirt/images
chmod g+rwx /var/lib/libvirt/images
```
In case there is a "/home" partition that contains more disk space than the "/" partition, the folder should be created at "/home" although a soft link can be created anywhere else. As an example, this is what our script for automatic installation in RHEL7.0 does:
```
mkdir -p /home/<user>/VNF_images
rm -f /opt/VNF/images
mkdir -p /opt/VNF/
ln -s /home/<user>/VNF_images /opt/VNF/images
chown -R <user> /opt/VNF
```
Besides, access to that folder must be granted to libvirt group in a SElinux system.
```
# SElinux management
semanage fcontext -a -t virt_image_t "/home/<user>/VNF_images(/.*)?"
cat /etc/selinux/targeted/contexts/files/file_contexts.local |grep virt_image
restorecon -R -v /home/<user>/VNF_images
```

#Compute node configuration in special cases#

##Datacenter with different types of compute nodes##

In a datacenter with different types of compute nodes, it might happen that compute nodes use different interface naming schemes. In that case, you can take the most used interface naming scheme as the default one, and make an additional configuration in the compute nodes that do not follow the default naming scheme.

In order to do that, you should create the file hostinfo.yaml file inside the image local folder (e.g. typically /opt/VNF/images). It contains entries with:

openvim-expected-name: local-iface-name

For example, if openvim contains a network using macvtap to the physical interface em1 (macvtap:em1) but in this compute node the interface is called eth1, creates a local-image-folder/hostinfo.yaml file with this content:

em1: eth1

##Configure compute node in 'developer' mode##

In order to test a VM, it is not really required to have a full NFV environment with 10G data plane interfaces and Openflow switches. If the VM is able to run with virtio interfaces, you can configure a compute node in a simpler way and use the 'developer mode' in openvim. In that mode, during the instantiation phase, VMs are deployed without hugepages and with all data plane interfaces changed to virtio interfaces. It must be noticed that openvim flavors do not change and keep identical (including all EPA attributes), but openvim performs an intelligent translation during the instantiation phase.

The configuration of a compute node to be used in 'developer mode' removes the configuration that is not needed for testing purposes, that is:

IOMMU configuration is not required since no passthrough or SR-IOV interfaces will be used
Huge pages configuration is unnecessary. All memory will be assigned in 4KB pages, allowing oversubscription (as in traditional clouds).
No configuration of data plane interfaces (e.g. SR-IOV) is required.

A VNF developer will typically use the developer mode in order to test its VNF in its own computer. Although part of the configuration is not required, the rest of the compute node configuration is still necessary. In order to prepare your own computer or a separate one as a compute node for developing purposes, you can use the script found in here

In order to execute the script, just run this command:

sudo ./configure-compute-node-develop.sh <user> <iface>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compute node configuration

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally