Skip to content

LDMSCON2020 single node container

valleydlr edited this page Sep 23, 2020 · 2 revisions

The Docker Image

Get the ovishpc/ldmscon2020-single image from dockerhub as follows:

$ docker pull ovishpc/ldmscon2020-single

Run The Container

Run the container with bash to get the interactive shell as follows:

$ docker run -it -d --name ldmscon2020 --hostname ldmscon2020 ovishpc/ldmscon2020-single bash

Note that using the above command we name the container ldmscon2020 so that we can use this name to refer to the container later, for example, when we want more interactive shells, or when we want to remove it. For consistency, we also set the container's hostname to ldmscon2020. The command starts the container using bash with detached IO and with TTY so that the bash process keeps the container running.

Use docker ps to see that the container is running.

Interactive Shell

Use the following command to get an interactive shell on the running container:

$ docker exec -it ldmscon2020 bash

Run LDMSD (LDMS Daemon) Inside The Container

Run the following command inside the container to start ldmsd:

[ldmscon2020] $ ldmsd -c /opt/ovis/etc/ldms/sampler.conf -l /var/log/ldmsd.log

The configuration file (sampler.conf) directs ldmsd to listen on the sock transport on port 411 and loads meminfo sampler.

The following is the contents of the sampler.conf configuration file:

# Listen on port 411 (default)
listen xprt=sock port=411
# Load the meminfo sampler
load name=meminfo
config name=meminfo producer=${HOSTNAME} instance=${HOSTNAME}/meminfo
start name=meminfo interval=1000000 offset=0

Use pgrep to check whether ldmsd is running as follows:

[ldmscon2020] $ pgrep -a ldmsd

ldms_ls Inside The Container)

Use ldms_ls to check that ldmsd is running properly as follows:

[ldmscon2020] $ ldms_ls -h localhost -x sock -p 411

This should report ldmscon2020/meminfo as a single set hosted on the daemon. Note that any two of the -h localhost -x sock -p 411 can be omitted as they are the defaults. However, at least one must be used and it is recommended that you use all three.

To obtain metric values, use ldms_ls with -l option as follows:

[ldmscon2020] $ ldms_ls -h localhost -x sock -p 411 -l

The following is an example of the output with -l option:

ldmscon2020/meminfo: consistent, last update: Mon Aug 24 22:19:05 2020 -0500 [1748us] 
M u64        component_id                               0    
D u64        job_id                                     0    
D u64        app_id                                     0
D u64        MemTotal                                   16285096
D u64        MemFree                                    2023092
D u64        MemAvailable                               10595664
D u64        Buffers                                    468900  
D u64        Cached                                     8515732    
D u64        SwapCached                                 252  
D u64        Active                                     6955336
D u64        Inactive                                   5864704
D u64        Active(anon)                               3602080
D u64        Inactive(anon)                             1238604
D u64        Active(file)                               3353256
D u64        Inactive(file)                             4626100
D u64        Unevictable                                136920
D u64        Mlocked                                    96
D u64        SwapTotal                                  2097148
D u64        SwapFree                                   2070268
D u64        Dirty                                      484
D u64        Writeback                                  0
D u64        AnonPages                                  3972008
D u64        Mapped                                     1091664
D u64        Shmem                                      1131116
D u64        KReclaimable                               935332
D u64        Slab                                       1153008
D u64        SReclaimable                               935332  
D u64        SUnreclaim                                 217676
D u64        KernelStack                                17088
D u64        PageTables                                 52860
D u64        NFS_Unstable                               0
D u64        Bounce                                     0
D u64        WritebackTmp                               0
D u64        CommitLimit                                10239696
D u64        Committed_AS                               15567388
D u64        VmallocTotal                               34359738367
D u64        VmallocUsed                                43984
D u64        VmallocChunk                               0
D u64        Percpu                                     6800
D u64        HardwareCorrupted                          0
D u64        AnonHugePages                              0
D u64        ShmemHugePages                             0
D u64        ShmemPmdMapped                             0
D u64        FileHugePages                              0
D u64        FilePmdMapped                              0
D u64        CmaTotal                                   0
D u64        CmaFree                                    0
D u64        HugePages_Total                            0
D u64        HugePages_Free                             0
D u64        HugePages_Rsvd                             0
D u64        HugePages_Surp                             0
D u64        Hugepagesize                               2048
D u64        Hugetlb                                    0
D u64        DirectMap4k                                743536
D u64        DirectMap2M                                15921152
D u64        DirectMap1G                                0

Stop & Remove The Container

Issue the following command on the host machine to kill (stop) the container:

$ docker kill ldmscon2020

All processes on the container will be stopped after the kill command is issued, but the storage (overlay fs) of the container will persist. Basically, the ldmscon2020 container created by the docker run command still exists, but in a stopped state.

To completely remove the container, issue the following command:

$ docker rm ldmscon2020

Also use docker ps -a to ensure that the container has been fully removed (does not appear in the list).

Main

LDMSCON

Tutorials are available at the conference websites

D/SOS Documentation

LDMS v4 Documentation

Basic

Configurations

Features & Functionalities

Working Examples

Development

Reference Docs

Building

Cray Specific
RPMs
  • Coming soon!

Adding to the code base

Testing

Misc

Man Pages

  • Man pages currently not posted, but they are available in the source and build

LDMS Documentation (v3 branches)

V3 has been deprecated and will be removed soon

Basic

Reference Docs

Building

General
Cray Specific

Configuring

Running

  • Running

Tutorial

Clone this wiki locally