Skip to content

Writing a sampler

oceandlr edited this page Oct 19, 2019 · 84 revisions

Under construction

Table of Contents

Basics

The easiest way to write a sampler, is to look at an existing one, like meminfo.

Required Functions

Required functions for all samplers are defined in the ldmsd_sampler struct definition, near the bottom of the file. This defines the specific functions to be called at various points. The get_plugin function returns the struct for this plugin.

     static struct ldmsd_sampler meminfo_plugin = {
        .base = {
                .name = SAMP,
                .type = LDMSD_PLUGIN_SAMPLER,
                .term = term,
                .config = config,
                .usage = usage,
        },
        .get_set = get_set,
        .sample = sample,
     };
     struct ldmsd_plugin *get_plugin(ldmsd_msg_log_f pf)
     {
        msglog = pf;
        set = NULL;
        return &meminfo_plugin.base;
     }

About the functions:

  • usage -- output for the help, which defines the usage.
  • config -- called at configuration.
  • sample -- the actual sampling
  • get_set -- This function still exists in v4 but is no longer called.
  • term -- called when the sampler terminates.
Other:
  • msglog -- wrapper call for logging. Log levels are: LDMSD_LDEBUG, LDMSD_LINFO, LDMSD_LWARNING, LDMSD_LERROR, LDMSD_LCRITICAL, LDMSD_LALL.
  • SAMP -- #define for the sampler name, for convenience (e.g., "meminfo")
Some details of interest are described in more detail below.

get_plugin

get_plugin is called when the load call on the plugin is called.

schema uniqueness

schema names need to uniquely map to their data definition. Even array lengths must be the same. There is nothing that checks this nor that generates unique schema names for you. However, there is expectation that memory sizes, pointers, etc for a schema are consistent. A common convention is that for /proc related sources, the architecture information is obtained and encoded into the schema name.

There is metadata generation number which changes when the metadata for a schema changes. This should not be used to detect changes in a schema for name reuse.

The schema name uniqueness limitation will be obviated in v5.

Config

Config is invoked by the config command. Arguments from the config are passed in the attr_value_list *avl. They can be extracted by name, as shown below. av_value returns null if there is no attribute with that name:

     value = av_value(avl, "num_metrics");
     if (value)
                num_metrics = atoi(value);
     else
                num_metrics = -1;

Certain metrics are common to all samplers, for example instance_name, producer_name, component_id. In addition certain items can also be specified to samplers, although they are not metrics, like schema. Samplers call various base functions in order to create these metrics or use these values. Config should call the following function to config the base:

    base = base_config(avl, SAMP, SAMP, msglog);

You may want to prevent config from being called multiple times if your code is not prepared to handle the implications of the changes in config. For example, if your config creates a metricset, then subsequent calls to config on a running sampler, would have to handle the how these changes impact the current metricset, particularly while sampling is occurring. In many, but not all cases, a sampler has only a single metricset, and thus a call to check for the existence of the set at the beginning of config can guard against this scenario:

    if (set) {
                msglog(LDMSD_LERROR, SAMP ": Set already created.\n");
                return EINVAL;
             }

If you are using a configuration file, a non-zero return from config will abort processing the configuration file.

Creating A MetricSet

Generally, a metricset is created by creating the schema and then adding metrics to the schema. In many samplers with a well known data source (e.g., /proc/meminfo), then, the metricset is created by reading once from the data source to get the names of the metrics and adding them to the set.

The process proceeds as follows:

a) Create the schema from the base. This will also add the base metrics to the schema:

    schema = base_schema_new(base);
    metric_offset = ldms_schema_metric_count_get(schema);

The last call will tell you how many metrics were added from the base. In some cases this may be a useful number to retain, in case you later add values to the metricset based on their position in the schema (we will see this in the sample example below)

b) Open the data source, obtain the metric names, and add them to the schema, including their type:

    mf = fopen(procfile, "r");
    do {
            s = fgets(lbuf, sizeof(lbuf), mf);
            rc = sscanf(lbuf, "%s %" PRIu64,
                        metric_name, &metric_value);
    
            /* Strip the colon from metric name if present */
            i = strlen(metric_name);
            if (i && metric_name[i-1] == ':')
                    metric_name[i-1] = '\0';
    
            rc = ldms_schema_metric_add(schema, metric_name, LDMS_V_U64);
            if (rc < 0) {
                    rc = ENOMEM;
                    goto err;
            }
        } while (s);

If ldms_schema_metric_add fails, it will return -1. Otherwise, it will return the positional number of the added metric.

c) Once all the metrics have been added, get the set:

 set = base_set_new(base);

Sample

The sample function adds values to the metrics in the set. In addition, some values will been need to be obtained and set for the base, for example, the timestamp of the set.

a) We first check to make sure we still have a set:

    if (!set) {
                msglog(LDMSD_LDEBUG, SAMP ": plugin not initialized\n");
                return EINVAL;
              }

b) Next, make the call for the any sample-related activities in the base (e.g., timestamp of the set):

    base_sample_begin(base);

c) In this case, we will add values by position, so use the index taking into account the base metrics, determined back when we created the metricset:

    metric_no = metric_offset;

d) Now process the data source to get the values and set them by position:

    fseek(mf, 0, SEEK_SET);
    do {
            s = fgets(lbuf, sizeof(lbuf), mf);
            rc = sscanf(lbuf, "%s %"PRIu64, metric_name, &v.v_u64);
                
            ldms_metric_set(set, metric_no, &v);
            metric_no++;
        } while (s);

e) Finally, end any transactional information in the base (e.g., the duration of the sample):

    base_sample_end(base);

A non-zero return code from sample will stop the plugin. If an error case may be temporary or may be a case for only some of the variables in the set, a common methodology is to return 0, even in these cases.

Usage

usage returns information on the configuration of a sampler. Be sure to include the macro providing information on the base usage, even if you have no other additional configuration arguments:

    return  "config name=" SAMP BASE_CONFIG_USAGE;

TODO: add how to view usage, other than looking at the source

Term

term is called when the plugin is terminated. In term be sure to close any file handles, delete the base, and delete the set, in addition to any other memory you may have allocated in your code

        if (base)
                base_del(base);
        if (set)
                ldms_set_delete(set);
        set = NULL;

Directory and Supporting files

Contributed samplers go under the contrib directory in their own directory, as described in Contributing. Include not only the sampler, but also the Makefile.am, the man page for the sampler, and any test.

The following assumes a directory structure of:

    XXX/ldms/src/sampler/contrib/mysite

where all of the samplers for mysite will be in subdirectories under mysite. E.g:

    XXX/ldms/src/sampler/contrib/mysite/mysampler

A per-sampler directory structure is also in work for the main-line samplers.

Makefile.am

Add lines in the following:

XXX/ldms/configure.ac

    OPTION_DEFAULT_ENABLE([mysampler], [ENABLE_MYSAMPLER])
    AC_CONFIG_FILES([Makefile src/Makefile src/core/Makefile...
                src/sampler/contrib/mysite/Makefile
                src/sampler/contrib/mysite/mysampler/Makefile

XXX/ldms/src/sampler/contrib/Makefile.am

    SUBDIRS += mysite

XXX/ldms/src/sampler/contrib/mysite/Makefile.am

    if ENABLE_MYSAMPLER
    SUBDIRS += mysampler
    endif

XXX/ldms/src/sampler/contrib/mysite/mysampler/Makefile.am

    pkglib_LTLIBRARIES =
    lib_LTLIBRARIES =
    check_PROGRAMS =
    dist_man7_MANS =
        
    CORE = ../../../core
    SAMPLER= ../../../sampler
    AM_CFLAGS = -I$(srcdir)/$(CORE) -I$(top_srcdir) -I../../.. @OVIS_LIB_INCDIR_FLAG@ \
                -I$(srcdir)/../../.. ldmsd
    AM_LDFLAGS = @OVIS_LIB_LIB64DIR_FLAG@ @OVIS_LIB_LIBDIR_FLAG@
    
    BASE_LIBADD = ../../libsampler_base.la
    COMMON_LIBADD = $(CORE)/libldms.la libfoo.la \
            @LDFLAGS_GETTIME@ -lovis_util -lcoll
    
    if ENABLE_MYSAMPLER
        lib_LTLIBRARIES += libfoo.la
        libfoo_la_SOURCES = foo.c foo.h
      
        libmysampler_la_SOURCES = dstat.c
        libmysampler_la_LIBADD = $(BASE_LIBADD) $(COMMON_LIBADD)
        libmysampler_la_CFLAGS = $(AM_CFLAGS) -I$(srcdir)/$(SAMPLER)
        pkglib_LTLIBRARIES += libmysampler.la
        dist_man7_MANS += Plugin_mysampler.man
        
        check_PROGRAMS += mysampler_test
        mysampler_test_SOURCES = mysampler.c mysampler.h
        mysampler_test_CFLAGS = -DMAIN
     endif

man page

Naming convention for plugins: Plugin_mysampler.man

Canonical headers and formatting for Linux man pages are used.

Advanced Topics

Multiple metric sets

test_sampler is a sampler that supports creation of multiple sets, with either the same or different schema, in the same sampler. This sampler keeps a list of schema and a list of sets and the set list is iterated through in the sample function.

The sets have different instance names and separate metadata. From the point of view of ldms_ls or the store these are no different than if the sets came from different samplers.

Its simplest form is triggered by a parameter action=default which creates one or more sets of the same schema with some default variables.

Configuring:

    config name=test_sampler producer=localhost1 action=default schema=test1 num_sets=2 component_id=1
    start name=test_sampler interval=1000000  

Querying:

    XXXX> ldms_ls -x sock -p 61101 -a ovis -A conf=/tmp/secret -l
    set_0: consistent, last update: Thu Oct 10 09:28:07 2019 -0600 [345791us] 
    D u64        metric_0                                   0
    D u64        metric_1                                   1
    D u64        metric_2                                   2
    D u64        metric_3                                   3
    D u64        metric_4                                   4
    D u64        metric_5                                   5
    D u64        metric_6                                   6
    D u64        metric_7                                   7
    D u64        metric_8                                   8
    D u64        metric_9                                   9
               
    set_1: consistent, last update: Thu Oct 10 09:28:07 2019 -0600 [345790us] 
    D u64        metric_0                                   0
    D u64        metric_1                                   1
    D u64        metric_2                                   2
    D u64        metric_3                                   3
    D u64        metric_4                                   4
    D u64        metric_5                                   5
    D u64        metric_6                                   6
    D u64        metric_7                                   7
    D u64        metric_8                                   8
    D u64        metric_9                                   9
    

Querying:

    XXXX> ldms_ls -x sock -p 61101 -a ovis -A conf=/tmp/secret -v
    Schema         Instance                 Flags  Msize  Dsize  UID    GID    Perm       Update            Duration          Info    
    -------------- ------------------------ ------ ------ ------ ------ ------ ---------- ----------------- ----------------- --------
    test1          set_1                       CL     488    136  22398  22398 -rwxrwxrwx 1570721297.355649              0.000002 "updt_hint_us"="1000000:" 
    test1          set_0                       CL     488    136  22398  22398 -rwxrwxrwx 1570721297.355650          0.000000 "updt_hint_us"="1000000:" 
    -------------- ------------------------ ------ ------ ------ ------ ------ ---------- ----------------- ----------------- --------
    Total Sets: 2, Meta Data (kB): 0.98, Data (kB) 0.27, Memory (kB): 1.25

In a more complex form, you can define the schema for each set to be different:

Configuring:

    config name=test_sampler action=add_schema schema=test1  num_metrics=10                                                             
    config name=test_sampler action=add_set schema=test1 instance=localhost9/test_sampler component_id=9 push=1 producer=localhost9 jobid=666                                                                                                                               
    config name=test_sampler action=add_schema schema=test2 num_metrics=9                                                                    
    config name=test_sampler action=add_set schema=test2 instance=localhost10/test_sampler component_id=10 push=1 producer=localhost10 jobid=666   
    start name=test_sampler interval=1000000   

Querying:

    XXXX> ldms_ls -x sock -p 61101 -a ovis -A conf=/tmp/secret -l
    localhost10/test_sampler: consistent, last update: Thu Oct 10 10:00:46 2019 -0600 [790147us] 
    M u64        component_id                               10
    D u64        job_id                                     666
    D u64        metric_0                                   5
    D u64        metric_1                                   5
    D u64        metric_2                                   5
    D u64        metric_3                                   5
    D u64        metric_4                                   5
    D u64        metric_5                                   5
    D u64        metric_6                                   5
    D u64        metric_7                                   5
    D u64        metric_8                                   5
      
    localhost9/test_sampler: consistent, last update: Thu Oct 10 10:00:46 2019 -0600 [790149us] 
    M u64        component_id                               9
    D u64        job_id                                     666
    D u64        metric_0                                   5
    D u64        metric_1                                   5
    D u64        metric_2                                   5
    D u64        metric_3                                   5
    D u64        metric_4                                   5
    D u64        metric_5                                   5
    D u64        metric_6                                   5
    D u64        metric_7                                   5
    D u64        metric_8                                   5
    D u64        metric_9                                   5

Querying:

    XXXX> ldms_ls -x sock -p 61101 -a ovis -A conf=/tmp/secret -v
    Schema         Instance                 Flags  Msize  Dsize  UID    GID    Perm       Update            Duration          Info    
    -------------- ------------------------ ------ ------ ------ ------ ------ ---------- ----------------- ----------------- --------
    test1          localhost9/test_sampler     CL     592    144  22398  22398 -rwxrwxrwx 1570723250.794460          0.000001 "updt_hint_us"="1000000:" 
    test2          localhost10/test_sampler    CL     560    136  22398  22398 -rwxrwxrwx 1570723250.794458          0.000003 "updt_hint_us"="1000000:" 
    -------------- ------------------------ ------ ------ ------ ------ ------ ---------- ----------------- ----------------- --------
    Total Sets: 2, Meta Data (kB): 1.15, Data (kB) 0.28, Memory (kB): 1.43
                                                                                                                      

Multiples of the same sampler

Multiples of the same sampler will not be supported until v5.

Changing the metrics in a sampler's metric set

If you want to change the metrics in sampler's metric set, you should create a new set with a different schema name. See Writing-a-Store for the implications on the store for creating sets with new schema.

Starting/Stopping/Changing Rates of a sampler

You have to stop and start (with new interval) a sampler to change the sampling rate.

Robustness and Error Handling

a) config -- Multiple calls to config are possible, and, possibly, even desirable. In these cases, different actions in configuration are typically distinguished by including an action parameter in the config line. Examples include the case above where multiple schema and sets are supported by the sampler; another case is in samplers with performance counters which may have one config line that specifies one or more performance counters to be set and another config line to finalize the set once all the performance counters have been set.

    static int config(struct ldmsd_plugin *self, struct attr_value_list *kwl, struct attr_value_list *avl)
    {
            char *action;
        
            action = av_value(avl, "action");
            if (action) {
                    rc = 0;
                    if (0 == strcmp(action, "add_schema")) {
                            rc = config_add_schema(avl);
                    } else if (0 == strcmp(action, "add_set")) {
                            rc = config_add_set(avl);
                    } else if (0 == strcmp(action, "default")) {
                            rc = config_add_default(avl);
                    } else {
                            msglog(LDMSD_LERROR, "test_sampler: Unrecognized "
                                    "action '%s'.\n", action);
                            rc = EINVAL;
                    }
                    return rc;
            }

In many cases, once the set has been established, you no longer want to change the configuration. A common methodology in these cases is to check for the existence of the set in the config, and return an error if the set exists. If you are using a configuration file, a non-zero return from config will abort processing the configuration file.

b) sample -- sample is called repeatedly. A non-zero return code from sample will stop the sampler. If an error case may be temporary or may be a case for only some of the variables in the set, a common methodology is to return 0, even in these cases.

Job Data

Main

LDMSCON

Tutorials are available at the conference websites

D/SOS Documentation

LDMS v4 Documentation

Basic

Configurations

Features & Functionalities

Working Examples

Development

Reference Docs

Building

Cray Specific
RPMs
  • Coming soon!

Adding to the code base

Testing

Misc

Man Pages

  • Man pages currently not posted, but they are available in the source and build

LDMS Documentation (v3 branches)

V3 has been deprecated and will be removed soon

Basic

Reference Docs

Building

General
Cray Specific

Configuring

Running

  • Running

Tutorial

Clone this wiki locally