Skip to content

Writing a store

oceandlr edited this page Nov 17, 2019 · 58 revisions

Table of Contents

Basics

The easiest way to write a store_plugin, is to look at an existing one, like store_flatfile.

    About store_flatfile: Stores each metric into its own file
    Run script:
         load name=store_flatfile
         config name=store_flatfile path=/tmp/gentile/ldmstest/26249/store 
         strgp_add name=store_flatfile plugin=store_flatfile schema=meminfo container=fmeminfo 
         strgp_prdcr_add name=store_flatfile regex=.*      
         strgp_start name=store_flatfile     
    Stores: > ls XXX/fmeminfo/meminfo
         Active
         Active(anon)
         Active(file)
         ...
    Contents:
         > more Active
         1571455657.385547 localhost1 0 2432912
         1571455658.386768 localhost1 0 2435064
         1571455659.387970 localhost1 0 2435768
         1571455660.389186 localhost1 0 2434616

Required Functions

Required functions for all samplers are defined in the ldmsd_store struct definition, near the bottom of the file. This defines the specific functions to be called at various points. The get_plugin function returns the struct for this plugin.

     .base = {
                        .name = "csv",
                        .type = LDMSD_PLUGIN_STORE,
                        .term = term,
                        .config = config,
                        .usage = usage,
        },
        .open = open_store,
        .get_context = get_ucontext,
        .store = store,
        .flush = flush_store,
        .close = close_store,
    struct ldmsd_plugin *get_plugin(ldmsd_msg_log_f pf)
    {
        msglog = pf;
        PG.msglog = pf;
        PG.pname = PNAME;
        return &store_csv.base;
    }

About the functions:

  • usage_ -- output for the help, which defines the usage.
  • config -- called at configuration.
  • store -- the actual storing
  • get_context -- stores have a void* context which is returned by this call.
  • term -- called when the store plugin terminates.
  • open -- the store group opens individual stores (e.g., csv files) have the option of opening on demand.
  • close -- the store group closes individual stores.
  • flush -- individual stores (e.g., csv files) may want to have the option of flushing. You might choose to do this if you have been relying on system buffering to flush, but you want to trigger a flush in order to ensure that the data is written to the store.
Other:
  • msglog -- wrapper call for logging. Log levels are: LDMSD_LDEBUG, LDMSD_LINFO, LDMSD_LWARNING, LDMSD_LERROR, LDMSD_LCRITICAL, LDMSD_LALL.
  • SAMP -- #define for the store plugin name, for convenience (e.g., "meminfo"); not used for csv, but could be.
Some details of interest are described in more detail below.

get_plugin

get_plugin is called when the load call on the plugin is called.

Config

Config is invoked by the config command. Arguments from the config are passed in the attr_value_list *avl. They can be extracted by name, as shown below. av_value returns null if there is no attribute with that name:

    char *value;
    value = av_value(avl, "path");
    if (!value)
        goto err;
    

You may want to prevent config from being called multiple times if your code is not prepared to handle the implications of the changes in config. For example, if your config creates a file, then subsequent calls to config on a running sampler, would have to handle the how these changes impact the current file, particularly while storing is occurring.

At the time of config information about the sets to be stored is unknown. This prevents doing much in the config, beyond parsing general variables.

If you are using a configuration file, a non-zero return from config will abort processing the configuration file.

Open

open on a store is called (roughly) when the store_group's store_handle is null, which is when the store_group is started or when the store_group is explicitly closed. So in many cases, the open may only be called once.

Arguments to open include container and schema name. This enables opening the container (e.g., setting a full file path). Via the metric_list, the metric names and types are also known, however if there is an array, the size is not known. This limits the ability to make some headers at this point. Only at the store call are the values, and hence the number of values in an array, are known.

Return value: pointer to the store or NULL if fails

Store

store is handed as parameters the store_handle, the set, and an array of metric indicies with the length of that array.

a) Only at the store call are the values, and hence the number of values in an array, are known. Thus some stores (e.g., store_csv) which print a header do so at this point.

b) get the timestamp for the set

    ldms_transaction_timestamp_get(set);

c) get the name of the producer

    pname = ldms_set_producer_name_get(set);

d) iterate through the metrics, check the type to write the value correctly

    for (i = 0; i != metric_count; i++) {
            enum ldms_value_type metric_type = ldms_metric_type_get(set, metric_array[i]);
            //use same formats as ldms_ls                                                                                                               
            switch (metric_type){
                    case LDMS_V_U64:
                        rc = fprintf(s_handle->file, ",%"PRIu64,
                                        ldms_metric_get_u64(set, metric_array[i]));
                        if (rc < 0)
                                msglog(LDMSD_LERROR, PNAME ": Error %d writing to '%s'\n",
                                                rc, s_handle->path);
                        break;

Return code: store is called repeatedly. The store_group does not check the return value of the store. No way to know if store call completed properly.

Close

close can be used to close a store, for example closing a file handle. Closing a store manually, does not result in that state being transmitted to the store group. The store_group being stopped will call each of its stores to be closed.

Flush

flush can be used to trigger a flush to a store on demand. The store_group does not have a flush, however you can takes flush information as part of config and use this to trigger the flush at appropriate moments (store_csv does this). You might choose to do this if you have been relying on system buffering to flush, but you want to trigger a flush in order to ensure that the data is written to the store.

        struct flatfile_store_instance *si = _sh;
        struct flatfile_metric_store *ms;
        LIST_FOREACH(ms, &si->ms_list, entry) {
                pthread_mutex_lock(&ms->lock);
                lrc = fflush(ms->file);
                if (lrc) {
                        rc = lrc;
                        eno = errno;
                        msglog(LDMSD_LERROR, "Errro %d: %s at %s:%d\n", eno, strerror(eno),
                                        __FILE__, __LINE__);
                }
                pthread_mutex_unlock(&ms->lock);
        }

Usage

usage returns the usage (in most cases the config) information>

    return
    "    config name=store_flatfile path=<path>\n"
    "              - Set the root path for the storage of flatfiles.\n"
    "              path      The path to the root of the flatfile directory\n";

Term

term is called when the plugin is terminated.

Multiple Storage Targets and Locks

While samplers might frequently have only a single set, it is far more common that store_plugins will have multiple targets to which they are storing. The store_plugin may define a way to store (e.g., csv files) for a variety of sets (or schema) or even for variables within a set. Thus, more bookkeeping may be required for a store_plugin. A typical construction is that a store_plugin has an array/list of each schema and its associated target.

Typically two types of locks are needed in implementing a store.

One is for configuration changes and possibly related changes, such as maintaining the array of schema and targets:

    static int config(struct ldmsd_plugin *self, struct attr_value_list *kwl, struct attr_value_list *avl)
    { 
            char* value = av_value(avl, "path");
            ...
            pthread_mutex_lock(&cfg_lock);
            root_path = strdup(value);
            pthread_mutex_unlock(&cfg_lock);
            return 0;
      }

The other is a per-target lock, which is used when writing to a store and changes like rolling over a store:

     static int store(ldmsd_store_handle_t _sh, ldms_set_t set, int *metric_arry, size_t metric_count)
     {
        struct flatfile_store_instance *si;
              
        ...
        
        for (i=0; i<metric_count; i++) {
                pthread_mutex_lock(&si->ms[i]->lock);
              
                enum ldms_value_type metric_type =
                        ldms_metric_type_get(set, metric_arry[i]);
                switch (metric_type) {
                        ...
                        case LDMS_V_U64:
                        rc = fprintf(si->ms[i]->file, " %"PRIu64"\n",
                             ldms_metric_get_u64(set, metric_arry[i]));
                        break;
                }
                pthread_mutex_unlock(&si->ms[i]->lock);
        }

Other implementation considerations

  • The metric_sets have a single timestamp for a group of variables. This can facilitate analysis. The store can split up, or further process, the variables as desired.
  • rollover - stores that write to files can result in large files. Features like rollover can limit file sizes and enable easier determination of files of interest. Log management capabilities such as logrotate could handle some issues, however, for sufficiently large files the copy of log rotate can be time intensive and a file handle swap can invalidate the open file handle of the store, so it may be more efficient to do the file management directly in the store (many of the stores do this).
  • performance - A variety of considerations around write performance, affect on ldmsd daemons, open file handles are discussed at Configuration-Considerations-and-Best-Practices-(v4)

Headerfiles and Utilities

Directory and Supporting Files

Contributed store_plugins will go under a contrib directory in their own directory, as described in Contributing. Include not only the store_plugin, but also the Makefile.am, and the man page for the plugin.

NOTE: this directory structure has changed slightly since the LDMSCON2019 tutorial slides.

The following assumes a directory structure of:

    XXX/ldms/src/contrib/store/mysite

where all of the stores for mysite will be in subdirectories under mysite. E.g.:

    XXX/ldms/src/contrib/store/mysite/mystore

A per-store directory structure is also in work for the main-line stores.

Makefile.am

XXX/ldms/configure.ac

    OPTION_DEFAULT_ENABLE([mystore], [ENABLE_MYSITESTORE])
    AC_CONFIG_FILES([Makefile src/Makefile src/core/Makefile...
                src/contrib/store/mysite/Makefile
                src/contrib/store/mysite/mystore/Makefile

XXX/ldms/src/contrib/store/Makefile.am

    if ENABLE_MYSITESSTORE
        MAYBE_MYSITESSTORE = mysite
    endif
    SUBDIRS += $(MAYBE_MYSITESTORE) 

XXX/ldms/src/contrib/store/mysite/Makefile.am

    if ENABLE_MYSITESTORE
        MAYBE_MYSITESTORE = mystore
    endif
    SUBDIRS += $(MAYBE_MYSITESTORE) 

XXX/ldms/src/contrib/store/mysite/mystore/Makefile.am

    SUBDIRS =
    lib_LTLIBRARIES =
    pkglib_LTLIBRARIES =
    CORE = ../../../core
    LDMSD = ../../../ldmsd
    AM_CFLAGS = -I$(srcdir)/$(CORE) -I$(top_srcdir) -I../../.. @OVIS_LIB_INCDIR_FLAG@ \
    	    -I$(srcdir)/$(LDMSD)
    STORE_LIBADD = $(CORE)/libldms.la \
    		-lcoll -lovis_util @OVIS_LIB_LIB64DIR_FLAG@ \
    	       @OVIS_LIB_LIBDIR_FLAG@
        
    ldmsstoreincludedir = $(includedir)/ldms
    ldmsstoreinclude_HEADERS = 
                
    if ENABLE_MYSITESTORE
    libmystore_la_SOURCES = mystore.c
    libmystore_la_CFLAGS = $(AM_CFLAGS)
    libmystore_la_LIBADD = $(STORE_LIBADD)
    pkglib_LTLIBRARIES += libmystore.la
    endif
    EXTRA_DIST = Plugin_mystore.man

man page

Naming convention for plugins: Plugin_mystore.man

Canonical headers and formatting for Linux man pages are used.




Advanced Topics

New Schemas After the Store Startup

Currently in v4, the schema names have to be explicitly supplied to the strgp_add line:

    strgp_add name=store_flatfile plugin=store_flatfile schema=meminfo container=fmeminfo

This means that dynamically creating schema in the sampler will not automatically result in the store_plugin handling the new schema. Options for this are in consideration for v4.

New Instances of A Schema After the Store Startup

New instances of a schema are discovered by the daemon hosting the store plugin and pass those to the store_plugin. The stores and the samplers can be started in either order.

Changing Schemas and Meta Data Generation Numbers

Robustness and Error Handling

a) config -- Multiple calls to config are possible, and, possibly, even desirable. In these cases, different actions in configuration are typically distinguished by including an action parameter in the config line. Examples may be when you want to provide a set of default configuration parameters and some overrides for particular stores.

b) store -- store is called repeatedly. The store_group does not check the return value of the store. Therefore there is no way to know if the store call actually completed properly.

c) schema names -- no ldmsd nor any current store checks that a schema has not changed. This may happen via any of the following:

  • a sampler on each two different types of nodes might be generating different schema contents but with the same name (e.g, if two nodes have a different architecture and hence different entires in a source in /proc)
  • a sampler plugin in a running daemon has been explicitly stopped and restarted with a different schema contents but with the same name
  • a sampler daemon has died and is restarted, again with the sampler plugin defining different contents with the same schema name.

Current limitations on store_csv

under construction

Main

LDMSCON

Tutorials are available at the conference websites

D/SOS Documentation

LDMS v4 Documentation

Basic

Configurations

Features & Functionalities

Working Examples

Development

Reference Docs

Building

Cray Specific
RPMs
  • Coming soon!

Adding to the code base

Testing

Misc

Man Pages

  • Man pages currently not posted, but they are available in the source and build

LDMS Documentation (v3 branches)

V3 has been deprecated and will be removed soon

Basic

Reference Docs

Building

General
Cray Specific

Configuring

Running

  • Running

Tutorial

Clone this wiki locally