Skip to content

Latest commit

 

History

History
68 lines (48 loc) · 4.81 KB

lustre_ost_monitoring.md

File metadata and controls

68 lines (48 loc) · 4.81 KB

Lustre OST Monitoring

The Lustre OST monitoring can detect if Lustre OSTs are showing very low performance
and thus can be considered as not beeing reactive anymore.

Since the monitoring task itself does active IO on the OSTs it is recommended to use small amount of test data
to keep the impact on the production system as low as possible. Tests should also run not to often e.g. once a day.

The monitoring task results are saved into a MySQL database, but it would be also feasible to save those metrics
via the Pushgateway client into the Prometheus monitoring system.

Configuration

Example config file for the task generator

Section: control

Name Type Value Description
local_mode String yes/no, on/off, true/false and 1/0 Specifies if local or productive mode is enabled
measure_interval Int n>=0 Specifies the task creation time in seconds

Section: task

Name Type Value Description
task_file String Path Path to task config file
task_name String Name Name of task to load

Section: lustre

Name Type Value Description
lfs_bin String Path Path to Lustre lfs binary
target String Name Target name of Lustre filessytem
ost_select_list RangeSet 0>=n List of decimal OST indexes comma separated and ranges defined with hyphen. Leave empty for all available OSTs.

Example config file for the task

Name Type Value Description
ost_idx Int - Placeholder, filled during runtime
block_size_bytes Int n>0 Block size in bytes for test data
total_size_bytes Int n>0 Total size in bytes for test data
write_file_sync Bool on/off Sets file sync for writing test
target_dir String Path Path of target directory for test data on Lustre
lfs_bin String Path Path to Lustre lfs binary
lfs_target String Name Target name of Lustre filessytem
db_proxy_target String Host Host name of database proxy target
db_proxy_port Int Port Port of database proxy target

Example config file for the task

This task inherits all parameter from the LustreIOTask, but adds alerting.

Name Type Value Description
mail_server String Host Specifies the mail servers host name
mail_sender String Email Sender email address
mail_receiver String Email Receiver email address
mail_threshold Int n>0 Threshold in seconds for sending an email when OST performance degradation detected