sysmon: System Health Tracker

Version 1

    Introduction

    There are various system tools provided by the Linux server to capture different system diagnostic information. This is a wrapper to invoke needed commands that collect various diagnostics information from the Linux server.

    Usecase / Scenario

    • Usually, sar/nmon reports are timed to collect system stats at every 10 min interval. Many times, such long interval of collection does not help much and have to be tapped to collect at a more regular frequency like 60s or 30s.
    • In addition to this, we also need to collect additional details like process level cpu/memory consumption, count of open files, and so on.
    • To accommodate all these scenarios, there is a need for a custom stat collector that can be tailored to the needs of the scenario.

    Features

    Following stats/commands are enabled by default:
            1. top command output

    2. vmstat command output

    3. netstat command output

    4. load average on the system

    5. list of proceses that are in Running state of performing Disk I/O

    6. ps command output

    7. lsof command output

     

    Note:
    If any of the above commands are not available on the server, the associated output would not be generated. So ensure that all the above-specified tools/commands are installed in the Linux server.

     

    Usage / Example

    sysmon.sh -f <frequency-in-seconds>  -i <number-of-iterations>

    Options:

    -f  : interval(in seconds) between two collection cycles

    -i  : how many times the diagnostics to be collected

    The script would generate output in /tmp/sysmon_HOSTNAME directory where HOSTNAME is the hostname of the Linux server. If the log directory has to be changed, the following line in script needs to be modified to the desired path:

    OUTPUT_PREFIX=/tmp/sysmon_`hostname`

    If the script needs to run everyday for every 60 seconds, the following crontab entry could be added:

    * * * * * <PATH_TO_SYSMON_SCRIPT>/sysmon.sh 1 1

     

    Sample Run Output

    [UserA@HostABC]$ sysmon.sh 1 1

    snippet from sysmon.log:

    2020-01-16 07:00:08: Woke up!
    2020-01-16 07:00:08: START top
    2020-01-16 07:00:09: DONE top
    2020-01-16 07:00:09: START vmstat
    2020-01-16 07:00:10: DONE vmstat
    2020-01-16 07:00:10: START netstat
    2020-01-16 07:00:10: DONE netstat
    2020-01-16 07:00:10: START loadavg
    2020-01-16 07:00:10: DONE loadavg
    2020-01-16 07:00:10: START runnables
    2020-01-16 07:00:10: DONE runnables
    2020-01-16 07:00:10: START iostat
    2020-01-16 07:00:11: DONE iostat
    2020-01-16 07:00:11: Zzzzz

     

    Files (Runnable & Source Code)

    • Tool on OneDrive Location
    • Location of tool on Informatica server: INFA_HOME/tools/debugtools/sysmon/linux_sysmon.sh
    • P4 Locations: NA

    System Requirement

    • Informatica Server files