Notice

This document is for a development version of Ceph.

Metrics

CephFS uses Perf counters to track metrics. The counters can be labeled (Labeled Perf Counters).

Client Metrics

CephFS exports client metrics as Labeled Perf Counters, which could be used to monitor the client performance. CephFS exports the below client metrics.

Client Metrics

Name

Type

Description

num_clients

Gauge

Number of client sessions

cap_hits

Gauge

Percentage of file capability hits over total number of caps

cap_miss

Gauge

Percentage of file capability misses over total number of caps

avg_read_latency

Gauge

Mean value of the read latencies

avg_write_latency

Gauge

Mean value of the write latencies

avg_metadata_latency

Gauge

Mean value of the metadata latencies

dentry_lease_hits

Gauge

Percentage of dentry lease hits handed out over the total dentry lease requests

dentry_lease_miss

Gauge

Percentage of dentry lease misses handed out over the total dentry lease requests

opened_files

Gauge

Number of opened files

opened_inodes

Gauge

Number of opened inodes

pinned_icaps

Gauge

Number of pinned Inode Caps

total_inodes

Gauge

Total number of Inodes

total_read_ops

Gauge

Total number of read operations generated by all process

total_read_size

Gauge

Number of bytes read in input/output operations generated by all process

total_write_ops

Gauge

Total number of write operations generated by all process

total_write_size

Gauge

Number of bytes written in input/output operations generated by all processes

Subvolume Metrics

CephFS exports subvolume metrics as Labeled Perf Counters, which could be used to monitor the subvolume performance. CephFS exports the below subvolume metrics. Subvolume metrics are aggregated within sliding window of 30 seconds (default value, configurable via the subv_metrics_window_interval parameter, see MDS Config Reference). In large Ceph clusters with tens of thousands of subvolumes, this parameter also helps clean up stale metrics. When a subvolume’s sliding window becomes empty, it’s metrics are removed and not reported as “zero” values, reducing memory usage and computational overhead.

Subvolume Metrics

Name

Type

Description

avg_read_iops

Gauge

Average read IOPS (input/output operations per second) over the sliding window.

avg_read_tp_Bps

Gauge

Average read throughput in bytes per second.

avg_read_lat_msec

Gauge

Average read latency in milliseconds.

avg_write_iops

Gauge

Average write IOPS over the sliding window.

avg_write_tp_Bps

Gauge

Average write throughput in bytes per second.

avg_write_lat_msec

Gauge

Average write latency in milliseconds.

Getting Metrics

The metrics could be scraped from the MDS admin socket as well as using the tell interface. The mds_client_metrics-<fsname> section in the output of counter dump command displays the metrics for each client as shown below:

"mds_client_metrics": [
    {
        "labels": {
            "fs_name": "<fsname>",
            "id": "14213"
        },
        "counters": {
            "num_clients": 2
        }
    }
],
"mds_client_metrics-<fsname>": [
    {
        "labels": {
            "client": "client.0",
            "rank": "0"
        },
        "counters": {
            "cap_hits": 5149,
            "cap_miss": 1,
            "avg_read_latency": 0.000000000,
            "avg_write_latency": 0.000000000,
            "avg_metadata_latency": 0.000000000,
            "dentry_lease_hits": 0,
            "dentry_lease_miss": 0,
            "opened_files": 1,
            "opened_inodes": 2,
            "pinned_icaps": 2,
            "total_inodes": 2,
            "total_read_ops": 0,
            "total_read_size": 0,
            "total_write_ops": 4836,
            "total_write_size": 633864192
        }
    },
    {
        "labels": {
            "client": "client.1",
            "rank": "0"
        },
        "counters": {
            "cap_hits": 3375,
            "cap_miss": 8,
            "avg_read_latency": 0.000000000,
            "avg_write_latency": 0.000000000,
            "avg_metadata_latency": 0.000000000,
            "dentry_lease_hits": 0,
            "dentry_lease_miss": 0,
            "opened_files": 1,
            "opened_inodes": 2,
            "pinned_icaps": 2,
            "total_inodes": 2,
            "total_read_ops": 0,
            "total_read_size": 0,
            "total_write_ops": 3169,
            "total_write_size": 415367168
        }
    }
]

The subvolume metrics are dumped as a part of the same command. The mds_subvolume_metrics section in the output of counter dump command displays the metrics for each client as shown below:

"mds_subvolume_metrics": [
    {
        "labels": {
            "fs_name": "a",
            "subvolume_path": "/volumes/_nogroup/test_subvolume"
        },
        "counters": {
            "avg_read_iops": 0,
            "avg_read_tp_Bps": 11,
            "avg_read_lat_msec": 0,
            "avg_write_iops": 1564,
            "avg_write_tp_Bps": 6408316,
            "avg_write_lat_msec": 338
        }
    }

Brought to you by the Ceph Foundation

The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.