Month: May 2014

Cacti, RRDs, and Disk Block Sizes

Abstract

Disk block size on Linux for the ext2/3/4 file system does not affect the amount of data written to disk.  Apparently, all of the allowed values for blocksize (1024, 2048 or the default of 4096 bytes) result in data of 4096 bytes being written to disk.

Introduction

Cacti is a network monitoring program which gathers statistics every minute or so, and writes them the disk.   When monitoring large systems, you may find that volume disk IO can be a problem.

Cacti writes the its data to RRD files (Round-Robin Databases).   Each RRD holds related values from a piece of data equipment.  These values may be stored for every minute, and then aggregated every 30 minutes and 24 hours, for example.  As a result, in an RRD of one megabyte, only a handful bytes are updated every minute (or so) interval.

Unfortunately, although only a few bytes are updated, disk IO happens in much larger blocks.  Thus diskIO can be 100 times or more of the theoretical minimum.

To help with this, I attempted to verify that a reducing the blocksize on a filesystem will reduce the IO.  This failed.  Modifying the blocksize had no effect on IO.

Method

Creating the Filesystem

A file system was created to test each blocksize, with a command similar to the following

mke2fs -T ext4 -b 4096 /dev/sdb1

where /dev/sdb1 is the file partition and 4096 is a blocksize (the others were 2048 and 1024).

Writing to the File

The ‘dd’ command was used to create an initial 1MB file filled with null chars.  This file was used to write to in the test.

dd if=/dev/zero of=test.fil bs=1024 count=1024

The actual test was done with this short Python program:

import sys
import mmap
import random

NSIZE = 0x100000    #  File size
fname = "test.fil"  #  File name

#  Open mmap file
f = open(fname,"r+b")
mm = mmap.mmap(f.fileno(), 0)

#  Initialize random number generator object
r = random.Random()

#  Get number of bytes to write into memory
n = int(sys.argv[1])           #  Read command line
for i in range(n):
    pos = r.randint(0,NSIZE-1) #  Choose memory location at random
    mm[pos] = 'X'              #  Place X character onto disk
mm.close()
f.close()

Measuring the Disk IO

The Python program was run with values for n of 0, 1, 2, 3, while watching the output from iostat to determine the number of KBytes written to disk.  Here’s an example of using iostat:

> iostat -p sdb 5
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sdb               1.00         0.00         1.60          0          8
sdb1              1.00         0.00         1.60          0          8
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.40    0.00    1.30    1.70    0.00   93.60
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sdb               1.20         0.00         2.40          0         12
sdb1              1.20         0.00         2.40          0         12
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.41    0.00    1.30    0.50    0.00   93.79
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sdb               1.40         0.00         3.20          0         16
sdb1              1.40         0.00         3.20          0         16

Here “-p sdb” tells iostat to only report on the device /dev/sdb, and the “5″ tells iostat to display every 5 seconds.

While “iostat” is running, the test Python program is run.  Iostat will show how many Kbytes are written to disk.

testwrite.py 1; sync   #  Wait five seconds
testwrite.py 2; sync   #  Wait five seconds
testwrite.py 3; sync

The lines above write 1, 2 and 3 bytes to disk.  By waiting 5 seconds between each command, iostat will show the amount of Kbytes written for each one — the iostat output shown above is from a test.  It shows that 8, 12 and 16 kilobytes were written to disk when 1, 2 and 3 bytes are written from the program.  There is a constant amount of overhead of 6 KB for each write, which means that each bytes written to by the test program resulted in 4KB written to disk.  This value persisted even when the filesystem blockwas was 1024, 2048 and 4096 (in each case, only the overhead grew, but not the amout of data written disk disk per byte written from the program).