The arrow of time

Ivan Voras' blog

Writing a GEOM GATE module, part 2

This is the second part of my short tutorial on writing a GEOM gate module for FreeBSD. If you missed it, see the first part for an introduction.

In this part, I'd like to talk about some of the mechanisms, ideas and constrains in GEOM and GEOM GATE (ggate).

GEOM deals with the type of devices which were formerly known as "block devices". Though the distincion (between "block" and "character" devices) is no longer there, for clarification and out of habit it is somewhat common to call all storage devices "block devices", mostly because they impose a constraint that all access to / from the device is done at block granularity. These blocks are usually called "sectors".

All IO on the GEOM level is done at the granularity of device sectors. This means that all offsets and sizes used in accessing the device are always strictly positive integers divisible by the sector size. Some operating systems do not enforce this behaviour but on FreeBSD, you can check this by attempting to write (for example) a single byte to a disk drive with dd: such operation will always fail. File systems are an abstraction on top of storage devices which handle arbitrary byte-level addressing.

On sector sizes

IO requests passing though GEOM are described by an absolute offset in bytes from the beginning of the device (a signed 64-bit integer, typed as off_t) and a buffer size in bytes (which is not entirely consistently typed, but is at least a 32-bit signed integer). Both numbers are enforced to be multiples of sector sizes.

Sector sizes are whatever the GEOM classes involved decide they should be, and the decision process goes from the lowest-level class upwards. Remember that GEOM supports a graph of classes which can consume an arbitrary number of lower-level devices and produce an arbitrary number of new devices. A fancy RAID level or some other transformation could, as an example, consume several devices and produce one device with a different sector size. If the lowest class through which a request passes is a disk device driver, the process will usually start with a sector size of 512 bytes (unfortunately even 4K-sector drives still announce themselves as having 512 byte sectors).

The sector size is treated opaquely by GEOM itself, and GEOM supports arbitrary sector sizes which do not have to be powers of two - but it is usually more efficient to have them such and some file systems require that the sector size be a power of two. Though GEOM classes can produce devices with any sector sizes, there is an effective upper limit of 64 KiB which comes from outside GEOM. GEOM will use (and enforce) whatever sector size the classes announce for the device, and the classes themselves must handle any mismatches between what they require and what the underlying (consumed) devices support.

The anatomy of ggatel

The ggatel is the simplest useful ggate module: its function is to make an arbitrary file be accessed as a disk drive. On the high level, this function is exactly the same as that of mdconfig, with the distinction that mdconfig does it entirely in the kernel and is much faster because it avoids system calls and data copying to and from the kernel.

It's structure is simple: the main() function at line 212 simply parses command line arguments and sets up the other parts of the system. At this point, the most important commands ggatel can perform are CREATE and DESTROY.

Note that there is a sector size argument "-s", and that there is an unit number argument "-u". Sector sizes are arbitrary (as the ggatel module accesses a file which can be addressed with byte granularity) and already discussed. Unit numbers are the numbers suffixed to the ggate device names, /dev/ggate%d. If not supplied (e.g. if it's -1), the kernel will supply a unique number.

The g_gatel_serve() function at line 84 is where the main part of the module is implemented. At the start it may call daemon(3) to optionally fork the process into the background before any serious work is done, and then it basically implements the GEOM GATE loop described in the first part of the tutorial.

The g_gate_ioctl() call is used to communicate with the kernel part of GEOM GATE, and there are a couple more g_gate_* utility functions which help with common tasks:

  • g_gate_mediasize(fd) - if the fd is another storage device, returns the size of the device; if the fd is a file, it returns the size of the file
  • g_gate_sectorsize(fd) - similar to the above, for sector sizes. As files in file systems do not actually have "sector sizes", some arbitrary number which the system thinks should be efficient enough to use as sector size will be returned (e.g. 4096).
  • g_gate_load_module() - ensures the ggate kernel module is loaded
  • g_gate_open_device() - establishes communication with the ggate kernel module
  • g_gate_close_device() - drops the communication with the ggate kernel module
  • g_gate_destroy(unit, force) - sends a G_GATE_CMD_DESTROY command to a (previously created) ggate device

A ggate device is created in ggatel.c in g_ggatel_create() at the line 168, which contains boilerplate code. Basically, a struct g_gate_ctl_create is filled in with information describing the new device and a G_GATE_CMD_CREATE command is issued via g_gate_ioctl().

In the next part, I will describe what I want to create for an example ggate module and post a simple ggate module template .c file.

Post your comment here!

Your name:
Comment title:
Type "xxx" here:

Comments are subject to moderation and will be deleted if deemed inappropriate. All content is © Ivan Voras. Comments are owned by their authors... who agree to basically surrender all rights by publishing them here :)