Previous: Command line arguments, Up: Getstats Internals



7.7 Adding Parsers to Getstats

Getstats itself does not include any parsers, instead it searches the Perl @INC array which contains possible include paths for files of the pattern gs_parser_*, and includes them. Getstats includes three parsers by default, one for Auto-pilot results files, one for CSV files, and one for a sequence of GNU time outputs. To add your own parser, you should simply copy one of these parsers and modify it to suit your needs.

Each parser defines two functions, and three parameters:

Detection Function
The detection function takes a list argument. The first item is the filename. The next items are the file's lines. The filename should not be used exclusively (because Getstats can read from stdin, which has a filename of "-"). For example, the CSV detection makes sure each line is a valid CSV line. If the file is of the correct type the detection function returns true, otherwise it returns false. If the file name does not exist, the detection function is called with the name, but not lines. This is so that parsers expecting filename globs will still be executed. If no detection function returns true, then an error is reported. If a detection function does return true, the corresponding parsing function is called.
Parsing Function
The parsing function takes the same arguments as the detection function, but now the file is assumed to be of that correct type (because the corresponding detection function has already returned true). The parsing function returns a two dimensional array containing the relation represented by the input file. The first element of the array is a "label" row that has a short description of each column (e.g., "elapsed" for Elapsed time). Each additional row of the array is a test result.
Description
A short description of the file type this parser supports (e.g., "Comma separated values.").
Extension
The default file extension (e.g., ".csv" is used for the CSV parser). This is not used by Auto-pilot internally, except to derive the basename of an input file.
Priority
The priority controls what order detection and parsing is performed. Lower-numbered priorities are tried first. Auto-pilot results files have a priority of 64, GNU time files have a priority of 96, and CSV files have a priority of 128. If there is a chance that a parser detects an invalid file as input, it should have a high-numbered priority. For example, the CSV parser accepts more input than it probably should , because a single column CSV file without any columns is valid. This means that the Auto-pilot results and GNU time should have a crack at reading the file first.