7.7 Adding Parsers to Getstats
Getstats itself does not include any parsers, instead it searches the
@INC array which contains possible include paths for files
of the pattern gs_parser_*, and includes them. Getstats includes
three parsers by default, one for Auto-pilot results files, one for CSV
files, and one for a sequence of GNU time outputs. To add your own
parser, you should simply copy one of these parsers and modify it to
suit your needs.
Each parser defines two functions, and three parameters:
- Detection Function
- The detection function takes a list argument. The first item is the filename.
The next items are the file's lines. The filename should not be used
exclusively (because Getstats can read from stdin, which has a filename of
"-"). For example, the CSV detection makes sure each line is a valid CSV line.
If the file is of the correct type the detection function returns true,
otherwise it returns false. If the file name does not exist, the detection
function is called with the name, but not lines. This is so that parsers
expecting filename globs will still be executed. If no detection function
returns true, then an error is reported. If a detection function does return
true, the corresponding parsing function is called.
- Parsing Function
- The parsing function takes the same arguments as the detection function, but
now the file is assumed to be of that correct type (because the corresponding
detection function has already returned true). The parsing function returns a
two dimensional array containing the relation represented by the input file.
The first element of the array is a "label" row that has a short description of
each column (e.g., "elapsed" for Elapsed time). Each additional row of the
array is a test result.
- A short description of the file type this parser supports (e.g., "Comma
- The default file extension (e.g., ".csv" is used for the CSV parser).
This is not used by Auto-pilot internally, except to derive the basename
of an input file.
- The priority controls what order detection and parsing is performed.
Lower-numbered priorities are tried first. Auto-pilot results files
have a priority of 64, GNU time files have a priority of 96, and CSV
files have a priority of 128. If there is a chance that a parser
detects an invalid file as input, it should have a high-numbered
priority. For example, the CSV parser accepts more input than it
probably should , because a single column CSV file without any columns
is valid. This means that the Auto-pilot results and GNU time should
have a crack at reading the file first.