As a policy writer, I would like to be able to read a line based file (like CSV) filtered by class expressions into a data container where the first line of the delimited file contains the field names.
- Return data container using column headings as keys instead of positional
- Filter returned data-set by class expression (ifvarclass)
- If you thought you wanted to use classmatch() use it to define a class that is used by ifvarclass.
- Lexically sort the data container by specified field name.
- This is a really nice feature to have because when rendering a template based on this data that iterates over the key value pairs, the order will be retained, making diffs more readable.
Why some people prefer line based files:
- One row with bad data doesn't invalidate the entire file (as happens with JSON).
- Easy for people
- Easy for spreadsheets
- Invalid data rows result in a warning and are discarded
- Empty rows are silently ignored.
- classexpresson_filterdata( "path to datafile", "Class expression Column/Key", "DELIM", "Has heading", "sort by")
Here is an example with delimiter ,.
Here is an example with delimiter ;;.
Should be translated into this data container debian production
Example template to be used with:
NOTE this example template structure does NOT match the suggested returned data format!
For this template to work, the following data structure would be necessary:
This example shows how data can be sharded which can help with execution speed and may help to better align with different groups managing different aspects.
In the example the data is sharded into defaults, datacenter, application, and security. We want each shard is able to override the keys of the former so that everyone is given sensible defaults, settings are adjusted for environmental factors (datacenter, application/role etc ...) but security has the final word and can set mandatory defaults.
Sharding the data allows for separation of concerns. Global IT can control the data set for default settings, facility and application admins can override with settings that are appropriate for their location and application. The policy writer controls the model and the merge strategy which determines which data will be used to configure the system in the end. Each sharded data set can leverage cfengine class expressions to determine the data loaded from the file.
These are identical examples, but using data where as originally requested data is parsed into named key values based on the column header. It seems this would be less desirable if data is to be merged.
Note how the keys are duplicated in the final data set.
For simple cases where no data merging is involved it may be ok if the function to load the data ensures that keys are unique in the returned data
- Ignore lines
- Like we have today for many functions that parse data files.
- Don't consider lines matching regular expression (like comments).
- Additional filter
- Let the function further restrict which data shall be allowed
- Example: data file has linux = value x, debian = value y (later, so more specific). Function could load and say !debian to filter out the debian specific line.
- Because people are crazy. The flexibility allows for policy writer to work around some issue in the incoming data set.