Processing project config files using formal language

Processing project config files
using
formal language approach
implemented in Python
Piotr Nikiel
ATLAS Central DCS
1
Which config files it is about?
WinCC OA project config files, like this:
[general]
pvss_path = "/opt/WinCC_OA/3.11"
proj_path = "/det/dcs/Production/ATLAS_DCS_MUO"
proj_path = "/det/dcs/Production/ATLAS_DCS_MDT"
proj_path = "/det/dcs/Production/ATLAS_DCS_MDT/ATLMDTMDM"
proj_path = "/det/dcs/fwInstallation/fwInstallation-3.11"
proj_path = "/localdisk/winccoa/fwComponents_ATLMDTMDM1"
proj_path = "/localdisk/winccoa/ATLMDTMDM1"
proj_version = "3.11"
userName = "root"
password = ""
langs = "en_US.iso88591"
distributed = 1
useRDBArchive = 1
useRDBGroups = 1
maxConnectMessageSize = 0
[dist]
[ValueArchiveRDB]
DbUser = "ATLAS_PVSSMDT_W"
Db = "ATONR_PVSSPROD"
DbType = "ORACLE"
writeWithBulk = 1
queryOverBounds = 0
sendMaxTS = 0
maxRequestThreads = 1
bufferToDisk = 2
2
The real problem – ATLAS Example
●
Process 150 WinCC OA projects for
3.15 update
–
we migrate by recreating projects in 3.15, the new config file is based on a
template
–
we import old config to the new one (made from a template) to make a final one
●
–
custom rules to apply – examples:
●
●
●
●
●
●
●
●
some things to go out, some to go in, remove duplicated settings, warn when conflicting
settings, etc.
drop old version fwInstallation path
preserve manager port numbers from old config, ignore those from the new one
remove trailing comments
warn when “LoadCtrlLibs” is used (ATLAS DCS policy)
merge duplicated sections
erase particular keys from particular sections
remove contents of DbPass setting in ValueArchiveRDB section (but keep the key)
We want it automated: too risky to process 150+150 config files by hand
3
So why it's not so easy?
●
●
●
parsers in use have “text processor” rather than “syntax tree” perspective
–
no grammar-based processing
–
output correctness depends on programmer's skills (and mood) rather than on “by-design”
principles
–
results in overcomplicated code taking too much effort to make
WinCC OA config parser gets it wrong sometimes
–
pacfg functions
–
installation tool confused sometimes (order reversal, etc)
even simpler problems are not so easy to code without a proper (= syntax-tree
based) approach:
–
move a component-inserted config entry to another section
#begin fwComponent
mySetting = myValue
#end fwComponent
–
parse recursive blocks of config statements
–
merge duplicated sections
–
assure no trailing comments
4
The theory
5
The state of the art
●
define a grammar
–
i.e. in Backus-Naur Form
https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form
●
●
●
use a corresponding parser to open input text as
a syntax tree (high-level representation)
process by operating on syntax tree nodes
use a composer (inversion of parser) to transform
syntax tree to text
6
Why is it robust?
●
You don't play on lines of text, but on nodes of
syntax-tree:
Configuration
(list of section objects)
Section “general”
Section “ValueArchiveRDB”
(has a name, may contain settings,
comments, or recursive blocks of
settings and comments)
(has a name, may contain settings,
comments, or recursive blocks of
settings and comments)
SettingLine
key: “distributed”,
value: “1”
(has a key, value and
a trailing comment)
●
CommentLine
(has a comment string)
SettingLine
key: “maxConnectMessageSize”,
value: “0”
(has a key, value and
a trailing comment)
Block
(typ. introduced by fwComp)
(list of Blocks, Settings or Comments)
SettingLine
...
The text representation follows, respecting new lines,
7
indents, grammar rules...
The practice
8
Resources used
●
●
pypeg2, as a grammar-based parser
https://fdik.org/pyPEG/
grammar defined in terms of pypeg2:
–
symbols (just 4) as regular expressions:
plain_comment = re.compile(r"#(?!begin|end)[^\n]*")
newline = re.compile(r'\n+')
whitespace_sameline = re.compile(r'[ \t]+')
restline_before_comment = re.compile(r'[^#\n]*')
–
production rules as pypeg2 subclasses, i.e.:
class SettingLine(str):
grammar = name(), blank, '=', blank, attr('arg_value', restline_before_comment),
attr('comment', optional(plain_comment)), newline
class Section(Concat):
grammar = '[', name(), ']', attr('comment', optional(Comment)), newline,
attr('contents', Block)
9
Class-to-item matching table (excerpt)
Class name
Description
Example(s)
SettingLine
“regular” key-value pair (with smoothBit = "_exp_inv"
or
optional comment)
distPeer = "pcatlpixiblfsm:5103" 335
Comment
Block
group of settings as brought
in by a component
(recursion supported)
#begin fwDIM
#This should not be edited manually
LoadCtrlLibs = "fwDIM"
#end fwDIM
Section
a named section
[ctrl_50]
distributed = 0
#begin fwNode
#This should not be edited manually
LoadCtrlLibs = "fwNode/fwNode.ctl"
#end fwNode
#begin fwDevice
#This should not be edited manually
LoadCtrlLibs = "fwDevice/fwDevice.ctl"
#end fwDevice
#IBL fsm
Configuration whole config file
10
Opening and saving the config
●
to open:
grammar2.initialize_grammar()
cfg = grammar2.open_configuration( fileName )
●
to save:
out = pypeg2.compose(cfg, autoblank=False)
out_f = file(fn, 'w+')
out_f.write(out)
out_f.close()
●
cfg object above (of Configuration type) is the root
of your syntax tree
11
Example application #1
remove all trailing comments
Visit all SettingLine in all sections, remove comment by blanking-out its comment
attribute:
cfg = grammar2.open_configuration( fileName )
for section in cfg.sections:
for element in section.contents.elements:
if type(element) == grammar2.SettingLine:
element.comment = None
12
Example application #2
merge all duplicated sections
Iterate over all pairs of sections drawn from all sections. If given pair has first and
second element equal, it is a duplicate: copy second's element content to the first
and remove the second one.
A pair is expressed as indices (i, j) of the list of sections.
def merge_duplicated_sections(sections):
for i in range(0, len(sections)):
for j in range(i+1, len(sections)):
if sections[j].name == sections[i].name:
sections[i].contents.elements.extend(sections[j].contents.elements)
sections.pop(j)
Please note that standard Python list operations can be used (extend, pop, ...) you don't have to learn anything new ;-)
13
(Other) ATLAS Examples
●
●
Moving “server” entry from opcua to opcua_9
section (on condition it was inserted by fwElmb)
–
script automating this task was run on all production DCS
projects
–
in addition identified many config files where various
problems (i.e. duplicated sections) cumulated over the
years ...
Maintaining a common layout and look of config files
–
our 150 config files are edited by 20+ people...
14
Where to get it from?
https://gitlab.cern.ch/atlas-dcs-common-software/ProjectConfigFileMigration
15