#+TITLE: Support for the Hanoi Omega Automata (HOA) Format #+DESCRIPTION: Details about support of the HOA format in Spot #+INCLUDE: setup.org #+HTML_LINK_UP: tools.html #+PROPERTY: header-args:sh :results verbatim :exports both The [[http://adl.github.io/hoaf/][Hanoi Omega-Automata format]] is a textual representation of ω-automata labeled by Boolean formulas over a set of atomic propositions, and using an arbitrary acceptance condition. The typical acceptances conditions like Büchi, generalized-Büchi, co-Büchi, Rabin, Streett, parity, ... are all supported, but the main advantage of this format is that any arbitrary acceptance condition can be defined. The HOA format has support for many features such as non-determinism, alternation, multiple initial states, transition or state-based acceptance, named states, and a range of property flags making it possible to store additional information about the automaton. The HOA format is already supported in [[http://adl.github.io/hoaf/support.html][several tools]]. The goal of this page is to detail the support of this format in Spot. It contains some information that are useful to better understand the behavior of the tools distributed by Spot, and it also look at some lower-level, discussing details that are interesting when programming with Spot. Spot can read files written using either version 1 or version 1.1 of the HOA format. It currently outputs version 1 by default, but version 1.1 can be requested from the command-line using option =-H1.1=. Future version of Spot are likely to switch to version 1.1 of HOA by default, so version 1 can already be requested explicitly using =-H1=. Note that version 1.1 is not yet published at the time of writing, and that there are discussion about calling it version 2 instead. * Format, files, and TωA Some note about the abbreviation first. We usually write "HOA format" or occasionally HOAF to denote the format (as a specification), and HOA or "HOA file" to denote an automaton in that format. In most examples involving HOA files, we use =*.hoa= as a filename extension (even if the actual extension does not matter). When an HOA file is loaded by Spot, it is stored into the data-structure used by Spot to represent ω-Automata. This structure is called Transition-based ω-Automaton, henceforth abbreviated TωA. Such a TωA can be saved back as an HOA file. If you run a command such as =autfilt input.hoa >output.hoa= this is exactly what happens: the file =input.hoa= is parsed to create a TωA, and this TωA is then printed in the HOA format into =output.hoa=. Since the TωA structure is not a perfect one-to-one representation of the HOA format, the output may not be exactly the same as the input. * Features of the HOA format with no or limited support in Spot :PROPERTIES: :CUSTOM_ID: restrictions :END: - Automata using explicit alphabet (introduced in version 1.1 of the format via =Alphabet:=) are not supported. - The maximum number of acceptance sets used is limited to 32. In the past, this limitation has forced us to improve some of our algorithms to be less wasteful and not introduce useless acceptance sets. This hard-coded limit can be augmented at configure time using option `--enable-max-accsets=N`, but doing so will consume more memory and time. - Multiple (or missing) initial states are emulated. The internal TωA representation used by Spot supports only a single initial state. When an HOA with multiple initial states is read, it is transformed into an equivalent TωA by merging the initial states into a single one. The merged state can either be one of the original initial states (if one of those has no incoming edge) or a new state introduced for that purpose. Similarly, when an automaton with no initial state is loaded (this includes the case where the automaton has no state), a disconnected initial state is added. As a consequence, Spot's HOA output always contains at least one state, even when the input had no state. - =Fin(!x)= and =Inf(!x)= are rewritten away. Internally Spot only deals with acceptance conditions involving the primitives =Fin(x)= or =Inf(x)=. When the parser encounters the variants =Fin(!x)= or =Inf(!x)=, it automatically complements the set =x= so that the resulting acceptance uses only =Fin(x)= and =Inf(x)=. For instance =Fin(0)&Inf(!1)= gets rewritten into =Fin(0)&Inf(1)= and the membership of all transitions to the set =1= is reversed. If =x= was already used without complementation in another primitive, then a new set has to be created. For instance the acceptance =Inf(0)&Inf(!0)= can only be fixed by adding a new set, =1=, that stores the complement of set =0=, and using =Inf(0)&Inf(1)=. * Internal representations of some features In this section we discuss features of the format that are fully supported, but in a way that so people could find unexpected. These design choices do not affect the semantics of the HOA format in any way. ** State-based vs. transition-based acceptance A Transition-based ω-Automaton (TωA), as its name implies, uses transition-based acceptance sets. Each edge is stored as a quadruplet $(s,d,\ell,F)$ where $s$ and $d$ are the source and destination state numbers, $\ell$ is a Binary Decision Diagram (BDD) representing the Boolean function labeling the edge, and $F$ is a bit-vector representing the membership of the transition to each declared acceptance set. States are just numbers, and may not belong to any accepting set. When reading a HOA file that use state-based acceptance (or even a mix of state-based and transitions-based acceptance), all the acceptance are pushed onto the outgoing transitions. So an automaton represented as an HOA file with this transition structure: #+BEGIN_SRC sh :results silent :exports results cat >stvstracc.hoa <sba.hoa <1$) or =Buchi= (corresponding to $n=1$) or =all= (corresponding to $n=0$). The use of =Buchi= or =all= instead of =generalized-Buchi n= follow the same idea as our use of state-based acceptance whenever possible. By using the name of these inferior acceptance conditions, we hope that the resulting automaton can be easier to use with tools that only deal with such inferior acceptance conditions. However, unlike for state vs. transition-based acceptance, there is currently no means to request another acceptance name to be used. The [[http://adl.github.io/hoaf/#canonical-acceptance-specifications-for-classical-conditions][canonical encodings for acceptance conditions]] are specified quite strictly in the HOA format. For instance =generalized-Buchi 2= corresponds to =Inf(0)&Inf(1)=, not to =Inf(1)&Inf(0)=, even though the two formulas are equivalent. Spot's HOA output routine contains some limited form of equivalence check (based mostly on associativity and commutativity of the Boolean operators), so that if it detects such a simple inversion, it will output it in the order required to be allowed to name the acceptance condition. In the following example, you can see =autfilt= removing the duplicate Rabin pair, and reordering the remaining pair to fit the syntax corresponding to =Rabin 1=. #+BEGIN_SRC sh :wrap SRC hoa autfilt <stvstrlab.hoa <prop_complete()= because that only checks the property bits, and it might return =maybe= even if =aut= is deterministic. Instead, call the function =is_complete(aut)=. This function will first test the property bits, and do the actual check in case it is unknown. Algorithms that update a TωA should call the method =prop_keep()= and use the argument to specify which of the properties they preserve. Algorithms that input a TωA and output a new one may call the method =prop_copy()= to copy over the subset of properties they preserve. Using these two functions ensure that in the future, whenever a new property is added to the TωA class, we cannot forget to update all the calls =prop_copy()= or =prop_keep()= (because these functions will take a new argument). The =HOA= printer also tries to not bloat the output with many redundant and useless properties. For instance =deterministic= automata are necessarily =unambiguous=, and people interested in unambiguous automata know that, so Spot only outputs the =unambiguous= property if an unambiguous automaton is non-deterministic. Similarly, while Spot may output alternating automata, it does not output the =no-univ-branch= property because we cannot think of a situation where this would be useful. This decision can be overridden by passing the =-Hv= (or =--hoa=v=) option to the command-line tools: this requests "verbose" properties. The following table summarizes how supported properties are handled. In particular: - For the parser, =checked= means that the property is always inferred and checked against any declaration (if present), =trusted= means that the property will be stored without being checked (unless =--trust-hoa=no= is specified). - Stored properties are those represented as bits in the automaton. - The printer will sometime check some properties when it can do it as part of its initial "survey scan" of the automaton; in that case the stored property is not used. This makes it possible to detect deterministic automata that have been output by algorithms that do not try to output deterministic automata. | property | parser | stored | printer | notes | |----------------------+---------+--------+---------------------------------------------------------+------------------------------------------------------------------| | =state-labels= | checked | no | checked if =-Hk= | state labels are converted to transition labels when reading TωA | | =trans-labels= | checked | no | always, unless =-Hi= or =-Hk= | | | =implicit-labels= | checked | no | if =-Hi= | =-Hi= only works for deterministic automata | | =explicit-labels= | checked | no | always, unless =-Hi= | | | =state-acc= | checked | yes | checked, unless =-Ht= or =-Hm= | | | =trans-acc= | checked | no | if not =state-acc= and not =-Hm= | | | =no-univ-branch= | ignored | no | only if =-Hv= | | | =univ-branch= | checked | no | checked | | | =deterministic= | checked | yes | checked | | | =complete= | checked | yes | checked | | | =unambiguous= | trusted | yes | as stored if (=-Hv= or not =deterministic=) | can be checked with =--check=unambiguous= | | =semi-deterministic= | trusted | yes | as stored if (=-Hv= or not =deterministic=) | can be checked with =--check=semi-deterministic= | | =stutter-invariant= | trusted | yes | as stored | can be checked with =--check=stuttering= | | =stutter-sensitive= | trusted | yes | as stored (opposite of =stutter-invariant=) | can be checked with =--check=stuttering= | | =terminal= | trusted | yes | as stored | can be checked with =--check=strength= | | =very-weak= | trusted | yes | as stored if (=-Hv= or not =terminal=) | can be checked with =--check=strength= | | =weak= | trusted | yes | as stored if (=-Hv= or not (=terminal= or =very-weak=)) | can be checked with =--check=strength= | | =inherently-weak= | trusted | yes | as stored if (=-Hv= or not =weak=) | can be checked with =--check=strength= | | =colored= | ignored | no | checked | | The above table is for version 1 of the format. When version 1.1 is selected (using =-H1.1=), some negated properties may be output. In particular, =stutter-sensitive= is replaced by =!stutter-invariant=. The logic of not cluttering the output with all of =!terminal=, =!weak=, and =!inherently-weak= is similar to the positive versions: =!inherently-weak= implies =!weak= which in turn implies =!terminal=, so only one of those is emitted unless =-Hv= is used. ** Named properties :PROPERTIES: :CUSTOM_ID: named-properties :END: In addition to the bit properties discussed above, a TωA can carry named properties of any type. When attaching a property to a TωA, you only supply a name for the property, a pointer, and an optional destructor function. There are currently two [[file:concepts.org::#named-properties][named properties]] related to the HOA format. - =automaton-name= :: Is a string that stores the name of the automaton (the one given after =name:= in the HOA format) - =state-names= :: is a vector of strings that stores the name of the states (in case states are named in the HOA format) You can see these properties being preserved when an automaton is read and then immediately output: #+NAME: hello-world #+BEGIN_SRC sh :wrap SRC hoa cat >hw.hoa <decorate.hoa <