Commit dee73ee3 authored by Alexandre Duret-Lutz's avatar Alexandre Duret-Lutz

parse_aut: simplify the interface

* src/parseaut/public.hh, src/parseaut/parseaut.yy,
src/parseaut/fmterror.cc: Add a raise_errors options.  Remove the
parse_strict() method.  Store parse errors and filename in the output
parsed_aut to simplify usage.
* doc/org/tut20.org, doc/org/tut21.org, doc/org/tut30.org,
src/bin/autfilt.cc, src/bin/common_hoaread.cc, src/bin/dstar2tgba.cc,
src/bin/ltlcross.cc, src/bin/ltldo.cc, src/tests/complementation.cc,
src/tests/ikwiad.cc, src/tests/ltlcross3.test, src/tests/ltldo.test,
wrap/python/spot.py, wrap/python/tests/parsetgba.py: Adjust usage.
* NEWS: Mention the changes.
parent 3d5d1606
......@@ -22,11 +22,11 @@ New in spot 1.99.4a (not yet released)
Library:
* Rename dtgba_complement() to dtwa_complement(), rename the header
as complement.hh, and restrict the purpose of this function to
just complete the automaton and complement its acceptance
condition. Any further acceptance condition transformation
can be done with to_generalized_buchi() or remove_fin().
* dtgba_complement() was renamed to dtwa_complement(), moved to
complement.hh, and its purpose was restricted to just completing
the automaton and complementing its acceptance condition. Any
further acceptance condition transformation can be done with
to_generalized_buchi() or remove_fin().
* The remove_fin() has learnt how to better deal with automata that
are declared as weak. This code was previously in
......@@ -37,7 +37,7 @@ New in spot 1.99.4a (not yet released)
acceptance. The most visible effect is in the output of "ltl2tgba
-s XXXa": it used to have 5 accepting states, it now has only one.
(Changing removing acceptance of those 4 states has no effect on
the language, but it speedup some algorithms like NDFS-based
the language, but it speeds up some algorithms like NDFS-based
emptiness checks, as discussed in our Spin'15 paper.)
* The HOA parser will diagnose any version that is not v1, unless it
......@@ -47,6 +47,12 @@ New in spot 1.99.4a (not yet released)
make it easier to introduce new options. One such new option is
"trust_hoa": when true (the default) supported properties declared
in HOA files are trusted even if they cannot be easily be verified.
Another option "raise_errors" now replaces the method
automaton_stream_parser::parse_strict().
* The output of the automaton parser now include the list of parse
errors (that does not have to be passed as a parameters) and the
input filename (making the output of error message easier).
* renamings:
ltl_simplifier -> tl_simplifier
......@@ -54,9 +60,10 @@ New in spot 1.99.4a (not yet released)
tgba_sub_statistics::sub_transitions -> twa_sub_statistics::transitions
tgba_run -> twa_run
reduce_run -> twa_run::reduce
replay_twa_run -> twa_run::replay
print_twa_run -> operator<<
twa_run_to_tgba -> twa_run::as_twa
replay_tgba_run -> twa_run::replay
print_tgba_run -> operator<<
tgba_run_to_tgba -> twa_run::as_twa
format_parse_aut_errors -> parsed_aut::format_errors
Python:
......
......@@ -124,16 +124,15 @@ State: 4
* C++
Parsing an automaton is almost similar to [[file:tut01.org][parsing an LTL formula]]. The
=parse_aut()= function takes a filename, a reference to a
=parse_aut_error_list= object to populate (should errors be found) and
a BDD dictionary (to be discussed later on this page). It returns a
shared pointer to a structure that has two fields: =aborted= is a
=parse_aut()= function takes a filename and a BDD dictionary (to be
discussed later on this page). It returns a shared pointer to a
structure that has a couple of important fields: =aborted= is a
Boolean telling if the input automaton was voluntarily aborted (a
feature of [[file:hoa.org][the HOA format]]), and =aut= is the actual automaton. The
shared pointer returned by =parse_aut()= might be null (in which case
the the =parse_aut_error_list= is guaranteed not to be empty), but
since the parser performs some error recovery it is likely that an
automaton is returned even in the presence of parse errors.
feature of [[file:hoa.org][the HOA format]]), =errors= is a list of syntax errors that
occurred while parsing the automaton (printing these errors is up to
you), and =aut= is the actual automaton. The parser usually tries to
recover from errors, so =aut= may not be null even if =errors= is
non-empty.
#+BEGIN_SRC C++ :results verbatim :exports both
#include <string>
......@@ -144,10 +143,9 @@ automaton is returned even in the presence of parse errors.
int main()
{
std::string input = "tut20.never";
spot::parse_aut_error_list pel;
spot::bdd_dict_ptr dict = spot::make_bdd_dict();
spot::parsed_aut_ptr pa = parse_aut(input, pel, dict);
if (spot::format_parse_aut_errors(std::cerr, input, pel))
spot::parsed_aut_ptr pa = parse_aut(input, dict);
if (pa->format_errors(std::cerr))
return 1;
// This cannot occur when reading a never claim, but
// it could while reading a HOA file.
......
......@@ -71,10 +71,8 @@ corresponding BDD variable number, and then use for instance
int main()
{
std::string input = "tut21.hoa";
spot::parse_aut_error_list pel;
spot::bdd_dict_ptr dict = spot::make_bdd_dict();
spot::parsed_aut_ptr pa = parse_aut(input, pel, dict);
if (spot::format_parse_aut_errors(std::cerr, input, pel))
spot::parsed_aut_ptr pa = parse_aut(input, spot::make_bdd_dict());
if (pa->format_errors(std::cerr))
return 1;
// This cannot occur when reading a never claim, but
// it could while reading a HOA file.
......
......@@ -240,10 +240,8 @@ automaton to process.
int main()
{
std::string input = "tut30.hoa";
spot::parse_aut_error_list pel;
spot::bdd_dict_ptr dict = spot::make_bdd_dict();
spot::parsed_aut_ptr pa = parse_aut(input, pel, dict);
if (spot::format_parse_aut_errors(std::cerr, input, pel))
spot::parsed_aut_ptr pa = parse_aut(input, spot::make_bdd_dict());
if (pa->format_errors(std::cerr))
return 1;
if (pa->aborted)
{
......
......@@ -676,27 +676,22 @@ namespace
int
process_file(const char* filename)
{
spot::parse_aut_error_list pel;
auto hp = spot::automaton_stream_parser(filename, opt_parse);
int err = 0;
while (!abort_run)
{
pel.clear();
auto haut = hp.parse(pel, opt->dict);
if (!haut && pel.empty())
auto haut = hp.parse(opt->dict);
if (!haut->aut && haut->errors.empty())
break;
if (spot::format_parse_aut_errors(std::cerr, filename, pel))
if (haut->format_errors(std::cerr))
err = 2;
if (!haut)
if (!haut->aut)
error(2, 0, "failed to read automaton from %s", filename);
else if (haut->aborted)
err = std::max(err, aborted(haut, filename));
else
process_automaton(haut, filename);
}
return err;
}
};
......
......@@ -40,13 +40,13 @@ spot::automaton_parser_options opt_parse;
spot::twa_graph_ptr
read_automaton(const char* filename, spot::bdd_dict_ptr& dict)
{
spot::parse_aut_error_list pel;
auto p = spot::parse_aut(filename, pel, dict,
auto p = spot::parse_aut(filename, dict,
spot::default_environment::instance(),
opt_parse);
if (spot::format_parse_aut_errors(std::cerr, filename, pel)
|| !p || p->aborted)
if (p->format_errors(std::cerr))
error(2, 0, "failed to read automaton from %s", filename);
if (p->aborted)
error(2, 0, "failed to read automaton from %s (--ABORT-- read)", filename);
return std::move(p->aut);
}
......
......@@ -171,27 +171,22 @@ namespace
int
process_file(const char* filename)
{
spot::parse_aut_error_list pel;
auto hp = spot::automaton_stream_parser(filename, opt_parse);
int err = 0;
while (!abort_run)
{
pel.clear();
auto haut = hp.parse(pel, spot::make_bdd_dict());
if (!haut && pel.empty())
auto haut = hp.parse(spot::make_bdd_dict());
if (!haut->aut && haut->errors.empty())
break;
if (spot::format_parse_aut_errors(std::cerr, filename, pel))
if (haut->format_errors(std::cerr))
err = 2;
if (!haut)
if (!haut->aut)
error(2, 0, "failed to read automaton from %s", filename);
else if (haut->aborted)
err = std::max(err, aborted(haut, filename));
else
process_automaton(haut, filename);
}
return err;
}
};
......
......@@ -547,28 +547,17 @@ namespace
problem = false;
es = 0;
spot::parse_aut_error_list pel;
std::string filename = output.val()->name();
auto aut = spot::parse_aut(filename, pel, dict,
auto aut = spot::parse_aut(output.val()->name(), dict,
spot::default_environment::instance(),
opt_parse);
if (!pel.empty())
if (!aut->errors.empty())
{
status_str = "parse error";
problem = true;
es = -1;
std::ostream& err = global_error();
err << "error: failed to parse the produced automaton.\n";
spot::format_parse_aut_errors(err, filename, pel);
end_error();
res = nullptr;
}
else if (!aut)
{
status_str = "empty output";
problem = true;
es = -1;
global_error() << "error: empty output.\n";
aut->format_errors(err);
end_error();
res = nullptr;
}
......
......@@ -168,25 +168,15 @@ namespace
else if (output.val())
{
problem = false;
spot::parse_aut_error_list pel;
std::string filename = output.val()->name();
auto aut = spot::parse_aut(filename, pel, dict,
auto aut = spot::parse_aut(output.val()->name(), dict,
spot::default_environment::instance(),
opt_parse);
if (!pel.empty())
if (!aut->errors.empty())
{
problem = true;
std::cerr << "error: failed to parse the automaton "
"produced by \"" << cmd << "\".\n";
spot::format_parse_aut_errors(std::cerr, filename, pel);
res = nullptr;
}
else if (!aut)
{
problem = true;
std::cerr << "error: command \"" << cmd
<< "\" produced an empty output.\n";
aut->format_errors(std::cerr);
res = nullptr;
}
else if (aut->aborted)
......
......@@ -23,18 +23,16 @@
namespace spot
{
bool
format_parse_aut_errors(std::ostream& os,
const std::string& filename,
parse_aut_error_list& error_list)
parsed_aut::format_errors(std::ostream& os)
{
bool printed = false;
spot::parse_aut_error_list::iterator it;
for (it = error_list.begin(); it != error_list.end(); ++it)
for (auto& err : errors)
{
if (filename != "-")
if (!filename.empty() && filename != "-")
os << filename << ':';
os << it->first << ": ";
os << it->second << std::endl;
os << err.first << ": ";
os << err.second << std::endl;
printed = true;
}
return printed;
......
......@@ -24,7 +24,7 @@
%name-prefix "hoayy"
%debug
%error-verbose
%lex-param { spot::parse_aut_error_list& error_list }
%lex-param { PARSE_ERROR_LIST }
%define api.location.type "spot::location"
%code requires
......@@ -46,6 +46,10 @@
extern "C" int strverscmp(const char *s1, const char *s2);
#endif
// Work around Bison not letting us write
// %lex-param { res.h->errors }
#define PARSE_ERROR_LIST res.h->errors
inline namespace hoayy_support
{
typedef std::map<int, bdd> map_t;
......@@ -142,7 +146,6 @@ extern "C" int strverscmp(const char *s1, const char *s2);
}
}
%parse-param {spot::parse_aut_error_list& error_list}
%parse-param {result_& res}
%parse-param {spot::location initial_loc}
......@@ -1433,7 +1436,7 @@ nc-formula: nc-formula-or-ident
here.end.column = here.begin.column + j.first.end.column - 1;
here.begin.line += j.first.begin.line - 1;
here.begin.column += j.first.begin.column - 1;
error_list.emplace_back(here, j.second);
res.h->errors.emplace_back(here, j.second);
}
bdd cond = bddfalse;
if (f)
......@@ -1614,7 +1617,7 @@ lbtt-guard: STRING
here.end.column = here.begin.column + j.first.end.column - 1;
here.begin.line += j.first.begin.line - 1;
here.begin.column += j.first.begin.column - 1;
error_list.emplace_back(here, j.second);
res.h->errors.emplace_back(here, j.second);
}
if (!f)
{
......@@ -1679,7 +1682,7 @@ void
hoayy::parser::error(const location_type& location,
const std::string& message)
{
error_list.emplace_back(location, message);
res.h->errors.emplace_back(location, message);
}
static spot::acc_cond::acc_code
......@@ -1867,15 +1870,14 @@ static void fix_properties(result_& r)
r.h->aut->prop_state_based_acc();
}
static void check_version(const result_& r,
spot::parse_aut_error_list& error_list)
static void check_version(const result_& r)
{
if (r.h->type != spot::parsed_aut_type::HOA)
return;
auto& v = r.format_version;
if (v.size() < 2 || v[0] != 'v' || v[1] < '1' || v[1] > '9')
{
error_list.emplace_front(r.format_version_loc, "unknown HOA version");
r.h->errors.emplace_front(r.format_version_loc, "unknown HOA version");
return;
}
const char* beg = &v[1];
......@@ -1883,17 +1885,18 @@ static void check_version(const result_& r,
long int vers = strtol(beg, &end, 10);
if (vers != 1)
{
error_list.emplace_front(r.format_version_loc, "unsupported HOA version");
r.h->errors.emplace_front(r.format_version_loc,
"unsupported HOA version");
return;
}
constexpr const char supported[] = "1";
if (strverscmp(supported, beg) < 0 && !error_list.empty())
if (strverscmp(supported, beg) < 0 && !r.h->errors.empty())
{
std::ostringstream s;
s << "we can read HOA v" << supported
<< " but this file uses " << v << "; this might "
<< "cause the following errors";
error_list.emplace_front(r.format_version_loc, s.str());
r.h->errors.emplace_front(r.format_version_loc, s.str());
return;
}
}
......@@ -1930,19 +1933,31 @@ namespace spot
hoayyclose();
}
void raise_parse_error(const parsed_aut_ptr& pa)
{
if (pa->aborted)
pa->errors.emplace_back(pa->loc, "parsing aborted");
if (!pa->errors.empty())
{
std::ostringstream s;
if (pa->format_errors(s))
throw parse_error(s.str());
}
// It is possible that pa->aut == nullptr if we reach the end of a
// stream. It is not necessarily an error.
}
parsed_aut_ptr
automaton_stream_parser::parse(parse_aut_error_list& error_list,
const bdd_dict_ptr& dict,
automaton_stream_parser::parse(const bdd_dict_ptr& dict,
environment& env)
{
restart:
result_ r;
r.opts = opts_;
r.h = std::make_shared<spot::parsed_aut>();
r.h = std::make_shared<spot::parsed_aut>(filename_);
r.h->aut = make_twa_graph(dict);
r.env = &env;
hoayy::parser parser(error_list, r, last_loc);
hoayy::parser parser(r, last_loc);
static bool env_debug = !!getenv("SPOT_DEBUG_PARSER");
parser.set_debug_level(opts_.debug || env_debug);
hoayyreset();
......@@ -1957,7 +1972,7 @@ namespace spot
// Bison 3.0.2 lacks a += operator for locations.
r.h->loc = r.h->loc + e.pos;
}
check_version(r, error_list);
check_version(r);
last_loc = r.h->loc;
last_loc.step();
if (r.h->aborted)
......@@ -1966,8 +1981,10 @@ namespace spot
goto restart;
return r.h;
}
if (opts_.raise_errors)
raise_parse_error(r.h);
if (!r.h->aut)
return nullptr;
return r.h;
if (r.state_names)
r.h->aut->set_named_prop("state-names", r.state_names);
fix_acceptance(r);
......@@ -1976,27 +1993,34 @@ namespace spot
return r.h;
};
twa_graph_ptr
automaton_stream_parser::parse_strict(const bdd_dict_ptr& dict,
environment& env)
parsed_aut_ptr
parse_aut(const std::string& filename, const bdd_dict_ptr& dict,
environment& env, automaton_parser_options opts)
{
parse_aut_error_list pel;
auto a = parse(pel, dict, env);
if (!pel.empty())
auto localopts = opts;
localopts.raise_errors = false;
parsed_aut_ptr pa;
try
{
std::ostringstream s;
if (format_parse_aut_errors(s, filename_, pel))
throw parse_error(s.str());
automaton_stream_parser p(filename, localopts);
pa = p.parse(dict, env);
}
if (!a)
return nullptr;
catch (std::runtime_error& e)
{
if (opts.raise_errors)
throw;
parsed_aut_ptr pa = std::make_shared<spot::parsed_aut>(filename);
pa->errors.emplace_back(spot::location(), e.what());
return pa;
}
if (!pa->aut && pa->errors.empty())
pa->errors.emplace_back(pa->loc, "no automaton read (empty input?)");
if (opts.raise_errors)
raise_parse_error(pa);
return pa;
}
if (a->aborted)
throw parse_error("parsing aborted");
return a->aut;
}
}
// Local Variables:
......
......@@ -45,111 +45,140 @@ namespace spot
enum class parsed_aut_type { HOA, NeverClaim, LBTT, DRA, DSA, Unknown };
/// \brief Temporary encoding of an omega automaton produced by
/// the parser.
struct SPOT_API parsed_aut
/// \brief Result of the automaton parser
struct SPOT_API parsed_aut final
{
// Transition structure of the automaton.
// This is encoded as a TGBA without acceptance condition.
/// \brief The parsed automaton.
///
/// May be null if the parser reached the end of the stream or a
/// serious error. In the latter case, \c errors is non-empty.
twa_graph_ptr aut;
/// Whether an HOA file was termined with <code>--ABORT</code>
bool aborted = false;
/// Location of the automaton in the stream.
spot::location loc;
/// Format of the automaton.
parsed_aut_type type = parsed_aut_type::Unknown;
/// Name of the stream (used for displaying syntax errors)
const std::string filename;
/// \brief Syntax errors that occurred during parsing.
///
/// Note that the parser does not print any diagnostic.
/// Deciding how to output those errors is up to you.
parse_aut_error_list errors;
parsed_aut(const std::string& str)
: filename(str)
{
}
/// \brief Format diagnostics produced by spot::parse_aut.
/// \param os Where diagnostics should be output.
/// \return \c true iff any diagnostic was output.
bool format_errors(std::ostream& os);
};
typedef std::shared_ptr<parsed_aut> parsed_aut_ptr;
typedef std::shared_ptr<const parsed_aut> const_parsed_aut_ptr;
struct automaton_parser_options
struct automaton_parser_options final
{
bool ignore_abort = false; ///< Skip aborted automata
bool debug = false; ///< Run the parser in debug mode?
bool trust_hoa = true; ///< Trust properties in HOA files
bool raise_errors = false; ///< Raise errors as exceptions.
};
class SPOT_API automaton_stream_parser
/// \brief Parse a stream of automata
///
/// This object should be constructed for a given stream (a file, a
/// file descriptor, or a raw buffer), and then it parse() method
/// may be called in a loop to parse each automaton in the stream.
///
/// Several input formats are supported, and automatically
/// recognized: HOA, LBTT, DSTAR, or neverclaim. We recommend
/// using the HOA format, because it is the most general.
///
/// The specification of the HOA format can be found at
/// http://adl.github.io/hoaf/
///
/// The grammar of neverclaim will not accept every possible
/// neverclaim output. It has been tuned to accept the output of
/// spin -f, ltl2ba, ltl3ba, and modella. If you know of some other
/// tool that produce Büchi automata in the form of a neverclaim,
/// but is not understood by this parser, please report it to
/// spot@lrde.epita.fr.
class SPOT_API automaton_stream_parser final
{
spot::location last_loc;
std::string filename_;
automaton_parser_options opts_;
public:
/// \brief Parse from a file opened file descriptor.
///
/// \param filename The file to read from.
/// \param opts Parser options.
automaton_stream_parser(const std::string& filename,
automaton_parser_options opts = {});
// Read from an already open file descriptor.
// Use filename in error messages.
/// \brief Parse from an already opened file descriptor.
///
/// \param fd The file descriptor to read from.
/// \param filename What to display in error messages.
/// \param opts Parser options.
automaton_stream_parser(int fd, const std::string& filename,
automaton_parser_options opts = {});
// Read from a buffer
/// \brief Parse from a buffer
///
/// \param data The buffer to read from.
/// \param filename What to display in error messages.
/// \param opts Parser options.
automaton_stream_parser(const char* data,
const std::string& filename,
automaton_parser_options opts = {});
~automaton_stream_parser();
parsed_aut_ptr parse(parse_aut_error_list& error_list,
const bdd_dict_ptr& dict,
/// \brief Parse the next automaton in the stream.
///
/// Note that the parser usually tries to recover from errors. It
/// can return an automaton even if it encountered an error during
/// parsing. If you want to make sure the input was parsed
/// successfully, make sure \c errors is empty and \c aborted is
/// false in the result. (Testing \c aborted is obviously
/// superfluous if the parser is configured to skip aborted
/// automata.)
///
/// The \c aut field of the result can be null in two conditions:
/// some serious error occurred (in this case \c errors is non
/// empty), or the end of the stream was reached.
///
/// \warning This function is not reentrant.
parsed_aut_ptr parse(const bdd_dict_ptr& dict,
environment& env =
default_environment::instance());
// Raises a parse_error on any syntax error
twa_graph_ptr parse_strict(const bdd_dict_ptr& dict,
environment& env =
default_environment::instance());
};
/// \brief Build a spot::twa_graph from a HOA file or a neverclaim.
/// \brief Read the first spot::twa_graph from a file.
/// \param filename The name of the file to parse.
/// \param error_list A list that will be filled with
/// parse errors that occured during parsing.
/// \param dict The BDD dictionary where to use.
/// \param env The environment of atomic proposition into which parsing
/// should take place.
/// \param opts Additional options to pass to the parser.
/// \return A pointer to the tgba built from \a filename, or
/// 0 if the file could not be opened.
/// \return A pointer to a \c parsed_aut structure.
///
/// Note that the parser usually tries to recover from errors. It can
/// return a non zero value even if it encountered error during the
/// parsing of \a filename. If you want to make sure \a filename
/// was parsed succesfully, check \a error_list for emptiness.
///
/// The specification of the HOA format can be found at
/// http://adl.github.io/hoaf/
///
/// The grammar of neverclaim will not accept every possible
/// neverclaim output. It has been tuned to accept the output of
/// spin -f, ltl2ba, ltl3ba, and modella. If you know of some other
/// tool that produce Büchi automata in the form of a neverclaim,
/// but is not understood by this parser, please report it to
/// spot@lrde.epita.fr.
/// This is a wrapper around spot::automaton_stream_parser that returns
/// the first automaton of the file. Empty inputs are reported as
/// syntax errors, so the \c aut field of the result is guaranteed not
/// to be null if \c errors is empty. (This is unlike
/// automaton_stream_parser::parse() where a null \c aut denots the
/// end of a stream.)
///
/// \warning This function is not reentrant.
inline parsed_aut_ptr
SPOT_API parsed_aut_ptr