Commit 6739151e authored by Raphal Poss's avatar Raphal Poss
Browse files

2006-03-17 Raphael Poss <raph@lrde.epita.fr>

	Make Vaucanson a bit more usable when Xerces-C++ is not available.
	
	* config/vaucanson_xml.m4: Add AC_DEFINE and AM_CONDITIONAL for
	VCSN_USE_XML. Ensure both are set properly depending on
	--enable-xml.

	* check_xml.sh: Add. 
	
	* boostrap.sh: Run check_xml.sh. Run find_tests.sh using sh (for
	systems where scripts are not executables).

	* include/vaucanson/xml/XML.hh,
	* include/vaucanson/xml/error_handler.hh,
	* include/vaucanson/xml/ios.hh,
	* include/vaucanson/xml/node.hh,
	* include/vaucanson/xml/session.hh,
	* include/vaucanson/xml/strings.hh,
	* include/vaucanson/xml/tools.hh,
	* include/vaucanson/xml/xml_chooser.hh,
	* include/vaucanson/xml/xml_converter.hh,
	* include/vaucanson/xml/xerces_parser-hh.in,
	* include/vaucanson/tools/xml_display.hh,
	* include/vaucanson/tools/xml_load.hh,	
	* include/vaucanson/tools/xml_dump.hh: Complain early if
	  VCSN_USE_XML is not set.
	
	* include/vaucanson/tools/usual_macros.hh: Do not set STR2XML if
	  VCSN_USE_XML is not set.

	As a proof of concept, make the "automaton library" available even
	when XML support is disabled. Make utilities more generic.
	
	* include/vaucanson/tools/simple_dump.hh,
	* include/vaucanson/tools/simple_dump.hxx: Add.
	* include/Makefile.am: Update accordingly.

	* src/demos/automaton_library/dumper.hcc: Add logic to output
	  automata to different formats, ignoring XML if it is not available.
	* src/demos/automaton_library/Makefile.am: Update accordingly.
	
	* src/demos/automaton_library/a1.cc,
	* src/demos/automaton_library/b1.cc,
	* src/demos/automaton_library/c1.cc,
	* src/demos/automaton_library/divkbaseb.cc,
	* src/demos/automaton_library/ladybird.cc,
	* src/demos/automaton_library/double_ring.cc: Change to use dumper.hcc.
	
	* doc/README-IO.txt: Add.
	* doc/Makefile.am: Update accordingly.
	
	* doc/README.txt: Mention README-IO. Mention FAQ.
parent caa72d22
2006-03-17 Raphal Poss <raph@lrde.epita.fr>
Make Vaucanson a bit more usable when Xerces-C++ is not available.
* config/vaucanson_xml.m4: Add AC_DEFINE and AM_CONDITIONAL for
VCSN_USE_XML. Ensure both are set properly depending on
--enable-xml.
* check_xml.sh: Add.
* boostrap.sh: Run check_xml.sh. Run find_tests.sh using sh (for
systems where scripts are not executables).
* include/vaucanson/xml/XML.hh,
* include/vaucanson/xml/error_handler.hh,
* include/vaucanson/xml/ios.hh,
* include/vaucanson/xml/node.hh,
* include/vaucanson/xml/session.hh,
* include/vaucanson/xml/strings.hh,
* include/vaucanson/xml/tools.hh,
* include/vaucanson/xml/xml_chooser.hh,
* include/vaucanson/xml/xml_converter.hh,
* include/vaucanson/xml/xerces_parser-hh.in,
* include/vaucanson/tools/xml_display.hh,
* include/vaucanson/tools/xml_load.hh,
* include/vaucanson/tools/xml_dump.hh: Complain early if
VCSN_USE_XML is not set.
* include/vaucanson/tools/usual_macros.hh: Do not set STR2XML if
VCSN_USE_XML is not set.
As a proof of concept, make the "automaton library" available even
when XML support is disabled. Make utilities more generic.
* include/vaucanson/tools/simple_dump.hh,
* include/vaucanson/tools/simple_dump.hxx: Add.
* include/Makefile.am: Update accordingly.
* src/demos/automaton_library/dumper.hcc: Add logic to output
automata to different formats, ignoring XML if it is not available.
* src/demos/automaton_library/Makefile.am: Update accordingly.
* src/demos/automaton_library/a1.cc,
* src/demos/automaton_library/b1.cc,
* src/demos/automaton_library/c1.cc,
* src/demos/automaton_library/divkbaseb.cc,
* src/demos/automaton_library/ladybird.cc,
* src/demos/automaton_library/double_ring.cc: Change to use dumper.hcc.
* doc/README-IO.txt: Add.
* doc/Makefile.am: Update accordingly.
* doc/README.txt: Mention README-IO. Mention FAQ.
2006-03-17 Florent Terrones <terron_f@lrde.epita.fr>
 
* doc/xml/xml_proposal.tex: New version. Change structure and
......
......@@ -11,7 +11,9 @@ fi
(cd src/tests/sanity && /bin/sh ./generate_files.sh .)
(cd src/demos/vaucanswig && /bin/sh ./expand.sh .)
(cd src/benchs && /bin/sh ./generate_all_benchs.sh)
./find_tests.sh
sh ./find_tests.sh
sh ./check_xml.sh
$AUTORECONF -v -f -i
# disabled temporarily
......
#! /bin/sh
for f in $(find . -type f \
-and \( -name \*.am -or -name \*.cc -or -name \*.hh -or -name \*.hxx \) \
-and -exec grep -q -i xml '{}' \; \
-and -not -exec grep -q VCSN_USE_XML '{}' \; \
-and -not -path \*include/vaucanson/xml\* \
-print); do
echo "Booh! $f uses XML I/O but does not test VCSN_USE_XML."
done
......@@ -104,6 +104,10 @@ AC_DEFUN([VCSN_XML],
else
enable_xml_tests=no
fi
if [ x$enable_vcsn_xml = xyes ]; then
AC_DEFINE([VCSN_USE_XML], 1, [Define this if you want to use XML I/O])
fi
AM_CONDITIONAL([VCSN_USE_XML], [test x$enable_vcsn_xml = xyes])
AM_CONDITIONAL([XML_CHECK], [test x$enable_xml_tests = xyes])
])
SUBDIRS = makefiles xml manual
SUBDIRS = makefiles
include $(srcdir)/Makefile.doc
dist_doc_DATA = ref.tar.gz
......@@ -16,8 +16,9 @@ MAINTAINERCLEANFILES = ref
dist_doc_DATA += README.pdf README.tex README.html \
FAQ.pdf FAQ.tex FAQ.html \
NEWS.pdf NEWS.tex NEWS.html
NEWS.pdf NEWS.tex NEWS.html \
README-IO.pdf README-IO.tex README-IO.html
EXTRA_DIST = README.txt FAQ.txt NEWS.txt
EXTRA_DIST = README.txt FAQ.txt NEWS.txt README-IO.txt
MAINTAINERCLEANFILES += $(dist_doc_DATA)
================
Vaucanson_ I/O
================
:Date: January 2005
Here is some information about input and output of automata in
Vaucanson_.
.. _Vaucanson: http://www.lrde.epita.fr/vaucanson
.. contents::
Introduction
============
As usual, the structure of the data representing an automaton in a flat
file is called the file format.
There are several input and output formats for Vaucanson
automata. Obviously:
- input formats are those that can be read from, i.e. from which an
automaton can be loaded.
- output formats are those that can be written to, i.e. to which an
automaton can be dumped.
Given these definitions, here is the meat:
- Vaucanson supports Graphviz (dot) as an output format. Most kinds of
automata can be dumped as dot-files. Through the library this format
is simply called ``dot``.
- Vaucanson supports XML as an input and output format. Most kinds of
automata can be read and written to and from XML streams, which
Vaucanson does by using the Xerces-C++ library. Through the library
this format is simply called ``xml``.
- Vaucanson supports the FSM toolkit I/O format as an input and output
format. This allows for basic FSM interaction. Only certain kinds of
weighted automata can be meaningfully input and output with this
format. Through the library this format is simply called ``fsm``.
- Vaucanson supports a simple informative textual format as an input
and output format. Most kinds of automata can be read and written to
and from this format. Through the library this format is simply called
``simple``.
Dot format
==========
This format provides an easy way to produce a graphical representation
of an automaton.
Output using this format can be given as input to the Graphviz ``dot``
command, which can in turn produce graphical representations in
Encapsulated PostScript, PNG, JPEG, and many others.
It uses Graphviz' "directed graph" subformat.
If you want to see what it looks like go to the
``src/demos/automaton_library`` subdirectory, build the examples and run
them with the "dot" argument.
For Graphviz users:
Each graph generated by Vaucanson can be named with a string that also
prefixes each state name. If done so, several automata can be grouped
in a single graph by simply concatenating the Vaucanson outputs.
XML format
==========
This format is intended to be an all-purpose strongly typed input and
output format for automata.
Using it requires:
- that the Xerces-C++ library is installed and ready to use by the C++
compiler that is used to compile Vaucanson.
- configuring Vaucanson to use XML.
- computer resources and time.
What you gain:
- support for the Greater and Better I/O format. See documentation in
the ``doc/xml`` subdirectory for further information.
If you want to see what it looks like go to the
``src/demos/automaton_library`` subdirectory, build the examples and run
them with the ``xml`` argument.
FSM format
==========
This format is intended to provide a basic level of compatibility with
the FSM tool kit. (FIXME: references needed)
Like FSM, support for this format in Vaucanson is limited to
deterministic automata. It probably does not work with transducers,
either.
It is not meant to be used that much apart from performance comparison
with FSM. Some code exists to simulate FSM, in
``src/demos/utilities/fsm``.
If you want to see what it looks like go to the
``src/demos/automaton_library subdirectory``, build the examples and run
them with the ``fsm`` argument.
Simple format
=============
Initially intended to be a quick and dirty debugging input and output
format, this format actually proves to be a useful, compact and
efficient textual representation of automata.
Advantages over XML:
- does not require additional 3rd party software,
- simple and efficient (designed to be read and written to streams
with very low memory footprint and minimum complexity),
- less bytes in file,
- not strongely typed (can be dumped from one automaton type and
loaded to another).
Drawbacks from XML:
- not strongely typed (one cannot know what automaton type to build by
only looking at the raw data).
- currently does not (probably) support transducers.
If you want to see what it looks like go to the
``src/demos/automaton_library subdirectory``, build the examples and run
them with the ``simple`` argument.
Using input and output
======================
The library provides an infrastructure for generic I/O, which
(hopefully) will help supporting more formats in the future.
The basis for this infrastructure is the way a developer C++ using the
library will use it::
#include <vaucanson/tools/io.hh>
/* to save an automaton */
output_stream << automaton_saver(automaton, converter, format)
/* to load an automaton */
input_stream >> automaton_loader(automaton, converter, format, merge_states)
Where:
- ``automaton`` is the automaton undergoing input or output. Note that
the object must already be constructed, even to be read into.
- ``converter`` is a helper class that is able to convert automaton
transitions (edges) to character strings and possibly vice-versa.
- ``format`` is a helper class that is able to convert the automaton
to (and possibly from) a character string, using the converter as an
argument.
- ``merge_states`` is an optional argument that should be omitted in
most cases. For advanced users, it allows loading a single automaton
from several different streams that share the same state set.
About converters
----------------
The ``converter`` argument is mandatory. There are several converter
types already available in Vaucanson. See below.
An I/O converter is a function object with one or both of the following:
- an operation that takes an automaton, a transition label and
converts the transition label to a character string
(std::string). This is called the output conversion.
- an operation that takes an automaton, a character string and
converts the character string to a transition label. This is called
the input conversion.
Vaucanson already provides these converters:
``vcsn::io::string_out``, bundled with ``io.hh``.
Provides the output conversion only. Uses the C++ operator << to
create a textual representation of transition labels. Should work
with all label types.
``vcsn::io::usual_converter_exp``, defined in ``tools/usual_io.hh``.
Provides both input and output conversions. Uses the C++ operator <<
to create a textual representation of transition labels, but
requires also that algebra::parse can read back that representation
into a variable of the same type. It is mostly used for generalized
automata where transitions are labeled by rational expressions,
hence the name.
``vcsn::io::usual_converter_poly<ExpType>``, defined in ``tools/usual_io.hh``.
Provides both input and output conversions. Converts transition labels
to and from ExpType before (after) doing I/O. The implementation
is meant to be used when labels are polynoms, and using the
generalized (expression) type as ExpType.
Notes about XML and converters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When the XML I/O format was implemented, the initial converter system
was not used. Instead a specific converter system was re-designed
specifically for this format.
(FIXME: explain why!)
(FIXME: why hasn't the generic converter for XML been ported back to
fsm and simple formats?)
Because of this, when using XML I/O the "converter" argument is
completely ignored by the format processor. Usually you can see
``vcsn::io::string_output`` mentioned.
(FIXME: this is terrible! it must be patched to use an empty
vcsn::io::xml_converter_placeholder or something like it).
About formats
-------------
The ``format`` argument is mandatory. It specifies an instance of the
object in charge of the actual input or output.
A format object is a function object that provides one or both the
following operations:
- an operation that takes an output stream, the caller ``automaton_saver``
object, and the ``converter`` object. This is called the output operation.
- an operation that takes an input stream and the caller
``automaton_loader`` object. This is called the input operation. Note
that this operation does not uses the ``converter`` object, because
it should call back the ``automaton_loader`` object to actually perform
string to transition label conversions.
Format objects may require arguments to be constructed, such as the
title of the automaton in the output.
Format objects for a format should be defined in a
``tools/xxx_format.hh`` file.
Vaucanson already provides the following format objects:
``vscn::io::dot(const std::string& digraph_title)``, in ``tools/dot_format.hh``.
Provides an output operation for the Graphviz ``dot`` subformat. The title
provided when buildint the ``dot`` object in Vaucanson becomes the title
of the graph in the output data and a prefix for state names. Therefore
the title must contain only alphanumeric characters or the underscore (_),
and no spaces.
``vcsn::io::simple()``, in ``tools/simple_format.hh``.
Provides both input and output operations for a simple text format.
``vcsn::xml::XML(const std::string& xml_title)``, in ``xml/XML.hh``.
Provides both input and output operations for the Vaucanson XML I/O format.
(FIXME: why not tools/xml_format.hh with proper includes of headers in xml/?)
(FIXME: really the FSM format should have a format object too.)
Examples
========
Create a simple dot output for an automaton a1::
std::ofstream fout("output.dot");
fout << automaton_saver(a1, vcsn::io::string_output(), vcsn::io::dot("a1"));
fout.close()
Output automaton a1 to XML, read it back into another automaton a2
(possibly of another type)::
std::ofstream fout("file.xml");
fout << automaton_saver(a1, NULL, vcsn::xml::XML());
fout.close()
std::ifstream fin("file.xml");
fin >> automaton_loader(a2, NULL, vcsn::xml::XML());
fin.close()
Do the same, but this time using the simple format. The automata are
generalized, i.e. labeled by expressions::
std::ofstream fout("file.txt");
fout << automaton_saver(a1, vcsn::io::usual_converter_exp(), vcsn::io::simple());
fout.close()
std::ifstream fin("file.txt");
fin >> automaton_loader(a2, vcsn::io::usual_converter_exp(), vcsn::io::simple());
fin.close()
Internal scenario
=================
What happens in Vaucanson when you write::
fin >> automaton_loader(a1, c1, f1)
?
1. function ``automaton_loader`` creates an object AL1 of type
``automaton_loader_`` that memorizes its arguments.
2. ``automaton_loader()`` returns AL1.
3. ``operator>>(fin, AL1)`` is called.
4. ``operator>>`` says to format object f1: "hi, please use fin to load
something with AL1".
5. f1 scans input stream fin. Things may happen then:
- f1 finds a state numbered N. Then it says to AL1: "hey, make a new
state into the output automaton, keep its handler s1 for yourself
and remember it is associated to N". (callback ``AL1.add_state``)
- f1 finds a transition from state numbered N to state P, labeled
with character string S. Then it says to AL1: "hey, create a
transition with N, P, and S." (callback ``AL1.add_edge``). Then:
- AL1 remembers handler for state N (s1)
- AL1 remembers handler for state P (s2)
- AL1 says to converter c1: "hey, make me a transition label from S"
- AL1 creates transition from s1 to s2 using converted label into
output automaton.
6. when f1 is finished, it returns control to ``operator>>`` and then
calling code.
Of course since everything is statically compiled using templates
there is no performance drawback due to the intensive use of
callbacks.
Convenience utilities
=====================
For most formats the (relatively) tedious following piece of code::
output_stream << automaton_saver(a, CONVERTER(), FORMAT(...))
is also available as::
FORMAT_dump(output_stream, a, ...)
If available, this convenience utility is defined in ``tools/XXX_dump.hh``.
Conversely, the following piece of code::
input_stream >> automaton_loader(a, CONVERTER(), FORMAT(...))
is usually also available as::
FORMAT_load(input_stream, a, ...)
If available, this convenience utility is defined in ``tools/XXX_load.hh``.
(FIXME: move fsm_load away from fsm_dump.hh!)
As of today (2006-03-17) the FSM format is only available using the
fsm_load() and fsm_dump() interface.
This is Vaucanson, a C++ generic library for weighted finite state
machine.
===========================
Introduction to Vaucanson
===========================
:Date: 2005-06-23
Vaucanson_, a C++ generic library for weighted finite state machine.
Vaucanson_, a C++ generic library for weighted finite state machines.
.. _Vaucanson: http://vaucanson.lrde.epita.fr
......@@ -146,6 +142,11 @@ There are other sources of interest in the distribution.
- Headline news about the project can be found in the file ``NEWS`` at
the root of the source tree.
- Frequently asked questions are answered in the file ``FAQ``.
- Some information about input and output of automata can be found in
the file ``README-IO``.
- Documentation about the XML I/O subsystem can be found in the
``doc/xml`` subdirectory.
......
......@@ -73,6 +73,8 @@ nobase_include_HEADERS = \
vaucanson/tools/io.hh \
vaucanson/tools/simple_format.hxx \
vaucanson/tools/simple_format.hh \
vaucanson/tools/simple_dump.hxx \
vaucanson/tools/simple_dump.hh \
vaucanson/tools/usual_io.hxx \
vaucanson/tools/usual_io.hh \
vaucanson/tools/dot_dump.hxx \
......
// dot_dump.hh: this file is part of the Vaucanson project.
//
// Vaucanson, a generic library for finite state machines.
//
// Copyright (C) 2001, 2002, 2003, 2004 The Vaucanson Group.
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// The complete GNU General Public Licence Notice can be found as the
// `COPYING' file in the root directory.
//
// The Vaucanson Group consists of people listed in the `AUTHORS' file.
//
#ifndef VCSN_TOOLS_SIMPLE_DUMP_HH
# define VCSN_TOOLS_SIMPLE_DUMP_HH
# include <iostream>
namespace vcsn
{
namespace tools
{
template <typename Auto, typename Converter>
void simple_dump(std::ostream& o, const Auto& a, const Converter& conv);
}
}
# include <vaucanson/tools/simple_dump.hxx>
#endif // ! VCSN_TOOLS_SIMPLE_DUMP_HH
// dot_dump.hxx: this file is part of the Vaucanson project.
//
// Vaucanson, a generic library for finite state machines.
//
// Copyright (C) 2001, 2002, 2003, 2004 The Vaucanson Group.
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License
// as published by the Free Software Foundation; either version 2
// of the License, or (at your option) any later version.
//
// The complete GNU General Public Licence Notice can be found as the
// `COPYING' file in the root directory.
//
// The Vaucanson Group consists of people listed in the `AUTHORS' file.
//
#ifndef VCSN_TOOLS_SIMPLE_DUMP_HXX
# define VCSN_TOOLS_SIMPLE_DUMP_HXX
# include <string>
# include <vaucanson/tools/io.hh>
# include <vaucanson/tools/simple_format.hh>
# include <vaucanson/automata/concept/automata_base.hh>
# include <vaucanson/automata/concept/transducer_base.hh>
namespace vcsn {
namespace tools {
template <class S, class Auto, class Converter>
void simple_dump(const AutomataBase<S>&,
std::ostream& o,
const Auto& a,
const Converter& conv)
{
o << automaton_saver(a, conv, io::simple());
}
template <typename Auto, typename Converter>
void simple_dump(std::ostream& o, const Auto& a, const Converter& conv)
{
simple_dump(a.structure(), o, a, conv);
}
} // tools
} // vcsn
#endif // ! VCSN_TOOLS_SIMPLE_DUMP_HXX
......@@ -184,7 +184,10 @@ typedef typename alphabet_t::letter_t letter_t;
# define XML_FAIL(S)
#endif
#define FAIL(S) { std::cerr << (S) << std::endl; exit(1); }
#ifdef VCSN_USE_XML
#define STR2XML(S) xercesc::XMLString::transcode(S)
#endif
//
......
......@@ -17,6 +17,11 @@
#ifndef VCSN_TOOLS_XML_DISPLAY_HH
# define VCSN_TOOLS_XML_DISPLAY_HH
#include <vaucanson/config/system.hh>
#ifndef VCSN_USE_XML
# error Vaucanson XML support is disabled.
#endif
/**
* @file xml_display.hh
*
......
......@@ -17,6 +17,11 @@
#ifndef VCSN_TOOLS_XML_DUMP_HH
# define VCSN_TOOLS_XML_DUMP_HH