Commit c02d497b authored by Akim Demaille's avatar Akim Demaille
Browse files

Vcsn 2.7

parent 722688c0
Pipeline #1160 canceled with stage
in 26 minutes and 1 second
......@@ -7,7 +7,13 @@ This file describes user visible changes in the course of the development of
Vcsn, in reverse chronological order. On occasions, significant changes in
the internal API may also be documented.
# Vcsn 2.7 (To be released)
# Vcsn 2.7 (2018-03-25)
We are happy to announce the release of Vcsn 2.7. This is mostly a bug fix
release, with improvements in the documentation, based on user feedback.
Most of our efforts are currently devoted to Vcsn 3.0.
For more information see the detailed news below.
## New features
### Improved compatibility between single- and multitape expressions
......@@ -73,8 +79,8 @@ Option `-s`/`--sort` sorts the benchmarks before running them.
Several errors were fixed. The page `expression.compose.ipynb` is new.
### Examples of C++
The directories `tests/demo` and `tests/benchmarks` contain more example of
how to use Vcsn as a C++ library.
The directories `tests/demo` and `tests/benchmarks` contain more examples
using the Vcsn C++ library.
## Bug fixes
### Incorrect order for 8bit characters
......
......@@ -15,7 +15,7 @@ m4_pattern_forbid([^(AX|BOOST|TC|URBI|VCSN)_])
AC_PREREQ([2.69])
AC_INIT([Vcsn], [2.6.dev],
AC_INIT([Vcsn], [2.7],
[vcsn-bugs@lrde.epita.fr], [],
[http://vcsn.lrde.epita.fr/])
AC_CONFIG_AUX_DIR([build-aux/bin])
......
%% Cell type:markdown id: tags:
# Welcome to Vcsn 2.6
# Welcome to Vcsn 2.7
Vcsn is a platform for weighted automata and rational expressions. it is composed of C++ libraries with various interfaces, a Python binding, some specific features for IPython and a set of command line tools.
Vcsn is a platform for weighted automata and rational expressions. it is composed of C++ libraries with various interfaces, Python bindings, some specific features for IPython and a set of command line tools.
This short tutorial guides you through the Python binding of Vcsn, and more specifically the IPython interface. If you are not a Python programmer, rest assured that there is not much to know, and if you are a Python programmer, rest assured that its conventions have been respected, and you will be able to take the full benefit from both Vcsn and Python.
This tutorial guides you through the Python binding of Vcsn, and more specifically the IPython interface. If you are not a Python programmer, rest assured that there is not much to know, and if you are a Python programmer, rest assured that its conventions have been respected, and you will be able to take the full benefit from both Vcsn and Python.
Once you read this page, you should explore these:
* [Read me first](!Read-me-first.ipynb) - This page.
* Python
* [Contexts](Contexts.ipynb) - The typing system for automata, expressions, etc.
* [Automata](Automata.ipynb) - How to define or edit automata.
* [Expressions](Expressions.ipynb) - The definition of the rational expressions.
* [Algorithms](Algorithms.ipynb) - Index of the available operations on automata, expressions, etc.
* [The C++ Library](C++-Library.ipynb) - Tutorial on how to use the Vcsn C++ library.
* [Tools](Tools.ipynb) - Executables to run from the shell, easy to chain via pipes.
* [Tools](Tools.ipynb) - Executables to run from the shell, easy to chain via pipes (e.g., `vcsn thompson -Ee '[ab]*c' | vcsn proper | vcsn determinize`).
* [Troubleshooting](Troubleshooting.ipynb) - Some known problems (during build or use), and a few known solutions.
* Examples
* [Sms2fr](Sms2fr.ipynb) - Translate SMS (i.e., text messages) to proper French.
* [Spell-checker](Spell-checker.ipynb) - From a dictionary, build a spell checker that fixes errors.
* [Stackoverflow](Stackoverflow.ipynb) - Some questions asked on Stackoverflow where Vcsn can compute the answer.
* Research
* [References](References.ipynb) - Publications on the algorithms and constructs used in Vcsn.
* [ICTAC-2016](ICTAC-2016.ipynb) - Examples taken from a paper presented to ICTAC 2016.
* [ICTAC-2017](ICTAC-2017.ipynb) - Examples taken from a paper presented to ICTAC 2017.
* [CIAA-2016](CIAA-2016.ipynb) - Examples taken from a paper presented to CIAA 2016.
* [Hacking](Hacking.ipynb) - Random notes, badly written, obsolete, meant for Vcsn developers.
Additional material is available on the web:
* The [Vcsn Website](http://vcsn.lrde.epita.fr) contains more information, tarballs, etc.
* The [Vcsn Sandbox](http://vcsn-sandbox.lrde.epita.fr) offers a playground to experiment with Vcsn from a web browser.
* The [Vcsn Gitlab](http://gitlab.lrde.epita.fr/vcsn/vcsn/issues) is the right place to submit bug reports.
%% Cell type:markdown id: tags:
## Videos
If you prefer to look at video rather than having to make the physical effort of sliding in a webpage, you might want to look at these short introductory videos (English and French):
%% Cell type:code id: tags:
``` python
%%HTML
<iframe width="300" height="169" src="https://www.youtube.com/embed/LzbXsEmqyC0" frameborder="0" allowfullscreen></iframe>
<iframe width="300" height="169" src="https://www.youtube.com/embed/LFYVBNbStZU" frameborder="0" allowfullscreen></iframe>
```
%% Output
%% Cell type:markdown id: tags:
## Quick Start
Vcsn offers several interfaces:
- fast efficient C++ templated library dubbed `static`
- a dynamic and flexible C++ interface dubbed `dyn` on top of `static`
- a Python interface on top of `dyn`
- an IPython interface built on top of the Python API (which is used to generate this very document)
- and also a shell interface ([Tools](Tools.ipynb)).
This documentation shows how to use the IPython interactive environment. Provided that Vcsn was properly deployed on your platform, to launch it run the following command from your shell:
$ vcsn notebook &
A web browser should open on a list of files. Click on the "New" menu, and select "Python" (or "Python 3"), which should open a new sheet. The remainder of this documentation is about such sheets.
First, import Vcsn into Python, and define the "context" in which you want to work. Do not worry about the (ugly!) syntax, just see where the alphabet (the set of letters, $\{a, b, c\}$) is defined. The last line (`ctx`) is here so that IPython displays what this variable contains.
%% Cell type:code id: tags:
``` python
import vcsn
ctx = vcsn.context("lal(abc), b")
ctx
```
%% Output
$\{a, b, c\}\to\mathbb{B}$
{abc} -> B
%% Cell type:markdown id: tags:
This object, the context, defines the types of the various entities. To build a rational expression on this alphabet, use `ctx.expression` as follows:
%% Cell type:code id: tags:
``` python
e1 = ctx.expression("ab*")
e1
```
%% Output
$a \, {b}^{*}$
ab*
%% Cell type:markdown id: tags:
The syntax for rational expressions is as follows (with increasing precedence):
- `\z` denotes the empty language
- `\e` denotes the language of the empty word
- `a` denotes the language of the word `a`
- `e+f` denotes the union of the languages of `e` and `f` (note the use of `+`, `|` is not accepted)
- `ef` denotes the concatenation of the languages of `e` and `f`
- `e*` denotes the Kleene closure of the language of `e`
For more details, please see [the documentation of expressions](Expressions.ipynb).
So for instance `e1` denotes the words starting with a single `a` followed by any number of `b`s.
Rational expressions are objects that feature methods. One such method is [expression.shortest(_number_)](expression.shortest.ipynb) that lists the _`number`_ first (in shortlex order) words of the language defined by the rational expresion:
%% Cell type:code id: tags:
``` python
e1.shortest(10)
```
%% Output
$\mathit{a} \oplus \mathit{ab} \oplus \mathit{abb} \oplus \mathit{abbb} \oplus \mathit{abbbb} \oplus \mathit{abbbbb} \oplus \mathit{abbbbbb} \oplus \mathit{abbbbbbb} \oplus \mathit{abbbbbbbb} \oplus \mathit{abbbbbbbbb}$
a + ab + abb + abbb + abbbb + abbbbb + abbbbbb + abbbbbbb + abbbbbbbb + abbbbbbbbb
%% Cell type:markdown id: tags:
You may compose rational expressions using Python operators such as `+` for sum, `*` for multiplication (concatenation):
%% Cell type:code id: tags:
``` python
e1 + e1 * e1
```
%% Output
$a \, {b}^{*} + a \, {b}^{*} \, a \, {b}^{*}$
ab*+ab*ab*
%% Cell type:markdown id: tags:
Vcsn features different means to build an automaton from a rational expression. The [expression.standard](expression.standard.ipynb) method builds the "standard autamaton", also known as the "position automaton", or the "Glushkov automaton":
%% Cell type:code id: tags:
``` python
e1.standard()
```
%% Output
mutable_automaton<context<letterset<char_letters>, b>>
%% Cell type:markdown id: tags:
When it comes to displaying automata as graphs, there are several "traditions". In Vcsn, initial states are denoted by an entering arrow, and final (or "accepting") states by an exiting arrow. This automaton has one initial state, and two final states.
The [expression.derived_term](expression.derived_term.ipynb) method builds the "derived-term automaton", aka, the Antimirov automaton.
%% Cell type:code id: tags:
``` python
a1 = e1.derived_term()
a1
```
%% Output
expression_automaton<mutable_automaton<context<letterset<char_letters>, b>>>
%% Cell type:markdown id: tags:
Python operators that are accepted by rational expressions are also accepted by automata, with matching semantics.
%% Cell type:code id: tags:
``` python
a2 = (e1 + e1*e1).derived_term()
a2
```
%% Output
expression_automaton<mutable_automaton<context<letterset<char_letters>, b>>>
%% Cell type:code id: tags:
``` python
a3 = a1 + a1 * a1
a3
```
%% Output
mutable_automaton<context<letterset<char_letters>, b>>
%% Cell type:markdown id: tags:
Well, those two automata are not equal (or more rigorously "isomorphic"), but they are equivalent:
%% Cell type:code id: tags:
``` python
a2.is_equivalent(a3)
```
%% Output
True
%% Cell type:markdown id: tags:
All the classical algorithms about automata are implemented:
%% Cell type:code id: tags:
``` python
a3
```
%% Output
mutable_automaton<context<letterset<char_letters>, b>>
%% Cell type:code id: tags:
``` python
a3.determinize()
```
%% Output
determinized_automaton<mutable_automaton<context<letterset<char_letters>, b>>, vcsn::wet_kind_t::bitset, false>
%% Cell type:markdown id: tags:
The states of this automaton are decorated with metadata: the corresponding set of states of the input automaton. Use [automaton.strip](automaton.strip.ipynb) to remove this decoration.
%% Cell type:code id: tags:
``` python
a3.determinize().strip().complete()
```
%% Output
mutable_automaton<context<letterset<char_letters>, b>>
%% Cell type:markdown id: tags:
Note that useless states and transitions are grayed.
%% Cell type:markdown id: tags:
To evaluate a word on an automaton, use `evaluate()`, or simpler yet: use the automaton as if it were a function:
%% Cell type:code id: tags:
``` python
a3.evaluate("a")
```
%% Output
$\top$
1
%% Cell type:code id: tags:
``` python
a3("b")
```
%% Output
$\bot$
0
%% Cell type:markdown id: tags:
To see the 10 first accepted words (if there are that many), use [automaton.shortest](automaton.shortest.ipynb):
%% Cell type:code id: tags:
``` python
a3.shortest(10)
```
%% Output
$\mathit{a} \oplus \mathit{aa} \oplus \mathit{ab} \oplus \mathit{aab} \oplus \mathit{aba} \oplus \mathit{abb} \oplus \mathit{aabb} \oplus \mathit{abab} \oplus \mathit{abba} \oplus \mathit{abbb}$
a + aa + ab + aab + aba + abb + aabb + abab + abba + abbb
%% Cell type:markdown id: tags:
To extract a rational expression from the automaton, use `expression()`:
%% Cell type:code id: tags:
``` python
a3.expression()
```
%% Output
$a \, {b}^{*} + a \, {b}^{*} \, a \, {b}^{*}$
ab*+ab*ab*
%% Cell type:markdown id: tags:
This concludes this quick overview of Vcsn's IPython interface. You should now proceed to discover other features in other notebooks.
......
%% Cell type:markdown id: tags:
Transducers
=========
# Transducers
*Transducers*, also called *k-tape automata*, are finite state machines where transitions are labeled on several tapes. The labelset of a transducer is a cartesian product of the labelsets of each tape: $L = L_1 \times \dots \times L_k$.
*Transducers*, also called *k-tape automata*, are finite state machines whose transitions are labeled on several tapes. The labelset of a transducer is a Cartesian product of the labelsets of each tape: $L = L_1 \times \dots \times L_k$.
Usually, it is common to manipulate 2-tape transducers, and to consider one as the *input* tape, and the other as the *output* tape. For example, we can define a 2-tape transducer with the first tape accepting letters in [a-c], and the same for the second tape:
%% Cell type:code id: tags:
``` python
import vcsn
ctx = vcsn.context("lat<lal_char(abc), lal_char(abc)>, b")
ctx
```
%% Output
$\{a, b, c\} \times \{a, b, c\}\to\mathbb{B}$
{abc} x {abc} -> B
%% Cell type:markdown id: tags:
Now we can define a transducer that will transform every *a* into *b*, and keep the rest of the letters. When writing the expression, to delimit the labels (a letter for each tape), we have to use simple quotes.
%% Cell type:code id: tags:
``` python
r = ctx.expression("(a|b+b|b+c|c)*")
r
```
%% Output
$\left(a|b + b|b + c|c\right)^{*}$
(a|b+b|b+c|c)*
%% Cell type:code id: tags:
``` python
r.automaton()
```
%% Output
mutable_automaton<context<lat<letterset<char_letters>, letterset<char_letters>>, b>>
%% Cell type:markdown id: tags:
Similarly, it is possible to define *weighted* transducers, as for weighted automata:
%% Cell type:code id: tags:
``` python
import vcsn
ctxw = vcsn.context("lat<lan_char(ab), lan_char(xy)>, z")
ctxw
```
%% Output
$(\{a, b\})^? \times (\{x, y\})^?\to\mathbb{Z}$
{ab}? x {xy}? -> Z
%% Cell type:code id: tags:
``` python
r = ctxw.expression("(a|x)*((a|y)(b|x))*(b|y)*")
r
```
%% Output
$\left(a|x\right)^{*} \, \left(\left(a|y\right) \, \left(b|x\right)\right)^{*} \, \left(b|y\right)^{*}$
(a|x)*((a|y)(b|x))*(b|y)*
%% Cell type:code id: tags:
``` python
r.thompson()
```
%% Output
mutable_automaton<context<lat<nullableset<letterset<char_letters>>, nullableset<letterset<char_letters>>>, z>>
%% Cell type:markdown id: tags:
This transducer transforms the *a*s at the beginning into *x*s, then *ab* into *yx*, then *b*s into *y*s. As you can see, it's possible to have $\varepsilon$-transitions in a transducer.
Keep in mind that while it is the common use-case, transducers are not limited to 2 tapes, but can have an arbitrary number of tapes. The notion of input tape and output tape becomes fuzzy, and the problem will have to be addressed in the algorithms' interface.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment