twacube/distributed
[March 27th]
According to our meeting, i sum up here the next steps (for now) of your internship:
STEP 1.
-
Integrate MPI in Spot. You have to build a set of functions that abstract the manipulation of MPI (barrier, display, etc.)
STEP 2.
-
First implement a simple has_deadlock
for MPI. In this version you only have to share the information about whether or not a deadlock has been found. -
Doing this implies to write a mc::has_deadlock_mpi
function inmc.hh
. This function will not launch threads to have correct performances. -
Moreover in modelcheck.cc
you have to write you own option that fix the number of thread to 1 when creating the model, i.e.modelcube = spot::ltsmin_model::load(mc_options.model).kripkecube({}, deadf, mc_options.compress, 1);
Doing this, will reduce the amount of memory per process when creating an instance of the model (have a look to line 241 ofspins_kripke.hxx
to understand why). -
Ensure that you have a correct display for STEP 2.
STEP 3.
-
Modify the existing has_deadlock
(do a copy in another file) so that no information is shared (except for the deadlock detection). -
Compare the relative performances of this algorithm to the one of SPTEP 2. At this point you will have to setup a benchmark and use the cluster. Doing this will allow to compare multithreaded implementation compared to distributed one. For this benchmark I expect: (1) a comparison between multithreaded and mpi (all processes running on the same computer), (2) a comparison between multithreaded and mpi (all processes running on different computers), and (3) a speedup of mpi (on multiple computer). -
At this point we do not expect a good speedup since no information is shared. Nonetheless doing this will help to setup the benchmark of your internship
STEP 4.
-
Share information between threads. Partitioning state-space may help but will probably have a huge communication complexity. Sharing only CLOSED states may be relevant but may also introduce an overhead. Another idea: if a state has 2 variables a
andb
share that all states fora
in [100..200] andb
in [8..12] are closed. To my knowledge no such idea have yet been explored... Another idea could be to force mpi_process to be responsible of progress of system process. -
At this point, you have to perform bibliography http://www.cs.au.dk/~evangeli/doc/ss-partitioning.pdf [2009] could be a could starting point for you, even if the problem addressed is more general than the one we focus here. The previous reference talk about partitioning the state space and Section 3. offers a good overview of various partitioning techniques (even if somehow outdated).