Commit 2be722fd authored by Florent D'Halluin's avatar Florent D'Halluin
Browse files

Update cbs.

CBS version: f2cc74d682bddc0da7386a5ce1017b591c2c6e47.
See cbs/Changelog.
parent 49b1d5b5
2009-06-17 d-halluin <d-halluin@lrde.epita.fr>
Update cbs.
CBS version: f2cc74d682bddc0da7386a5ce1017b591c2c6e47.
See cbs/Changelog.
2009-06-14 d-halluin <d-halluin@lrde.epita.fr>
 
Update cbs.
......
2009-06-17 d-halluin <d-halluin@lrde.epita.fr>
Timer: Fix text ouput.
* lib/timer.cc: Add missing endlines.
2009-06-17 d-halluin <d-halluin@lrde.epita.fr>
Update README.
* README: Add info on including CBS to an autoconf project.
* README_LIBBENCH: Reorganize, add compilation flags info.
2009-06-17 d-halluin <d-halluin@lrde.epita.fr>
Demos: Update plot scripts.
Make them more robust to changes in libbench.
* demo/memplot-bench/generate-figures.sh,
* demo/timer-bench/generate-figures.sh: Handpick parameters/results.
* demo/memplot-bench/results-time.plot.in,
* demo/timer-bench/timer-many-tasks-time.plot.in,
* demo/timer-bench/timer-recursion-time.plot.in,
* demo/timer-bench/timer-scoped-time.plot.in: Use the handpicked values.
2009-06-17 d-halluin <d-halluin@lrde.epita.fr>
Memplot: Select probe method via macro.
* lib/memplot.cc: Use sbrk() or parse /proc/self/stat
according to MEMPLOT_USE_PROC_SELF_STAT.
2009-06-17 d-halluin <d-halluin@lrde.epita.fr>
Libbench: add time spent in children processes.
* include/cbs/bench/internal/timer_gathering.hh,
* include/cbs/bench/internal/timer_graph.hh,
* include/cbs/bench/timer.hh: Add to interface.
* include/cbs/bench/internal/timer_gathering.hxx,
* include/cbs/bench/timer.hxx,
* lib/timer.cc,
* lib/timer_gathering.cc,
* lib/timer_graph.cc: Implement.
* lib/bench.cc: Add in results.
2009-06-14 d-halluin <d-halluin@lrde.epita.fr>
Libbench: add total number of timer measures.
......
......@@ -7,3 +7,21 @@ CBS contains:
* libbench: measure performance of equiped C++ code.
* plot helpers: extract results from xml files in gnuplot format.
=======================================
-- Adding CBS to an autoconf project --
=======================================
1. From the CBS archive:
Copy the cbs directory to the project directory.
Add AC_CONFIG_SUBDIRS([cbs]) to configure.ac.
2. From the CBS git repository:
Generate the CBS archive:
git archive --prefix=cbs/ HEAD | gzip > cbs-<sha1>.tar.gz
Copy the archive to the project directory.
Extract the archive.
Add AC_CONFIG_SUBDIRS([cbs]) to configure.ac.
Libbench: measure performance of equiped C++ code.
## Influence on instrumented code performance ##
Measures made on a 2GHz Intel Centrino, 1GB RAM.
-- CPU Time measures (BENCH_TASK) --
For 1 million measures:
2.5s for 1 task (1.25s in parent, 1.25s in child).
+3.5s for 100k different tasks.
i.e. using CBS to measure 100 different tasks for 100000 measures takes:
100 000 * 2.5s / 1 000 000 (= 250 ms)
+ 100 000 * 3.5s / 1 000 000 * 100 / 100 000 (= 0.350ms)
total: ~250ms.
-- Memory measures (BENCH_MEMPLOT) --
Time:
2s per 100 000 measures.
Memory:
10MB per 100 000 measures.
=======================
-- Compilation flags --
=======================
-- Benchmarking on another machine --
Adapt and use demo/generate-plot-files.sh to generate proper plot files
for your system.
Use the timer-bench and memplot-bench demos to compute those figures
on your system.
Note on the memory measures: For each module, although all the measures
are written in a single source file, the test process uses fork()
to make sure measures are independent.
-- Vaucanson Benchmark --
Some helper scripts are provided to use Vaucanson benchmark results:
BENCH_DISABLED: Disable Libbench.
BENCH_CALLGRAPH_DISABLED: Disable callgraph.
BENCH_MEMPLOT_DISABLED: Disable Memplot.
1) Adapt and run demo/generate-plot-files.sh
2) Look in demo/vcsn, adapt and use .sh files.
MEMPLOT_USE_PROC_SELF_STAT: Parse /proc/self/stat instead of using sbrk(0).
Note: This is a much slower alternative, but the rss value is only available
in this mode. The rss value is only stored in xml result files.
## Typical usage ##
===================
-- Usage example --
===================
(see demo/bench/main.cc)
......@@ -112,3 +82,57 @@ Use explicit cast or std::stringstream to convert other values:
n_str << n;
BENCH_PARAMETER("n (string)", n_str.str());
BENCH_PARAMETER("n (long)", (long) n));
================================================
-- Influence on instrumented code performance --
================================================
Measures made on a 2GHz Intel Centrino, 1GB RAM.
## CPU Time measures (BENCH_TASK) ##
For 1 million measures:
2.5s for 1 task (1.25s in parent, 1.25s in child).
+3.5s for 100k different tasks.
i.e. using CBS to measure 100 different tasks for 100000 measures takes:
100 000 * 2.5s / 1 000 000 (= 250 ms)
+ 100 000 * 3.5s / 1 000 000 * 100 / 100 000 (= 0.350ms)
total: ~250ms.
## Memory measures (BENCH_MEMPLOT) ##
Note: using MEMPLOT_USE_POC_SELF_STAT.
Time:
2s per 100 000 measures.
Memory:
10MB per 100 000 measures.
=====================================
-- Benchmarking on another machine --
=====================================
Adapt and use demo/generate-plot-files.sh to generate proper plot files
for your system.
Use the timer-bench and memplot-bench demos to compute those figures
on your system.
Note on the memory measures: For each module, although all the measures
are written in a single source file, the test process uses fork()
to make sure measures are independent.
============================
-- Benchmarking Vaucanson --
============================
Some helper scripts are provided to use Vaucanson benchmark results:
1) Adapt and run demo/generate-plot-files.sh
2) Look in demo/vcsn, adapt and use .sh files.
#!/bin/sh
../../bin/plot.pl -a ../../_build/demo/memplot-bench/memplot_*.xml > memplot-bench.data
../../bin/plot.pl \
-p "empty tasks" \
-r "memory peak" -r "time" -r "time (system)" -r "time (user)" \
../../_build/demo/memplot-bench/memplot_*.xml \
> memplot-bench.data
gnuplot results-memory.plot
gnuplot results-time.plot
#!/bin/sh
../../bin/plot.pl -a ../../_build/demo/timer-bench/timer-start-stop_*.xml > timer-start-stop.data
../../bin/plot.pl -a ../../_build/demo/timer-bench/timer-scoped_*.xml > timer-scoped.data
../../bin/plot.pl -a ../../_build/demo/timer-bench/timer-many-tasks_*.xml > timer-many-tasks.data
../../bin/plot.pl -a ../../_build/demo/timer-bench/timer-recursion_*.xml > timer-recursion.data
../../bin/plot.pl \
-p "empty scopes" -p "tasks" \
-r "memory peak" -r "time" -r "time (system)" -r "time (user)" \
../../_build/demo/timer-bench/timer-start-stop_*.xml \
> timer-start-stop.data
../../bin/plot.pl \
-p "empty scopes" -p "tasks" \
-r "memory peak" -r "time" -r "time (system)" -r "time (user)" \
../../_build/demo/timer-bench/timer-scoped_*.xml \
> timer-scoped.data
../../bin/plot.pl \
-p "empty scopes" -p "tasks" \
-r "memory peak" -r "time" -r "time (system)" -r "time (user)" \
../../_build/demo/timer-bench/timer-many-tasks_*.xml \
> timer-many-tasks.data
../../bin/plot.pl \
-p "depth" \
-r "memory peak" -r "time" -r "time (system)" -r "time (user)" \
../../_build/demo/timer-bench/timer-recursion_*.xml \
> timer-recursion.data
gnuplot timer-many-tasks-memory.plot
gnuplot timer-many-tasks-time.plot
......
......@@ -152,6 +152,12 @@ namespace timer
// Get the total system time in ms
long system_to_ms () const;
// Get the total user time in ms for children processes
long user_children_to_ms () const;
// Get the total system time in ms for children processes
long system_children_to_ms () const;
// Get the total system time in ms
long wall_to_ms () const;
......@@ -164,9 +170,11 @@ namespace timer
bool operator< (const TimeStamp& rhs) const;
private:
TimeVal user_; // Time spent in user code
TimeVal sys_; // Time spent in system calls
TimeVal wall_; // Wall clock (observed time)
TimeVal user_; // Time spent in user code
TimeVal sys_; // Time spent in system calls
TimeVal cuser_; // Time spent in user code (children processes)
TimeVal csys_; // Time spent in system calls (children processes)
TimeVal wall_; // Wall clock (observed time)
};
// Data collection classes
......
......@@ -276,21 +276,22 @@ namespace timer
std::ostream&
TimeStamp::print (std::ostream& o) const
{
return o << '(' << user_ << ", " << sys_ << ", " << wall_ << ')';
return o << '(' << user_ + cuser_ << ", " << sys_ + csys_<< ", " << wall_ << ')';
}
inline
bool
TimeStamp::operator< (const TimeStamp& rhs) const
{
return user_ + sys_ < rhs.user_ + rhs.sys_;
return user_ + cuser_ + sys_ + csys_
< rhs.user_ + rhs.cuser_ + rhs.sys_ + rhs.csys_;
}
inline
long
TimeStamp::to_ms () const
{
return user_.to_ms () + sys_.to_ms ();
return user_.to_ms () + sys_.to_ms () + cuser_.to_ms () + csys_.to_ms ();
}
inline
......@@ -307,6 +308,20 @@ namespace timer
return sys_.to_ms ();
}
inline
long
TimeStamp::user_children_to_ms () const
{
return cuser_.to_ms ();
}
inline
long
TimeStamp::system_children_to_ms () const
{
return csys_.to_ms ();
}
inline
long
TimeStamp::wall_to_ms () const
......
......@@ -70,6 +70,8 @@ namespace timer
{
TimeVal user;
TimeVal system;
TimeVal cuser;
TimeVal csystem;
TimeVal cpu;
TimeVal wall;
TimeVal average;
......
......@@ -386,7 +386,9 @@ namespace timer
/// The timer must be running.
void pop ();
/// Return the measured time, in ms.
/// Return the measured cpu time, in ms.
/// Includes cpu time for the current process and its children
/// (started and terminated while the timer was running).
/// The time is recorded after the timer has been stopped.
long time () const;
......@@ -398,6 +400,14 @@ namespace timer
/// The time is recorded after the timer has been stopped.
long time_system () const;
/// Return the measured user time in terminated children processes, in ms.
/// The time is recorded after the timer has been stopped.
long time_children_user () const;
/// Return the measured system time in terminated children processes, in ms.
/// The time is recorded after the timer has been stopped.
long time_children_system () const;
/// Return the measured wall clock (observed) time, in ms.
/// The time is recorded after the timer has been stopped.
long time_wall () const;
......
......@@ -68,6 +68,20 @@ namespace timer
return last_time_.system_to_ms ();
}
inline
long
Timer::time_children_user () const
{
return last_time_.user_children_to_ms ();
}
inline
long
Timer::time_children_system () const
{
return last_time_.system_children_to_ms ();
}
inline
long
Timer::time_wall () const
......
......@@ -138,42 +138,14 @@ namespace bench
started = false;
result("timer measures", (long) timer.total_calls());
{
std::stringstream time;
time << timer.time();
result("time", time.str());
}
{
std::stringstream time;
time << timer.time_user();
result("time (user)", time.str());
}
{
std::stringstream time;
time << timer.time_system();
result("time (system)", time.str());
}
{
std::stringstream time;
time << timer.time_wall();
result("time (wall)", time.str());
}
{
std::stringstream memory;
memory << mp.max().memory;
result("memory peak", memory.str());
}
{
std::stringstream memory;
memory << mp.relative_max().memory;
result("relative memory usage", memory.str());
}
result("time", (long) timer.time());
result("time (user)", (long) timer.time_user());
result("time (system)", (long) timer.time_system());
result("time (children, user)", (long) timer.time_children_user());
result("time (children, system)", (long) timer.time_children_system());
result("time (wall)", (long) timer.time_wall());
result("memory peak", (long) mp.max().memory);
result("relative memory usage", (long) mp.relative_max().memory);
}
// Restart the benchmark using the same parameters and results.
......
......@@ -36,6 +36,10 @@
// Internal
#include <cbs/bench/memplot.hh>
// Unset to use sbrk (fast execution time but no rss value)
// Set to parse /proc/self/stat (slow, but rss value available)
//#define MEMPLOT_USE_PROC_SELF_STAT
namespace memplot
{
Memplot::Plot::Plot(std::string task_, std::string description_)
......@@ -50,10 +54,10 @@ namespace memplot
time = wall.tv_usec / 1000 + 1000 * wall.tv_sec;
// FIXME: choose between this and /proc/self/stat
#ifndef MEMPLOT_USE_PROC_SELF_STAT
memory = (unsigned long) sbrk(0);
return;
#else
std::ifstream file;
std::string str_;
unsigned val_;
......@@ -94,6 +98,7 @@ namespace memplot
memory = vm_size_;
memory_rss = rss_;
#endif // MEMPLOT_USE_SBRK
}
Memplot::Plot::Plot(const Memplot::Plot& plot)
......
......@@ -358,13 +358,17 @@ namespace timer
vd = *vx;
vd2 = *vx2;
graph_[vd].total.user += rhs.graph_[vd2].total.user;
graph_[vd].total.system += rhs.graph_[vd2].total.system;
graph_[vd].total.cpu += rhs.graph_[vd2].total.cpu;
graph_[vd].self.user += rhs.graph_[vd2].self.user;
graph_[vd].self.system += rhs.graph_[vd2].self.system;
graph_[vd].self.cpu += rhs.graph_[vd2].self.cpu;
graph_[vd].total.user += rhs.graph_[vd2].total.user;
graph_[vd].total.system += rhs.graph_[vd2].total.system;
graph_[vd].total.cuser += rhs.graph_[vd2].total.cuser;
graph_[vd].total.csystem += rhs.graph_[vd2].total.csystem;
graph_[vd].total.cpu += rhs.graph_[vd2].total.cpu;
graph_[vd].self.user += rhs.graph_[vd2].self.user;
graph_[vd].self.system += rhs.graph_[vd2].self.system;
graph_[vd].self.cuser += rhs.graph_[vd2].self.cuser;
graph_[vd].self.csystem += rhs.graph_[vd2].self.csystem;
graph_[vd].self.cpu += rhs.graph_[vd2].self.cpu;
graph_[vd].count += rhs.graph_[vd2].count;
graph_[vd].recursive_count += rhs.graph_[vd2].recursive_count;
......@@ -380,13 +384,17 @@ namespace timer
ed = *ei;
ed2 = *ei2;
graph_[ed].total.user += rhs.graph_[ed2].total.user;
graph_[ed].total.system += rhs.graph_[ed2].total.system;
graph_[ed].total.cpu += rhs.graph_[ed2].total.cpu;
graph_[ed].total.user += rhs.graph_[ed2].total.user;
graph_[ed].total.system += rhs.graph_[ed2].total.system;
graph_[ed].total.cuser += rhs.graph_[ed2].total.cuser;
graph_[ed].total.csystem += rhs.graph_[ed2].total.csystem;
graph_[ed].total.cpu += rhs.graph_[ed2].total.cpu;
graph_[ed].self.user += rhs.graph_[ed2].self.user;
graph_[ed].self.system += rhs.graph_[ed2].self.system;
graph_[ed].self.cpu += rhs.graph_[ed2].self.cpu;
graph_[ed].self.user += rhs.graph_[ed2].self.user;
graph_[ed].self.system += rhs.graph_[ed2].self.system;
graph_[ed].self.cuser += rhs.graph_[ed2].self.cuser;
graph_[ed].self.csystem += rhs.graph_[ed2].self.csystem;
graph_[ed].self.cpu += rhs.graph_[ed2].self.cpu;
graph_[ed].compute_average (total_cpu);
}
......@@ -425,13 +433,17 @@ namespace timer
{
vd = *vx;
graph_[vd].total.user /= rhs;
graph_[vd].total.system /= rhs;
graph_[vd].total.cpu /= rhs;
graph_[vd].total.user /= rhs;
graph_[vd].total.system /= rhs;
graph_[vd].total.cuser /= rhs;
graph_[vd].total.csystem /= rhs;
graph_[vd].total.cpu /= rhs;
graph_[vd].self.user /= rhs;
graph_[vd].self.system /= rhs;
graph_[vd].self.cpu /= rhs;
graph_[vd].self.user /= rhs;
graph_[vd].self.system /= rhs;
graph_[vd].self.cuser /= rhs;
graph_[vd].self.csystem /= rhs;
graph_[vd].self.cpu /= rhs;
graph_[vd].count /= rhs;
graph_[vd].recursive_count /= rhs;
......@@ -445,15 +457,19 @@ namespace timer
{
ed = *ei;
graph_[ed].total.user /= rhs;
graph_[ed].total.system /= rhs;
graph_[ed].total.cpu /= rhs;
graph_[ed].total.user /= rhs;
graph_[ed].total.system /= rhs;
graph_[ed].total.cuser /= rhs;
graph_[ed].total.csystem /= rhs;
graph_[ed].total.cpu /= rhs;
graph_[ed].self.user /= rhs;
graph_[ed].self.system /= rhs;
graph_[ed].self.cpu /= rhs;
graph_[ed].self.user /= rhs;
graph_[ed].self.system /= rhs;
graph_[ed].self.cuser /= rhs;
graph_[ed].self.csystem /= rhs;
graph_[ed].self.cpu /= rhs;
graph_[ed].count /= rhs;
graph_[ed].count /= rhs;
graph_[ed].compute_average (total_cpu);
}
......@@ -806,6 +822,8 @@ namespace timer
}
o << std::endl;
if (vd < timer::VERBOSE_NORMAL)
return;
......@@ -895,6 +913,8 @@ namespace timer
}
}
o << std::endl;
o << "__________________________________"
"______________________________________________\n\n\n";
......@@ -1032,6 +1052,9 @@ namespace timer
o << std::endl;
}
}
o << std::endl;
if (vd < timer::VERBOSE_MAXIMAL)
return;
......@@ -1051,55 +1074,67 @@ namespace timer
o << '[' << task.id << "] " << task.name
<< "\n\n";
o << "Calls from exterior: " << std::setw (10) << task.count
o << "Calls from exterior: " << std::setw (10) << task.count
<< '\n';
o << "Calls from within cycle:" << std::setw (10) << task.int_count
o << "Calls from within cycle: " << std::setw (10) << task.int_count
<< '\n';
o << "Recursive calls: " << std::setw (10)
o << "Recursive calls: " << std::setw (10)
<< task.recursive_count << '\n';
o << '\n';
o << std::setiosflags (std::ios::fixed);
o << "Self cpu time: " << std::setw (10)
o << "Self cpu time: " << std::setw (10)
<< std::setprecision (2) << task.self.cpu
<< '\n';
o << "Self user time: " << std::setw (10)
o << "Self user time: " << std::setw (10)
<< std::setprecision (2) << task.self.user