Backmp11 back-end (C++17, experimental)

Backmp11 is a new back-end that is mostly backwards-compatible with back. It is currently in experimental stage, thus some details about the compatibility might change (feedback welcome!). It is named after the metaprogramming library Boost Mp11, the main contributor to the optimizations. Usages of MPL are replaced with Mp11 to get rid of the costly C++03 emulation of variadic templates.

The new back-end has the following goals:

  • reduce compilation time and RAM usage

  • reduce state machine runtime

  • provide new features and customization options

It offers a significant reduction in compilation time and RAM usage, as can be seen in these benchmarks:

Large state machine

Compile / sec RAM / MB Runtime / sec

back

17

961

2.8

back_favor_compile_time

22

1006

3.5

back11

40

2802

2.7

backmp11

3

218

1.1

backmp11_favor_compile_time

3

206

6.0

sml

6

273

0.5

Large hierarchical state machine

Compile / sec RAM / MB Runtime / sec

back

64

2902

13.2

back_favor_compile_time

75

2612

> 300

backmp11

9

370

5.2

backmp11_favor_compile_time

6

282

20.3

backmp11_favor_compile_time_multi_cu

4

~917

20.7

sml

24

706

5.7

The full code with the benchmarks and more information about them is available in this repository. There are still a couple more optimizations to come, the tables in the repository frequently get updated with the latest results.

Deprecation information

Deprecations of features, APIs, and other changes with additional context are listed in the table below.

Feature Deprecation / Removal Description

Support for event deferral as action and public access to the deferred event container

1.90 / 1.91

Deferring an event as an action triggered by the same event is not foreseen in UML and leads to ambiguities (the event is both consumed and deferred).

Change: The public API to defer an event will be changed to protected. If needed, users can inherit from state_machine and make the API public again, but without guarantees about correct functionality and API consistency.

Due to a merge of queued and deferred events into a single event pool, defer_event can be used in more contexts and the API remains public. It’s still recommended to configure event deferral as a state property if possible. This enables processing of deferred events in FIFO order, provides full support for event deferral in orthogonal regions, and it is more performant because events don’t need to be dispatched for evaluation.

Public access to the event container

1.90 / 1.91

The event container can be accessed and manipulated via public APIs. Manipulation of the container outside of the library code can lead to undefined behavior.

Change: The public API to access the event container will be changed to protected. If needed, users can inherit from state_machine to access the event container, but without guarantees about correct functionality and API consistency.

Renaming of transition_owner to local_transition_owner

1.91 / 1.92

The default setting of the Fsm parameter was named transition_owner, which was intended to reflect the same behavior as in back. However, transition_owner does have a different behavior from back. Furthermore, it was not implemented correctly.

Change: The behavior will be corrected to match back and the default setting will be renamed to local_transition_owner in 1.91. The previous setting name transition_owner will be deprecated in 1.91 and removed in 1.92.

Removal of APIs to process queued events

1.91 / 1.92

The event containers for queued events and deferred events have been merged into a single event pool. The APIs for processing queued events are obsolete.

Change: The APIs process_queued_events, and process_single_queued_event will be removed.

Occurrences of the old APIs can be replaced as follows:

  • to process a single event in the event pool, use process_event_pool(1)

  • to process all events in the event pool, use process_event_pool()

Removal of automatic enqueuing in the process_event API

1.91 / 1.92

The API process_event automatically enqueues events when the state machine is already processing. This requires instantiating enqueue_event for every event type, even though most instances are never called. They are only needed when process_event is called in actions.

Change: The API process_event will reject an event and return HANDLED_FALSE when called during event processing instead of automatically enqueueing it.

Calls to process_event in actions have to be replaced with enqueue_event.

New features

Universal visitor API

The need to define a BaseState, accept_sig and accept method in the front-end is obsolete.

Instead there is a universal visitor API that supports traversing through the state machine in multiple modes:

  • only the active states or all states

  • non-recursive or recursive

// API:
enum class visit_mode
{
    // State selection (mutually exclusive).
    active_states = 0b001,
    all_states    = 0b010,

    // Traversal mode (not set = non-recursive).
    recursive     = 0b100,

    // All valid combinations.
    active_non_recursive = active_states,
    active_recursive     = active_states | recursive,
    all_non_recursive    = all_states,
    all_recursive        = all_states | recursive
};
template<typename Visitor>
void state_machine::visit(Visitor&& visitor); // Same as active_states | recursive
template<visit_mode Mode, typename Visitor>
void state_machine::visit(Visitor&& visitor);

// Assemble your mode...
state_machine machine;
machine.visit
    <visit_mode::all_states | visit_mode::recursive>
    ([](auto &state) {/*...*/});
// ... or use the pre-defined constants
machine.visit
    <visit_mode::all_recursive>
    ([](auto &state) {/*...*/});

The visitor needs to fulfill the following signature requirement for all sub-states present in the state machine:

template<typename State>
void operator()(State& state);

Also these bugs are fixed:

  • If the SM is not started yet, no active state is visited instead of the initial state(s)

  • If the SM is stopped, no active state is visited instead of the last active state(s)

Method to check whether a state is active

A new method is_state_active can be used to check whether a state is currently active:

template <typename State>
bool state_machine::is_state_active() const;

If the type of the state appears multiple times in a hierarchical state machine, the method returns true if any of the states are active.

Simplified state machine signature

The signature has been simplified to facilitate sharing configurations between state machines. The new signature looks as follows (pseudo-code, the implementation looks a little different):

template <
    class FrontEnd,
    class Config = default_state_machine_config,
    class Derived = state_machine
>
class state_machine;

You can define state machine back-ends with a using MyStateMachine = state_machine<…​>; declaration or by inheriting from state_machine.

All settings are bundled in one Config parameter

The configuration of the state machine can be defined with a config structure. The default config looks as follows:

// Default config:
struct default_state_machine_config
{
    // Tune characteristics related to compile time, runtime performance,
    // code size, and available features.
    using compile_policy = favor_runtime_speed;
    // A common context that is shared by all SMs
    // in hierarchical state machines.
    using context = no_context;
    // Identifier for the upper-most SM
    // in hierarchical state machines.
    using root_sm = no_root_sm;
    // Type of the Fsm parameter passed in actions and guards.
    using fsm_parameter = local_transition_owner;
    // Which container to use for the event pool.
    template <typename T>
    using event_container = std::deque<T>;
};

using state_machine_config = default_state_machine_config;

...

// Custom config:
struct CustomStateMachineConfig : public state_machine_config
{
    using compile_policy = favor_compile_time;
};

New state machine setting for defining a context

The setting context sets up a context member in the state machine for dependency injection.

If using context = Context; is defined in the config, a reference to it has to be passed to the state machine constructor as first argument. The following API becomes available to access it in the state machine:

Context& state_machine::get_context();
const Context& state_machine::get_context() const;

New state machine setting for defining a root sm

The setting root_sm defines the type of the root state machine of hierarchical state machines. The root sm depicts the uppermost state machine.

If using root_sm = RootSm; is defined in the config, the following API becomes available to access it from any sub-state machine:

RootSm& state_machine::get_root_sm();
const RootSm& state_machine::get_root_sm() const;

It is highly recommended to always configure the root_sm in hierarchical state machines, even if access to it is not required. This reduces the compilation time, because it enables the back-end to instantiate the full set of construction-related methods only for the root and it can omit them for sub-state machines.

If a context is used in a hierarchical state machine, then also the root_sm should be set. The context is then automatically accessed through the root_sm.

New state machine setting for defining the Fsm parameter of actions and guards

The setting fsm_parameter defines the instance of the Fsm& fsm parameter that is passed to actions and guards in hierarchical state machines.

By default it is set to local_transition_owner, which reflects the same behavior as in back:

  • Actions and guards for transitions in the same transition table receive the SM instance that owns the transition (the one processing the event)

  • Entry and exit actions receive the "local" transition owner from the perspective of the state being entered/exited (the immediate parent SM)

In UML, the "transition owner" is the region or state machine that contains the transition. The term "local transition owner" extends this UML terminology, because the Fsm parameter in entry and exit actions is not necessarily the state machine containing the transition. It refers to the immediate parent state machine of the state being entered/exited, which is the transition owner from the local perspective of that state.

Example: Consider a hierarchical state machine with nested state machines:

SM1 (root)
  └─ SM2
      └─ SM3
          └─ State S0 (active)

When a transition defined in SM1 causes SM2 and SM3 to exit:

  • S0’s exit action receives SM3 as Fsm (S0’s immediate parent)

  • SM3’s exit action receives SM2 as Fsm (SM3’s immediate parent)

  • SM2’s exit action receives SM1 as Fsm (SM2’s immediate parent)

You can alternatively set it to root_sm, in which case always the root sm is passed as Fsm parameter.

If the fsm_parameter is set to root_sm, then also the root_sm must be set.

Generic support for serializers

The state_machine allows access to its private members for serialization purposes with a friend:

// Class boost::msm::backmp11::state_machine
template<typename T, typename A0, typename A1, typename A2>
friend void serialize(T&, state_machine<A0, A1, A2>&);

A similar friend declaration is available in the history_impl classes.

This design allows you to provide any serializer implementation, but due to the need to access private members there is no guarantee that your implementation breaks in a new version of the back-end.

Unified event pool for queued and deferred events

The containers for queued and deferred events have been merged into a single event pool. This unification results in improved processing performance if both types of events are used, because only one container needs to be traversed.

It is also no longer required to set up activate_deferred_events in a state machine’s front-end to use Defer actions.

Enhanced capabilities of the deferred_events property

Better performance:

Deferred events are inspected with a recursive visitor to decide whether they remain deferred or are ready to be dispatched. This avoids the overhead of dispatching and re-queuing them into the event pool for re-evaluation.

Improved support in hierarchical state machines:

Prior to the recursive visitor mechanism, an event could have been forwarded to a submachine for processing and then stored in its event pool. Events stored in a submachine’s event pool can only be consumed by that submachine and its descendants; other submachines at the same hierarchy level are unable to receive them.

With the recursive visitor mechanism, deferred events are stored in the event pool of the state machine that was requested to process the event. This is usually the root state machine, in which case all submachines can receive the event upon dispatch.

Conditional event deferral:

In back and back11, the events listed in a state’s deferred_events property are always deferred. In backmp11, deferral can be made conditional by defining an is_event_deferred method in the state:

struct MyState : boost::msm::front::state<>
{
    using deferred_events = mp11::mp_list<MyEvent>;

    template <typename Fsm>
    bool is_event_deferred(const MyEvent& event, Fsm& fsm) const
    {
        // Return true or false to decide
        // whether the event shall be deferred.
        ...
    }
};

Resolved issues with respect to back

Deferring events in orthogonal regions

Event deferral in orthogonal regions behaves as described in the UML standard:

  • If one active region decides to defer an event, then it is deferred for all regions instead of being processed

  • The event gets processed once no more active region decides to defer it

In back the event is evaluted by all regions independently. This leads to the same event being processed multiple times (and worst case to infinite recursion).

Other changes with respect to back

The required minimum C++ version is C++17

C++11 brings the strongly needed variadic template support for MSM, but later C++ versions provide other important features - for example C++17’s if constexpr (…​).

The signature of the state machine is changed

Please refer to the simplified state machine signature above for more information.

The history policy of a state machine is defined in the front-end instead of the back-end

The definition of the history policy is closer related to the front-end, and defining it there ensures that state machine configs can be shared between back-ends. The definition looks as follows:

struct no_history {};

template <typename... Events>
struct shallow_history {};

struct always_shallow_history {};

...

// User-defined state machine
struct Playing_ : public msm::front::state_machine_def<Playing_>
{
    using history = msm::front::shallow_history<end_pause>;
    ...
};

The public API of state_machine is refactored

All methods that should not be part of the public API are removed from it, redundant methods are removed as well. A few other methods have been renamed. The following adapter pseudo-code showcases the differences to the back API:

class state_machine_adapter
{
    using Flag_AND = backmp11::flag_and;

    // The new API returns a const std::array<...>&.
    const int* current_state() const
    {
        return &this->get_active_state_ids()[0];
    }

    // The history can be accessed like this,
    // but it has to be configured in the front-end.
    auto& get_history()
    {
        return this->m_history;
    }

    auto& get_message_queue()
    {
        return this->get_event_pool().events;
    }

    size_t get_message_queue_size() const
    {
        return this->get_event_pool().events.size();
    }

    void execute_queued_events()
    {
        this->process_event_pool();
    }

    void execute_single_queued_event()
    {
        this->process_event_pool(1);
    }

    auto& get_deferred_queue()
    {
        return this->get_event_pool().events;
    }

    void clear_deferred_queue()
    {
        this->get_event_pool().events.clear();
    }

    // No adapter.
    // Superseded by the visitor API.
    // void visit_current_states(...) {...}

    // No adapter.
    // States can be set with `get_state<...>() = ...` or the visitor API.
    // void set_states(...) {...}

    // No adapter.
    // Could be implemented with the visitor API.
    // auto get_state_by_id(int id) {...}
};

A working code example of such an adapter is available in the tests. It can be copied and adapted if needed, though this class is internal to the tests and not planned to be supported officially.

Further details about the applied API changes:

The dependency to boost::serialization is removed

The back-end aims to support serialization in general, but without providing a concrete implementation for a specific serialization library. If you want to use boost::serialization for your state machine, you can look into the state machine adapter from the tests for an example how to set it up.

The back-end’s constructor does not allow initialization of states and set_states is removed

There were some caveats with one constructor that was used for different use cases: On the one hand some arguments were immediately forwarded to the front-end’s constructor, on the other hand the stream operator was used to identify other arguments in the constructor as states, to copy them into the state machine. Besides the syntax of the later being rather unusual, when doing both at once the syntax becomes too difficult to understand; even more so if states within hierarchical sub state machines were initialized in this fashion.

In order to keep the API of the constructor simpler and less ambiguous, it only supports forwarding arguments to the front-end and no more. Also the set_states API is removed. If setting a state is required, this can still be done (in a little more verbose, but also more direct & explicit fashion) by getting a reference to the desired state via get_state and then assigning the desired new state to it.

The method get_state_by_id is removed

If you really need to get a state by id, please use the universal visitor API to implement the function on your own. The backmp11 state_machine has a new method to support getting the id of a state in the visitor:

template<typename State>
static constexpr int state_machine::get_state_id(const State&);

The pointer overload of get_state is removed

Similarly to the STL’s std::get of a tuple, the only sensible template parameter for get_state is T returning a T&. The overload for a T* is removed and the T& is discouraged, although still supported. If you need to get a state by its address, use the address operator after you have received the state by reference.

boost::any as Kleene event is replaced by std::any

To reduce the amount of necessary header inclusions backmp11 uses std::any for defining Kleene events instead of boost::any. You can still opt in to use boost::any by explicitly including boost/msm/event_traits.h.

The eUML front-end support is removed

The support for EUML induces longer compilation times by the need to include the Boost proto headers and applying C++03 variadic template emulation. If you want to use a UML-like syntax, please try out the new PUML front-end.

The fsm check and find region support is removed

The implementation of these two features depends on mpl_graph, which induces high compilation times.

sm_ptr support is removed

Not needed with the functor front-end and was already deprecated, thus removed in backmp11.

How to use it

The back-end with both its compile policies favor_runtime_speed and favor_compile_time should be mostly compatible with existing code.

Required replacements to try it out:

  • for the state machine use boost::msm::backmp11::state_machine in place of boost::msm::back::state_machine and

  • for configuring the compile policy and more use boost::msm::backmp11::state_machine_config

  • if you encounter API-incompatibilities please check the details above for reference

Since the back-end should compile very fast for most machines, the manual generation of state machines with the favor_compile_time policy has become an opt-in feature. If you want to build your state machine across multiple compilation units, you need to do the following:

  • set up a preprocessor define BOOST_MSM_BACKMP11_MANUAL_GENERATION before including msm/backmp11/favor_compile_time.hpp

  • then generate your state machine(s) in the compilation units with the macro BOOST_MSM_BACKMP11_GENERATE_STATE_MACHINE(<smname>)

You can find an example for this in the visitor test.

Applied optimizations

Below you can find some insights how the compile-time and runtime optimizations were achieved.

  • Replacement of CPU-intensive calls (due to C++03 recursion from MPL) with Mp11

  • Replaced O(N) algorithms with O(1) alternatives (using additional dispatch tables)

  • Added more type filters prior to template instantiations

  • Applied type punning where useful (to reduce template instantiations, e.g. std::deque & other things around the dispatch table)

favor_runtime_speed policy

Summary:

  • Unconditionally default-initialized everything first and afterwards only the row-related transition cells

  • Optimized cell initialization with initializer arrays (to reduce template instantiations)

favor_compile_time policy

Once an event is given to the FSM for processing, it is immediately wrapped with std::any and processing continues with this any event. The structure of the dispatch table has been reworked, one dispatch table is created per state as a hash map. The state dispatch tables are designed to directly work with the any event, they use the event’s type index via its type() function as hash value.

This mechanism enables SMs to forward events to sub-SMs without requiring additional template instantiations just for forwarding as was needed with the process_any_event mechanism. The new mechanism enables forwarding of events to sub-SMs in O(1) order instead of O(N).

Summary:

  • Use one dispatch table per state to reduce compiler processing time

  • The algorithms for processing the STT and states are optimized to go through rows and states only once

  • These dispatch tables are hash tables with type_id as key

  • Apply type erasure with std::any as early as possible and do further processing only with any events

  • each dispatch table only has to cover the events it’s handling, no template instantiations required for forwarding events to sub-SMs