SmartAnthill: An open IoT system

SmartAnthill is an open IoT system which allows easy control over multiple microcontroller-powered devices, creating a home- or office-wide heterogeneous network out of these devices.

SmartAnthill system can be pretty much anything: from a system to control railway network model to an office-wide heating control and security system. As an open system, SmartAnthill can integrate together a wide range of devices beginning from embedded development boards and ending with off-the-shelf sensors and actuators. They can be connected via very different communication means - from wired (currently Serial, with CAN bus and Ethernet planned soon) to wireless (currently IEEE 802.15.4, with low-cost RF, Bluetooth Smart, ZigBee and WiFi planned soon).

All SmartAnthill devices within a system are controlled from the one place (such as PC or credit-card sized computer Raspberry Pi, BeagleBoard or CubieBoard), with an optional access via Internet.

From programming point of view, SmartAnthill provides a clear separation between microcontroller programming (such as “how to get temperature from this sensor”) and system integration logic (such as “how we should heat this particular house to reduce the heating bill”). Microcontroller programming usually requires C/asm programming and C/asm programs are notoriously difficult to customize. SmartAnthill allows you to customise device with pre-defined capabilities via GUI and generate compatible firmware which will be flashed to device automatically. On the other hand, system integration logic needs to be highly customizable for needs and properties of specific house or office, but within SmartAnthill it can be done via rich suite of development instruments: Generic Protocols (HTTP, Sockets, WebSokets), High Level API (REST API) and SDK for popular languages, which allow for easy development and customization.

SmartAnthill Overall Architecture

Warning

SmartAnthill has not been released yet. The steps decribed below will install development verion of SmartAnthill. Don’t forget to upgrade SmartAnthill using the same installation steps from time to time.

Installation

SmartAnthill is written in Python and works on Mac OS X, Linux, Windows OS and ARM-based credit-card sized computers (Raspberry Pi, BeagleBone, CubieBoard).

System requirements

All commands below should be executed in Command-line application:

  • Mac OS X / Linux this is Terminal application.
  • Windows this is Command Prompt (cmd.exe) application.

Note

Linux Users: Don’t forget to install “udev” rules file 99-platformio-udev.rules (an instruction is located in the file).

Windows Users: Please check that you have correctly installed USB driver from board manufacturer

Installation Methods

Please choose one of the following installation methods:

Super-Quick (Mac / Linux)

To install or upgrade SmartAnthill paste that at a Terminal prompt (you MIGHT need to run sudo first, just for installation):

[sudo] python -c "$(curl -fsSL https://raw.githubusercontent.com/smartanthill/smartanthill2_0/develop/scripts/get-smartanthill.py)"

Installer Script (Mac / Linux / Windows)

To install or upgrade SmartAnthill, download (save as...) get-smartanthill.py script. Then run the following (you MIGHT need to run sudo first, just for installation):

# change directory to folder where is located downloaded "get-smartanthill.py"
cd /path/to/dir/where/is/located/get-smartanthill.py/script

# run it
python get-smartanthill.py

On Windows OS it may look like:

# change directory to folder where is located downloaded "get-smartanthill.py"
cd C:\path\to\dir\where\is\located\get-smartanthill.py\script

# run it
C:\Python27\python.exe get-smartanthill.py

Full Guide

  1. Check python version (only 2.6-2.7 is supported):
$ python --version

Windows Users only:

  1. Download Python 2.7 and install it.
  2. Download and install the latest Python for Windows extensions (PyWin32).
  3. Add to PATH system variable ;C:\Python27;C:\Python27\Scripts; and reopen Command Prompt (cmd.exe) application. Please read this article How to set the path and environment variables in Windows.
  1. Check a pip tool for installing and managing Python packages:
pip search smartanthill

You should see short information about smartanthill package.

If your computer does not recognize pip command, try to install it first using these instructions.

  1. Install a smartanthill and related packages:

Warning

SmartAnthill has not been published in PyPi registry. Please use installation script get-smartanthill.py mentioned above.

pip install smartanthill && pip install --egg scons

For upgrading the smartanthill to new version please use this command:

pip install -U smartanthill

Launching

SmartAnthill is based on Twisted and can be launched as Foreground Process as well as Background Process.

Foreground Process

The whole list of usage options for SmartAnthill is accessible via:

smartanthill --help

Quick launching (the user’s home directory ${HOME} will be used as Workspace Directory):

smartanthill

Launching with specific Workspace Directory:

smartanthill --workspacedir=/path/to/workspace/directory

Check the Configuration page for detailed configuration options.

Background Process

The launching in the Background Process implements through twistd utility. The whole list of usage options for twistd is accessible via twistd --help command. The final SmartAnthill command looks like:

twistd smartanthill

Dashboard (GUI)

SmartAnthill Dashboard is accessible by default on http://localhost:8138.

However, you can change TCP/IP port later using Dashboard > Settings page.

Configuration

SmartAnthill uses JSON human-readable format for data serialization. This syntax is easy for using and reading.

The SmartAnthill Configuration Parser gathers data in the next order (steps):

  1. Loads predefined Base Configuration options.
  2. Loads options from Workspace Directory.
  3. Loads Console Options.

Note

The Configuration Parser redefines options step by step (from #1 to #3). The Console Options step has the highest priority.

Base Configuration

The Base Configuration is predefined in SmartAnthill System. See config_base.json.

Workspace Directory

SmartAnthill uses --workspacedir for:

  • finding user’s specific start-up configuration options. They must be located in the smartanthill.json file. (Check the list of the available options here)
  • finding the Plugins for SmartAnthill System (should be located in plugins directory, see examples)
  • storing the settings about embedded boards/MCUs
  • storing the another working data.

Warning

The Workspace Directory must have Written Permission

Console Options

The simple options that are defined in Base Configuration can be redefined through console options for SmartAnthill Application.

The whole list of usage options for SmartAnthill are accessible via:

smartanthill --help

SmartAnthill Plugins

Version:v0.5.0

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture document, please make sure to read it before proceeding.

SmartAnthill Devices use SmartAnthill Plugins to communicate with specific devices.

SmartAnthill Plugins are generally written in C programming language.

Each SmartAnthill Plugin is represented by it’s Plugin Handler, and Plugin Manifest.

SmartAnthill Plugin Handler

Each SmartAnthill Plugin has Plugin Handler, usually implemented as two C functions which have the following prototypes:

plugin_handler_init(const void* plugin_config, void* plugin_state )

plugin_handler(const void* plugin_config, void* plugin_state, ZEPTO_PARSER* command, REPLY_HANDLE reply, WAITING_FOR* waiting_for)

See details on MEMORY_HANDLE in ‘Plugin API’ section below.

SmartAnthill Plugin Manifest

Each SmartAnthill Plugin has Plugin Manifest, which describes input and output of the plugin.

Plugin Manifest is an XML file, with structure which looks as follows:

<smartanthill.plugin id="my" name="My Plugin" version="1.0">

  <description>Short description of plugin's capabilities</description>

  <request>
    <field name="abc" type="encoded-int[max=2]" />
    <field name="level" type="encoded-uint[max=1]" min="0" max="1" default="0" title="Level">
      <values>
        <item value="0" title="LOW" />
        <item value="1" title="HIGH" />
       </values>
    </field>
  </request>

  <response>
    <field name="xyz" type="encoded-int[max=2]" min="0" max="255">
      <meaning type="float">
        <linear-conversion input-point0="0" output-point0="20.0"
                           input-point1="100" output-point1="40.0" />
      </meaning>
    </field>
  </response>

  <configuration>
    <peripheral>
      <pin type="spi[sclk]" name="pin_spi_sclk" title="SPI SCLK Pin" />
      <pin type="spi[mosi]" name="pin_spi_mosi" title="SPI MOSI Pin" />
      <pin type="spi[miso]" name="pin_spi_miso" title="SPI MISO Pin" />
      <pin type="spi[ss]"   name="pin_spi_ss"   title="SPI SS Pin" />
      <pin type="digital"   name="pin_led1"     title="LED 1 Pin" />
      <pin type="digital"   name="pin_led2"     title="LED 2 Pin" />
    </peripheral>
    <options>
      <option type="uint[2]" name="delay_blink_ms" default="150" title="Delay between blinks, ms" />
      <option type="char[30]" name="welcome_to" default="Welcome to SmartAnthill" title="Welcome text" />
      <option type="uint[max=1]" name="power" min="0" max="2" default="1" title="Transmitter power">
        <values>
          <item value="0" title="MIN" />
          <item value="1" title="NORMAL" />
          <item value="2" title="MAX" />
         </values>
      </option>
    </options>
  </configuration>

</smartanthill.plugin>

Currently supported <field> types are:

<meaning> tag

<meaning> tag specifies that while field has type such as integer, it’s meaning for the programmer and end-user is different, and can be, for example, a float. This often arises when plugin, for example, measures temperature in range between 35 and 40 celsius as an integer from 0 to 255. <meaning> tag in Plugin Manifest allows developer to write something along the lines of:

if(TemperatureSensor.Temperature > 38.9) {...}

instead of

if(TemperatureSensor.Temperature > 200) {...}

which would be necessary without <meaning> tag.

To enable much more intuitive first form, an appropriate fragment of Plugin Manifest should be written as

...
  <field name="Temperature" type="encoded-int[max=1]">
    <meaning type="float">
      <linear-conversion input-point0="0" output-point0="35.0"
                         input-point1="255" output-point1="40.0">
    </meaning>
...

or as

...
  <field name="Temperature" type="encoded-int[max=1]" min="0" max="99">
    <meaning type="float">
      <linear-conversion a="0.0196" b="35.">
    </meaning>
...

where meaning is calculated as meaning=a*field+b.

Currently supported <meaning> types are “float” and “int”. If <meaning> type is ‘int’, then all the relevant calculations are performed as floats, and then rounded to the nearest integer.

Each <meaning> tag MUST specify conversion. Currently supported conversions are: <linear-conversion> and <piecewise-linear-conversion> [TODO].

<meaning> tags can be used both for <command> fields and for <reply> fields.

<configuration> tag

<configuration> tag specifies the list of required peripheral, pin numbers, plugin options, etc. This information will be used by Dashboard Service for configuring SmartAnthill device.

Allowed field types:

Peripheral
  • <pin type="i2c[*]"> Inter-Integrated Circuit

    • <pin type="i2c[sda]"> - Serial Data Line (SDA)
    • <pin type="i2c[scl]"> - Serial Clock Line (SCL)
  • <pin type="spi[*] Serial Peripheral Interface Bus

    • <pin type="spi[sclk]"> - Serial Clock (SCLK, output from master)
    • <pin type="spi[mosi]"> - Master Output, Slave Input (MOSI, output from master)
    • <pin type="spi[miso]"> - Master Input, Slave Output (MISO, output from slave)
    • <pin type="spi[ss]"> - Slave Select (SS, active low, output from master)
  • <pin type="analog">

  • <pin type="digital">

  • <pin type="pwm"> - Pulse-width modulation

Options
  • <option type="int[n]"> , where int[1] is equal to byte type
  • <option type="uint[n]">
  • <option type="char[n]">

SmartAnthill Plugin Handler Without a State Machine

Simple SA plugins MAY be written without being a State Machine, for example:

struct my_plugin_config { //constant structure filled with a configuration
                      //  for specific 'ant body part'
  byte bodypart_id;//always present
  byte request_pin_number;//pin to request sensor read
  byte ack_pin_number;//pin to wait for to see when sensor has provided the data
  byte reply_pin_numbers[4];//pins to read when ack_pin_number shows that the data is ready
};

byte my_plugin_handler_init(const void* plugin_config,void* plugin_state) {
  const my_plugin_config* pc = (my_plugin_config*) plugin_config;
  zepto_set_pin(pc->request_pin_number,0);
}

//TODO: reinit? (via deinit, or directly, or implicitly)

byte my_plugin_handler(const void* plugin_config, void* plugin_state,
  ZEPTO_PARSER* command, REPLY_HANDLE reply, WAITING_FOR* waiting_for) {
  const my_plugin_config* pc = (my_plugin_config*) plugin_config;

  //requesting sensor to perform read, using pc->request_pin_number
  zepto_set_pin(pc->request_pin_number,1);

  //waiting for sensor to indicate that data is ready
  zepto_wait_for_pin(pc->ack_pin_number,1);

  uint16_t data_read = zepto_read_from_pins(pc->reply_pin_numbers,4);
  zepto_reply_append_byte(reply,data_read);
  return 0;
}

SmartAnthill Plugin Handler as a State Machine

Implementation above is not ideal; in fact, it blocks execution at the point of zepto_wait_for_pin() call, which under restrictions of Zepto OS means that nothing else can be processed. Ideally, SmartAnthill Plugin Handler SHOULD be implemented as a state machine; for example, the very same plugin SHOULD be rewritten as follows:

struct my_plugin_config { //constant structure filled with a configuration
                      //  for specific 'ant body part'
  byte bodypart_id;//always present
  byte request_pin_number;//pin to request sensor read
  byte ack_pin_number;//pin to wait for to see when sensor has provided the data
  byte reply_pin_numbers[4];//pins to read when ack_pin_number shows that the data is ready
};

struct my_plugin_state {
  byte state; //'0' means 'initial state', '1' means 'requested sensor to perform read'
};

byte my_plugin_handler_init(const void* plugin_config,void* plugin_state) {
  my_plugin_state* ps = (my_plugin_state*)plugin_state;
  const my_plugin_config* pc = (my_plugin_config*) plugin_config;
  zepto_set_pin(pc->request_pin_number,0);
  ps->state = 0;
}

//TODO: reinit? (via deinit, or directly, or implicitly)

byte my_plugin_handler(const void* plugin_config, void* plugin_state,
  ZEPTO_PARSER* command, REPLY_HANDLE reply, WAITING_FOR* waiting_for) {
  const my_plugin_config* pc = (my_plugin_config*) plugin_config;
  my_plugin_state* ps = (my_plugin_state*)plugin_state;

  switch(ps->state) {
    case 0:
      //requesting sensor to perform read, using pc->request_pin_number
      zepto_set_pin(pc->request_pin_number,1);

      //waiting for sensor to indicate that data is ready
      zepto_indicate_waiting_for_pin(waiting_for,pc->ack_pin_number,1);
      return WAITING_FOR;

    case 1:
      uint16_t data_read = zepto_read_from_pins(pc->reply_pin_numbers,4);
      zepto_reply_append_byte(reply,data_read);
      return 0;

    default:
      assert(0);
  }
}

Such an approach allows SmartAnthill implementation (such as Zepto VM) to perform proper pausing (with ability for SmartAnthill Client to interrupt processing by sending a new command while it didn’t receive an answer to the previous one), when long waits are needed. It also enables parallel processing of the plugins (see PARALLEL instruction of Zepto VM in Zepto VM document for details).

Plugin API

SmartAnthill implementation MUST provide the following APIs to be used by plugins.

Zepto Exceptions

As SmartAnthill plugins operate in a very restricted environments, SmartAnthill uses a very simplified version of exceptions, which can be implemented completely in C, without any support from compiler or underlying libraries. This is known as Zepto Exceptions and should be used as follows:

Try-catch block:

if(ZEPTO_TRY()) {
  do_something();
}

if(ZEPTO_CATCH()) {
  //exception handling here
  //ZEPTO_CATCH() returns exception code passed in ZEPTO_THROW()
}

Throwing exception:

ZEPTO_THROW(exception_code);
//exception_code has type 'byte'

Intermediate processing (MUST be written after each and ever call to a function-able-to-throw-exception; this is necessary to handle platforms where setjmp/longjmp is not available, but MUST be written regardless of the target platform):

function_able_to_throw_exception();
ZEPTO_UNWIND(-1); //returns '-1' in case of exception unwinding

ZEPTO_UNWIND MUST be issued after each function call (except for those function calls which are known not to throw any exceptions) for all valid SmartAnthill Plugins.

Exception Codes

Some Exception Codes are reserved for SmartAnthill. To avoid collisions, user exception codes MUST start from ZEPTO_USER_EXCEPTION.

ZEPTO_ASSERT

ZEPTO_ASSERT is a way to have trackable assertions in plugin code. ZEPTO_ASSERT(condition) effectively causes ZEPTO_THROW(1) if condition fails. ZEPTO_ASSERT() SHOULD be used instead of usual C assert() calls.

zeptoerr

zeptoerr is a pseudo-stream, somewhat similar to traditional stderr. However, due to hardware limitations, zeptoerr capabilities are very limited, and should be used sparingly.

zeptoerr is intended to be used as follows:

ZEPTOERR(plugin_config->bodypart_id,"Error: %d",error);

It compiles differently depending on compile-time settings, but generally should have an effect similar to fprintf(stderr,”Error: %dn”, error);. To facilitate automated stream decoding in certain modes, the following SHOULD be added to the Plugin Manifest:

<zeptoerr>
  <line>Error: %d</line> <!-- text within SHOULD be an EXACT match of the text in ZEPTOERR() call -->
  <line>Error 2: %f</line> <!-- text within SHOULD be an EXACT match of the text in ZEPTOERR() call -->
</zeptoerr>

ZEPTOERR has very limited support for data types: only %d (and synomym %i), %x, and %f are supported. Formatting modifiers (such as “%02d”) are currently not supported at all.

Note that in some cases (for example, if SmartAnthill Device runs out of RAM), SmartAnthill Device MAY truncate zeptoerr pseudo-stream.

For implementation details of zeptoerr, please refer to SmartAnthill Zepto OS document.

Data Types

REPLY_HANDLE

REPLY_HANDLE is an encapsulation of request/reply block, which allows plugin to call zepto_reply_append_*() (see below). REPLY_HANDLE is normally obtained by plugin as a parameter from plugin_handler() call.

Caution: Plugins MUST treat REPLY_HANDLE as completely opaque and MUST NOT try to use it to access reply buffer directly; doing so may easily result in memory corruption when running certain Zepto VM programs (for example, when PARALLEL instruction is used).

For an information on possible implementations of REPLY_HANDLE, see SmartAnthill Zepto OS document.

ZEPTO_PARSER structure

ZEPTO_PARSER is an opaque structure (which can be seen as a sort of object where all data should be considered as private). It is used as follows:

uint16_t sz = zepto_parse_encodeduint2(parser);
byte b = zepto_parse_byte(parser,sz);

TODO: WAITING_FOR

TODO: half-float library

Functions

Names of all functions within plugin interface start from papi_ using by plugins of any functions with names not starting from papi_ is not supported. All such calls should be declared in a single papi.h file, and, if possible, this file should not include any other file listing function calls (that is, calls not related to plugin API).

Parsing request and writing response functions
Request parsing functions:
uint8_t papi_parser_read_byte( ZEPTO_PARSER* po );
uint16_t papi_parser_read_encoded_uint16( ZEPTO_PARSER* po );
uint16_t papi_parser_read_encoded_signed_int16( ZEPTO_PARSER* po );
TODO: add vector-related functions
Writing functions:
void papi_reply_write_byte( REPLY_HANDLE mem_h, uint8_t val );
void papi_reply_write_encoded_uint16( REPLY_HANDLE mem_h, uint16_t num );
void papi_reply_write_encoded_signed_int16( REPLY_HANDLE mem_h, int16_t sx );
TODO: add vector-related functions
Misc functions:
void papi_init_parser_with_parser( ZEPTO_PARSER* po, const ZEPTO_PARSER* po_base );
bool papi_parser_is_parsing_done( ZEPTO_PARSER* po );
uint16_t papi_parser_get_remaining_size( ZEPTO_PARSER* po );
EEPROM access
bool papi_eeprom_write( uint16_t plugin_id, const uint8_t* data );
bool papi_eeprom_read( uint16_t plugin_id, uint8_t* data );

plugin_id should eventually be converted to slot_id; data_size must be declared by plugin writer in advance (that is, in plugin manifest); mapping of plugin_id to slot_id must be done at time of firmware code generation (exact details are TBD).

void papi_eeprom_flush();

when this function returns, results of previous ‘write’ operations are guaranteed to be actually stored in eeprom. Note: depending on a particular archetecture this may result in an actually-empty call.

Non-blocking calls to access hardware

Here are calls to access pins.

bool papi_read_digital_pin( uint16_t pin_num );
void papi_write_digital_pin( uint16_t pin_num, bool value );

The following calls implement access to devices sitting behind SPI and I2C interfaces. Each size is in bits. TODO: discuss the order of bits within an unsigned int representing command/data

void papi_start_sending_spi_command_16( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint16_t command, uint8_t command_sz);
void papi_start_sending_spi_command_32( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint32_t command, uint8_t command_sz);
void papi_start_sending_i2c_command_16( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint16_t command, uint8_t command_sz);
void papi_start_sending_i2c_command_32( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint32_t command, uint8_t command_sz);

Each of the above papi_start_sending_*() calls start an operation and return immediately; to know that the request is already performed wait for a respective spi_id / i2c_id

void papi_start_receiving_spi_data_16( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint16_t* data);
void papi_start_receiving_spi_data_32( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint32_t* data);
void papi_start_receiving_i2c_data_16( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint16_t* data);
void papi_start_receiving_i2c_data_32( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint32_t* data);

Each of the above papi_start_receiving_*() calls start an operation and return immediately; to know that the data is already available wait for a respective spi_id / i2c_id

void papi_cancel_spi_send( uint8_t spi_id );
void papi_cancel_spi_receive( uint8_t spi_id );
void papi_cancel_i2c_send( uint8_t i2c_id );
void papi_cancel_i2c_receive( uint8_t i2c_id );

Each of the above papi_cancel_*() calls return immediately.

Blocking calls

All calls in this group are pseudo-functions that will be compiled to a proper sequence of calls that implements initiating of a correspondent operation and starting waiting for the result.

Blocking calls to to wait for timeout
void papi_sleep( uint16_t millisec );//TODO?: SA_TIME_VAL instead of ms?
Blocking calls to access hardware
void papi_wait_for_spi_send( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint16_t command, uint8_t command_sz );
void papi_wait_for_i2c_send( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint16_t command, uint8_t command_sz );
void papi_wait_for_spi_receive( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint16_t* data );
void papi_wait_for_i2c_receive( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint16_t* data );
void papi_wait_for_wait_handler( WAITING_FOR* wf );//see helper functions below
Helper functions to fill WAITING_FOR structure
void papi_init_wait_handler( WAITING_FOR* wf );
void papi_wait_handler_add_wait_for_spi_send( WAITING_FOR* wf, uint8_t spi_id );
void papi_wait_handler_add_wait_for_i2c_send( WAITING_FOR* wf, uint8_t i2c_id );
void papi_wait_handler_add_wait_for_spi_receive( WAITING_FOR* wf, uint8_t spi_id );
void papi_wait_handler_add_wait_for_i2c_receive( WAITING_FOR* wf, uint8_t i2c_id );
void papi_wait_handler_add_wait_for_timeout( WAITING_FOR* wf, SA_TIME_VAL tv );
bool papi_wait_handler_is_waiting_for_spi_send( const WAITING_FOR* wf, uint8_t spi_id );
bool papi_wait_handler_is_waiting_for_i2c_send( const WAITING_FOR* wf, uint8_t i2c_id );
bool papi_wait_handler_is_waiting_for_spi_receive( const WAITING_FOR* wf, uint8_t spi_id );
bool papi_wait_handler_is_waiting_for_i2c_receive( const WAITING_FOR* wf, uint8_t i2c_id );
bool papi_wait_handler_is_waiting_for_timeout( SA_TIME_VAL* remaining, const WAITING_FOR* wf );

TODO: think about parameters

Yet unsorted calls
void papi_gravely_power_inefficient_micro_sleep( SA_TIME_VAL* timeval );

Zepto Programming Patterns

Version:v0.2

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture , Zepto VM , and SmartAnthill Plugins documents, please make sure to read them before proceeding.

Zepto VM does not intend to provide highly sophisticated and/or mathematics-oriented functionality; instead, it intends to support very limited but highly practical scenarios, which allow to minimize communications between SmartAnthill Central Controller and SmartAnthill Device, therefore allowing to minimize power consumption on the side of SmartAnthill Device.

Currently, Zepto programming patterns are described in terms of Zepto JavaScript. This in an extremely limited version of normal JavaScript (“zepto” means “10^-21”, so you get about 10^-21 functionality out of original language). Basically, whatever is not in these examples, you shouldn’t use.

Pattern 1. Simple Request/Response

The very primitive (though, when aided with a program on Central Controller side, sufficient to implement any kind of logic) program:

return TemperatureSensor.Execute();

This program should compile to all Zepto VM Levels.

Pattern 2. Sleep and Measure

Sleep for several minutes (turning off transmitter), then report back.

mcu_sleep(5*60); //5*60 is a compile-time constant,
                 //  so no multiplication is performed on MCU here
return TemperatureSensor.Execute();

This program should compile to all Zepto VM Levels.

Pattern 3. Measure and Report If

The same thing, but asking to report only if measurements exceed certain bounds. Still, once per 5 cycles, SmartAnthill Device reports back, so that Central Controller knows that the Device is still alive.

for (var i = 0; i < 5; i++) {
    temp = TemperatureSensor.Execute();
    if (temp.Temperature < 36.0 ||
            temp.Temperature > 38.9)
              //Note that both comparisons should compile
              //  into integer comparisons, using Plugin Manifest
        return temp;
    mcu_sleep(5*60);
}
return TemperatureSensor.Execute();

This program should compile to all Zepto VM Levels, starting from Zepto VM Small.

Pattern 4. Implicit parallelism

temp = TemperatureSensor.Execute();
humi = HumiditySensor.Execute();
return [temp, humi];

or

return [TemperatureSensor.Execute(), HumiditySensor.Execute()];

In all these (equivalent) cases compiler, if possible, SHOULD implicitly call both sensor Execute() functions in parallel (see PARALLEL Zepto VM instruction), reducing processing time.

Combined Example

Now let’s consider an example where we want to perform temperature measurements more frequently than humidity ones, and

humi = HumiditySensor.Execute();
for (var i = 0; i < 5; i++) {
    if (i%2 == 0) //SHOULD compile into '&1' to avoid division
        humi = HumiditySensor.Execute();
    temp = TemperatureSensor.Execute(); //SHOULD be performed in parallel
                                        //  with HumiditySensor() when applicable
    if (humi.HumiditySensor > 80 &&
           temp.Temperature > 30.0)
        return [temp, humi];

    mcu_sleep(5*60);
}

return [TemperatureSensor.Execute(), HumiditySensor.Execute()];

TODO: calculation plugins(?)

SimpleIoT

SimpleIoT Protocol Stack

Version:0.1

SimpleIoT protocol stack is intended to provide communication services over heterogeneous IoT networks, allowing SimpleIoT Clients to control SimpleIoT Devices. These communication services are implemented as request-response services within OSI/ISO network model. SimpleIoT Devices and Clients connected together, form SimpleIoT Personal Area Network (PAN).

SimpleIoT protocol stack consists of several protocols, which adhere to a common naming schema, and currently include SimpleIoT/VM, SimpleIoT/CCP (which includes SimpleIoT/Pairing and SimpleIoT/OtAProgramming subprotocols), SimpleIoT/GDP, SimpleIoT/SP, SimpleIoT/oIP, SimpleIoT/MP, and SimpleIoT/DLP-*. Details of these protocols are described below.

Design Goals

SimpleIoT protocol stack is aimed to be used as a communication stack of an operating system such as Zepto OS, FreeRTOS or Contiki. As a result the following design goals apply:

  • SimpleIoT is a heterogeneous network protocol, aimed to provide interoperability between Devices which have very different connectivity means
  • SimpleIoT aims to support a wide range of communication protocols, including both wired and wireless
    • SimpleIoT aims to support various wireless protocols, including FSK-modulated RF (specifying details of such modulation to ensure sufficient interoperability), 802.15.4, and Bluetooth.
    • SimpleIoT aims to support various wired protocols, including RS-232, CAN bus, and USB CDC ACM.
  • SimpleIoT aims to support a wide range of MCUs, from 8-bit MCUs such as those AVR8-based, to 32-bit MCUs such as ARM Cortex-M0 to ARM Cortex-M4.
    • SimpleIoT for Terminating Devices should be small enough to fit into 16K-24K Flash (i.e. Zepto OS running on MCUs with 32K Flash should be perfectly usable, accounting for HAL, device drivers, etc.), and into 512 bytes RAM (i.e. Zepto OS running on MCUs with 1K RAM should be perfectly usable).
    • SimpleIoT for Retransmitting Devices should be small enough to fit into 24-32K Flash (i.e. Zepto OS running on MCUs with 64K Flash should be perfectly usable, accounting for HAL, device drivers, etc.), and into 3K bytes RAM (i.e. Zepto OS running on MCUs with 4K RAM should be perfectly usable).
  • SimpleIoT aims to provide explicit support for battery-powered Terminating Devices, optimizing power consumption as much as possible, and allowing a Terminating Device to operate for months and years from a single 20-mm button cell (assuming that power consumption of the attached sensor is negligible).
    • In particular, one very important scenario is enabling a (Terminating) SimpleIoT Device to turn off its receiver for a time period which is specified by SimpleIoT Client; during this time, Device is completely unaccessible by the rest of SimpleIoT network.
    • In addition, other power-optimization measures need to be employed where applicable, including reduction in number of packets transmitted and reduction in waiting times in “listening” state, transmission power wherever possible, and using forward error correction to allow for transmission with less emitted power and less retransmits.

Actors

In SimpleIoT Protocol Stack, there are three distinct actors:

  • SimpleIoT Client. Whoever needs to control SimpleIoT Device(s). Each SimpleIoT Client needs to keep a SimpleIoT DB of those SimpleIoT Devices it needs to control. SimpleIoT Clients are usually implemented by Internet-connected PCs, though this is not strictly required.
  • SimpleIoT Router. SimpleIoT Router allows to control SimpleIoT Devices connected to it. Provides compression on SimpleIoT/oIP protocols, and implements SimpleIoT/DLP-* protocols (see below). Each SimpleIoT Router keeps a SimpleIoT Routing Table, which is necessary to operate SimpleIoT mesh PAN (see [SimpleIoT/MP] document for details on SimpleIoT Mesh Protocol).
  • SimpleIoT Device. Physical device (containing sensor(s) and/or actuator(s)), which implements at least some parts of SimpleIoT protocol stack. Every SimpleIoT Device runs its own (usually minimal and optimized for SimpleIoT tasks) IP stack (but does not necessarily run TCP stack). As described in [SimpleIoT/MP] document, SimpleIoT Devices are divided into Retransmitting Devices and Terminating Devices.
    • Terminating SimpleIoT Devices represent Devices which perform specific tasks (such as sensors and/or actuators). Terminating Devices do not retransmit packets in a SimpleIoT mesh network. Terminating Devices MAY turn off their receiver(s), and are good candidates to be battery-powered Devices.
    • Retransmitting SimpleIoT Devices MAY perform the same functions as Terminating Devices (i.e. controlling sensors and/or actuators), plus they SHOULD be able to retransmit packets to other SimpleIoT Devices within SimpleIoT mesh network. Retransmitting Devices MUST NOT turn off their receiver(s), and usually should have easy access to quite significant power sources.

Assumptions

SimpleIoT operates under assumption that most of communication in SimpleIoT PAN will happen between Device and Client (i.e. not between two Devices). Communication between Devices is currently supported only via a SimpleIoT Client; while this MIGHT change in the future versions, inter-Device communication will still be considered as a rare occurrence.

Addressing

In SimpleIoT, each SimpleIoT Device is assigned it’s own IPv6 address (usually generated pseudo-randomly as specified in RFC4193). When transferring SimpleIoT/oIP packets over SimpleIoT/MP, IPv6 and UDP headers MUST be compressed (as described in [SimpleIoT/oIP] document; techniques described there are similar to those of 6LoWPAN, but are more specific to SimpleIoT tasks, and are more efficient for our purposes as a result).

In addition, each Device within PAN is assigned a NODE-ID (it happens as a part of “pairing” procedure, see [SimpleIoT/Pairing] document for details), which is used as a shortcut to IPv6 address whenever possible.

Relation between SimpleIoT protocol stack and OSI/ISO network model

Note

For more information, please scroll the table below horizontally

Layer OSI-Model SimpleIoT Protocol Stack Function Implementation on Clients Implementation on Routers Implementation on Devices
IP side SIoT side
7 Application SimpleIoT/VM Device Control Byte-code Compiler SimpleIoT/VM
SimpleIoT/CCP Command/Reply Handling SimpleIoT/CCP SimpleIoT/CCP
5 Session SimpleIoT/GDP Guaranteed Delivery SimpleIoT/GDP (“Master”) SimpleIoT/GDP (“Slave”)
SimpleIoT/SP Encryption and Authentication SimpleIoT/SP SimpleIoT/SP (optional) SimpleIoT/SP
4 Transport SimpleIoT/oIP Transport over IP Networks SimpleIoT/oIP SimpleIoT/oIP SimpleIoT/ oUDP (compressed) SimpleIoT/UDP (compressed)
UDP As usual for UDP UDP UDP
3 Network SimpleIoT/MP or IP Mesh for SimpleIoT/MP, As usual for IP IP IP SimpleIoT/ MP SimpleIoT/MP
2 Datalink SimpleIoT/DLP-* Intra-bus addressing, Fragmentation (if applicable), Forward Error Correction – (standard network capabilities) – (std netwk capabilities) SimpleIoT/ DLP-* SimpleIoT/DLP-*
1 Physical Physical   – (standard network capabilities) – (std netwk capabilities) Physical Physical

SimpleIoT protocol stack consists of the following protocols:

  • SimpleIoT/VM. Essentially a byte-code interpreter, where byte-code is optimized for exteremely resource-constrained devices. SimpleIoT/VM handles generic commands and routes device-specific commands to device-specific plug-ins. Belongs to Layer 7 of OSI/ISO network model.
  • SimpleIoT/CCP – SimpleIoT Command&Control Protocol. Also belongs to Layer 7 of OSI/ISO network model.
  • SimpleIoT/GDP – SimpleIoT Guaranteed Delivery Protocol. Belongs to Layer 5 of OSI/ISO network model. Provides guaranteed command/reply delivery. Flow control is implemented, but is quite rudimentary (only one outstanding packet is normally allowed for each virtual link, see details below). On the other hand, SimpleIoT/GDP provides efficient support for scenarios such as temporary disabling receiver on the SimpleIoT Device side; as noted above, such scenarios are very important to ensure energy efficiency.
  • SimpleIoT/SP – SimpleIoT Security Protocol. Due to several considerations (including resource constraints) SimpleIoT protocol stack implements security on a layer right below SimpleIoT/GDP, so SimpleIoT/SP essentially belongs to Layer 5 of OSI/ISO network model.
  • SimpleIoT/oIP – “SimpleIoT over IP” Protocol. MAY have different flavours, though currently only SimpleIoT/oUDP is supported. In the future support for SimpleIoT/oTCP MIGHT be added, but it won’t be mandatory for Devices.
  • SimpleIoT/MP - SimpleIoT Mesh Protocol. Aims to provide heterogeneous mesh network with an explicit “storm” control within applicable collision domains.
  • SimpleIoT/DLP-* – SimpleIoT DataLink Protocol family. Belongs to Layer 2 of OSI/ISO network model. SimpleIoT/DLP-* is specific to an underlying transfer technology (so for CAN bus SimpleIoT/DLP-CAN is used, for IEEE 802.15.4 SimpleIoT/DLP-IEEE802.15.4 is used). Protocols from SimpleIoT/DLP-* family handle fragmentation and forward error correction if necessary, and in general provide non-guaranteed packet transfer.

Error Handling Philosophy and Asymmetric Nature

In real-world operation, it is inevitable that from time to time a mismatch occurs between the states of SimpleIoT Client and SimpleIoT Device; while such mismatches should never occur as long as the SimpleIoT protocols are strictly adhered to, mistmatches still may occur for many practical reasons, such as reboot or restore-from-backup of SimpleIoT Client, a transient failure of the SimpleIoT Device (for example, due to power surge, near-depleted battery, RAM soft error due to X-rays, etc.).

SimpleIoT protocol stack attempts to clear as many such scenarios as possible ‘automagically’, without the need to reprogram SimpleIoT Device. To achieve this goal, the following approach is used: SimpleIoT protocol stack assumes that in any case when there is any kind of the mismatch, it is the SimpleIoT Client who’s “right”. In addition, if such a decision is not sufficient to recover from the mismatch, SimpleIoT Device will perform complete re-initialization.

It means that certain SimpleIoT protocols (such as SimpleIoT/CCP and SimpleIoT/GDP) are inherently asymmetrical; details are provided in their respective documents ([SimpleIoT/CCP] and [SimpleIoT/GDP] respectively).

TODO: recommend on-device watchdog?

Packet Chains

SimpleIoT protocol stack is intended to provide various services between two entities: SimpleIoT Client and SimpleIoT Device. Most of these services are of request-response nature. To implement them while imposing the least requirements on the resource-stricken SimpleIoT Device, all interactions within SimpleIoT protocol stack at the levels between SimpleIoT/VM and SimpleIoT/GDP (inclusive) are considered as “packet chains”, when one of the parties initiates communication by sending a packet P1, another party responds with a packet P2, then first party may respond to P2 with P3 and so on.

Packet chains are initiated at SimpleIoT/VM layer, and are supported by all the layers between SimpleIoT/VM and SimpleIoT/GDP (inclusive). Whenever SimpleIoT/VM issues a packet to an underlying protocol, it MUST specify whether a packet is a first, intermediate, or last within a “packet chain”. This information allows underlying protocols (down to SimpleIoT/GDP) to arrange for proper retransmission if some packets are lost during communication, see [SimpleIoT/GDP] document for details.

Packet Size Guarantees, DEVICECAPS instruction, SIMPLEIOT_VM_GUARANTEED_PAYLOAD, and Fragmentation

All SimpleIoT Devices MUST support sending SimpleIoT/VM commands and receiving SimpleIoT/VM replies with at-least-8-bytes payload; all underlying protocols MUST support it (taking into account appropriate header sizes, so, for example, SimpleIoT/SP MUST be able to pass at least 8_bytes+SimpleIoT_VM_headers+SimpleIoT_CCP_headers+SimpleIoT_GDP_headers as payload). If Client needs to send a command which is larger than 8 bytes, it SHOULD obtain information about device capabilities, before doing it. It SHOULD be done via SimpleIoT/VM DEVICECAPS request (see [SimpleIoT/VM] for details). When Client doesn’t have information about Device, it’s SimpleIoT/VM request with the DEVICECAPS instruction MUST be <= 8 bytes in size; VM’s reply to a DEVICECAPS instruction MAY be larger than 8 bytes if it is specified in the instruction (and if is Device itself is capable of sending it). The information obtained from DEVICECAPS request, SHOULD be stored in Client’s SimpleIoT DB.

One of DeviceCapabilities fields is SIMPLEIOT_VM_GUARANTEED_PAYLOAD (which is conceptually similar to MTU from IP stack, but includes header sizes to provide information which is appropriate for Layer 7). When a SimpleIoT Device fills in SIMPLEIOT_VM_GUARANTEED_PAYLOAD in response to DEVICECAPS request, it MUST take into account capabilities of it’s L1/L2 protocol; that is, if a SimpleIoT Device supports IEEE 802.15.4 and L2 protocol which doesn’t perform packet fragmentation and re-assembly, then the Device won’t be able to send/receive payloads which are roughly 80 bytes in size (exact size depends on headers and needs to be calculated depending on protocol specifics), and it MUST NOT report DeviceCapabilities.SIMPLEIOT_VM_GUARANTEED_PAYLOAD which is more than this amount. TODO: separate _COMMAND/_REPLY instead of _PAYLOAD?

In SimpleIoT, fragmentation and re-assembly is a responsibility of SimpleIoT/DLP-* family of protocols. If implemented, it may allow device to increase reported (and sent/received) SIMPLEIOT_VM_GUARANTEED_PAYLOAD.

All SimpleIoT Retransmitting Devices MUST support SIMPLEIOT_VM payload sizes of at least 384 bytes. Therefore, after obtaining Device Capabilities for a SimpleIoT Device, SimpleIoT Client MAY rely on min(DeviceCapabilities.SIMPLEIOT_VM_GUARANTEED_PAYLOAD,384) being guaranteed to be delivered to the Device.

Stack-Wide Encodings

There are some encodings and encoding conventions which are used throughout SimpleIoT Protocol Stack.

SimpleIoT Encoded-Unsigned-Int

In several places in SimpleIoT Protocol Stack, there is a need to encode integers, which happen to be small most of the time (one such example is sizes, another example is some kinds of incrementally-increased ids such as NODE-ID). To encode them efficiently, SimpleIoT Protocol Stack uses a compact encoding, which encodes small integers with smaller number of bytes. Encoded-Unsigned-Int is very close to Variable-length quantity (VLQ) (see http://en.wikipedia.org/wiki/Variable-length_quantity), however, SimpleIoT Encoded-Unsigned-Int<> encoding enforces “canonical” VLQ representation, prohibiting non-optimal encodings such as two-byte encoding of ‘0’. Also note that other encodings such as Encoded-Signed-Int are different from what is described on VLQ Wikipedia page.

Encoded-Unsigned-Int is a variable-length encoding of unsigned integers. Namely:

  • if the first byte of Encoded-Unsigned-Int is c1 <= 127, then the value of Encoded-Unsigned-Int is equal to c1
  • if the first byte of Encoded-Unsigned-Int is c1 >= 128, then the next byte c2 is needed:
    • if the second byte of Encoded-Unsigned-Int is c2 <= 127, then the value of Encoded-Unsigned-Int is equal to ((uint16)(c1&0x7F) | ((uint16)c2 << 7)).
    • if the second byte of Encoded-Unsigned-Int is c2 >= 128, then the next byte c3 is needed:
      • if the third byte of Encoded-Unsigned-Int is c3 <= 127, then the value of Encoded-Unsigned-Int is equal to ((uint32)(c1&0x7F) | ((uint32)(c2&0x7F) << 7)) | ((uint32)c3 << 14)).
      • if the third byte of Encoded-Unsigned-Int is c3 >= 128, then the next byte c4 is needed:
        • if the fourth byte of Encoded-Unsigned-Int is c4 <= 127, then the value of Encoded-Unsigned-Int is equal to ((uint32)(c1&0x7F) | ((uint32)(c2&0x7F) << 7)) | ((uint32)(c3&0x7F) << 14)) | ((uint32)c4 << 21)).
        • if the fourth byte of Encoded-Unsigned-Int is c4 >= 128, then the next byte c5 is needed.
          • for nth byte:
            • if the nth byte of Encoded-Unsigned-Int is cn <= 127, then the value of Encoded-Unsigned-Int is equal to ((uintNN)(c1&0x7F) | ((uintNN)(c2&0x7F) << 7)) | ((uintNN)(c3&0x7F) << 14)) | ... | ((uintNN)(c<n-1>&0x7F) << (7*(n-2))))) | ((uintNN)cn << (7*(n-1)))), where uintNN is sufficient to store the result. NB: in practice, for Encoded-Unsigned-Ints over 4 bytes, implementation is likely to be quite different from, but equivalent to, the formula given
            • if the nth byte of Encoded-Unsigned-Int is cn >= 128, then the <n+1>th byte is needed.

IMPORTANT: Encoded-Unsigned-Int enforces “canonical” representation. It means that all integers MUST be encoded with the smallest number of bytes possible. This requirement is equivalent to a requirement that for encodings with length > 1, last byte of encoding MUST NOT be equal to zero. This MUST be checked by compliant implementations (and MUST generate invalid-encoding exception, with effects depending on the point where it has occurred).

The following table shows how many Encoded-Unsigned-Int bytes is necessary to encode ranges of Encoded-Unsigned-Int values:

Encoded-Unsigned-Int Values Encoded-Unsigned-Int Bytes Fully Covers Result fits in
0-127 1 7 bits 1 byte
128-16 383 2 14 bits 2 bytes
16 512-2 097 151 3 21 bits 3 bytes
2 097 152-268 435 455 4 28 bits 4 bytes
268 435 456- 34 359 738 367 5 35 bits 5 bytes
34 359 738 368- 4 398 046 511 103 6 42 bits 6 bytes
4 398 046 511 104- 562 949 953 421 311 7 49 bits 7 bytes
562 949 953 421 312- 72 057 594 037 927 935 8 56 bits 8 bytes
72 057 594 037 927 936- 9 223 372 036 854 775 808 9 63 bits 8 bytes

IMPORTANT: Encoding-Unsigned-Int encoding (specifically, low-to-high byte encoding order) guarantees that for even numbers, first byte of encoded value is always even. This property MAY be relied on in other places in protocol stack, specifically, in “indicate an error in an unknown-length field” scenarios (so if we decide to change order of bytes in the encoding, we need to change logic in those places too).

Table of correspondence of “max=” parameter and maximum possible encoding length:

max= maximum Encoded-Unsigned-Int bytes
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 10
Encoded-Signed-Int

Encoded-Signed-Int is an encoding for signed integers, based on Zig-Zag conversion from signed integer to unsigned integer, and subsequent Encoded-Unsigned-Int encoding of unsigned integer.

Zig-Zag conversion is the same as described here: https://developers.google.com/protocol-buffers/docs/encoding?csw=1#types. For example, to convert int16_t sx to uint16_t ux, the following C language expression is used:

ux = (uint16_t)((sx << 1) ^ (sx>>15))

To convert int32_t sx to uint32_t ux, expression becomes ux = (uint32_t)((sx << 1) ^ (sx>>31)), and so on.

Note that right shift in these expressions is a signed shift, making it equivalent creating a bitmask of appropriate length, consisting out of all ‘0’ or out of all ‘1’s (equal to the sign bit of original signed integer). This allows, for example, to calculate one byte of this mask by signed-shifting highest byte of sx to the right by 7, and then to use this byte for XORing with all the bytes of left-shifted sx; this trick should speed up implementations on 8-bit MCUs.

After ux is calculated, it is stored as an Encoded-Unsigned-Int of the appropriate size, as described above.

To perform Zig-Zag conversion back (from Zig-Zag-encoded unsigned ux to original signed sx), the following expression may be used (for 16-bit conversions, for the others expressions are very similar):

sx = (int16_t)((ux >> 1) ^ (-(ux & 1)))

Note that once again, all bits (and therefore bytes) of (-(ux&1)) are the same, so one byte can be calculated (this time - based on lowest byte) and then used for XORing with all the bytes of right-shifted ux.

Encoded-*-Int<max=>

Wherever SimpleIoT specification mentions Encoded-Unsigned-Int or Encoded-Signed-Int, it MUST specify it in the form of Encoded-Unsigned-Int<max=...> or Encoded-Signed-Int<max=...>. “max=” parameter specifies maximum number of bytes which are necessary to represent the encoded number. For example, Encoded-Unsigned-Int<max=2> specifies that the number is between 0 and 65535 (and therefore from one to three bytes may be used to encode it). The high bit of the last possible byte of Encoded-*-Int is always 0; this ensures an option for an easy expansion in the future.

Currently supported values of “max=” parameter are from 1 to 8.

When parsing Encoded-*-Int, if high bit in the last-possible byte is 1, then Encoded-*-Int is considered invalid. Handling of invalid Encoded-*-Ints SHOULD be specified in the appropriate place of documentation.

SimpleIoT Endianness

In most cases, SimpleIoT Protocol Stack uses SimpleIoT Encoded-*-Int<max=...> to encode integers. However, there are some cases where we need an exact number of bytes, and have no idea about their statistical distribution. In such cases, using Encoded-*-Int<> would be a waste.

In such cases, SimpleIoT uses SimpleIoT Endianness, which is LITTLE-ENDIAN.

Rationale for using LITTLE-ENDIAN encoding (rather than “network byte order” which is traditionally big-endian) is based on the observation that the most resource-constrained MCUs out of target group (such as PIC and AVR8), are little-endian. For them, the difference of not doing conversion between protocol-order and MCU-order might be important; as the other MCUs are not that much constrained, we don’t expect the cost of conversion to be significant. In other words, this LITTLE-ENDIAN decision to favours poorer-resource MCUs at the cost of richer-resource MCUs.

SimpleIoT Bitfields

In some cases, SimpleIoT Protocols use bitfields; in such cases:

  • bitfields MUST use 1-byte, 2-byte, Encoded-Unsigned-Int<max=>, or Encoded-Signed-Int<max=> field as a ‘substrate’. ‘Bitfield Substrate’ is composed/parsed as an ordinary field, which is encoded using appropriate encodings described in this document.
  • as soon as ‘substrate’ is parsed, it is treated as an integer, out of which specific bits can be used; these bits are specified as [3] (specifying that single bit #3 is used), or [2..4] (specifying that bits from 2 to 4 - inclusive - are used). Bit[0] means the least significant bit, i.e. (substrate&0x01), bit[1] - the next bit, i.e. ((substrate>>1)&0x01), and so on.
  • if ‘substrate’ is an Encoded-Unsigned-Int field, then one of bitfields MAY be specified as [2..] - specifying that all the bits from 2 to the highest available one, are used for the bitfield.
  • if ‘substrate’ is an Encoded-Signed-Int field, then one of bitfields MAY be specified as [2..] - specifying that all the bits from 2 to the highest available one, are used for the bitfield; in this example, the bitfield in question MUST be calculated as substrate>>1, where substrate is treated as signed (i.e. ‘>>’ operator works extending sign bit).
SimpleIoT Half-Float

Some SimpleIoT packets (in particular, some of the commands within SimpleIoT/VM) use ‘Half-Float’ data as described here: http://en.wikipedia.org/wiki/Half-precision_floating-point_format. SimpleIoT serializes such data as 2-byte substrate (encoded according to SimpleIoT Endianness), then considering Sign-Bit bitfield as bit [15], Exponent bitfield as bits [10..14], and Fraction bitfield as bits [0..9]. This representation is strictly equivalent to the one described in Wikipedia (TODO: check).

SimpleIoT Security Protocol (SimpleIoT/SP)

Version:0.1a

NB: this document relies on certain terms and concepts introduced in SimpleIoT Protocol Stack document, please make sure to read it before proceeding.

SimpleIoT/SP (SimpleIoT Security Protocol) aims to provide security guarantees for communications within SimpleIoT environments, in particular, message confidentiality, message integrity guarantees, and protection from replay attacks.

1. Definitions

1.1. Packet. A unit of data exchange with other levels/protocols. For the sake of clarity two types of packets are distinguished:

  • HLP packet: a packet that is sent to or received from a higher-level protocol. HLP packet data is a payload for SimpleIoT/SP, as it will be discussed in more details below.
  • SP packet: a packet that is formed by SimpleIoT/SP and is sent to or received from the communication peer (using an underlying protocol).
  • Internally valid SP packet: a packet that has passed authentication based solely on packet data (see also “intra-packet authentication”).

1.2. SP Packet structure

SP packet structure looks as follows:

| SP Header | Security Tag | Encrypted Data |

where:

  • SP Header is a non-encrypted part of the packet that contains flags and certain bits of the packet nonce. Header takes 6 bytes.
  • Security Tag: data related to encryption and authentication process. Security Tag takes 16 bytes.
  • Encrypted Data: encrypted data, which includes certain SimpleIoT/SP information as well as SimpleIoT/SP payload. The same data before encryption (or after decryption) is referred to as “Data Under Encryption

1.3. Packet Nonce: all data used as a packet nonce for purposes of encryption/authentication. PFN consists of:

  • Nonce Varying Part (Nonce VP): a fixed-size bit sequence uniquely generated by a sending device for each new packet; Nonce VP is 47 bits; as defined herein, in certain contexts it can be treated as a(n unsigned) integer. It can serve as a part of Packet ID when such ID is required.

  • Destination Flag: a bit that indicates whether the packet is intended solely to SimpleIoT/SP itself (such as a packet with Error “Old Nonce” Message), or for its higher level protocol. Values of the flag have the following meaning:
    • 0: packet is intended for a higher level protocol
    • 1: packet is intended for SimpleIoT/SP itself
  • Peer-Distinguishing Flag: a bit that is set to 0 for one communication peer and to 1 for another peer.

    Nonce VP and Destination Flag must be packet as a 6-byte sequence b0 | b1 | ... | b5 in the order of increasing of their addresses in memory. Then the numerical value of Nonce VP is calculated as follows: (uint48)b0 + ((uint48)b1)<<8 + ((uint48)b2)<<16 + ((uint48)b3)<<24 + ((uint48)b4)<<32 + (((uint48)b5)&0x73f)<<40, and the value of Destination Flag is calculated as b5>>7.

1.4. Packet ID (PID): a unique identifier of a packet, when such ID is required.

PID is formed using Nonce VP and Peer-Distinguishing Flag and must be packet as a 6-byte sequence b0 | b1 | ... | b5 in the order of increasing of their addresses in memory so that b0 = (uint8)(Nonce_VP); b1 = (uint8)(Nonce_VP >> 8); b2 = (uint8)(Nonce_VP >> 16); b3 = (uint8)(Nonce_VP >> 24); b4 = (uint8)(Nonce_VP >> 32); b5 = (uint8)(Nonce_VP >> 40) | (Peer_Distinguishing_Flag << 7);.

1.5. Nonce Lower Watermark (NLW): a value supported by a packet receiving side that is used to determine whether a value of Packet Nonce VP (i) has never been used before (if a new packet is received); (ii) has been used with the last received packet (for instance, in case of packet resending); or (iii) a de-synchronization in communication has happened.

1.6. Nonce to use For Sending (NFS): a value supported by a packet sending side that is used to generate a value of Packet Nonce VP that would have never been used before, and that would be verifiable by the communication peer.

1.7. Last Received Packet Signature: [TODO: check whether it is indeed required]

1.8. Packet validation process: a core task of SimpleIoT/SP. Main purpose of the packet validation is to ensure that a packet is actually received is from an intended communication partner, is not modified by a third party on the way, and its content (unless specified otherwise) is protected from reading by not indented parties. On the sending side of communication the packet validation process results in encryption and adding authentication data. On receiving side a process can logically be divided into two steps:

  • intra-packet authentication, which is done using solely packet data such as respective headers, nonces, tags, etc, and not using NLW;
  • in-sequence authentication, which is based on comparison of a packet nonce Varying Part with the Nonce Lower Watermark.

1.9. Error “Old Nonce” Message: a packet that represents an “old nonce” error report with the lowest possible value of a valid nonce VP (which is equal to a current value of Nonce Lower Watermark plus 1). This packet can be sent, if an otherwise valid packet is received with an “old” nonce VP, that is, with a nonce VP that is less than the Nonce Lower Watermark.

2. Security choices

The core of SimpleIoT/SP is packet encryption/decryption and authentication. These processes are based on EAX algorithm (see [EAX]). Design choices with respect the above-mentioned algorithm are:

  • Encryption method: AES-256

  • Tag size: 128 bit

  • EAX Nonce size: 49 bit, consisting of:

    • Nonce Varying Part: 47 bit [1]
    • Destination Flag: 1 bit
    • Peer-Distinguishing Flag: 1 bit

To reduce the amount of data transferred, Peer-Distinguishing Flag is not actually transferred but just appended to the packet header that actually contains only Nonce Varying Part and Destination Flag to get a Packet Full Nonce:

  • SP Header: 48 bit, consisting of:

    • Nonce Varying Part: 47 bit
    • Destination Flag: 1 bit

Rationale: In order to use the same encryption key in both directions of communication each nonce should be unique for packets going in both directions, too. Uniqueness of the nonce going in a particular direction is enforced by packet sender (using nonce VP generation based on NFS). To separates sets of nonces generated by each of two communication peers, a separate bit in the nonce value (Peer-Distinguishing Flag) is used to distinguish between peers so that this bit is set for all nonces generated by one peer and is not set for nonces generated by the other peer. Which peer should have this bit set can be determined, in particular, during set up of communication between two specific devices (for instance, at the same time when encryption key exchange is done), or can be a predefined choice for some types of the devices, if devices of different type participate in communication (for instance, in communication of a Master device with a Slave device Master device may always have the flag set, and Slave device may always have the flag not set).

[1]If 47 bit nonce VP is used, then different nonces will be enough for 10 years with packet frequency of 2.25 mks: 10*365*24*60*60*1000000/2^47 = 2.25
2.1 SP Nonces

In SimpleIoT/SP, nonce varying part is always increased, and never goes back. This is a critical requirement for SimpleIoT/SP to be secure (both to guarantee nonce being unique, which is required for EAX to be secure, and to avoid replay attacks).

3. Security Guarantees

Security of SimpleIoT/SP relies on security of EAX, which is proven as long as underlying cipher (AES128) is secure, and as long as nonces are unique per key.

Within SimpleIoT/SP, keys MUST be unique for each communication pair, and uniqueness of nonces for the pair is guaranteed by:

  • Peer-Distinguishing Flag
  • for packets sent by each peer, by “Nonce to use for Sending” (NFS)

EAX as such doesn’t guarantee protection from replay attacks, however as nonces are unique, replay attack is not possible as long as SimpleIoT/SP drops packets with repeated nonces. SimpleIoT/SP does drop packets with repeated nonces, with the following exception:

  • Error “Old Nonce” Message. For ‘Error “Old Nonce” Message, SimpleIoT/SP does not check the nonce (this is necessary to avoid potential deadlocks). However, replay attack based on these messages is not possible, because SimpleIoT/SP does not allow NLW to decrease, and therefore all replay packets will be ignored by SimpleIoT/SP.

Therefore, SimpleIoT/SP is secure (because of EAX and AES128 being secure) and also provides protection from replay attacks.

4. Scenarios

4.1. Normal packet processing

Two devices, A and B, participate in packet exchange. Each packet sent is encrypted and authenticated in a way to both guarantee packet integrity and protect from replay attacks. Each packet received has a respective authentication data. Correspondingly, when an HLP packet is being prepared for sending, it is encrypted by an encryption key known to both communication peers, and authentication data is added. It is important that a nonce used for encryption/authentication could be recognized as such (that is, as a value actually used once) by the other communication peer. This is achieved by using Nonce to use For Sending (NFS) on the sending side and Nonce Lower Watermark (NLW) on receiving side.

4.1.1. How NFS / NLW pair works

To avoid replay attacks nonces are commonly used to distinguish between an original message and a message with otherwise the same content that is being replayed. A problem with nonces is to check that a particular value is actually new and has not yet been used ever before. To address this problem, SimpleIoT/SP treats VP of nonces as numerical values and compares a nonce VP from a received packet with a current value of the NLW. If the value of nonce VP is greater than a current value of the NLW, the nonce is considered as new; in this case the value of NLW is set to the value of the nonce VP, and its reuse becomes impossible.

To be economical with the set of values that are greater than a current value of NLW (within a certain range), it is desired that a value of a new nonce VP received be as close (from above) to NLW as possible, ideally, greater by 1. NFS is used to keep track of nonces on the sending side. Initially (for example, at the same time when secret keys are exchanged between the sides) communication partners set NLW on receiving side to the same value as NFS on sending side (namely, NLW = 0, and NFS = 0). Before a new packet is being sent, NFS is incremented, and packet nonce VP is set to a value of NFS. On the receiving side, upon reception of the packet, the value of NLW will become the value of the nonce VP, that is, again equal to NFS on the sending side. The process may be continued until all space of NFS/NLW values is exhausted.

TODO: Nonce Exhaustion/Overflow handling

4.2. Processing packet with an obsolete nonce

If a packet is internally valid, but its nonce VP is less than or equal to a current value of NLW, it may indicate that states of the communication peers are out of sync (and not necessarily that a third party attack is detected). In this case, to resynchronize communication process an Error “Old Nonce” Message is formed with the lowest possible acceptable nonce VP, and a packet with this message is sent to a communication partner.

If an Error “Old Nonce” Message is received, the receiving party compares its NFS with the lowest possible value of the nonce within the message, and if NFS is less that value, NFS is set to the value as specified in the message; using such a value of NFS for sending packets will ensure that the packet will pass NLW test at the receiving party.

TODO: exact format of ‘Error “Old Nonce” Message’

5. SP padding

5.1. SimpleIoT/SP data under encryption and payload

SimpleIoT/SP data under encryption is organized as follows:

| First Byte | (opt) complementary size | byte sequence | (opt) padding |

where:

  • First Byte is a 1 byte field that is treated as follows:

    • MSB bit: padding size flag, which is set to 1, if padding is present, and 0 otherwise. Presence of padding implies presence of padding size field as well.
    • Remaining 7 bits: a part of payload.
  • complementary size: SimpleIoT Encoded-Unsigned-Int<max=2> variable-size field, as described in SimpleIoT Protocol Stack; this field is present only if padding size flag is set; in this case the field contains encoded value of a sum of the size of this field and the size of padding (if any). If Encoded-Unsigned-Int has an invalid value (as defined in SimpleIoT Protocol Stack), then SimpleIoT/SP receiving side MUST treat such a packet as an invalid (as the one which didn’t pass internal validation). Note: unless “enforced padding” (see below) is used, SimpleIoT/SP pads data only to the block size; it means that unless “enforced padding” is used, padding size is always <= 15, and therefore Encoded-Unsigned-Int cannot be longer than 1 byte.

  • byte sequence: variable size field; data that is defined by a higher level protocol.

  • padding: variable size field; this field is present only if padding size flag is set and complementary size represents a value greater than 1; contains padding up to a target size.

Correspondingly, SimpleIoT/SP payload consists of:

  • Remaining 7 bits of the First Byte
  • byte sequence

Higher-level protocol is free to use “partial byte” (7 bits) of SimpleIoT/SP payload, or to ignore it; however, this “partial byte” might be useful, for example, to store some bitflags of higher-level protocol, which may allow to save 1 byte of payload.

5.2. SimpleIoT/SP padding data

SimpleIoT/SP padding data MUST be generated using Non-Key Random Stream as described in SimpleIoT Random Number Generation and Key Generation.

5.3. SimpleIoT/SP enforced padding

In certain scenarios, some information might be extracted from the packet length even though information is encrypted. To support the cases when this is important, SimpleIoT/SP supports a concept of “enforced padding”, which works as follows:

  • When sending an HLP, a high-level protocol is allowed to specify enforce-pad-to. For each packet length len, SimpleIoT/SP guarantees that for all the HLPs which have their own size= len and are sent without enforced-pad-to, or which are sent with enforced-pad-to = len, the length of SimpleIoT/SP packet is exactly the same (therefore, preventing any length-based information leak).

To implement it, on receiving such a request SimpleIoT/SP MUST do the following:

  • check that enforce-pad-to is greater or equal to the size of packet itself. TODO: specify what to do if it is not (probably different for Master and Slave)
  • calculate required-size, the size of the SimpleIoT/SP packet which an HLP with a size of enforce-pad-to would produce
  • calculate the size of enforced-padding for current packet (so that SimpleIoT/SP packet produced from current packet, would have size= required-size)
  • pad packet, using calculated enforced-padding, and producing ‘enforced-padded’ SimpleIoT/SP packet

TODO: specify handling of enforce-pad-to for the layers between SimpleIoT/SP and SimpleIoT/CCP.

6. SP state

For its operations SimpleIoT/SP keeps the following state on both sides of communication:

  • Nonce Lower Watermark (NLW)
  • Nonce to use For Sending (NFS)

7. Events

There are three events that SimpleIoT/SP processes:

  1. receiving a SP packet from the communication peer
  2. receiving a packet from a higher level protocol (HLP packet)
  3. receiving a request from a higher level for nonce variable part
7.1. Receiving an HLP packet

A packet from a higher level protocol is received together with a nonce VP. After a received nonce VP is ensured to be numerically greater than NLS, this packet is encrypted and authentication data is added using a new nonce based on a received nonce VP, a resulting SP packet is to be passed to the communication peer (using underlying protocol).

7.2. Receiving an SP packet

An SP packet from the communication peer is received (via underlying protocol). The packet can be:

  • valid new packet, which means that the packet data passed validation process, and packet nonce VP is greater than the Nonce Lower Watermark;
  • old-nonce packet, an otherwise valid packet with a nonce VP less than the Nonce Lower Watermark, which means either de-synchronization in communication, or an attack attempt
  • packet with Error “Old Nonce” Message (intended for SimpleIoT/SP itself)
  • invalid packet, in particular, corrupted, an attacker’s packet, etc.
7.3. Receiving a request for nonce VP

A higher level protocol can request for a nonce VP that will be returned together with an HLP packet for sending to a communication peer. Nonce VP returned must be greater then a current value of NLS.

8. Event processing

Further details of event processing are placed below.

8.1. Receiving an HLP packet

A packet from a higher level protocol is received together with a nonce VP. Nonce VP is compared to the current value of NFS.

  • Nonce VP is less than or equal to NFS: no processing is done and an error is reported [TODO: should we provide more details on what such error should result in]
  • Nonce VP is greater than NFS: NFS is set to the value of nonce VP; HLP packet is encrypted and authenticated using a new nonce based on a received nonce VP to form an SP packet. This SP packet is sent to the communication peer using underlying protocol.
8.2. Receiving an SP packet

On receipt of a SP packet, first, an intra-packet authentication is performed as follows:

  • TODO!

Then:

  • if intra-packet authentication has failed: the packet is silently dropped as being either corrupted or an attacker’s packet;

  • if intra-packet authentication is passed: it can be either an error message packet directed to SP itself, or a “regular” packet with payload intended for a higher level protocol.

    • if a packet is with Error Old Nonce Message [+++structure and detection]: packet nonce VP is not compared to NLW (reason: replay attack is impossible since NFS cannot be decreased as a result of this message, and performing comparison may lead to a deadlock); a value of the lowest possible valid nonce from the packet is compared to the current value of NFS.

      • if NFS is less than the value of the lowest possible valid nonce: NFS is set to the value of the lowest possible valid nonce.
      • if NFS is greater than or equal to the value of the lowest possible valid nonce: no changes to NFS is done; the packet is ignored.
    • if packets other than Error Old Nonce Message: packet nonce VP is compared to the Nonce Lower Watermark (NLW). Three cases are possible:

      • if nonce VP is less than or equal to NLW: a packet with Error Old Nonce Message is prepared with the lowest possible valid nonce set to a current value of NLW; the packet is authenticated and sent to the communication peer.
      • if nonce VP is greater than NLW: a new packet is received: NLW is set to the value of nonce VP of the received packet; LRPS is set to packet signature [TODO: check whether we use it elsewhere]; an HLP packet with payload of the received packet is passed to the higher level protocol together with the nonce VP of the packet nonce.

TODO!: sending packets (encryption etc.)

8.3. Receiving a request for nonce VP

A Nonce VP is generated based on a current value of NLS so that the numerical value of nonce VP be greater than numerical value of NLS. Such generation can be as simple as numerical value of NLS plus 1.

9. Payload Size and SP Packet Size

As SimpleIoT/SP is using 48-bit (= 6 bytes) nonce, a block cipher (AES128) with a block size of 128 bits (=16 bytes), and tag size is chosen as maximum 128 bits, it means that SimpleIoT/SP packet size is always (6+16+k*16)=(22+k*16), where k >= 1.

The following table shows relations between SP packet sizes and SP payload [2] not including “remaining 7 bits” part (that is, a size of byte sequence part only):

SimpleIoT/SP packet size, bytes SimpleIoT/SP payload, bytes
38 7bits+0bytes to 7bits+15bytes
54 7bits+16bytes to 7bits+31bytes
70 7bits+32bytes to 7bits+47bytes
86 7bits+48bytes to 7bits+63bytes
102 7bits+64bytes to 7bits+79bytes
118 7bits+80bytes to 7bits+95bytes
[2]Note that SimpleIoT/SP payload is not the same as, say, SimpleIoT/GDP payload or SimpleIoT/CCP payload: for example, if SimpleIoT/GDP lies right on top of SimpleIoT/SP, then SIoT_GDP_Payload_Size = SIoT_SP_Payload_size - Size_of_SIoT_GDP_Headers.

10. Implementation notes

10.1 Incrementing nonces

For SimpleIoT/SP security, it is critical that nonces are never re-used and are always incremented (never going back). Therefore, implementation MUST enforce it (both for sending side and for receiving side).

10.1.1 Basic Implementation

Basic secure implementation is rather simple:

  • Whenever a new packet is sent, an update value of NSF MUST be saved and committed in in persistent storage; this commit MUST be performed before the packet is actually sent over the air. This is necessary to keep EAX security guarantees.
  • Whenever a packet with status “new” is received, an updated value of NLW MUST be saved and committed in persistent storage; this commit MUST be performed before further message processing. This is necessary to avoid using an obsolete value of NLW in case of “dirty” reboot (and thus to avoid a potential for replay attacks).
10.1.2 Optimized Implementation

In cases where basic secure implementation is too resource-intensive (causing too many writes to persistent storage, which can be undesirable, in particular for EEPROM), the following optimizations MAY be used without affecting security; note that implementation described below are ok if and only if all of the steps are implemented (or none is implemented, falling back to the basic schema described above): [TODO: check that boundary handling (‘<’ vs ‘<=’ etc. etc.) is described correctly]

  • On program start:
    • both NSF and NLW are read from the persistent storage, and stored into the RAM (as ‘Current_NSF’ and ‘Current_NLW’ respectively).
    • both NSF and NLW in persistent storage are incremented by a certain value DELTA; this change MUST be committed to persistent storage before any further processing. The value of DELTA can be, for example, 100; DELTA SHOULD NOT be too large, as having it too large, combined with frequent “dirty” reboots, may cause exhaustion of nonce space.
    • These incremented values are also stored in RAM (as ‘Last_NSF’ and ‘Last_NLW’).
  • Whenever a new value of NSF is needed (for the reasons stated above), if ‘Current_NSF’ is less than ‘Last_NSF’, then new value of NSF is taken as ‘Current_NSF’ and ‘Current_NSF’ is incremented in RAM. This is ok from security perspective, because in case of “dirty reboot” NSF will be still increased, and never repeated.
  • Whenever a new value of NSF is needed (for the reasons stated above), and if ‘Current_NSF’ is greated or equal than ‘Last_NSF’, then:
    • NSF in persistent storage is incremented by DELTA (or other similar value); this new value MUST be committed to persistent storage before proceeding further
    • ‘Last_NSF’ is set to new value of NSF in persistent storage
    • ‘Current_NSF’ is returned as the new NSF value, and then incremented
  • Whenever a new value of NLW is needed (for the reasons stated above), if ‘Current_NLW’ is less than ‘Last_NLW’, then new value of NLW is taken as ‘Current_NLW’ and ‘Current_NLW’ is incremented in RAM. This is ok from security perspective, because in case of “dirty reboot” NLW will be still increased, and never repeated. Using such policy for NLW might cause an extra ‘Error “Old Nonce” Message’, but this situation will be quickly recovered from.
  • Whenever a new value of NLW is needed (for the reasons stated above), and if ‘Current_NLW’ is greated or equal than ‘Last_NLW’, then:
    • NLW in persistent storage is incremented by DELTA (or other similar value); this new value MUST be committed to persistent storage before proceeding further
    • ‘Last_NLW’ is set to new value of NLW in persistent storage
    • ‘Current_NLW’ is returned as the new NLW value, and then incremented
10.1.3 Restoring from Backup

Whenever an actor-implementing-SimpleIoT/SP (such as “SimpleIoT Client” or “SimpleIoT Device”) is restored from backup, it MUST take care to avoid duplicate nonces, in particular:

  • amount of time dT (in seconds) between backup and restore MUST be calculated
  • if dT is less than min-backup-restore-time, it MUST be set to min-backup-restore-time; normally min-backup-restore-time should be set to a value such as 24 hours.
  • if dT is larger than max-backup-restore-time, restore SHOULD be interrupted, the problem SHOULD be explained to the person who’s performing restore, and confirmation SHOULD be obtained before proceeding. This is intended to prevent restores with erroneous clock, which might lead to the erroneous exhaustion of the nonce space. Normally, max-backup-restore-time should be set to a value such as 30*24 hours.
  • both NLW and NSF, as stored in persistent storage, MUST be increased by a number equal to: dT*max_number_of_packets_per_second. This increased number MUST be stored and committed to persistent storage before proceeding further. Here, max_number_of_packets_per_second is a constant estimating maximum feasible number of packets which might be sent per second; in general, it depends on the higher-level protocols, but for basic SimpleIoT/CCP it usually can be taken between 100‘000 (1e5) and 1‘000‘000 (1e6).

SimpleIoT Random Number Generation and Key Generation

Version:0.1a

NB: this document relies on certain terms and concepts introduced in SimpleIoT Protocol Stack document, please make sure to read it before proceeding.

Random Number Generation is vital for ensuring security. This document describes requirements for Random Number Generation for SimpleIoT Devices.

Poor-Man’s PRNG

Each device with Poor-Man’s PRNG, has it’s own AES-128 secret key (this key MUST NOT be stored outside of the device), and additionally keeps a counter. This counter MUST be kept in a way which guarantees that the same value of the counter is never reused; this includes both having counter of sufficient size, and proper commits to persistent storage to avoid re-use of the counter in case of accidental device reboot. As for commits to persistent storage - two such implementations are discussed in SimpleIoT Security Protocol (SimpleIoT/SP) document, in ‘Implementation Notes’ section, with respect to storing nonces.

Then, Poor-Man’s PRNG simply encrypts current value of the counter with AES-128, increments counter (see note above about guarantees of no-reuse), and returns encrypted value of the counter as next 16 bytes of the random output.

Devices with uniquely-pre-initialized Poor-Man’s PRNG

Resource-constrained SimpleIoT Devices which don’t have their own crypto-safe RNG, MUST use Poor-Man’s PRNG. On such Devices, Poor-Man’s PRNG MUST be pre-populated during Device manufacturing, with a random key and random initial counter, generated outside of Device. Both key and counter MUST be crypto-safe random numbers, and MUST be statistically unique for each Device.

Devices with hardware-assisted Fortuna

If Device doesn’t have a uniquely-pre-initialized Poor-Man’s PRNG, the following approach based on hardware-assisted Fortuna PRNG, MAY be used (ONLY for certain types of Devices, see ‘Restrictions for Secure and non-Secure Devices’ section below). For such Devices with hardware-assisted Fortuna, the following conditions MUST be met:

  • Device MUST implement Fortuna PRNG, with multiple entropy sources to feed to Fortuna as described below
    • details and implementation options are specified below
    • Device MUST comply to seed file requirements as specified below
  • Device MUST implement hardware entropy gathering, and RNG additional seeding procedure, as described below
Fortuna Implementation in SimpleIoT

There are two approaches to implement Fortuna in SimpleIoT: ‘Radical’ and ‘Conservative’. ‘Radical’ is not strictly compliant with Fortuna description from [Fortuna], but we feel it should perform significantly better for our special circumstances. ‘Conservative’ is fully compliant to description in [Fortuna], with really minor tweaks (within the spirit of Fortuna) to reduce resource requirements. Currently, and until it is shown otherwise, both implementations are acceptable for SimpleIoT.

In any case, pool size for SimpleIoT Fortuna implementations is 128*3 bytes; effectively it means that we’re making a guesstimate that each event (encoded as 3 bytes per ADDRANDOMEVENT() description) carries one bit of entropy.

Currently, Fortuna implementation is estimated to require 32 (state of first SHA256 in SHAd256)+32 (state of second SHA256 in SHAd256)+64 (512-bit chunk buffer) = 128 bytes per pool, plus 32 bytes regardless of pools (generator state).

‘Radical’ SimpleIoT Fortuna

‘Radical’ SimpleIoT Fortuna has the following changes from [Fortuna] description:

  • only one pool is used. Rationale. Under conditions of generic PC-based RNG it may be seen as a major deficiency, but we feel that for SimpleIoT purposes, where the mostly important random generation (the one for pairing purposes) is ‘imminent’, and long-term recovery is of significantly less interest than making key material really random. Under these circumstances, spreading entropy across multiple pools, where it won’t be used for imminent security-critical key generation, is considered a waste.
‘Conservative’ SimpleIoT Fortuna

‘Conservative’ SimpleIoT Fortuna has the following changes from [Fortuna] description:

  • for non-Secure SimpleIoT Devices, number of pools MAY be reduced to 16 (from 32 in original Fortuna); for Secure SimpleIoT Devices number of pools MUST be at least 24
  • minimum time between reseeds MUST be increased to 1 minute (from 100ms in original Fortuna). Rationale: given our limited entropy sources and rare events, we’re not likely to get 128 bits of entropy more frequently anyway

These changes bring time-needed-for-attacker-to-exhaust-pools from 13 years as in original Fortuna, down to 1.5 months; we feel that this number is prudent enough for non-Secure devices. For Secure Devices 24 pools with 1 minute minimum reseeds, provide 31 years.

Fortuna Seed File

[Fortuna] specifies a 64-byte ‘seed file’ to keep Fortuna state between reboots. SimpleIoT Fortuna implementations MUST implement a ‘seed file’ (normally in EEPROM), with all atomicity requriements specified in [Fortuna]. If ‘seed file’ cannot be read on Device start, then Device MUST perform the following (depending on Device ‘pairing state’ as described in SimpleIoT Pairing document):

  • if Device is in PRE-PAIRING state, necessary entropy will be gathered during normal “pairing” procedure, so Fortuna may start without seed file.
  • if Device is in PAIRING-MITM-CHECK state, Device MUST switch to PRE-PAIRING state and require “pairing” to be repeated (TODO: provide appropriate Client-side errors and user messages)
  • if Device is in PAIRING-COMPLETED state, Device MUST use “SimpleIoT/CCP Entropy Recovery” procedure as described in SimpleIoT Command and Control Protocol (SimpleIoT/CCP) document (this procedure is different from “Entropy Gathering” procedure used as a part of “pairing”); in practice, it MAY be sufficient to get a single entropy recovery packet to re-initialize Fortuna (as it is after-pairing, packet is transferred encrypted, so there is no risk for it to be known to adversary; also, if key material will be needed, Fortuna will be fed with additional entropy which is sufficient for such generation, according to SimpleIoT Pairing).

Fortuna ‘seed file’ MUST be written before any MCUSLEEP operation (TODO: what if MCUSLEEP is memory-preserving?), and MUST be written at least every 10 minutes of Device operation.

Fortuna uniquely-pre-initialized seed file

To improve security, Devices MAY pre-populate Device with Fortuna seed file during manufacturing; if implemented, this seed file MUST be a file consisting of 64 random crypto-safe bytes. Presence of uniquely-pre-initialized “seed file” does NOT ease any of the other requirements to Fortuna and/or random number generation.

Device Operation for Devices with hardware-assisted Fortuna

NB: when “feeding entropy to Fortuna”, exact bit representation doesn’t matter, as long as all the data bits are fed to ADDRANDOMEVENT() Fortuna function

  • Device MUST have at least one MCU ADC channel which is either connected to an entropy source (such as Zener diode, details TBD), or just being not connected at all. This ADC is named “noise ADC”
    • it is acceptable to disconnect ADC channel only temporarily (for example, using an analogue switch); in this case, ADC channel MUST be disconnected for the whole duration of RNG additional seeding (i.e. it is not acceptable to disconnect it only for one measurement and to connect it back right afterwards).
  • During each “pairing” (IMPORTANT: it applies to any “pairing”, not just first “pairing”), the following procedure of RNG additional seeding MUST be performed:
    • When pairing procedure starts, Device MUST initialize two internal variables (Network-Time-Change-Count and ADC-Change-Count) as zeros
    • Device MUST implement “Entropy Gathering” procedure as defined in SimpleIoT Pairing document
    • On receiving each packet with entropy, Device MUST:
      • feed received ENTROPY to the Fortuna (NB: this ENTROPY is not really required, but it costs pretty much nothing to add it, and in case if attacker missed at least a part of the exchange, it certainly improves security, even if all the hardware entropy data turns out to be 100% deterministic, which shouldn’t really happen, but...)
      • feed entropy which is based on pseudo-measured time since the request has been sent, with at least 1mks precision; for the purposes of pseudo-measurement of time, exact time isn’t important, what is important is that two different times with 1mks difference, produce two different results with a probability at least 50%.
        • in particular, time MAY be pseudo-measured using “tight loops” (increment-pseudo-time-check-packet-arrival-repeat-until-packet-arrives), provided that 1mks requirement is satisfied (i.e. that “tight loop” time is less than 1mks, i.e. MCU-frequency * tight-loop-clock-count < 1mks). Device MAY perform some non-time-measured operations (for example, some measurements and/or calculations) after sending a packet and before going into time-pseudo-measuring “tight loop”, as long as maximum-possible-time-before-tight-loop < minimum-possible-packet-round-trip-time.
        • if pseudo-measured time is different from last pseudo-measured time, increment Network-Time-Change-Count. NB: even if Network-Time-Change-Count is not incremented, time data SHOULD still be fed to Fortuna PRNG
        • additionally, if another independent timer (such as WDT on AVR) is available, it SHOULD be read on packet arrival, and the data from the timer SHOULD be fed to Fortuna PRNG
    • in addition, if bare-metal implementation is used, whenever an interrupt happens (this includes interrupt on receiving packets, and/or any other interrupts), Device SHOULD feed “program-counter-before-interrupt has been called” (which is usually readily available as [SP-some_constant], and usually has 1 or more bits of entropy if the MCU is actively running at the moment) to Fortuna PRNG.
      • regardless of handling interrupts in such a manner, Device still MUST pseudo-measure time in a tight loop as described above
      • in addition, if another independent timer (such as WDT on AVR) is available, it SHOULD be read on all the interrupts, and the data from the timer SHOULD be fed to Fortuna PRNG. If independent timer is read-and-fed-to-Fortuna on interrupt, and all packet arrivals are handled via interrupts, then independent timer SHOULD NOT be read-and-fed-to-Fortuna outside of interrupt (tight-loop pseudo-measure of time outside of interrupt is still necessary)
      • to pass entropy from interrupt handler to Fortuna, entropy MAY be combined within different calls to interrupt handlers; in particular, the entropy MAY be accumulated via XOR-ing (with or without rotations, or using some other mixing function which doesn’t affect bit balance; good mixing functions examples include addition/substraction modulo 2^n, XOR, rotations, CRC functions, and crypto hash functions; bad examples include AND,OR, and shifts without rotations which may lose informaiton from some bits completely) incoming entropy in a fixed-size buffer until it is atomically-read-and-removed-from-fixed-size-buffer (TODO: is atomicity strictly required here?) outside of the interrupt handler and is fed to Fortuna PRNG. Regardless of mixing function, implementations MUST provide DEBUG compile-time flag which will ensure that each entropy component is passed separately without any mixing, and is never overwritten until it is read-and-removed; this is necessary to validate implementation to return what is expected (PC and/or timer) and to evaluate amount of entropy they produce.
    • Device MUST continue “Entropy Gathering” procedure at least until Network-Time-Change-Count reaches 250 * number-of-Fortuna-pools.
    • in addition, Device MUST perform measurements of “noise ADC” and feed the results to the Fortuna PRNG
      • on every such measurement, if measurement result is neither maximum nor minimum possible value for the ADC in question (usually, but not necessarily, minimum is all-zeros, and maximum is all-ones), and measurement result doesn’t match previous measurement from “noise ADC”, ADC-Change-Count is incremented. NB: even if ADC-Change-Count is not incremented, entropy still SHOULD be fed to Fortuna PRNG. NB2: “neither maximum nor minimum” requirement effectively rules out using 1-bit ADCs as “noise ADCs”.
      • these measurements MUST be performed in parallel with “Entropy Gathering” network exchange; at least one ADC measurement per “Entropy Gathering” packet MUST be performed; more than one is fine.
    • in addition, Device SHOULD perform measurements of all the other ADCs in the system (e.g. one measurement for each other ADC for one measurement of “noise ADC”) and feed the results to Fortuna PRNG
    • Device MUST continue measurements of “noise ADC” at least until ADC-Change-Count reaches 250 * number-of-Fortuna-pools.
    • if hardware RNG (for example, accessible via a special MCU instruction) is available, Device SHOULD feed it’s output to Fortuna
    • after both ADC-Change-Count and Network-Time-Change-Count reach 250, Device MAY decide to complete RNG additional seeding
    • to complete RNG additional seeding, Device MUST explicitly call Fortuna’s RESEED() (see [Fortuna] for details), and then MUST skip at least TODO bits of Fortuna output
  • Until RNG additional seeding is completed, RNG output MUST NOT be used in any manner
  • after RNG additional seeding is completed, Devices still SHOULD feed all the available entropy (as described above) to the Fortuna PRNG
Fortuna State and re-pairing

When Device is to be re-paired (i.e. Device pairing state is changed to PRE-PAIRING, see SimpleIoT Pairing document for details), Fortuna PRNG state (both seed file and in-memory state) MUST NOT be affected. The only process which MAY rewrite Fortuna persistent state while ignoring the existing Fortuna state, is Device re-programming (but not OtA re-programming).

Devices with hardware RNG

To qualify as a ‘Device with hardware RNG’, Device MUST comply with all the following requirements:

  • Device MUST have a hardware entropy source, which provides a hardware-generated bit stream
  • Device MUST implement on-line testing of hardware-generated bit stream (monobit test, poker test, runs test, and long runs test, as they were specified in FIPS140-2 after Change Notice 1 and before Change Notice 2; testing should be performed on each 20000-bit block before this block is fed to Fortuna). TODO: adaptation to streaming?
  • on-line testing MUST be performed on a bit stream before any cryptographic primitives are applied (but SHOULD be performed after von Neumann bias removal)
  • Device MUST implement Fortuna PRNG (as specified above).
    • this includes implementing Fortuna seed file as described above
  • on the first launch of the Device (i.e. if Fortuna seed file is not present, and Device is in PRE-PAIRING state), at least 3 of hardware-generated bit stream blocks, with on-line test above being successful, MUST be fed to a Fortuna PRNG during Fortuna initialization:
    • until such an initialization is completed, Device MUST NOT be operational
    • bit stream blocks with online test failed, still SHOULD be fed to Fortuna PRNG
    • RNG MUST skip at least first TODO bits of the Fortuna output bit stream (before starting to output Fortuna output as RNG output)
  • Device MUST continue feeding output from hardware entropy source to Fortuna PRNG, without applying the online tests, at a rate at least 1 bit per second (as long as Device is running during at least some portion of the 1 second and not in a hardware sleep mode)
  • Device SHOULD feed additional available entropy (timings, ADC etc. as described above) to Fortuna PRNG

Restrictions for Secure and non-Secure Devices

non-Secure SimpleIoT Devices MAY use one of the following RNGs (as long as all requirements for respective RNG, as specified above, are complied with):

  • uniquely-pre-initialized Poor-Man’s PRNGs
  • hardware-assisted Fortuna
  • hardware-assisted Fortuna with uniquely-pre-initialized seed file
  • hardware RNG
  • hardware RNG with Fortuna having uniquely-pre-initialized seed file

Secure SimpleIoT Devices MAY use one of the following RNGs (as long as all requirements for respective RNG, as specified above, are complied with):

  • uniquely-pre-initialized Poor-Man’s PRNGs
  • hardware-assisted Fortuna
  • hardware-assisted Fortuna with uniquely-pre-initialized seed file
  • hardware RNG
  • hardware RNG with Fortuna having uniquely-pre-initialized seed file (RECOMMENDED)

SimpleIoT Client (and Devices with Crypto-Safe RNG)

Even if the system where the SimpleIoT stack is running, has a supposedly crypto-safe RNG (such as built-in crypto-safe /dev/urandom), SimpleIoT implementations still MUST employ Poor-Man’s PRNG (as described above) in addition to system-provided crypto-safe PRNG. In such cases, each byte of SimpleIoT RNG (which is provided to the rest of SimpleIoT) SHOULD be a XOR of 1 byte of system-provided crypto-safe PRNG, and 1 byte of Poor-Man’s PRNG.

Rationale. This approach allows to reduce the impact of catastrophic failures of the system-provided crypto-safe PRNG (for example, it would mitigate effects of the Debian RNG disaster very significantly).

To initialize Poor-Man’s RNG on Client side, SimpleIoT implementation MUST NOT use the same crypto-safe RNG which output will be used for XOR-ing with Poor-Man’s RNG (as specified above); instead, Poor-Man’s RNG on Client side MUST be initialized independently; valid examples of such independent initialization include XOR-ing of at least two sources, such as an independent Fortuna PRNG with user input (timing of typing or mouse movements), or online generators such as ‘raw bytes’ from random.org or from (TODO); IMPORTANT: all exchanges with online generators MUST be over https, and with server certificate validation.

The same procedure SHOULD also be used for generating random data which is used for SimpleIoT key generation.

Key Generation

This sections describes rules for generating keys (and other key material, such as DH random numbers).

For Devices which support OtA Pairing (see SimpleIoT Pairing document for details), key material needs to be generated. For such Devices the following requirements MUST be met:

  • if Device doesn’t have a hardware-assisted Fortuna PRNG:
    • Device MUST implement at least two uniquely-pre-initialized Poor-Man’s PRNGs: one of them (named ‘POORMAN4KEYS’) MUST NOT be used for any purposes except for key generation as described below. Another one (named ‘NONKEYPOORMAN’) is used to produce ‘non-key Random Stream’.
    • in addition, Device MUST have an additional uniquely-pre-initialized key (KEY4KEYS), which MUST NOT be used except for key generation as described below
    • to generate 128 bits of key material, the following procedure applies:
      • calculate output=AES(key=KEY4KEYS,data=POORMAN4KEYS.Random16bytes())
  • if Device does have a hardware-assisted Fortuna PRNG:
    • Fortuna output (after mandatory RNG additional seeding as described above) is used as a key material
  • if Device (or Client) has a crypto-safe RNG:
    • Device MUST implement at least two uniquely-pre-initialized Poor-Man’s PRNGs: one of them (named ‘POORMAN4KEYS’) MUST NOT be used for any purposes except for key generation as described below. Another one (named ‘NONKEYPOORMAN’) is used to produce ‘non-key Random Stream’.
      • Initialization of both Poor-Man’s PRNGs (as well as initialization of KEY4KEYS and POORMAN4KEYS, see below) MUST be done independently, as specified in “SimpleIoT Client (and Devices with Crypto-Safe RNG)” section above.
    • in addition, Device MUST have an additional uniquely-pre-initialized key (KEY4KEYS), which MUST NOT be used except for key generation as described below
    • to generate 128 bits of key, the following procedure applies:
      • calculate output=CryptoSafeRNG.Random16bytes() XOR AES(key=KEY4KEYS,data=POORMAN4KEYS.Random16bytes())

Non-Key Random Stream

SimpleIoT RNG provides a ‘non-key Random Stream’ for various purposes such as padding, ENTROPY data for the pairing (sic!), etc. Generation of 128 bits of non-key Random Stream is similar to key generation described above, with the following differences:

  • instead of POORMAN4KEYS Poor-Man’s PRNG, NONKEYPOORMAN Poor-Man’s PRNG is used
  • instead of AES(key=KEY4KEYS,data=DATA), DATA is used directly

References

[Fortuna] Niels Ferguson, Bruce Schneier. “Practical Cryptography”. Wiley Publishing, 2003. Sections 10.3 (‘Fortuna’) - 10.7 (‘So What Should I Do?’)

SimpleIoT Pairing

Version:0.1a

NB: this document relies on certain terms and concepts introduced in SimpleIoT Protocol Stack document, please make sure to read it before proceeding.

“Pairing” SimpleIoT Device to SimpleIoT Client is necessary to ensure secure key exchange between SimpleIoT Device and SimpleIoT Client. As soon as “pairing” is completed, both parties have a 128-bit symmetric key shared between them, and can use it for SimpleIoT/SP purposes.

SimpleIoT Pairing comes in several flavours. SimpleIoT Device MUST implement at least one of these flavours. SimpleIoT Client MUST implement all these flavours.

SimpleIoT Pairing flavours are divided into two categories: Zero Pairing (which doesn’t involve communication over SimpleIoT communication channel), and Over-the-Air (OtA) pairing.

SimpleIoT Zero Pairing

Zero pairing doesn’t involve communication over SimpleIoT communication channel.

SimpleIoT Zero Programming Pairing

SimpleIoT Zero Programming Pairing applies only to those devices which can be completely reprogrammed. It is a RECOMMENDED way of pairing for hobbyist-oriented SimpleIoT Devices.

SimpleIoT Zero Programming Pairing consists of:

  • Client generating secret key (see TODO for details)
  • Client preparing (e.g. compiling or linking) a program which includes generated secret key as static data
  • Client storing generated secret key in Client DB
  • Client programming Device using prepared program (which contains generated secret key)

TODO: restrictions on SimpleIoT Device programming-socket key access.

SimpleIoT Zero Paper Pairing

SimpleIoT Zero Paper Pairing MAY be used by those mass-market-oriented SimpleIoT Devices, for which implementing other pairing methods is not feasible. Zero Paper Pairing SHOULD NOT be used if other pairing methods are feasible. Zero Paper Pairing MUST NOT be used by SimpleIoT Security Devices unless it is demonstrated that other pairing methods are not feasible for the Device.

Zero Paper Pairing requires each Device to:

  • have unique 128-bit crypto-random key programmed in as it’s SimpleIoT/SP AES key
  • have this 128-bit key printed in the following user-friendly form:
    • 128-bit key is converted to a large unsigned integer (using SimpleIoT Endianness) from 0 to 2^128-1
    • this large unsigned integer is written as an integer using base 36 (i.e. using 36 digits in each position); to write digits 0-9 in this representation, symbols ‘0’-‘9’ are used; to write digits 10-35 in this representation, symbols ‘A’-‘Z’ (upper case) are used. This representation will have at most 25 symbols (as 36^25 > 2^128); if there are less symbols than 25, they’re left-padded with zeros to 25
    • these 25 symbols are written in dash-separated groups of five
    • checksum symbol is calculated as a modulo-36 sum of all the symbols
    • checksum is appended (via dash) to dash-separated groups of five, forming XXXXX-XXXXX-XXXXX-XXXXX-XXXXX-X pattern
    • for example, all-zero key will be written as 00000-00000-00000-00000-00000-0
  • this user-friendly form of 128-bit crypto-random key MUST be provided in a printed form with the device.

In addition, to ensure that if the printed key is lost, Device is still usable, Devices using Zero Paper Pairing, MUST comply to the following Reprogramming Requirements:

  • Device MUST provide an option to re-program key, using either UART or USB. Wireless programming methods are expressly forbidden; in addition, any way (whether wired or wireless) to read the key from the device, is expressly forbidden.
    • this can be made by one of the following methods:
      • full Device reprogramming; to be compliant, it MUST be done as follows:
        • Device MUST be re-programmable using Platform.IO
        • Manufacturer MUST provide source code for the Device programming in a form which is used by SimpleIoT for programming of hobbyist-orineted SimpleIoT Devices
          • this source code MUST be available free of charge BOTH from manufacturer’s web site, AND from an allowed third-party repository. List of allowed third-party source code repositories TBD (github, sourceforge, something else?)
      • key reprogramming. Protocols for key reprogramming over UART and over USB are TBD.

SimpleIoT OtA Pairing

SimpleIoT OtA pairing provides security (including MITM protection) with minimal complexity involved.

From OtA Pairing perspective, SimpleIoT Device can be in one of the following OtA pairing states:

  • PRE-PAIRING
  • PAIRING-MITM-CHECK
  • PAIRING-COMPLETED

IMPORTANT: Change from any of the states into PRE-PAIRING state MUST be implemented ONLY via physical manipulations of end-user with SimpleIoT Device (and MUST NOT be allowed remotely). Examples of valid user interfaces to perform such a change include on-Device button or buttons (for example, if two buttons are simultaneously kept pressed for over N seconds) and on-Device PCB jumper. When changing Device state to PRE-PAIRING state, state of Device RNG (i.e. data used for random number generation) MUST NOT be affected.

In PRE-PAIRING state, SimpleIoT/SP MUST use ‘zero’ AES-128 key (with AES key consisting of all zeros).

In PRE-PAIRING state, no programs are allowed to be sent to SimpleIoT/CCP; only TODO SimpleIoT/CCP packets are allowed. In PAIRING-MITM-CHECK state, SimpleIoT/CCP programs are allowed; however, in this state, SimpleIoT/CCP restricts EXEC command of SimpleIoT/VM to the only Built-In bodypart (id=BUILTIN_BODYPART_PAIRING).

From security perspective, SimpleIoT OtA pairing works as follows:

  • BOTH parties generate DH randoms (a and b - 1024- or 2048-bit ones).
  • parties perform anonymous Diffie-Hellman key exchange, obtaining a 1024- or 2048-bit shared secret Z.
  • parties derive 128-bit key K and 128-bit verification value X out of Z.
  • from this point on, on both sides SimpleIoT/SP starts to use key K, as SimpleIoT/SP AES key
  • parties use verification value X (which is essentially a MITM check key) to perform MITM protection check depending on the OtA pairing flavour. During this exchange, Device is kept in PAIRING-MITM-CHECK Device OtA pairing state.
  • if MITM protection check indicates that everything is fine - Device OtA pairing state is changed to PAIRING-COMPLETED, and normal work can be started.

Pre-Programmed Keys and RNGs

It should be understood that to ensure security, Devices MUST comply to at least one of the following two requirements:

  • each device MUST have unique pre-programmed SimpleIoT/SP key:
    • this applies to Zero Pairing Devices

or

SimpleIoT OtA Pairing Protocol

All the messages within one pairing procedure form a single “packet chain”. That is, “packet chain” for a normal OtA Pairing exchange works as follows:

Pairing-Ready-Pseudo-Response - Pairing-Pre-Request - Pairing-Pre-Response - Pairing-DH-Data-Request - Pairing-DH-Data-Response - ... - Pairing-DH-Data-Request - Pairing-DH-Data-Response

When both sides receive the last of Pairing-DH-DATA-* packets (the ones which provide the whole DH data, with size defined according to KEY-EXCHANGE-TYPE field in Pairing-DH-Data-Request), they proceed with calculation of SimpleIoT/SP key.

“Awaiting pairing” mode

To avoid Device connecting to wrong SimpleIoT Client, SimpleIoT Client MUST NOT proceed with “pairing” in response to Pairing-Ready packets unless SimpleIoT Client is in “awaiting pairing” mode. “Awaiting pairing” mode for SimpleIoT Client MUST be user-initiated, and MUST NOT be kept for longer than 1 hour, unless user requests another “awaiting pairing”. This is necessary to reduce “partially paired to wrong SimpleIoT Client” encounters (which MUST have a way to be handled separately; TODO: example in SimpleIoT/RF).

TODO: errors (Z=1 per NIST SP 800-56B, and derived-key=0 to avoid being caught by attacks on misimplementations)!

OtA Pairing Protocol Packets

Pairing-Ready-Pseudo-Response: |ENTROPY-NEEDED-SIZE |

where ENTROPY-NEEDED-SIZE is an Encoded-Unsigned-Int<max=2> field specifying amount of needed entropy in bytes.

Pairing-Ready-Pseudo-Response is not really a response, but a request from Device side which initiates pairing sequence. It is sent as a payload for a CCP-OTA-PAIRING-RESPONSE message (which in turn initiates a new “packet chain”), with 2 “additional bits” being 0x0. If ENTROPY-NEEDED-SIZE is not zero, it indicates that Phase 1 of ‘Entropy Gathering Procedure’ (see below) is necessary before issuing a Pairing-Pre-Request from Client side.

If Client is not in “awaiting pairing” mode, it MUST respond with Pairing-Error-Request with ERROR-CODE = ERROR_NOT_AWAITING_PAIRING.

Pairing-Pre-Request: | OTA-PROTOCOL-VERSION-NUMBER-MAJOR | OTA-PROTOCOL-VERSION-NUMBER-MINOR | CLIENT-RANDOM | PROJECTED-NODE-ID | CLIENT-OTA-AND-SIOT-SP-CAPABILITIES |

where OTA-PROTOCOL-VERSION-NUMBER-* are Encoded-Unsigned-Int<max=2> fields, CLIENT-RANDOM is a 16-byte field with crypto-random data, PROJECTED-NODE-ID is an Encoded-Unsigned-Int<max=2> field, containing NODE-ID which Client intends to assign to the Device if pairing is successful, and CLIENT-OTA-AND-SIOT-SP-CAPABILITIES TBD.

Pairing-Pre-Request is sent as a payload for a SimpleIoT/CCP CCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for CCP-OTA-PAIRING-REQUEST message being 0x0.

Pairing-Pre-Response: | ENTROPY-NEEDED-SIZE | OPTIONAL-DEVICE-RANDOM | OPTIONAL-DEVICE-BUS-TYPE | OPTIONAL-DEVICE-INTRABUS-ID-SIZE | OPTIONAL-DEVICE-INTRABUS-ID | OPTIONAL-DEVICE-OTA-AND-SIOT-SP-CAPABILITIES |

where ENTROPY-NEEDED-SIZE is an Encoded-Unsigned-Int<max=2> field, OPTIONAL-DEVICE-RANDOM is an optional 32-byte field, OPTIONAL-DEVICE-BUS-TYPE is an Encoded-Unsigned-Int<max=1> field representing a enum of bus types (TBD), OPTIONAL-DEVICE-INTRABUS-ID-SIZE is an Encoded-Unsigned-Int<max=1> field, representing size of OPTIONAL-DEVICE-INTRABUS-ID field in bytes, OPTIONAL-DEVICE-INTRABUS-ID depends on the bus type, and OPTIONAL-DEVICE-OTA-AND-SIOT-SP-CAPABILITIES (format TBD); all the OPTIONAL-* fields are present only if this Pairing-Pre-Response packet is the first such packet in current “pairing” exchange.

Pairing-Pre-Response is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-RESPONSE message, with 2 “additional bits” for CCP-OTA-PAIRING-RESPONSE message being 0x1.

NB: to comply with key generation requirements as specified in SimpleIoT Random Number Generation and Key Generation document, Device MUST request at least amount of entropy which is equal to the b parameter size for DH key exchange; however, Device MAY request more entropy (up to 256 extra bytes per pairing attempt, which requests MAY be split into packets as small as 1-byte) - for example, to initialize it’s own Fortuna generator.

If ENTROPY-NEEDED-SIZE is not zero, it means that “Entropy Gathering” Phase 3 is necessary (see below), and that Client MUST reply with a Pairing-Entropy-Provided-Request.

Pairing-Entropy-Provided-Request: | ERROR-CODE | ENTROPY |

where ERROR-CODE is an Encoded-Unsigned-Int<max=2> field, equal to zero, and ENTROPY is an arbitrary-length field with cryptographically safe random data.

Pairing-Entropy-Provided-Request is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for CCP-OTA-PAIRING-REQUEST message being 0x1. Note that “additional bits” for Pairing-Entropy-Provided-Request are the same as for Pairing-Error-Request, and they’re distinguished by the value of ERROR-CODE field.

Client MAY supply less entropy than it was requested (and SHOULD do it in case if requested data potentially exceeds MTU); in such a case, Device SHOULD request more entropy via replying with an appropriate message with a non-zero ENTROPY-NEEDED-SIZE.

In response to Pairing-Entropy-Provided-Request, Device MUST send another Pairing-Ready-Pseudo-Response or Pairing-Pre-Response packet (depending on the Phase of Entropy Gathering procedure currently in progress), specifying non-zero ENTROPY-NEEDED-SIZE if it still has not enough entropy.

Pairing-Error-Request: | ERROR-CODE |

where ERROR-CODE is an Encoded-Unsigned-Int<max=2> field, never equal to zero.

Pairing-Error-Request is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for CCP-OTA-PAIRING-REQUEST message being 0x1. Note that “additional bits” for Pairing-Error-Request are the same as for Pairing-Entropy-Provided-Request, and they’re distinguished by the value of ERROR-CODE field.

Pairing-DH-Data-Request: | OPTIONAL-KEY-EXCHANGE-TYPE | DH-REQUEST-PART |

where OPTIONAL-KEY-EXCHANGE-TYPE is sent only for the very first Pairing-DH-Data-Request within the “pairing”, and is Encoded-Unsigned-Int<max=2> field with values defined below, and DH-REQUEST-PART is a field taking the rest of the packet, and representing first remaining (SimpleIoT-Endianness-wise) bytes of A = g^a mod p from DH key exchange (using SimpleIoT Endianness).

Supported OPTIONAL-KEY-EXCHANGE-TYPEs:

  • value 0:
    • Key Exchange: DH with 1024-bit MODP group with 160-bit Prime Order Subgroup as defined in RFC 5114. This OPTIONAL-KEY-EXCHANGE-TYPE MUST NOT be used for Security SimpleIoT Devices. NB: MODP groups from RFC 5114 are preferred to earlier-defined ones (for example, those from RFC 3526), as they explicitly comply with NIST-suggested restrictions, in particular, restrictions on q.
    • Key Derivation: SHA256-based
  • value 1:
    • Key Exchange: DH with 2048-bit MODP group with 256-bit Prime Order Subgroup as defined in RFC 5114.
    • Key Derivation: SHA256-based
  • others: MAY be added as necessary

TODO: double-check presence of any typical patterns in Z, and decide on split (first-half/second-half or even-bits/odd-bits)

Pairing-DH-Data-Request is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for CCP-OTA-PAIRING-REQUEST message being 0x2.

Pairing-DH-Data-Response: | DH-RESPONSE-PART |

where DH-RESPONSE-PART is a field taking the whole packet; length of DH-RESPONSE-PART MUST be exactly the same as DH-REQUEST-PART in the incoming Pairing-DH-Data-Request message. DH-RESPONSE-PART represents first remaining (SimpleIoT-Endianness-wise) bytes of B = g^b mod p from DH key exchange (using SimpleIoT Endianness).

Pairing-DH-Data-Response is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-RESPONSE message, with 2 “additional bits” for CCP-OTA-PAIRING-RESPONSE message being 0x2.

Pairing-Ok-Request: | OK-A-ENTROPY-CHECKSUM | NODE-ID |

where OK-A-ENTROPY-CHECKSUM is a 16-byte field containing result of SimpleIoT/SP-tag(nonce=(varying-part=1,direction=from-client-to-device),authenticated-data=All-Sent-ENTROPY-Combined,key=derived-SimpleIoT/SP-key), where nonce is constructed in the same way it is constructed in SimpleIoT/SP, and NODE-ID is an Encoded-Unsigned-Int<max=2> field containing SimpleIoT/MP node ID to be assigned to the Device. NODE-ID is conditional on OK-A-ENTROPY-CHECKSUM check described below, otherwise NODE-ID MUST be ignored.

Pairing-Ok-Request is sent by Client when the last Pairing-DH-Data-Response is received; it is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for CCP-OTA-PAIRING-RESPONSE message being 0x3.

On receiving Pairing-Ok-Request, Device calculated it’s own DEVICE-OK-A-ENTROPY-CHECKSUM with derived-SimpleIoT/SP-key, compares it to received OK-A-ENTROPY-CHECKSUM. If the check is Ok, then Device calculates OK-B-ENTROPY-CHECKSUM (the same way as OK-A-ENTROPY-CHECKSUM is calculated, but with direction=from-device-to-client), and sends it back as a part of Part-Ok-Response; then Device changes pairing state into Pairing-MITM-Check, sets SimpleIoT/SP key to derived-SimpleIoT/SP-key for all future communications with Client, and sets next SimpleIoT/SP nonce varying-part (including the one stored in persistent storage) to 2.

If DEVICE-OK-A-ENTROPY-CHECKSUM and received OK-A-ENTROPY-CHECKSUM don’t match - Device MUST switch back to PRE-PAIRING state and report TODO error to the Client.

Pairing-Ok-Response: | OK-B-ENTROPY-CHECKSUM |

where OK-B-ENTROPY-CHECKSUM is a 16-byte field.

Pairing-Ok-Response is sent as a payload for SimpleIoT/CCP CCP-OTA-PAIRING-RESPONSE message, with 2 “additional bits” for CCP-OTA-PAIRING-RESPONSE message being 0x3.

On receiving Pairing-Ok-Response, Client calculates it’s own CLIENT-OK-B-ENTROPY-CHECKSUM, compares it with received OK-B-ENTROPY-CHECKSUM. If everything is fine - “pairing” can be considered completed, and Client sets SimpleIoT/SP key (to be used by SimpleIoT/SP) to derived-SimpleIoT/SP-key for all future communications with this Device, and sets next SimpleIoT/SP nonce varying-part (including the one stored in persistent storage) to 2. After that, Client starts to perform MITM check (using MITM-Check-Program as described below).

If CLIENT-OK-B-ENTROPY-CHECKSUM and received OK-B-ENTROPY-CHECKSUM don’t match - Client reports end-user a potential attack on pairing (without such an attack, chances of ENTROPY-CHECKSUM mismatching are on the order of 2^-120), and asks end-user to re-start pairing by manually switching Device to PRE-PAIRING state (using appropriate UI as described above).

Entropy Gathering

In some cases, as a prerequisite for Device to be able to perform pairing, RNG needs to be supplied with entropy (exact conditions are described in SimpleIoT Random Number Generation and Key Generation document); NB: as described in SimpleIoT Random Number Generation and Key Generation, entropy usually needs to be supplied not only to the first pairing of the Device, but also to any subsequent pairing.

The procedure of Entropy Gathering is performed as follows:

Phase 1 (OPTIONAL, used only if Device ID needs to be generated, hardware-assisted Fortuna PRNG is used, and Fortuna doesn’t have enough entropy):

  • Device sends Pairing-Ready-Pseudo-Response with non-zero ENTROPY-NEEDED-SIZE
  • Client replies with Pairing-Entropy-Provided request, sent as a broadcast (SHOULD be restricted to those Retransmitting Nodes which may reach the Device)
  • this Pairing-Ready-Pseudo-Response - Pairing-Entropy-Provided sequence is repeated until Device has sufficient entropy to generate Device ID (this is the same as for regular “pairing”, as described in SimpleIoT Random Number Generation and Key Generation document)
  • NB: during Phase 1, Pairing-Entropy-Provided packets from Client to Device are sent as SimpleIoT/MP From-Santa packets (see SimpleIoT Heterogeneous Mesh Protocol (SimpleIoT/HMP)) which do not distinguish between target Devices, so there is a chance that more than one Device obtains the same packet. However, these same packets will (with an overwhelming probability) lead to different states within Fortuna PRNGs on different Devices, which will allow to distinguish these (originally potentially indistinguishable) Devices.

Phase 2:

  • Device sends Pre-Pairing-Response non-zero ENTROPY-NEEDED-SIZE, DEVICE-ID-FLAG set, and all Device ID-related fields.
  • Client replies with Pairing-Entropy-Provided request
  • NB: starting from Phase 2, all the packets from Client to Device are sent as SimpleIoT/MP Unicast packets (see SimpleIoT Heterogeneous Mesh Protocol (SimpleIoT/HMP)) and are addressed to specific Device (using Device ID from Phase 2).

Phase 3:

It should be noted that number of packets sent and received is IMPORTANT for security purposes, so combining packets contrary to requirements in SimpleIoT Guaranteed Delivery Protocol (SimpleIoT/GDP) is strictly prohibited.

DH Random Generation

For both Client side and Device side, DH random numbers (a and b respectively) MUST be generated as described in Key Generation section in SimpleIoT Random Number Generation and Key Generation document.

SimpleIoT/SP Key Derivation

When both sides have all the information they need (that is, Client has full B = g^b mod p and Device has full A = g^a mod p), they need to calculate shared secret Z (Z = A^b mod p for Device, and Z = B^a mod p for Client), and generate SimpleIoT/SP Key K (128 bit), as well as verification value X (also 128 bit), from Z.

SimpleIoT/SP Key K and verification value X are calculated as follows:

  • for SHA256-based derivation: K = SHAd256(Z||Info||first-half-of-CLIENT-RANDOM||first-half-of-DEVICE-RANDOM), X = SHAd256(Z||Info||second-half-of-CLIENT-RANDOM||second-half-of-DEVICE-RANDOM), where Info=‘“SimpleIoT/SP”||KEY-EXCHANGE-TYPE||’K’-or-‘X’||ROOT-NODE-ID||PROJECTED-NODE-ID’ (where ROOT-NODE-ID is always 0, and ‘K’-or-‘X’ is equal to ‘K’ ASCII byte if calculating ‘K’, and to ‘X’ ASCII byte if calculating ‘X’). SHAd256(m) is SHA256(SHA256(m)), same as in [Fortuna]. NB: this method differs from recommended by NIST, in that we’re deriving both K and X from the same DH keys; as some function of X is exposed (via LED blinking), in theory it might leak some information about K; however, in practice we don’t see any specific attack vectors (especially as obtaining key material from X requires reverting SHAd256, AND as blinking is not just X, but X-encrypted-with-a-random-key-which-is-transferred-over-encrypted-channel, so X itself is not easily accessible). We could use method of obtaining X which is similar to Simple Secure Pairing, but at the point we do not see it necessary.
  • other methods MAY be added in the future
OtA Pairing MITM-Check Program

After initial “packet chain” consisting of Pairing Request and Pairing Response, Device goes into PAIRING-MITM-CHECK state; MITM check is performed via “MITM-Check Program”.

MITM-Check Program is pretty much a regular SimpleIoT/VM program which goes over SimpleIoT/CCP (normally over SimpleIoT/GDP over SimpleIoT/SP). There is a difference from regular program though: MITM-Check Program MUST come only in PAIRING-MITM-CHECK Device pairing state. In this state, SimpleIoT/CCP (and/or SimpleIoT/VM) prohibits program to access any bodyparts, except for a Built-In bodypart with id=BUILTIN_BODYPART_PAIRING. This also ensures that despite there can be two bodyparts accessing the same LED (one is ‘pairing’ bodypart, another is regular bodypart), there is no possible conflict between the two.

OtA Pairing Flavours

All OtA Pairing Flavours run on top of SimpleIoT OtA Pairing Protocol, and differ only in their MITM-Check Programs.

SimpleIoT OtA Single-LED Pairing

SimpleIoT OtA Single-LED Pairing is pairing mechanism, which is semi-automated (i.e. user is not required to enter any data, but will be required to position devices in a certain way), and which requires absolute minimum of resources on the Device side. Namely, all the Device needs to have (in addition to MCU) is one single LED. This LED MAY be any of existing LEDs on the Device.

MITM-Check for Single-LED Pairing is performed as follows:

  • User is asked to bring Device close to the webcam which is located on SimpleIoT Central Controller
  • Client sends a MITM-Check program which requests LED to blink, using Blinking-Function(random-nonce-sent-by-Client)=AES(key=verification-value-X,data=random-nonce-sent-by-Client) as a blinking pattern. TODO: Built-in Plugin to produce AES(...) reply.
  • Accordingly, Device starts blinking the LED
  • Client, using webcam, recognizes blinking pattern and makes sure that it matches expectations.
  • If expectations don’t match, program may be repeated with a different random-nonce-sent-by-Client
  • If expectations do match, another program (also technically a MITM-Check program) is sent to change OtA Pairing State of the Device to PAIRING-COMPLETED.

NB: SimpleIoT Client SHOULD support using webcam on a smartphone camera for “pairing” purposes (provided that TODO requirements for securing communication between SimpleIoT Client and smartphone’s app, are met).

MITM-Check for Single-LED Pairing being User-OPTIONAL

All SimpleIoT Devices using Single-LED Pairing, MUST implement proper MITM Check procedures as described above. However, devices which are not designated as Security Devices, MAY set PAIRING-USER-OPTIONAL flag in their Device Capabilities (TODO). If Client receives PAIRING-USER-OPTIONAL flag from a Device which also has SECURE-DEVICE flag - it MUST NOT allow using such a Device, with an appropriate report to the end-user.

If Client “pairs” with a Device which has PAIRING-USER-OPTIONAL set, it MAY ask user if he wants to perform “pairing”. If PAIRING-USER-OPTIONAL flag is not set, Client MUST NOT allow to use Device (i.e. MUST NOT issue a program which resets MITM-CHECK-IN-PROGRESS Device flag, and MUST NOT send any non-pairing programs to the Device) until “pairing” is actually performed.

To re-iterate: being User-OPTIONAL means that while Device implementors still MUST implement MITM; however, under certain circumstances end-user MAY be allowed to skip MITM protection.

SINGLE-LED-PAIRING Built-In Plugin

TODO

References

[Fortuna] Niels Ferguson, Bruce Schneier. “Practical Cryptography”. Wiley Publishing, 2003. Sections 6.4 (‘Fixing the Weaknesses’)

SimpleIoT Heterogeneous Mesh Protocol (SimpleIoT/HMP)

Version:0.1.9

NB: this document relies on certain terms and concepts introduced in SimpleIoT Protocol Stack document, please make sure to read it before proceeding.

SimpleIoT/HMP is a part of SimpleIoT protocol stack. It belongs to Level 3 of OSI/ISO Network Model, and is responsible for routing packets within SimpleIoT mesh network.

SimpleIoT mesh network is a heterogeneous network. In particular, on the way from SimpleIoT Client to SimpleIoT Device a packet may traverse different bus types (including all supported types of wired and wireless buses); the same stands for the packet going in the opposite direction.

SimpleIoT/HMP is optimized based on the following assumptions:

  • SimpleIoT/HMP relies on all communications being between Central Controller and Device (no Device-to-Device communications); no other communications are currently supported
  • SimpleIoT/HMP aims to optimize “last mile” traffic (between last Retransmitting Device and target Device) while paying less attention to Central-Controller-to-Retransmitting-Device and Retransmitting-Device-to-Retransmitting-Device traffic. This is based on the assumption that the Retransmitting Devices usually have significantly less power restrictions (for example, are mains-powered rather than battery-powered).
  • SimpleIoT/HMP combines data with route requests
  • SimpleIoT/HMP allows to send “urgent” data packets, which sacrifice traffic and energy consumption for the best possible delivery speed
  • SimpleIoT/HMP relies on pre-existence of Routing Tables (see below) on all relevant Retransmitting Nodes. Communicating Routing Tables MAY be implemented over the upper-layer protocol such as SimpleIoT/CCP
    • This is done because of sensitivity of Routing Tables; with upper-layer protocol, Routing Tables can be communicated securely
    • It doesn’t create a chicken-and-egg problem, as SimpleIoT/HMP provides a way to reach any reachable Retransmitting Node without a Routing Table on it; as soon as Retransmitting Node is reachable via SimpleIoT/HMP, upper-layer protocol such as SimpleIoT/CCP can be used to create/update Routing Table on the Retransmitting Node.
    • Technically, updating Routing Tables is not a part of SimpleIoT/HMP; however, a protocol of updating Routing Tables over CCP_PHY_AND_ROUTING_DATA messages is provided below as an example.
  • SimpleIoT/HMP relies on upper-layer protocol (such as SimpleIoT/GDP) to send retransmits in case if packet has not been delivered, and to provide SimpleIoT/HMP with an information about retransmit number (i.e., original packet having retransmit-number=0, first retransmit having retransmit-number=1, and so on).
  • SimpleIoT/HMP relies on upper-layer protocol (such as SimpleIoT/GDP) to provide information if the Device on the other side is required to have it’s transmitter on for upper-layer protocol purposes. For SimpleIoT/GDP, there are states which do guarantee this (in fact, it stands in almost all SimpleIoT/GDP states except for IDLE).

SimpleIoT/HMP has the following types of actors: Root (normally implemented by Central Controller), Retransmitting Device, and non-Retransmitting Device. All these actors are collectively named Nodes.

Underlying Protocol Requirements

SimpleIoT/HMP underlying protocol (normally one of SimpleIoT/DLP-* protocols), MUST support the following operations:

  • bus broadcast (addressed to all the Devices on the bus)
  • bus multi-cast (addressed to a list of the Devices on the bus)
  • bus uni-cast

NB: these operations MAY be implemented using only bus broadcast, without any additional intra-bus addressing information; all HMP packets have sufficient information to ensure further processing of HMP packets without underlying protocol addressing information. If some information within HMP packet becomes redundant given underlying protocol’s addressing information, underlying protocol MAY compress HMP packet when transmitting it, by re-using underlying-protocol information when compressing HMP packet; however, as HMP addresses in normal (post-pairing) communication are usually very short anyway, such compression is not likely to bring substantial benefits.

All SimpleIoT Devices SHOULD, and all SimpleIoT Retransmitting Devices MUST implement some kind of collision avoidance (at least CSMA/CA, a.k.a. “listen before talk with random delay”).

SCRAMBLING and Underlying Protocol Error Correction

HMP packets are usually SCRAMBLED, and after SCRAMBLING are transmitted over some of SimpleIoT/DLP-* protocols.

SADLP-* protocols SHOULD allow for gradual error correction, starting from the beginning of the packet. Even if the packet cannot be error-corrected completely, information in the first part of the header MAY be of value, and SHOULD be passed to upper layers. SCRAMBLING procedure SHOULD allow for partial descrambling (to the extent possible), and SHOULD return partially descrambled packets back to SimpleIoT/HMP. It will allow SimpleIoT/HMP to get “partially correct” packets, which are to be used as described below, to improve certain SimpleIoT/HMP characteristics. SimpleIoT/HMP uses headers of “partially correct” packets in “promiscuous mode” operations, and in some other cases referred to as “partially correct packet”.

Promiscuous Mode Operations

Wherever possible (in particular, for all kinds of wireless communications unless explicitly prohibited by underlying standard), SimpleIoT Retransmitting Devices SHOULD listen the network in promiscuous mode; this doesn’t affect security, but provides valuable header information and speeds up message delivery and recovery in certain practical cases.

SimpleIoT Retransmitting Devices

Some SimpleIoT Devices are intended to be “SimpleIoT Retransmitting Devices”. “SimpleIoT Retransmitting Device” has one or more transmitters. Transmitters on SimpleIoT Retransmitting Devices MUST be always-on (except for possible intermittent turning off if described in a corresponding SimpleIoT/DLP* document); turning off transmitter is NOT allowed for SimpleIoT Retransmitting Devices. That is, if MCUSLEEP instruction is executed on a SimpleIoT Retransmitting Device, it simply pauses executing a program, without turning transmitter off (TODO: add to Zepto VM). Normally, SimpleIoT Retransmitting Devices are mains-powered, or are using large batteries. SimpleIoT Protocol Stack (specifically SimpleIoT/HMP) on SimpleIoT Retransmitting Devices requires more resources (specifically RAM) than non-Retransmitting Devices.

Highly mobile Devices SHOULD NOT be Retransmitting Devices. Building a reliable network out of highly mobile is problematic from the outset (and right impossible if these movements are not synchronized). Therefore, SimpleIoT/HMP assumes that Retransmitting Devices are moved rarely, and is optimized for rarely-moving Retransmitting Devices. While SimpleIoT/HMP does recover from moving one or even all Retransmitting Devices, this scenario is not optimized and recovery from it may take significant time.

Routing Tables

Each Retransmitting Device, after pairing, MUST keep a Routing Table. Routing Table consists of two lists: (a) Links list, with each entry being (LINK-ID,BUS-ID,INTRA-BUS-ID,NEXT-HOP-ACKS,LINK-DELAY-UNIT,LINK-DELAY,LINK-DELAY-ERROR) tuple, and (b) Routes list, with each entry being (TARGET-ID,LINK-ID). LINK-ID is an intra-Routing-Table id, used to map routes into links.

Each entry in Routes list has semantics of “where to route packet addressed to TARGET-ID”. In Links list, INTRA-BUS-ID=NULL means that the entry is for an incoming link. Incoming link entries are relatiely rare, and are used to specify LINK-DELAYs.

NEXT-HOP-ACKS is a flag which is set if the nearest hop (over (BUS-ID,INTRA-BUS-ID)) is known to be able not only to receive packets, but to send ACKs back; in general, NEXT-HOP-ACKS cannot be calculated based only on bus type, and may change for the same link during system operation; SimpleIoT/HMP is built to try using links with NEXT-HOP-ACKS as much as possible, but MAY use links without NEXT-HOP-ACKS if there are no alternatives.

TODO: size reporting to Root (as # of unspecified ‘storage units’, plus sizes of Links entry and Routes entry expressed in the same ‘storage units’).

Routing Tables SHOULD be stored in a ‘canonical’ way (Links list ordered from lower LINK-IDs to higher ones, Routes list ordered from lower TARGET-IDs to higher ones; duplicate entries for the same LINK-ID are prohibited, for the same TARGET-ID are currently prohibited); this is necessary to simplify calculations of the Routing Table checksums. TODO: specify Routing-Table-Checksum calculation

On non-Retransmitting Devices, Routing Table is rudimentary: it contains only one Link (LINK-ID=0,BUS-ID,INTRA-BUS-ID,...) and only one Route (TARGET-ID=0,LINK-ID=0). Moreover, on non-Retransmitting Devices Routing Table is OPTIONAL; if non-Retransmitting Device does not keep Routing Table - it MUST be reflected in a TODO CAPABILITIES flag during “pairing”; in this case Root MUST send requests to such devices specifying TODO header extension (which contains BUS-ID,INTRA-BUS-ID for the first hop back from target Device).

All Routing Tables on both Retransmitting and non-Retransmitting Devices are essentially (usually partial) replicas of “Master Routing Tables” which are kept on Root. It is a responsibility of Root to maintain Routing Tables for all the Devices (both Retransmitting and non-Retransmitting); it is up to Root which entries to store in each Routing Table. In some cases, Routing Table might need to be truncated; in this case, it is responsibility of Root to use VIA field in Target-Address (see below) to ensure that the packet can be routed given the Routing Tables present. In any case, Routing Table MUST be able to contain at least one entry, with TARGET-ID=0 (Root). This guarantees that path to Root can always be found without VIA field.

In addition, on Retransmitting Devices the following parameters are kept (and updated by Root): MAX-TTL, FORWARD-TO-SANTA-DELAY-UNIT, FORWARD-TO-SANTA-DELAY, MAX-FORWARD-TO-SANTA-DELAY (using same units as FORWARD-TO-SANTA-DELAY), NODE-MAX-RANDOM-DELAY-UNIT, and NODE-MAX-RANDOM-DELAY. MAX-FORWARD-TO-SANTA-DELAY indicates maximum “forward to santa” delay for all Retransmitting Devices in the PAN.

TODO: no mobile non-Retransmitting (TODO reporting ‘mobile’ in pairing CAPABILITIES, plus heuristics), priorities (low->high): non-Retransmitting, Retransmitting.

Broken Routing Tables

Despite that Routing Tables are updated only by authenticated upper-layer messages, SimpleIoT/HMP does recognize that Routing Tables may become broken during operation. To deal with it, two separate procedures are used. One such procedure is intended for destination Devices (either Retransmitting or non-Retransmitting), and is described within “Unicast” section below. Another procedure is intended for Retransmitting Devices, and is described in “Acknowledged Unicast” section below.

Communicating Routing Table Information over SimpleIoT/CCP

As described above, SimpleIoT/HMP relies on Routing Table information being available on all relevant Retransmitting Nodes. To ensure that this information is transmitted in secure manner, it SHOULD be transmitted by an upper-layer secure (and acknowledged-delivery) protocol such as SimpleIoT/CCP. As described above, this doesn’t create a chicken-and-egg problem, as each Retransmitting Node can be accessed via SimpleIoT/HMP regardless of Routing Tables present (or even badly broken) on the Retransmitting Node in question; and as soon as Retransmitting Node can be accessed via SimpleIoT/HMP - upper-layer protocol such as SimpleIoT/CCP can be used to update Routing Table on the Retransmitting Node.

Technically, protocol for communicating Routing Table information is not a part of SimpleIoT/HMP. However, in this section we provide an example implementation of such protocol over CCP_PHY_AND_ROUTING_DATA packets.

CCP_PHY_AND_ROUTING_DATA supports the following packets:

Route-Update-Request: | FLAGS | OPTIONAL-EXTRA-HEADERS | OPTIONAL-ORIGINAL-RT-CHECKSUM | OPTIONAL-MAX-TTL | OPTIONAL-FORWARD-TO-SANTA-DELAY-UNIT | OPTIONAL-FORWARD-TO-SANTA-DELAY | OPTIONAL-MAX-FORWARD-TO-SANTA-DELAY | OPTIONAL-MAX-NODE-RANDOM-DELAY-UNIT | OPTIONAL-MAX-NODE-RANDOM-DELAY | MODIFICATIONS-LIST | RESULTING-RT-CHECKSUM |

where FLAGS is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] being DISCARD-RT-FIRST (indicating that before processing MODIFICATIONS-LIST, the whole Routing Table must be discarded), bit[1] being UPDATE-MAX-TTL flag, bit[2] being UPDATE-FORWARD-TO-SANTA-DELAY flag, bit[3] being UPDATE-MAX-NODE-RANDOM-DELAY flag, and bits[4..] reserved (MUST be zeros); OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above; Target-Address is the Target-Address field; OPTIONAL-ORIGINAL-RT-CHECKSUM is present only if DISCARD-RT-FIRST flag is not set; OPTIONAL-ORIGINAL-RT-CHECKSUM is a Routing-Table-Checksum, specifying Routing Table checksum before the change is applied; if OPTIONAL-ORIGINAL-RT-CHECKSUM doesn’t match to that of the Routing Table - it is TODO Routing-Error; OPTIONAL-MAX-TTL is present only if UPDATE-MAX-TTL flag is present, and is a 1-byte field, OPTIONAL-FORWARD-TO-SANTA-DELAY-UNIT, OPTIONAL-FORWARD-TO-SANTA-DELAY, and OPTIONAL-MAX-FORWARD-TO-SANTA-DELAY are present only if UPDATE-FORWARD-TO-SANTA-DELAY flag is present, and all are Encoded-Signed-Int<max=2> fields, OPTIONAL-MAX-NODE-RANDOM-DELAY-UNIT and OPTIONAL-MAX-NODE-RANDOM-DELAY are present only if UPDATE-MAX-NODE-RANDOM-DELAY flag is present, and both are Encoded-Unsigned-Int<max=2> fields, MODIFICATIONS-LIST described below; RESULTING-RT-CHECKSUM is a Routing-Table-Checksum, specifying Routing Table Checksum after the change has been applied (if RESULTING-RT-CHECKSUM doesn’t match - it is TODO Routing-Error).

Route-Update-Request is always accompanied with SimpleIoT/CCP “additional bits” equal to 0x0 (see SimpleIoT Command and Control Protocol (SimpleIoT/CCP) for details on CCP_PHY_AND_ROUTING_DATA “additional bits”).

MODIFICATIONS-LIST consists of entries, where each entry is one of the following:

  • | ADD-OR-MODIFY-LINK-ENTRY-AND-LINK-ID | BUS-ID | NEXT-HOP | NEXT-HOP-ACKS-AND-INTRA-BUS-ID-PLUS-1 | OPTIONAL-LINK-DELAY-UNIT | OPTIONAL-LINK-DELAY | OPTIONAL-LINK-DELAY-ERROR |

    where ADD-OR-MODIFY-LINK-ENTRY-AND-LINK-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant ADD_OR_MODIFY_LINK_ENTRY, bit[3] being LINK-DELAY-PRESENT flag, and bits[4..] equal to LINK-ID; BUS-ID is an Encoded-Unsigned-Int<max=2> field; NEXT-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of the next-hop node; NEXT-HOP-ACKS-AND-INTRA-BUS-ID is an Encoded-Unsigned-Int<max=4> bitfield substrate, with bit[0] being a NEXT-HOP-ACKS flag for the Routing Table Entry, and bits[1..] representing INTRA-BUS-ID-PLUS-1 (INTRA-BUS-ID-PLUS-1 == 0 means that INTRA-BUS-ID==NULL, and therefore that the link entry is an incoming link entry; otherwise, INTRA-BUS-ID = INTRA-BUS-ID-PLUS-1 - 1); OPTIONAL-LINK-DELAY-UNIT, OPTIONAL-LINK-DELAY, and OPTIONAL-LINK-DELAY-ERROR are present only if LINK-DELAY-PRESENT flag is set, and are Encoded-Unsigned-Int<max=2> fields. NB: by default, link delays are not set by Root, and are set based on device’s internal per-bus settings.

  • | DELETE-LINK-ENTRY-AND-LINK-ID |

    where DELETE-LINK-ENTRY-AND-LINK-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant DELETE_LINK_ENTRY, and bits[3..] equal to LINK-ID.

  • | ADD-OR-MODIFY-ROUTE-ENTRY-AND-LINK-ID | TARGET-ID |

    where ADD-OR-MODIFY-ROUTE-ENTRY-AND-LINK-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant ADD_OR_MODIFY_ROUTE_ENTRY, and bits[3..] equal to LINK-ID; TARGET-ID is an Encoded-Unsigned-Int<max=2> field.

  • | DELETE-ROUTE-ENTRY-AND-TARGET-ID |

    where DELETE-ROUTE-ENTRY-AND-TARGET-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant DELETE_ROUTE_ENTRY, and bits[3..] equal to TARGET-ID. Note that DELETE-ROUTE-ENTRY-AND-TARGET-ID is the only MODIFICATIONS-LIST entry first field which includes TARGET-ID rather than LINK-ID.

Route-Update-Request packets always go from Root to Device. Route-Update-Request MAY be sent either to Retransmitting or to non-Retransmitting Device; however (as with any SimpleIoT/CCP packet), if sending it to a non-Retransmitting Device, Root MUST be sure that non-Retransmitting Device has it’s transmitter turned on (because upper-layer protocol state guarantees it).

Route-Update-Response: | ERROR-CODE | TODO: more error info if any

where ERROR-CODE is an Encoded-Unsigned-Int<max=1> field, containing error code. ERROR-CODE = 0 means that Route-Update-Request has been completed successfully.

Route-Update-Response is always accompanied with SimpleIoT/CCP “additional bits” equal to 0x0 (see SimpleIoT Command and Control Protocol (SimpleIoT/CCP) for details on CCP_PHY_AND_ROUTING_DATA “additional bits”).

Communicating PHY Information over SimpleIoT/CCP

Some of SimpleIoT/DLP-* protocols (as described in corresponding SimpleIoT/DLP-* document) MAY need to communicate information to Central Controller (for example, to calculate optimums using quite complicated methods).

This is done via the following packets:

PHY-Data-Request: | ID-OF-SADLP | DLP-DEPENDENT-PAYLOAD | where ID-OF-SADLP is an Encoded-Unsigned-Int<max=2> field, specified in respective SimpleIoT/DLP-* document. TODO: list of all IDs in one place to avoid potential for collisions.

PHY-Data-Request is always accompanied with SimpleIoT/CCP “additional bits” equal to 0x1 (see SimpleIoT Command and Control Protocol (SimpleIoT/CCP) for details on CCP_PHY_AND_ROUTING_DATA “additional bits”).

PHY-Data-Response: | DLP-DEPENDENT-PAYLOAD |

PHY-Data-Response is always accompanied with SimpleIoT/CCP “additional bits” equal to 0x1 (see SimpleIoT Command and Control Protocol (SimpleIoT/CCP) for details on CCP_PHY_AND_ROUTING_DATA “additional bits”).

PHY-Data-Ready-Request: | (empty)

PHY-Data-Ready-Request is always accompanied with SimpleIoT/CCP “additional bits” equal to 0x2 (see SimpleIoT Command and Control Protocol (SimpleIoT/CCP) for details on CCP_PHY_AND_ROUTING_DATA “additional bits”).

PHY-Data-Ready-Response: | (empty)

PHY-Data-Ready-Response is always accompanied with SimpleIoT/CCP “additional bits” equal to 0x2 (see SimpleIoT Command and Control Protocol (SimpleIoT/CCP) for details on CCP_PHY_AND_ROUTING_DATA “additional bits”).

To indicate that PHY-level tuning is completed, Device sends PHY-Data-Ready-Response (sic!); this happens at the point specified in respective SimpleIoT/DLP-* document. In response, Root sends PHY-Data-Ready-Request (sic!).

Addressing

SimpleIoT/HMP supports two ways of addressing devices: non-paired and paired.

Non-paired addressing is used for temporary addressing Devices which are not “paired” with SimpleIoT Root (yet). Non-paired addressing is used ONLY during “Pairing” process, as described in SimpleIoT Pairing document. As soon as “pairing” is completed, Device obtains it’s own HMP-NODE-ID (TODO: add to pairing document), and all further communications with Device is performed using “paired” addressing. Non-paired addressing is a triplet (NODE-ID,BUS-ID,INTRA-BUS-ID).

Paired addressing is used for addressing Devices which has already been “paired”. It is always one single item HMP-NODE-ID. Root always has HMP-NODE-ID=0.

HMP Checksums

To validate integrity of HMP headers, and of the whole HMP packets, HMP-CHECKSUM is used.

HMP-CHECKSUM is defined as a Fletcher-16 checksum, as described in https://en.wikipedia.org/wiki/Fletcher%27s_checksum (using modulo 255), stored using “SimpleIoT Endianness”.

Whenever the packet has both header and body, SimpleIoT/HMP uses two HMP-CHECKSUMs: first checksum (referred to as HEADER-CHECKSUM) encompasses only header (i.e. everything before the first checksum), second HMP-CHECKSUM (referred to as FULL-CHECKSUM) is located at the very end and encompasses header+first_checksum+body (i.e. everything before the second checksum).

DELAYs and DELAY-UNITs

Whenever delay (or more generally - time interval) needs to be calculated, it is always represented as two fields: DELAY itself and corresponding DELAY-UNIT.

To calculate delay for specific DELAY and DELAY-UNIT, the following formula is used (the formula as written is assumed to be in floating-point; other equivalent implementations are possible depending in particular on timer resolution for specific Device): delay = 1 millisecond * DELAY * (2^DELAY_UNIT); that is, DELAY-UNIT=0 and DELAY=1 means 1 millisecond, DELAY-UNIT=1 and DELAY=1 means 2 milliseconds, and DELAY-UNIT =-2 and DELAY=1 means 0.25 milliseconds.

Recovery Philosophy

Recovery from route changes/failures is vital for any mesh protocol. SimpleIoT/HMP does it as follows:

  • by default, most of the transfers are not acknowledged at SimpleIoT/HMP level (go as Hmp-Unicast-Data-Packet without ACKNOWLEDGED-DELIVERY flag)
  • however, upper-layer protocol (normally SimpleIoT/GDP) issues it’s own retransmits and passes retransmit number to SimpleIoT/HMP
  • on retransmit #TODO, SimpleIoT/HMP switches ACKNOWLEDGED-DELIVERY flag on
  • when ACKNOWLEDGED-DELIVERY flag is set, SimpleIoT/HMP uses ‘Acknowledged Uni-Cast’ mode described below
  • if ‘Acknowledged Uni-Cast’ fails for M times (as described below), link failure is assumed
  • link failure (as described above) is reported to the Root, so it can initiate route discovery to the node on the other side of the failed link (using Hmp-From-Santa-Data-Packet)
    • if link failure is detected from the side of the link which is close to Root, link failure reporting is done by sending Routing-Error (which always come in ACKNOWLEDGED-DELIVERY mode) back to Root
    • if link failure is detected from the side of the link which is far from Root, link failure reporting is done by broadcasting Hmp-To-Santa-Data-Or-Error-Packet, which is then converted into Hmp-Forward-To-Santa-Data-Or-Error-Packet (which is always sent in ACKNOWLEDGED-DELIVERY mode) by all Retransmitting Devices which have received it.

Storm Avoidance

To reduce number of induced collisions during broadcasts, a.k.a. “request storm” and “reply storm” (NB: avoiding “storms” is important even when CSMA/CA is present, because CSMA/CA provides only probabilistic success), SimpleIoT/HMP supports two mechanisms: explicit time-based collision avoidance, and random-delay-based storm avoidance.

Explicit Time-Based Storm Avoidance and Collision Domains

SimpleIoT/HMP explicit time-based collision avoidance works as follows:

  • to avoid “request storm”: when performing a ‘network flood’ (using Hmp-From-Santa-Data-Packet), Root MAY specify explicit time delays for each node.
  • to avoid “reply storm”: Root MAY specify FORWARD-TO-SANTA-DELAY-* parameters; whenever a Hmp-To-Santa-Data-Or-Error-Packet (these are essentially sent as “anybody who can hear this, forward it to Root”), is received by Retransmitting Node, each of receiving Retransmitting Nodes waits according to FORWARD-TO-SANTA-DELAY before retransmitting.
  • In addition (to avoid “storms” in general), each HMP packet, MAY have a ‘Collision-Domain’ restrictions (i.e. “from t0-from-now to t1-from-now, don’t transmit on Collision-Domain #CD); these restrictions specify . Retransmitting Devices SHOULD monitor Collision-Domain headers in promiscuous mode and work accordingly, even if the packet is not addressed to this Retransmitting Device.
Random-delay-based Storm Avoidance

If explicit time-based collision avoidance is not used, Retransmitting Devices MUST use random delays (based on NODE-MAX-RANDOM-DELAY-UNIT and NODE-MAX-RANDOM-DELAY) as specified below.

Target-Address, Multiple-Target-Addresses, and Multiple-Target-Addresses-With-Extra-Data

Target-Address allows to store either paired-address, or non-paired address. Target-Address is encoded as

| FLAG-AND-NODE-ID | OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID | ... | OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID | OPTIONAL-CUSTOM-INTRA-BUS-SIZE | OPTIONAL-INTRA-BUS-ID |

where FLAG-AND-NODE-ID-OR-BUS-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, where bit[0] is EXTRA_DATA_FOLLOWS flag, and bits[1..] are NODE-ID.

OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID is present only if EXTRA_DATA_FOLLOWS is set, and is an Encoded-Unsigned-Int<max=2> bitfield substrate, where bit[0] represents IS_NONPAIRED_ADDRESS flag, and the rest of the bits depend on bit[0]. If IS_NONPAIRED_ADDRESS flag is not set, then bits[1..] represent VIA field (encoded as NODE-ID+1); if VIA field is -1 (because bits[1..] are zero), then no further extra data fields are present. If IS_NONPAIRED_ADDRESS flag is set, then bits[1..3] represent INTRA-BUS-SIZE (with value 0x7 interpreted in a special way, specifying that INTRA-BUS-SIZE is ‘custom’), and bits [4..] represent BUS-ID. If IS_NONPAIRED_ADDRESS flag is not set, and VIA field in it is >=0, it means that another OPTIONAL-VIA-INTRA-BUS-SIZE-AND-BUS-ID field is present, which is interpreted as above. OPTIONAL-VIA-INTRA-BUS-SIZE-AND-BUS-ID with either IS_NONPAIRED_ADDRESS set, or with VIA field equal to -1, denote the end of the list.

OPTIONAL-CUSTOM-INTRA-BUS-SIZE is present only if OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID is present, and flag IS_NONPAIRED_ADDRESS is set, and INTRA-BUS-SIZE field has value ‘custom’; OPTIONAL-INTRA-BUS-ID is present only if OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID is present, and has INTRA-BUS-SIZE (calculated from OPTIONAL-INTRA-BUS-SIZE-AND-BUS-ID and OPTIONAL-CUSTOM-INTRA-BUS-SIZE) size.

Multiple-Target-Addresses is essentially a multi-cast address. It is encoded as a list of items, where each item is similar to an Target-Address field, with the following changes:

  • for list entries, within FLAG-AND-NODE-ID field it is NODE-ID + 1 which is stored (instead of simple NODE-ID for single Target-Address). This change does not affect VIA fields.
  • to denote the end of Multiple-Target-Addresses list, FLAG-AND-NODE-ID field with EXTRA_DATA_FOLLOWS=0 and NODE-ID=0, is used
  • value of FLAG-AND-NODE-ID field with EXTRA_DATA_FOLLOWS=1 and NODE-ID=0, is prohibited (reserved)

Multiple-Target-Addresses-With-Extra-Data is the same as Multiple-Target-Addresses, but each item (except for the last one, where NODE-ID=0), additionally contains some extra data (which is specified whenever Multiple-Target-Addresses-With-Extra-Data is mentioned). For example, if we’re speaking about “Multiple-Target-Addresses-With-Extra-Data, where Extra-Data is 1-byte field”, it means that each item of the list (except for the last one) will have both Target-Address field (with changes described in Multiple-Target-Addresses), and 1-byte field of extra data.

Time-To-Live

Time-To-Live (TTL) is a field which is intended to address misconfigured/inconsistent Routing Tables. TTL is set to certain value (default 4) whenever the packet is sent, and is decremented by each Node which retransmits the packet. TTL=0 is valid, but TTL < 0 is not; whenever the packet needs to be retransmitted and it would cause TTL to become < 0 - the packet is dropped (with a Routing-Error, see below).

During normal operation, it SHOULD NOT occur. Whenever the packet is dropped because TTL is down to zero (except for Routing-Error HMP packets), it MUST cause a TODO Routing-Error to be sent to Root.

Uni-Cast Processing

Whenever a Uni-Cast packet (the one with a Target-Address field) is received by Retransmitting Device, the procedure is the following:

  • check if the Target-Address is intended for the Retransmitting Device
    • if it is - process the packet locally and don’t process further
  • if packet TTL is already equal to 0 - drop the packet and send Routing-Error to the Root (see Time-To-Live section above for details)
  • decrement packet TTL
  • using Routing Table, find next hop for the Target-Address
    • if next hop cannot be found for the Target-Address itself, but Target-Address contains VIA field(s) - try to find next hop based on each of VIA fields
    • if next hop cannot be found using Target-Address and all VIA field(s) - drop the packet and send TODO Routing-Error to the Root
  • if any of VIA fields in the Target-Address is the same as the next hop - remove all such VIA fields from the Target-Address
  • find bus for the next hop and send modified packet (see on TTL and VIA modifications above) over this bus
Processing on Destination and Broken Routing Table

As described above, SimpleIoT/HMP does recognize that Routing Tables may become broken during operation. On a destination Device, whenever Device attempts retransmit #TODO of the message, Device sends it as a Hmp-To-Santa message, ignoring local Routing Table completely; TODO: add optional-header with RT-CHECKSUM for Hmp-To-Santa messages?

Acknowledged Uni-Cast

As described in detail below, Hmp-Unicast-Data-Packets, except for Hmp-Unicast-Data-Packets without ACKNOWLEDGED-DELIVERY flag and Hmp-Loop-Ack-Packet, are sent in ‘Acknowledged Uni-Cast’ mode.

Processing by Retransmitting Devices

If packet is to be delivered to the next hop in ‘Acknowledged’ mode by Retransmitting Device, it is processed in the following manner:

If the packet already has LOOP-ACK extra header (see below), and next hop has NEXT-HOP-ACKS flag set in the Routing Table, then Retransmitting Device:

  • sends Hmp-Loop-Ack-Packet (see below) back to the requestor specified in LOOP-ACK extra header
  • removes LOOP-ACK extra header
  • continues processing as specified below

If the next hop has NEXT-HOP-ACKS flag set in the Routing Table, after sending the packet, timer is set and the packet is sent using “uni-cast” bus mechanism. If timer expires (or Node receives relevant Hmp-Ack-Nack-Packet with IS-NACK flag set), SimpleIoT/HMP retries it for 5 times (with exponentially increasing timeouts - TODO); if all 5 attempts fail - it is treated as ‘Routing-Error’. In particular:

  • if the packet has Root as Target-Address:
    • packet Hmp-To-Santa-Data-Or-Error-Packet containing TBD Routing-Error as PAYLOAD (and with IS_ERROR flag set) is broadcasted
    • if possible, the packet which wasn’t delivered, SHOULD be preserved (TODO: what to do if it cannot be?), and retransmitted as soon as route to the Root is restored
  • if the packet has anything except for Root as Target-Address (and therefore is coming from Root):
    • packet Hmp-Routing-Error containing TBD Routing-Error is sent (towards Root)
    • to deal with potentially broken Routing Table on this Retransmitting Device, this Hmp-Routing-Error packet MUST contain TODO optional-header with RT-Checksum
    • the packet which wasn’t delivered, doesn’t need to be preserved (TODO: identify packet which has been lost within Routing-Error)

If the packet doesn’t have LOOP-ACK extra header, and next hop doesn’t have NEXT-HOP-ACKS flag set in the Routing Table, then Retransmitting Device:

  • adds LOOP-ACK extra header (which is described below) to the packet (if it is not already present)
  • sends modified packet using “bus unicast” operation
  • and sets timer to TODO
    • if the sender doesn’t receive Hmp-Loop-Ack-Packet until timer expires - it retransmits the packet at SimpleIoT/HMP level.
      • if such attempts don’t succeed for 5 (TODO) times (with exponentially increasing timeouts - TODO) - it is treated as ‘Routing-Error’ (the same way as described above, depending on packet having Root as a Target-Address).

If the packet already has LOOP-ACK extra header, and next hop doesn’t have NEXT-HOP-ACKS flag set in the Routing Table, then Retransmitting Device:

  • keeps LOOP-ACK extra header
  • sends packet using “bus unicast” operation
  • doesn’t set any timers
LOOP-ACK on Destination

If packet with LOOP-ACK extra header is received by destination Device, destination Device MUST send Hmp-Loop-Ack-Packet back to the node specified in LOOP-ACK extra header. If destination Device is a non-Retransmitting Device, it will send Hmp-Loop-Ack-Packet with Target-Address specified in LOOP-ACK, but to the next hop specified in Root’s Routing Table entry. TODO: is it possible that Device doesn’t have a route to Root yet?

LOOP-ACK and Routing

As LOOP-ACK currently doesn’t support VIA routing, it means that Root MUST ensure that all the nodes on the “loop” route already know the routes without VIA fields; it applies both to the route from the loop beginning to the loop end, and back from the loop end to the loop beginning (as for request-response cycle, LOOP-ACKs go both directions). When speaking about ‘back from the loop end to the loop beginning’, it MUST be taken into account that, as specified above, non-Retransmitting Device will send a Hmp-Loop-Ack-Packet in the direction of the Root (but with Target-Address equal to the address from LOOP-ACK extra header), so there MUST be an already-defined route from this next-hop-in-direction-of-Root to the loop beginning.

“From-Santa” Packet Processing

Whenever a From-Santa packet (see below) is processed by a Retransmitting Device, the procedure is the following:

  • check if one of addresses within Target-Address is intended for the Retransmitting Device (TODO: if multiple addresses match the Retransmitting Device - it is a TODO Routing-Error, which should never happen)

    • if it is - process the packet as terminating device (as described in more details while discussing Hmp-From-Santa-Data-Packet below) and do not process any further
  • if packet TTL is already equal to 0 - drop the packet and send Routing-Error to the Root (see Time-To-Live section above for details)

  • decrement packet TTL

  • using Routing Table, find all retransmitters in the MULTIPLE-RETRANSMITTING-ADDRESSES for which routes are known, and exclude the rest from consideration; group found retransmitters according to Bus IDs in their routes; for each group of retransmitters create a new packet with MULTIPLE-RETRANSMITTING-ADDRESSES consisting of retransmitters from the group, and send the packet using respective BUS-ID

  • if the retransmitter is not listed in the MULTIPLE-RETRANSMITTING-ADDRESSES, stop

  • if all bus types in the bus-type-list were used while sending packets during above steps (if any), stop

  • for each remaining bus type prepare and send a packet with the same target list and empty retransmitter list.

    • if the bus supports multi-casting - send the modified packet using multi-cast bus addressing over the bus.
    • otherwise, [TODO: check details] send the packet using uni-cast bus addressing to each of the potential recepients

[TODO: if VIA fields are expected to be used, address this issue]

NOTE: at terminating device the above steps result in

  • check whether the device in the target-list; if found, process the packet by as described in more details while discussing Hmp-From-Santa-Data-Packet below.

At the Root device, forming a From-Santa packet can be organized as follows:

  • determine a list of devices to be found and form a Target-Address list
  • determine which types of buses have devices to be found and form bus-type-list
  • determine a list of retransmitting devices to be used; ultimately, it can be a list of all retransmitters with known routes to, or a subset of this list
  • further processing is done as if the Root were a retransmitting device that has received a From-Santa packet with data formed above and that has found itself in the MULTIPLE-RETRANSMITTING-ADDRESSES.

Note: if more than a single device is selected as a target at Root, the payload of the packet must be empty.

Promiscuous Mode Processing

Retransmitting Devices SHOULD, wherever possible, to listen to all the packets in “promiscuous mode”. It allows for the following processing:

  • if Retransmitting Device hears a packet addressed (at underlying protocol level) to another (“next-hop”) Retransmitting Device (which is not Root), and it has a RETRANSMIT-ON-NO-RETRANSMIT flag in Routing Table for the route entry for that Retransmitting Device, and after a TODO timeout it doesn’t hear a retransmit (neither full nor “partially correct”) by next retransmitting the same packet (TODO define “the same packet”), it MUST try to send a TODO packet to the next-hop Retransmitting Device (in “acknowledged mode”) - receiving Device MUST forward the packet to the destination, and send (or attach as a Combined-Packet if the target is Root) a TODO Routing-Error to the Root. If this attempt by our Retransmitting Device doesn’t succeed - our Retransmitting Device MUST send a TODO Routing-Error packet (containing the packet as a payload) to the Root.

OPTIONAL-EXTRA-HEADERS

Most of HMP packets have OPTIONAL-EXTRA-HEADERS field. It has a generic structure, but interpretations depend on the packet type. More specifically, OPTIONAL-EXTRA-HEADERS is a sequence of the following items:

  • | GENERIC-EXTRA-HEADER-FLAGS |

    where GENERIC-EXTRA-HEADER-FLAGS is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-HEADER list, bits[1..2] equal to 2-bit constant GENERIC_EXTRA_HEADER_FLAGS, and further bits interpreted depending on packet type:

    • bit[3]. MORE-PACKETS-FOLLOW flag.
    • bit[4]. If the packet type is Hmp-To-Santa-Data-Or-Error-Packet or Hmp-Forward-To-Santa-Data-Or-Error-Packet - the bit is IS-ERROR (indicating that PAYLOAD is in fact Routing-Error). If the packet type is Hmp-From-Santa-Data-Packet - it is a TARGET-COLLECT-LAST-HOPS flag. For Hmp-To-Santa-Data-Or-Error-Packet the bit is IS-LOCAL-ECHO flag. For Hmp-Ack-Nack-Packet the bit is IS-NACK flag. For other packet types - RESERVED (MUST be zero)
    • bit[5]. If the packet type is Hmp-From-Santa-Data-Packet, the bit is an EXPLICIT-TIME-SCHEDULING flag. For Hmp-Ack-Nack-Packet - the bit is IS-LOOP-ACK flag. For other packet types - RESERVED (MUST be zero)
    • bit[6]. RESERVED (MUST be zero)
    • bit[7]. If the packet type is Hmp-Unicast-Data-Packet, Hmp-From-Santa-Data-Packet, Hmp-To-Santa-Data-Or-Error-Packet, or Hmp-Forward-To-Santa-Data-Packet - the bit is IS-PROBE flag. For Hmp-Ack-Nack packet - the bit is DELAYS-PRESENT. For other packet types - RESERVED (MUST be zero)
    • bits [8..] - RESERVED (MUST be zeros)
  • | GENERIC-EXTRA-HEADER-COLLISION-DOMAIN | COLLISION-DOMAIN-ID-AND-FLAG | COLLISION-DOMAIN-T0 | COLLISION-DOMAIN-T1 | ... |

    where GENERIC-EXTRA-HEADER-COLLISION-DOMAIN is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-HEADER list, bits[1..2] equal to 2-bit constant GENERIC_EXTRA_HEADER_COLLISION_DOMAIN, and bits [3..] equal to DELAY-UNIT; COLLISION-DOMAIN-ID-AND-FLAG is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=0 indicating the end of collision-domain list, bits[1..] being COLLISION-DOMAIN-ID; COLLISION-DOMAIN-T0 and COLLISION-DOMAIN-T1 are Encoded-Unsigned-Int<max=2> fields specifying respectively beginning and end of the window (“from now”) when COLLISION-DOMAIN-ID SHOULD NOT be disturbed. There can be multiple GENERIC-EXTRA-HEADER-COLLISION-DOMAIN headers in the same packet.

    GENERIC-EXTRA-HEADER-COLLISION-DOMAIN is a special kind of header; on receiving it, each node SHOULD take information within into account, and SHOULD NOT transfer over corresponding COLLISION-DOMAIN-ID within specified time window. In addition, whenever Retransmitting Device retransmits such a packet, it MUST calculate NEW-COLLISION-DOMAIN-T0 = MAX(0,OLD-COLLISION-DOMAIN-T0 - INCOMING-LINK-DELAY - OUTGOING-LINK-DELAY) and NEW-COLLISION-DOMAIN-T1 = MAX(0,OLD-COLLISION-DOMAIN-T1 - INCOMING-LINK-DELAY - OUTGOING-LINK-DELAY + INCOMING-LINK-DELAY-ERROR + OUTGOING-LINK-DELAY-ERROR) and use NEW-* values in the retransmitted packet; for calculating OLD-COLLISION-DOMAIN-* parameters DELAY-UNIT field is used, *-LINK-DELAY parameters together with their DELAY-UNITs are taken from corresponding entries in Routing Table; after doing these calculations, if both NEW-COLLISION-DOMAIN-T0 and NEW-COLLISION-DOMAIN-T1 become =0, this specific extra header SHOULD be dropped (i.e. not sent further).

  • | UNICAST-EXTRA-HEADER-LOOP-ACK | LOOP-ACK-ID |

    where UNICAST-EXTRA-HEADER-LOOP-ACK is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-DATA list, bits[1..2] equal to a 2-bit constant UNICAST_EXTRA_HEADER_LOOP_ACK, and bits[3..] representing NODE-ID of the address where to send the LOOP-ACK, and LOOP-ACK-ID is an Encoded-Unsigned-Int<max=2> field representing ID of the LOOP-ACK to be returned. This extra header MUST NOT be present for packets other than Hmp-Unicast-Data-Packet.

  • | TOSANTA-EXTRA-HEADER-LAST-INCOMING-HOP | CONNECTION_QUALITY |

    where TOSANTA-EXTRA-HEADER-FLAGS is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-HEADER list, bits[1..3] equal to 3-bit constant TOSANTA_EXTRA_HEADER_LAST_INCOMING_HOP, and bits [4..] being node id; and CONNECTION_QUALITY is an Encoded-Unsigned-Int<max=1> bitfield substrate, with bits[0..3] being signal level (with 0 corresponding to the highest and 15 to the lowest signal level) and bits[4..6] being error count (resulting from error correction of the received packet). This extra header MUST NOT be present for packets other than Hmp-To-Santa-Data-Or-Error-Packet. There can be multiple TOSANTA-EXTRA-HEADER-LAST-INCOMING-HOP extra headers within single packet.

NB: 2-bit extra header type constants MAY overlap as long as applicable types are different.

HMP Combined-Packet

In general, SimpleIoT/HMP passes Combined-Packets over underlying protocol. HMP Combined-Packet consists of one or more HMP Packets as described below; all HMP Packets except for last one in an HMP Combined-Packet, have MORE-PACKETS-FOLLOW flag set (depending on the packet type, this flag is either passed as a part of the first field, or as a part of GENERAL-EXTRA-HEADERS-FLAGS, see details below).

When combining packets, SimpleIoT/HMP MUST take into account both “MTU Hard Limits” and “MTU Soft Limits” of the appropriate SimpleIoT/DLP-* protocol.

HMP Packets

Hmp-Unicast-Data-Packet: | HMP-UNICAST-DATA-PACKET-FLAGS-AND-TTL | OPTIONAL-EXTRA-HEADERS | NEXT-HOP | LAST-HOP | Non-Root-Address | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where HMP-UNICAST-DATA-PACKET-FLAGS-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] equal to 0, bit[1] being ACKNOWLEDGED-DELIVERY flag, bit [2] being reserved (must be 0), bit [3] being EXTRA-HEADERS-PRESENT, bit[4] being DIRECTION-FLAG that is set, if a packet follows from the Root, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT flag is set and is described above; NEXT-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of the next-hop node (based on info from Routing Table), LAST-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of currently transmitting node, Non-Root-Address is a target (recipient) address or a source (sender) address depending on DIRECTION-FLAG and is always a device ID of a communication party other than the Root, OPTIONAL-PAYLOAD-SIZE is present only if optional headers are present and MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field, HEADER-CHECKSUM is a header HMP-CHECKSUM (see HMP-CHECKSUM section for details), PAYLOAD is a payload to be passed to the upper-layer protocol, and FULL-CHECKSUM is a HMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

If NEXT-HOP field doesn’t match ID of the receiving Device - the packet is ignored.

If a packet is addressed to the Root (DIRECTION-FLAG is not set), it MUST NOT contain VIA fields within.

If IS-PROBE flag is set, then PAYLOAD is treated differently. When destination receives Hmp-Unicast-Data-Packet with IS-PROBE flag set, destination doesn’t pass PAYLOAD to upper-layer protocol. Instead, destination parses PAYLOAD as follows: | PROBE-TYPE | OPTIONAL-PROBE-EXTRA-HEADERS | PROBE-PAYLOAD | where PROBE-TYPE is 1-byte bitfield substrate, with bits [0..2] being either PROBE_UNICAST or PROBE_TO_SANTA, bit[3] being PROBE-EXTRA-HEADERS-PRESENT, and bits [4..7] reserved (MUST be zeros); OPTIONAL-PROBE-EXTRA-HEADERS are similar to OPTIONAL-EXTRA-HEADERS, and PROBE-PAYLOAD takes the rest of the PAYLOAD; if PROBE-TYPE==PROBE_UNICAST, then destination Device sends Hmp-Unicast-Data-Packet back to Root, with PAYLOAD copied from PROBE-PAYLOAD, and extra headers formed from PROBE-EXTRA-HEADERS, “as if” this packet is sent in reply to IS-PROBE packet by upper layer, but adding IS-PROBE flag (as a part of GENERIC-EXTRA-FLAGS extra header). If PROBE-TYPE==PROBE_TO_SANTA, destination Device sends a Hmp-To-Santa-Data-Or-Error-Packet, with PAYLOAD copied from PROBE-PAYLOAD, “as if” the packet is sent in reply to IS-PROBE packet by upper layer, but adding IS-PROBE flag (as a part of GENERIC-EXTRA-FLAGS extra header).

Hmp-Unicast-Data-Packet is processed as specified in Uni-Cast Processing section above; if ACKNOWLEDGED-DELIVERY flag is set, packet is sent in ‘Acknowledged Uni-Cast’ mode. In any case, LAST-HOP field is updated every time the packet is re-sent. Processing at the target node (regardless of node type) consists of passing PAYLOAD to the upper-layer protocol.

If Retransmitting Device receives a “partially correct” Hmp-Unicast-Data-Packet, addressed to itself, and it has NACK-PREV-HOP flag set for the source link within Routing Table, it MUST send a Hmp-Nack-Packet back to the source of packet.

Hmp-From-Santa-Data-Packet: | HMP-FROM-SANTA-DATA-PACKET-AND-TTL | OPTIONAL-EXTRA-HEADERS | LAST-HOP | LAST-HOP-BUS-ID | REQUEST-ID | OPTIONAL-DELAY-UNIT | MULTIPLE-RETRANSMITTING-ADDRESSES | BROADCAST-BUS-TYPE-LIST | Multiple-Target-Addresses | OPTIONAL-TARGET-REPLY-DELAY | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where HMP-FROM-SANTA-DATA-PACKET-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant HMP_FROM_SANTA_DATA_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits[5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above, LAST-HOP is an Encoded-Unsigned-Int<max=2> representing node id of the last sender, LAST-HOP-BUS-ID is an Encoded-Unsigned-Int<max=2> representing ID of the Bus used by the last sender to send the packet, REQUEST-ID is an Encoded-Unsigned-Int<max=2> field, OPTIONAL-DELAY-UNIT is present only if EXPLICIT-TIME-SCHEDULING flag is present, and is an Encoded-Signed-Int<max=2> field, which specifies units for subsequent DELAY fields (as described below), MULTIPLE-RETRANSMITTING-ADDRESSES is a Multiple-Target-Addresses-With-Extra-Data field described above (with Extra-Data being either empty if EXPLICIT-TIME-SCHEDULING flag is not present, or otherwise Encoded-Unsigned-Int<max=2> DELAY field, using OPTIONAL-DELAY-UNIT field for delay calculations), BROADCAST-BUS-TYPE-LIST is a zero-terminated list of BUS-TYPE+1 values (enum values for BUS-TYPE TBD), Multiple-Target-Addresses is described above, OPTIONAL-TARGET-REPLY-DELAY has the same type as DELAY fields (and is absent if EXPLICIT-TIME-SCHEDULING flag is not present), and represents delay for the target Device (also using OPTIONAL-DELAY-UNIT field for delay calculations); OPTIONAL-PAYLOAD-SIZE is present only if MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field; HEADER-CHECKSUM is a header HMP-CHECKSUM (see HMP-CHECKSUM section for details), PAYLOAD is a payload to be passed to the upper-layer protocol, and FULL-CHECKSUM is a HMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

Hmp-From-Santa-Data-Packet is a packet sent by Root, which is intended to find one or more destinations specified in Multiple-Target-Addresses that are ‘somewhere around’, but exact locations are unknown. When Root needs to pass data to a Node for which it has no valid route, or to build a route to one or more Nodes for any other reason, Root sends HMP-FROM-SANTA-DATA-PACKET (or multiple packets), to each of Retransmitting Devices, in hope to find target Device(s) and to pass the packet. It should be noted that if Root intends to pass data to a node within this type of a packet, the packet can be addressed to only a single device, and, therefore, Multiple-Target-Addresses will have only a single address; and if Root intends to find locations to more than a single device at a time, payload must be empty.

Hmp-From-Santa-Data-Packet is processed as specified in “From Santa” packet Processing section above, up to the point where all the buses for all the next hops are found; note that if Multi-Cast processing generates a Routing-Error, it is not transmitted immediately (see below). Starting from that point, Retransmitting Device processes Hmp-From-Santa-Data-Packet proceeds as follows:

  • replaces LAST-HOP field with its own node id
  • replaces LAST-HOP-BUS-ID field with its own bus id to be used for packet retransmission
  • creates a broadcast-bus-list of its own buses which match BROADCAST-BUS-TYPE-LIST
  • for each bus which is on a next-hop-bus list but not on the broadcast-bus-list - continue processing as specified in Multi-Cast Processing section above
    • transmission MUST NOT be made until time specified in DELAY field for current node, passes. If the time in DELAY field (after subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) using their respective DELAY-UNITs) has already passed - node MUST introduce a random delay uniformly distributed from 0 to NODE-MAX-RANDOM-DELAY parameter (using NODE-MAX-RANDOM-DELAY-UNIT for calculations).
    • right before sending each modified packet - further modify all DELAY fields within MULTIPLE-RETRANSMITTING-ADDRESSES by subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) (using their respective DELAY-UNITs). If resulting value is <0, it is made equal to 0.
  • for each bus which is on the broadcast-bus-list - broadcast modified packet over this bus
    • transmission MUST NOT be made until time specified in DELAY field for current node, passes. If the time in DELAY field (after subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) using their respective DELAY-UNITs) has already passed - node MUST introduce a random delay uniformly distributed from 0 to NODE-MAX-RANDOM-DELAY parameter (using NODE-MAX-RANDOM-DELAY-UNIT for calculations).
    • right before broadcasting each modified packet - further modify all DELAY (including TARGET-REPLY-DELAY) fields within MULTIPLE-RETRANSMITTING-ADDRESSES by subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) (using their respective DELAY-UNITs). If resulting value is <0, it is made equal to 0.

If Retransmitting Device generates Routing-Error, then it MUST be delayed until time of TARGET-REPLY-DELAY + FORWARD-TO-SANTA-DELAY (using corresponding DELAY-UNITs for calculations). If this time has already passed - Routing-Error is transferred with a random delay (from 0 to NODE-MAX-RANDOM-DELAY, using NODE-MAX-RANDOM-DELAY-UNIT for calculations) from now.

On target Device, Hmp-From-Santa-Data-Packet waits until reply payload is ready (which is almost immediately if IS-PROBE is set, including ‘discovery’ packets, see below), then it is processed as follows:

  • if TARGET-DELAY (expressed in DELAY-UNITs) has not passed yet, Device waits until it passes
    • if the incoming packet has TARGET-COLLECT-LAST-HOPS flag set (which is normally set for all the packets which have IS-PROBE flag), then target Device traces all the incoming packets addressed to it and having the same REQUEST-ID and makes a list of extra-last-hops consisting of LAST-HOP and LAST-HOP-BUS-ID headers from all of them
    • when sending Hmp-To-Santa-Data-Or-Error-Packet reply back, target Device adds LAST-INCOMING-HOP extra header for LAST-HOP within incoming packet, plus LAST-INCOMING-HOP headers for extra-last-hops (if such list exists, see above)

If IS-PROBE flag is set, then PAYLOAD is treated differently. When destination receives Hmp-From-Santa-Data-Packet with IS-PROBE flag set, destination doesn’t pass PAYLOAD to upper-layer protocol. Instead, destination processes the packet in the same way as described for the processing of Hmp-Unicast-Data-Packet with IS-PROBE flag set. A special case of Hmp-From-Santa-Data-Packet with IS-PROBE set is when Target-Address is Root (=0). Such packets (a.k.a. ‘discovery’ packets) are ignored by Root, but are replied to only by Devices which are not paired yet (i.e. have no node id). All such ‘discovery’ packets with Target-Address=0 MUST have IS-PROBE flag set.

Hmp-To-Santa-Data-Or-Error-Packet: | HMP-TO-SANTA-DATA-OR-ERROR-PACKET-NO-TTL | OPTIONAL-EXTRA-HEADERS | SOURCE-ID | BUS-ID-AT-SOURCE | REQUEST-ID | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where HMP-TO-SANTA-DATA-OR-ERROR-PACKET-NO-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant HMP_TO_SANTA_DATA_OR_ERROR_PACKET, bit[4] being EXTRA-HEADERS-PRESENT, and bits [5..] reserved (MUST be zero); OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above. Note that Hmp-To-Santa-Data-Or-Error-Packet doesn’t contain TTL (as it is never retransmitted ‘as is’); SOURCE-ID is an Encoded-Unsigned-Int<max=2> ID of the sender; BUS-ID-AT-SOURCE is an Encoded-Unsigned-Int<max=2> Bus ID used by the sender; REQUEST-ID is an Encoded-Unsigned-Int<max=2> field taken from a Hmp-From-Santa-Data-Packet being answered, or 0, if current packet is initiated by device itself; OPTIONAL-PAYLOAD-SIZE is present only if MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field; HEADER-CHECKSUM is a header HMP-CHECKSUM (see HMP-CHECKSUM section for details); PAYLOAD is either data or error data depending on IS_ERROR flag; if IS_ERROR flag is set - PAYLOAD format is the same as the body (after OPTIONAL-EXTRA-HEADERS) of Hmp-Routing-Error-Packet; and FULL-CHECKSUM is a HMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

If IS-LOCAL-ECHO flag is set, the packet is ignored, except for Retransmitting Devices sending Hmp-Ack-Nack-Packet back to LAST-HOP. To avoid “packet storms”, these ACKs MUST be sent using FORWARD-TO-SANTA-DELAY (using FORWARD-TO-SANTA-DELAY-UNIT for calculations). In addition, these ACKs SHOULD contain DELAY-UNIT, DELAY-PASSED, and DELAY-LEFT fields, with DELAY-UNIT being FORWARD-TO-SANTA-DELAY-UNIT, DELAY-PASSED being FORWARD-TO-SANTA-DELAY, and DELAY-LEFT calculated as MAX-FORWARD-TO-SANTA-DELAY - FORWARD-TO-SANTA-DELAY. TODO: add RETRANSMITTING-DEVICE-QUALITY?

Hmp-To-Santa-Data-Or-Error-Packet is a packet intended from Device (either Retransmitting or non-Retransmitting) to Root. It is broadcasted by Device in several cases:

  • when the message is marked as Urgent by upper-layer protocol
  • when Device needs to report Routing-Error to Root when it has found that Root is not directly accessible.
  • when requested to do so via a packet with IS-PROBE flag and PROBE-TYPE==PROBE_TO_SANTA

In any case, if Hmp-To-Santa-Data-Or-Error-Packet is sent in response to a Hmp-From-Santa-Data-Packet flag (regardless of packet being first or not from SimpleIoT/GDP point of view), Device MUST provide TOSANTA-EXTRA-HEADER-LAST-INCOMING-HOP extra header, filling it from LAST-HOP field of the Hmp-From-Santa-Data-Packet.

On receiving Hmp-To-Santa-Data-Or-Error-Packet, Retransmitting Device sends a Hmp-Forward-To-Santa-Data-Or-Error-Packet towards Root, in ‘Acknowledged Uni-Cast’ mode. To avoid congestion at this point, each Retransmitting Device delays according for FORWARD-TO-SANTA-DELAY (using FORWARD-TO-SANTA-DELAY-UNIT for calculations), where FORWARD-TO-SANTA-DELAY and FORWARD-TO-SANTA-DELAY-UNIT are the values which are locally stored on Retransmitting Device.

Hmp-Forward-To-Santa-Data-Or-Error-Packet: | HMP-FORWARD-TO-SANTA-DATA-OR-ERROR-PACKET-AND-TTL | OPTIONAL-EXTRA-HEADERS | FIRST-HOP | NEXT-HOP | FORWARDED-SOURCE-ID | FORWARDED-BUS-ID-AT-SOURCE | REQUEST-ID | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where HMP-FORWARD-TO-SANTA-DATA-OR-ERROR-PACKET-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant HMP_FORWARD_TO_SANTA_DATA_OR_ERROR_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above; FIRST-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of a node that has received a respective TO-SANTA packet; NEXT-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of the next-hop node (based on info from Routing Table); FORWARDED-SOURCE-ID is an Encoded-Unsigned-Int<max=2> value of SOURCE-ID from the original To-Santa packet; FORWARDED-BUS-ID-AT-SOURCE is an Encoded-Unsigned-Int<max=2> value of BUS-ID-AT-SOURCE from the original To-Santa packet; REQUEST-ID is an Encoded-Unsigned-Int<max=2> field; OPTIONAL-PAYLOAD-SIZE is present only if MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field; HEADER-CHECKSUM is a header HMP-CHECKSUM (see HMP-CHECKSUM section for details); PAYLOAD is data being forwarded (copied from PAYLOAD of Hmp-To-Santa-Data-Or-Error-Packet); and FULL-CHECKSUM is a HMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

If NEXT-HOP field doesn’t match ID of the receiving Device - the packet is ignored.

Hmp-Forward-To-Santa-Data-Or-Error-Packet is sent by Retransmitting Device when it receives Hmp-To-Santa-Data-Or-Error-Packet (with TTL=MAX_TTL-1 to account for original Hmp-To-Santa-Data-Or-Error-Packet). In this case retransmitting device sets FIRST-HOP to its node ID.

On receiving Hmp-Forward-To-Santa-Data-Or-Error-Packet by a Retransmitting Device, it is processed as described in Uni-Cast processing section above (with implicit Target-Address being Root), and is always sent in ‘Acknowledged Uni-Cast’ mode.

Hmp-Routing-Error-Packet: | HMP-ROUTING-ERROR-PACKET-AND-TTL | OPTIONAL-EXTRA-HEADERS | LAST-HOP | ERROR-CODE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where HMP-ROUTING-ERROR-PACKET-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant HMP_ROUTING_ERROR_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above; LAST-HOP is an Encoded-Unsigned-Int<max=2> representing node id of the last sender; ERROR-CODE is an Encoded-Unsigned-Int<max=1> field, HEADER-CHECKSUM is a header HMP-CHECKSUM (see HMP-CHECKSUM section for details), PAYLOAD is TODO, and FULL-CHECKSUM is a full-packet HMP-CHECKSUM.

On receiving Hmp-Routing-Error-Packet, it is processed as described in Uni-Cast processing section above (with implicit Target-Address being Root), and is always sent in ‘Acknowledged Uni-Cast’ mode.

Hmp-Ack-Nack-Packet: | HMP-ACK-NACK-AND-TTL | OPTIONAL-EXTRA-HEADERS | LAST-HOP | Target-Address | NUMBER-OF-ERRORS | ACK-CHESKSUM | HEADER-CHECKSUM | OPTIONAL-DELAY-UNIT | OPTIONAL-DELAY-PASSED | OPTIONAL-DELAY-LEFT | FULL-CHECKSUM |

where HMP-ACK-NACK-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant HMP_ACK_NACK_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT flag is set, LAST-HOP is an id of the transmitting node, Target-Address is described above, NUMBER-OF-ERRORS is an Encoded-Unsigned-Int<max=2> field, which contains number of bit-errors observed at PHY level for the packet being acknowledged, ACK-CHECKSUM is copied from FULL-CHECKSUM of the packet being acknowledged (with an exception for NACK generated due to “partially correct” packet, see below), and HEADER-CHECKSUM is a header HMP-CHECKSUM (see HMP-CHECKSUM section for details); OPTIONAL-DELAY-UNIT, OPTIONAL-DELAY-PASSED, and OPTIONAL-DELAY-LEFT fields are all Encoded-Unsigned-Int<max=2> fields, all present only if DELAYS-PRESENT flag is set (which is set only in response to packets with IS-LOCAL-ECHO flag set, see above); and FULL-CHECKSUM is a HMP-CHECKSUM of concatenation of the header (without header checksum) and the remaining part of the packet.

NUMBER-OF-ERRORS field allows to provide feedback about connection quality to sender by receiver; it is a normalized number of bit errors which have been error-corrected when the packet being acknowledged, was received by receiver. If error correction is not employed, this field SHOULD be zero. This information SHOULD be used by sending-side PHY level to optimize power consumption.

Hmp-Ack-Nack-Packet with IS-LOOP-ACK flag is generated either by destination, or by the node which has found that the next hop already has NEXT-HOP-ACKS flag (see details in ‘Acknowledged Uni-Cast’ section above); generating node always specifies itself as a target. Hmp-Ack-Nack-Packet with IS-LOOP-ACK flag MUST NOT have IS-NACK flag.

If Hmp-Ack-Nack-Packet has IS-LOOP-ACK flag, it is processed as specified in ‘Uni-cast processing’ section above; Hmp-Loop-Ack packet is never sent using ‘Acknowledged uni-cast’ delivery. Processing at the target node (regardless of node type) consists of passing PAYLOAD to the upper-layer protocol.

Hmp-Ack-Nack-Packet without IS-LOOP-ACK flag and without IS-NACK flag, is generated as a response to an incoming Hmp-Unicast-Data-Packet with ACKNOWLEDGED-DELIVERY flag, or in response to a packet with IS-LOCAL-ECHO flag (TODO: anything else?). It is not retransmitted, but taken as an acknowledgement that the packet has been received.

In addition, Hmp-Ack-Nack-Packet without IS-LOOP-ACK flag and without IS-NACK flag, MAY be generated by receiver in an “unsolicited” manner, i.e. even if ACK has not been requested, to indicate that received packet has number of errors which is considered to be “too high” for the underlying PHY level. Such an ACK packet (as well as any other ACK packet with high NUMBER-OF-ERRORS) SHOULD lead to adjustments on sending side (for example, it MAY lead to increase in transmission power). Another case for “unsolicited” ACK is for Retransmission Device, when NUMBER-OF-ERRORS becomes “too low” after being substantially higher, to indicate that the other side is allowed to lower transmission power. In any case, whenever Retransmission Device sends an “unsolicited” ACK to non-transmitting Device , it SHOULD make sure (from upper-layer protocols) that receiving non-transmitting Device is expected to have it’s transceiver on.

Hmp-Ack-Nack-Packet without IS-LOOP-ACK flag and with IS-NACK flag, is generated as a response to a “partially correct” packet (regardless of type and ACKNOWLEDGED-DELIVERY flag); in this case, it’s ACK-CHECKSUM represents only HEADER-CHECKSUM of the original packet. Such Hmp-Ack-Nack-Packet is not retransmitted itself, but is taken as an indication to perform quick retransmit of the last packet sent.

Type of HMP packet

As described above, type of HMP packet is always defined by bits [0..3] of the first field (which is always Encoded-Unsigned-Int<max=2> bitfield substrate):

bit [0] bits[1..3] HMP packet type
0 ANY (used for other purposes) Hmp-Unicast-Data-Packet
1 HMP_FROM_SANTA_DATA_PACKET Hmp-From-Santa-Data-Packet
1 HMP_TO_SANTA_DATA_OR_ERROR_PACKET Hmp-To-Santa-Data-Packet
1 HMP_FORWARD_TO_SANTA_DATA_OR_ERROR_PACKET Hmp-Forward-To-Santa-Data-Or-Error-Packet
1 HMP_ROUTING_ERROR_PACKET Hmp-Routing-Error-Packet
1 HMP_ACK_NACK_PACKET Hmp-Ack-Nack-Packet
1 3 more values RESERVED

Packet Urgency

From SimpleIoT/HMP point of view, all upper-layer-protocol packets can have one of three urgency levels. If the packet has urgency URGENCY_LAZY, it is first sent as a Hmp-Unicast-Data-Packet without ACKNOWLEDGED-DELIVERY flag (as described above, in case of retries it will be resent with ACKNOWLEDGED-DELIVERY). If the packet has urgency URGENCY_QUITE_URGENT, it is first sent as a Hmp-Unicast-Data-Packet with ACKNOWLEDGED-DELIVERY flag (as described above, in case of retries it will be resent as a Hmp-*-Santa-* packet). If the packet has urgency URGENCY_TRIPLE_GALOP, then it is first sent as a Hmp-From-Santa-Data-Packet or Hmp-To-Santa-Data-Packet (depending on source being Root or Device).

Handling of TERMINAL-ADVERTISING Underlying Protocols

Some of underlying SimpleIoT/DLP-* protocols MAY be designated as TERMINAL-ADVERTISING ones. For these protocols, some of the handling is different from described above. In particular:

  • Hmp-From-Santa packets are never sent with BUS-TYPE being a TERMINAL-ADVERTISING bus.
    • If, according to the normal HMP logic described above, a need arises to send Hmp-From-Santa packet with such a BUS-TYPE, this BUS-TYPE is simply skipped.
    • If, as a result of such filtering, BUS-TYPE-LIST of Hmp-From-Santa packet becomes empty, Hmp-From-Santa packet is not sent at all
  • Whenever TERMINAL-ADVERTISING Device has its transmitter turned on, but it has no connection (as defined in respective SimpleIoT/DLP-* document) to the next hop, it starts to “advertise” itself (as defined in respective SimpleIoT/DLP-* document), using an Hmp-To-Santa packet as a payload. This Hmp-To-Santa packet MAY be a packet-which-needs-to-be-delivered-to-Root, or MAY be an Hmp-To-Santa packet with an empty payload (TODO: define).
    • All Retransmitting Devices which hear this “advertised” Hmp-To-Santa packet, process it as a normal Hmp-To-Santa packet
    • When Hmp-Forward-To-Santa packets reach Root:
      • Root chooses “the best” route, and assumes that all the inter-hops connections are symmetrical (i.e. path from A to B always implies path from B to A).
      • Root updates Routing Tables along the chosen route (the same way as for non-TERMINAL-ADVERTISING Devices)
        • Retransmitting Device which is adjacent to the TERMINAL-ADVERTISING Device which has advertised, established connection with the Device (as defined in respective SimpleIoT/DLP-* document). If connection cannot be established, Retransmitting Device sends a TODO Hmp-Routing-Error-Packet to Root.
        • If, at any point, connection is broken (as defined in respective SimpleIoT/DLP-* document), Retransmitting Device sends a TODO Hmp-Routing-Error-Packet to Root.

PHY quality measurement over SimpleIoT/HMP

Certain SimpleIoT/DLP-* protocols need to measure connection quality. This can be made using the following procedure:

  • Device sends Hmp-To-Santa packet with IS-LOCAL-ECHO flag
  • Device waits for any Hmp-Ack-Nack packet, validly acknowledging receipt of IS-LOCAL-ECHO packet, OR for 100 milliseconds, whichever comes first
  • If a valid Hmp-Ack-Nack packet is received - Device waits only for DELAY-LEFT specified in the packet from the moment of receiving the packet (more strictly: if multiple packets are received, it is maximum of the DELAY-LEFT-received-since-receiving-each-packet + 10ms (safety margin)).
  • While waiting, all the valid Hmp-Ack-Nack packets are accounted for (to be used as described in respective SimpleIoT/DLP-* document)
  • when wait expires, Device repeats the whole process above; 5 repetitions are usually made to gather required statistics.

This “quality measurement” procedure MAY be performed ONLY if respective SimpleIoT/DLP-* document specifies using it, and ONLY under circumstances specified there.

Device Discovery and Pairing over SimpleIoT/HMP

Whenever Device is in PRE-PAIRING state (see SimpleIoT Pairing for details on the PRE-PAIRING state), it scans all available channels; if channel is “eligible” (as defined in an appropriate SimpleIoT/DLP-* document), the following basic exchange occurs:

  • Device (after, maybe, performing certain preliminary actions on the channel, as defined in an appropriate SimpleIoT/DLP-* document) sends Pairing-Ready-Pseudo-Response (described in SimpleIoT Pairing document), as a SimpleIoT/CCP packet, addressed to Root. When SimpleIoT/CCP packet reaches SimpleIoT/HMP level (still on Device side), SimpleIoT/HMP doesn’t have a route to Root, so it sends it as a Hmp-To-Santa packet.
  • In response, Root will send a Pairing-Pre-Request (as it has no route to Device, it will be sent as a From-Santa HMP packet)
  • Device will reply with Pairing-Pre-Response (which will be sent a To-Santa HMP packet, containing DEVICE-INTRABUS-ID)
  • Up to this point in exchange, all the packets from Root to Device at SimpleIoT/HMP level, including optional and not mentioned above Entropy Gathering packets, are always sent as From-Santa packets with Target-Address being ROOT, i.e. broadcast packets. Packets from Device to Root are sent as To-Santa packets.
  • From this point onwards, all the packets from Root to Device at SimpleIoT/HMP level are always addressed to specific Device, using non-paired addressing. Packets from Device to Root are still sent as To-Santa packets.
  • Root will proceed with Pairing procedure as described in SimpleIoT Pairing document, still using HMP From-Santa/To-Santa packets, but from now on From-Santa packets are addressed to specific Device using “non-paired addressing”
  • As soon as Device pairing is completed (and Root sets NODE-ID for the Device), Root SHOULD:
    • calculate optimal route to the Device
    • change Routing Tables for all the Retransmitting Devices alongside the optimal route (for example, using CCP_PHY_AND_ROUTING_DATA packets as described above)
    • as soon as confirmations from all the Retransmitting Devices about route updates are obtained, Root SHOULD start using Device’s “paired addressing” for all the communications onwards with the Device.
    • change Routing Table on the Device, indicating optimal route to the Root. From this point on, Device will start using usual Unicast packets when communicating with Root (unless there are reasons to use other HMP packets, for example, on multiple retransmits or for packets marked URGENT).

TODO: merge of To-Santa into Unicast (with NEXT-HOP being -1)? TODO: Hmp-Retransmit (to next-hop Retransmitting Device on RETRANSMIT-ON-NO-RETRANSMIT) TODO: define handling for all “partially correct” packets TODO: what exactly is “header” for the purposes of “partially correct” packets? Is “sub-header” worth the trouble? TODO: NACK-PREV-HOP into Routing Table Links; RETRANSMIT-ON-NO-RETRANSMIT into RT Routes TODO: ?move FORWARD-TO-SANTA-* to links (target ones) too (and specify that it is per-link wherever it is used) TODO: procedure for calibration of LINK-DELAYs? TODO: optional explicit loop begin (alongside VIA?)

SimpleIoT Guaranteed Delivery Protocol (SimpleIoT/GDP)

Version:v0.0

** TODO **

SimpleIoT Command and Control Protocol (SimpleIoT/CCP)

Version:v0.0

** TODO **

SimpleIoT/DLP* Protocols

Specification

SmartAnthill 2.0 Overall Architecture

Version:v0.3.1

SmartAnthill is an open IoT system which allows easy control over multiple microcontroller-powered devices, creating a home- or office-wide heterogeneous network out of these devices.

SmartAnthill system can be pretty much anything: from a system to control railway network model to an office-wide heating control and security system. As an open system, SmartAnthill can integrate together a wide range of devices beginning from embedded development boards and ending with off-the-shelf sensors and actuators. They can be connected via very different communication means - from wired (currently Serial, with CAN bus and Ethernet planned soon) to wireless (currently IEEE 802.15.4, with low-cost RF, Bluetooth Smart, ZigBee and WiFi planned soon).

All SmartAnthill devices within a system are controlled from the one place (such as PC or credit-card sized computer Raspberry Pi, BeagleBoard or CubieBoard), with an optional access via Internet.

From programming point of view, SmartAnthill provides a clear separation between microcontroller programming (such as “how to get temperature from this sensor”) and system integration logic (such as “how we should heat this particular house to reduce the heating bill”). Microcontroller programming usually requires C/asm programming and C/asm programs are notoriously difficult to customize. SmartAnthill allows you to customise device with pre-defined capabilities via GUI and generate compatible firmware which will be flashed to device automatically. On the other hand, system integration logic needs to be highly customizable for needs and properties of specific house or office, but within SmartAnthill it can be done via rich suite of development instruments: Generic Protocols (HTTP, Sockets, WebSokets), High Level API (REST API) and SDK for popular languages, which allow for easy development and customization.

SmartAnthill 2.0 represents further work on SmartAnthill 1.0, which was designed solely by Ivan Kravets, an author of PlatformIO. Improvements in SmartAnthill 2.0 cover several areas, from introducing security, to support of protocols such as ZigBee and improvements aimed at reducing energy consumption. SmartAnthill 2.0 is not intended to be compatible with SmartAnthill 1.0.

Sales pitch (not to be taken seriously)

  • This ... SmartAnthill thing... What does it do?

  • Sir, better you should ask “What doesn’t it do?”

    It crypts, flips, scripts, and strips,
    It loops, groups, hooks, and schnooks,
    It chases, races, faces, and places!

    No home or maximum security prison should be without one!

    With SmartAnthill, each of your devices will get their very own personal IP address (whatever that is)! And for those low-income devices which were able to save only limited amount of RAM and cannot afford running an IP stack, SmartAnthill will simulate an IP address, so nobody from outside world will be able to tell the difference! It is an ultimate tool in keeping your devices’ self-esteem, even when they cannot afford the latest greatest technology through no fault of their own!

    In addition to being an indispensable motivation vehicle to keep your devices from becoming apathetic and irresponsive, SmartAnthill is also an ultimate engine to keep your devices in line. Yes, your very own SmartAnthill keeps a comprehensive secret dossier on each and every of your devices, monitors their behaviour, and takes corrective measures whenever necessary! And yes, you can plausibly deny any knowledge of SmartAnthill’s actions if you feel like it, too!

    But that’s not all! SmartAnthill will go to great lengths to make sure that your devices don’t misuse any watts and milliamp-hours you give them! It will enable suitable devices to run a year or even more when being fed just once! No energy-saving trick in the book is left without SmartAnthill’s attention - from sleep to hibernation, from minimizing data being transmitted to minimizing time when RF oscillator is on (whatever that is)!

    What else you, a true manager of your house, can possibly want? You’ll get your devices motivated, under tight control, using only very minimum amount of food, and with plausible deniability on top! What are you thinking about? Download your very own SmartAnthill today, and we’ll provide 50% discount from your DIY setup price! That’s right, if you install SmartAnthill today, you’ll be able to pay yourself twice less for setting SmartAnthill up!

Aims

SmartAnthill aims to create a viable system of control for the Internet of Things (IoT) devices in home and office environments. More secure and more risky environments (such as industrial control, military, etc.) are currently out of scope. Due to SmartAnthill roots in hardware hobbyist movement, special attention is to be paid for hobbyist needs.

Requirements

SmartAnthill is built around the following requirements. They follow from the aims and generally are not negotiable.

  1. Low Cost. In home/office environments SmartAnthill should aim for a single device (such as sensor) to be in the range of $10-$20. Rationale: higher costs will deter acceptance greatly.
  2. Support for Devices with Limited Resources. Many of devices and MPUs aimed to be used with SmartAnthill are very limited in their resources (which is closely related to their low cost). Currently, minimal MPU configuration which minimal SmartAnthill aims to run on, is as low as 512 bytes RAM, 16K bytes PROM (to store the program) and 256 bytes EEPROM. [TODO: think about number of rewrites in EEPROM, incl. optimization]
  3. Wireless Support. SmartAnthill needs to support wireless technologies. Wired support is optional. Rationale: without wireless devices, acceptance in home environments is expected to be very low.
  4. Support for Heterogeneous Systems. SmartAnthill should allow to create systems consisting of devices connected via different means. ZigBee and RF technologies are of the particular interest.
  5. System Integration should not require asm or C programming. Most MPUs require C or asm programming. This is ok, as long as such programming can be done once per device type and doesn’t need to be repeated when the system integrator needs to adjust system behavior. To achieve it, SmartAnthill should provide clear separation between device developer and system integrator, and system integration should not require C or asm programming skills.
  6. Energy Efficiency. SmartAnthill should aim to achieve best energy efficiency possible. In particular, a wide range of SmartAnthill sensors should be able to run from a single ‘tablet’-size battery for at least a year (more is better).
  7. Security. SmartAnthill should provide adequate protection given the home/office environment. In other words, SmartAnthill as such doesn’t aim to protect from NSA (or any other government agency) or from somebody who’s already obtained physical access to the system. However:
    1. protection from remote attackers (both over the Internet and present within the reach of wireless communications) is the must
    2. level of protection should be sufficient to control home/office physical security systems
    3. protection from local attackers trying to obtain physical entry requires additional physical security measures, which can be aided by SmartAnthill. For example, if the attacker gets entrance to the hardware of SmartAnthill Central Controller, SmartAnthill becomes vulnerable. However, SmartAnthill-enabled sensors may be installed to detect unauthorized entrance to the room where SmartAnthill is installed, and/or to detect unauthorized opening of the SmartAnthill Central Controller physical box, with an appropriate action taken by Central Controller before it becomes vulnerable (for example, notifying authorities).
  8. Openness. All core SmartAnthill technologies should be open. SmartAnthill protocols are intended to be published, and any device compliant with these protocols should be able to interoperate with other compliant devices. SmartAnthill project will provide a reference software stack as an open source code, which will be distributed under GPL v2 [TODO:decide] license.
    1. Openness of SmartAnthill does not mean that all SmartAnthill devices should use open-source software. Any device, whether using open- or closed-source software, is welcome as long as it complies with published SmartAnthill protocols.
    2. Openness of SmartAnthill does not mean that SmartAnthill devices are not allowed to use existing proprietary protocols as a transport.
    3. Position on patents. SmartAnthill Core MUST use patent-free technologies wherever possible. Support for patented technologies as a transport is allowed. All SmartAnthill contributors MUST fill a form with a statement on their knowledge on patents related to their contribution.
  9. Vendor and Technology Neutrality. SmartAnthill should not rely on any single technology/platform (leave alone any single vendor). All kinds of suitable technologies and platforms are welcome. Any references to a specific technology should be considered only as an example.
  10. Extensibility. Closely related to technology neutrality is extensibility. SmartAnthill should expect new technologies to emerge, and should allow them to be embraced in a non-intrusive manner. It is especially important to allow easy addition of new communication protocols, and of new devices/MPUs.
  11. Ability to Utilize Resources of More Capable Devices. Non-withstanding Requirement #2 above, it is recognized that there are some devices out there which have better capabilities than minimal capabilities. Moreover, it is recognized that share of such more capable devices is expected to grow. Therefore, as long as it is helpful to achieve any of the goals above, SmartAnthill should allow to utilize capabilities of more sophisticated devices. One example is to utilize device’s ability to sleep and wake up on timer, allowing to improve battery life greatly. Another example is to allow combining several commands into one wireless transmission, allowing to reduce amount of time wireless module needs to be turned on, which should also help improving battery life.
    1. It doesn’t mean that SmartAnthill is going to increase minimal requirements. However, if minimal requirements are exceeded by any particular device, SmartAnthill should allow to utilize those improved capabilities to improve other user-observable characteristics.
  12. Support both for mass-market devices and for hobbyist devices. While SmartAnthill is not limited to hobbyists and aims to become a widely-accepted network for controlling IoT and smart homes, it should consider hobbyists as a first-class citizens and pay attention to their needs. In particular, compatibility with existing devices and practices is to be taken seriously, as well as any feedback.

SmartAnthill Architecture

SmartAnthill Overall Architecture
Simple Topology

Simple SmartAnthill system consists of one SmartAnthill Central Controller and one or more SmartAnthill Devices (also known as “Ants”) controlled by it (see Sample SmartAnthill Single-Node System diagram above for an example topology).

SmartAnthill Central Controller is a relatively complex device (such as PC or credit-card sized computer Raspberry Pi, BeagleBoard or CubieBoard) which normally runs several pieces of software, including operating system TCP/IP stack, 3rd-party System Control Software, and SmartAnthill Core.

System Control Software

System Control Software is intended to be easily customizable according to customer needs. It can be very different, but we aim to support OpenHAB, and to support DYI programming with pretty much any programming language which can support one of the REST, WebSockets or Sockets. SmartAnthill project as such doesn’t provide control software, it is rather a service which can be used by a control software.

SmartAnthill Core

SmartAnthill Core represents a cross-platform software which is written in Python language and supports all the popular server/desktop operation systems: Mac OS X, Linux (x86 or ARM), and Windows. System requirements of SmartAnthill Core are very low for a modern server-side application:

  • < 1% CPU in IDLE mode
  • < 20Mb RAM for service/daemon
  • < 20Mb of free disk space (cross-compilers, tool chains, and firmware upload software are not included here)

More detailed information on SmartAnthill Core is provided in a separate document, SmartAnthill 2.0 Core Architecture.

API Service

API Service is responsible for supporting multiple protocols (such as REST, Websocket, or plain socket) and converting them into requests to the other parts of SmartAnthill.

Dashboard Service

Dashboard Service is responsible for providing UI for the SmartAnthill administrator. It allows to:

  • administer SmartAnthill Core (control services running, view logs etc.)
  • configure and program/”pair” SmartAnthill Devices so they can be used with specific SmartAnthill system (see Life Cycle of SmartAnthill Device below for details on configuring, programming, and “pairing”)
Device Service

Device Service provides device abstraction to the rest of SmartAnthill Core, allowing to handle different devices in a consistent manner.

Device Firmware Module

Device Firmware Module is used for SmartAnthill Hobbyist Devices (see on them below). Device Firmware Module is responsible for generating device firmware (for specific device, based on configuration entered via Dashboard), and for programming it. Device Firmware Module is implemented on top of PlatformIO.

SmartAnthill Router

SmartAnthill Router is responsible for handling so-called SmartAnthill Simple Devices (see below; in a nutshell - SmartAnthill Simple Device is not able to run it’s own IP stack).

SmartAnthill Router provides SmartAnthill Simple Devices with a virtual IP address (or more precisely - either with a separate IP address, or with a dedicated port on one of SmartAnthill Central Controller’s IP addresses). While SmartAnthill Simple Device itself knows nothing about IP, SmartAnthill Router completely encapsulates all connected SmartAnthill Simple Devices, so from the point of view of the outside world, these SmartAnthill Simple Devices are completely indistinguishable from fully-fledged SmartAnthill IP-Enabled Devices.

SmartAnthill Database (SA DB)

SmartAnthill Database (SA DB) is a database which stores all the information about SmartAnthill Devices within specific SmartAnthill System. SA DB is used by most of SmartAnthill Core components.

SmartAnthill Database is specific to the Central Controller and SHOULD NOT be shared. In SA DB, at least the following information is stored:

  • device addresses (bus-specific for Simple Devices and IPs for IP-enabled devices)
  • credentials (i.e. symmetric keys)
  • configuration (i.e. which device is connected to which pins)
  • device capabilities (i.e. amount of RAM/PROM/EEPROM available, MPU capabilities etc.)
SmartAnthill Devices
SmartAnthill Devices

TODO: Master-Slave topology!

Each SmartAnthill Device (also known as ‘Ant’) is either SmartAnthill Hobbyist Device, or a SmartAnthill Mass-Market Device. While these devices are similar, there are some differences as outlined below. In addition, in a completely different and independent dimension each SmartAnthill Device is either a Simple Device, or an IP-enabled Device.

These properties are independent of each other, so it is possible to have all four different types of devices: SmartAnthill Hobbyist Simple Device, SmartAnthill Hobbyist IP-enabled Device, SmartAnthill Mass-Market Simple Device, and SmartAnthill Mass-Market IP-enabled Device.

SmartAnthill Hobbyist Device

A diagram of a typical SmartAnthill Hobbyist Device is provided in section SmartAnthill Devices. SmartAnthill Hobbyist Device consists of an MCU, persistent storage (such as EEPROM or Flash), communication module, and one or more sensors and/or actuators (which are also known as ‘ant body parts’). TODO: add persistent storage to the diagram. MCU on SmartAnthill Hobbyist Device runs several layers of software:

  • SmartAnthill-Generated Software it is system-specific, i.e. it is generated for each system
  • Device-Specific Plugins for each type of sensor or actuator present
  • SmartAnthill 2.0 Protocol Stack; it is generic, i.e. it is intended to be pretty much the same for all SmartAnthill Devices. SmartAnthill 2.0 Protocol Stack uses persistent storage, in particular, to provide security guarantees.

An important part of SmartAnthill Hobbyist Device (which is absent on SmartAnthill Mass-Market Devices) is programming interface; for example, it can be some kind of SPI, UART or USB.

SmartAnthill Mass-Market Device

A diagram of a typical SmartAnthill Mass Market Device is also provided in the section SmartAnthill Devices. In addition to the components available on SmartAnthill Hobbyist Device, SmartAnthill Mass-Market Device MAY additionally include:

  • an additional LED to support Single-LED Pairing. In practice, an existing LED MAY be re-used for this purpose.

In addition, Persistent Storage on Mass-Market Devices stores System-specific Data. System-specific Data contains information such as bus-specific addresses and security keys; it is obtained during “pairing” process which is described below

MCU on SmartAnthill Mass-Market Device runs several layers of software (note the differences from SmartAnthill Hobbyist Device):

  • SmartAnthill Configurator, which is responsible for handling “pairing” process and populating system-specific data. SmartAnthill Configurator is generic.
  • Device-Specific Plugins for each type of sensor or actuator present
  • SmartAnthill 2.0 Protocol Stack as noted above, protocol stack is generic.
SmartAnthill Simple Device

Many of SmartAnthill Devices are expected to have very little resources, and might be unable to implement IP stack. Such devices are known as SmartAnthill Simple Devices; they implement a portion of SmartAnthill 2.0 Protocol Stack, with SmartAnthill Router providing interface to the outside world and conversion between IP-based requests/replies and Simple Device requests/replies.

SmartAnthill IP-enabled Device

SmartAnthill IP-enabled Device is a device which is able to handle IP requests itself. For example, if SmartAnthill IP-enabled Device uses IEEE 802.15.4 for communication, it may implement 6LoWPAN and IP stack with at least UDP support (TCP stack, which is more resource-intensive than UDP/IP stack, is optional for SmartAnthill IP-enabled Devices). SmartAnthill IP-enabled Devices can and should be accessed without the assistance of SmartAnthill Router.

Life Cycle of SmartAnthill Device

Let’s consider how new devices are added and used within a SmartAnthill. Life cycle is a bit different for SmartAnthill Hobbyist Device and SmartAnthill Mass-Market Device.

Life Cycle of SmartAnthill Hobbyist-Oriented Device

During it’s life within SmartAnthill, a hobbyist-oriented device goes through the following stages:

  • Initial State. Initially (when shipped to the customer), Hobbyist-oriented SmartAnthill Device doesn’t need to contain any program. Program will be generated and device will be programmed as a part of ‘Program Generation and Programming’ stage. Therefore, programming connector is a must for hobbyist-oriented devices.
  • Specifying Configuration. Configuration is specified by a user (hobbyist) using a Dashboard Service. User selects board type and then specifies connections of sensors or actuators to different pins of the board. For example, one hobbyist might specify that she has [TODO] board and has a LED connected to pin 1, a temperature sensor connected to pins 2 through 5, and a DAC connected to pins 7 to 10.
  • Program Generation and Programming. Program generation and programming is performed by SmartAnthill Firmware Builder and Uploader automagically based on configuration specified in a previous step. Generated program includes a SmartAnthill stack, credentials necessary to authenticate the device to the network and vice versa (as described in SATP section below, authentication is done via symmetric keys), and subprograms necessary to handle devices specified in a previous step. Currently SmartAnthill supports either UART-programmed devices, or SIP-programmed devices [TODO:check]

After the device is programmed, it is automatically added to a SmartAnthill Database of available devices.

  • Operation. After the device is programmed, it can start operation. Device operation involves receiving and executing commands from Central Controller. Operations can be either device-specific (such as “measure temperature and report”), or generic (such as “wait for XXXX seconds and come back for further instructions”).
Life Cycle of SmartAnthill Mass-Market-Oriented Device

Mass-market devices are expected to be shipped in already programmed state, with a pre-defined configuration. Expected life cycle of a SmartAnthill Mass-market-oriented Device can be described as follows:

  • Initial State. Initially (when shipped to the customer), SmartAnthill mass-market-oriented device contains a program which ensures it’s operation. Re-programming capability and connector are optional for SmartAnthill mass-market-oriented devices.
  • “Pairing” with Central Controller. “Pairing” includes Central Controller (controlled via SmartAnthill Dashboard) generating and exchanging credentials with device, querying device configuration and capabilities, and entering credentials, configuration and capabilities into SmartAnthill Database. “Pairing” is described in detail in SmartAnthill Pairing document.
    • Physically, “pairing” can be done in one of two different ways:
      • OtA Single-LED Pairing. Requires user to point a webcam of Central Controller (or a phone camera with SmartAnthill app running - TODO) to the Device intended to be paired. On the Device side, requires only one single LED (existing LED MAY be re-used for “pairing”)
      • Zero Paper Pairing. Requires user to enter 26-symbol key into Central Controller. On the Device side, requires printed key (unique to the Device); additionally requires Device to fullfil Reprogramming Requirements as specified in SmartAnthill Pairing.
    • Special considerations: SmartAnthill Device MUST NOT allow to extract keys; the only action allowed is to re-pair device with a different Central Controller, destroying previously existing credentials in the process. In other words, while it is possible to steal device to use with a different Central Controller, it should not be possible to impersonate device without access to Central Controller. In addition, re-pairing MUST be initiated on the Device itself (and Devices MUST NOT allow initiating re-pairing remotely); this is necessary to ensure that to hijack Device, attacker needs to be in physical possession of the Device.
  • Operation. Operation of Mass-market-oriented device is the same as operation of Hobbyist-oriented device.
SmartAnthill protocol stack

SmartAnthill protocol stack is described in detail in a separate document, SmartAnthill 2.0 Protocol Stack.

SmartAnthill 2.0 Core Architecture

Version:v0.1b

SmartAnthill Core represents a cross-platform software which is written in Python language and should support the popular operation systems (Mac OS X, Linux (+ARM) and Windows).

Requirements

The requirements of SmartAnthill Core by the system resources should by very low:

  • < 1% CPU in IDLE mode
  • < 20Mb RAM for service/daemon
  • < 20Mb of free disk space (the cross-compilers, tool chains and firmware upload software are not included here)

SmartAnthill Core Services

SmartAnthill Core operates on PC like a system foreground daemon with the pieces of own services.

@TODO more explanation

API Service

API Service is responsible for receiving requests (via REST, WebSocket or Socket) from System Control Software and taking necessary measures to execute them via SmartAnthill Command&Control Protocol (SACCP).

@TODO more explanation

Dashboard Service

Dashboard Service represents WEB-based GUI (requires browser with enabled JavaScript) which allows:

  • to manage SmartAnthill Devices (add, edit or remove them, customise with the specific capabilities/plugins/operations)
  • to generate and upload device-compatible firmware via “TrainIt” wizard (see explanation below in SmartAnthill Firmware Builder and Uploader)
  • to monitor SmartAnthill Heterogeneous Network in the real time (operational state of each device, the number of sent/received messages, errors and etc)
  • to analyze log messages

@TODO more explanation

Device Service

@TODO more explanation

Message Queuing Service

SmartAnthill liteMQ Protocol:

  • Queues

  • Exchanges

    • Direct
    • Fanout
    • Topic

@TODO more explanation

Network Service

Network Service is based on the SmartAnthill 2.0 Protocol Stack and operates with the network data (messages, packets, fragments) within SmartAnthill Heterogeneous Network.

@TODO more explanation

SmartAnthill Router

SmartAnthill Router is responsible for translating IP-based requests into bus-specific requests for SmartAnthill Simple Devices (also see document SmartAnthill 2.0 Protocol Stack for details).

SmartAnthill Router operates one or more ‘buses’. Each SmartAnthill bus can be either a traditional wired bus (such as CAN bus), or a wireless ‘bus’. Wireless SmartAnthill ‘buses’ do not imply any wired connection, they just represent certain domain of wireless connections; for example, one wireless ‘bus’ can be a IEEE 802.15.4 ‘bus’ controlling some devices connected via IEEE 802.15.4, and at the same time another wireless ‘bus’ can be a 433 MHz RF ‘bus’ controlling some other devices connected via 433 MHz RF. Each bus (wired or wireless) has one or more simple devices (such as sensors or actuators) connected to it (in case of wireless buses, the connection is wireless). Each device runs an MPU (or in theory CPU), which runs SmartAnthill stack on it (either a reference stack, or some other implementation).

It should be noted that IP-enabled devices do not use SmartAnthill Router to operate; they can and SHOULD be addressed directly via their IP.

SmartAnthill Firmware Builder and Uploader

SmartAnthill Firmware Builder and Uploader is implemented on top of PlatformIO.

@TODO PlatformIO role should be explained here

SmartAnthill Database

  • Board settings
  • Pre-configured plugins
  • Application state
  • Configs

Protocols

SmartAnthill 2.0 Protocol Stack

Version:v0.3.1

IMPORTANT: This document is obsolete. Please DO NOT modify it. Please refer to SimpleIoT Protocol Stack for an up to date version.

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture document, please make sure to read it before proceeding.

SmartAnthill protocol stack is intended to provide communication services between SmartAnthill Clients and SmartAnthill Devices, allowing SmartAnthill Clients to control SmartAnthill Devices. These communication services are implemented as a request-response services within OSI/ISO network model.

Actors

In SmartAnthill Protocol Stack, there are three distinct actors:

  • SmartAnthill Client. Whoever needs to control SmartAnthill Device(s). SmartAnthill Clients are usually implemented by SmartAnthill Central Controllers, though this is not strictly required.
  • SmartAnthill Router. SmartAnthill Router allows to control SmartAnthill Devices connected to it. Performs conversion between SAoIP and SADLP-* protocols (see below).
  • SmartAnthill Device. Physical device (containing sensor(s) and/or actuator(s)), which implements at least some parts of SmartAnthill protocol stack. Every SmartAnthill Device runs it’s own IP/SAoIP stack (but not necessarily TCP stack).
Addressing

In SmartAnthill, each SmartAnthill Device is assigned it’s own IPv6 address (usually generated pseudo-randomly as specified in RFC4193). Currently the only supported SAoIP subprotocol (also known as “SAoIP flavour”) is SAoUDP; for further details, please refer to SmartAnthill-over-IP Protocol (SAoIP) and SmartAnthill Router document. When transferring SAoUDP packets over SAMP, IPv6 and UDP headers MUST be compressed (as described in SmartAnthill-over-IP Protocol (SAoIP) and SmartAnthill Router document; techniques described there are similar to those of 6LoWPAN, but are more specific to SmartAnthill tasks, and are more efficient for our purposes as a result).

Relation between SmartAnthill protocol stack and OSI/ISO network model

Note

For more detailed information please scroll table below by horizontal

Layer OSI-Model SmartAnthill Protocol Stack Function Implementation on Clients Implementation on Routers Implementation on Devices
IP side SA side
7 Application Zepto VM Device Control Control Program Zepto VM
6 Presentation SACCP Command/Reply Handling SACCP SACCP
5 Session SAGDP Guaranteed Delivery SAGDP (“Master”) SAGDP (“Slave”)
SASP Encryption and Authentication SASP SASP (optional) SASP
4 Transport SAoIP Transport over IP Networks SAoIP SAoIP SAoUDP+UDP (compressed) SAoUDP+UDP (compressed)
UDP As usual for UDP UDP UDP
3 Network SAMP or IP Mesh for SAMP, As usual for IP IP IP SAMP SAMP
2 Datalink SADLP-* Intra-bus addressing, Fragmentation (if applicable) – (standard network capabilities) – (std netwk capabilities) SADLP-* SADLP-*
1 Physical Physical   – (standard network capabilities) – (std netwk capabilities) Physical Physical

SmartAnthill protocol stack consists of the following protocols:

  • Zepto VM. Essentially a byte-code interpreter, where byte-code is optimized for exteremely resource-constrained devices. Zepto VM handles generic commands and routes device-specific commands to device-specific plug-ins.
  • SACCP – SmartAnthill Command&Control Protocol. Corresponds to Layer 7 of OSI/ISO network model.
  • SAGDP – SmartAnthill Guaranteed Delivery Protocol. Belongs to Layer 5 of OSI/ISO network model. Provides guaranteed command/reply delivery. Flow control is implemented, but is quite rudimentary (only one outstanding packet is normally allowed for each virtual link, see details below). On the other hand, SAGDP provides efficient support for scenarios such as temporary disabling receiver on the SmartAnthill Device side; such scenarios are very important to ensure energy efficiency.
  • SASP – SmartAnthill Security Protocol. Due to several considerations (including resource constraints) SmartAnthill protocol stack implements security on a layer right below SAGDP, so SASP essentially belongs to Layer 5 of OSI/ISO network model.
  • SAoIP – SmartAnthill over IP Protocol. Currently only SAoUDP is supported, in the future support for SAoTCP MIGHT be added, but it won’t be mandatory for Devices.
  • SAMP - SmartAnthill Mesh Protocol. EXPERIMENTAL. Aims to provide heterogeneous mesh network with an explicit “storm” control within applicable collision domains.
  • SADLP-* – SmartAnthill DataLink Protocol family. Belongs to Layer 2 of OSI/ISO network model. SADLP-* is specific to an underlying transfer technology (so for CAN bus SADLP-CAN is used, for IEEE 802.15.4 SADLP-IEEE802.15.4 is used). SADLP-* handles fragmentation if necessary and provides non-guaranteed packet transfer.
Error Handling Philosophy and Asymmetric Nature

In real-world operation, it is inevitable that from time to time a mismatch occurs between the states of SmartAnthill Central Controller and SmartAnthill Device; while such mismatches should never occur as long as the SmartAnthill protocols are strictly adhered to, mistmatches still may occur for many practical reasons, such as reboot or restore-from-backup of SmartAnthill Central Controller, a transient failure of the SmartAnthill Device (for example, due to power surge, near-depleted battery, RAM soft error due to cosmic rays, etc.).

SmartAnthill protocol stack attempts to clear as many such scenarios as possible ‘automagically’, without the need to reprogram SmartAnthill Device. To achieve this goal, the following approach is used: SmartAnthill protocol stack assumes that in any case when there is any kind of the mismatch, it is the SmartAnthill Central Controller who’s “right”. In addition, if such a decision is not sufficient to recover from the mismatch, SmartAnthill Device will perform complete re-initialization.

It means that certain SmartAnthill protocols (such as SACCP and SAGDP) are inherently asymmetrical; details are provided in their respective documents ( SmartAnthill Command&Control Protocol (SACCP) and SmartAnthill Guaranteed Delivery Protocol (SAGDP) ).

TODO: recommend on-device self-recovery circuit?

Packet Chains

SmartAnthill protocol stack is intended to provide various services between two entities: SmartAnthill Central Controller and SmartAnthill Device. Most of these services are of request-response nature. To implement them while imposing the least requirements on the resource-stricken SmartAnthill Device, all interactions within SmartAnthill protocol stack at the levels between SACCP and SAGDP (inclusive) are considered as “packet chains”, when one of the parties initiates communication by sending a packet P1, another party responds with a packet P2, then first party may respond to P2 with P3 and so on.

Chains are initiated by the topmost protocol is SmartAnthill protocol layer, SACCP, and are supported by all the layers between SACCP and SAGDP (inclusive). Whenever SACCP issues a packet to an underlying protocol, it MUST specify whether a packet is a first, intermediate, or last within a “packet chain” (using ‘is-first’ and ‘is-last’ flags; note that due to “rules of engagement” described below, ‘is-first’ and ‘is-last’ flags are inherently incompatible, which MAY be relied on by implementation). This information allows underlying protocols (down to SAGDP) to arrange for proper retransmission if some packets are lost during communication, see SmartAnthill Guaranteed Delivery Protocol (SAGDP) document for details.

Starting from OSI Layer 2 and above, there is a virtual link established between SmartAnthill Central Controller and SmartAnthill Device. Normally (as guaranteed by SAGDP) only one outstanding packet is allowed on each such virtual link. There is one exception to this rule, which is described below.

Handling of temporary dual “packet chains”

Normally, at each moment for each of the ‘virtual links’ decribed above, there can be only one “packet chain” active, and within a “packet chain”, all transmissions are always sequential. However, there are scenarios when both SmartAnthill Central Controller and SmartAnthill Device try to initiate their own “packet chains”. One such example is when SmartAnthill Device is sleeping according to instructions received from SmartAnthill Central Controller (and just woke up to perform task and report), and meanwhile SmartAnthill Central Controller has made a decision (for example, due to the input from other SmartAnthill Devices or from the end-user) to issue different set of instructions to the SmartAnthill Device.

Handling of these scenarios is explained in detail in respective documents ( SmartAnthill Command&Control Protocol (SACCP) and SmartAnthill Guaranteed Delivery Protocol (SAGDP) ); as a result of such handling, one of the chains (the one coming from the SmartAnthill Device, according to “Central Controller is always right” principle described above), will be dropped pretty much as if it has never been started.

Packet Size Guarantees, DEVICECAPS instruction, SACCP_GUARANTEED_PAYLOAD, and Fragmentation

In SmartAnthill, SACCP MUST allow sending commands with at-least-8-bytes payload; all underlying protocols MUST support it (taking into account appropriate header sizes, so, for example, SASP MUST be able to pass at least 8_bytes+SACCP_headers+SAGDP_headers as payload). If Client needs to send a command which is larger than 8 bytes, it SHOULD obtain information about device capabilities, before doing it. Currently, SmartAnthill provides two ways to do it:

  • to obtain Device Capabilities information about SmartAnthill Device from SmartAnthill DB (see SmartAnthill 2.0 Overall Architecture document for details) at the time of SmartAnthill Device programming or “pairing”. This method is currently beyond the scope of SmartAnthill Protocols (TODO: should we add it?).
  • to obtain Device Capabilities information via Zepto VM DEVICECAPS instruction (see Zepto VM document for details). When Client doesn’t have information about Device, it’s SACCP request with Zepto VM’s DEVICECAPS instruction MUST be <= 8 bytes in size; Zepto VM’s SACCP reply to a DEVICECAPS instruction MAY be larger than 8 bytes if it is specified in the instruction (and if is Device itself is capable of sending it).

One of DeviceCapabilities fields is SACCP_GUARANTEED_PAYLOAD (which is conceptually similar to MTU from IP stack, but includes header sizes to provide information which is appropriate for Layer 7). When SmartAnthill Device fills in SACCP_GUARANTEED_PAYLOAD in response to Device Capabilities request, it MUST take into account capabilities of it’s L1/L2 protocol; that is, if a SmartAnthill Device supports IEEE 802.15.4 and L2 protocol which doesn’t perform packet fragmentation and re-assembly, then the Device won’t be able to send/receive payloads which are roughly 80 bytes in size (exact size depends on headers and needs to be calculated depending on protocol specifics), and it MUST NOT report DeviceCapabilities.SACCP_GUARANTEED_PAYLOAD which is more than this amount.

In SmartAnthill, fragmentation and re-assembly is a responsibility of SADLP-* family of protocols. If implemented, it may allow device to increase reported (and sent/received) SACCP_GUARANTEED_PAYLOAD.

All SmartAnthill Protocols, except for SADLP-*, MUST support SACCP payload sizes of at least 384 bytes. Therefore, after obtaining Device Capabilities for a SmartAnthill Device, SmartAnthill Client MAY calculate min(DeviceCapabilities.SACCP_GUARANTEED_PAYLOAD,384) to determine SACCP payload size which is guaranteed to be delivered to the Device. Alternatively, SmartAnthill MAY calculate min(DeviceCapabilities.SACCP_GUARANTEED_PAYLOAD,Client_Side_SACCP_Payload) for the same purpose (here Client_Side_SACCP_Payload will depend on SAoIP protocol in use).

Stack-Wide Encodings

There are some encodings and encoding conventions which are used throughout SmartAnthill Protocol Stack.

SmartAnthill Encoded-Unsigned-Int

In several places in SmartAnthill Protocol Stack, there is a need to encode integers, which happen to be small most of the time (one such example is sizes, another example is some kinds of incrementally-increased ids). To encode them efficiently, SmartAnthill Protocol Stack uses a compact encoding, which encodes small integers with smaller number of bytes. Encoded-Unsigned-Int is very close to Variable-length quantity (VLQ) (see http://en.wikipedia.org/wiki/Variable-length_quantity), however, SmartAnthill Encoded-Unsigned-Int<> encoding enforces “canonical” VLQ representation, prohibiting non-optimal encodings such as two-byte encoding of ‘0’. Also note that other encodings such as Encoded-Signed-Int are different from what is described on VLQ Wikipedia page.

Encoded-Unsigned-Int is a variable-length encoding of unsigned integers. Namely:

  • if the first byte of Encoded-Unsigned-Int is c1 <= 127, then the value of Encoded-Unsigned-Int is equal to c1
  • if the first byte of Encoded-Unsigned-Int is c1 >= 128, then the next byte c2 is needed:
    • if the second byte of Encoded-Unsigned-Int is c2 <= 127, then the value of Encoded-Unsigned-Int is equal to ((uint16)(c1&0x7F) | ((uint16)c2 << 7)).
    • if the second byte of Encoded-Unsigned-Int is c2 >= 128, then the next byte c3 is needed:
      • if the third byte of Encoded-Unsigned-Int is c3 <= 127, then the value of Encoded-Unsigned-Int is equal to ((uint32)(c1&0x7F) | ((uint32)(c2&0x7F) << 7)) | ((uint32)c3 << 14)).
      • if the third byte of Encoded-Unsigned-Int is c3 >= 128, then the next byte c4 is needed:
        • if the fourth byte of Encoded-Unsigned-Int is c4 <= 127, then the value of Encoded-Unsigned-Int is equal to ((uint32)(c1&0x7F) | ((uint32)(c2&0x7F) << 7)) | ((uint32)(c3&0x7F) << 14)) | ((uint32)c4 << 21)).
        • if the fourth byte of Encoded-Unsigned-Int is c4 >= 128, then the next byte c5 is needed.
          • for nth byte:
            • if the nth byte of Encoded-Unsigned-Int is cn <= 127, then the value of Encoded-Unsigned-Int is equal to ((uintNN)(c1&0x7F) | ((uintNN)(c2&0x7F) << 7)) | ((uintNN)(c3&0x7F) << 14)) | ... | ((uintNN)(c<n-1>&0x7F) << (7*(n-2))))) | ((uintNN)cn << (7*(n-1)))), where uintNN is sufficient to store the result. NB: in practice, for Encoded-Unsigned-Ints over 4 bytes, implementation is likely to be quite different from, but equivalent to, the formula given
            • if the nth byte of Encoded-Unsigned-Int is cn >= 128, then the <n+1>th byte is needed.

IMPORTANT: Encoded-Unsigned-Int enforces “canonical” representation. It means that all integers MUST be encoded with the smallest number of bytes possible. This requirement is equivalent to a requirement that for encodings with length > 1, last byte of encoding MUST NOT be equal to zero. This MUST be checked by compliant implementations (and MUST generate invalid-encoding exception, with effects depending on the point where it has occurred).

The following table shows how many Encoded-Unsigned-Int bytes is necessary to encode ranges of Encoded-Unsigned-Int values:

Encoded-Unsigned-Int Values Encoded-Unsigned-Int Bytes Fully Covers Result fits in
0-127 1 7 bits 1 byte
128-16 383 2 14 bits 2 bytes
16 512-2 097 151 3 21 bits 3 bytes
2 097 152-268 435 455 4 28 bits 4 bytes
268 435 456- 34 359 738 367 5 35 bits 5 bytes
34 359 738 368- 4 398 046 511 103 6 42 bits 6 bytes
4 398 046 511 104- 562 949 953 421 311 7 49 bits 7 bytes
562 949 953 421 312- 72 057 594 037 927 935 8 56 bits 8 bytes
72 057 594 037 927 936- 9 223 372 036 854 775 808 9 63 bits 8 bytes

IMPORTANT: Encoding-Unsigned-Int encoding (specifically, low-to-high byte encoding order) guarantees that for even numbers, first byte of encoded value is always even. This property MAY be relied on in other places in protocol stack, specifically, in “indicate an error in an unknown-length field” scenarios (so if we decide to change order of bytes in the encoding, we need to change logic in those places too).

Table of correspondence of “max=” parameter and maximum possible encoding length:

max= maximum Encoded-Unsigned-Int bytes
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 10
Encoded-Signed-Int

Encoded-Signed-Int is an encoding for signed integers, based on Zig-Zag conversion from signed integer to unsigned integer, and subsequent Encoded-Unsigned-Int encoding of unsigned integer.

Zig-Zag conversion is the same as described here: https://developers.google.com/protocol-buffers/docs/encoding?csw=1#types. For example, to convert int16_t sx to uint16_t ux, the following C language expression is used:

ux = (uint16_t)((sx << 1) ^ (sx>>15))

To convert int32_t sx to uint32_t ux, expression becomes ux = (uint32_t)((sx << 1) ^ (sx>>31)), and so on.

Note that right shift in these expressions is a signed shift, making it equivalent creating a bitmask of appropriate length, consisting out of all ‘0’ or out of all ‘1’s (equal to the sign bit of original signed integer). This allows, for example, to calculate one byte of this mask by signed-shifting highest byte of sx to the right by 7, and then to use this byte for XORing with all the bytes of left-shifted sx; this trick should speed up implementations on 8-bit MCUs.

After ux is calculated, it is stored as an Encoded-Unsigned-Int of the appropriate size, as described above.

To perform Zig-Zag conversion back (from Zig-Zag-encoded unsigned ux to original signed sx), the following expression may be used (for 16-bit conversions, for the others expressions are very similar):

sx = (int16_t)((ux >> 1) ^ (-(ux & 1)))

Note that once again, all bits (and therefore bytes) of (-(ux&1)) are the same, so one byte can be calculated (this time - based on lowest byte) and then used for XORing with all the bytes of right-shifted ux.

Encoded-*-Int<max=>

Wherever SmartAnthill specification mentions Encoded-Unsigned-Int or Encoded-Signed-Int, it MUST specify it in the form of Encoded-Unsigned-Int<max=...> or Encoded-Signed-Int<max=...>. “max=” parameter specifies maximum number of bytes which are necessary to represent the encoded number. For example, Encoded-Unsigned-Int<max=2> specifies that the number is between 0 and 65535 (and therefore from one to three bytes may be used to encode it). The high bit of the last possible byte of Encoded-*-Int is always 0; this ensures an option for an easy expansion in the future.

Currently supported values of “max=” parameter are from 1 to 8.

When parsing Encoded-*-Int, if high bit in the last-possible byte is 1, then Encoded-*-Int is considered invalid. Handling of invalid Encoded-*-Ints SHOULD be specified in the appropriate place of documentation.

SmartAnthill Endianness

In most cases, SmartAnthill Protocol Stack uses SmartAnthill Encoded-*-Int<max=...> to encode integers. However, there are some cases where we need an exact number of bytes, and have no idea about their statistical distribution. In such cases, using Encoded-*-Int<> would be a waste.

In such cases, SmartAnthill uses SmartAnthill Endianness, which is LITTLE-ENDIAN.

Rationale for using LITTLE-ENDIAN encoding (rather than “network byte order” which is traditionally big-endian) is based on the observation that the most resource-constrained MPUs out of target group (namely PIC and AVR8), are little-endian. For them, the difference of not doing conversion between protocol-order and MPU-order might be important; as the other MPUs are not that much constrained, we don’t expect the cost of conversion to be significant. In other words, this LITTLE-ENDIAN decision to favours poorer-resource MPUs at the cost of richer-resource MPUs.

SmartAnthill Bitfields

In some cases, SmartAnthill Protocols use bitfields; in such cases:

  • bitfields MUST use 1-byte, 2-byte, Encoded-Unsigned-Int<max=>, or Encoded-Signed-Int<max=> field as a ‘substrate’. ‘Bitfield Substrate’ is composed/parsed as an ordinary field, which is encoded using appropriate encodings described in this document.
  • as soon as ‘substrate’ is parsed, it is treated as an integer, out of which specific bits can be used; these bits are specified as [3] (specifying that single bit #3 is used), or [2..4] (specifying that bits from 2 to 4 - inclusive - are used)
  • if ‘substrate’ is an Encoded-Unsigned-Int field, then one of bitfields MAY be specified as [2..] - specifying that all the bits from 2 to the highest available one, are used for the bitfield.
  • if ‘substrate’ is an Encoded-Signed-Int field, then one of bitfields MAY be specified as [2..] - specifying that all the bits from 2 to the highest available one, are used for the bitfield; in this example, the bitfield in question MUST be calculated as substrate>>1, where substrate is treated as signed (i.e. ‘>>’ operator works extending sign bit).
SmartAnthill Half-Float

Some SmartAnthill commands use ‘Half-Float’ data as described here: http://en.wikipedia.org/wiki/Half-precision_floating-point_format . SmartAnthill serializes such data as 2-byte substrate (encoded according to SmartAnthill Endianness), then considering Sign-Bit bitfield as bit [15], Exponent bitfield as bits [10..14], and Fraction bitfield as bits [0..9].

Layering remarks
SACCP and “packet chains”

SACCP is somewhat unusual for an application-level protocol in a sense that SACCP needs to have some knowledge about “packet chains” which are implicitly related to retransmission correctness. This is a conscious design choice of SACCP (and SAGDP) which has been made in face of extremely constrained (and unusual for conventional communication) environments which SmartAnthill protocol stack needs to support. It should also be noted that while some such details are indeed exposed to SACCP, they are formalized as a clear set of “rules of engagement” to be obeyed. As long as these “rules of engagement” are complied with, SACCP does not need to care about retransmission correctness (though the rationale for “rules of engagement” is still provided by retransmission correctness).

SASP below SAGDP

It is somewhat unusual to have encryption layer (SASP) “below” transport/session layer (SAGDP). This is a conscious design choice of SASP/SAGDP. In particular, it allows to:

  • rely that all the packets reaching SAGDP layer, are already authenticated; this allows (at the cost of the authenticating potentially malicious packets) to:
    • avoid attacks such as malicious RST sent to disrupt logical connection (TODO: check)
    • avoid attacks similar to “SYN flood” attacks
  • implement “Trusted Router” nodes in a simple manner (without implementing SAGDP on the router).

SmartAnthill Command&Control Protocol (SACCP)

Version:v0.2.17a

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SACCP is a part of SmartAnthill 2.0 protocol stack. It belongs to Level 7 of OSI/ISO Network Model, and is responsible for allowing SmartAnthill Client (usually implemented by SmartAnthill Central Controller) to control SmartAnthill Device.

Within SmartAnthill protocol stack, SACCP is located on top of SAGDP. On the side of SmartAnthill Device, on top of SACCP there is Zepto VM. On the side of SmartAnthill Client, on top of SACCP is Control Program.

As well as it’s underlying protocol (which is usually SAGDP), SACCP is an asymmetric protocol; it means that behaviour of SACCP is somewhat different for SmartAnthill Device and SmartAnthill Client. For the purposes of SACCP underlying protocol, SmartAnthill Client is considered “master device”, and SmartAnthill Device is considered “slave device”.

SACCP Assumptions

It is assumed that authentication, encryption, integrity and reliable delivery should be implemented by protocol layers below SACCP. SACCP operates on data packets which are already defragmented, authenticated, decrypted, and are guaranteed to be reliably delivered (reliable delivery includes guarantees that every data packet is delivered only once, see also an exceptions to guaranteed delivery in cases of “dual packet chains” and “fatal error handling” below).

The underlying protocol of SACCP should support the concept of “packet chain” (see section “Packet Chains” for more details). SACCP, when sending a packet, MUST specify to the underlying protocol whether the packet is the first, intermediate, or last in the “packet chain” (and receiving this information back when receiving the packet). One protocol which can be used as SACCP underlying protocol, is SAGDP.

Packet Chains

All interactions in SACCP are considered as “packet chains” (see SmartAnthill 2.0 Protocol Stack document for more details). With “packet chains”, one of the parties initiates communication by sending a packet P1, another party responds with a packet P2, then first party may respond to P2 with P3 and so on. Whenever SACCP issues a packet to an underlying protocol, it MUST specify whether a packet is a first, intermediate, or last within a “packet chain” (using ‘is-first’ and ‘is-last’ flags; note that due to “rules of engagement” described below, ‘is-first’ and ‘is-last’ flags are inherently incompatible, which MAY be relied on by implementation). This information allows underlying protocol to arrange for proper retransmission if some packets are lost during communication.

Handling of Fatal Errors

SACCP is built under the assumption that in case of any inconsistency between SmartAnthill Client and SmartAnthill Device, it is SmartAnthill Client which is right (see SmartAnthill 2.0 Protocol Stack document for more details). Keeping this in mind, implementation of SACCP underlying protocol on the SmartAnthill Device side MUST detect any fatal inconsistencies in the protocol (one example of such inconsistency is authenticated packet which is out-of-chain-order), and MUST invoke re-initialization of the SmartAnthill Device in this case. It is done regardless of the SACCP state and layers above SACCP, and MAY be done without notifying SACCP or any layers above the SACCP.

Layering remarks

SACCP (and it’s underlying protocol, which is normally SAGDP) are somewhat unusual for an application-level protocol in a sense that SACCP needs to care about details which are implicitly related to retransmission correctness. This is a design choice of SACCP (and SAGDP) which has been made in face of extremely constrained (and unusual for conventional communication) environments. It should also be noted that while some such details are indeed exposed to SACCP, they are formalized as a clear set of “rules of engagement” to be obeyed. As long as these “rules of engagement” are obeyed, SACCP does not need to care about retransmission correctness (though the rationale for “rules of engagement” is provided by retransmission correctness). Any references to retransmission correctness in current document are non-normative and are presented for the purposes of better understanding only.

SACCP Rules of Engagement

To ensure correct operation of an underlying protocol, there are certain rules (referred to “rules of engagement”) which MUST be obeyed (note that these “rules of engagement” are not specific to SAGDP, but will be a general requirement for any underlying protocol of similar nature):

  1. Each packet belongs to a “chain”, and has associated flags which specify whether the packet ‘is-first’ or ‘is-last’

  2. All “chains” MUST be at least two packets long (this is required by an underlying protocol to ensure retransmission correctness)

    1. From (2) it follows that ‘is-first’ and ‘is-last’ flags are inherently incompatible (which MAY be relied on by implementation)
  3. Multiple replies to a single command are not allowed. Scenarios when ‘double-reply’ to the same command is needed (for example, for longer- or uncertain-time-taking commands need to be implemented, SHOULD be handled in the same way as scenarios with disabling the receiver (‘last’ packet on the SmartAnthill Device side, then long command, then SmartAnthill Device initiates a new chain). A short “ACK” to confirm that the command is received, may be sent first, then the command can be executed, and then a real reply may be sent), MUST be implemented as follows:

    1. first reply MUST be the last packet in the “packet chain” (that is, it MUST have ‘is-last’ flag)
    2. second reply MUST start a new “packet chain” (that is, it MUST have ‘is-first’ flag)
      • TODO: this approach implies that there should be a reply-to-second-reply, need to see if it is restrictive enough in practice to consider adding special handling for double-replies
  4. If a device is going to turn off it’s receiver as a result of receiving a packet, such a packet MUST be the last packet in the “chain” (again, this is required to ensure retransmission correctness)

    1. From (2) and (3) it follows that if SmartAnthill Client needs to initiate a “packet chain” which requests SmartAnthill Device to turn off it’s receiver, such a chain MUST be at least 3 packets long. (NB: if such a chain is initiated by SmartAnthill Device, it MAY be 2 packets long).
  5. If the underlying protocol issues a packet with a ‘previous-send-aborted’ flag (which can happen only for SmartAnthill Device, and not for SmartAnthill Client), it means that underlying protocol has canceled a send of previously issued packet. In such cases, SACCP (and all the layers above) MUST NOT assume that previously issued packet was received by counterpart (TODO: maybe we can guarantee that the packet was NOT sent?)

  6. Due to the “Fatal Error Handling” mechanism described above, SACCP (as well as any layers above SACCP) on the SmartAnthill Device MUST assume that re-initialization can occur at any moment of their operation (at least whenever control is passed to the protocol which is an underlying protocol for SACCP). The effect of such re-initialization is that all volatile memory (such as RAM) is re-initialized, but all non-volatile memory (such as EEPROM) is preserved.

    As long as the “rules of engagement” above are obeyed, and SACCP properly informs an underlying protocol whether each packet it sends, is first, intermediary, or last in the chain, retransmission correctness can be provided by an underlying protocol, and SACCP doesn’t need to care about it.

SACCP Checksum

To re-use the same code which is used for SASP anyway, SACCP uses OMAC (as used for EAX, exact details TBD), using a fixed key consisting of byte 0xA5 repeated 16 times, as “SACCP Checksum”. Further, SACCP Checksum MAY be truncated to required number of bytes (starting from the beginning of 16-byte OMAC tag) as necessary.

SACCP Packets

SACCP packets are divided into SACCP pairing packets, SACCP command packets (from SmartAnthill Client to SmartAnthill Device) and SACCP reply packets (from SmartAnthill Device to SmartAnthill Client).

SACCP Packet Type Constants

SACCP Packet Type Constants are 3-bit constants, used to recognize SACCP packet type.

For SACCP packets coming from Client to Device (a.k.a. “requests”), the following SACCP Packet Type Constants are recognized for:

  • SACCP_PAIRING
  • SACCP_OTA_PROGRAMMING
  • SACCP_PHY_AND_ROUTING_DATA
  • SACCP_PROGRAM_TO_EXECUTE
  • SACCP_ENTROPY_PROVIDED

For SACCP packets coming from Device to Client (a.k.a. “responses”), the following SACCP Packet Type Constants are recognized for:

  • SACCP_PAIRING
  • SACCP_OTA_PROGRAMMING
  • SACCP_PHY_AND_ROUTING_DATA
  • SACCP_OK
  • SACCP_ERROR
SACCP Pairing Packets

NB: implementing Pairing Packets is NOT REQUIRED for SmartAnthill Devices which use Zero Pairing (such as Hobbyist Devices).

| SACCP-OTA-PAIRING-REQUEST | OTA-PAIRING-REQUEST-BODY |

where SACCP-OTA-PAIRING-REQUEST is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_PAIRING 3-bit constant, bits [3..4] are “additional bits” passed from Pairing Protocol alongside with OTA-PAIRING-REQUEST-BODY, bits [5..7] are reserved (MUST be zeros), and OTA-PAIRING-REQUEST-BODY is described in SmartAnthill Pairing document.

| SACCP-OTA-PAIRING-RESPONSE | OTA-PAIRING-RESPONSE-BODY |

where SACCP-OTA-PAIRING-RESPONSE is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bits [0..2] equal to SACCP_PAIRING 3-bit constant, bits [3..4] are “additional bits” passed from Pairing Protocol alongside with OTA-PAIRING-RESPONSE-BODY, bits [5..] are reserved (MUST be zeros), and OTA-PAIRING-RESPONSE-BODY as described in SmartAnthill Pairing document.

SACCP-OTA-PAIRING-REQUEST is sent from Client to Device, and SACCP-OTA-PAIRING-RESPONSE is sent from Device to Client; they form a “packet chain” as described in SmartAnthill Pairing document.

SACCP OtA Programming Packets

NB: implementing OtA Programming Packets is OPTIONAL for SmartAnthill Devices.

| SACCP-OTA-PROGRAMMING-REQUEST | OTA-PROGRAMMING-REQUEST-BODY |

where SACCP-OTA-PROGRAMMING-REQUEST is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_OTA_PROGRAMMING 3-bit constant, bits [3..5] are “additional bits” passed from SAOtAPP alongside with OTA-PROGRAMMING-REQUEST-BODY, bits [6..7] reserved (MUST be zeros), and OTA-PROGRAMMING-REQUEST-BODY is described in SmartAnthill Programming, Bootloaders and OtA Programming document.

| SACCP-OTA-PROGRAMMING-RESPONSE | OTA-PROGRAMMING-RESPONSE-BODY |

where SACCP-OTA-PROGRAMMING-RESPONSE is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bits [0..2] equal to SACCP_OTA_PROGRAMMING 3-bit constant, bits [3..5] being “additional bits” passed from SAOtAPP alongside with OTA-PROGRAMMING-RESPONSE-BODY, bits [6..] reserved (MUST be zeros), and OTA-PROGRAMMING-RESPONSE-BODY is described in SmartAnthill Programming, Bootloaders and OtA Programming document.

TODO: blocking all other messages (return TODO error) while OtA Programming Session is in progress (i.e. OtA Programming State being OTA_PROGRAMMING_INPROGRESS).

SACCP PHY-and-Routing-Data Packets

| SACCP-PHY-AND-ROUTING-DATA-REQUEST | PHY-AND-ROUTING-DATA-REQUEST-BODY |

where SACCP-PHY-AND-ROUTING-DATA-REQUEST is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_PHY_AND_ROUTING_DATA 3-bit constant, bits [3..5] are “additional bits” passed alongside with PHY-AND-ROUTING-DATA-REQUEST-BODY, bits [6..7] reserved (MUST be zeros), and PHY-AND-ROUTING-DATA-REQUEST-BODY is described in SmartAnthill Mesh Protocol (SAMP) document.

| SACCP-PHY-AND-ROUTING-DATA-RESPONSE | PHY-AND-ROUTING-DATA-RESPONSE-BODY |

where SACCP-PHY-AND-ROUTING-DATA-RESPONSE is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bits [0..2] equal to SACCP_PHY_AND_ROUTING_DATA 3-bit constant, bits [3..5] being “additional bits” passed alongside with PHY-AND-ROUTING-DATA-RESPONSE-BODY, bits [6..] reserved (MUST be zeros), and PHY-AND-ROUTING-DATA-RESPONSE-BODY is described in SmartAnthill Programming, Bootloaders and OtA Programming document.

SACCP Command Packets

IMPORTANT FORMAT CHANGES

SACCP command packets can be one of the following:

| SACCP-ENTROPY-PROVIDED | ENTROPY |

where SACCP-ENTROPY-PROVIDED is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_ENTROPY_PROVIDED 3-bit constant.

SACCP-ENTROPY-PROVIDED command is sent in response to SACCP_ERROR_ENTROPY_RECOVERY_NEEDED error, as a part of “entropy recovery” procedure. In response to SACCP-ENTROPY-PROVIDED, Device response either with another SACCP_ERROR_ENTROPY_RECOVERY_NEEDED, or with SACCP_ERROR_ENTROPY_RECOVERY_COMPLETED. In response to the former Client SHOULD send another SACCP-ENTROPY-PROVIDED command packet, in response to the latter - Client SHOULD repeat original command packet (the one which has failed with SACCP_ERROR_ENTROPY_RECOVERY_NEEDED).

| SACCP-NEW-PROGRAM-AND-EXTRA-HEADERS-FLAG | OPTIONAL-EXTRA-HEADERS | Execution-Layer-Program |

where SACCP-NEW-PROGRAM-AND-EXTRA-HEADERS-FLAG is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_PROGRAM_TO_EXECUTE 3-bit constant, bit [3] being EXTRA-HEADERS-FLAG specifying if OPTIONAL-EXTRA-HEADERS are present, bits [4..5] being 0x0, and bits[6..7] being reserved (MUST consist of zeros, otherwise SACCP returns SACCP_ERROR_INVALID_FORMAT), and Execution-Layer-Program is variable-length program.

NEW_PROGRAM command packet indicates that Execution-Layer-Program (normally - Zepto VM program) is requested to be executed on the SmartAnthill Device.

| SACCP-REPEAT-OLD-PROGRAM-AND-EXTRA-HEADERS-FLAG | OPTIONAL-EXTRA-HEADERS | Checksum-Length | Checksum |

where SACCP-REPEAT-OLD-PROGRAM-AND-EXTRA-HEADERS-FLAG is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_PROGRAM_TO_EXECUTE 3-bit constant, bit [3] being EXTRA-HEADERS-FLAG specifying if OPTIONAL-EXTRA-HEADERS are present, bits [4..5] being 0x1, bits[6..7] being reserved (MUST be zeros), Checksum-Length is a 1-byte field, indicating length of Checksum field (Checksum-Length MUST be >= 4 and MUST be <= 16, if it is not - SACCP returns SACCP_ERROR_INVALID_FORMAT error), Checksum has length of Checksum-Length, and is calculated as SACCP Checksum as described above.

SACCP-REPEAT-OLD-PROGRAM command packet indicates that the Execution-Layer program which is already in memory of SmartAnthill Device, needs to be repeated. Checksum field is used to ensure that perceptions of the “program which is already in memory” are the same for SmartAnthill Client and SmartAnthill Device (inconsistencies are possible is several scenarios, such as two SmartAnthill Clients working with the same SmartAnthill Device, accidental reboot of the SmartAnthill Device, and so on). If Checksum does not match the program within SmartAnthill Device, SACCP returns SACCP_ERROR_OLD_PROGRAM_CHECKSUM_DOESNT_MATCH error.

| SACCP-REUSE-OLD-PROGRAM-AND-EXTRA-HEADERS-FLAG | OPTIONAL-EXTRA-HEADERS | Checksum-Length | Checksum | Fragments | TODO: New-Checksum just in case?

where SACCP-REUSE-OLD-PROGRAM-AND-EXTRA-HEADERS-FLAG is a 1-byte bitfield substrate, with bits [0..2] equal to SACCP_PROGRAM_TO_EXECUTE 3-bit constant, bit [3] being EXTRA-HEADERS-FLAG specifying if OPTIONAL-EXTRA-HEADERS are present, bits [4..5] being 0x2, bits[6..7] being reserved (MUST be zeros), Checksum-Length is a 1-byte field, similar to that of in REPEAT-OLD-PROGRAM packet, Checksum has length of Checksum-Length, and is calculated as SACCP Checksum which is described above, and Fragments is a sequence of fragments.

SACCP-REUSE-OLD-PROGRAM is used when existing program is mostly the same, but there are some differences. When processing it, SACCP goes through the fragments, and appends data within (or referred to by) the fragment, to the new program, in a sense “assembling” new program from verbatim fragments, and from reference-to-old-program fragments.

For all SACCP command packets, OPTIONAL-EXTRA-HEADERS is a list of optional headers; each header starts from an Encoded-Unsigned-Int<max=2> bitfield substrate, which is then interpreted as follows:

  • bits [0..2] - header type
  • bits [3..] - header length

Currently, only two types of extra headers are supported:

  • END_OF_HEADERS (with no further data)
  • ENABLE_ZEPTOERR, with further data being | TRUNCATE-MOST-RECENT-AND-RESERVED |, where TRUNCATE-MOST-RECENT-AND-RESERVED is a 1-byte bitfield substrate, where bit [0] is a TRUNCATE-MOST-RECENT flag which specifies that zeptoerr should be truncated at the end if truncation becomes necessary (if this bit is not set, the least recent records are truncated from zeptoerr pseudo-stream), and bits [1..7] are reserved (MUST be zero). By default zeptoerr pseudo-stream is disabled; ENABLE_ZEPTOERR header enables zeptoerr if it is supported by target SmartAnthill Device.
SACCP Reuse Fragments

Each of the fragments in SACCP_REUSE_OLD_PROGRAM command packet is one of the following:

| SACCP_REUSE_FRAGMENT_VERBATIM | Fragment-Length | Fragment |

where SACCP_REUSE_FRAGMENT_VERBATIM is a 1-byte constant, Fragment-Length is Encoded-Size<max=2> field, and Fragment has size of Fragment-Length. TODO: Truncated-Encoded-Size (also for FRAGMENT_REFERENCE)?

| SACCP_REUSE_FRAGMENT_REFERENCE | Fragment-Length | Fragment-Offset |

where SACCP_REUSE_FRAGMENT_REFERENCE is a 1-byte constant, Fragment-Length is Encoded-Size<max=2> field, and Fragment-Offset is Encoded-Size<max=2> field, indicating offset of the fragment within existing program.

SACCP Reply Packets

POTENTIALLY IMPORTANT FORMAT CHANGES

Note that even if Device starts new “packet chain”, at SACCP level it is still considered as a reply (with OK-FLAGS-SIZE, etc.). It also means that (if there is no “packet chain” pending) Device MAY start a new “packet chain” with SACCP-ERROR-CODE (including SACCP_ERROR_ENTROPY_RECOVERY_NEEDED when necessary).

SACCP reply packets can be one of the following:

| OK-FLAGS-SIZE | Execution-Layer-Reply |

where OK-FLAGS-SIZE field is described below, and Execution-Layer-Reply is a variable-length field

OK-FLAGS-SIZE is an Encoded-Unsigned-Int<max=2> bitfield substrate, which is treated as follows:

  • bits [0..2] should be equal to SACCP_OK
  • bit [3] is TRUNCATED-FLAG, an indication that Execution-Layer-Reply has been truncated by SACCP (for example, due to the lack of RAM)
  • bits [4..] is EXECUTION-LAYER-REPLY-SIZE, size of Execution-Layer-Reply field (i.e. size is reported after truncation if there was any)

| EXCEPTION-FLAGS-SIZE | Exception-Data | TODO: ZEPTO-ERR?

where EXCEPTION-FLAGS-SIZE is described below, and OPTIONAL-Exception-Data is exception data as passed by Execution Layer, present only if IS-EXCEPTION flag is set (see below).

ERROR-FLAGS-SIZE is an Encoded-Unsigned-Int<max=2> bitfield substrate, which is treated as follows:

  • bits [0..2] should be equal to SACCP_ERROR
  • bit [3] is IS-EXCEPTION flag

If IS-EXCEPTION flag is set: * bit [4] is EXCEPTION-TRUNCATED-FLAG, an indication that Exception-Data has been truncated by SACCP (for example, due to the lack of RAM) * bits [5..] is EXCEPTION-DATA-SIZE, size of Exception-Data field (i.e. size is reported after truncation if there was any) * OPTIONAL-EXCEPTION-DATA field is present

If IS-EXCEPTION flag is not set: * bits [4..] is an ERROR-CODE, which takes one of the following values: SACCP_ERROR_INVALID_FORMAT, or SACCP_ERROR_OLD_PROGRAM_CHECKSUM_DOESNT_MATCH, SACCP_ERROR_ENTROPY_RECOVERY_NEEDED (in response to the latter, Client replies with SACCP-ENTROPY-PROVIDED), SACCP_ERROR_ENTROPY_RECOVERY_COMPLETED (only in response to SACCP-ENTROPY-PROVIDED, so Client may repeat original command).

Device Pins SHOULD NOT be Addressed Directly within Execution-Layer-Program

Execution-Layer-Program may contain EXEC instructions (see Zepto VM document for details). These EXEC instructions address a certain ‘ant body part’, and pass opaque data to the corresponding plugin. While the data passed to the plugin is opaque, it SHOULD NOT contain any device pins in it; which device pins are used by the plugin on this specific device, is considered a part of ‘body part configuration’ and is stored within MCU.

Therefore, data within EXEC instruction normally does not contain pins, but contains only a BODYPART-ID and an action. For example, a command to plugin which turns on connected LED, SHOULD look as |EXEC|BODYPART-ID|ON|, where ON is a 1-byte taking values ‘0’ and ‘1’, indicating “what to do with LED”. All mappings of BODYPART-ID to pins SHOULD be described as plugin_config parameter of plugin_handler(), as described in SmartAnthill Zepto OS document.

TODO: ?describe same thing in ‘Zepto VM’?

Execution Layer and Control Program

Whenever SmartAnthill Device receives a SACCP command packet, SACCP invokes Execution Layer and passes received (or calculated as described above) Execution-Layer-Program to it. After Execution Layer has finished it’s execution, SACCP passes the reply back to the SmartAnthill Client. One example of a valid Execution Layer is Zepto VM which is described in a separate document, Zepto VM .

Within SmartAnthill system, Execution Layer exists only on the side of SmartAnthill Device (and not on the side of SmartAnthill Client). It’s counterpart on the side of SmartAnthill Client is Control Program.

Execution Layer Restrictions

To comply with SACCP’s “rules of engagement”, SACCP on the side of SmartAnthill Device (a.k.a Execution Layer) MUST comply and enforce the following restrictions:

  1. Each reply provided by Execution Layer MUST be accompanied with a flag which signifies if the reply is ‘is-first’ or ‘is-last’ (or neither) in a “packet chain”. This flag is specified by Execution-Layer-Program.
  2. If a reply is sent before the Execution-Layer-Program exit, it MUST have a ‘is-last’ flag is set. If it is not the case, Execution Layer MUST generate a “Program Error” exception.
  3. If Execution Layer disables device receiver (such a disabling is always temporary) while processing a program, it MUST check that a reply was not sent before disabling device receiver (if it was –Execution Layer generates a “Program Error” exception, and does not disable receiver). However, after device receiver is re-enabled and Execution Layer execution continues and completes, Execution layer MUST check that a reply is sent before the Execution-Layer-Program is completed; this reply MUST have ‘is-first’ flag. If any of these conditions is not met, Execution Layer MUST generate a “Program Error” exception.
  4. If Execution Layer does not disable device receiver while processing an Execution-Layer-Program and the program terminates, Execution Layer MUST check that reply was sent before or on program exit; this reply MUST NOT have ‘is-first’ flag. If any of these conditions is not met, Execution Layer MUST generate a “Program Error” exception.
  5. Multiple replies to the same command are NOT allowed
  6. Whenever “Program Error” exception is generated, Execution Layer MUST abort program execution, and MUST send a special packet which indicates that an error has occurred, to the other side of the channel (i.e. to SmartAnt Client).
  7. If the underlying protocol issues a packet with a ‘previous-send-aborted’ flag, it means that underlying protocol has canceled a send of previously issued packet. In such cases, Execution Layer (and all the layers above) MUST NOT assume that previously issued packet was received by counterpart (TODO: maybe we can guarantee that the packet was NOT sent?)
  8. Due to the “Fatal Error Handling” mechanism described above, Execution Layer MUST assume that re-initialization can occur at any moment of their operation (at least whenever control is passed to the protocol which is an underlying protocol for SACCP). The effect of such re-initialization is that all volatile memory (such as RAM) is re-initialized, but all non-volatile memory (such as EEPROM) is preserved.
  9. TODO: check if these rules are enough.

TODO: timeouts

Control Program Restrictions

To comply with SACCP’s rules of engagement, SACCP on the side of SmartAnthill Client (a.k.a Control Program) MUST comply and enforce the following restrictions:

  1. Control Program SHOULD NOT send a program which would cause Execution Layer on the server side to violate Execution Layer rules of engagement
  2. TODO: is this enough?

SmartAnthill Guaranteed Delivery Protocol (SAGDP)

Version:v0.2.2

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SAGDP (SmartAnthill Guaranteed Delivery Protocol) aims to provide reliable message delivery for SmartAnthill environments; as described in SmartAnthill 2.0 Overall Architecture document, SmartAnthill environments tend to be extremely limited, and tend to require special attention to energy-saving features. In addition, special considerations (such as ability to turn off receiver temporarily) need to be considered.

SAGDP belongs to Layer 2 of OSI/ISO network model, see SmartAnthill 2.0 Protocol Stack document for details.

Contents

1. Main notions and definitions

1.1. Packet. A unit of data exchange with other levels/protocols. For the sake of clarity two types of packets are distinguished:

1.1.1. HLP packet: a packet that is sent to or received from a high level protocol;

1.1.2. UP packet: a packet that is sent to or received from an underlying protocol. HLP packet data is a payload data of UP protocol as it will be discussed in more details below.

1.1.3. Packet ID (PID): each packet has an associated unique (for communication between two given devices) packet ID. Packet ID must be packet as a 6-byte sequence b0 | b1 | ... | b5 in the order of increasing of their addresses in memory. Then the numerical value of Nonce VP is calculated as follows: (uint48)b0 + ((uint48)b1)<<8 + ((uint48)b2)<<16 + ((uint48)b3)<<24 + ((uint48)b4)<<32 + (((uint48)b5)&0x73f)<<40. It is a responsibility of an underlying protocol to generate an ID, to report the generated ID to SAGDP, and to send a packet together with the generated ID to a communication peer. Packet IDs must be generated in a way that the numerical value of each next ID is greater than a previous ID.

1.1.4. Preceding packet ID (PPID): a PID of a preceding packet in the chain, if preceding packet exists.

1.2. Chain. An ordered set of packets. Each packet in a chain is of one of mutually exclusive types: “first”, “intermediate”, and “terminating”, wherein “first” is the first packet in the chain, “terminating” is the last packet in the chain, and “intermediate” is neither “first” nor “terminating”.

1.2.1. Chain ID: each chain has an associated unique (for communication between two given devices) chain ID.

1.3. Master and Slave. For certain reasons that are discussed below parties participating in data exchange and using this protocol are considered as non-equivalent to each other, and details of protocols at each side are somehow different. To distinguish sides, where applicable, we will use terms Master and Slave. Usually Master is a device generating and sending some commands, and Slave is a device receiving commands and returning results.

1.4. Error Message. A packet that represents an error report. This packet can be sent by a Slave in context of any or no chain, if the Slave has encountered an error that prevents it from further packet processing. To be distinguished from other packets, a packet containing Error Message must be marked as both “first” and “terminating” since it has no definite context and does not assume any response.

1.5. UP packet structure: UP packet structure looks as follows:

| First Byte | PPID | HLP packet |

where

  • First Byte is a 1 byte field that is treated as follows (starting from LSB):

    • bit 0: “is-first” flag; set to 1 if a packet is marked as “first”, and to 0 otherwise;
    • bit 1: “is-terminating” flag; set to 1 if a packet is marked as “terminating”, and to 0 otherwise;
    • bit 2: “requested-resend” flag; set to 1 if a packet is being re-sent as a result of a repeated receiving of a packet being responded;
    • Remaining 5 bits: reserved; must be set to 0.
  • PPID: 6-byte field with PPID (for “intermediate” or “terminating” packet), or with Chain ID (for “first” packet).

  • HLP packet: variable size field; data that is defined by a higher level protocol.

2. Scenarios
2.1. Normal processing of a packet chain.

Two devices, A and B, participate in packet exchange. Each packet sent, except a packet with status “terminating”, assumes that there is a packet to be received from the opposite side of communication.

If all packets sent are actually delivered to the other side of communication (that is, no packet is lost on the way), a “ping-pong” packet exchange happens starting from a packet marked as “first” and ending with a packet marked “terminating”. To have guaranteed delivery, if no response to non-“terminating” packet is received, the packet is resent.

In more detail, a device A sends a non-“terminating” packet P to the device B and starts waiting for a packet P’ to receive from B. If no packet is received within certain time interval, A resends the packet P to B in hope the packet P will successfully go through. Two main cases are, in general, possible, if A receives no packet from B in turn: (1) packet P is lost, and (2) packet P has been delivered successfully, but packet P’ is lost.

In case (1), resending packet P can lead (after one or more repetitions) to reception of P at B. In the same time, while P is not received at B, similar to what A does, B resends its last packet (a predecessor of P in chain). In case (2) B replies by a packet P’ to packet P (and does the same to each additional packet P’ received (for instance, because of case (1)).

Thus, after sending a packet P, A can get either a reply to P, or a predecessor of P in chain. Details of processing of both options are considered in more details while discussing protocol states and events.

2.1.1. Special case: planned turning-off the receiver.

In some cases it may be desirable to turn off the receiver of one of devices, for instance, for power saving. Since with a receiver turned off a device could not be able to receive packets (including reply to the last packet sent to the other side of communication), chains must be organized in a way that the last received packet at the side that plans to turn off the receiver, would be “terminating” (that is such that does not assume sending a packet in turn).

2.2. Motivating differences in protocol for Master and Slave side.

Scenario: Two sides, Master and Slave, start their chains at the same time (that is, they send packets that are “first” ones in their respective chains). This could lead to having two chains at the same time, which is an unusual situation for SAGDP and should be handled separately.

Solution. The protocol is asymmetric for participating parties, that is, incoming packets are processed differently for Master and for Slave side. Particularly, if on the Slave side a “first” packet in a chain is received, current processing on the Slave side (if any) is terminated, and processing of a new chain starts. In turn, on the Master side, if a packet that is not in a chain currently processed by Master, is received, it is ignored. In particular, if a packet with status “first” in the chain is received from the Slave as in the discussed scenario, it will be ignored, and the “first” packet of the Master chain will eventually be resent (by timeout). Upon reception on the Slave side, this packet will cause start of the Master chain processing.

2.3. Inconsistency in order of incoming packets within the chain.

Scenario: a packet that is not “first” in a chain received, and the ID of a packet to which it is intended to be a reply does not coincide with the ID of the last sent message. Problem: obvious inconsistency in data exchange. While this shouldn’t happen if both parties adhere to the protocol, in real life it is possible due to events such as reboots, power losses, malfunctions etc.

Solution. On the Slave side this causes a device reset (since no reasonable processing can be continued). On the master side such a packet is ignored [+++do we report it to an upper level?]

2.4. Motivating “requested-resend” flag.

TODO: is ‘requested-resend’ the same as ‘Resent-Packet’ below?

Scenario: Side A has sent an “intermediate” packet in a chain to side B, but B has not received it; both sides are waiting for a packet: side A waits for a reply to the packet sent, and size B waits for a reply to a previous packet in the chain. Both sides can re-send respective packets by timeout. A problem could appear, if both sides would send packets by timeout in the same time as this will cause duplicated sending of all remaining packets in the chain.

(Virtual) Example 1:

...

S1. A <- B: packet #3

S2. A -> B: packet #4 (reply to #3; lost)

S3. A waits for reply to #4; B waits for reply to #3

S4. A -> B: packet #4 (re-send by timeout); A <- B: packet #3 (re-send by timeout)

S5. A -> B: packet #4 (as reply to packet #3 received at S4.)

S6. A <- B: packet #5 (as reply to packet #4 received at S4.)

S6. A <- B: packet #5 (as reply to packet #4 received at S5.)

...

To avoid such duplication a “requested-resend” flag is set for each packet that is a reply to a packet that is received not a first time. Then the Example 1 is transformed to

(Actual) Example 2:

...

S1. A <- B: packet #3

S2. A -> B: packet #4 (reply to #3; lost)

S3. A waits for reply to #4; B waits for reply to #3

S4. A -> B: packet #4 (re-send by timeout); A <- B: packet #3 (re-send by timeout)

S5. A -> B: packet #4 (as reply to packet #3 received at S4. with flag “requested-resend” set)

S6. A <- B: packet #5 (as reply to packet #4 received at S4.)

S6. B does nothing with respect to packet #4 received at S5 as flag “requested-resend” was found

...

Thus a potential for duplicated packet sending is eliminated.

2.5. Motivating Chain ID.

There are two cases why in state “idle” a “first” packet can come: the packet is an actual beginning of a new chain, or a packet is a re-sent beginning of a previous chain (in this latter case the previous chain is of length two). Respectively, processing of such two cases is different. Chain ID can answer a question whether a “first” packet is related to a previous chain (same chain ID), or to a new chain (otherwise).

3. States

SAGDP has four states.

3.1. “not initialized”

SAGDP appears in this state at system start, and can appear at any time, if detected inconsistencies in packet sequencing are such that the context of processing is lost and all existing data, if any, becomes invalid. The only event that can be processed in this state is “initializing”, which results in transition to “idle” state.

This state has no associated data.

3.2. “idle”

If no chain is being processed, the protocol appears in state “idle” and waits for a packet that is marked as a “first” in chain from either a higher level protocol (when the device itself initiates communication) or from an underlying protocol (that is, ultimately, from a device that is a partner for communication). The first case results in transition to “wait-remote” state since after packet sending to the other device a response is being expected and waited. In the second case it is a communication partner device that initiated communication, and implementing device is to respond, so transition happens toward “wait-local” state. In addition, if a repeated packet is received, the last sent packet must be re-sent (without changing state).

Idle state has no associated data.

3.3. “wait-remote”

When a packet is sent to the communication partner device, a reply packet is expected, and the protocol is in “wait-remote” state. With respect to chain ordering two types of packets can arrive: a reply to the packet sent (which means, in particular, that the last sent packet has been received by a communication partner device), and a previously received packet (which means that the last sent packet has not been delivered successfully). In the first case the payload of the received packet is forwarded to the higher level protocol for processing, and SAGDP transits to “wait-local” state waiting for the reply from the higher level. In the second case a last sent packet is resent, and the protocol remains in the same “wait-remote” state.

Another event that can happen in this state is a timer event. If nothing is received from a communication partner device within certain time period from the last packet has been sent, a last sent packet should be resent. Timer event happens after expiration of that time period. The protocol remains in the same “wait-remote” state after timer event.

“Wait-remote” has the following associated data:

  • last sent packet (LSP);
  • last sent packet ID range (LSPIDR);
  • previous sent packet ID range (PSPIDR);
  • last received chain ID (LRCID);
  • length of the last time interval between re-send attempts (RSP).

LSP is used for packet resending, and RSP is used to set timer. LSPIDR is used to check whether an incoming packet is a reply to the last sent packet, or is a previously received packet. Such check is done by comparison of LSPIDR with PPID of the received packet.

3.4. “wait-local”

When payload data of a new packet received from the underlying protocol (and thus, ultimately, from a communication partner device) is forwarded to the higher level protocol, SAGDP starts waiting for a reply from a higher level, and stays in “wait-local” state. In this state the only legitimate event is receiving a packet from a higher level that is not marked as a “first” in chain.

“Wait-local” has the following associated data:

  • last received packet unique identifier (LRPID),

which is to be added to the header of a packet that is to be forwarded to underlying protocol as an indication to which packet in chain the current packet serves as a reply.

4. Events

Here is a full list of events.

4.1. Receiving an UP packet

A packet that has not been received ever before arrives. Unless an error in chaining happened, it is either the first in a new chain, or a reply of a communication partner to the last sent packet. This event is initiated by an underlying protocol. In general, a payload of this packet is to be extracted and passed to a higher level protocol.

4.2. Receiving a request to resend LSP

If, for any reason, an underlying protocol determins that the last sent packet did not go through, it may request to re-send the last sent packet.

4.3. Receiving an HLP packet that is “first”, or is “intermediate”, or is “terminating”

TODO: pls check that the intended meaning didn’t change

A packet from a higher level protocol has been received with a respective status in chain. This packet is to be pre-processed and passed to an underlying protocol to be ultimately sent to a communication partner device.

4.4. Timer

In the context of SAGDP timer event is used for packet resending, if a response has not been received within certain time.

5. Event processing
5.1. Processing events in idle state

In idle state SAGDP is ready to accept a packet marked as “first” from either underlying or higher level protocol.

5.1.1. Receiving an UP packet

Processing of this event is different at Mater’s and Slave’s side in a part when the packet is not a subsequent packet within a current chain.

At Master’s side, processing depends on the status of the packet in chain.
  • Error Message: payload of the packet is reported to a higher level protocol with its status, and SAGDP changes its state to idle.

  • “First”: chain id in the packet is compared to LRCID.
    • chain ID in the packet is equal to LRCID: a repeated packet has been received; SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID; SAGDP does not change its state.
      • chain ID in the packet is not equal to LRCID: LRCID is set to the value of chain ID in the packet; packet PID is saved as a current value of LRPID, payload of the packet is reported to a higher level protocol with its status, and SAGDP changes its state to wait-local.
  • “Intermediate”: PPID of the packet is compared to LSPIDR and PSPIDR as follows.
    • PPID is below the LSPIDR and below PSPIDR: packet is ignored; SAGDP does not change its state.
    • PPID is below the LSPIDR and within PSPIDR: the Last Sent Packet must be re-sent (note that in “idle” state it could be only “terminating”); SAGDP does not change its state.
    • PPID is within LSPIDR: unexpected (received packet is a response to the last sent packet, but the last sent packet in state “idle” could be only “terminating”): ignored [TODO: check for necessity of other actions].
    • PPID is above LSPIDR (chain is broken): ignored [TODO: check for necessity of other actions].
  • “Terminating”: PPID of the packet is compared to LSPIDR.
    • PPID is below the LSPIDR: the chain is broken (PPID being below LSPIDR means that the last this packet has been replied, which is impossible sinse “this” packet is “terminating”); ignored [TODO: check for necessity of other actions].
    • PPID is within LSPIDR: (received packet is a reply to the last sent packet; since SAGDP is in “idle” state, then the last received packet was “terminating”, and thus this packet is already processed): ignored without changing state.
    • PPID is above LSPIDR (chain is broken): ignored [TODO: check for necessity of other actions].
At Slave side,
  • Error Message: unexpected; system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.

  • “First”: chain id in the packet is compared to LRCID.
    • chain ID in the packet is equal to LRCID: a repeated packet has been received; SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID; SAGDP does not change its state.
      • chain ID in the packet is not equal to LRCID: LRCID is set to the value of chain ID in the packet; packet PID is saved as a current value of LRPID, payload of the packet is reported to a higher level protocol with its status, and SAGDP changes its state to wait-local.
  • “Intermediate”: PPID of the packet is compared to LSPIDR and PSPIDR as follows.
    • PPID is below the LSPIDR and below PSPIDR: unexpected (chain is broken): system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
    • PPID is below the LSPIDR and within PSPIDR: the Last Sent Packet must be re-sent (note that in “idle” state it could be only “terminating”); SAGDP does not change its state.
    • PPID is within LSPIDR: unexpected (received packet is a response to the last sent packet, but the last sent packet in state “idle” could be only “terminating”); system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
    • PPID is above LSPIDR: unexpected (chain is broken); system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
  • “Terminating”: PPID of the packet is compared to LSPIDR.
    • PPID is below the LSPIDR: the chain is broken (PPID being below LSPIDR means that the last this packet has been replied, which is impossible sinse “this” packet is “terminating”); system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
    • PPID is within LSPIDR: (received packet is a reply to the last sent packet; since SAGDP is in “idle” state, then the last received packet was “terminating”, and thus this packet is already processed): ignored without changing state.
    • PPID is above LSPIDR: unexpected (chain is broken): system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
5.1.2. Receiving an HLP packet that is “first”

An UP packet is formed wherein HLP packet becomes a payload data, and a header contains flags regarding the position of the packet in chain (“is-first” flag is set, “is-last” is not set) and the packet PPID that is equal to LRPID. SAGDP requests a new Packet ID; sets PSPIDR to a current value of LSPIDR; and sets both lower and upper bound of LSPIDR to the received Packet ID (note that the upper bound of LSPIDR serves as a last sent packet ID and can be used when necessary as such). The UP packet is saved as LSP. Timer is set to RSP. The UP packet is sent to the underlying protocol. SAGDP changes its state to “wait-remote”.

5.1.3. Receiving a request to resend LSP; or an HLP packet that is “intermediate”; or an HLP packet that is “terminating”

TODO: pls check that the intended meaning didn’t change

If any of these events happen in idle state, consistency of data processing is broken. If implemented on Master, an error must e reported to the higher level protocol, and SAGDP transits to “idle” state. If implemented on Slave, system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.

5.1.4. Timer

Ignored in this state.

5.2. Processing events in wait-local state

In wait-local state SAGDP waits from a higher level protocol for a packet that is not a “first” in the chain.

5.2.1. Receiving an HLP packet that is “intermediate”

An UP packet is formed wherein HLP packet becomes a payload data, and a header contains flags regarding the position of the packet in chain (“is-first” flag is not set, “is-last” is not set) and the packet PPID that is equal to LSPID. SAGDP requests a new Packet ID; sets PSPIDR to a current value of LSPIDR; and sets both lower and upper bound of LSPIDR to the received Packet ID (note that the upper bound of LSPIDR serves as a last sent packet ID and can be used when necessary as such). Timer is set to RSP. The UP packet is sent to the underlying protocol. SAGDP changes its state to “wait-remote”.

5.2.2. Receiving an HLP packet that is “terminating”

An UP packet is formed wherein HLP packet becomes a payload data, and a header contains flags regarding the position of the packet in chain (“is-first” flag is not set, “is-last” is not set) and the packet PPID that is equal to LSPID. SAGDP requests a new Packet ID; sets PSPIDR to a current value of LSPIDR; and sets both lower and upper bound of LSPIDR to the received Packet ID (note that the upper bound of LSPIDR serves as a last sent packet ID and can be used when necessary as such). Timer is set to RSP. The UP packet is sent to the underlying protocol. SAGDP changes its state to “wait-remote”.

5.2.3. Receiving a UP packet with flag “Resent-Packet”

The packet is ignored. SAGDP does not change its state.

5.2.4. Receiving an HLP packet that is “first”; or an UP packet; or Receiving a request to resend LSP

TODO: pls check that the intended meaning didn’t change

If any of these events happen in wait-local state, consistency of data processing is broken. If implemented on Master, an error must e reported to the higher level protocol, and SAGDP transits to “idle” state. If implemented on Slave, system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.

5.2.5. Timer

Ignored in this state.

5.4. Processing events in wait-remote state
5.4.1. Receiving an UP packet

A received UP packet can be either a new packet, or a repetition of a previously last-received packet. In the latter case a last sent packet is resent; in the former case processing of this event is different at Mater’s and Slave’s side in a part when the packet is not a subsequent packet within a current chain. The received packet is processed as follows:

At Master’s side, processing depends on the status of the packet in chain.
  • Error Message: payload of the packet is reported to a higher level protocol with its status, and SAGDP changes its state to idle.

  • “First”: chain id in the packet is compared to LRCID.
    • chain ID in the packet is equal to LRCID: a repeated packet has been received; SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID; SAGDP does not change its state.
      • chain ID in the packet is not equal to LRCID: unexpected; ignored; SAGDPdoes not change its state.
  • “Intermediate”: PPID of the packet is compared to LSPIDR and PSPIDR as follows.
    • PPID is below the LSPIDR and below PSPIDR: packet is ignored; SAGDP does not change its state.
    • PPID is below the LSPIDR and within PSPIDR: SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID; SAGDP does not change its state.
    • PPID is within LSPIDR (received packet is a response to the last sent packet): packet PID is saved as a current value of LRPID, payload of the packet is reported to a higher level protocol with its status in chain, and SAGDP changes its state to wait-local.
    • PPID is above LSPID (chain is broken): the packet is ignored.
  • “Terminating”: chain consistency is verified by comparison of PPID of the packet with LSPID.
    • PPID is below the LSPIDR: unexpected (a repeated packet has been received that is “terminating”, but SAGDP did not respond to a “terminating” packet). Ignored. [TODO: check]
    • PPID is within LSPIDR (received packet is a response to the last sent packet): payload of the packet is reported to a higher level protocol with its status in chain, and SAGDP changes its state to idle.
    • PPID is above LSPIDR (chain is broken): the packet is ignored [+++check]
At Slave side,
  • Error Message: unexpected; system must send a packet with Error Message to its communication partner and then transit to “not initialized” state thus invalidating all current data.

  • “First”: chain id in the packet is compared to LRCID.
    • chain ID in the packet is equal to LRCID: a repeated packet has been received; SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID; SAGDP does not change its state.
    • chain ID in the packet is not equal to LRCID (master has selected to start a new chain): system must transit to “not initialized” and then to “idle” state, and then to process the packet again.
  • “Intermediate”: PPID of the packet is compared to LSPIDR and PSPIDR as follows.
    • PPID is below the LSPIDR and below PSPIDR: unexpected (chain is broken): system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
    • PPID is below the LSPIDR and within PSPIDR: a repeated packet has been received. SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID, and SAGDP keeps its present state (“wait-remote”).
    • PPID is within LSPIDR (received packet is a response to the last sent packet): packet PID is saved as a current value of LRPID, payload of the packet is reported to a higher level protocol with its status in chain, and SAGDP changes its state to wait-local.
    • PPID is above LSPID (chain is broken): system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
  • “Terminating”: chain consistency is verified by comparison of PPID of the packet with LSPID.
    • PPID is below the LSPIDR: unexpected (a repeated packet has been received that is “terminating”, but SAGDP did not respond to a “terminating” packet). System must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
    • PPID is within LSPIDR (received packet is a response to the last sent packet): payload of the packet is reported to a higher level protocol with its status in chain, and SAGDP changes its state to idle.
    • PPID is above LSPIDR (chain is broken): system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.
5.4.2. Receiving a request to resend LSP

SAGDP requests a new Packet ID, updates upper bound of LSPIDR with received Packet ID, the Last Sent Packet is re-sent together with its Packet ID. Timer is reset [TODO: details on timer reset here and at all applicable places]. SAGDP does not change its state.

5.4.3. Timer

The LSP is sent to the underlying protocol. Timer is set to RSP. SAGDP does not change its state.

5.4.4. Receiving an HLP packet that is “first”; or receiving an HLP packet that is “intermediate”; or receiving an HLP packet that is “terminating”

If any of these events happen in wait-remote state, consistency of data processing is broken. If implemented on Master, an error must be reported to the higher level protocol, and SAGDP transits to “idle” state. If implemented on Slave, system must send a packet with Error Message to its communication partner and then to transit to “not initialized” state thus invalidating all current data.

[+++ processing around “requested-resend” flag]

... [work in progress]

SmartAnthill Security Protocol (SASP)

Version:v0.2.2

IMPORTANT: This document is obsolete. Please DO NOT modify it. Please refer to SimpleIoT Security Protocol (SimpleIoT/SP) for an up to date version.

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SASP (SmartAnthill Security Protocol) aims to provide security guarantees for communications within SmartAnthill environments, in particular, prevention from unauthorized access to message content, message integrity guarantees, and protection from replay attacks.

1. Definitions

1.1. Packet. A unit of data exchange with other levels/protocols. For the sake of clarity two types of packets are distinguished:

  • HLP packet: a packet that is sent to or received from a higher-level protocol. HLP packet data is a payload for SASP, as it will be discussed in more details below.
  • SASP packet: a packet that is formed by SASP and is sent to or received from the communication peer (using an underlying protocol).
  • Internally valid SASP packet: a packet that has passed authentication based solely on packet data (see also “intra-packet authentication”).

1.2. SASP Packet structure

SASP packet structure looks as follows:

| SASP Header | Security Tag | Encrypted Data |

where:

  • SASP Header is a non-encrypted part of the packet that contains flags and certain bits of the packet nonce. Header takes 6 bytes.
  • Security Tag: data related to encryption and authentication process. Security Tag takes 16 bytes.
  • Encrypted Data: encrypted data, which includes certain SASP information as well as SASP payload. The same data before encryption (or after decryption) is referred to as “Data Under Encryption

1.3. Packet Nonce: all data used as a packet nonce for purposes of encryption/authentication. PFN consists of:

  • Nonce Varying Part (Nonce VP): a fixed-size bit sequence uniquely generated by a sending device for each new packet; Nonce VP is 47 bits; as defined herein, in certain contexts it can be treated as a(n unsigned) integer. It can serve as a part of Packet ID when such ID is required.

  • Destination Flag: a bit that indicates whether the packet is intended solely to SASP itself (such as a packet with Error “Old Nonce” Message), or for its higher level protocol. Values of the flag have the following meaning:
    • 0: packet is intended for a higher level protocol
    • 1: packet is intended for SASP itself
  • Peer-Distinguishing Flag: a bit that is set to 0 for one communication peer and to 1 for another peer.

    Nonce VP and Destination Flag must be packet as a 6-byte sequence b0 | b1 | ... | b5 in the order of increasing of their addresses in memory. Then the numerical value of Nonce VP is calculated as follows: (uint48)b0 + ((uint48)b1)<<8 + ((uint48)b2)<<16 + ((uint48)b3)<<24 + ((uint48)b4)<<32 + (((uint48)b5)&0x73f)<<40, and the value of Destination Flag is calculated as b5>>7.

1.4. Packet ID (PID): a unique identifier of a packet, when such ID is required.

PID is formed using Nonce VP and Peer-Distinguishing Flag and must be packet as a 6-byte sequence b0 | b1 | ... | b5 in the order of increasing of their addresses in memory so that b0 = (uint8)(Nonce_VP); b1 = (uint8)(Nonce_VP >> 8); b2 = (uint8)(Nonce_VP >> 16); b3 = (uint8)(Nonce_VP >> 24); b4 = (uint8)(Nonce_VP >> 32); b5 = (uint8)(Nonce_VP >> 40) | (Peer_Distinguishing_Flag << 7);.

1.5. Nonce Lower Watermark (NLW): a value supported by a packet receiving side that is used to determine whether a value of Packet Nonce VP (i) has never been used before (if a new packet is received); (ii) has been used with the last received packet (for instance, in case of packet resending); or (iii) a de-synchronization in communication has happened.

1.6. Nonce to use For Sending (NFS): a value supported by a packet sending side that is used to generate a value of Packet Nonce VP that would have never been used before, and that would be verifiable by the communication peer.

1.7. Last Received Packet Signature: [+++check whether it is indeed required]

1.8. Packet validation process: a core task of SASP main purpose of which is to ensure that a packet is actually received is from an intended communication partner, is not modified by a third party on the way, and its content (unless specified otherwise) is protected from reading by not indented parties. On the sending side of communication the packet validation process results in encryption and adding authentication data. On receiving side a process can logically be divided into two steps:

  • intra-packet authentication, which is done using solely packet data such as respective headers, nonces, tags, etc, and not using NLW;
  • in-sequence authentication, which is based on comparison of a packet nonce Varying Part with the Nonce Lower Watermark.

1.9. Error “Old Nonce” Message: a packet that represents an “old nonce” error report with the lowest possible value of a valid nonce VP (which is equal to a current value of Nonce Lower Watermark plus 1). This packet can be sent, if an otherwise valid packet is received with an “old” nonce VP, that is, with a nonce VP that is less than the Nonce Lower Watermark.

2. Security choices

The core of SASP is packet encryption/decryption and authentication. These processes are based on EAX algorithm (see [EAX0]). Design choices with respect the above-mentioned algorithm are:

  • Encryption method: AES-256

  • Tag size: 128 bit

  • EAX Nonce size: 49 bit, in particular:

    • Nonce Varying Part: 47 bit [1]
    • Destination Flag: 1 bit
    • Peer-Distinguishing Flag: 1 bit

To reduce the amount of data transferred, Peer-Distinguishing Flag is not actually transferred but just appended to the packet header that actually contains only Nonce Varying Part and Destination Flag to get a Packet Full Nonce:

  • SASP Header size: 48 bit, in particular:

    • Nonce Varying Part: 47 bit
    • Destination Flag: 1 bit

Rationale: In order to use the same encryption key in both directions of communication each nonce should be unique for packets going in both directions, too. Uniqueness of the nonce going in a particular direction is enforced by packet sender (using nonce VP generation based on NFS). To separates sets of nonces generated by each of two communication peers, a separate bit in the nonce value (Peer-Distinguishing Flag) is used to distinguish between peers so that this bit is set for all nonces generated by one peer and is not set for nonces generated by the other peer. Which peer should have this bit set can be determined, in particular, during set up of communication between two specific devices (for instance, at the same time when encryption key exchange is done), or can be a predefined choice for some types of the devices, if devices of different type participate in communication (for instance, in communication of a Master device with a Slave device Master device may always have the flag set, and Slave device may always have the flag not set).

[1]If 47 bit nonce VP is used, then different nonces will be enough for 10 years with packet frequency of 2.25 mks: 10*365*24*60*60*1000000/2^47 = 2.25
2.1 SASP Nonces

In SASP, nonce varying part is always increased, and never goes back. This is a critical requirement for SASP to be secure (both to guarantee nonce being unique, which is required for EAX to be secure, and to avoid replay attacks).

3. Security Guarantees

Security of SASP relies on security of EAX, which is proven as long as underlying cipher (AES128) is secure, and as long as nonces are unique per key.

Within SASP, keys MUST be unique for each communication pair, and uniqueness of nonces for the pair is guaranteed by:

  • Peer-Distinguishing Flag
  • for packets sent by each peer, by “Nonce to use for Sending” (NFS)

EAX as such doesn’t guarantee protection from replay attacks, however as nonces are unique, replay attack is not possible as long as SASP drops packets with repeated nonces. SASP does drop packets with repeated nonces, with the following exception:

  • Error “Old Nonce” Message. For ‘Error “Old Nonce” Message, SASP does not check the nonce (this is necessary to avoid potential deadlocks). However, replay attack based on these messages is not possible, because SASP does not allow NLW to decrease, and therefore all replay packets will be ignored by SASP.

Therefore, SASP is secure (because of EAX and AES128 being secure) and also provides protection from replay attacks.

4. Scenarios
4.1. Normal packet processing

Two devices, A and B, participate in packet exchange. Each packet sent is encrypted and authenticated in a way to both guarantee packet integrity and protect from replay attacks. Each packet received has a respective authentication data. Correspondingly, when an HLP packet is being prepared for sending, it is encrypted by an encryption key known to both communication peers, and authentication data is added. It is important that a nonce used for encryption/authentication could be recognized as such (that is, as a value actually used once) by the other communication peer. This is achieved by using Nonce to use For Sending (NFS) on the sending side and Nonce Lower Watermark (NLW) on receiving side.

4.1.1. How NFS / NLW pair works

To avoid replay attacks nonces are commonly used to distinguish between an original message and a message with otherwise the same content that is being replayed. A problem with nonces is to check that a particular value is actually new and has not yet been used ever before. To address this problem SASP treats VP of nonces as numerical values and compares a nonce VP from a received packet with a current value of the NLW. If the value of nonce VP is greater than a current value of the NLW, the nonce is considered as new; in this case the value of NLW is set to the value of the nonce VP, and its reuse becomes impossible.

To be economical with the set of values that are greater than a current value of NLW (within a certain range), it is desired that a value of a new nonce VP received be as close (from above) to NLW as possible, ideally, greater by 1. NFS is used to keep track of nonces on the sending side. Initially (for example, at the same time when secret keys are exchanged between the sides) communication partners set NLW on receiving side to the same value as NFS on sending side (namely, NLW = 0, and NFS = 0). Before a new packet is being sent, NFS is incremented, and packet nonce VP is set to a value of NFS. On the receiving side, upon reception of the packet, the value of NLW will become the value of the nonce VP, that is, again equal to NFS on the sending side. The process may be continued until all space of NFS/NLW values is exhausted.

TODO: Nonce Exhaustion/Overflow handling

4.2. Processing packet with an obsolete nonce

If a packet is internally valid, but its nonce VP is less than or equal to a current value of NLW, it may indicate that states of the communication peers are out of sync (and not necessarily that a third party attack is detected). In this case, to resynchronize communication process an Error “Old Nonce” Message is formed with the lowest possible acceptable nonce VP, and a packet with this message is sent to a communication partner.

If an Error “Old Nonce” Message is received, the receiving party compares its NFS with the lowest possible value of the nonce within the message, and if NFS is less that value, NFS is set to the value as specified in the message; using such a value of NFS for sending packets will ensure that the packet will pass NLW test at the receiving party.

TODO: exact format of ‘Error “Old Nonce” Message’

5. SASP padding
5.1. SASP data under encryption and payload

SASP data under encryption is organized as follows:

| First Byte | (opt) complementary size | byte sequence | (opt) padding |

where:

  • First Byte is a 1 byte field that is treated as follows:

    • MSB bit: padding size flag, which is set to 1, if padding is present, and 0 otherwise. Presence of padding implies presence of padding size field as well.
    • Remaining 7 bits: a part of payload.
  • complementary size: SmartAnthill Encoded-Unsigned-Int<max=2> variable-size field, as described in SmartAnthill 2.0 Protocol Stack; this field is present only if padding size flag is set; in this case the field contains encoded value of a sum of the size of this field and the size of padding (if any). If Encoded-Unsigned-Int has an invalid value (as defined in SmartAnthill 2.0 Protocol Stack), then SASP receiving side MUST treat such a packet as an invalid (as the one which didn’t pass internal validation). Note: unless “enforced padding” (see below) is used, SASP pads data only to the block size; it means that unless “enforced padding” is used, padding size is always <= 15, and therefore Encoded-Unsigned-Int cannot be longer than 1 byte.

  • byte sequence: variable size field; data that is defined by a higher level protocol.

  • padding: variable size field; this field is present only if padding size flag is set and complementary size represents a value greater than 1; contains padding up to a target size.

Correspondingly, SASP payload consists of:

  • Remaining 7 bits of the First Byte
  • byte sequence

Higher-level protocol is free to use “partial byte” (7 bits) of SASP payload, or to ignore it; however, this “partial byte” might be useful, for example, to store some bitflags of higher-level protocol, which may allow to save 1 byte of payload.

5.2. SASP padding data

SASP padding data MUST be generated using Non-Key Random Stream as described in SmartAnthill SmartAnthill Random Number Generation and Key Generation.

5.3. SASP enforced padding

In certain scenarios, some information might be extracted from the packet length even though information is encrypted. To support the cases when this is important, SASP supports a concept of “enforced padding”, which works as follows:

  • When sending an HLP, a high-level protocol is allowed to specify enforce-pad-to. For each packet length len, SASP guarantees that for all the HLPs which have their own size= len and are sent without enforced-pad-to, or which are sent with enforced-pad-to = len, the length of SASP packet is exactly the same (therefore, preventing any length-based information leak).

To implement it, on receiving such a request SASP MUST do the following:

  • check that enforce-pad-to is greater or equal to the size of packet itself. TODO: specify what to do if it is not (probably different for Master and Slave)
  • calculate required-size, the size of the SASP packet which an HLP with a size of enforce-pad-to would produce
  • calculate the size of enforced-padding for current packet (so that SASP packet produced from current packet, would have size= required-size)
  • pad packet, using calculated enforced-padding, and producing ‘enforced-padded’ SASP packet

TODO: specify handling of enforce-pad-to for the layers between SASP and SACCP.

6. SASP data

For its operations SASP uses the following data:

  • Nonce Lower Watermark (NLW)
  • Nonce to use For Sending (NFS)
7. Events

There are three events that SASP processes:

  1. receiving a SASP packet from the communication peer
  2. receiving a packet from a higher level protocol (HLP packet)
  3. receiving a request from a higher level for nonce variable part
7.1. Receiving an HLP packet

A packet from a higher level protocol is received together with a nonce VP. After a received nonce VP is ensured to be numerically greater than NLS, this packet is encrypted and authentication data is added using a new nonce based on a received nonce VP, a resulting SASP packet is to be passed to the communication peer (using underlying protocol).

7.2. Receiving a SASP packet

A SASP packet from the communication peer is received (via underlying protocol). A packet can be:

  • valid new packet, which means that the packet data passed validation process, and packet nonce VP is greater than the Nonce Lower Watermark;
  • old-nonce packet, an otherwise valid packet with a nonce VP less than the Nonce Lower Watermark, which means either de-synchronization in communication, or an attack attempt
  • packet with Error “Old Nonce” Message (intended for SASP itself)
  • invalid packet, in particular, corrupted, an attacker’s packet, etc.
7.3. Receiving a request for nonce VP

A higher level protocol can request for a nonce VP that will be returned together with an HLP packet for sending to a communication peer. Nonce VP returned must be greater then a current value of NLS.

8. Event processing

Further details of event processing are placed below.

8.1. Receiving an HLP packet

A packet from a higher level protocol is received together with a nonce VP. Nonce VP is compared to the current value of NFS.

  • Nonce VP is less than or equal to NFS: no processing is done and an error is reported [TODO: should we provide more details on what such error should result in]
  • Nonce VP is greater than NFS: NFS is set to the value of nonce VP; HLP packet is encrypted and authenticated using a new nonce based on a received nonce VP to form a SASP packet. This SASP packet is sent to the communication peer using underlying protocol.
8.2. Receiving a SASP packet

On receipt of a SASP packet, first, an intra-packet authentication is performed as follows:

  • TODO!

Then:

  • if intra-packet authentication has failed: the packet is silently dropped as being either corrupted or an attacker’s packet;

  • if intra-packet authentication is passed: it can be either an error message packet directed to SASP itself, or a “regular” packet with payload intended for a higher level protocol.

    • if a packet is with Error Old Nonce Message [+++structure and detection]: packet nonce VP is not compared to NLW (reason: replay attack is impossible since NFS cannot be decreased as a result of this message, and performing comparison may lead to a deadlock); a value of the lowest possible valid nonce from the packet is compared to the current value of NFS.

      • if NFS is less than the value of the lowest possible valid nonce: NFS is set to the value of the lowest possible valid nonce.
      • if NFS is greater than or equal to the value of the lowest possible valid nonce: no changes to NFS is done; the packet is ignored.
    • if packets other than Error Old Nonce Message: packet nonce VP is compared to the Nonce Lower Watermark (NLW). Three cases are possible:

      • if nonce VP is less than or equal to NLW: a packet with Error Old Nonce Message is prepared with the lowest possible valid nonce set to a current value of NLW; the packet is authenticated and sent to the communication peer.
      • if nonce VP is greater than NLW: a new packet is received: NLW is set to the value of nonce VP of the received packet; LRPS is set to packet signature [TODO: check whether we use it elsewhere]; an HLP packet with payload of the received packet is passed to the higher level protocol together with the nonce VP of the packet nonce.

TODO!: sending packets (encryption etc.)

8.3. Receiving a request for nonce VP

A Nonce VP is generated based on a current value of NLS so that the numerical value of nonce VP be greater than numerical value of NLS. Such generation can be as simple as numerical value of NLS plus 1.

9. Payload Size and SASP Packet Size

As SASP is using 48-bit (= 6 bytes) nonce, a block cipher (AES128) with a block size of 128 bits (=16 bytes), and tag size is chosen as maximum 128 bits, it means that SASP packet size is always (6+16+k*16)=(22+k*16), where k >= 1.

The following table shows relations between SASP packet sizes and SASP payload [2] not including “remaining 7 bits” part (that is, a size of byte sequence part only):

SASP packet size, bytes SASP payload, bytes
38 7bits+0bytes to 7bits+15bytes
54 7bits+16bytes to 7bits+31bytes
70 7bits+32bytes to 7bits+47bytes
86 7bits+48bytes to 7bits+63bytes
102 7bits+64bytes to 7bits+79bytes
118 7bits+80bytes to 7bits+95bytes
[2]Note that SASP payload is not the same as, say, SAGDP payload or SACCP payload: for example, if SAGDP lies right on top of SASP, then SAGDP_Payload = SASP_Payload - Size_of_SAGDP_Headers.
10. Implementation notes
10.1 Incrementing nonces

For SASP security, it is critical that nonces are never re-used and are always incremented (never going back). Therefore, implementation MUST enforce it (both for sending side and for receiving side).

10.1.1 Basic Implementation

Basic secure implementation is rather simple:

  • Whenever a new packet is sent, an update value of NSF MUST be saved and committed in in persistent storage; this commit MUST be performed before the packet is actually sent over the air. This is necessary to keep EAX security guarantees.
  • Whenever a packet with status “new” is received, an updated value of NLW MUST be saved and committed in persistent storage; this commit MUST be performed before further message processing. This is necessary to avoid using an obsolete value of NLW in case of “dirty” reboot (and thus to avoid a potential for replay attacks).
10.1.2 Optimized Implementation

In cases where basic secure implementation is too resource-intensive (causing too many writes to persistent storage, which can be undesirable, in particular for EEPROM), the following optimizations MAY be used without affecting security; note that implementation described below are ok if and only if all of the steps are implemented (or none is implemented, falling back to the basic schema described above): [TODO: check that boundary handling (‘<’ vs ‘<=’ etc. etc.) is described correctly]

  • On program start:
    • both NSF and NLW are read from the persistent storage, and stored into the RAM (as ‘Current_NSF’ and ‘Current_NLW’ respectively).
    • both NSF and NLW in persistent storage are incremented by a certain value DELTA; this change MUST be committed to persistent storage before any further processing. The value of DELTA can be, for example, 100; DELTA SHOULD NOT be too large, as having it too large, combined with frequent “dirty” reboots, may cause exhaustion of nonce space.
    • These incremented values are also stored in RAM (as ‘Last_NSF’ and ‘Last_NLW’).
  • Whenever a new value of NSF is needed (for the reasons stated above), if ‘Current_NSF’ is less than ‘Last_NSF’, then new value of NSF is taken as ‘Current_NSF’ and ‘Current_NSF’ is incremented in RAM. This is ok from security perspective, because in case of “dirty reboot” NSF will be still increased, and never repeated.
  • Whenever a new value of NSF is needed (for the reasons stated above), and if ‘Current_NSF’ is greated or equal than ‘Last_NSF’, then:
    • NSF in persistent storage is incremented by DELTA (or other similar value); this new value MUST be committed to persistent storage before proceeding further
    • ‘Last_NSF’ is set to new value of NSF in persistent storage
    • ‘Current_NSF’ is returned as the new NSF value, and then incremented
  • Whenever a new value of NLW is needed (for the reasons stated above), if ‘Current_NLW’ is less than ‘Last_NLW’, then new value of NLW is taken as ‘Current_NLW’ and ‘Current_NLW’ is incremented in RAM. This is ok from security perspective, because in case of “dirty reboot” NLW will be still increased, and never repeated. Using such policy for NLW might cause an extra ‘Error “Old Nonce” Message’, but this situation will be quickly recovered from.
  • Whenever a new value of NLW is needed (for the reasons stated above), and if ‘Current_NLW’ is greated or equal than ‘Last_NLW’, then:
    • NLW in persistent storage is incremented by DELTA (or other similar value); this new value MUST be committed to persistent storage before proceeding further
    • ‘Last_NLW’ is set to new value of NLW in persistent storage
    • ‘Current_NLW’ is returned as the new NLW value, and then incremented
10.1.3 Restoring from Backup

Whenever an entity-implementing-SASP (such as “SmartAnthill Central Controller”) is restored from backup, it MUST take care to avoid duplicate nonces, in particular:

  • amount of time dT (in seconds) between backup and restore MUST be calculated
  • if dT is less than min-backup-restore-time, it MUST be set to min-backup-restore-time; normally min-backup-restore-time should be set to a value such as 24 hours.
  • if dT is larger than max-backup-restore-time, restore SHOULD be interrupted, the problem SHOULD be explained to the person who’s performing restore, and confirmation SHOULD be obtained before proceeding. This is intended to prevent restores with erroneous clock, which might lead to the erroneous exhaustion of the nonce space. Normally, max-backup-restore-time should be set to a value such as 30*24 hours.
  • both NLW and NSF, as stored in persistent storage, MUST be increased by a number equal to: dT*max_number_of_packets_per_second. This increased number MUST be stored and committed to persistent storage before proceeding further. Here, max_number_of_packets_per_second is a constant estimating maximum feasible number of packets which might be sent per second; in general, it depends on the higher-level protocols, but for basic SACCP it usually can be taken between 100‘000 (1e5) and 1‘000‘000 (1e6).

SmartAnthill-over-IP Protocol (SAoIP) and SmartAnthill Router

Version:v0.3a

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SAoIP is a part of SmartAnthill 2.0 protocol stack. It belongs to Layer 4 of OSI/ISO Network Model, and is responsible for transferring SAoIP payload (usually SASP packets) between SmartAnthill Client (normally implemented by SmartAnthill Core) and SmartAnthill Device or SmartAnthill Router.

Within SmartAnthill protocol stack, SAoIP is located right below SASP.

SAoIP Flavours

Currently, only one flavour of SAoIP is supported: SAoUDP. In the future, SAoTCP and SAoTLSoTCP may be added, though their support won’t be mandatory for SmartAnthill Devices.

SAoIP Requirements

Generally, SAoIP is a very simple wrapper around SAoIP payload (which are normally SASP packets). As guaranteed delivery is normally handled by SAGDP, no guarantees are required (and neither are provided) by SAoIP in general. Even when/if certain SAoIP flavour (such as SAoTCP) provides certain delivery guarantees, SAoIP application layer (normally SASP+SAGDP+SACCP) MUST NOT rely on delivery guarantees provided by specific SAoIP flavour.

SAoIP SCRAMBLING

SCRAMBLING is an optional feature of SAoIP. SAoIP SHOULD use SCRAMBLING whenever SAoIP goes over non-secure connection; while not using SCRAMBLING is not a significant security risk, but might reveal some information about packet destination and/or might simplify certain DoS attacks. For the purpose, any connection SHOULD be considered as exposed (and therefore SCRAMBLING procedure SHOULD be used) unless proven secure; in particular, all connections which go over Wi-Fi or over the Internet, SHOULD be considered as exposed.

SAoIP uses SCRAMBLING procedure as described in SmartAnthill SCRAMBLING procedure document.

SCRAMBLING requires that both parties share the same symmetric key (for details, see SmartAnthill SCRAMBLING procedure document). This symmetric key MUST be completely independent and separate from any other keys, in particular, from SASP keys.

SAoIP SCRAMBLING uses Default SCRAMBLING-Header formatting schema as described in SmartAnthill SCRAMBLING procedure document.

SCRAMBLING being optional

In some cases (for example, if all the communications is within Intranet without being passed through wireless links, or performed over TLS), SAoIP MAY omit SCRAMBLING procedure. In fact, if there is no information about SCRAMBLING key for the packet sender, both SmartAnthill Router and SmartAnthill IP-Enabled Device SHOULD try to interpret the packet as the one without SCRAMBLING applied.

Formally, within SmartAnthill Protocol Stack omitting SCRAMBLING doesn’t affect any security guarantees (as such guarantees are provided by SASP, which is not optional). However, as SCRAMBLING provides some benefits at a very low cost, by default SCRAMBLING procedure SHOULD be applied to all communications which are potentially exposed to the attacker.

SAoUDP

Unless SAoUDP packet is intended to be transferred over SAMP, it is formed as follows:

  • SAoUDP payload is SCRAMBLED
  • it is placed over an UDP packet
SAoUDP and UDP

SAoUDP packet uses UDP as an underlying transport; as such, it also (implicitly) contains standard 8-byte UDP headers as described in RFC 768. SAoUDP only uses unicast UDP.

SAoUDP+UDP (compressed)

When SAoUDP packet is transferred over SAMP, it MUST be combined with UDP/IP packet information, and MUST be encoded (in 6LoWPAN-speak, “compressed”) as follows:

| FOREIGN-IP-TYPE-AND-SOME-DATA | OPTIONAL-FOREIGN-IP-DATA | PAYLOAD |

where FOREIGN-IP-TYPE-AND-SOME-DATA is a Encoded-Unsigned-Int<max=2> bitfield substrate, described in detail below, OPTIONAL-FOREIGN-IP-DATA presence and length is defined by FOREIGN-IP-TYPE-AND-SOME-DATA (see below), and PAYLOAD is a payload of the upper protocol layer (usually SASP). Note that for over-SASP communications, SCRAMBLING procedure is not applied within SAoUDP (SCRAMBLING will be performed at SADLP-* level).

“Foreign” address is either a source address (for packets travelling from Central Controller to Device), or destination address (for packets travelling from Device to Central Controller). Another address (non-“foreign” one) can always be derived from SAMP headers and is never transferred at this level.

If bit[0] of FOREIGN-IP-TYPE-AND-SOME-DATA is 0, then:

  • foreign IP address is an address within the current PAN, bits [1..] of FOREIGN-IP-TYPE-AND-SOME-DATA represent SAMP address. As a consequence, for SmartAnthill Controller’s FOREIGN-IP-TYPE-AND-SOME-DATA is encoded as a single byte 0x00.

If bit[0] of FOREIGN-IP-TYPE-AND-SOME-DATA is 1, then:

  • if bits[1..] = 0, then foreign IP address is an IPv4 address, and OPTIONAL-FOREIGN-IP-DATA is 4-byte IPv4 address (encoded as described in https://en.wikipedia.org/wiki/IPv4). It is normally translated to an IPv6 address using SIIT (see https://en.wikipedia.org/wiki/IPv6_transition_mechanism).
  • if bits[1..] = 1, then foreign IP address is a full IPv6 address, bits [2..] MUST be zero, and OPTIONAL-FOREIGN-IP-DATA is 16-byte IPv6 address (encoded as described in https://en.wikipedia.org/wiki/IPv6).
  • if bits[1..] = 2, then foreign IP address is 64 lower bits of IPv6 address, bits [2..] MUST be zero, and OPTIONAL-FOREIGN-IP-DATA is 8-byte (remaining 64 bits of IPv6 address being the same as IPv6 address of the SmartAnthill Router).
  • other values of bits[1..6] (when bit[0] = 1) are RESERVED.
SmartAnthill Router

SmartAnthill Router is responsible for converting packets from “SAoUDP over UDP” format which travels over the IP network, into “compressed SAoUDP+UDP” format which travels over SmartAnthill PAN (and which can be seen as a compression which is similar to 6LoWPAN, but re-optimized for SmartAnthill needs). After conversion, the packet is sent over SAMP. On the way back (from Device to IP network), Router receives packet over SAMP, converts it into “SAoUDP over UDP” format which travels over the IP network, and sends it over IP network.

Currently, SmartAnthill Router supports only stateless convertion/compression. If necessary, stateful conversion/compression may be added in the future.

In general, SmartAntill Router can operate either at application level, or at L3 level. Currently, only application-level SmartAnthill Router is implemented.

SmartAnthill Application-Level Router

In addition to packet format conversion described above, SmartAnthill Application-Level Router allows to perform NAT and/or PAT.

SmartAnthill Router keeps the following records in SmartAnthill Database (SA DB) table DEVICE_MAPPINGS:

| Device-Key-ID | IPv6 | SAoIP-Flavour | port | SCRAMBLING-Key | Bus ID | Intra-Bus ID | Recrypt-External-Key | Recrypt-Internal-Key |

In addition, there is another SA DB table KEY_MAPPINGS:

| Device-Key-ID | external-SASP-key-ID | internal-SASP-key-ID |

When an incoming SAoIP packet comes in (from a receiving socket), SmartAnthill Router:

  • finds out an address of the receiving socket: (Flavour,IPv6,port). If socket listens on IPv4, IPv4 is first translated into IPv6 using “Stateless IP/ICMP Translation” (SIIT).
  • finds out a ‘from’ address of the packet: (Flavour,IPv6,port); normally, it is taken from the incoming packet of SAoIP underlying protocol (for example, from UDP packet itself). If TCP or UDP operates over IPv4, then IPv4 is first translated into IPv6 using “Stateless IP/ICMP Translation” (SIIT).
  • checks if any filtering rules apply to the ‘from’ address (TODO: define filtering rules a-la IPTables)
  • finds a record in DEVICE_MAPPINGS table, based on (IPv6,Flavour,port); from this record, obtains Device-Key-ID, SCRAMBLING-Key, and (Bus-ID,Intra-Bus-ID) pair
  • if SCRAMBLING-Key is not NULL, DESCRAMBLES incoming packet (using SCRAMBLING-Key)
  • at this point we have a plain (not scrambled) SAoIP packet
  • parses SAoIP packet to get SASP packet, and gets key-ID from SASP packet (it can be extracted without decrypting SASP packet); for SmartAnthill Router, this is external-SASP-key-ID.
  • finds a row in KEY_MAPPINGS based on Device-Key-ID and external-SASP-key-ID; gets internal-SASP-key-ID. TODO: what to do if record is not found
  • if DEVICE_MAPPINGS record found above, contains “re-crypt” information (which is a pair of Recrypt-External-Key and Recrypt-Internal-Key), SmartAnthill Router decrypts SASP packet within SAoIP-Payload (using Recrypt-External-Key) and encrypts it again (using Recrypt-Internal-Key)
  • changes (‘hacks’) SASP packet to use internal-SASP-key-ID instead of external-SASP-key-ID; this can be done without decrypting SASP packet
  • forms “SAoUDP+UDP (compressed)” packet as decsribed above, using SASP ‘hacked’ packet as a payload
  • forms SAMP packet, and then SADLP-* packet (depending on the bus in use) as described in respective documents, using “SAoUDP+UDP (compressed)” packet as a payload
  • sends SADLP-* packet to (Bus-ID, Intra-Bus-ID)

When an incoming packet from SADLP-* comes in (from certain Bus-ID and Intra-Bus-ID), SmartAnthill Router:

  • processes SADLP-* incoming packet to obtain SAMP packet, and then “SAoUDP+UDP (compressed)” as described in respective documents
  • processes “SAoUDP+UDP (compressed)” packet as described above, to obtain PAYLOAD and FOREIGN-IP-ADDRESS
  • parses PAYLOAD to get SASP packet, and gets key-ID out of it (this can be done without decrypting SASP packet); for SmartAnthill Router, this is internal-SASP-key-ID
  • finds a row in DEVICE_MAPPINGS table, based on (Bus ID, Intra-Bus ID), and obtains Device-Key-ID and SCRAMBLING-Key TODO: what to do if not found
  • finds a row in KEY_MAPPINGS table, based on (Device-Key-ID, internal-SASP-key-ID), and obtains external-SASP-key-ID TODO: what to do if not found
  • changes (‘hacks’) SASP packet to use external-SASP-key-ID instead of internal-SASP-key-ID; this can be done without decrypting SASP packet
  • if DEVICE_MAPPINGS record found above, contains “re-crypt” information, SmartAnthill Router decrypts SASP packet within SAoIP-Payload (using Recrypt-Internal-Key) and encrypts it again (using Recrypt-External-Key)
  • forms a SAoIP packet, using FOREIGN-IP-ADDRESS, and ‘hacked’ SASP packet as a payload
  • if SCRAMBLING-Key is not NULL, SCRAMBLES packet, using SCRAMBLING-Key
  • sends packet to FOREIGN-IP-ADDRESS

SmartAnthill Mesh Protocol (SAMP)

EXPERIMENTAL

Version:v0.0.23

IMPORTANT: This document is obsolete. Please DO NOT modify it. Please refer to SimpleIoT Heterogeneous Mesh Protocol (SimpleIoT/HMP) for an up to date version.

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SAMP is a part of SmartAnthill 2.0 protocol stack (TODO: insert). It belongs to Level 3 of OSI/ISO Network Model, and is responsible for routing packets within SmartAnthill mesh network.

SmartAnthill mesh network is a heterogeneous network. In particular, on the way from SmartAnthill Central Controller to SmartAnthill Device a packet may traverse different bus types (including all supported types of wired and wireless buses); the same stands for the packet going in the opposite direction.

SAMP is optimized based on the following assumptions:

  • SAMP relies on all communications being between Central Controller and Device (no Device-to-Device communications); no other communications are currently supported
  • SAMP aims to optimize “last mile” traffic (between last Retransmitting Device and target Device) while paying less attention to Central-Controller-to-Retransmitting-Device and Retransmitting-Device-to-Retransmitting-Device traffic. This is based on the assumption that the Retransmitting Devices usually have significantly less power restrictions (for example, are mains-powered rather than battery-powered).
  • SAMP combines data with route requests
  • SAMP allows to send “urgent” data packets, which sacrifice traffic and energy consumption for the best possible delivery speed
  • SAMP relies on pre-existence of Routing Tables (see below) on all relevant Retransmitting Nodes. Communicating Routing Tables MAY be implemented over the upper-layer protocol such as SACCP
    • This is done because of sensitivity of Routing Tables; with upper-layer protocol, Routing Tables can be communicated securely
    • It doesn’t create a chicken-and-egg problem, as SAMP provides a way to reach any reachable Retransmitting Node without a Routing Table on it; as soon as Retransmitting Node is reachable via SAMP, upper-layer protocol such as SACCP can be used to create/update Routing Table on the Retransmitting Node.
    • Technically, updating Routing Tables is not a part of SAMP; however, a protocol of updating Routing Tables over SACCP_PHY_AND_ROUTING_DATA messages is provided below as an example.
  • SAMP relies on upper-layer protocol (such as SAGDP) to send retransmits in case if packet has not been delivered, and to provide SAMP with an information about retransmit number (i.e., original packet having retransmit-number=0, first retransmit having retransmit-number=1, and so on).
  • SAMP relies on upper-layer protocol (such as SAGDP) to provide information if the Device on the other side is required to have it’s transmitter on for upper-layer protocol purposes. For SAGDP, there are states which do guarantee this (in fact, it stands in almost all SAGDP states except for IDLE).

SAMP has the following types of actors: Root (normally implemented by Central Controller), Retransmitting Device, and non-Retransmitting Device. All these actors are collectively named Nodes.

Underlying Protocol Requirements

SAMP underlying protocol (normally SADLP-*), MUST support the following operations:

  • bus broadcast (addressed to all the Devices on the bus)
  • bus multi-cast (addressed to a list of the Devices on the bus)
  • bus uni-cast

NB: these operations MAY be implemented using only bus broadcast, without any additional intra-bus addressing information; all SAMP packets have sufficient information to ensure further processing of SAMP packets without underlying protocol addressing information. If some information within SAMP packet becomes redundant given underlying protocol’s addressing information, underlying protocol MAY compress SAMP packet when transmitting it, by re-using underlying-protocol information when compressing SAMP packet; however, as SAMP addresses in normal (post-pairing) communication are usually very short anyway, such compression is not likely to bring substantial benefits.

All SmartAnthill Devices SHOULD, and all SmartAnthill Retransmitting Devices MUST implement some kind of collision avoidance (at least CSMA/CA, a.k.a. “listen before talk with random delay”).

SCRAMBLING and Underlying Protocol Error Correction

SAMP packets are usually SCRAMBLED, and after SCRAMBLING are transmitted over some of SADLP-* protocols.

SADLP-* protocols SHOULD allow for gradual error correction, starting from the beginning of the packet. Even if the packet cannot be error-corrected completely, information in the first part of the header MAY be of value, and SHOULD be passed to upper layers. SCRAMBLING procedure SHOULD allow for partial descrambling (to the extent possible), and SHOULD return partially descrambled packets back to SAMP. It will allow SAMP to get “partially correct” packets, which are to be used as described below, to improve certain SAMP characteristics. SAMP uses headers of “partially correct” packets in “promiscuous mode” operations, and in some other cases referred to as “partially correct packet”.

Promiscuous Mode Operations

Wherever possible (in particular, for all kinds of wireless communications unless explicitly prohibited by underlying standard), SmartAnthill Retransmitting Devices SHOULD listen the network in promiscuous mode; this doesn’t affect security, but provides valuable header information and speeds up message delivery and recovery in certain practical cases.

SmartAnthill Retransmitting Devices

Some SmartAnthill Devices are intended to be “SmartAnthill Retransmitting Devices”. “SmartAnthill Retransmitting Device” has one or more transmitters. Transmitters on SmartAnthill Retransmitting Devices MUST be always-on; turning off transmitter is NOT allowed for SmartAnthill Retransmitting Devices. That is, if MCUSLEEP instruction is executed on a SmartAnthill Retransmitting Device, it simply pauses executing a program, without turning transmitter off (TODO: add to Zepto VM). Normally, SmartAnthill Retransmitting Devices are mains-powered, or are using large batteries. SmartAnthill Protocol Stack (specifically SAMP) on SmartAnthill Retransmitting Devices requires more resources (specifically RAM) than non-Retransmitting Devices.

Highly mobile Devices SHOULD NOT be Retransmitting Devices. Building a reliable network out of highly mobile is problematic from the outset (and right impossible if these movements are not synchronized). Therefore, SAMP assumes that Retransmitting Devices are moved rarely, and is optimized for rarely-moving Retransmitting Devices. While SAMP does recover from moving one or even all Retransmitting Devices, this scenario is not optimized and recovery from it may take significant time.

Routing Tables

Each Retransmitting Device, after pairing, MUST keep a Routing Table. Routing Table consists of two lists: (a) Links list, with each entry being (LINK-ID,BUS-ID,INTRA-BUS-ID,NEXT-HOP-ACKS,LINK-DELAY-UNIT,LINK-DELAY,LINK-DELAY-ERROR) tuple, and (b) Routes list, with each entry being (TARGET-ID,LINK-ID). LINK-ID is an intra-Routing-Table id, used to map routes into links.

Each entry in Routes list has semantics of “where to route packet addressed to TARGET-ID”. In Links list, INTRA-BUS-ID=NULL means that the entry is for an incoming link. Incoming link entries are relatiely rare, and are used to specify LINK-DELAYs.

NEXT-HOP-ACKS is a flag which is set if the nearest hop (over (BUS-ID,INTRA-BUS-ID)) is known to be able not only to receive packets, but to send ACKs back; in general, NEXT-HOP-ACKS cannot be calculated based only on bus type, and may change for the same link during system operation; SAMP is built to try using links with NEXT-HOP-ACKS as much as possible, but MAY use links without NEXT-HOP-ACKS if there are no alternatives.

TODO: size reporting to Root (as # of unspecified ‘storage units’, plus sizes of Links entry and Routes entry expressed in the same ‘storage units’).

Routing Tables SHOULD be stored in a ‘canonical’ way (Links list ordered from lower LINK-IDs to higher ones, Routes list ordered from lower TARGET-IDs to higher ones; duplicate entries for the same LINK-ID are prohibited, for the same TARGET-ID are currently prohibited); this is necessary to simplify calculations of the Routing Table checksums. TODO: specify Routing-Table-Checksum calculation

On non-Retransmitting Devices, Routing Table is rudimentary: it contains only one Link (LINK-ID=0,BUS-ID,INTRA-BUS-ID,...) and only one Route (TARGET-ID=0,LINK-ID=0). Moreover, on non-Retransmitting Devices Routing Table is OPTIONAL; if non-Retransmitting Device does not keep Routing Table - it MUST be reflected in a TODO CAPABILITIES flag during “pairing”; in this case Root MUST send requests to such devices specifying TODO header extension (which contains BUS-ID,INTRA-BUS-ID for the first hop back from target Device).

All Routing Tables on both Retransmitting and non-Retransmitting Devices are essentially (usually partial) replicas of “Master Routing Tables” which are kept on Root. It is a responsibility of Root to maintain Routing Tables for all the Devices (both Retransmitting and non-Retransmitting); it is up to Root which entries to store in each Routing Table. In some cases, Routing Table might need to be truncated; in this case, it is responsibility of Root to use VIA field in Target-Address (see below) to ensure that the packet can be routed given the Routing Tables present. In any case, Routing Table MUST be able to contain at least one entry, with TARGET-ID=0 (Root). This guarantees that path to Root can always be found without VIA field.

In addition, on Rentransmitting Devices the following parameters are kept (and updated by Root): MAX-TTL, FORWARD-TO-SANTA-DELAY-UNIT, FORWARD-TO-SANTA-DELAY, MAX-FORWARD-TO-SANTA-DELAY (using same units as FORWARD-TO-SANTA-DELAY), NODE-MAX-RANDOM-DELAY-UNIT, and NODE-MAX-RANDOM-DELAY. MAX-FORWARD-TO-SANTA-DELAY indicates maximum “forward to santa” delay for all Retransmitting Devices in the PAN.

TODO: no mobile non-Retransmitting (TODO reporting ‘mobile’ in pairing CAPABILITIES, plus heuristics), priorities (low->high): non-Retransmitting, Retransmitting.

Broken Routing Tables

Despite that Routing Tables are updated only by authenticated upper-layer messages, SAMP does recognize that Routing Tables may become broken during operation. To deal with it, two separate procedures are used. One such procedure is intended for destination Devices (either Retransmitting or non-Retransmitting), and is described within “Unicast” section below. Another procedure is intended for Retransmitting Devices, and is described in “Guaranteed Unicast” section below.

Communicating Routing Table Information over SACCP

As described above, SAMP relies on Routing Table information being available on all relevant Retransmitting Nodes. To ensure that this information is transmitted in secure manner, it SHOULD be transmitted by an upper-layer secure (and guaranteed-delivery) protocol such as SACCP. As described above, this doesn’t create a chicken-and-egg problem, as each Retransmitting Node can be accessed via SAMP regardless of Routing Tables present (or even badly broken) on the Retransmitting Node in question; and as soon as Retransmitting Node can be accessed via SAMP - upper-layer protocol such as SACCP can be used to update Routing Table on the Retransmitting Node.

Technically, protocol for communicating Routing Table information is not a part of SAMP. However, in this section we provide an example implementation of such protocol over SACCP_PHY_AND_ROUTING_DATA packets.

SACCP_PHY_AND_ROUTING_DATA supports the following packets:

Route-Update-Request: | FLAGS | OPTIONAL-EXTRA-HEADERS | OPTIONAL-ORIGINAL-RT-CHECKSUM | OPTIONAL-MAX-TTL | OPTIONAL-FORWARD-TO-SANTA-DELAY-UNIT | OPTIONAL-FORWARD-TO-SANTA-DELAY | OPTIONAL-MAX-FORWARD-TO-SANTA-DELAY | OPTIONAL-MAX-NODE-RANDOM-DELAY-UNIT | OPTIONAL-MAX-NODE-RANDOM-DELAY | MODIFICATIONS-LIST | RESULTING-RT-CHECKSUM |

where FLAGS is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] being DISCARD-RT-FIRST (indicating that before processing MODIFICATIONS-LIST, the whole Routing Table must be discarded), bit[1] being UPDATE-MAX-TTL flag, bit[2] being UPDATE-FORWARD-TO-SANTA-DELAY flag, bit[3] being UPDATE-MAX-NODE-RANDOM-DELAY flag, and bits[4..] reserved (MUST be zeros); OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above; Target-Address is the Target-Address field; OPTIONAL-ORIGINAL-RT-CHECKSUM is present only if DISCARD-RT-FIRST flag is not set; OPTIONAL-ORIGINAL-RT-CHECKSUM is a Routing-Table-Checksum, specifying Routing Table checksum before the change is applied; if OPTIONAL-ORIGINAL-RT-CHECKSUM doesn’t match to that of the Routing Table - it is TODO Routing-Error; OPTIONAL-MAX-TTL is present only if UPDATE-MAX-TTL flag is present, and is a 1-byte field, OPTIONAL-FORWARD-TO-SANTA-DELAY-UNIT, OPTIONAL-FORWARD-TO-SANTA-DELAY, and OPTIONAL-MAX-FORWARD-TO-SANTA-DELAY are present only if UPDATE-FORWARD-TO-SANTA-DELAY flag is present, and all are Encoded-Signed-Int<max=2> fields, OPTIONAL-MAX-NODE-RANDOM-DELAY-UNIT and OPTIONAL-MAX-NODE-RANDOM-DELAY are present only if UPDATE-MAX-NODE-RANDOM-DELAY flag is present, and both are Encoded-Unsigned-Int<max=2> fields, MODIFICATIONS-LIST described below; RESULTING-RT-CHECKSUM is a Routing-Table-Checksum, specifying Routing Table Checksum after the change has been applied (if RESULTING-RT-CHECKSUM doesn’t match - it is TODO Routing-Error).

Route-Update-Request is always accompanied with SACCP “additional bits” equal to 0x0 (see SmartAnthill Command&Control Protocol (SACCP) for details on SACCP_PHY_AND_ROUTING_DATA “additional bits”).

MODIFICATIONS-LIST consists of entries, where each entry is one of the following:

  • | ADD-OR-MODIFY-LINK-ENTRY-AND-LINK-ID | BUS-ID | NEXT-HOP-ACKS-AND-INTRA-BUS-ID-PLUS-1 | OPTIONAL-LINK-DELAY-UNIT | OPTIONAL-LINK-DELAY | OPTIONAL-LINK-DELAY-ERROR |

    where ADD-OR-MODIFY-LINK-ENTRY-AND-LINK-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant ADD_OR_MODIFY_LINK_ENTRY, bit[3] being LINK-DELAY-PRESENT flag, and bits[4..] equal to LINK-ID; BUS-ID is an Encoded-Unsigned-Int<max=2> field, NEXT-HOP-ACKS-AND-INTRA-BUS-ID is an Encoded-Unsigned-Int<max=4> bitfield substrate, with bit[0] being a NEXT-HOP-ACKS flag for the Routing Table Entry, and bits[1..] representing INTRA-BUS-ID-PLUS-1 (INTRA-BUS-ID-PLUS-1 == 0 means that INTRA-BUS-ID==NULL, and therefore that the link entry is an incoming link entry; otherwise, INTRA-BUS-ID = INTRA-BUS-ID-PLUS-1 - 1); OPTIONAL-LINK-DELAY-UNIT, OPTIONAL-LINK-DELAY, and OPTIONAL-LINK-DELAY-ERROR are present only if LINK-DELAY-PRESENT flag is set, and are Encoded-Unsigned-Int<max=2> fields. NB: by default, link delays are not set by Root, and are set based on device’s internal per-bus settings.

  • | DELETE-LINK-ENTRY-AND-LINK-ID |

    where DELETE-LINK-ENTRY-AND-LINK-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant DELETE_LINK_ENTRY, and bits[3..] equal to LINK-ID.

  • | ADD-OR-MODIFY-ROUTE-ENTRY-AND-LINK-ID | TARGET-ID |

    where ADD-OR-MODIFY-ROUTE-ENTRY-AND-LINK-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant ADD_OR_MODIFY_ROUTE_ENTRY, and bits[3..] equal to LINK-ID; TARGET-ID is an Encoded-Unsigned-Int<max=2> field.

  • | DELETE-ROUTE-ENTRY-AND-TARGET-ID |

    where DELETE-ROUTE-ENTRY-AND-TARGET-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] marks the end of MODIFICATIONS-LIST, bits[1..2] equal to a 2-bit constant DELETE_ROUTE_ENTRY, and bits[3..] equal to TARGET-ID. Note that DELETE-ROUTE-ENTRY-AND-TARGET-ID is the only MODIFICATIONS-LIST entry first field which includes TARGET-ID rather than LINK-ID.

Route-Update-Request packets always go from Root to Device. Route-Update-Request MAY be sent either to Retransmitting or to non-Retransmitting Device; however (as with any SACCP packet), if sending it to a non-Retransmitting Device, Root MUST be sure that non-Retransmitting Device has it’s transmitter turned on (because upper-layer protocol state guarantees it).

Route-Update-Response: | ERROR-CODE | TODO: more error info if any

where ERROR-CODE is an Encoded-Unsigned-Int<max=1> field, containing error code. ERROR-CODE = 0 means that Route-Update-Request has been completed successfully.

Route-Update-Response is always accompanied with SACCP “additional bits” equal to 0x0 (see SmartAnthill Command&Control Protocol (SACCP) for details on SACCP_PHY_AND_ROUTING_DATA “additional bits”).

Communicating PHY Information over SACCP

Some of SADLP-* protocols (as described in corresponding SADLP-* document) MAY need to communicate information to Central Controller (for example, to calculate optimums using quite complicated methods).

This is done via the following packets:

PHY-Data-Request: | ID-OF-SADLP | SADLP-DEPENDENT-PAYLOAD | where ID-OF-SADLP is an Encoded-Unsigned-Int<max=2> field, specified in respective SADLP-* document. TODO: list of all IDs in one place to avoid potential for collisions.

PHY-Data-Request is always accompanied with SACCP “additional bits” equal to 0x1 (see SmartAnthill Command&Control Protocol (SACCP) for details on SACCP_PHY_AND_ROUTING_DATA “additional bits”).

PHY-Data-Response: | SADLP-DEPENDENT-PAYLOAD |

PHY-Data-Response is always accompanied with SACCP “additional bits” equal to 0x1 (see SmartAnthill Command&Control Protocol (SACCP) for details on SACCP_PHY_AND_ROUTING_DATA “additional bits”).

PHY-Data-Ready-Request: | (empty)

PHY-Data-Ready-Request is always accompanied with SACCP “additional bits” equal to 0x2 (see SmartAnthill Command&Control Protocol (SACCP) for details on SACCP_PHY_AND_ROUTING_DATA “additional bits”).

PHY-Data-Ready-Response: | (empty)

PHY-Data-Ready-Response is always accompanied with SACCP “additional bits” equal to 0x2 (see SmartAnthill Command&Control Protocol (SACCP) for details on SACCP_PHY_AND_ROUTING_DATA “additional bits”).

To indicate that PHY-level tuning is completed, Device sends PHY-Data-Ready-Response (sic!); this happens at the point specified in respective SADLP-* document. In response, Root sends PHY-Data-Ready-Request (sic!).

Addressing

SAMP supports two ways of addressing devices: non-paired and paired.

Non-paired addressing is used for temporary addressing Devices which are not “paired” with SmartAnthill Central Controller (yet). Non-paired addressing is used ONLY during “Pairing” process, as described in SmartAnthill Pairing document. As soon as “pairing” is completed, Device obtains it’s own SAMP-NODE-ID (TODO: add to pairing document), and all further communications with Device is performed using “paired” addressing. Non-paired addressing is a triplet (NODE-ID,BUS-ID,INTRA-BUS-ID).

Paired addressing is used for addressing Devices which has already been “paired”. It is always one single item SAMP-NODE-ID. Root always has SAMP-NODE-ID=0.

SAMP Checksums

To validate integrity of SAMP headers, and of the whole SAMP packets, SAMP-CHECKSUM is used.

SAMP-CHECKSUM is defined as a Fletcher-16 checksum, as described in https://en.wikipedia.org/wiki/Fletcher%27s_checksum (using modulo 255), stored using “SmartAnthill Endianness”.

Whenever the packet has both header and body, SAMP uses two SAMP-CHECKSUMs: first checksum (referred to as HEADER-CHECKSUM) encompasses only header (i.e. everything before the first checksum), second SAMP-CHECKSUM (referred to as FULL-CHECKSUM) is located at the very end and encompasses header+first_checksum+body (i.e. everything before the second checksum).

DELAYs and DELAY-UNITs

Whenever delay (or more generally - time interval) needs to be calculated, it is always represented as two fields: DELAY itself and corresponding DELAY-UNIT.

To calculate delay for specific DELAY and DELAY-UNIT, the following formula is used (the formula as written is assumed to be in floating-point; other equivalent implementations are possible depending in particular on timer resolution for specific Device): delay = 1 millisecond * DELAY * (2^DELAY_UNIT); that is, DELAY-UNIT=0 and DELAY=1 means 1 millisecond, DELAY-UNIT=1 and DELAY=1 means 2 milliseconds, and DELAY-UNIT =-2 and DELAY=1 means 0.25 milliseconds.

Recovery Philosophy

Recovery from route changes/failures is vital for any mesh protocol. SAMP does it as follows:

  • by default, most of the transfers are not acknowledged at SAMP level (go as Samp-Unicast-Data-Packet without GUARANTEED-DELIVERY flag)
  • however, upper-layer protocol (normally SAGDP) issues it’s own retransmits and passed retransmit number to SAMP
  • on retransmit #N, SAMP switches GUARANTEED-DELIVERY flag on
  • when GUARANTEED-DELIVERY flag is set, SAMP uses ‘Guaranteed Uni-Cast’ mode described below
  • if ‘Guaranteed Uni-Cast’ fails for M times (as described below), link failure is assumed
  • link failure (as described above) is reported to the Root, so it can initiate route discovery to the node on the other side of the failed link (using Samp-From-Santa-Data-Packet)
    • if link failure is detected from the side of the link which is close to Root, link failure reporting is done by sending Routing-Error (which always come in GUARANTEED-DELIVERY mode) back to Root
    • if link failure is detected from the side of the link which is far from Root, link failure reporting is done by broadcasting Samp-To-Santa-Data-Or-Error-Packet, which is then converted into Samp-Forward-To-Santa-Data-Or-Error-Packet (which is always sent in GUARANTEED-DELIVERY mode) by all Retransmitting Devices which have received it.
Storm Avoidance

To reduce number of induced collisions during broadcasts, a.k.a. “request storm” and “reply storm” (NB: avoiding “storms” is important even when CSMA/CA is present, because CSMA/CA provides only probabilistic success), SAMP supports two mechanisms: explicit time-based collision avoidance, and random-delay-based storm avoidance.

Explicit Time-Based Storm Avoidance and Collision Domains

SAMP explicit time-based collision avoidance works as follows:

  • to avoid “request storm”: when performing a ‘network flood’ (using Samp-From-Santa-Data-Packet), Root MAY specify explicit time delays for each node.
  • to avoid “reply storm”: Root MAY specify FORWARD-TO-SANTA-DELAY-* parameters; whenever a Samp-To-Santa-Data-Or-Error-Packet (these are essentially sent as “anybody who can hear this, forward it to Root”), is received by Retransmitting Node, each of receiving Retransmitting Nodes waits according to FORWARD-TO-SANTA-DELAY before retransmitting.
  • In addition (to avoid “storms” in general), each SAMP packet, MAY have a ‘Collision-Domain’ restrictions (i.e. “from t0-from-now to t1-from-now, don’t transmit on Collision-Domain #CD); these restrictions specify . Retransmitting Devices SHOULD monitor Collision-Domain headers in promiscuous mode and work accordingly, even if the packet is not addressed to this Retransmitting Device.
Random-delay-based Storm Avoidance

If explicit time-based collision avoidance is not used, Retransmitting Devices MUST use random delays (based on NODE-MAX-RANDOM-DELAY-UNIT and NODE-MAX-RANDOM-DELAY) as specified below.

Target-Address, Multiple-Target-Addresses, and Multiple-Target-Addresses-With-Extra-Data

Target-Address allows to store either paired-address, or non-paired address. Target-Address is encoded as

| FLAG-AND-NODE-ID | OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID | ... | OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID | OPTIONAL-CUSTOM-INTRA-BUS-SIZE | OPTIONAL-INTRA-BUS-ID |

where FLAG-AND-NODE-ID-OR-BUS-ID is an Encoded-Unsigned-Int<max=2> bitfield substrate, where bit[0] is EXTRA_DATA_FOLLOWS flag, and bits[1..] are NODE-ID.

OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID is present only if EXTRA_DATA_FOLLOWS is set, and is an Encoded-Unsigned-Int<max=2> bitfield substrate, where bit[0] represents IS_NONPAIRED_ADDRESS flag, and the rest of the bits depend on bit[0]. If IS_NONPAIRED_ADDRESS flag is not set, then bits[1..] represent VIA field (encoded as NODE-ID+1); if VIA field is -1 (because bits[1..] are zero), then no further extra data fields are present. If IS_NONPAIRED_ADDRESS flag is set, then bits[1..3] represent INTRA-BUS-SIZE (with value 0x7 interpreted in a special way, specifying that INTRA-BUS-SIZE is ‘custom’), and bits [4..] represent BUS-ID. If IS_NONPAIRED_ADDRESS flag is not set, and VIA field in it is >=0, it means that another OPTIONAL-VIA-INTRA-BUS-SIZE-AND-BUS-ID field is present, which is interpreted as above. OPTIONAL-VIA-INTRA-BUS-SIZE-AND-BUS-ID with either IS_NONPAIRED_ADDRESS set, or with VIA field equal to -1, denote the end of the list.

OPTIONAL-CUSTOM-INTRA-BUS-SIZE is present only if OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID is present, and flag IS_NONPAIRED_ADDRESS is set, and INTRA-BUS-SIZE field has value ‘custom’; OPTIONAL-INTRA-BUS-ID is present only if OPTIONAL-VIA-OR-INTRA-BUS-SIZE-AND-BUS-ID is present, and has INTRA-BUS-SIZE (calculated from OPTIONAL-INTRA-BUS-SIZE-AND-BUS-ID and OPTIONAL-CUSTOM-INTRA-BUS-SIZE) size.

Multiple-Target-Addresses is essentially a multi-cast address. It is encoded as a list of items, where each item is similar to an Target-Address field, with the following changes:

  • for list entries, within FLAG-AND-NODE-ID field it is NODE-ID + 1 which is stored (instead of simple NODE-ID for single Target-Address). This change does not affect VIA fields.
  • to denote the end of Multiple-Target-Addresses list, FLAG-AND-NODE-ID field with EXTRA_DATA_FOLLOWS=0 and NODE-ID=0, is used
  • value of FLAG-AND-NODE-ID field with EXTRA_DATA_FOLLOWS=1 and NODE-ID=0, is prohibited (reserved)

Multiple-Target-Addresses-With-Extra-Data is the same as Multiple-Target-Addresses, but each item (except for the last one, where NODE-ID=0), additionally contains some extra data (which is specified whenever Multiple-Target-Addresses-With-Extra-Data is mentioned). For example, if we’re speaking about “Multiple-Target-Addresses-With-Extra-Data, where Extra-Data is 1-byte field”, it means that each item of the list (except for the last one) will have both Target-Address field (with changes described in Multiple-Target-Addresses), and 1-byte field of extra data.

Time-To-Live

Time-To-Live (TTL) is a field which is intended to address misconfigured/inconsistent Routing Tables. TTL is set to certain value (default 4) whenever the packet is sent, and is decremented by each Node which retransmits the packet. TTL=0 is valid, but TTL < 0 is not; whenever the packet needs to be retransmitted and it would cause TTL to become < 0 - the packet is dropped (with a Routing-Error, see below).

During normal operation, it SHOULD NOT occur. Whenever the packet is dropped because TTL is down to zero (except for Routing-Error SAMP packets), it MUST cause a TODO Routing-Error to be sent to Root.

Uni-Cast Processing

Whenever a Uni-Cast packet (the one with a Target-Address field) is received by Retransmitting Device, the procedure is the following:

  • check if the Target-Address is intended for the Retransmitting Device
    • if it is - process the packet locally and don’t process further
  • if packet TTL is already equal to 0 - drop the packet and send Routing-Error to the Root (see Time-To-Live section above for details)
  • decrement packet TTL
  • using Routing Table, find next hop for the Target-Address
    • if next hop cannot be found for the Target-Address itself, but Target-Address contains VIA field(s) - try to find next hop based on each of VIA fields
    • if next hop cannot be found using Target-Address and all VIA field(s) - drop the packet and send TODO Routing-Error to the Root
  • if any of VIA fields in the Target-Address is the same as the next hop - remove all such VIA fields from the Target-Address
  • find bus for the next hop and send modified packet (see on TTL and VIA modifications above) over this bus
Processing on Destination and Broken Routing Table

As described above, SAMP does recognize that Routing Tables may become broken during operation. On a destination Device, whenever Device attempts retransmit #TODO of the message, Device sends it as a Samp-To-Santa message, ignoring local Routing Table completely; TODO: add optional-header with RT-CHECKSUM for Samp-To-Santa messages?

Guaranteed Uni-Cast

As described in detail below, all SAMP uni-cast packet types, except for Samp-Unicast-Data-Packet without GUARANTEED-DELIVERY flag and Samp-Loop-Ack-Packet, are sent in ‘Guaranteed Uni-Cast’ mode.

Processing by Retransmitting Devices

If packet is to be delivered to the next hop in ‘Guaranteed’ mode by Retransmitting Device, it is processed in the following manner:

If the packet already has LOOP-ACK extra header (see below), and next hop has NEXT-HOP-ACKS flag set in the Routing Table, then Retransmitting Device:

  • sends Samp-Loop-Ack-Packet (see below) back to the requestor specified in LOOP-ACK extra header
  • removes LOOP-ACK extra header
  • continues processing as specified below

If the next hop has NEXT-HOP-ACKS flag set in the Routing Table, after sending the packet, timer is set and the packet is sent using “uni-cast” bus mechanism. If timer expires (or Node receives relevant Samp-Ack-Nack-Packet with IS-NACK flag set), SAMP retries it for 5 times (with exponentially increasing timeouts - TODO); if all 5 attempts fail - it is treated as ‘Routing-Error’. In particular:

  • if the packet has Root as Target-Address:
    • packet Samp-To-Santa-Data-Or-Error-Packet containing TBD Routing-Error as PAYLOAD (and with IS_ERROR flag set) is broadcasted
    • if possible, the packet which wasn’t delivered, SHOULD be preserved (TODO: what to do if it cannot be?), and retransmitted as soon as route to the Root is restored
  • if the packet has anything except for Root as Target-Address (and therefore is coming from Root):
    • packet Samp-Routing-Error containing TBD Routing-Error is sent (towards Root)
    • to deal with potentially broken Routing Table on this Retransmitting Device, this Samp-Routing-Error packet MUST contain TODO optional-header with RT-Checksum
    • the packet which wasn’t delivered, doesn’t need to be preserved (TODO: identify packet which has been lost within Routing-Error)

If the packet doesn’t have LOOP-ACK extra header, and next hop doesn’t have NEXT-HOP-ACKS flag set in the Routing Table, then Retransmitting Device:

  • adds LOOP-ACK extra header (which is described below) to the packet (if it is not already present)
  • sends modified packet using “bus unicast” operation
  • and sets timer to TODO
    • if the sender doesn’t receive Samp-Loop-Ack-Packet until timer expires - it retransmits the packet at SAMP level.
      • if such attempts don’t succeed for 5 (TODO) times (with exponentially increasing timeouts - TODO) - it is treated as ‘Routing-Error’ (the same way as described above, depending on packet having Root as a Target-Address).

If the packet already has LOOP-ACK extra header, and next hop doesn’t have NEXT-HOP-ACKS flag set in the Routing Table, then Retransmitting Device:

  • keeps LOOP-ACK extra header
  • sends packet using “bus unicast” operation
  • doesn’t set any timers
LOOP-ACK on Destination

If packet with LOOP-ACK extra header is received by destination Device, destination Device MUST send Samp-Loop-Ack-Packet back to the node specified in LOOP-ACK extra header. If destination Device is a non-Retransmitting Device, it will send Samp-Loop-Ack-Packet with Target-Address specified in LOOP-ACK, but to the next hop specified in Root’s Routing Table entry. TODO: is it possible that Device doesn’t have a route to Root yet?

LOOP-ACK and Routing

As LOOP-ACK currently doesn’t support VIA routing, it means that Root MUST ensure that all the nodes on the “loop” route already know the routes without VIA fields; it applies both to the route from the loop beginning to the loop end, and back from the loop end to the loop beginning (as for request-response cycle, LOOP-ACKs go both directions). When speaking about ‘back from the loop end to the loop beginning’, it MUST be taken into account that, as specified above, non-Retransmitting Device will send a Samp-Loop-Ack-Packet in the direction of the Root (but with Target-Address equal to the address from LOOP-ACK extra header), so there MUST be an already-defined route from this next-hop-in-direction-of-Root to the loop beginning.

Multi-Cast Processing

Whenever a Multi-Cast packet (the one with Multiple-Target-Addresses field) is processed by a Retransmitting Device, the procedure is the following:

  • check if one of addresses within Target-Address is intended for the Retransmitting Device (TODO: if multiple addresses match the Retransmitting Device - it is a TODO Routing-Error, which should never happen)
    • if it is - process the packet locally (NB: Retransmitting Devices SHOULD schedule processing instead)
    • remove the address of the Retransmitting Device from Multiple-Target-Addresses
      • if Multiple-Target-Addresses became empty - don’t process any further
  • if packet TTL is already equal to 0 - drop the packet and send Routing-Error to the Root (see Time-To-Live section above for details)
  • decrement packet TTL
  • using Routing Table, find next hops for all the Devices on the list of Multiple-Target-Addresses (this search MUST include using VIA field(s) if present, see Uni-Cast Processing above)
  • if at least one of the next hops is not found - send a TODO Routing-Error packet (one packet containing all Routing-Errors for incoming packet) to Root, and continue processing
  • if any of VIA fields in any of the Multiple-Target-Addresses is the same as the next hop - remove all such VIA fields from the Multiple-Target-Addresses
  • find buses for all next hops, forming next-hop-bus-list
  • for each bus on next-hop-bus-list
    • if there is only a single next hop for this bus - send the modified packet to this bus using uni-cast bus addressing
    • if there is multiple next hops for this bus:
      • if the bus supports multi-casting - send the modified packet using multi-cast bus addressing over the bus.
      • otherwise, send the modified packet using uni-cast bus addressing to each of the hops
Promiscuous Mode Processing

Retransmitting Devices SHOULD, wherever possible, to listen to all the packets in “promiscuous mode”. It allows for the following processing:

  • if Retransmitting Device hears a packet addressed (at underlying protocol level) to another (“next-hop”) Retransmitting Device (which is not Root), and it has a RETRANSMIT-ON-NO-RETRANSMIT flag in Routing Table for the route entry for that Retransmitting Device, and after a TODO timeout it doesn’t hear a retransmit (neither full nor “partially correct”) by next retransmitting the same packet (TODO define “the same packet”), it MUST try to send a TODO packet to the next-hop Retransmitting Device (in “guaranteed mode”) - receiving Device MUST forward the packet to the destination, and send (or attach as a Combined-Packet if the target is Root) a TODO Routing-Error to the Root. If this attempt by our Retransmitting Device doesn’t succeed - our Retransmitting Device MUST send a TODO Routing-Error packet (containing the packet as a payload) to the Root.
OPTIONAL-EXTRA-HEADERS

Most of SAMP packets have OPTIONAL-EXTRA-HEADERS field. It has a generic structure, but interpretations depend on the packet type. More specifically, OPTIONAL-EXTRA-HEADERS is a sequence of the following items:

  • | GENERIC-EXTRA-HEADER-FLAGS |

    where GENERIC-EXTRA-HEADER-FLAGS is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-HEADER list, bits[1..2] equal to 2-bit constant GENERIC_EXTRA_HEADER_FLAGS, and further bits interpreted depending on packet type:

    • bit[3]. MORE-PACKETS-FOLLOW flag.
    • bit[4]. If the packet type is Samp-To-Santa-Data-Or-Error-Packet or Samp-Forward-To-Santa-Data-Or-Error-Packet - the bit is IS-ERROR (indicating that PAYLOAD is in fact Routing-Error). If the packet type is Samp-From-Santa-Data-Packet - it is a TARGET-COLLECT-LAST-HOPS flag. For Samp-To-Santa-Data-Or-Error-Packet the bit is IS-LOCAL-ECHO flag. For Samp-Ack-Nack-Packet the bit is IS-NACK flag. For other packet types - RESERVED (MUST be zero)
    • bit[5]. If the packet type is Samp-From-Santa-Data-Packet, the bit is an EXPLICIT-TIME-SCHEDULING flag. For Samp-Ack-Nack-Packet - the bit is IS-LOOP-ACK flag. For other packet types - RESERVED (MUST be zero)
    • bit[6]. RESERVED (MUST be zero)
    • bit[7]. If the packet type is Samp-Unicast-Data-Packet, Samp-From-Santa-Data-Packet, Samp-To-Santa-Data-Or-Error-Packet, or Samp-Forward-To-Santa-Data-Packet - the bit is IS-PROBE flag. For Samp-Ack-Nack packet - the bit is DELAYS-PRESENT. For other packet types - RESERVED (MUST be zero)
    • bits [8..] - RESERVED (MUST be zeros)
  • | GENERIC-EXTRA-HEADER-COLLISION-DOMAIN | COLLISION-DOMAIN-ID-AND-FLAG | COLLISION-DOMAIN-T0 | COLLISION-DOMAIN-T1 | ... |

    where GENERIC-EXTRA-HEADER-COLLISION-DOMAIN is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-HEADER list, bits[1..2] equal to 2-bit constant GENERIC_EXTRA_HEADER_COLLISION_DOMAIN, and bits [3..] equal to DELAY-UNIT; COLLISION-DOMAIN-ID-AND-FLAG is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=0 indicating the end of collision-domain list, bits[1..] being COLLISION-DOMAIN-ID; COLLISION-DOMAIN-T0 and COLLISION-DOMAIN-T1 are Encoded-Unsigned-Int<max=2> fields specifying respectively beginning and end of the window (“from now”) when COLLISION-DOMAIN-ID SHOULD NOT be disturbed. There can be multiple GENERIC-EXTRA-HEADER-COLLISION-DOMAIN headers in the same packet.

    GENERIC-EXTRA-HEADER-COLLISION-DOMAIN is a special kind of header; on receiving it, each node SHOULD take information within into account, and SHOULD NOT transfer over corresponding COLLISION-DOMAIN-ID within specified time window. In addition, whenever Retransmitting Device retransmits such a packet, it MUST calculate NEW-COLLISION-DOMAIN-T0 = MAX(0,OLD-COLLISION-DOMAIN-T0 - INCOMING-LINK-DELAY - OUTGOING-LINK-DELAY) and NEW-COLLISION-DOMAIN-T1 = MAX(0,OLD-COLLISION-DOMAIN-T1 - INCOMING-LINK-DELAY - OUTGOING-LINK-DELAY + INCOMING-LINK-DELAY-ERROR + OUTGOING-LINK-DELAY-ERROR) and use NEW-* values in the retransmitted packet; for calculating OLD-COLLISION-DOMAIN-* parameters DELAY-UNIT field is used, *-LINK-DELAY parameters together with their DELAY-UNITs are taken from corresponding entries in Routing Table; after doing these calculations, if both NEW-COLLISION-DOMAIN-T0 and NEW-COLLISION-DOMAIN-T1 become =0, this specific extra header SHOULD be dropped (i.e. not sent further).

  • | UNICAST-EXTRA-HEADER-LOOP-ACK | LOOP-ACK-ID |

    where UNICAST-EXTRA-HEADER-LOOP-ACK is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-DATA list, bits[1..2] equal to a 2-bit constant UNICAST_EXTRA_HEADER_LOOP_ACK, and bits[3..] representing NODE-ID of the address where to send the LOOP-ACK, and LOOP-ACK-ID is an Encoded-Unsigned-Int<max=2> field representing ID of the LOOP-ACK to be returned. This extra header MUST NOT be present for packets other than Samp-Unicast-Data-Packet.

  • | TOSANTA-EXTRA-HEADER-LAST-INCOMING-HOP | CONNECTION_QUALITY |

    where TOSANTA-EXTRA-HEADER-FLAGS is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] indicating the end of OPTIONAL-EXTRA-HEADER list, bits[1..3] equal to 3-bit constant TOSANTA_EXTRA_HEADER_LAST_INCOMING_HOP, and bits [4..] being node id; and CONNECTION_QUALITY is an Encoded-Unsigned-Int<max=1> bitfield substrate, with bits[0..3] being signal level (with 0 correcponding to the highest and 15 to the lowest signal level) and bits[4..6] being error count (resulting from error correction of the received packet). This extra header MUST NOT be present for packets other than Samp-To-Santa-Data-Or-Error-Packet. There can be multiple TOSANTA-EXTRA-HEADER-LAST-INCOMING-HOP extra headers within single packet.

NB: 2-bit extra header type constants MAY overlap as long as applicable types are different.

SAMP Combined-Packet

In general, SAMP passes SAMP Combined-Packets over underlying protocol. SAMP Combined-Packet consists of one or more SAMP Packets as described below; all SAMP Packets except for last one in SAMP Combined-Packet, have MORE-PACKETS-FOLLOW flag set (depending on the packet type, this flag is either passed as a part of the first field, or as a part of GENERAL-EXTRA-HEADERS-FLAGS, see details below).

When combining packets, SAMP MUST take into account both “MTU Hard Limits” and “MTU Soft Limits” of the appropriate SADLP-* protocol.

SAMP Packets

Samp-Unicast-Data-Packet: | SAMP-UNICAST-DATA-PACKET-FLAGS-AND-TTL | OPTIONAL-EXTRA-HEADERS | NEXT-HOP | LAST-HOP | Non-Root-Address | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where SAMP-UNICAST-DATA-PACKET-FLAGS-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0] equal to 0, bit[1] being GUARANTEED-DELIVERY flag, bit [2] being BACKWARD-GUARANTEED-DELIVERY, bit [3] being EXTRA-HEADERS-PRESENT, bit[4] being DIRECTION-FLAG that is set, if a packet follows from the Root, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT flag is set and is described above; NEXT-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of the next-hop node (based on info from Routing Table), LAST-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of currently transmitting node, Non-Root-Address is a target (recipient) address or a source (sender) address depending on DIRECTION-FLAG and is always a device ID of a communication party other than the Root, OPTIONAL-PAYLOAD-SIZE is present only if optional headers are present and MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field, HEADER-CHECKSUM is a header SAMP-CHECKSUM (see SAMP-CHECKSUM section for details), PAYLOAD is a payload to be passed to the upper-layer protocol, and FULL-CHECKSUM is a SAMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

If NEXT-HOP field doesn’t match ID of the receiving Device - the packet is ignored.

If a packet is addressed to the Root (DIRECTION-FLAG is not set), it MUST NOT contain VIA fields within; in addition, if the packet is addressed to the Root (DIRECTION-FLAG is not set), the packet MUST NOT have BACKWARD-GUARANTEED-DELIVERY flag set.

If IS-PROBE flag is set, then PAYLOAD is treated differently. When destination receives Samp-Unicast-Data-Packet with IS-PROBE flag set, destination doesn’t pass PAYLOAD to upper-layer protocol. Instead, destination parses PAYLOAD as follows: | PROBE-TYPE | OPTIONAL-PROBE-EXTRA-HEADERS | PROBE-PAYLOAD | where PROBE-TYPE is 1-byte bitfield substrate, with bits [0..2] being either PROBE_UNICAST or PROBE_TO_SANTA, bit[3] being PROBE-EXTRA-HEADERS-PRESENT, and bits [4..7] reserved (MUST be zeros); OPTIONAL-PROBE-EXTRA-HEADERS are similar to OPTIONAL-EXTRA-HEADERS, and PROBE-PAYLOAD takes the rest of the PAYLOAD; if PROBE-TYPE==PROBE_UNICAST, then destination Device sends Samp-Unicast-Data-Packet back to Root, with PAYLOAD copied from PROBE-PAYLOAD, and extra headers formed from PROBE-EXTRA-HEADERS, “as if” this packet is sent in reply to IS-PROBE packet by upper layer, but adding IS-PROBE flag (as a part of GENERIC-EXTRA-FLAGS extra header). If PROBE-TYPE==PROBE_TO_SANTA, destination Device sends a Samp-To-Santa-Data-Or-Error-Packet, with PAYLOAD copied from PROBE-PAYLOAD, “as if” the packet is sent in reply to IS-PROBE packet by upper layer, but adding IS-PROBE flag (as a part of GENERIC-EXTRA-FLAGS extra header).

Samp-Unicast-Data-Packet is processed as specified in Uni-Cast Processing section above; if GUARANTEED-DELIVERY flag is set, packet is sent in ‘Guaranteed Uni-Cast’ mode. In any case, LAST-HOP field is updated every time the packet is re-sent. Processing at the target node (regardless of node type) consists of passing PAYLOAD to the upper-layer protocol.

When target Device receives the packet, and sends reply back, it MUST set GUARANTEED-DELIVERY flag in reply to BACKWARD-GUARANTEED-DELIVERY flag in original packet; this logic applies to all the packets, including ‘first’ packets in SAGDP “packet chain” (as they’re still sent in reply to some SAMP packet coming from the Root).

If Retransmitting Device receives a “partially correct” Samp-Unicast-Data-Packet, addressed to itself, and it has NACK-PREV-HOP flag set for the source link within Routing Table, it MUST send a Samp-Nack-Packet back to the source of packet.

Samp-From-Santa-Data-Packet: | SAMP-FROM-SANTA-DATA-PACKET-AND-TTL | OPTIONAL-EXTRA-HEADERS | LAST-HOP | REQUEST-ID | OPTIONAL-DELAY-UNIT | MULTIPLE-RETRANSMITTING-ADDRESSES | BROADCAST-BUS-TYPE-LIST | Target-Address | OPTIONAL-TARGET-REPLY-DELAY | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where SAMP-FROM-SANTA-DATA-PACKET-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant SAMP_FROM_SANTA_DATA_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits[5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above, LAST-HOP is an Encoded-Unsigned-Int<max=2> representing node id of the last sender, REQUEST-ID is an Encoded-Unsigned-Int<max=2> field, OPTIONAL-DELAY-UNIT is present only if EXPLICIT-TIME-SCHEDULING flag is present, and is an Encoded-Signed-Int<max=2> field, which specifies units for subsequent DELAY fields (as described below), MULTIPLE-RETRANSMITTING-ADDRESSES is a Multiple-Target-Addresses-With-Extra-Data field described above (with Extra-Data being either empty if EXPLICIT-TIME-SCHEDULING flag is not present, or otherwise Encoded-Unsigned-Int<max=2> DELAY field, using OPTIONAL-DELAY-UNIT field for delay calculations), BROADCAST-BUS-TYPE-LIST is a zero-terminated list of BUS-TYPE+1 values (enum values for BUS-TYPE TBD), Target-Address is described above, OPTIONAL-TARGET-REPLY-DELAY has the same type as DELAY fields (and is absent if EXPLICIT-TIME-SCHEDULING flag is not present), and represents delay for the target Device (also using OPTIONAL-DELAY-UNIT field for delay calculations); OPTIONAL-PAYLOAD-SIZE is present only if MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field; HEADER-CHECKSUM is a header SAMP-CHECKSUM (see SAMP-CHECKSUM section for details), PAYLOAD is a payload to be passed to the upper-layer protocol, and FULL-CHECKSUM is a SAMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

Samp-From-Santa-Data-Packet is a packet sent by Root, which is intended to find destination which is ‘somewhere around’, but exact location is unknown. When Root needs to pass data to a Node for which it has no valid route, Root sends SAMP-FROM-SANTA-DATA-PACKET (or multiple packets), to each of Retransmitting Devices, in hope to find target Device and to pass the packet.

Samp-From-Santa-Data-Packet is processed as specified in Multi-Cast Processing section above, up to the point where all the buses for all the next hops are found; note that if Multi-Cast processing generates a Routing-Error, it is not transmitted immediately (see below). Starting from that point, Retransmitting Device processes Samp-From-Santa-Data-Packet proceeds as follows:

  • replaces LAST-HOP field with it’s own node id
  • creates a broadcast-bus-list of it’s own buses which match BROADCAST-BUS-TYPE-LIST
  • for each bus which is on a next-hop-bus list but not on the broadcast-bus-list - continue processing as specified in Multi-Cast Processing section above
    • transmission MUST NOT be made until time specified in DELAY field for current node, passes. If the time in DELAY field (after subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) using their respective DELAY-UNITs) has already passed - node MUST introduce a random delay uniformly distributed from 0 to NODE-MAX-RANDOM-DELAY parameter (using NODE-MAX-RANDOM-DELAY-UNIT for calculations).
    • right before sending each modified packet - further modify all DELAY fields within MULTIPLE-RETRANSMITTING-ADDRESSES by subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) (using their respective DELAY-UNITs). If resulting value is <0, it is made equal to 0.
  • for each bus which is on the broadcast-bus-list - broadcast modified packet over this bus
    • transmission MUST NOT be made until time specified in DELAY field for current node, passes. If the time in DELAY field (after subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) using their respective DELAY-UNITs) has already passed - node MUST introduce a random delay uniformly distributed from 0 to NODE-MAX-RANDOM-DELAY parameter (using NODE-MAX-RANDOM-DELAY-UNIT for calculations).
    • right before broadcasting each modified packet - further modify all DELAY (including TARGET-REPLY-DELAY) fields within MULTIPLE-RETRANSMITTING-ADDRESSES by subtracting (INCOMING-LINK-DELAY+OUTGOING-LINK-DELAY) (using their respective DELAY-UNITs). If resulting value is <0, it is made equal to 0.

If Retransmitting Device generates Routing-Error, then it MUST be delayed until time of TARGET-REPLY-DELAY + FORWARD-TO-SANTA-DELAY (using corresponding DELAY-UNITs for calculations). If this time has already passed - Routing-Error is transferred with a random delay (from 0 to NODE-MAX-RANDOM-DELAY, using NODE-MAX-RANDOM-DELAY-UNIT for calculations) from now.

On target Device, Samp-From-Santa-Data-Packet waits until reply payload is ready (which is almost immediately if IS-PROBE is set, including ‘discovery’ packets, see below), then it is processed as follows:

  • if TARGET-DELAY (expressed in DELAY-UNITs) has not passed yet, Device waits until it passes
    • if the incoming packet has TARGET-COLLECT-LAST-HOPS flag set (which is normally set for all the packets which have IS-PROBE flag), then target Device traces all the incoming packets addressed to it and having the same REQUEST-ID and makes a list of extra-last-hops consisting of LAST-HOP headers from all of them
    • when sending Samp-To-Santa-Data-Or-Error-Packet reply back, target Device adds LAST-INCOMING-HOP extra header for LAST-HOP within incoming packet, plus LAST-INCOMING-HOP headers for extra-last-hops (if such list exists, see above)

If IS-PROBE flag is set, then PAYLOAD is treated differently. When destination receives Samp-From-Santa-Data-Packet with IS-PROBE flag set, destination doesn’t pass PAYLOAD to upper-layer protocol. Instead, destination processes the packet in the same way as described for the processing of Samp-Unicast-Data-Packet with IS-PROBE flag set. A special case of Samp-From-Santa-Data-Packet with IS-PROBE set is when Target-Address is Root (=0). Such packets (a.k.a. ‘discovery’ packets) are ignored by Root, but are replied to only by Devices which are not paired yet (i.e. have no node id). All such ‘discovery’ packets with Target-Address=0 MUST have IS-PROBE flag set.

Samp-To-Santa-Data-Or-Error-Packet: | SAMP-TO-SANTA-DATA-OR-ERROR-PACKET-NO-TTL | OPTIONAL-EXTRA-HEADERS | SOURCE-ID | REQUEST-ID | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where SAMP-TO-SANTA-DATA-OR-ERROR-PACKET-NO-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant SAMP_TO_SANTA_DATA_OR_ERROR_PACKET, bit[4] being EXTRA-HEADERS-PRESENT, and bits [5..] reserved (MUST be zero); OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above. Note that Samp-To-Santa-Data-Or-Error-Packet doesn’t contain TTL (as it is never retransmitted ‘as is’); SOURCE-ID is an Encoded-Unsigned-Int<max=2> ID of the sender; REQUEST-ID is an Encoded-Unsigned-Int<max=2> field taken from a Samp-From-Santa-Data-Packet being answered, or 0, if current packet is initiated by device itself; OPTIONAL-PAYLOAD-SIZE is present only if MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field; HEADER-CHECKSUM is a header SAMP-CHECKSUM (see SAMP-CHECKSUM section for details); PAYLOAD is either data or error data depending on IS_ERROR flag; if IS_ERROR flag is set - PAYLOAD format is the same as the body (after OPTIONAL-EXTRA-HEADERS) of Samp-Routing-Error-Packet; and FULL-CHECKSUM is a SAMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

If IS-LOCAL-ECHO flag is set, the packet is ignored, except for Retransmitting Devices sending Samp-Ack-Nack-Packet back to LAST-HOP. To avoid “packet storms”, these ACKs MUST be sent using FORWARD-TO-SANTA-DELAY (using FORWARD-TO-SANTA-DELAY-UNIT for calculations). In addition, these ACKs SHOULD contain DELAY-UNIT, DELAY-PASSED, and DELAY-LEFT fields, with DELAY-UNIT being FORWARD-TO-SANTA-DELAY-UNIT, DELAY-PASSED being FORWARD-TO-SANTA-DELAY, and DELAY-LEFT calculated as MAX-FORWARD-TO-SANTA-DELAY - FORWARD-TO-SANTA-DELAY. TODO: add RETRANSMITTING-DEVICE-QUALITY?

Samp-To-Santa-Data-Or-Error-Packet is a packet intended from Device (either Retransmitting or non-Retransmitting) to Root. It is broadcasted by Device in several cases:

  • when the message is marked as Urgent by upper-layer protocol
  • when Device needs to report Routing-Error to Root when it has found that Root is not directly accessible.
  • when requested to do so via a packet with IS-PROBE flag and PROBE-TYPE==PROBE_TO_SANTA

In any case, if Samp-To-Santa-Data-Or-Error-Packet is sent in response to a Samp-From-Santa-Data-Packet flag (regardless of packet being first or not from SAGDP point of view), Device MUST provide TOSANTA-EXTRA-HEADER-LAST-INCOMING-HOP extra header, filling it from LAST-HOP field of the Samp-From-Santa-Data-Packet.

On receiving Samp-To-Santa-Data-Or-Error-Packet, Retransmitting Device sends a Samp-Forward-To-Santa-Data-Or-Error-Packet towards Root, in ‘Guaranteed Uni-Cast’ mode. To avoid congestion at this point, each Retransmitting Device delays according for FORWARD-TO-SANTA-DELAY (using FORWARD-TO-SANTA-DELAY-UNIT for calculations), where FORWARD-TO-SANTA-DELAY and FORWARD-TO-SANTA-DELAY-UNIT are the values which are locally stored on Retransmitting Device.

Samp-Forward-To-Santa-Data-Or-Error-Packet: | SAMP-FORWARD-TO-SANTA-DATA-OR-ERROR-PACKET-AND-TTL | OPTIONAL-EXTRA-HEADERS | NEXT-HOP | FORWARDED-SOURCE-ID | REQUEST-ID | OPTIONAL-PAYLOAD-SIZE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where SAMP-FORWARD-TO-SANTA-DATA-OR-ERROR-PACKET-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant SAMP_FORWARD_TO_SANTA_DATA_OR_ERROR_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above; NEXT-HOP is an Encoded-Unsigned-Int<max=2> field containing node ID of the next-hop node (based on info from Routing Table), FORWARDED-SOURCE-ID is an Encoded-Unsigned-Int<max=2> ID of the sender; REQUEST-ID is an Encoded-Unsigned-Int<max=2> field; OPTIONAL-PAYLOAD-SIZE is present only if MORE-PACKETS-FOLLOW flag is set, and is an Encoded-Unsigned-Int<max=2> field; HEADER-CHECKSUM is a header SAMP-CHECKSUM (see SAMP-CHECKSUM section for details); PAYLOAD is data being forwarded (copied from PAYLOAD of Samp-To-Santa-Data-Or-Error-Packet); and FULL-CHECKSUM is a SAMP-CHECKSUM of concatenation of the header (without header checksum) and PAYLOAD.

If NEXT-HOP field doesn’t match ID of the receiving Device - the packet is ignored.

Samp-Forward-To-Santa-Data-Or-Error-Packet is sent by Retransmitting Device when it receives Samp-To-Santa-Data-Or-Error-Packet (with TTL=MAX_TTL-1 to account for original Samp-To-Santa-Data-Or-Error-Packet). On receiving Samp-Forward-To-Santa-Data-Or-Error-Packet by a Retransmitting Device, it is processed as described in Uni-Cast processing section above (with implicit Target-Address being Root), and is always sent in ‘Guaranteed Uni-Cast’ mode.

Samp-Routing-Error-Packet: | SAMP-ROUTING-ERROR-PACKET-AND-TTL | OPTIONAL-EXTRA-HEADERS | ERROR-CODE | HEADER-CHECKSUM | PAYLOAD | FULL-CHECKSUM |

where SAMP-ROUTING-ERROR-PACKET-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant SAMP_ROUTING_ERROR_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT is set, and is described above, ERROR-CODE is an Encoded-Unsigned-Int<max=1> field, HEADER-CHECKSUM is a header SAMP-CHECKSUM (see SAMP-CHECKSUM section for details), PAYLOAD is TODO, and FULL-CHECKSUM is a full-packet SAMP-CHECKSUM.

On receiving Samp-Routing-Error-Packet, it is processed as described in Uni-Cast processing section above (with implicit Target-Address being Root), and is always sent in ‘Guaranteed Uni-Cast’ mode.

Samp-Ack-Nack-Packet: | SAMP-ACK-NACK-AND-TTL | OPTIONAL-EXTRA-HEADERS | LAST-HOP | Target-Address | NUMBER-OF-ERRORS | ACK-CHESKSUM | HEADER-CHECKSUM | OPTIONAL-DELAY-UNIT | OPTIONAL-DELAY-PASSED | OPTIONAL-DELAY-LEFT |

where SAMP-ACK-NACK-AND-TTL is an Encoded-Unsigned-Int<max=2> bitfield substrate, with bit[0]=1, bits[1..3] equal to a 3-bit constant SAMP_ACK_NACK_PACKET, bit [4] being EXTRA-HEADERS-PRESENT, and bits [5..] being TTL; OPTIONAL-EXTRA-HEADERS is present only if EXTRA-HEADERS-PRESENT flag is set, LAST-HOP is an id of the transmitting node, Target-Address is described above, NUMBER-OF-ERRORS is an Encoded-Unsigned-Int<max=2> field, which contains number of bit-errors observed at PHY level for the packet being acknowledged, ACK-CHECKSUM is copied from FULL-CHECKSUM of the packet being acknowledged (with an exception for NACK generated due to “partially correct” packet, see below), and HEADER-CHECKSUM is a header SAMP-CHECKSUM (see SAMP-CHECKSUM section for details); OPTIONAL-DELAY-UNIT, OPTIONAL-DELAY-PASSED, and OPTIONAL-DELAY-LEFT fields are all Encoded-Unsigned-Int<max=2> fields, all present only if DELAYS-PRESENT flag is set (which is set only in response to packets with IS-LOCAL-ECHO flag set, see above).

NUMBER-OF-ERRORS field allows to provide feedback about connection quality to sender by receiver; it is a normalized number of bit errors which have been error-corrected when the packet being acknowledged, was received by receiver. If error correction is not employed, this field SHOULD be zero. This information SHOULD be used by sending-side PHY level to optimize power consumption.

Samp-Ack-Nack-Packet with IS-LOOP-ACK flag is generated either by destination, or by the node which has found that the next hop already has NEXT-HOP-ACKS flag (see details in ‘Guaranteed Uni-Cast’ section above); generating node always specifies itself as a target. Samp-Ack-Nack-Packet with IS-LOOP-ACK flag MUST NOT have IS-NACK flag.

If Samp-Ack-Nack-Packet has IS-LOOP-ACK flag, it is processed as specified in ‘Uni-cast processing’ section above; Samp-Loop-Ack packet is never sent using ‘Guaranteed uni-cast’ delivery. Processing at the target node (regardless of node type) consists of passing PAYLOAD to the upper-layer protocol.

Samp-Ack-Nack-Packet without IS-LOOP-ACK flag and without IS-NACK flag, is generated as a response to an incoming Samp-Unicast-Data-Packet with GUARANTEED-DELIVERY flag, or in response to a packet with IS-LOCAL-ECHO flag (TODO: anything else?). It is not retransmitted, but taken as an acknowledgement that the packet has been received.

In addition, Samp-Ack-Nack-Packet without IS-LOOP-ACK flag and without IS-NACK flag, MAY be generated by receiver in an “unsolicited” manner, i.e. even if ACK has not been requested, to indicate that received packet has number of errors which is considered to be “too high” for the underlying PHY level. Such an ACK packet (as well as any other ACK packet with high NUMBER-OF-ERRORS) SHOULD lead to adjustments on sending side (for example, it MAY lead to increase in trasmission power). Another case for “unsolicited” ACK is for Retransmission Device, when NUMBER-OF-ERRORS becomes “too low” after being substantially higher, to indicate that the other side is allowed to lower transmission power. In any case, whenever Retransmission Device sends an “unsolicited” ACK to non-transmitting Device , it SHOULD make sure (from upper-layer protocols) that receiving non-transmitting Device is expected to have it’s transciever on.

Samp-Ack-Nack-Packet without IS-LOOP-ACK flag and with IS-NACK flag, is generated as a response to a “partially correct” packet (regardless of type and GUARANTEED-DELIVERY flag); in this case, it’s ACK-CHECKSUM represents only HEADER-CHECKSUM of the original packet. Such Samp-Ack-Nack-Packet is not retransmitted itself, but is taken as an indication to perform quick retransmit of the last packet sent.

Type of Samp packet

As described above, type of Samp packet is always defined by bits [0..3] of the first field (which is always Encoded-Unsigned-Int<max=2> bitfield substrate):

bit [0] bits[1..3] SAMP packet type
0 ANY (used for other purposes) Samp-Unicast-Data-Packet
1 SAMP_FROM_SANTA_DATA_PACKET Samp-From-Santa-Data-Packet
1 SAMP_TO_SANTA_DATA_OR_ERROR_PACKET Samp-To-Santa-Data-Packet
1 SAMP_FORWARD_TO_SANTA_DATA_OR_ERROR_PACKET Samp-Forward-To-Santa-Data-Or-Error-Packet
1 SAMP_ROUTING_ERROR_PACKET Samp-Routing-Error-Packet
1 SAMP_ACK_NACK_PACKET Samp-Ack-Nack-Packet
1 3 more values RESERVED
Packet Urgency

From SAMP point of view, all upper-layer-protocol packets can have one of three urgency levels. If the packet has urgency URGENCY_LAZY, it is first sent as a Samp-Unicast-Data-Packet without GUARANTEED-DELIVERY flag (as described above, in case of retries it will be resent with GUARANTEED-DELIVERY). If the packet has urgency URGENCY_QUITE_URGENT, it is first sent as a Samp-Unicast-Data-Packet with GUARANTEED-DELIVERY flag (as described above, in case of retries it will be resent as a Samp-*-Santa-* packet). If the packet has urgency URGENCY_TRIPLE_GALOP, then it is first sent as a Samp-From-Santa-Data-Packet or Samp-To-Santa-Data-Packet (depending on source being Root or Device).

PHY quality measurement over SAMP

Certain SADLP-* protocols need to measure connection quality. This can be made using the following procedure:

  • Device sends Samp-To-Santa packet with IS-LOCAL-ECHO flag
  • Device waits for any Samp-Ack-Nack packet, validly acknowledging receipt of IS-LOCAL-ECHO packet, OR for 100 milliseconds, whichever comes first
  • If a valid Samp-Ack-Nack packet is received - Device waits only for DELAY-LEFT specified in the packet from the moment of receiving the packet (more strictly: if multiple packets are received, it is maximum of the DELAY-LEFT-received-since-receiving-each-packet + 10ms (safety margin)).
  • While waiting, all the valid Samp-Ack-Nack packets are accounted for (to be used as described in respective SADLP-* document)
  • when wait expires, Device repeats the whole process above; 5 repetitions are usually made to gather required statistics.

This “quality measurement” procedure MAY be performed ONLY if respective SADLP-* document specifies using it, and ONLY under circumstances specified there.

Device Discovery and Pairing over SAMP

Whenever Device is in PRE-PAIRING state (see SmartAnthill Pairing for details on the PRE-PAIRING state), it scans all available channels; if channel is “eligible” (as defined in an appropriate SADLP-* document), the following basic exchange occurs:

  • Device (after, maybe, performing certain preliminary actions on the channel, as defined in an appropriate SADLP-* document) sends Pairing-Ready-Pseudo-Response (described in SmartAnthill Pairing document), as a SACCP packet, addressed to Root. When SACCP packet reaches SAMP level (still on Device side), SAMP doesn’t have a route to Root, so it sends it as a SAMP To-Santa packet.
  • In response, Root will send a Pairing-Pre-Request (as it has no route to Device, it will be sent as a From-Santa SAMP packet)
  • Device will reply with Pairing-Pre-Response (which will be sent a To-Santa SAMP packet, containing DEVICE-INTRABUS-ID)
  • Up to this point in exchange, all the packets from Root to Device at SAMP level, including optional and not mentioned above Entropy Gathering packets, are always sent as From-Santa packets with Target-Address being ROOT, i.e. broadcast packets. Packets from Device to Root are sent as To-Santa packets.
  • From this point onwards, all the packets from Root to Device at SAMP level are always addressed to specific Device, using non-paired addressing. Packets from Device to Root are still sent as To-Santa packets.
  • Root will proceed with Pairing procedure as described in SmartAnthill Pairing document, still using SAMP From-Santa/To-Santa packets, but from now on From-Santa packets are addressed to specific Device using “non-paired addressing”
  • As soon as Device pairing is completed (and Root sets NODE-ID for the Device), Root SHOULD:
    • calculate optimal route to the Device
    • change Routing Tables for all the Retransmitting Devices alongside the optimal route (for example, using SACCP_PHY_AND_ROUTING_DATA packets as described above)
    • as soon as confirmations from all the Retransmitting Devices about route updates are obtained, Root SHOULD start using Device’s “paired addressing” for all the communications onwards with the Device.
    • change Routing Table on the Device, indicating optimal route to the Root. From this point on, Device will start using usual Unicast packets when communicating with Root (unless there are reasons to use other SAMP packets, for example, on multiple retransmits or for packets marked URGENT).

TODO: merge of To-Santa into Unicast (with NEXT-HOP being -1)? TODO: Samp-Retransmit (to next-hop Retransmitting Device on RETRANSMIT-ON-NO-RETRANSMIT) TODO: define handling for all “partially correct” packets TODO: what exactly is “header” for the purposes of “partially correct” packets? Is “sub-header” worth the trouble? TODO: NACK-PREV-HOP into Routing Table Links; RETRANSMIT-ON-NO-RETRANSMIT into RT Routes TODO: ?move FORWARD-TO-SANTA-* to links (target ones) too (and specify that it is per-link wherever it is used) TODO: procedure for calibration of LINK-DELAYs? TODO: optional explicit loop begin (alongside VIA?)

SmartAnthill DLP for RF (SADLP-RF)

Version:v0.4.10

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SADLP-RF provides L2 datalink over simple Radio-Frequency channels which have only an ability to send/receive packets over RF without any addressing. For more complicated RF communications (such as IEEE 802.15.4), different SADLP-* protocols (such as SADLP-802.15.4 described in SmartAnthill DLP for IEEE 802.15.4 (SADLP-802.15.4)) need to be used.

SADLP-RF Design

Assumptions:

  • We’re assuming to operate in a noisy environment. Hence, we need to use forward error correction.
  • error correction level is to be specified by an upper protocol layer for each packet separately (for example, retransmits may use higher error correction levels)
  • We don’t have enough resources to run sophisticated error-correction mechanisms such as Reed-Solomon, Viterbi, etc.
  • Transmissions are rare, hence beacons and frequency hopping are not used
  • upper protocol layer may have some use for packets where only a header is correct; hence, packets with only first portion of the packet being correctable, SHOULD still be passed to upper protocol layer (non-correctable “tail” of the packet MAY be truncated)
SADLP-RF PHY Level

Modulation: 2FSK (a.k.a. FSK without further specialization, and BFSK), or GFSK (2FSK and GFSK are generally compatible), with frequency deviation specified above.

Supported modulation types:

Name From (for base frequency) To (for base frequency) Tau (*) SA-Deviation Receiver filter bandwidth (non-normative)
Generic 433 433.125 MHz 434.715 MHz 1/38‘400 sec +-38‘400 Hz (**) 4*38‘400 = 153‘600 Hz
nRF24 2401 MHz 2499 MHz 1/1‘000‘000 sec +-160kHz 1 MHz

(*) Tau is minimum period with the same frequency during FSK modulation. NB: tau of 1/38400 sec usually, but not necessarily, corresponds to 38400 baud transfer rate as used in RF Module APIs. (TODO: rate negotiation?)

(**) For this modulation type, SA-Deviation uses deviation which is four times larger than theoretically necessary for MSK, to account for not-so-perfect hardware.

Line code always consists of preamble, followed by a sync word, followed by “raw” SADLP-RF Packet as described below. The following parameters are used for the line code depending on the modulation type:

Name Preamble Sync Word Byte encoding
Generic 433 At least six symbols 0xAA (eight recommended) 0x0F, 0x33 MSB-first
nRF24 At least two symbols 0xAA 0x0F, 0x33 TBD

CSMA/CA: enabled (if available)

Typical transceiver chips settings
  • Modulation: GFSK (if available), otherwise 2FSK. GFSK is preferred over 2FSK due to better spectral characteristics.
  • Frequency: see table above
  • Baud rate: see table above
  • FSK Frequency Deviation: see table above
  • Receiver filter bandwidth: see table above

In addition, transceiver chips are often providing additional features, which MUST be disabled for SADLP-RF to work properly:

  • CRC: off Rationale: SADLP-RF uses forward error correction, which allows to improve reliability and reduce power consumption significantly and CRC would disable this ability
  • Manchester encoding: off Rationale: SADLP-RF uses it’s own line code which allows for higher bit rates than Manchester
  • Whitening: off Rationale: SADLP-RF provides it’s own whitening which is compatible between chips by different manufacturers
  • Encryption: off Rationale: SADLP-RF relies on encryption being performed on higher levels; enabling AES at L2 would defeat such features as forward error correction
  • Shockburst: off Rationale: (Shockburst(tm) enforces CRC, which should be turned off, see above
  • Auto-ACK: off Rationale: SAMP provides its own ACK where applicable.

In addition, the following settings SHOULD be used (if supported by transceiver chip):

  • Number of mismatched bits allowed for sync word: 1 Note that this SHOULD be emulated by SADLP-RF HAL if not supported by the chip.
  • CSMA/CA: enabled
SADLP-RF Packets, SCRAMBLING, and Line Codes

FSK modulation used by SADLP-RF, does not require AC/DC balance. However, it requires to have at least one edge per N*tau (to keep synchronization). For example, for a popular RFM69 module, it is recommended to have at least one edge per 16*tau. SADLP-RF approach in this field is two fold:

  • first, there are strict guarantees on absolute minimum number of edges; however, these absolute-minimum guarantees MAY be lower than recommended 16 bytes.
  • second, SADLP-RF protocol uses “Salted-SCRAMBLING” procedure as defined in SmartAnthill SCRAMBLING procedure. If by any chance Salted-SCRAMBLING does violate 16*tau requirement and the packet is lost (which is extremely unlikely to start with), “Salt” will be changed on the next retransmit and all the bits will be reshuffled, which leads to very-low probability of exceeding 16*tau for the retransmit.

Statistical data (TODO: double-check):

Run length Probability to occur in 2600-bit (325-byte) packet
17 1.22%
18 0.61%
19 0.27%
20 0.13%
21 0.07%
22 0.03%
23 0.018%
24 0.010%
25 0.003%
26 0.002%
27 0.001%
28+ 0.0005%

As run-length of 17 is very unlikely to be fatal, and as probability of longer run-lengths is decreased exponentially, we hope that described statistical approach will be acceptable in practice.

As a result, SADLP-RF does not need any additional line codes, and SADLP-RF Packets MUST be transmitted directly over FSK (after preamble and sync word, as described above).

SADLP-RF MTU Limits

For RF, too long packets MAY increase chances of the packet being incorrect; this applies (though to the less extent) to the error-corrected packets.

NB: numbers below are EXTREMELY preliminary, and are subject to change based on real-world experiments

For ENCODING-TYPE=PLAIN16

Hard Limit: 128 bytes. Soft Limit: 64 bytes.

For ENCODING-TYPE=HAMMING-32-CORRECTION

Hard Limit: 256 bytes. Soft Limit: 128 bytes.

For ENCODING-TYPE=HAMMING-32-2D-CORRECTION

Hard Limit: 512 bytes. Soft Limit: 256 bytes.

Non-paired Addressing for RF Buses

Each RF frequency channel on a Device represents a “wireless bus” in terms of SAMP. For “intra-bus address” as a part “non-paired addressing” (as defined in SmartAnthill Mesh Protocol (SAMP)), RF Devices MUST use randomly generated 64-bit ID.

If Device uses hardware-assisted Fortuna PRNG (as described in SmartAnthill SmartAnthill Random Number Generation and Key Generation document), Device MUST complete Phase 1 of “Entropy Gathering Procedure” (as described in SmartAnthill Pairing document) to initialize Fortuna PRNG before generating this 64-bit ID. Then, Device should proceed to Phase 2 (providing Device ID), and Phase 3 (entropy gathering for key generation purposes), as described in SmartAnthill Pairing document.

PHY-Data-Request and PHY-Data-Response

As described in SmartAnthill Mesh Protocol (SAMP) document, SACCP PHY-AND-ROUTING-DATA packets support PHY-Data-Request and PHY-Data-Response packets. For SADLP-RF, they’re used as described below.

ID-OF-SADLP for SADLP-RF

For SADLP-RF, ID-OF-SADLP is 0x0.

PHY-Data Packets for SADLP-RF

SADLP-RF uses the following PHY-Data Packets:

Fine-Tune-Best-Frequency, going over PHY-Data-Response (sic!) and having SADLP-DEPENDENT-PAYLOAD of: | FREQUENCY-SCHEMA | FREQUENCY | FREQUENCY-WEIGHT | FREQUENCY2 | FREQUENCY-WEIGHT | ... | where FREQUENCY-SCHEMA is an Encoded-Unsigned-Int<max=1> (currently only LINEAR schema is supported), FREQUENCY is an Encoded-Unsigned-Int<max=2> field, FREQUENCY-WEIGHT is an Encoded-Unsigned-Int<max=2>.

Fine-Tune-Best-Frequency-Reply, going over PHY-Data-Request (sic!) and having SADLP-DEPENDENT-PAYLOAD of: | FREQUENCY | where FREQUENCY is an Encoded-Unsigned-Int<max=2> field.

On receiving Fine-Tune-Best-Frequency, Central Controller calculates a “best fit” frequency for the reported graph of FREQUENCY-WEIGHT as a function of FREQUENCY. One example of such calculation would be to look for the best fit between a obtained graph and a theoretical gaussian graph; while such a calculation is “too heavy” for the MCU, it can be made on Central Controller easily.

Device after-Zero-Pairing

For Devices with Zero Pairing, the following procedure is used:

  • From Zero Pairing, Device gets pre-programmed list of frequencies for “reduced scan”, based on SmartAnthill known-frequency; these frequencies SHOULD be expressed in terms which are convenient for the Device to be used; in particular, they SHOULD be recalculated into prefered-Device’s form, and SHOULD be expressed as (start,end,increment). These frequencies MUST be calculated to cover range from SA-frequency - 2e-4 * SA-frequency to SA-frequency + 2e-4 * SA-frequency, with a step of SA-deviation / 2. Zero Pairing DOES NOT set field ‘preferred-frequency’ for the Device.
  • When Device is turned on for the first time after being programmed with Zero Pairing, it has no preferred-frequency in EEPROM, so it:
    • sets power to -6dB (TODO!: increase if there is no result/very-bad-results at all)
    • takes one of the frequencies from the list of frequencies obtained from Zero Pairing
    • performs SAMP PHY quality measurement (as described in SmartAnthill Mesh Protocol (SAMP) document), with the following clarifications:
      • frequency-quality variable is set to 0
      • measurement is performed over 5 packets sent
      • for each packet sent, there can be multiple packets received (as described in SmartAnthill Mesh Protocol (SAMP))
      • for each packet received, number-of-erroneous-bits (based on data from Hamming decoder) is calculated (if applicable).
      • for each packet received, weight = 2^24 >> number-of-erroneous-bits, is added to frequency-quality
    • repeats the process for another frequency from the list
    • the frequency with the largest frequency-quality becomes first preferred-frequency (up until the Frequency-Fine-Tuning described below).
    • from this point on, Device uses this preferred-frequency
    • Device sends a Fine-Tune-Best-Frequency packet to Central Controller, with all the data gathered from the measurements above
    • Device receives a Fine-Tune-Best-Frequency reply, double-checks it for sanity (TODO: what if insane?), writes received preferred-frequency to EEPROM, and starts to use preferred-frequency for all the subsequent communications
    • Device sends a PHY-Data-Ready-Response (sic!), and receives PHY-Data-Ready-Request (sic!). From this point on, Device is ready to work within the SmartAnthill PAN.
Device OtA Discovery and Pairing

For Devices with OtA Pairing (as described in SmartAnthill Pairing), “Device Discovery” procedure described in SmartAnthill Mesh Protocol (SAMP) document is used, with the following clarifications:

  • SAMP “channel scan” for SADLP-RF is performed as follows:
    • Device sets power to -6dB (TODO!: increase if there is no result/very-bad-results at all)
    • “candidate channel” list consists of all the frequencies in the range allowed in target area, with a step of SA-deviation / 2.
    • for each of candidate channels:
      • Device performs SAMP PHY quality measurement procedure (with SADLP-RF refinements described in after-Zero-Pairing section), using the range from SA-frequency - 2e-4 * SA-frequency to SA-frequency + 2e-4 * SA-frequency with a step of SA-Deviation / 2. During this measurement, Device SHOULD use data from measurements-which-have-already-been-performed-within-this-channel-scan (effectively using cached measurement data for known frequencies). NB: if following this specification as described (and be careful with potential rounding errors during calculations), it means that only one frequency scan with a step of `SA-Deviation / 2` is performed; i.e. for each new “candidate channel” only one new measurement is performed, and all the other data is taken from the cache..
        • if preferred-frequency can be found (with at least 2^20 - TODO - weight), then:
          • the first packet as described in SAMP “Device Discovery” procedure is sent by Device
          • if a reply is received indicating that Root is ready to proceed with “pairing” - “pairing” is continued over this channel; after pairing is completed - Device performs Fine-Tune-Best-Frequency process and PHY-Data-Ready acknowledgement as described in after-Zero-Pairing section above.
          • if “pairing” fails, then the next available “candidate channel” is processed.
          • to handle the situation when “pairing” succeeds, but Device is connected to wrong Central Controller - Device MUST (a) provide a visual indication that it is “paired”, (b) provide a way (such as jumper or button) allowing to drop current “pairing” and continue processing “candidate channels”. In the latter case, Device MUST process remaining candidate channels before re-scanning.
          • if a reply is received with ERROR-CODE = ERROR_NOT_AWAITING_PAIRING, or if there is no reply within 500 msec, the Device proceeds to the next candidate channel
    • if the list of “candidate channels” is exhausted without “pairing”, the whole “channel scan” is repeated (indefinitely, or with a 5-or-more-minute limit - if the latter, then “not scanning anymore” state MUST be indicated on the Device itself - TODO acceptable ways of doing it, and the scanning MUST be resumed if user initiates “re-pairing” on the Device), starting from an “active scan” as described above
SADLP-RF Packet

SADLP-RF packet has the following format:

| ENCODING-TYPE | SADLP-RF-DATA |

where ENCODING-TYPE is 1-byte fields (see below).

ENCODING-TYPE is an error-correctable field, described by the following table:

ENCODING-TYPE Meaning Value after Hamming Decoding
0x00 RESERVED (NOT RECOMMENDED) 0
0x69 RESERVED (MANCHESTER-COMPATIBLE) 1
0xAA RESERVED (MANCHESTER-COMPATIBLE) 2
0xC3 PLAIN16-NO-CORRECTION 3
0xCC HAMMING-32-CORRECTION 4
0xA5 RESERVED (MANCHESTER-COMPATIBLE) 5
0x66 RESERVED (MANCHESTER-COMPATIBLE) 6
0x0F RESERVED 7
0xF0 RESERVED 8
0x99 RESERVED (MANCHESTER-COMPATIBLE) 9
0x5A RESERVED (MANCHESTER-COMPATIBLE) 10
0x33 HAMMING-32-2D-CORRECTION 11
0x3C RESERVED 12
0x55 RESERVED (MANCHESTER-COMPATIBLE) 13
0x96 RESERVED (MANCHESTER-COMPATIBLE) 14
0xFF RESERVED (NOT RECOMMENDED) 15

All listed ENCODING-TYPEs have “Hamming Distance” of at least 4 between them. It means that error correction can be applied to ENCODING-TYPE, based on “Hamming Distance”, as described below (for error correction to work, “Hamming Distance” must be at least 3).

ENCODING-TYPE can be considered as a Hamming (7.4) code as described in https://en.wikipedia.org/wiki/Hamming_code, with a prepended parity bit to make it SECDED. Note: implementation is not strictly required to perform Hamming decoding; instead, the following procedure MAY be used for error correction of ENCODING-TYPE:

  • calculate “Hamming Distance” of received ENCODING-TYPE with one of supported values (NO-CORRECTION, HAMMING-32-CORRECTION, and HAMMING-32-2D-CORRECTION)
  • if “Hamming Distance” is 0 or 1, than we’ve found the error-corrected ENCODING-TYPE
  • otherwise - repeat the process with another supported value
  • if we’re out of supported values - ENCODING-TYPE is beyond repair, and we SHOULD drop the whole packet

To check that “Hamming Distance” of bytes a and b is <=1:

  • calculate d = a XOR b
  • calculate number of 1’s in d
    • if MCU supports this as an asm operation - it is better to use it
    • otherwise, either shift-and-add-if
    • or compare with each of (0,1,2,4,8,16,32,64,128) - if doesn’t match any, “Hamming Distance” is > 1
PLAIN16 Block

PLAIN16 block is always a 16-bit (2-byte) block. It consists of 15 data bits d0..d15, followed by 16th bit p, where p = ~d15 (inverted d15). p is necessary to provide strict guarantees that there is at least 1 bit change every 16 bits of data stream. On receiving side, p is ignored (though if bit-error counter is enabled, and p it is not equal to ~d15, it SHOULD be counted as a bit-error).

Converting Data Block into a Sequence of PLAIN16 Blocks

To produce PLAIN16-BLOCK-SEQUENCE from DATA-BLOCK, the following procedure is used:

  • PADDED-DATA-BLOCK is formed as | DATA-BLOCK | padding |, where padding is random data (using non-key random stream as specified in SmartAnthill SmartAnthill Random Number Generation and Key Generation) with a size, necessary to make the bitsize of PADDED-DATA-BLOCK a multiple of 15. NB: Within implementation, PADDED-DATA-BLOCK is usually implemented virtually
  • resulting bit sequence (which has bitsize which is a multiple of 15) is split into 15-bit chunks, and each 15-bit chunk is converted into a 16-bit PLAIN16 block
PLAIN16-NO-CORRECTION Packets

For PLAIN16-NO-CORRECTION packets, SADLP-RF-DATA has the following format:

| SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD-PLAIN16 |

where SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD-PLAIN16 is a conversion of SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD into a sequence of PLAIN16 blocks, with SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD obtained by applying Salted-SCRAMBLED procedure (as described in SmartAnthill SCRAMBLING procedure document) to payload from upper layer, and conversion is performed as described above.

In the absolutely worst case for PLAIN16-NO-CORRECTION packets, maximum distance between edges is always <= 15.

HAMM32 block

HAMM32 block is always a 32-bit (4-byte) block. It is a Hamming (31,26)-encoded block where d1..d26 are data bits and p1,p2,p4,p8,p16 are parity bits as described in https://en.wikipedia.org/wiki/Hamming_code, then HAMM32 block is built as follows:

| p0 | ~p1 | ~p2 | d1 | ~p4 | d2 | d3 | d4 | ~p8 | d5 | d6 | d7 | d8 | d9 | d10 | d11 | ~p16 | d12 | d13 | d14 | d15 | d16 | d17 | d18 | d19 | d20 | d21 | d22 | d23 | d24 | d25 | d26 |

where ‘~’ denotes bit inversion, and p0 is calculated to make the whole 32-bit HAMM32 parity even (making HAMM32 a SECDED block).

Parity bit inversion is needed to make sure that HAMM32 block can never be all-zeros or all-ones (and simple inversion doesn’t change Hamming Distances, so error correction on the receiving side is essentially the same as for non-inverted parity bits). HAMM32 blocks guarantee that there is at least one change-from-zero-to-one-or-vice-versa at least every 32 bits.

Converting Data Block into a Sequence of HAMM32 Blocks

To produce HAMM32-BLOCK-SEQUENCE from DATA-BLOCK, the following procedure is used:

  • PADDED-DATA-BLOCK is formed as | DATA-BLOCK | padding |, where padding is random data (using non-key random stream as specified in SmartAnthill SmartAnthill Random Number Generation and Key Generation) with a size, necessary to make the bitsize of PADDED-DATA-BLOCK a multiple of 26. NB: Within implementation, PADDED-DATA-BLOCK is usually implemented virtually
  • resulting bit sequence (which has bitsize which is a multiple of 26) is split into 26-bit chunks, and each 26-bit chunk is converted into a 32-bit HAMM32 block
HAMMING-32-CORRECTION Packets

For HAMMING-32-CORRECTION packets, SADLP-RF-DATA is | SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD-HAMM32 |

where SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD-HAMM32 is a conversion of SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD into a sequence of HAMM32 blocks, with SALTED-SCRAMBLED-UPPER-LAYER-PAYLOAD obtained by applying Salted-SCRAMBLED procedure (as described in SmartAnthill SCRAMBLING procedure document) to payload from upper layer, and conversion is performed as described above.

In the absolutely worst case for HAMMING-32-CORRECTION packets, maximum distance between edges is always <= 39. However, given Salted-SCRAMBLING, it is statistically MUCH better than that.

HAMMING-32-2D-CORRECTION Packets

HAMMING-32-2D-CORRECTION is similar to HAMMING-32-CORRECTION, with an additional field of 2D-HAMM32 being added.

2D-HAMM32 consists of 26 additional Hamming checksums; each Hamming checksum #i consists of N parity bits of Hamming code, calculated over all bits #i in 26-bit data bits within HAMM32 blocks forming UPPER-LAYER-PAYLOAD-HAMM32. Number N is a number of Hamming bits necessary to provide error correction for NN=NUMBER-OF-HAMM32-BLOCKS. Hamming checksums are encoded as a bitstream, without intermediate padding, but padded at the end to a byte boundary with random (non-key-stream) data.

For example, if original block is 50 bytes long, then it will be split into 16 26-bit blocks, which will be encoded as 16 HAMM32 blocks (to foem UPPER-LAYER-PAYLOAD-HAMM32); then, for HAMMING-32-2D-CORRECTION, additional 26 Hamming checksums (5 bits each, as for NN=16 N=5) will be added. Therefore, original 50 bytes will be encoded as 4*16+17=81 byte (62% overhead).

SmartAnthill DLP for IEEE 802.15.4 (SADLP-802.15.4)

Version:v0.1.1

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SADLP-802.15.4 provides L2 datalink over IEEE 802.15.4, suitable for use by SmartAnthill protocol stack in general, and by SAMP (see SmartAnthill Mesh Protocol (SAMP) for details) in particular.

All references to IEEE 802.15.4 standard in this document, marked as [802.15.4], refer to IEEE Std 802.15.4(tm)‐2011, “IEEE Standard for Local and metropolitan area networks—Part 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs)” (which can be obtained at http://standards.ieee.org/getieee802/download/802.15.4-2011.pdf ).

*NB: to be taken with a grain of salt. If there are any issues with implementing it - let’s discuss.*

NB2: current version of SADLP-802-15-4 is very basic, and doesn’t utilize some potentially useful features of IEEE 802.15.4; this is intended to be corrected in the future versions of SADLP-802-15-4. Such features include: uni-cast packets where applicable, security and ACK at 802.15.4 level.

IEEE 802.15.4 Topology, PANs, and Beacons

Current version of SADLP-802.15.4 uses peer-to-peer topology as defined in [802.15.4]. The node with the shortest SAMP distance from Root Node (TODO: define better) becomes “PAN coordinator” as defined in [802.15.4]. Normally, all the 802.15.4 Devices within one SmartAnthill network represent one single 802.15.4 PAN; however, in some cases (different channel frequencies or different collision domains) 802.15.4 Devices within the same SmartAnthill network MAY belong to different 802.15.4 PANs (therefore using different 802.15.4 PAN IDs, and having different PAN coordinators). PAN ID for this PAN (these PANs where applicable) SHOULD be chosen according to [802.15.4].

Current version of SADLP-802.15.4 doesn’t use beacons. I.e. the 802.15.4 PAN is nonbeacon-enabled, as defined in [802.15.4].

Non-paired Addressing for IEEE 802.15.4 Buses

Each IEEE 802.15.4 frequency channel on a Device represents a “wireless bus” in terms of SAMP. For “intra-bus address” as a part “non-paired addressing” (as defined in SmartAnthill Mesh Protocol (SAMP)), IEEE 802.15.4 Devices MUST use 64-bit IEEE 802.15.4 extended address as defined in [802.15.4].

Device Discovery and Pairing

For Devices with OtA Pairing (as described in SmartAnthill Pairing), “Device Discovery” procedure described in SmartAnthill Mesh Protocol (SAMP) document is used, with the following clarifications:

  • SAMP “channel scan” for SADLP-IEEE-802-15-4 is performed as follows:
    • an “active scan” as described in [802.15.4] is performed, and a list of candidate channels is obtained
    • for each of candidate channels:
      • “Association” is performed as described in [802.15.4]
      • the first packet as described in SAMP “Device Discovery” procedure is sent by Device
      • if a reply is received indicating that Root is ready to proceed with “pairing” - “pairing” is continued over this channel
        • if “pairing” fails, then the next available “candidate channel” is processed.
        • to handle the situation when “pairing” succeeds, but Device is connected to wrong Central Controller - Device MUST (a) provide a visual indication that it is “paired”, (b) provide a way (such as jumper or button) allowing to drop current “pairing” and continue processing “candidate channels”. In the latter case, Device MUST process remaining candidate channels before re-scanning.
      • if a reply is received with ERROR-CODE = ERROR_NOT_AWAITING_PAIRING, or if there is no reply within 500 msec, “Disassociation” is performed as described in [802.15.4], and the procedure is repeated for the next candidate channel
    • if the list of “candidate channels” is exhausted without “pairing”, the whole “channel scan” is repeated (indefinitely, or with a 5-or-more-minute limit - if the latter, then “not scanning anymore” state MUST be indicated on the Device itself - TODO acceptable ways of doing it, and the scanning MUST be resumed if user initiates “re-pairing” on the Device), starting from an “active scan” as described above
IEEE 802.15.4 MAC Frames

Each of SADLP-802-15-4 upper-layer-protocol packets (usually SAMP packets) is transmitted as 802.15.4 MAC frame (as described in [802.15.4] 5.2.1) with the following field values:

  • Frame Control Field:
    • Frame Type = ‘Data Frame’
    • Security Enabled = false
    • Frame Pending = <depending on more data available; when required so by [802.15.4] 5.2.1.1.3, shall be set to zero>
    • Acknowledgment Request = false
    • PAN ID Compression = false
    • Destination Addressing Mode = 0b10 (‘16-bit’)
    • Frame Version = 0x01 (‘post-2003 802.15.4’)
    • Source Addressing Mode = 0b00 (‘not present’)
  • Sequence Number Field: current value of macDSN, as specified in [802.15.4] 5.2.2.2.1
  • Destination PAN Identifier: 0xffff
  • Destination Address: 0xffff
  • Source PAN Identifier/Source Address: not present
  • Auxiliary Security Header: not present
  • Frame Payload: SADLP payload (TODO: exactly or some massaging is needed?)
  • FCS: 16-bit ITU-T CRC, as specified in [802.15.4]

That is, current version of the SADLP-802-15-4 sends all the data as unacknowledged (acknowledgement is handled by SAMP) insecure (security is provided by SASP) broadcast IEEE 802.15.4 data frame.

Mapping to specific APIs

Mapping to specific APIs implementing IEEE 802.15.4, is beyond the scope of this document. Any implementation which produces MAC frames with the fields above, should be ok.

SmartAnthill SCRAMBLING procedure

Version:v0.5.5

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

SmartAnthill SCRAMBLING procedure aims to provide some extra protection when data is transmitted in open (in particular, over wireless or over the Internet). SCRAMBLING procedure does not provide security guarantees in a strict sense, but might hide certain details (such as source/destination addresses) and does help against certain classes of DoS attacks. Because of the lack of security guarantees, SCRAMBLING procedure SHOULD NOT be used as a sole encryption protocol (using it over SASP is fine).

SCRAMBLING procedure requires both sides to share one AES-128 key. SCRAMBLING key MUST be separate and independent from any other key in the system, in particular, from SASP keys.

Within SmartAnthill protocol stack, to allow for arbitrary moving Devices, SCRAMBLING key is the same for the whole SmartAnthill PAN. This is unlike SASP keys, which are unique per-device.

SCRAMBLING procedure is intended to be used as the outermost packet wrapper which is possible for an underlying protocol. Within SmartAnthill Protocol Stack, SCRAMBLING procedure is OPTIONALLY used by SAoIP protocol as described in SmartAnthill-over-IP Protocol (SAoIP) and SmartAnthill Router document. In addition, SADLP-* protocols, especially those working over wireless L1 protocols, SHOULD use SCRAMBLING procedure to hide as much information as possible.

Environment

SCRAMBLING procedure is a procedure of taking an input packet of arbitrary size, and producing a “scrambled” packet. It is used both by SAoIP and some of SADLP-*.

SCRAMBLING procedure requires both sides to share one secret AES-128 key. SCRAMBLING key MUST be independent from any other key in the system, in particular, from SASP key.

For SCRAMBLING procedure to be efficient (in secure sense), caller SHOULD guarantee that there is a 12-byte block within input packet, where such block is at least statistically unique, and the same block is statistically indistinguishable from white noise. Offset to such a block within the packet is an input unique-block-offset parameter for SCRAMBLING procedure. In practice, SASP tag can be used for this purpose.

SCRAMBLING procedure
OPTIONAL Salted-SCRAMBLING

For some of the SADLP-* protocols, desirable properties MAY include such things as AC/DC balance, or a (weaker) requirement to have at least one switch between different bits per N bits in bitstream. These requirements MAY be satisfied statistically (with the idea behind being similar to 64/66b protocol).

SADLP-* protocols MAY specify Salted-SCRAMBLING procedure, which enables such statistical properties. Given that SCRAMBLING procedure uses industrial-grade encryption, normally, one-byte “salt” is sufficient to ensure that at least one of the packets will have desirable properties.

When Salted-SCRAMBLING is used, “Salt” field MUST be populated by random data:

  • this random data MUST come from non-keystream RNG, see SmartAnthill SmartAnthill Random Number Generation and Key Generation for details
  • if non-keystream RNG is not completely initialized yet at the moment when Salted-Default SCRAMBLING needs to be used, then:
    • No data from not-initialized (or partially-initialized) RNG can ever be used
    • Instead, Salted-SCRAMBLING procedure MUST use incremented counter; this counter SHOULD be pre-initialized with 0x33 on each system start.
    • as soon as non-keystream RNG is completely initialized, it MUST be used instead of the incremented counter.

If Salted-SCRAMBLING is used, then for ANY retransmit (this MUST include SAMP retransmits and SAGDP retransmits), new value of “Salt” MUST be generated as described above, and the packet MUST be re-SCRAMBLED. This is necessary to ensure that if violation of desirable properties has occurred, it will be statistically healed on the next run.

Input

Input of SCRAMBLING procedure is a pre-SCRAMBLING packet, and unique-block-offset offset. pre-SCRAMBLING packet can be considered as follows:

| unencrypted-pre-SCRAMBLING-Data | encrypted-pre-unique-pre-SCRAMBLING-Data | encrypted-unique-block | encrypted-post-unique-pre-SCRAMBLING-Data |

where encrypted-unique-block is always 12 bytes in size, and it’s offset from the beginning is specified by unique-block-offset input parameter, and any of encrypted-pre-unique-pre-SCRAMBLING-Data and encrypted-post-unique-pre-SCRAMBLING-Data can have 0 size.

unique-block-offset+12 MUST be within pre-SCRAMBLING-Data.

Procedure

SCRAMBLING procedure works as follows:

  1. Form SCRAMBLING-Header according to formatting schema (Default schema is described below, but SADLP-* implementations are allowed to define their own schemas if necessary).

SCRAMBLING-Header, regardless of formatting schema, MUST specify Scrambled-Size and Forced-Padding-Size parameters. Scrambled-Size is a number of 16-byte blocks which were scrambled; 16*Scrambled-Size MUST be >= size of SCRAMBLING-Header. For security purposes, sender MAY scramble more bytes (and respectively specify Scrambled-Size) than strictly necessary. However, sender MUST NOT specify Scrambled-Size so that 16*Scrambled-Size is more than sizeof(SCRAMBLING-Header)+sizeof(encrypted-pre-unique-pre-SCRAMBLING-Data)+sizeof(encrypted-post-unique-pre-SCRAMBLING-Data)+15; otherwise, receiver MUST treat it as a malformed packet.

  1. Form pre-SCRAMBLED packet which has the following format:

| encrypted-unique-block | OPTIONAL-SALT | SCRAMBLED-Header | encrypted-pre-unique-pre-SCRAMBLING-Data | encrypted-post-unique-pre-SCRAMBLING-Data | Optional-Forced-Padding |

where OPTIONAL-SALT is a 1-byte field, present only if Salted-SCRAMBLING is specified by the protocol which uses SCRAMBLING procedure (if non-Salted-SCRAMBLING is specified, SALT is presumed to be equal to 0), Optional-Forced-Padding is optional forced padding, which has size of Forced-Padding-Size parameter from SCRAMBLING-Header. Forced-Padding, if present, MUST be generated using SmartAnthill Non-Key Random Stream (which is described in SmartAnthill SmartAnthill Random Number Generation and Key Generation).

  1. Encrypt a portion of pre-SCRAMBLED packet, starting from SCRAMBLED-HEADER, and with length of Scrambled-Size*16 (as specified in SCRAMBLING-Header), using AES-128 in CTR mode, using SCRAMBLING key, and using ( encrypted-unique-block << 32 ) | ( SALT << 24 ) as initial counter for CTR. CTR mode, combined with statistical-uniqueness requirement for unique-block, ensures that SCRAMBLED data is indistinguishable from white noise for a potential attacker. NB: size of ( encrypted-unique-block << 32 ) | ( SALT << 24 ) is 128 bit, or one AES-128 block. NB2: technically, this construct restricts the size of data being SCRAMBLED, to 16*2^24~=256 Mbytes; it is many orders of magnitude larger than any practical packets may reasonably contain.

If 16*Scrambled-Size goes beyond encrypted-post-unique-pre-SCRAMBLING-DATA, remaining SCRAMBLING bytes are ignored; due to requirement on Scrambled-Size stated above, number of such ignored bytes cannot exceed 15.

Default SCRAMBLING-Header Schema

Default SCRAMBLING-Header Schema assumes that the size of encrypted-post-unique-pre-SCRAMBLING-Data is always zero (and that therefore unique-block-offset parameter is always equal to pre_SCRAMBLING_packet_size-12). This occurs when (a) SASP tag is located at the very end of the SASP packet (which is always the case for SASP as described in SmartAnthill Security Protocol (SASP) document), and (b) all protocols below SASP and above the protocol which uses SCRAMBLING procedure add only headers, and not trailers.

If the size of encrypted-post-unique-pre-SCRAMBLING-Data is always zero, it means that there is no need to send unique-block-offset over the wire, as it can always be calculated on receiving side. Therefore, Default SCRAMBLING-Header Schema is defined as follows:

| Forced-Padding-Flag-And-Scrambled-Size | Optional-Forced-Padding-Size | unencrypted-pre-SCRAMBLING-Data |

where Forced-Padding-Flag-And-Scrambled-Size is an Encoded-Unsigned-Int<max=2> field, which acts as a substrate for bitfields Forced-Padding-Flag (takes bit [0]), and Scrambled-Size (takes bits [1..]), and Optional-Forced-Padding-Size is an Encoded-Unsigned-Int<max=2> field which is present only if Forced-Padding-Flag is equal to 1.

DESCRAMBLING

Processing of a SCRAMBLED packet (“DESCRAMBLING”) is performed in reverse order compared to SCRAMBLING procedure.

“Streamed” SCRAMBLING (ON HOLD)

NB: “Streamed” SCRAMBLING is not currently used; MAY be reinstated when/if SAoTCP is reinstated

There are cases, where SCRAMBLED data is intended to be sent over stream (such as TCP stream), other than in individual datagrams. In such cases, “Streamed” SCRAMBLING may be used. “Streamed” SCRAMBLING differs from SCRAMBLING procedure above in the following details:

  • when SCRAMBLING-Header is formed, it includes Whole-Packet-Size (as the very first field), followed by all the fields specified in SCRAMBLING procedure above.

where Whole-Packet-Size is an Encoded-Unsigned-Int<max=2> field, representing the whole packet size (excluding forced-padding if any).

As even Whole-Packet-Size is scrambled, the whole stream looks as a white noise (NB: some information can be still extracted by attacker from timing and division of the stream into packets).

To ensure proper error recovery, receiving side of “Streamed”-SCRAMBLED stream MUST forcibly break an underlying stream (such as TCP connection) as soon as any of the de-SCRAMBLING operations for packets received over this underlying connection fail (this includes size field exceeding it’s “max=” size).

TODO: forced-padding (incl. random-size padding)

SmartAnthill SmartAnthill Random Number Generation and Key Generation

Version:v0.1.5c

IMPORTANT: This document is obsolete. Please DO NOT modify it. Please refer to SimpleIoT Random Number Generation and Key Generation for an up to date version.

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

Random Number Generation is vital for ensuring security. This document describes requirements for Random Number Generation for SmartAnthill Devices.

Poor-Man’s PRNG

Each device with Poor-Man’s PRNG, has it’s own AES-128 secret key (this key MUST NOT be stored outside of the device), and additionally keeps a counter. This counter MUST be kept in a way which guarantees that the same value of the counter is never reused; this includes both having counter of sufficient size, and proper commits to persistent storage to avoid re-use of the counter in case of accidental device reboot. As for commits to persistent storage - two such implementations are discussed in SmartAnthill Security Protocol (SASP) document, in ‘Implementation Details’ section, with respect to storing nonces.

Then, Poor-Man’s PRNG simply encrypts current value of the counter with AES-128, increments counter (see note above about guarantees of no-reuse), and returns encrypted value of the counter as next 16 bytes of the random output.

Devices with uniquely-pre-initialized Poor-Man’s PRNG

Resource-constrained SmartAnthill Devices which don’t have their own crypto-safe RNG, MUST use Poor-Man’s PRNG. On such Devices, Poor-Man’s PRNG MUST be pre-populated during Device manufacturing, with a random key and random initial counter, generated outside of Device. Both key and counter MUST be crypto-safe random numbers, and MUST be statistically unique for each Device.

Devices with hardware-assisted Fortuna

If Device doesn’t have a uniquely-pre-initialized Poor-Man’s PRNG, the following approach based on hardware-assisted Fortuna PRNG, MAY be used (ONLY for certain types of Devices, see ‘Restrictions for Secure and non-Secure Devices’ section below). For such Devices with hardware-assisted Fortuna, the following conditions MUST be met:

  • Device MUST implement Fortuna PRNG, with multiple entropy sources to feed to Fortuna as described below
    • details and implementation options are specified below
    • Device MUST comply to seed file requirements as specified below
  • Device MUST implement hardware entropy gathering, and RNG additional seeding procedure, as described below
Fortuna Implementation in SmartAnthill

There are two approaches to implement Fortuna in SmartAntill: ‘Radical’ and ‘Conservative’. ‘Radical’ is not strictly compliant with Fortuna description from [Fortuna], but we feel it should perform significantly better for our special circumstances. ‘Conservative’ is fully compliant to description in [Fortuna], with really minor tweaks (within the spirit of Fortuna) to reduce resource requirements. Currently, and until it is shown otherwise, both implementations are acceptable for SmartAnthill.

In any case, pool size for SmartAnthill Fortuna implementations is 128*3 bytes; effectively it means that we’re making a guesstimate that each event (encoded as 3 bytes per ADDRANDOMEVENT() description) carries one bit of entropy.

Currently, Fortuna implementation is estimated to require 32 (state of first SHA256 in SHAd256)+32 (state of second SHA256 in SHAd256)+64 (512-bit chunk buffer) = 128 bytes per pool, plus 32 bytes regardless of pools (generator state).

‘Radical’ SmartAnthill Fortuna

‘Radical’ SmartAnthill Fortuna has the following changes from [Fortuna] description:

  • only one pool is used. Rationale. Under conditions of generic PC-based RNG it may be seen as a major deficiency, but we feel that for SmartAnthill purposes, where the mostly important random generation (the one for pairing purposes) is ‘imminent’, and long-term recovery is of significantly less interest than making key material really random. Under these circumstances, spreading entropy across multiple pools, where it won’t be used for imminent security-critical key generation, is considered a waste.
‘Conservative’ SmartAnthill Fortuna

‘Conservative’ SmartAnthill Fortuna has the following changes from [Fortuna] description:

  • for non-Secure SmartAnthill Devices, number of pools MAY be reduced to 16 (from 32 in original Fortuna); for Secure SmartAnthill Devices number of pools MUST be at least 24
  • minimum time between reseeds MUST be increased to 1 minute (from 100ms in original Fortuna). Rationale: given our limited entropy sources and rare events, we’re not likely to get 128 bits of entropy more frequently anyway

These changes bring time-needed-for-attacker-to-exhaust-pools from 13 years as in original Fortuna, down to 1.5 months; we feel that this number is prudent enough for non-Secure devices. For Secure Devices 24 pools with 1 minute minimum reseeds, provide 31 years.

Fortuna Seed File

[Fortuna] specifies a 64-byte ‘seed file’ to keep Fortuna state between reboots. SmartAnthill Fortuna implementations MUST implement a ‘seed file’ (normally in EEPROM), with all atomicity requriements specified in [Fortuna]. If ‘seed file’ cannot be read on Device start, then Device MUST perform the following (depending on Device ‘pairing state’ as described in SmartAnthill Pairing document):

  • if Device is in PRE-PAIRING state, necessary entropy will be gathered during normal “pairing” procedure, so Fortuna may start without seed file.
  • if Device is in PAIRING-MITM-CHECK state, Device MUST switch to PRE-PAIRING state and require “pairing” to be repeated (TODO: provide appropriate Client-side errors and user messages)
  • if Device is in PAIRING-COMPLETED state, Device MUST use “SACCP Entropy Recovery” procedure as described in SmartAnthill Command&Control Protocol (SACCP) document (this procedure is different from “Entropy Gathering” procedure used as a part of “pairing”); in practice, it MAY be sufficient to get a single entropy recovery packet to re-initialize Fortuna (as it is after-pairing, packet is transferred encrypted, so there is no risk for it to be known to adversary; also, if key material will be needed, Fortuna will be fed with additional entropy which is sufficient for such generation, according to SmartAnthill Pairing).

Fortuna ‘seed file’ MUST be written before any MCUSLEEP operation (TODO: what if MCUSLEEP is memory-preserving?), and MUST be written at least every 10 minutes of Device operation.

Fortuna uniquely-pre-initialized seed file

To improve security, Devices MAY pre-populate Device with Fortuna seed file during manufacturing; if implemented, this seed file MUST be a file consisting of 64 random crypto-safe bytes. Presence of uniquely-pre-initialized “seed file” does NOT ease any of the other requirements to Fortuna and/or random number generation.

Device Operation for Devices with hardware-assisted Fortuna

NB: when “feeding entropy to Fortuna”, exact bit representation doesn’t matter, as long as all the data bits are fed to ADDRANDOMEVENT() Fortuna function

  • Device MUST have at least one MCU ADC channel which is either connected to an entropy source (such as Zener diode, details TBD), or just being not connected at all. This ADC is named “noise ADC”
    • it is acceptable to disconnect ADC channel only temporarily (for example, using an analogue switch); in this case, ADC channel MUST be disconnected for the whole duration of RNG additional seeding (i.e. it is not acceptable to disconnect it only for one measurement and to connect it back right afterwards).
  • During each “pairing” (IMPORTANT: it applies to any “pairing”, not just first “pairing”), the following procedure of RNG additional seeding MUST be performed:
    • When pairing procedure starts, Device MUST initialize two internal variables (Network-Time-Change-Count and ADC-Change-Count) as zeros
    • Device MUST implement “Entropy Gathering” procedure as defined in SmartAnthill Pairing document
    • On receiving each packet with entropy, Device MUST:
      • feed received ENTROPY to the Fortuna (NB: this ENTROPY is not really required, but it costs pretty much nothing to add it, and in case if attacker missed at least a part of the exchange, it certainly improves security, even if all the hardware entropy data turns out to be 100% deterministic, which shouldn’t really happen, but...)
      • feed entropy which is based on pseudo-measured time since the request has been sent, with at least 1mks precision; for the purposes of pseudo-measurement of time, exact time isn’t important, what is important is that two different times with 1mks difference, produce two different results with a probability at least 50%.
        • in particular, time MAY be pseudo-measured using “tight loops” (increment-pseudo-time-check-packet-arrival-repeat-until-packet-arrives), provided that 1mks requirement is satisfied (i.e. that “tight loop” time is less than 1mks, i.e. MCU-frequency * tight-loop-clock-count < 1mks). Device MAY perform some non-time-measured operations (for example, some measurements and/or calculations) after sending a packet and before going into time-pseudo-measuring “tight loop”, as long as maximum-possible-time-before-tight-loop < minimum-possible-packet-round-trip-time.
        • if pseudo-measured time is different from last pseudo-measured time, increment Network-Time-Change-Count. NB: even if Network-Time-Change-Count is not incremented, time data SHOULD still be fed to Fortuna PRNG
        • additionally, if another independent timer (such as WDT on AVR) is available, it SHOULD be read on packet arrival, and the data from the timer SHOULD be fed to Fortuna PRNG
    • in addition, if bare-metal implementation is used, whenever an interrupt happens (this includes interrupt on receiving packets, and/or any other interrupts), Device SHOULD feed “program-counter-before-interrupt has been called” (which is usually readily available as [SP-some_constant], and usually has 1 or more bits of entropy if the MCU is actively running at the moment) to Fortuna PRNG.
      • regardless of handling interrupts in such a manner, Device still MUST pseudo-measure time in a tight loop as described above
      • in addition, if another independent timer (such as WDT on AVR) is available, it SHOULD be read on all the interrupts, and the data from the timer SHOULD be fed to Fortuna PRNG. If independent timer is read-and-fed-to-Fortuna on interrupt, and all packet arrivals are handled via interrupts, then independent timer SHOULD NOT be read-and-fed-to-Fortuna outside of interrupt (tight-loop pseudo-measure of time outside of interrupt is still necessary)
      • to pass entropy from interrupt handler to Fortuna, entropy MAY be combined within different calls to interrupt handlers; in particular, the entropy MAY be accumulated via XOR-ing (with or without rotations, or using some other mixing function which doesn’t affect bit balance; good mixing functions examples include addition/substraction modulo 2^n, XOR, rotations, CRC functions, and crypto hash functions; bad examples include AND,OR, and shifts without rotations which may lose informaiton from some bits completely) incoming entropy in a fixed-size buffer until it is atomically-read-and-removed-from-fixed-size-buffer (TODO: is atomicity strictly required here?) outside of the interrupt handler and is fed to Fortuna PRNG. Regardless of mixing function, implementations MUST provide DEBUG compile-time flag which will ensure that each entropy component is passed separately without any mixing, and is never overwritten until it is read-and-removed; this is necessary to validate implementation to return what is expected (PC and/or timer) and to evaluate amount of entropy they produce.
    • Device MUST continue “Entropy Gathering” procedure at least until Network-Time-Change-Count reaches 250 * number-of-Fortuna-pools.
    • in addition, Device MUST perform measurements of “noise ADC” and feed the results to the Fortuna PRNG
      • on every such measurement, if measurement result is neither maximum nor minimum possible value for the ADC in question (usually, but not necessarily, minimum is all-zeros, and maximum is all-ones), and measurement result doesn’t match previous measurement from “noise ADC”, ADC-Change-Count is incremented. NB: even if ADC-Change-Count is not incremented, entropy still SHOULD be fed to Fortuna PRNG. NB2: “neither maximum nor minimum” requirement effectively rules out using 1-bit ADCs as “noise ADCs”.
      • these measurements MUST be performed in parallel with “Entropy Gathering” network exchange; at least one ADC measurement per “Entropy Gathering” packet MUST be performed; more than one is fine.
    • in addition, Device SHOULD perform measurements of all the other ADCs in the system (e.g. one measurement for each other ADC for one measurement of “noise ADC”) and feed the results to Fortuna PRNG
    • Device MUST continue measurements of “noise ADC” at least until ADC-Change-Count reaches 250 * number-of-Fortuna-pools.
    • if hardware RNG (for example, accessible via a special MCU instruction) is available, Device SHOULD feed it’s output to Fortuna
    • after both ADC-Change-Count and Network-Time-Change-Count reach 250, Device MAY decide to complete RNG additional seeding
    • to complete RNG additional seeding, Device MUST explicitly call Fortuna’s RESEED() (see [Fortuna] for details), and then MUST skip at least TODO bits of Fortuna output
  • Until RNG additional seeding is completed, RNG output MUST NOT be used in any manner
  • after RNG additional seeding is completed, Devices still SHOULD feed all the available entropy (as described above) to the Fortuna PRNG
Fortuna State and re-pairing

When Device is to be re-paired (i.e. Device pairing state is changed to PRE-PAIRING, see SmartAnthill Pairing document for details), Fortuna PRNG state (both seed file and in-memory state) MUST NOT be affected. The only process which MAY rewrite Fortuna persistent state while ignoring the existing Fortuna state, is Device re-programming (but not OtA re-programming).

Devices with hardware RNG

To qualify as a ‘Device with hardware RNG’, Device MUST comply with all the following requirements:

  • Device MUST have a hardware entropy source, which provides a hardware-generated bit stream
  • Device MUST implement on-line testing of hardware-generated bit stream (monobit test, poker test, runs test, and long runs test, as they were specified in FIPS140-2 after Change Notice 1 and before Change Notice 2; testing should be performed on each 20000-bit block before this block is fed to Fortuna). TODO: adaptation to streaming?
  • on-line testing MUST be performed on a bit stream before any cryptographic primitives are applied (but SHOULD be performed after von Neumann bias removal)
  • Device MUST implement Fortuna PRNG (as specified above).
    • this includes implementing Fortuna seed file as described above
  • on the first launch of the Device (i.e. if Fortuna seed file is not present, and Device is in PRE-PAIRING state), at least 3 of hardware-generated bit stream blocks, with on-line test above being successful, MUST be fed to a Fortuna PRNG during Fortuna initialization:
    • until such an initialization is completed, Device MUST NOT be operational
    • bit stream blocks with online test failed, still SHOULD be fed to Fortuna PRNG
    • RNG MUST skip at least first TODO bits of the Fortuna output bit stream (before starting to output Fortuna output as RNG output)
  • Device MUST continue feeding output from hardware entropy source to Fortuna PRNG, without applying the online tests, at a rate at least 1 bit per second (as long as Device is running during at least some portion of the 1 second and not in a hardware sleep mode)
  • Device SHOULD feed additional available entropy (timings, ADC etc. as described above) to Fortuna PRNG
Restrictions for Secure and non-Secure Devices

non-Secure SmartAnthill Devices MAY use one of the following RNGs (as long as all requirements for respective RNG, as specified above, are complied with):

  • uniquely-pre-initialized Poor-Man’s PRNGs
  • hardware-assisted Fortuna
  • hardware-assisted Fortuna with uniquely-pre-initialized seed file
  • hardware RNG
  • hardware RNG with Fortuna having uniquely-pre-initialized seed file

Secure SmartAnthill Devices MAY use one of the following RNGs (as long as all requirements for respective RNG, as specified above, are complied with):

  • uniquely-pre-initialized Poor-Man’s PRNGs
  • hardware-assisted Fortuna
  • hardware-assisted Fortuna with uniquely-pre-initialized seed file
  • hardware RNG
  • hardware RNG with Fortuna having uniquely-pre-initialized seed file (RECOMMENDED)
SmartAnthill Client (and Devices with Crypto-Safe RNG)

Even if the system where the SmartAnthill stack is running, has a supposedly crypto-safe RNG (such as built-in crypto-safe /dev/urandom), SmartAnthill implementations still MUST employ Poor-Man’s PRNG (as described above) in addition to system-provided crypto-safe PRNG. In such cases, each byte of SmartAnthill RNG (which is provided to the rest of SmartAnthill) SHOULD be a XOR of 1 byte of system-provided crypto-safe PRNG, and 1 byte of Poor-Man’s PRNG.

Rationale. This approach allows to reduce the impact of catastrophic failures of the system-provided crypto-safe PRNG (for example, it would mitigate effects of the Debian RNG disaster very significantly).

To initialize Poor-Man’s RNG on Client side, SmartAnthill implementation MUST NOT use the same crypto-safe RNG which output will be used for XOR-ing with Poor-Man’s RNG (as specified above); instead, Poor-Man’s RNG on Client side MUST be initialized independently; valid examples of such independent initialization include XOR-ing of at least two sources, such as an independent Fortuna PRNG with user input (timing of typing or mouse movements), or online generators such as ‘raw bytes’ from random.org or from smartanthill.org (TODO); IMPORTANT: all exchanges with online generators MUST be over https, and with server certificate validation.

The same procedure SHOULD also be used for generating random data which is used for SmartAnthill key generation.

Key Generation

This sections describes rules for generating keys (and other key material, such as DH random numbers).

For Devices which support OtA Pairing (see SmartAnthill Pairing document for details), key material needs to be generated. For such Devices the following requirements MUST be met:

  • if Device doesn’t have a hardware-assisted Fortuna PRNG:
    • Device MUST implement at least two uniquely-pre-initialized Poor-Man’s PRNGs: one of them (named ‘POORMAN4KEYS’) MUST NOT be used for any purposes except for key generation as described below. Another one (named ‘NONKEYPOORMAN’) is used to produce ‘non-key Random Stream’.
    • in addition, Device MUST have an additional uniquely-pre-initialized key (KEY4KEYS), which MUST NOT be used except for key generation as described below
    • to generate 128 bits of key material, the following procedure applies:
      • calculate output=AES(key=KEY4KEYS,data=POORMAN4KEYS.Random16bytes())
  • if Device does have a hardware-assisted Fortuna PRNG:
    • Fortuna output (after mandatory RNG additional seeding as described above) is used as a key material
  • if Device (or Client) has a crypto-safe RNG:
    • Device MUST implement at least two uniquely-pre-initialized Poor-Man’s PRNGs: one of them (named ‘POORMAN4KEYS’) MUST NOT be used for any purposes except for key generation as described below. Another one (named ‘NONKEYPOORMAN’) is used to produce ‘non-key Random Stream’.
      • Initialization of both Poor-Man’s PRNGs (as well as initialization of KEY4KEYS and POORMAN4KEYS, see below) MUST be done independently, as specified in “SmartAnthill Client (and Devices with Crypto-Safe RNG)” section above.
    • in addition, Device MUST have an additional uniquely-pre-initialized key (KEY4KEYS), which MUST NOT be used except for key generation as described below
    • to generate 128 bits of key, the following procedure applies:
      • calculate output=CryptoSafeRNG.Random16bytes() XOR AES(key=KEY4KEYS,data=POORMAN4KEYS.Random16bytes())
Non-Key Random Stream

SmartAnthill RNG provides a ‘non-key Random Stream’ for various purposes such as padding, ENTROPY data for the pairing (sic!), etc. Generation of 128 bits of non-key Random Stream is similar to key generation described above, with the following differences:

  • instead of POORMAN4KEYS Poor-Man’s PRNG, NONKEYPOORMAN Poor-Man’s PRNG is used
  • instead of AES(key=KEY4KEYS,data=DATA), DATA is used directly
References

[Fortuna] Niels Ferguson, Bruce Schneier. “Practical Cryptography”. Wiley Publishing, 2003. Sections 10.3 (‘Fortuna’) - 10.7 (‘So What Should I Do?’)

SmartAnthill Pairing

Version:v0.1.7

IMPORTANT: This document is obsolete. Please DO NOT modify it. Please refer to SimpleIoT Pairing for an up to date version.

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

“Pairing” SmartAnthill Device to SmartAnthill Client (which is normally located on SmartAnthill Central Controller) is necessary to ensure secure key exchange between SmartAnthill Device and SmartAnthill Client. As soon as “pairing” is completed, both parties have a 128-bit symmetric key shared between them, and can use it for SASP purposes.

SmartAnthill Pairing comes in several flavours. SmartAnthill Device MUST implement at least one of these flavours. SmartAnthill Client MUST implement all these flavours.

SmartAnthill Pairing flavours are divided into two categories: Zero Pairing (which doesn’t involve communication over SmartAnthill communication channel), and Over-the-Air (OtA) pairing.

SmartAnthill Zero Pairing

Zero pairing doesn’t involve communication over SmartAnthill communication channel.

SmartAnthill Zero Programming Pairing

SmartAnthill Zero Programming Pairing applies only to those devices which can be completely reprogrammed by SmartAnthill Central Controller (usually it applies to SmartAnthill Hobbyist Devices). It is a RECOMMENDED way of pairing for SmartAnthill Hobbyist Devices.

SmartAnthill Zero Programming Pairing consists of:

  • Client generating secret key (see TODO for details)
  • Client preparing (e.g. compiling or linking) a program which includes generated secret key as static data
  • Client storing generated secret key in SA DB
  • Client programming Device using prepared program (which contains generated secret key)

TODO: restrictions on SmartAnthill Device programming-socket key access.

SmartAnthill Zero Paper Pairing

SmartAnthill Zero Paper Pairing MAY be used by those SmartAnthill Mass-Market Devices, for which implementing other pairing methods is not feasible. Zero Paper Pairing SHOULD NOT be used if other pairing methods are feasible. Zero Paper Pairing MUST NOT be used by Security Devices unless it is demonstrated that other pairing methods are not feasible for the Device.

Zero Paper Pairing requires each Device to:

  • have unique 128-bit crypto-random key programmed in as it’s SASP AES key
  • have this 128-bit key printed in the following user-friendly form:
    • 128-bit key is converted to a large unsigned integer (using SmartAnthill Endianness) from 0 to 2^128-1
    • this large unsigned integer is written as an integer using base 36 (i.e. using 36 digits in each position); to write digits 0-9 in this representation, symbols ‘0’-‘9’ are used; to write digits 10-35 in this representation, symbols ‘A’-‘Z’ (upper case) are used. This representation will have at most 25 symbols (as 36^25 > 2^128); if there are less symbols than 25, they’re left-padded with zeros to 25
    • these 25 symbols are written in dash-separated groups of five
    • checksum symbol is calculated as a modulo-36 sum of all the symbols
    • checksum is appended (via dash) to dash-separated groups of five, forming XXXXX-XXXXX-XXXXX-XXXXX-XXXXX-X pattern
    • for example, all-zero key will be written as 00000-00000-00000-00000-00000-0
  • this user-friendly form of 128-bit crypto-random key MUST be provided in a printed form with the device.

In addition, to ensure that if the printed key is lost, Device is still usable, Devices using Zero Paper Pairing, MUST comply to the following Reprogramming Requirements:

  • Device MUST provide an option to re-program key, using either UART or USB. Wireless programming methods are expressly forbidden; in addition, any way (whether wired or wireless) to read the key from the device, is expressly forbidden.
    • this can be made by one of the following methods:
      • full Device reprogramming; to be compliant, it MUST be done as follows:
        • Device MUST be re-programmable using Platform.IO
        • Manufacturer MUST provide source code for the Device programming in a form which is used by SmartAnthill for programming of Hobbyist Devices
          • this source code MUST be available free of charge BOTH from manufacturer’s web site, AND from an allowed third-party repository. List of allowed third-party source code repositories TBD (github, sourceforge, something else?)
      • key reprogramming. Protocols for key reprogramming over UART and over USB are TBD.
SmartAnthill OtA Pairing

SmartAnthill OtA pairing provides security (including MITM protection) with minimal complexity involved.

From OtA Pairing perspective, SmartAnthill Device can be in one of the following OtA pairing states:

  • PRE-PAIRING
  • PAIRING-MITM-CHECK
  • PAIRING-COMPLETED

IMPORTANT: Change from any of the states into PRE-PAIRING state MUST be implemented ONLY via physical manipulations of end-user with SmartAnthill Device (and MUST NOT be allowed remotely). Examples of valid user interfaces to perform such a change include on-Device button or buttons (for example, if two buttons are simultaneously kept pressed for over N seconds) and on-Device PCB jumper. When changing Device state to PRE-PAIRING state, state of Device RNG (i.e. data used for random number generation) MUST NOT be affected.

In PRE-PAIRING state, SASP MUST use ‘zero’ AES-128 key (with AES key consisting of all zeros).

In PRE-PAIRING state, no programs are allowed to be sent to SACCP; only TODO SACCP packets are allowed. In PAIRING-MITM-CHECK state, SACCP programs are allowed; however, in this state, SACCP restricts EXEC command of Zepto VM to the only Built-In bodypart (id=BUILTIN_BODYPART_PAIRING).

From security perspective, SmartAnthill OtA pairing works as follows:

  • BOTH parties generate DH randoms (a and b - 1024- or 2048-bit ones).
  • parties perform anonymous Diffie-Hellman key exchange, obtaining a 1024- or 2048-bit shared secret Z.
  • parties derive 128-bit key K and 128-bit verification value X out of Z.
  • from this point on, on both sides SASP starts to use key K, as SASP AES key
  • parties use verification value X (which is essentially a MITM check key) to perform MITM protection check depending on the OtA pairing flavour. During this exchange, Device is kept in PAIRING-MITM-CHECK Device OtA pairing state.
  • if MITM protection check indicates that everything is fine - Device OtA pairing state is changed to PAIRING-COMPLETED, and normal work can be started.
Pre-Programmed Keys and RNGs

It should be understood that to ensure security, Devices MUST comply to at least one of the following two requirements:

  • each device MUST have unique pre-programmed SASP key:
    • this applies to Zero Pairing Devices

or

SmartAnthill OtA Pairing Protocol

All the messages within one pairing procedure form a single “packet chain”. That is, “packet chain” for a normal OtA Pairing exchange works as follows:

Pairing-Ready-Pseudo-Response - Pairing-Pre-Request - Pairing-Pre-Response - Pairing-DH-Data-Request - Pairing-DH-Data-Response - ... - Pairing-DH-Data-Request - Pairing-DH-Data-Response

When both sides receive the last of Pairing-DH-DATA-* packets (the ones which provide the whole DH data, with size defined according to KEY-EXCHANGE-TYPE field in Pairing-DH-Data-Request), they proceed with calculation of SASP key.

“Awaiting pairing” mode

To avoid Device connecting to wrong SmartAnthill Client, SmartAnthill Client MUST NOT proceed with “pairing” in response to Pairing-Ready packets unless SmartAnthill Client is in “awaiting pairing” mode. “Awaiting pairing” mode for Central Controller MUST be user-initiated, and MUST NOT be kept for longer than 1 hour, unless user requests another “awaiting pairing”. This is necessary to reduce “paired to wrong Central Controller” encounters (which MUST have a way to be handled separately; one example of such handling is described in SmartAnthill DLP for IEEE 802.15.4 (SADLP-802.15.4) document).

TODO: errors (Z=1 per NIST SP 800-56B, and derived-key=0 to avoid being caught by attacks on misimplementations)!

OtA Pairing Protocol Packets

Pairing-Ready-Pseudo-Response: |ENTROPY-NEEDED-SIZE |

where ENTROPY-NEEDED-SIZE is an Encoded-Unsigned-Int<max=2> field specifying amount of needed entropy in bytes.

Pairing-Ready-Pseudo-Response is not really a response, but a request from Device side which initiates pairing sequence. It is sent as a payload for a SACCP-OTA-PAIRING-RESPONSE message (initiating a new “packet chain” in terms of SAGDP), with 2 “additional bits” being 0x0. If ENTROPY-NEEDED-SIZE is not zero, it indicates that Phase 1 of ‘Entropy Gathering Procedure’ (see below) is necessary before issuing a Pairing-Pre-Request from Client side.

If Client is not in “awaiting pairing” mode, it MUST respond with Pairing-Error-Request with ERROR-CODE = ERROR_NOT_AWAITING_PAIRING.

Pairing-Pre-Request: | OTA-PROTOCOL-VERSION-NUMBER-MAJOR | OTA-PROTOCOL-VERSION-NUMBER-MINOR | CLIENT-RANDOM | PROJECTED-NODE-ID | CLIENT-OTA-AND-SASP-CAPABILITIES |

where OTA-PROTOCOL-VERSION-NUMBER-* are Encoded-Unsigned-Int<max=2> fields, CLIENT-RANDOM is a 16-byte field with crypto-random data, PROJECTED-NODE-ID is an Encoded-Unsigned-Int<max=2> field, containing NODE-ID which Client intends to assign to the Device if pairing is successful, and CLIENT-OTA-AND-SASP-CAPABILITIES TBD.

Pairing-Pre-Request is sent as a payload for a SACCP SACCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for SACCP-OTA-PAIRING-REQUEST message being 0x0.

Pairing-Pre-Response: | ENTROPY-NEEDED-SIZE | OPTIONAL-DEVICE-RANDOM | OPTIONAL-DEVICE-BUS-TYPE | OPTIONAL-DEVICE-INTRABUS-ID-SIZE | OPTIONAL-DEVICE-INTRABUS-ID | OPTIONAL-DEVICE-OTA-AND-SASP-CAPABILITIES |

where ENTROPY-NEEDED-SIZE is an Encoded-Unsigned-Int<max=2> field, OPTIONAL-DEVICE-RANDOM is an optional 32-byte field, OPTIONAL-DEVICE-BUS-TYPE is an Encoded-Unsigned-Int<max=1> field representing a enum of bus types (TBD), OPTIONAL-DEVICE-INTRABUS-ID-SIZE is an Encoded-Unsigned-Int<max=1> field, representing size of OPTIONAL-DEVICE-INTRABUS-ID field in bytes, OPTIONAL-DEVICE-INTRABUS-ID depends on the bus type, and OPTIONAL-DEVICE-OTA-AND-SASP-CAPABILITIES (format TBD); all the OPTIONAL-* fields are present only if this Pairing-Pre-Response packet is the first such packet in current “pairing” exchange.

Pairing-Pre-Response is sent as a payload for a SACCP SACCP-OTA-PAIRING-RESPONSE message, with 2 “additional bits” for SACCP-OTA-PAIRING-RESPONSE message being 0x1.

NB: to comply with key generation requirements as specified in SmartAnthill SmartAnthill Random Number Generation and Key Generation document, Device MUST request at least amount of entropy which is equal to the b parameter size for DH key exchange; however, Device MAY request more entropy (up to 256 extra bytes per pairing attempt, which requests MAY be split into packets as small as 1-byte) - for example, to initialize it’s own Fortuna generator.

If ENTROPY-NEEDED-SIZE is not zero, it means that “Entropy Gathering” Phase 3 is necessary (see below), and that Client MUST reply with a Pairing-Entropy-Provided-Request.

Pairing-Entropy-Provided-Request: | ERROR-CODE | ENTROPY |

where ERROR-CODE is an Encoded-Unsigned-Int<max=2> field, equal to zero, and ENTROPY is an arbitrary-length field with cryptographically safe random data.

Pairing-Entropy-Provided-Request is sent as a payload for a SACCP SACCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for SACCP-OTA-PAIRING-REQUEST message being 0x1. Note that “additional bits” for Pairing-Entropy-Provided-Request are the same as for Pairing-Error-Request, and they’re distinguished by the value of ERROR-CODE field.

Client MAY supply less entropy than it was requested (and SHOULD do it in case if requested data potentially exceeds MTU); in such a case, Device SHOULD request more entropy via replying with an appropriate message with a non-zero ENTROPY-NEEDED-SIZE.

In response to Pairing-Entropy-Provided-Request, Device MUST send another Pairing-Ready-Pseudo-Response or Pairing-Pre-Response packet (depending on the Phase of Entropy Gathering procedure currently in progress), specifying non-zero ENTROPY-NEEDED-SIZE if it still has not enough entropy.

Pairing-Error-Request: | ERROR-CODE |

where ERROR-CODE is an Encoded-Unsigned-Int<max=2> field, never equal to zero.

Pairing-Error-Request is sent as a payload for a SACCP SACCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for SACCP-OTA-PAIRING-REQUEST message being 0x1. Note that “additional bits” for Pairing-Error-Request are the same as for Pairing-Entropy-Provided-Request, and they’re distinguished by the value of ERROR-CODE field.

Pairing-DH-Data-Request: | OPTIONAL-KEY-EXCHANGE-TYPE | DH-REQUEST-PART |

where OPTIONAL-KEY-EXCHANGE-TYPE is sent only for the very first Pairing-DH-Data-Request within the “pairing”, and is Encoded-Unsigned-Int<max=2> field with values defined below, and DH-REQUEST-PART is a field taking the rest of the packet, and representing first remaining (SmartAnthill-Endianness-wise) bytes of A = g^a mod p from DH key exchange (using SmartAnthill Endianness).

Supported OPTIONAL-KEY-EXCHANGE-TYPEs:

  • value 0:
    • Key Exchange: DH with 1024-bit MODP group with 160-bit Prime Order Subgroup as defined in RFC 5114. This OPTIONAL-KEY-EXCHANGE-TYPE MUST NOT be used for Security SmartAnthill Devices. NB: MODP groups from RFC 5114 are preferred to earlier-defined ones (for example, those from RFC 3526), as they explicitly comply with NIST-suggested restrictions, in particular, restrictions on q.
    • Key Derivation: SHA256-based
  • value 1:
    • Key Exchange: DH with 2048-bit MODP group with 256-bit Prime Order Subgroup as defined in RFC 5114.
    • Key Derivation: SHA256-based
  • others: MAY be added as necessary

TODO: double-check presence of any typical patterns in Z, and decide on split (first-half/second-half or even-bits/odd-bits)

Pairing-DH-Data-Request is sent as a payload for a SACCP SACCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for SACCP-OTA-PAIRING-REQUEST message being 0x2.

Pairing-DH-Data-Response: | DH-RESPONSE-PART |

where DH-RESPONSE-PART is a field taking the whole packet; length of DH-RESPONSE-PART MUST be exactly the same as DH-REQUEST-PART in the incoming Pairing-DH-Data-Request message. DH-RESPONSE-PART represents first remaining (SmartAnthill-Endianness-wise) bytes of B = g^b mod p from DH key exchange (using SmartAnthill Endianness).

Pairing-DH-Data-Response is sent as a payload for a SACCP SACCP-OTA-PAIRING-RESPONSE message, with 2 “additional bits” for SACCP-OTA-PAIRING-RESPONSE message being 0x2.

Pairing-Ok-Request: | OK-A-ENTROPY-CHECKSUM | NODE-ID |

where OK-A-ENTROPY-CHECKSUM is a 16-byte field containing result of SASP-tag(nonce=(varying-part=1,direction=from-client-to-device),authenticated-data=All-Sent-ENTROPY-Combined,key=derived-SASP-key), where nonce is constructed in the same way it is constructed in SASP, and NODE-ID is an Encoded-Unsigned-Int<max=2> field containing SAMP node ID to be assigned to the Device. NODE-ID is conditional on OK-A-ENTROPY-CHECKSUM check described below, otherwise NODE-ID MUST be ignored.

Pairing-Ok-Request is sent by Client when the last Pairing-DH-Data-Response is received; it is sent as a payload for a SACCP SACCP-OTA-PAIRING-REQUEST message, with 2 “additional bits” for SACCP-OTA-PAIRING-RESPONSE message being 0x3.

On receiving Pairing-Ok-Request, Device calculated it’s own DEVICE-OK-A-ENTROPY-CHECKSUM with derived-SASP-key, compares it to received OK-A-ENTROPY-CHECKSUM. If the check is Ok, then Device calculates OK-B-ENTROPY-CHECKSUM (the same way as OK-A-ENTROPY-CHECKSUM is calculated, but with direction=from-device-to-client), and sends it back as a part of Part-Ok-Response; then Device changes pairing state into Pairing-MITM-Check, sets SASP key to derived-SASP-key for all future communications with Client, and sets next SASP nonce varying-part (including the one stored in persistent storage) to 2.

If DEVICE-OK-A-ENTROPY-CHECKSUM and received OK-A-ENTROPY-CHECKSUM don’t match - Device MUST switch back to PRE-PAIRING state and report TODO error to the Client.

Pairing-Ok-Response: | OK-B-ENTROPY-CHECKSUM |

where OK-B-ENTROPY-CHECKSUM is a 16-byte field.

Pairing-Ok-Response is sent as a payload for a SACCP SACCP-OTA-PAIRING-RESPONSE message, with 2 “additional bits” for SACCP-OTA-PAIRING-RESPONSE message being 0x3.

On receiving Pairing-Ok-Response, Client calculates it’s own CLIENT-OK-B-ENTROPY-CHECKSUM, compares it with received OK-B-ENTROPY-CHECKSUM. If everything is fine - “pairing” can be considered completed, and Client sets SASP key (to be used by SASP) to derived-SASP-key for all future communications with this Device, and sets next SASP nonce varying-part (including the one stored in persistent storage) to 2. After that, Client starts to perform MITM check (using MITM-Check-Program as described below).

If CLIENT-OK-B-ENTROPY-CHECKSUM and received OK-B-ENTROPY-CHECKSUM don’t match - Client reports end-user a potential attack on pairing (without such an attack, chances of ENTROPY-CHECKSUM mismatching are on the order of 2^-120), and asks end-user to re-start pairing by manually switching Device to PRE-PAIRING state (using appropriate UI as described above).

Entropy Gathering

In some cases, as a prerequisite for Device to be able to perform pairing, RNG needs to be supplied with entropy (exact conditions are described in SmartAnthill SmartAnthill Random Number Generation and Key Generation document); NB: as described in SmartAnthill SmartAnthill Random Number Generation and Key Generation, entropy usually needs to be supplied not only to the first pairing of the Device, but also to any subsequent pairing.

The procedure of Entropy Gathering is performed as follows:

Phase 1 (OPTIONAL, used only if Device ID needs to be generated, hardware-assisted Fortuna PRNG is used, and Fortuna doesn’t have enough entropy):

  • Device sends Pairing-Ready-Pseudo-Response with non-zero ENTROPY-NEEDED-SIZE
  • Client replies with Pairing-Entropy-Provided request, sent as a broadcast (SHOULD be restricted to those Retransmitting Nodes which may reach the Device)
  • this Pairing-Ready-Pseudo-Response - Pairing-Entropy-Provided sequence is repeated until Device has sufficient entropy to generate Device ID (this is the same as for regular “pairing”, as described in SmartAnthill SmartAnthill Random Number Generation and Key Generation document)
  • NB: during Phase 1, Pairing-Entropy-Provided packets from Client to Device are sent as a SAMP From-Santa packets (see SmartAnthill Mesh Protocol (SAMP)) which do not distinguish between target Devices, so there is a chance that more than one Device obtains the same packet. However, these same packets will (with an overwhelming probability) lead to different states within Fortuna PRNGs on different Devices, which will allow to distinguish these (originally potentially indistinguishable) Devices.

Phase 2:

  • Device sends Pre-Pairing-Response non-zero ENTROPY-NEEDED-SIZE, DEVICE-ID-FLAG set, and all Device ID-related fields.
  • Client replies with Pairing-Entropy-Provided request
  • NB: starting from Phase 2, all the packets from Client to Device are sent as SAMP Unicast packets (see SmartAnthill Mesh Protocol (SAMP)) and are addressed to specific Device (using Device ID from Phase 2).

Phase 3:

It should be noted that number of packets sent and received is IMPORTANT for security purposes, so combining packets contrary to requirements in SmartAnthill Guaranteed Delivery Protocol (SAGDP) is strictly prohibited.

DH Random Generation

For both Client side and Device side, DH random numbers (a and b respectively) MUST be generated as described in Key Generation section in SmartAnthill SmartAnthill Random Number Generation and Key Generation document.

SASP Key Derivation

When both sides have all the information they need (that is, Client has full B = g^b mod p and Device has full A = g^a mod p), they need to calculate shared secret Z (Z = A^b mod p for Device, and Z = B^a mod p for Client), and generate SASP Key K (128 bit), as well as verification value X (also 128 bit), from Z.

SASP Key K and verification value X are calculated as follows:

  • for SHA256-based derivation: K = SHAd256(Z||Info||first-half-of-CLIENT-RANDOM||first-half-of-DEVICE-RANDOM), X = SHAd256(Z||Info||second-half-of-CLIENT-RANDOM||second-half-of-DEVICE-RANDOM), where Info=‘“SASP”||KEY-EXCHANGE-TYPE||’K’-or-‘X’||ROOT-NODE-ID||PROJECTED-NODE-ID’ (where ROOT-NODE-ID is always 0, and ‘K’-or-‘X’ is equal to ‘K’ ASCII byte if calculating ‘K’, and to ‘X’ ASCII byte if calculating ‘X’). SHAd256(m) is SHA256(SHA256(m)), same as in [Fortuna]. NB: this method differs from recommended by NIST, in that we’re deriving both K and X from the same DH keys; as some function of X is exposed (via LED blinking), in theory it might leak some information about K; however, in practice we don’t see any specific attack vectors (especially as obtaining key material from X requires reverting SHAd256, AND as blinking is not just X, but X-encrypted-with-a-random-key-which-is-transferred-over-encrypted-channel, so X itself is not easily accessible). We could use method of obtaining X which is similar to Simple Secure Pairing, but at the point we do not see it necessary.
  • other methods MAY be added in the future
OtA Pairing MITM-Check Program

After initial “packet chain” consisting of Pairing Request and Pairing Response, Device goes into PAIRING-MITM-CHECK state; MITM check is performed via “MITM-Check Program”.

MITM-Check Program is pretty much a regular Zepto VM program which goes over SACCP (over SAGDP over SASP). There is a difference from regular program though: MITM-Check Program MUST come only in PAIRING-MITM-CHECK Device pairing state. In this state, SACCP (and/or Zepto VM) prohibits program to access any bodyparts, except for a Built-In bodypart with id=BUILTIN_BODYPART_PAIRING. This also ensures that despite there can be two bodyparts accessing the same LED (one is ‘pairing’ bodypart, another is regular bodypart), there is no possible conflict between the two.

OtA Pairing Flavours

All OtA Pairing Flavours run on top of SmartAnthill OtA Pairing Protocol, and differ only in their MITM-Check Programs.

SmartAnthill OtA Single-LED Pairing

SmartAnthill OtA Single-LED Pairing is pairing mechanism, which is semi-automated (i.e. user is not required to enter any data, but will be required to position devices in a certain way), and which requires absolute minimum of resources on the Device side. Namely, all the Device needs to have (in addition to MCU) is one single LED. This LED MAY be any of existing LEDs on the Device.

MITM-Check for Single-LED Pairing is performed as follows:

  • User is asked to bring Device close to the webcam which is located on SmartAnthill Central Controller
  • Client sends a MITM-Check program which requests LED to blink, using Blinking-Function(random-nonce-sent-by-Client)=AES(key=verification-value-X,data=random-nonce-sent-by-Client) as a blinking pattern. TODO: Built-in Plugin to produce AES(...) reply.
  • Accordingly, Device starts blinking the LED
  • Client, using webcam, recognizes blinking pattern and makes sure that it matches expectations.
  • If expectations don’t match, program may be repeated with a different random-nonce-sent-by-Client
  • If expectations do match, another program (also technically a MITM-Check program) is sent to change OtA Pairing State of the Device to PAIRING-COMPLETED.

NB: SmartAnthill Client SHOULD support using webcam on a smartphone camera for “pairing” purposes (provided that TODO requirements for securing communication between SmartAnthill Controller and smartphone’s app, are met).

MITM-Check for Single-LED Pairing being User-OPTIONAL

All SmartAnthill Devices using Single-LED Pairing, MUST implement proper MITM Check procedures as described above. However, devices which are not designated as Security Devices, MAY set PAIRING-USER-OPTIONAL flag in their Device Capabilities (TODO). If Client receives PAIRING-USER-OPTIONAL flag from a Device which also has SECURE-DEVICE flag - it MUST NOT allow using such a Device, with an appropriate report to the end-user.

If Client “pairs” with a Device which has PAIRING-USER-OPTIONAL set, it MAY ask user if he wants to perform “pairing”. If PAIRING-USER-OPTIONAL flag is not set, Client MUST NOT allow to use Device (i.e. MUST NOT issue a program which resets MITM-CHECK-IN-PROGRESS Device flag, and MUST NOT send any non-pairing programs to the Device) until “pairing” is actually performed.

To re-iterate: being User-OPTIONAL means that while Device implementors still MUST implement MITM; however, under certain circumstances end-user MAY be allowed to skip MITM protection.

SINGLE-LED-PAIRING Built-In Plugin

TODO

References

[Fortuna] Niels Ferguson, Bruce Schneier. “Practical Cryptography”. Wiley Publishing, 2003. Sections 6.4 (‘Fixing the Weaknesses’)

SmartAnthill Programming, Bootloaders and OtA Programming

Version:v0.1.4a

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill 2.0 Protocol Stack documents, please make sure to read them before proceeding.

In general, SmartAnthill supports several different deployment scenarios:

  • SmartAnthill (no bootloader)
  • SmartAnthill over 3rd-party-bootloader
  • SmartAnthill-with-OtA-programming (no bootloader)
  • SmartAnthill-with-OtA-programming over 3rd-party-bootloader
SmartAnthill and 3rd-party Bootloaders

In general, SmartAnthill implementation MAY run on top of a 3rd-party bootloader (for example, Zepto OS MAY be loaded over UART by 3rd-party bootloader), provided that 3rd-party bootloader complies to the following requirements:

  • 3rd-party bootloader MUST require physical access to the SmartAnthill Device for the Device to be programmed
    • It means that all wireless/OtA 3rd-party bootloaders are prohibited
    • It also means that wired 3rd-party bootloaders are generally ok, as long as they comply with other requirements mentioned here
  • 3rd-party bootloader MUST NOT allow extracting existing program from the SmartAnthill Device
SmartAnthill OtA Programming

SmartAnthill Devices which support SmartAnthill OtA Programming, MUST run SmartAnthill OtA Bootloader. Note that SmartAnthill OtA Bootloader MAY run either as a primary bootloader, or on top of 3rd-party bootloader (as long as 3rd-party bootloader satisfies requriements above).

SmartAnthill Devices MAY support OtA reprogramming of the bootloader itself; however, in this case SmartAnthill Devices MUST comply with robustness requirements, and MUST ensure that if programming process is aborted for any reason, an old bootloader will continue to work. In particular, it means that when reprogramming a bootloader (i.e. OTA-TYPE=OTA_ROBUST_BOOTLOADER) and for “robust” OS programming (OTA-TYPE-OTA_ROBUST_OS):

  • new bootloader MUST be loaded into an Flash/EEPROM area which is different from current bootloader
  • atomicity requirements specified in OTA_COMMIT message below, MUST be complied with
SmartAnthill OtA Bootloader and SmartAnthill OS

SmartAnthill OtA Bootloader needs to implement and run most of SmartAnthill Protocol Stack - from SADLP-* to most of SACCP. Moreover, to enable re-programming over the same channel, SmartAnthill OtA Bootloader needs to run these parts of the stack even when OS is running on top of it. That’s why, when a SmartAnthill OS (such as Zepto OS) is running on top of SmartAnthill OtA Bootloader, most of incoming packet processing is made within SmartAnthill OtA Bootloader, with only SACCP packets (provided that they’re neither SACCP Pairing packets nor SACCP OtA Programming packets) fed to SmartAnthill OS (via “SACCP-OS Handler”).

From the point of view of Zepto OS, when it needs to support OtA, it is simply split into two parts. The first part is SmartAnthill OtA Bootloader, which handles protocol layers starting from PHY and up to and including SAGDP; it also implements SACCP messages for Pairing and SAOtAPP protocols. All other SACCP messages are passed to the rest of Zepto OS, via “SACCP-OS Handler” implemented as zepto_saccp_os_handler() (TODO) function.

In addition, SmartAnthill OS part MAY be allowed to call certain functions (such as AES) of SmartAnthill OtA Bootloader (via “Dynamic Linking”, described below); this MAY be used to reduce code footprint of the Zepto OS.

Pairing with SmartAnthill OtA Programming

Devices which support SmartAnthill OtA Programming, as any other SmartAnthill Device, perform “Pairing” (see SmartAnthill Pairing document for details). As a result of pairing, SmartAnthill Device obtains a secret key which is shared with SmartAnthill Central Controller.

For non-OtA-Programmable SmartAnthill Devices, this secret-key-obtained-from-pairing is used directly as a SASP key for SmartAnthill OS communications. However, for OtA-Programmable SmartAnthill Devices, this secret-key-obtained-from-pairing is used as a OtA Programming key (and SASP key for SmartAnthill OS communications MUST be an independent key, which passed as a part of SmartAnthill-OtA-Programming-Protocol echange, which uses OtA Programming Key for authentication/encryption).

NB: if necessary, OtA programming MAY be used to change SmartAnthill OS SASP key (via reloading the whole SmartAnthill OS); this MAY be used for several reasons, including refreshing keys and dealing with running-out-of-SASP-nonces scenarios. In addition, OTA_BOOTLOADER mode of SAOtAPP MAY be used to refresh OTA Programming Key itself.

SmartAnthill OtA Programming Protocol

All OtA Programming of SmartAnthill Devices MUST be performed ONLY via the following SmartAnthill OtA Programming Protocol (SAOtAPP). SAOtAPP MUST be supported ONLY over SACCP over SAGDP over SASP, where SASP key MUST be OtA Programming Key (see above, obtained from “pairing”); OtA Programming Key MUST NOT be an all-zeros key (as defined in SmartAnthill Pairing).

SAOtAPP consists of the following messages:

OtA Capabilities Request: | (empty body). “Additional SACCP Bits”: 0x0

OtA Capabilities Response: | PLATFORM-ID | FLAGS | OS-ADDRESS | BOOTLOADER-ADDRESS | MAX-OS-SIZE | MAX-BOOTLOADER-SIZE | MAX-ROBUST-OS-SIZE | MAX-RAM | PLATFORM-SPECIFIC | “Additional SACCP Bits”: 0x0

where PLATFORM-ID is an Encoded-Unsigned-Int<max=2> field representing TODO enum, FLAGS is a 1-byte substrate for a bitfield, where bit [0] being OTA_OS_SUPPORT, bit [1] being OTA_ROBUST_BOOTLOADER_SUPPORT, bit [2] being OTA_ROBUST_OS_SUPPORT, bit [3] being OTA_DYNAMIC_LINKING_SUPPORT, and bits [4..7] reserved (MUST be zeros), PLATFORM-ADDRESS and BOOTLOADER-ADDRESS are Encoded-Unsigned-Int<max=2> fields representing addresses for which program and bootloader respectively should be compiled, MAX-OS-SIZE and MAX-BOOTLOADER-SIZE are Encoded-Unsigned-Int<max=2> fields specifying maximum possible sizes for os and bootloader, MAX-ROBUST-OS-SIZE is an Encoded-Unsigned-Int<max=2> field described below, MAX-RAM is an Encoded-Unsigned-Int<max=2> fields specifying maximum amount of RAM available on the Device, and PLATFORM-SPECIFIC fields depend on PLATFORM-ID (TODO).

MAX-ROBUST-OS-SIZE field specifies maximum size of the os for which Device guarantees robust rewriting. If MAX-ROBUST-OS is zero, Device does not guarantee robust programming for OTA-TYPE=OTA_OS. Regardless of MAX-ROBUST-OS, Device MUST guarantee robustness for OTA-TYPE=OTA_ROBUST_BOOTLOADER (if it is supported).

NB: to comply with “robustness” requirements, Device MAY need to return different addresses for OS-ADDRESS and/or BOOTLOADER-ADDRESS at different times; therefore, SmartAnthill Client MUST re-issue OtA Capabilities Request before every programming, and MUST NOT cache BOOTLOADER-ADDRESS and/or OS-ADDRESS.

OtA Dynamic Linking Request: | (empty body). “Additional SACCP Bits”: 0x1

OtA Dynamic Linking Response: BOOTLOADER-VENDOR | BOOTLOADER-VERSION | BOOTLOADER-COMPILER-ID | BOOTLOADER-COMPILER-VERSION | BOOTLOADER-COMPILER-CALLING-CONVENTION-ID | DYNAMIC-LINK-ID1 | DYNAMIC-LINK-ADDRESS1 | DYNAMIC-LINK-ID2 | DYNAMIC-LINK-ADDRESS2 | ... | “Additional SACCP Bits”: 0x1

where BOOTLOADER-VENDOR (TODO: list and way to apply for one), BOOTLOADER-VERSION (up to vendor, but SHOULD be monotonous), and BOOTLOADER-COMPILER-ID (TODO: list) are Encoded-Unsigned-Int<max=2> fields, BOOTLOADER-COMPILER-VERSION is a null-terminated string such as “4.8.1a”, BOOTLOADER-COMPILER-CALLING-CONVENTION-ID (TODO:list) is an Encoded-Unsigned-Int<max=2> field, DYNAMIC-LINK-ID* is an Encoded-Unsigned-Int<max=2> field, specifying dynamic-id, and DYNAMIC-LINK-ADDRESS* field is an absolute address of the corresponding function (residing within current Bootloader). Only implemented functions are listed in OtA Dynamic Linking Response.

Dynamic linking is a mechanism which allows SmartAnthill OS to call certain functions from SmartAnthill Bootloader to save on size; it is described in more detail below.

OtA Start Request: | OTA-TYPE | PROGRAM-ADDRESS | PROGRAM-ENTRY-POINT | OPTIONAL-KEY | PROGRAM-SIZE | DATA-SIZE | DATA | “Additional SACCP Bits”: 0x2

where OTA-TYPE is a 1-byte enum, which can be one of OTA_OS, OTA_ROBUST_OS, or OTA_ROBUST_BOOTLOADER, PROGRAM-ADDRESS is an Encoded-Unsigned-Int<max=2> field, which represents address for which the program (os or bootloader) has been compiled, PROGRAM-ENTRY-POINT is a point where control should be passed within PROGRAM (for OTA-TYPE=OTA_*OS, it is an address of “SACCP-OS Handler” as described above, for OTA-TYPE=OTA_ROBUST_BOOTLOADER it is usually the same as PROGRAM-ADDRESS), OPTIONAL-KEY is a 16-byte field, which is present only if OTA-TYPE=OTA_*OS, and represents SmartAnthill OS SASP key, PROGRAM-SIZE is a size of the whole program, DATA-SIZE is an Encoded-Unsigned-Int<max=2> field, and DATA has size of DATA-SIZE.

OtA Start Request message instructs Device to start programming. If OTA-TYPE = OTA_OS, then previous OS MAY be discarded right away. However, if OTA-TYPE = OTA_ROBUST_*, existing OS/bootloader MUST be preserved intact until OtA Commit message is received (and further processed as described in OtA Commit message). If OTA_ROBUST_OS is requested but PROGRAM-SIZE > MAX-ROBUST-OS-SIZE returned in OTA_CAPABILITIES_RESPONSE, Device MAY return OTA_ERROR_TOOLARGE error.

OtA Start Request message starts a new OtA Programming Session. While OtA Programming Session is in progress, SACCP MUST block all the other messages and return TODO errors, until the session ends (either via OtA Abort Request or via OtA Commit Request). Programming Session being in progress is specified by having OTA_PROGRAMMING_INPROGRESS in-RAM state.

If Device receives of any OTA messages except for OtA Capabilities Request, OtA Dynamic Linking Request, and OtA Start Request when it is in OTA_PROGRAMMING_IDLE state - it is an OTA_ERROR_NOPROGRAMMING error.

OtA Continue Request: | CURRENT-OFFSET | DATA-SIZE | DATA | “Additional SACCP Bits”: 0x3

where CURRENT-OFFSET is an offset within the program (CURRENT-OFFSET is redundant, and MUST be equal to previous_OtA_message_offset + previous_OtA_message_data_size; otherwise it is a TODO error), and DATA-SIZE and DATA are similar to that of in OtA Start message.

OtA Abort Request: | (empty body) “Additional SACCP Bits”: 0x4

OtA Abort Request instructs Device to abort current programming session. The only valid reply to OtA Abort Request is OtA Error Response with an error code OTA_ERROR_ABORTED.

OtA Commit Request: | CURRENT-OFFSET | DATA-SIZE | DATA | PROGRAM-SIZE | SACCP-CHECKSUM | “Additional SACCP Bits”: 0x5

where CURRENT-OFFSET, DATA-SIZE and DATA are similar to that of in OtA Continue Request message, PROGRAM-SIZE is overall program size (PROGRAM-SIZE is redundant, and MUST match PROGRAM-SIZE in OtA Start Request message, otherwise it is a TODO error), and SACCP-CHECKSUM is a 16-byte SACCP checksum (as defined in SmartAnthill Command&Control Protocol (SACCP) document) of the whole program.

OtA Commit Request message instructs the Device to check integrity of the program (using SACCP-CHECKSUM), and to “commit” current changes. In particular, for OTA-TYPE=OTA_ROBUST_BOOTLOADER and for OTA-TYPE=OTA_ROBUST_OS, Device MUST ensure atomic switch from existing bootloader to new (loaded) one. For example, it MAY be implemented as rewriting one single address within one single JMP instruction in the very beginning of the bootloader; it MUST NOT be implemented as copying of new bootloader to the old location (as it is not possible to ensure atomicity in this case, and bootloader might be lost).

OtA Ok Response: | (empty body) “Additional SACCP Bits”: 0x2

OtA Ok Response can be sent in response to any of the following: OtA Start Request, OtA Continue Request, or OtA Commit Request.

OtA Error Response: | ERROR-CODE | “Additional SACCP Bits”: 0x3

where ERROR-CODE is an Encoded-Unsigned-Int<max=2> field. OtA Error Response MAY be sent in response to any of the OtA * Request messages. Error codes: OTA_ERROR_ABORT, OTA_ERROR_TOOLARGE, OTA_ERROR_NOPROGRAMMING, the rest TODO.

All OtA * Request messages above are sent as a payload for SACCP OTA-REQUEST messages (with “Additional SACCP Bits” passed alongside), and all OtA * Response messages above are sent as a payload for SACCP OTA-RESPONSE messages (with “Additional SACCP bits” passed alongside).

NB: Current implementation of SAOtAPP doesn’t allow to use SAGDP Streaming (TODO). It means that it is slower than it might be; however, such decision simplifies and reduces portion of SmartAnthill Stack which needs to be implemented as a part of SmartAnthill OtA bootloader; TODO: study if adding streaming support makes sense

Dynamic Linking

To allow saving on size of large (by MCU standards) functions such as AES and EAX, SmartAnthill Device MAY support a “Dynamic Linking” mechanism. In this case, Device SHOULD return OTA_DYNAMIC_LINKING_SUPPORT flag in OtA Capabilities response, and SHOULD return a list of implemented functions in OtA Dynamic Linking response. Each supported function has it’s own well-known ID; each ID specifies not only a function name, but an exact C prototype, so when prototype changes, it requires introducing new ID.

Currently supported functions include:

ID Prototype
DYNAMIC_LINK_AES128_ENCRYPT void aes128_encrypt(void* block, const void* key); TODO
DYNAMIC_LINK_AES128_CTRENCRYPTDECRYPT TODO
DYNAMIC_LINK_OMAC_AES128 TODO
DYNAMIC_LINK_EAX_AES128_ENCRYPTAUTH TODO
DYNAMIC_LINK_EAX_AES128_DECRYPTAUTHCHECK TODO

TODO: more if applicable

Zepto VM

Version:v0.2.12

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture and SmartAnthill Command&Control Protocol (SACCP) documents, please make sure to read them before proceeding.

Zepto VM is a minimalistic virtual machine used by SmartAnthill Devices. It implements SACCP (SmartAnthill Command&Control Protocol) on the side of the SmartAnthill Device (and SACCP corresponds to Layer 7 of OSI/ISO network model). By design, Zepto VM is intended to run on devices with extremely limited resources (as little as 512 bytes of RAM).

Sales pitch (not to be taken seriously!)

Zepto VM is the only VM which allows you to process fully-fledged Turing-complete byte-code, enables you to program your MCU the way professionals do, with all the bells and whistles such as flow control (including both conditions and loops), postfix expressions, subroutine calls, C routine calls, MCU sleep mode (where not prohibited by law of physics), and even a reasonable facsimile of “green threads” - all at a miserable price of 1 to 50 bytes of RAM (some restrictions apply, batteries not included). Yes, today you can get many of these features at the price of 1 (one) byte of RAM (offer is valid while supplies last, stores open late).

We’re so confident in our product that we offer a unique memory-back guarantee for 30 days or 30 seconds, whichever comes first. Yes, if you are not satisfied with Zepto VM and remove it from your MCU, you’ll immediately get all your hard earned bytes back, no questions asked, no strings attached. TODO: proof of being Turing-complete via being able to implement brainfuck

Zepto VM Philosophy

“Zepto” is a prefix in the metric system (SI system) which denotes a factor of 10^-21. This is 10^12 times less than “nano”, a billion times less than “pico”, and a million times less than “femto”. As of now, ‘zepto’ is the second smallest prefix in SI system (we didn’t take the smallest one, because there is always room for improvement).

Zepto VM is the smallest VM we were able to think about, with an emphasis of using as less RAM as possible. While in theory it might be possible to implement something smaller, in practice it is difficult to go below 1 byte of RAM (which is the minimum overhead by Zepto VM-One).

Note on memory overhead

While Zepto VM itself indeed uses ridiculously low amount of RAM, a developer needs to understand that using some capabilities of Zepto VM will implicitly require more RAM. For example, stacking several replies in one packet will implicitly require more RAM for “reply buffer”. And using “green pseudo-threads” feature will require to store certain portions of the intermediate state of the plugins running simultaneously, at the same time (while without “green pseudo-threads” this RAM can be reused, so the intermediate state of only one plugin needs to be stored at a time).

Zepto VM Restrictions

As Zepto VM implements an “Execution Layer” of SACCP, it needs to implement all “Execution Layer Restrictions” set in SmartAnthill Command&Control Protocol (SACCP) document. While present document doesn’t duplicate these restrictions, it aims to specify them in appropriate places (for example, when specific instructions are described).

“Program Errors” as specified in Execution Layer Restrictions are implemented as ZEPTOVM_PROGRAMERROR_* Zepto VM exceptions as described below.

Bodyparts and Plugins

According to a more general SmartAnthill architecture, each SmartAnthill Device (a.k.a. ‘Ant’) has one or more sensors and/or actuators, with each sensor or actuator known as an ‘ant body part’. Each ‘body part’ is assigned it’s own id, which is stored in ‘SmartAnthill Database’ within SmartAnthill Client (which in turn is usually implemented by SmartAnthill Central Controller). For each body part type, there is a ‘plugin’ (so if there are body parts of the same type in the device, number of plugins can be smaller than number of body parts). Plugins are pieces of code which are written in C language and programmed into MCU of SmartAnthill device.

Bodyparts are identified by BODYPART-ID. BODYPART-IDs MAY be negative. Non-negative values for BODYPART-IDs are used for device-specific body parts. Negative values for BODYPART-IDs are reserved for well-known built-int body parts. Currently, such body parts include:

  • BUILTIN_BODYPART_PAIRING (see SmartAnthill Pairing document for details)
  • BUILTIN_BODYPART_AES (TODO; MUST NOT allow using any of Device keys, key MUST be provided as a plugin parameter)
Reply Buffer and Reply Frames

To handle plugins and replies, Zepto VM uses “reply buffer”, which consists of “reply frames”. Whenever plugin is called, it is asked to fill its own “reply frame”. These “reply frames” are appended to each other in a “reply buffer”, so that if there is more than one EXEC instruction, “reply buffer” consists out of “reply frames” in the order of EXEC instructions. As “reply buffer” would be needed regardless of Zepto VM (even simple call to a plugin would need to implement some kind of “reply frame”), it is not considered a part of Zepto VM and it’s size is not counted as “memory overhead” of Zepto VM.

Structure of Plugin Data

Data to be passed to and from plugins is generally described in Plugin Manifest, as described in SmartAnthill Plugins document.

Reply Frame Structure

Reply Frames have the following structure:

| OPTIONAL-HEADERS | FLAGS-AND-SIZE | REPLY-BODY |

where OPTIONAL-HEADERS is described below, FLAGS-AND-SIZE is an Encoded-Unsigned-Int<max=2> field, and REPLY-BODY is data as returned from plugin (possibly truncated, see below), with the size determined by FLAGS-AND-SIZE field as described below.

FLAGS-AND-SIZE field is an Encoded-Unsigned-Int<max=2> bitfield substrate, which is further considered as follows:

  • bit [0] is a flag which specifies that there is no more optional headers (always equals 1 when REPLY-DATA immediately follows).
  • bit [1] is a flag which specifies if REPLY-DATA has been truncated
  • bits [2..] specify size of the REPLY-DATA

OPTIONAL-HEADERS is one or more of optional headers. Each of optional headers has the following structure:

| HEADER-FLAGS-AND-SIZE | HEADER-DATA |

where HEADER-FLAGS-AND-SIZE field is an Encoded-Unsigned-Int<max=2> bitfield substrate, which is further considered as follows:

  • bit [0] is zero
  • bits [1..3] is an type of optional header
  • bits [4..] is the size of HEADER-DATA
Optional Headers

Currently, two optional headers are supported: Plugin-Exception optional header, and Plugin-Exception-Call-Trace optional header.

For Plugin-Exception optional header, type of optional header is 0x0, and HEADER-DATA has the following structure: | EXCEPTION-CODE | EXCEPTION-FILE-HASH | EXCEPTION-LINE |, where EXCEPTION-CODE and EXCEPTION-LINE are Encoded-Unsigned-Int<max=2> fields, and EXCEPTION-FILE-HASH is 2-byte file hash (encoded using “SmartAnthill Endianness”).

Plugin-Exception-Call-Trace optional header is optionally added if Plugin-Exception optional header is present, and call trace information is available (in particular, it requires ZEPTO_UNWIND() exception mechanism to be employed).

For Plugin-Exception-Call-Trace optional header, type of optional header is 0x2, and HEADER-DATA is a sequence of | TRACE-FILE-HASH | TRACE-LINE | frames (starting from most deep function calls), where TRACE-LINE is an Encoded-Unsigned-Int<max=2> field, and TRACE-FILE-HASH is 2-byte file hash (encoded using “SmartAnthill Endianness”).

Plugin-Exception optional header is added if an exception (ZEPTO_THROW, see SmartAnthill Plugins document for details) has been thrown while the plugin was executed.

NB: due to very limited resources and lack of memory separation support on most MCUs (i.e. all the plugins are effectively running in the same protection ring as the Zepto OS itself), it is very easy to break Zepto OS by injecting an ill-behaved plugin. Zepto OS and Zepto VM are aiming to provide as much debug information as possible, but there are still scenarios when Zepto OS is not able to recover from bugs in plugin, and will not be able to report anything back.

Packet Chains

In SACCP (and in Zepto VM as an implementation of SACCP), all interactions between SmartAnthill Client and SmartAnthill Device are considered as “packet chains”, when one of the parties initiates communication by sending a packet P1, another party responds with a packet P2, then first party may respond to P2 with P3 and so on. Whenever Zepto VM issues a packet to an underlying protocol, it needs to specify whether a packet is a first, intermediate, or last within a “packet chain” (using ‘is-first’ and ‘is-last’ flags; note that due to “rules of engagement” described below, ‘is-first’ and ‘is-last’ flags are inherently incompatible, which MAY be relied on by implementation). This information allows underlying protocol to arrange for proper retransmission if some packets are lost during communication. See SmartAnthill 2.0 Protocol Stack document for more details on “packet chains”.

Zepto VM Instructions

Notation
  • Through this document, ‘|’ denotes field boundaries. All fields (including bitfield substrates, as described in SmartAnthill 2.0 Protocol Stack document) take a whole number of bytes.
  • All Zepto VM instructions have the same basic format: | OP-CODE | OP-PARAMS |, where OP-CODE is a 1-byte operation code, and length and content of OP-PARAMS are implicitly defined by OP code.
Zepto VM Opcodes
  • ZEPTOVM_OP_DEVICECAPS
  • ZEPTOVM_OP_EXEC
  • ZEPTOVM_OP_PUSHREPLY
  • ZEPTOVM_OP_SLEEP
  • ZEPTOVM_OP_TRANSMITTER
  • ZEPTOVM_OP_MCUSLEEP
  • ZEPTOVM_OP_POPREPLIES * limited support in Zepto VM-One, full support from Zepto VM-Tiny */
  • ZEPTOVM_OP_EXIT
  • ZEPTOVM_OP_APPENDTOREPLY * limited support in Zepto VM-One, full support from Zepto VM-Tiny */
  • /* starting from the next opcode, instructions are not supported by Zepto VM-One */
  • ZEPTOVM_OP_JMP
  • ZEPTOVM_OP_JMPIFREPLYFIELD_LT
  • ZEPTOVM_OP_JMPIFREPLYFIELD_GT
  • ZEPTOVM_OP_JMPIFREPLYFIELD_EQ
  • ZEPTOVM_OP_JMPIFREPLYFIELD_NE
  • ZEPTOVM_OP_MOVEREPLYTOFRONT
  • /* starting from the next opcode, instructions are not supported by Zepto VM-Tiny and below */
  • ZEPTOVM_OP_PUSHEXPR_CONSTANT
  • ZEPTOVM_OP_PUSHEXPR_REPLYFIELD
  • ZEPTOVM_OP_EXPRUNOP
  • ZEPTOVM_OP_EXPRUNOP_EX
  • ZEPTOVM_OP_EXPRUNOP_EX2
  • ZEPTOVM_OP_EXPRBINOP
  • ZEPTOVM_OP_EXPRBINOP_EX
  • ZEPTOVM_OP_EXPRBINOP_EX2
  • ZEPTOVM_OP_JMPIFEXPR_LT
  • ZEPTOVM_OP_JMPIFEXPR_GT
  • ZEPTOVM_OP_JMPIFEXPR_EQ
  • ZEPTOVM_OP_JMPIFEXPR_NE
  • ZEPTOVM_OP_JMPIFEXPR_EX_LT
  • ZEPTOVM_OP_JMPIFEXPR_EX_GT
  • ZEPTOVM_OP_JMPIFEXPR_EX_EQ
  • ZEPTOVM_OP_JMPIFEXPR_EX_NE
  • ZEPTOVM_OP_CALL
  • ZEPTOVM_OP_RET
  • ZEPTOVM_OP_SWITCH
  • ZEPTOVM_OP_SWITCH_EX
  • ZEPTOVM_OP_INCANDJMPIF
  • ZEPTOVM_OP_DECANDJMPIF
  • /* starting from the next opcode, instructions are not supported by Zepto VM-Small and below */
  • ZEPTOVM_OP_PARALLEL

Zepto VM Exceptions

If Zepto VM encounters a problem, it reports it as an “VM exception” (not to be confused with Plugin-Exception, which is different; normally, on plugin exception Zepto VM records it in respective “reply frame”, and continues program execution). Whenever Zepto VM exception characterized by EXCEPTION-CODE occurs, it is processed as follows:

  • “reply buffer” is converted into the following format: |EXCEPTION-CODE|FLAGS-AND-INSTRUCTION-POSITION|EXISTING-REPLY-BUFFER-DATA| , where all fields except for REPLY-BUFFER-DATA, are Encoded-Unsigned-Int<max=2>, and REPLY-BUFFER-DATA fills the rest of the message. In some cases (for example, if there is insufficient RAM), REPLY-BUFFER-DATA MAY be truncated (which is indicated in FLAGS-AND-INSTRUCTION-POSITION field). Rationale: In certain scenarios, this REPLY-BUFFER-DATA, while incomplete, may allow SmartAnthill Client to extract useful information about the partially successful command. FLAGS-AND-INSTRUCTION-POSITION field is a 1-byte bitfield substrate, which is further considered as follows:
    • bit [0] - specifies if EXISTING-REPLY-BUFFER-DATA has been truncated
    • bits [1..] - specify instruction position where VM exception has occurred
  • This reply is passed to the underlying protocol as an ‘exception’.

Currently, Zepto VM may issue the following exceptions:

  • ZEPTO_VM_INVALIDINSTRUCTION /* Note that this exception may also be issued when an instruction is encountered which is legal in general, but is not supported by current level of Zepto VM. */
  • ZEPTOVM_INVALIDENCODEDSIZE /* Issued whenever Encoded-*-Int<max=...> is an invalid encoding, as defined in SmartAnthill 2.0 Protocol Stack document */
  • ZEPTOVM_PLUGINERROR
  • ZEPTOVM_INVALIDPARAMETER
  • ZEPTOVM_INVALIDREPLYNUMBER
  • ZEPTOVM_EXPRSTACKUNDERFLOW
  • ZEPTOVM_EXPRSTACKINVALIDOFFSET
  • ZEPTOVM_EXPRSTACKFROZENVIOLATION
  • ZEPTOVM_EXPRSTACKOVERFLOW
  • ZEPTOVM_PROGRAMERROR_INVALIDREPLYFLAG
  • ZEPTOVM_PROGRAMERROR_INVALIDREPLYSEQUENCE

Zepto VM End of Execution

Zepto VM program exits when the sequence of instructions has ended. At this point, an equivalent of |EXIT|<ISLAST>,<0>| is implicitly executed (see description of ‘EXIT’ instruction below); this causes “reply buffer” to be sent back to the SmartAnt Client, with ‘is-last’ flag set. Alternatively, an “EXIT” instruction (see below) may end program execution explicitly; in this case, parameters to “EXIT” command may specify additional properties as described in “EXIT” instruction description.

Zepto VM Overriding Command

If there is a new command incoming from SmartAnthill Client, while Zepto VM is executing a current program, Zepto VM will (at the very first opportunity) automatically abort execution of the current program, and starts executing the new one. This behaviour is consistent with the concept of “SmartAnthill Client always knows better” which is used throughout the SmartAnthill protocol stack. Such command may be used, for example, by SmartAnthill Client to abort execution of a long-running request and ask SmartAnthill Device to do something else.

Zepto VM Levels

To accommodate SmartAnthill devices with different capabilities and different amount of RAM, Zepto VM implementations are divided into several levels. Minimal level, which is mandatory for all implementations of Zepto VM, is Level One. Each subsequent Zepto VM level adds support for some new instructions while still supporting all the capabilities of underlying levels.

TODO: timeouts

Level One

ZeptoVM-One is the absolute minimum implementation of Zepto-VM, which allows to execute only a linear sequence of commands, at the cost of additional RAM needed being 1 byte. ZeptoVM-One supports the following instructions:

| ZEPTOVM_OP_DEVICECAPS | REQUESTED-FIELDS |

where ZEPTOVM_OP_DEVICECAPS is 1-byte opcode, and REQUESTED-FIELDS is described below.

DEVICECAPS instruction pushes Device-Capabilities-Reply to “reply buffer” as a “reply frame”. Usually, DEVICECAPS instruction is the only instruction in the program. If there are too many requested-fields (for example, they don’t fit into RAM, or don’t fit into MTU) - as any other “reply frame”, it MAY be truncated.

REQUESTED-FIELDS is a sequence of indicators which configuration parameters are requested:

Indicator Return Type
SACCP_GUARANTEED_PAYLOAD DEVICE-CAPS-UINT2, as described below
ZEPTOVM_LEVEL 1 byte (enum)
ZEPTOVM_REPLY_BUFFER_AND_EXPR_STACK_BYTE_SIZES See below
ZEPTOVM_REPLY_STACK_SIZE DEVICE-CAPS-UINT2
ZEPTOVM_EXPR_FLOAT_TYPE 1 byte (enum)
ZEPTOVM_MAX_PSEUDOTHREADS DEVICE-CAPS-UINT2
DEVICECAPS_END_OF_LIST n/a

DEVICECAPS_END_OF_LIST specifies end of REQUESTED-FIELDS list. Reply to DEVICECAPS instruction contains data which correspond to indicators (and come in the same order as indicators within the request).

If Device doesn’t support specific indicator, it MUST return single byte 0xFF in the appropriate place of the reply buffer. All the valid replies are constructed in the way that cannot possibly have first byte as 0xFF.

DEVICE-CAPS-UINT2 is an Encoded-Unsigned-Int<max=2> which is used as a bitfield substrate, with bit[0] being always 0 (this guarantees that “0xFF is impossible” requirement is met, see about even byte encodings and “indicate an error in an unknown-length field” scenarios in SmartAnthill 2.0 Protocol Stack), and bits[1..] representing value.

For ZEPTOVM_REPLY_BUFFER_AND_EXPR_STACK_BYTE_SIZES indicator, DEVICECAPS instruction returns 3 fields, first having type DEVICE-CAPS-UINT2, the other two having type Encoded-Unsigned-Int<max=2>. First field is MAX_REPLY_BUFFER_SIZE (in bytes), second field is MAX_EXPR_STACK_BYTE_SIZE (in bytes, so to calculate number of stack entries one needs to use EXPR_FLOAT_TYPE), and third field is MAX_COMBINED_REPLY_BUFFER_SIZE_AND_EXPR_STACK_BYTE_SIZE (in bytes). If device implements “reply buffer” and “expression stack” separately, then device reports MAX_COMBINED_REPLY_BUFFER_SIZE_AND_EXPR_STACK_BYTE_SIZE as equal to MAX_REPLY_BUFFER_SIZE+MAX_EXPR_STACK_BYTE_SIZE; if they use shared portion of memory and one can grow at the expense of another one, then device may report, for example, MAX_REPLY_BUFFER_SIZE = MAX_EXPR_STACK_BYTE_SIZE = MAX_COMBINED_REPLY_BUFFER_SIZE_AND_EXPR_STACK_SIZE.

IMPORTANT: for all the reported sizes, device MUST report them as if implementation does not use any of them for temporary purposes. For example, if implementation has an expression stack with size=8, but when processing a EXPRBINOP_EX2 instruction with PUSH-FLAG=0, implementation first temporarily pushes the result to the top of the stack, before modifying the appropriate value of the stack - such a device MUST report expression stack size reduced by 1 entry (the one used for temporary purposes), i.e. 7*stack_entry_size.

| ZEPTOVM_OP_EXEC | BODYPART-ID | DATA-SIZE | DATA |

where ZEPTOVM_OP_EXEC is 1-byte opcode, BODYPART-ID is a Encoded-Signed-Int<max=2> id of the bodypart to be used, DATA-SIZE is an Encoded-Unsigned-Int<max=2> (as defined in SmartAnthill 2.0 Protocol Stack document) length of DATA field, and DATA in an opaque data to be passed to the plugin associated with body part identified by BODYPART-ID; DATA field has size DATA-SIZE. EXEC instruction invokes a plug-in which corresponds to BODYPART-ID, and passes DATA of DATA-SIZE size to this plug-in. Plug-in always adds a reply to the reply-buffer; reply size may vary, but MUST be at least 1 byte in length; otherwise it is a ZEPTOVM_PLUGINERROR exception.

| ZEPTOVM_OP_PUSHREPLY | REPLY-BODY-SIZE | REPLY-BODY |

where ZEPTOVM_OP_PUSHREPLY is a 1-byte opcode, REPLY-BODY-SIZE is an Encoded-Unsigned-Int<max=2> (as defined in SmartAnthill 2.0 Protocol Stack document) size of REPLY-BODY field, and REPLY-BODY is opaque data to be pushed to reply buffer. PUSHREPLY instruction pushes an additional reply frame with DATA in it to reply buffer.

| ZEPTOVM_OP_TRANSMITTER | ONOFF |

where ZEPTOVM_OP_TRANSMITTER is a 1-byte opcode, and ONOFF is a 1-byte field, taking values {0,1}

TRANSMITTER instruction turns transmitter on or off, according to the value of ONOFF field.

| ZEPTOVM_OP_SLEEP | MSEC-DELAY |

where ZEPTOVM_OP_SLEEP is a 1-byte opcode, and MSEC-DELAY is an Encoded-Unsigned-Int<max=4> field (as defined in SmartAnthill 2.0 Protocol Stack document). Pauses execution for approximately MSEC-DELAY milliseconds. Exact delay times are not guaranteed; specifically, SLEEP instruction MAY take significantly longer than requested.

| ZEPTOVM_OP_MCUSLEEP | SEC-DELAY | TRANSMITTERONWHENBACK-AND-MAYDROPEARLIERINSTRUCTIONS |

where ZEPTOVM_OP_MCUSLEEP is a 1-byte opcode, SEC-DELAY is an Encoded-Unsigned-Int<max=4> field (as defined in SmartAnthill 2.0 Protocol Stack document), and TRANSMITTERONWHENBACK-AND-MAYDROPEARLIERINSTRUCTIONS is a 1-byte bitfield substrate, with bit [0] being TRANSMITTERONWHENBACK, bit [1] being MAYDROPEARLIERINSTRUCTIONS, and bits [2..7] being reserved (MUST be zero).

MCUSLEEP instruction puts MCU into sleep-with-timer mode for approximately SEC-DELAY seconds. If sleep-with-timer mode is not available with current MCU, then such an instruction still may be sent to such a device, as a means of long delay, and SmartAnthill device MUST process it just by waiting for specified time. TRANSMITTERONWHENBACK specifies if device transmitter should be turned on after MCUSLEEP, and MAYDROPEARLIERINSTRUCTIONS is an optimization flag which specifies if MCUSLEEP is allowed to drop the portion of the ZeptoVM program which is located before MCUSLEEP, when going to sleep (this may allow to provide certain savings, see below).

As MCUSLEEP may disable device receiver, Zepto VM enforces relevant “Execution Layer Restrictions” when MCUSLEEP is invoked; to ensure consistent behavior between MCUs, these restriction MUST be enforced regardless of MCUSLEEP really disabling device receiver. Therefore (NB: these checks SHOULD be implemented for ZeptoVM-One; they MUST be implemented for all Zepto-VM levels other than ZeptoVM-One):

  • If original command has not had an ISLAST flag, and MCUSLEEP is invoked, it causes a ZEPTOVM_PROGRAMERROR_INVALIDREPLYSEQUENCE exception.
  • Zepto VM keeps track if MCUSLEEP was invoked; this ‘mcusleep-invoked’ flag is used by some other instructions.
  • NB: calling MCUSLEEP twice within the same program is allowed, so if ‘mcusleep-invoked’ flag is already set and MCUSLEEP is invoked, this is not a problem

It should be noted that implementing MCUSLEEP instruction will implicitly require storing current program, current PC and current “reply buffer” either in EEPROM, or to request MPU to preserve RAM while waiting. This will be done automagically by Zepto VM, but it is not without it’s cost. It might be useful to know that in some cases this cost is lower when amount of data to be preserved is small (for example, it happens when “reply buffer” is empty, and/or when MAYDROPEARLIERINSTRUCTIONS is used and the remaining program is small).

| ZEPTOVM_OP_POPREPLIES | N-REPLIES |

where ZEPTOVM_OP_POPREPLIES is a 1-byte opcode (NB: it is the same as ZEPTOVM_OP_POPREPLIES in Level Tiny), and N-REPLIES is an Encoded-Unsigned-Int<max=2> field, which MUST be 0 for Zepto VM-One (other values are allowed for Zepto VM-Tiny and above, as described below). If N-REPLIES is not 0 for Zepto VM-One POPREPLIES instruction, Zepto VM will issue a ZEPTOVM_INVALIDPARAMETER exception. |POPREPLIES|0| means “remove all replies currently in reply buffer”.

NB: Zepto VM-One implements POPREPLIES instruction only partially (for N-REPLIES=0); Zepto VM-Tiny and above support other values as described below, and behavior for N-REPLIES=0 which is supported by all Zepto VM levels, is exactly the same regardless of level.

| ZEPTOVM_OP_EXIT | REPLY-FLAGS-AND-FORCED-PADDING-FLAG | (opt) FORCED-PADDING-TO |

where ZEPTOVM_OP_EXIT is a 1-byte opcode (NB: it is the same as ZEPTOVM_OP_EXIT in Level Tiny), REPLY-FLAGS-AND-FORCED-PADDING-FLAG is a 1-byte bitfield substrate, with bits[0..1] being REPLY-FLAGS bitfield, taking one of the following values: {NONE,ISFIRST,ISLAST}, bit [2] being FORCED-PADDING-FLAG bitfield which stores {0,1}, bits [3..7] being reserved (MUST be zero), and FORCED-PADDING-TO is an Encoded-Unsigned-Int<max=2> (as defined in SmartAnthill 2.0 Protocol Stack document) field, which is present only if <FORCED-PADDING-FLAG> is equal to 1.

EXIT instruction posts all the replies which are currently in the “reply buffer”, back to SmartAnthill Central Controller, and terminates the program. Device receiver is kept turned on after the program exits (so the device is able to accept new commands).

To enforce “Execution Layer Requirements”, the following SHOULD be enforced for Zepto VM-One and MUST be enforced for other Zepto VM layers:

  • if ‘mcusleep-invoked’ flag is not set, and original command has had ISLAST flag, then “reply buffer” MUST be non-empty, and EXIT instruction MUST have REPLY-FLAGS != ISFIRST (this is an usual command-reply pattern)
  • if ‘mcusleep-invoked’ flag is not set, and original command has not had ISLAST flag, then “reply buffer” MUST be non-empty, and EXIT instruction MUST have REPLY-FLAGS == ISFIRST (this is a ‘long command-reply’ pattern)
  • if ‘mcusleep-invoked’ flag is set, then original command will have ISLAST flag (because of other restrictions; this means violating ‘ISLAST’ requirement while processing EXIT instruction is not an exception, but an internal assertion which MUST NOT happen); “reply buffer” MUST be non-empty, and EXIT instruction MUST have REPLY-FLAGS == ISFIRST (this is a ‘mcusleep-then-wake’ pattern)

If any of the restrictions above is not compied with, Zepto VM generates a ZEPTOVM_PROGRAMERROR_INVALIDREPLYSEQUENCE exception.

FORCED-PADDING-TO field (if present) specifies ‘enforced padding’ as described in SmartAnthill SCRAMBLING procedure document. Essentially:

  • if present, FORCED-PADDING-TO MUST specify length which is equal to or greater than the size of current “reply buffer”
  • if developer wants to avoid information leak from the fact that encrypted messages may have different lengths, she may specify the same FORCED-PADDING-TO for all the replies which should be indistinguishable.

| ZEPTOVM_OP_APPENDTOREPLY | REPLY-NUMBER | DATA-TYPE | DATA |

where REPLY-NUMBER is a Encoded-Signed-Int<max=2> field, which MUST be equal to ‘-1’ for Zepto VM-One, DATA-TYPE is 1-byte taking one of the valid values for FIELD-SEQUENCE field as described in JMPIFREPLYFIELD instruction, and DATA has size determined by DATA-TYPE field.

NB: Zepto VM-One implements APPENDTOREPLY instruction only partially (for REPLY-NUMBER=-1); Zepto VM-Tiny and above support other values as described below, and behavior for REPLY-NUMBER=-1 which is supported by all Zepto VM levels, is exactly the same regardless of level.

Implementation notes

If strict checks of “Execution Layer Restrictions” are disabled (which is allowed only for Zepto VM-One and not for any other level), then only PC (Program Counter) needs to be maintained for operating Level One.

To keep track of “Execution Layer Restrictions”, a one-byte flag bitmask is used with the following flags:

  • mcusleep-invoked
  • currently there are no other flags
Memory overhead

Memory overhead of ZeptoVM-One is 1 byte; if “Execution Layer Restrictions” are strictly enforced (which is a MUST for all levels except for Zepto VM-One), this requires an additional 1 byte.

Level Tiny

Zepto VM-Tiny allows for more complicated programs, including basic conditions, at the cost of additional memory needed being on the order of 5-10 bytes. Zepto VM-Tiny, in addition to instructions supported by Zepto VM-One, additionally supports the following instructions:

| ZEPTOVM_OP_JMP | DELTA |

where ZEPTOVM_OP_JMP is a 1-byte opcode, and DELTA is an Encoded-Signed-Int<max=2> signed integer which denotes how PC (program counter) should be changed (DELTA is always considered in relation to the end of current instruction, so JMP 0 is effectively a no-op).

| ZEPTOVM_OP_JMPIFREPLYFIELD_<SUBCODE> | REPLY-NUMBER | FIELD-SEQUENCE | THRESHOLD | DELTA |

where <SUBCODE> is one of {LT,GT,EQ}; ZEPTOVM_OP_JMPIFREPLYFIELD_LT, ZEPTOVM_OP_JMPIFREPLYFIELD_GT, and ZEPTOVM_OP_JMPIFREPLYFIELD_EQ are 1-byte opcodes, REPLY-NUMBER is an Encoded-Signed-Int<max=2>, FIELD-SEQUENCE is described below, THRESHOLD is an Encoded-Signed-Int<max=2> field, and interpretation of DELTA is similar to that of in JMP instruction description.

REPLY-NUMBER is a number of reply frame in “reply buffer”. Negative values mean ‘from the end of buffer’, so that REPLY-NUMBER=-1 means ‘last reply in reply buffer’. If REPLY-NUMBER points to a non-existing item in “reply buffer” (that is, it is positive and is >= number-of-replies, or it is negative and is <= -number-of-replies TODO:check), it is a ZEPTOVM_INVALIDREPLYNUMBER exception.

FIELD-SEQUENCE field describes a sequence of fields to be read from plugin reply body (that is, after all the optional headers, flags etc. are processed); normally, for SmartAnthill systems, it is derived from SmartAnthill Plugin Manifest during program preparation. Last field in FIELD-SEQUENCE always represents a field to be read; all previous fields are skipped. FIELD-SEQUENCE is encoded as a byte sequence with the following byte values supported:

ZEPTOVM_OP_JMPIFREPLYFIELD_* instruction takes the reply of the last plugin which was called, and compares required field to the THRESHOLD. If first byte of the reply is < (for <SUBCODE>=LT) THRESHOLD, PC is incremented by a value of DELTA (as with JMP, DELTA is added to a PC positioned right after current instruction).

<SUBCODE> Jump if
LT Field < THRESHOLD
GT Field > THRESHOLD
EQ Field == THRESHOLD
NE Field != THRESHOLD

| ZEPTOVM_OP_POPREPLIES | N-REPLIES |

where ZEPTOVM_OP_POPREPLIES is a 1-byte opcode and N-REPLIES is an Encoded-Unsigned-Int<max=2> field representing number of replies to be popped.

POPREPLIES instruction removes last N-REPLIES of plugins from the reply buffer. If N-REPLIES is equal to zero, it means that all replies are removed. If N-REPLIES is more than number of replies in the buffer, it is a TODO exception. Usually, either |POPREPLIES|0| (removing all the replies) or |POPREPLIES|1| (removing only one reply) is used, but other values are also possible.

NB: POPREPLIES instruction is partially implemented by Zepto VM-One (for N-REPLIES=0).

| ZEPTOVM_OP_MOVEREPLYTOFRONT | REPLY-NUMBER |

where ZEPTOVM_OP_MOVEREPLYTOFRONT is a 1-byte opcode and REPLY-NUMBER is an Encoded-Signed-Int<max=2> field, which is interpreted as described in JMPIFREPLYFIELD instruction.

MOVEREPLYTOFRONT instruction is used to reorder reply frames within reply buffer. It takes reply frame which has REPLY-NUMBER, and makes it the first one in the buffer, moving the rest of the replies back. Implementation note: need also to recalculate and update positions in offset stack.

| ZEPTOVM_OP_APPENDTOREPLY | REPLY-NUMBER | DATA-TYPE | DATA |

where REPLY-NUMBER is interpreted as described in JMPIFREPLYFIELD instruction, DATA-TYPE is 1-byte taking one of the valid values for FIELD-SEQUENCE field as described in JMPIFREPLYFIELD instruction, and DATA has size determined by DATA-TYPE field.

APPENDTOREPLY instruction appends DATA with DATA-TYPE to the end of reply specified by REPLY-NUMBER.

NB: APPENDTOREPLY instruction is partially implemented by Zepto VM-One (for REPLY-NUMBER=-1).

Implementation notes

To implement Zepto VM-Tiny, in addition to PC required by Zepto VM-One, a stack of offsets which signify positions of reply frames in “reply buffer”, needs to be maintained. Such stack should consist of an array of bytes for offsets, and additional byte to store number of entries on the stack. Size of this stack is a ZEPTOVM_REPLY_STACK_SIZE parameter of Zepto VM-Tiny (which is stored in SmartAnthill DB on SmartAnthill Client and reported via DEVICECAPS instruction).

Memory overhead

Memory overhead of ZeptoVM-Tiny is (in addition to overhead of ZeptoVM-One) is 1+ZEPTOVM_REPLY_STACK_SIZE (or 1+2*ZEPTOVM_REPLY_STACK_SIZE if size of reply buffer can be over 256 bytes).

Level Small

Zepto VM-Small allows for even more complicated programs, including expressions and loops, at the cost of additional memory needed (in addition to Zepto VM-Tiny) being on the order of 9-65 (depending on complexity of programs to be supported) bytes. Zepto VM-Small, in addition to instructions supported by Zepto VM-Tiny, additionally supports the following instructions:

| ZEPTOVM_OP_PUSHEXPR_CONSTANT | CONST |

where ZEPTOVM_OP_PUSHEXPR_CONSTANT is 1-byte opcode, and CONST is a 2-byte half-float constant (encoded as described in SmartAnthill 2.0 Protocol Stack) to be pushed to expression stack.

PUSHEXPR_CONSTANT instruction pushes CONST to an expression stack (if expression stack is exceeded, it will cause ZEPTOVM_EXPRSTACKOVERFLOW VM exception).

| ZEPTOVM_OP_PUSHEXPR_REPLYFIELD | REPLY-NUMBER | FIELD-SEQUENCE |

ZEPTOVM_OP_PUSHEXPR_REPLYFIELD is 1-byte opcode, REPLY-NUMBER and FIELD-SEQUENCE are similar to that of in JMPIFREPLYFIELD instruction.

PUSHEXPR_REPLYFIELD takes a field (specified by FIELD-SEQUENCE) from reply frame (specified by REPLY-NUMBER), and pushes it to the expression stack (if expression stack is exceeded, it will cause ZEPTOVM_EXPRSTACKOVERFLOW VM exception). If data in the field doesn’t fit into stack type (see below), it is an ZEPTOVM_INVALIDEXPRDATA exception.

| ZEPTOVM_OP_EXPRUNOP | UNOP |

where ZEPTOVM_OP_EXPRUNOP is a 1-byte opcode, and UNOP is 1-byte taking one of the following values:

UNOP Corresponding unary C operation
UNOP_POP N/A
UNOP_COPY =
UNOP_MINUS -
UNOP_BITNEG ~
UNOP_NOT !
UNOP_INC +1
UNOP_DEC -1

EXPRUNOP instruction pops topmost value from the expression stack, modifies it according to the table above, and pushes modified value back to expression stack. All operations are performed as specified in the table above; ‘-‘, ‘+1’ and ‘-1’ operations are performed as floating-point operation (see details below), for ‘~’ and ‘!’ operations the operand is first converted into integer with zero exponent (and then only fraction is involved in these operations). If expression stack is empty, it will cause a ZEPTOVM_EXPRSTACKUNDERFLOW VM exception. Overflows are handled in a normal manner for floats (NB: as it is float arithmetics, ‘+1’ and ‘-1’ operations MAY cause operand to stay without changes even if no ‘infinity’ has occurred; it means that if half-floats are used as expression stack values, 2048+1 results in 2048, causing potential for infinite loops TODO: check if it is 2048 or 2050).

If UNOP is UNOP_POP, then no value is pushed back to the expression stack (i.e. UNOP_POP causes one value to be removed from the expression stack).

| ZEPTOVM_OP_EXPRUNOP_EX | UNOP | POP-FLAG-AND-EXPR-OFFSET | OPTIONAL-IMMEDIATE-OPERAND |

where ZEPTOVM_OP_EXPRUNOP_EX is a 1-byte opcode, UNOP is similar to that of in EXPRUNOP instruction, POP-FLAG-AND-EXPR-OFFSET is an Encoded-Signed-Int<max=2> field, which acts as a substrate for POP-FLAG bitfield (occupies bit [0]), and EXPR-OFFSET bitfield (occupies bits [1..]), and OPTIONAL-IMMEDIATE-OPERAND is a half-float field, present only if EXPR-OFFSET is zero (see also below). EXPR-OFFSET specifies expression index which is used by EXPRUNOP_EX instruction, as follows: zero value means that the operand is an immediate operand, positie values of EXPR-OFFSET mean “values from top of the expression stack”, so ‘1’ means ‘topmost value on the stack’; negative values mean ‘values from beginning of the stack’, so that ‘-1’ means expr_stack[0], ‘-2’ means expr_stack[1] and so on (negative values of EXPR-OFFSET can be used, for example, to simulate global variables). POP-FLAG specifies whether the slot specified by EXPR-OFFSET is removed from the expression stack after the calculation is performed (if it is not topmost value which is popped, it causes collapsing the stack as necessary). Accordingly, | EXPRUNOP_EX | UNOP | POP-FLAG=1, EXPR-OFFSET=1 | is equivalent to | EXPRUNOP | UNOP |. If EXPR-OFFSET points beyond the current size of expression stack, this will cause a ZEPTOVM_EXPRSTACKINVALIDOFFSET exception. If POP-FLAG is 1 and popping would lead to the modification of “frozen” part of the expression stack (see description of “frozen” stack in the context of PARALLEL instruction), it will cause a ZEPTOVM_EXPRSTACKFROZENVIOLATION exception. If POP-FLAG is 1 and EXPR-OFFSET is zero (which would mean ‘pop immediate operand’), it is ZEPTOVM_INVALIDPARAMETER exception.

EXPRUNOP_EX instruction is similar to EXPRUNOP instruction, but allows to use wider range of the operand (with popping from the stack being optional).

| ZEPTOVM_OP_EXPRUNOP_EX2 | UNOP | POP-FLAG-AND-EXPR-OFFSET | OPTIONAL-IMMEDIATE-OPERAND | PUSH-FLAG-AND-PUSH-EXPR-OFFSET |

where ZEPTOVM_OP_EXPRUNOP_EX2 is a 1-byte opcode, UNOP, POP-FLAG-AND-EXPR-OFFSET, and OPTIONAL-IMMEDIATE-OPERAND are similar to that of in EXPRUNOP_EX instruction, PUSH-FLAG-AND-EXPR-OFFSET is an Encoded-Signed-Int<max=2> field, which acts as a substrate for PUSH-FLAG bitfield (occupies bit [0]), and PUSH-EXPR-OFFSET bitfield (occupies bits [1..]). PUSH-EXPR-OFFSET specifies target expression index which is used by EXPRUNOP_EX2 instruction, as follows:

  • PUSH-EXPR-OFFSET=0 means that result should be pushed on top of expression stack; PUSH-FLAG MUST be =1 in this case (otherwise it is ZEPTOVM_INVALIDPARAMETER exception).
  • PUSH-FLAG=0 and PUSH-EXPR-OFFSET != 0 means that a value on expression stack (specified by PUSH-EXPR-OFFSET, which is treated similar to EXPR-OFFSET; in particular, both negative and positive values of PUSH-EXPR-OFFSET are valid) needs to be replaced with a result of calculation
  • PUSH-FLAG=1 and PUSH-EXPR-OFFSET != 0 means that a calculated value needs to be inserted within expression stack; index of the stack before which index such insertion needs to be made, is specified by PUSH-EXPR-OFFSET (in a manner similar to EXPR-OFFSET; in particular, both negative and positive values of PUSH-EXPR-OFFSET are valid; for example, PUSH-EXPR-OFFSET=-1 means that new value needs to be inserted into the very beginning of the stack, and PUSH-EXPR-OFFSET=1 means that new value should be inserted right before the topmost value - so it will become second-topmost after insertion).

EXPRUNOP_EX2 instruction is similar to EXPRUNOP_EX instruction, but allows to specify where the result of calculation needs to be placed. | EXPRUNOP_EX2 | UNOP | POP-FLAG=1,EXPR-OFFSET=1 | PUSH-FLAG=1, PUSH-EXPR-OFFSET=0 | is equivalent to | EXPRUNOP | UNOP |.

| ZEPTOVM_OP_EXPRBINOP | BINOP |

where ZEPTOVM_OP_EXPRBINOP is a 1-byte opcode, and BINOP is 1-byte taking one of the following values:

BINOP Corresponding binary C operation
BINOP_PLUS +
BINOP_MINUS -
BINOP_SHL <<
BINOP_SHR >>
BINOP_USHR Java-like >>>
BINOP_BITAND &
BINOP_BITOR |
BINOP_AND &&
BINOP_OR ||

EXPRBINOP instruction pops two topmost values from the expression stack, calculates result out of them according to the table above (as ‘second topmost’ op ‘topmost’), and pushes calculated value back to the expression stack. All operations are performed as specified in the table above; ‘+’ and ‘-‘ are performed as floating-point operations (see details below), for ‘<<’, ‘>>’, ‘&’, ‘|’, ‘&&’, and ‘||’ both operands are first converted into integers with zero exponent (and then only significands of operands are involved in these operations). If expression stack has less than two items, it will cause a ZEPTOVM_EXPRSTACKUNDERFLOW VM exception. Overflows are handled in a standard manner for floats (causing ‘infinity’ result when necessary). NB: there are no multiplication/division operations for Zepto VM-Small, they’re introduced in higher Zepto-VM levels.

| ZEPTOVM_OP_EXPRBINOP_EX | BINOP | OP1-POP-FLAG-AND-EXPR-OFFSET | OPTIONAL-IMMEDIATE-OP1 | OP2-POP-FLAG-AND-EXPR-OFFSET | OPTIONAL-IMMEDIATE-OP2 |

where ZEPTOVM_OP_EXPRBINOP_EX is a 1-byte opcode, BINOP is 1-byte taking the same values as for ZEPTOVM_OP_EXPRBINOP, OP1-POP-FLAG-AND-EXPR-OFFSET and OP2-POP-FLAG-AND-EXPR-OFFSET are similar to POP-FLAG-AND-EXPR-OFFSET in EXPRUNOP_EX instruction, OPTIONAL-IMMEDIATE-OP1 is a half-float field present only if EXPR-OFFSET within OP1-POP-FLAG-AND-EXPR-OFFSET is zero, and OPTIONAL-IMMEDIATE-OP2 is a half-float field present only if EXPR-OFFSET within OP2-POP-FLAG-AND-EXPR-OFFSET is zero.

EXPRBINOP_EX instruction is similar to EXPRBINOP instruction, but allows to use wider range of operands (with popping from the stack being optional). | EXPRBINOP_EX | BINOP | POP-FLAG=1,EXPR-OFFSET=2 | POP-FLAG=1,EXPR-OFFSET=1 | is equivalent to | EXPRBINOP | BINOP |.

| ZEPTOVM_OP_EXPRBINOP_EX2 | BINOP | OP1-POP-FLAG-AND-EXPR-OFFSET | OPTIONAL-IMMEDIATE-OP1 | OP2-POP-FLAG-AND-EXPR-OFFSET | OPTIONAL-IMMEDIATE-OP2 | PUSH-FLAG-AND-PUSH-EXPR-OFFSET |

where ZEPTOVM_OP_EXPRBINOP_EX2 is a 1-byte opcode, BINOP, OP*-POP-FLAG-AND-EXPR-OFFSET and OPTIONAL-IMMEDIATE-OP* fields are similar to that of in EXPRBINOP_EX instruction, and PUSH-FLAG-AND-PUSH-EXPR-OFFSET is similar to that of in EXPRUNOP_EX2 instruction.

EXPRBINOP_EX2 instruction is similar to EXPRBINOP_EX instruction, but allows to to specify where the result of calculation needs to be placed. | EXPRBINOP_EX2 | BINOP | POP-FLAG=1,EXPR-OFFSET=2 | POP-FLAG=1,EXPR-OFFSET=1 | PUSH-FLAG=1, PUSH-EXPR-OFFSET=0 | is equivalent to | EXPRBINOP | BINOP |.

| ZEPTOVM_OP_JMPIFEXPR <SUBCODE> | THRESHOLD | DELTA |

where <SUBCODE> is one of {LT,GT,EQ,NE}; ZEPTOVM_OP_JMPIFEXPR_LT, ZEPTOVM_OP_JMPIFEXPR_GT, ZEPTOVM_OP_JMPIFEXPR_EQ, and ZEPTOVM_OP_JMPIFEXPR_NE are 1-byte opcodes, THRESHOLD is a 2-byte half-float constant (encoded as described in SmartAnthill 2.0 Protocol Stack), and interpretation of DELTA is similar to that of in JMP description.

<SUBCODE> Jump if
LT Topmost value on the expression stack < THRESHOLD
GT Topmost value on the expression stack > THRESHOLD
EQ Topmost value on the expression stack == THRESHOLD
NE Topmost value on the expression stack != THRESHOLD

JMPIFEXPR <SUBCODE> instruction pops the topmost value from the expression stack, compares it with THRESHOLD according to <SUBCODE>, and updates Program Counter by DELTA if condition specified by comparison is met (as with JMP, DELTA is added to a PC positioned right after current instruction). If expression stack is empty, it will cause a ZEPTOVM_EXPRSTACKUNDERFLOW VM exception.

TODO: can equivalents for LE/GE be strictly derived in case of floats?

| ZEPTOVM_OP_JMPIFEXPR_EX <SUBCODE> | POP-FLAG-AND-EXPR-OFFSET | THRESHOLD | DELTA |

where <SUBCODE> is one of {LT,GT,EQ,NE}; ZEPTOVM_OP_JMPIFEXPR_EX_LT, ZEPTOVM_OP_JMPIFEXPR_EX_GT, ZEPTOVM_OP_JMPIFEXPR_EX_EQ, and ZEPTOVM_OP_JMPIFEXPR_EX_NE are 1-byte opcodes, POP-FLAG-AND-EXPR-OFFSET is treated similar to that of in EXPRUNOP_EX instruction, THRESHOLD is a 2-byte half-float constant (encoded as described in SmartAnthill 2.0 Protocol Stack), and interpretation of DELTA is similar to that of in JMP description.

JMPIFEXPR_EX <SUBCODE> instruction works similar to JMPIFEXPR instruction, but allows for a wider range of the operand (with popping from the stack being optional).

| ZEPTOVM_OP_CALL | PROC-ADDR |

where ZEPTOVM_OP_CALL is a 1-byte opcode, and PROC-ADDR is an Encoded-Unsigned-Int<max=2> which specifies an absolute address of the procedure/function being called.

CALL instruction pushes current value of the Program Counter (PC) to the top of expression stack, and sets Program Counter to PROC-ADDR.

Passing parameters to a procedure/function is a matter of convention between caller and callee. For example, caller may push values to the top of the expression stack, and callee may read them then; cleaning parameters from the stack may be performed by caller after callee returns.

IMPORTANT: the way of the address being pushed to expression stack is not specified, and therefore it is prohibited to use RET instruction to implement calculated jumps. Debug versions of Zepto VM SHOULD implement an additional bit array specifying which expression stack entries are CALL entries, and raise a TBD exception whenever CALL stack entry is used for calculations, and whenever non-CALL stack entry is used for RET

| ZEPTOVM_OP_RET |

where ZEPTOVM_OP_RET is a 1-byte opcode.

RET instruction pops value of the Program Counter (PC) from the expression stack (the one which has been previously pushed by CALL instruction), effectively performing return from the procedure/function.

| ZEPTOVM_OP_SWITCH | NUMBER-OF-ENTRIES | SWITCH-ENTRY | ... | SWITCH-ENTRY |

where ZEPTOVM_OP_SWITCH is a 1-byte opcode, NUMBER-OF-ENTRIES is an Encoded-Unsigned-Int<max=2> field, which specifies number of SWITCH-ENTRY’s.

Each of SWITCH-ENTRY’s has a format of | CASE-VALUE | DELTA |, where CASE-VALUE is an Encoded-Signed-Int<max=depends-on-stack-expr-type> field, and DELTA is treated similar to that of in JMP instruction (as usual, DELTA is applied to the end of current instruction, i.e. to the end of SWITCH instruction).

SWITCH instruction pops a topmost value from the expression stack, converts it to the integer, and then compares it to CASE-VALUE’s of each SWITCH-ENTRY, and performs jump if the (converted) value from expression stack matches an appropriate CASE-VALUE. If none of CASE-VALUE’s matches, then execution continues after SWITCH instruction.

| ZEPTOVM_OP_SWITCH_EX | POP-FLAG-AND-EXPR-OFFSET | NUMBER-OF-ENTRIES | SWITCH-ENTRY | ... | SWITCH-ENTRY |

where ZEPTOVM_OP_SWITCH_EX is a 1-byte opcode, POP-FLAG-AND-EXPR-OFFSET is treated similar to that of EXPRUNOP_EX instruction, and NUMBER-OF-ENTRIES and SWITCH-ENTRIES are similar to that of SWITCH instruction.

SWITCH_EX instruction works similar to SWITCH instruction, but allows for a wider range of the operand (with popping from the stack being optional).

| ZEPTOVM_OP_INCANDJMPIF | EXPR-OFFSET | THRESHOLD | DELTA |

where ZEPTOVM_OP_INCANDJMPIF is a 1-byte opcode, EXPR-OFFSET is an Encoded-Signed-Int<max=2> (treated similar to EXPR-OFFSET bitfield in EXPRUNOP instruction), THRESHOLD is similar to that of in JMPIFEXPR instruction, and DELTA is similar to that of in JMP instruction.

| INCANDJMPIF | EXPR-OFFSET | THRESHOLD | DELTA | instruction is a shortcut for | EXPRUNOP_EX | UNOP_INC | POP-FLAG=0,EXPR-OFFSET=EXPR-OFFSET | JMPIFEXPR_EX LT | POP-FLAG=0,EXPR-OFFSET=EXPR-OFFSET | THRESHOLD | DELTA |. It can be used, for example, at the end of the for(int i=0; i < 5; i++) {...} loop (use within while and do-while loops is similar).

| ZEPTOVM_OP_DECANDJMPIF | EXPR-OFFSET | THRESHOLD | DELTA |

where ZEPTOVM_OP_DECANDJMPIF is a 1-byte opcode, EXPR-OFFSET is an Encoded-Signed-Int<max=2> (treated similar to EXPR-OFFSET bitfield in EXPRUNOP instruction), and DELTA is similar to that of in JMP instruction.

| DECANDJMPIF | EXPR-OFFSET | THRESHOLD | DELTA | instruction is a shortcut for | EXPRUNOP_EX | UNOP_DEC | POP-FLAG=0,EXPR-OFFSET=EXPR-OFFSET | JMPIFEXPR_EX GT | POP-FLAG=0,EXPR-OFFSET=EXPR-OFFSET | THRESHOLD | DELTA |. It can be used, for example, at the end of the for(int i=5; i > 0; i–) {...} loop (use within while and do-while loops is similar).

Implementation notes

To implement Zepto VM-Small, in addition to PC and reply-offset-stack required by Zepto VM-Tiny, an expression stack of floating-point values, need to be maintained. Such stack should consist of an array of floating-point values, and an additional byte to store number of entries on the stack. Size of this stack is a ZEPTOVM_EXPR_STACK_SIZE parameter of Zepto VM-Small (which is stored in SmartAnthill DB on SmartAnthill Client and reported via DEVICECAPS instruction).

Type of the values on expression stack always has floating point semantics, and is one of the following: ROUGH_HALF_FLOAT (2 bytes; same as HALF_FLOAT, but with reduced calculation precision - TBD), HALF_FLOAT (2-byte float, see http://en.wikipedia.org/wiki/Half-precision_floating-point_format), FLOAT (4-byte float), DOUBLE (8-byte float); one of these constants is returned in DEVICECAPS instruction reply to indicate kind of floating point arithmetics supported by specific device; each subsequent floating point format is an extension over previous one.

Memory overhead

Memory overhead of ZeptoVM-Small is (in addition to overhead of ZeptoVM-Tiny) is 1+stack_entry_size*ZEPTOVM_EXPR_STACK_SIZE.

Turing Completeness

Starting from Zepto VM-Small, Zepto VM implementations are techically Turing complete. TODO: check

Zepto VM-Medium

Zepto VM-Medium adds support for parallel execution, and TODO: remote debugging.

| ZEPTOVM_OP_PARALLEL | N-PSEUDO-THREADS | PSEUDO-THREAD-1-INSTRUCTIONS-SIZE | PSEUDO-THREAD-1-INSTRUCTIONS | ... | PSEUDO-THREAD-N-INSTRUCTIONS-SIZE | PSEUDO-THREAD-N-INSTRUCTIONS |

where ZEPTOVM_OP_PARALLEL is 1-byte opcode, N-PSEUDO-THREADS is a number of “pseudo-threads” requested, ‘PSEUDO-THREAD-X-INSTRUCTIONS-SIZE’ is Encoded-Unsigned-Int<max=2> (as defined in SmartAnthill 2.0 Protocol Stack document) size of PSEUDO-THREAD-X-INSTRUCTIONS, and PSEUDO-THREAD-X-INSTRUCTIONS is a sequence of Zepto VM commands which belong to the pseudo-thread #X. Within PSEUDO-THREAD-X-INSTRUCTIONS, all commands of Zepto VM are allowed, with an exception of PARALLEL, EXIT and any jump instruction which leads outside of the current pseudo-thread.

PARALLEL instruction starts processing of several pseudo-threads. PARALLEL instruction is considered completed when all the pseudo-threads reach the end of their respective instructions. Normally, it is implemented via state machines (see SmartAnthill Zepto OS document for details), so it is functionally equivalent to “green threads” (and not to “native threads”).

When PARALLEL instruction execution is started, original “reply buffer” is “frozen” and cannot be modified by any of the pseudo-threads; each pseudo-thread has it’s own “reply buffer” which is empty at the beginning of the pseudo-thread execution. After PARALLEL instruction is completed (i.e. all pseudo-threads have been terminated), the original “reply buffer” which existed before PARALLEL instruction has started, is restored, and all the pseudo-thread “reply buffers” which existed right before after respective pseudo-threads are terminated, are added to the end of the original “reply buffer”; this allows to have instructions such as EXEC and PUSHREPLY within the pseudo-threads; this adding of pseudo-thread “reply buffers” to the end of original “reply buffer” always happens in order of pseudo-thread descriptions within the PARALLEL instruction (and is therefore does not depend on the race conditions between different pseudo-threads).

When PARALLEL instruction execution is started, original expression stack is “frozen” and cannot be modified by any of the pseudo-threads (though it may be read using EXPR*_EX instructions as described below); each pseudo-thread has it’s own expression stack which is empty at the beginning of the pseudo-thread execution. After PARALLEL instruction is completed (i.e. all pseudo-threads have been terminated), the original expression stack which existed before PARALLEL instruction has started, is restored, and all the pseudo-thread expression stacks remaining after respective pseudo-threads are terminated, are added to the top of this original stack; this allows to easily pass information from pseudo-threads to the main program; this adding of pseudo-thread expression stacks on top of original expression stack always happens in order of pseudo-thread descriptions within the PARALLEL instruction (and is therefore does not depend on the race conditions between different pseudo-threads).

Caution: in addition to any memory overhead listed for Zepto VM-Medium, there is an additional implicit memory overhead associated with PARALLEL instruction: namely, all the states of all the plugin state machines which are run in parallel, need to be kept in RAM simultaneously. Normally, it is not much, but for really constrained environments it might become a problem.

Note on | ZEPTOVM_OP_EXPR*_EX | within PARALLEL pseudo-thread

EXPRUNOP_EX, EXPRBINOP_EX, and JMPIFEXPR_EX instructions, when applied within PARALLEL pseudo-thread, allow to access original (pre-PARALLEL) expression stack. That is:

  • for positive EXPR-OFFSET values, first EXPR-OFFSET values identify expression stack items within the pseudo-thread, but when pseudo-thread values are exhausted, increasing EXPR-OFFSET starts to go into pre-PARALLEL expression stack.
  • negative EXPR-OFFSET values address pre-PARALLEL expression stack (as usual, starting from the bottom of the stack); if pre-PARALLEL expression stack is exhausted, negative EXPR-OFFSET values start to address pseudo-thread’s own expression stack
  • for all EXPR-OFFSET values, if POP-FLAG is specified and it would affect pre-PARALLEL expression stack, it causes an ZEPTOVM_EXPRSTACKFROZENVIOLATION exception.

TODO: (Medium Level) ZEPTOVM_NETINTERRUPT

TODO: (orthogonal to VM level, starting from Small?) multiplication/division, multiplication/log/exp/sin(?), support for piecewise table maths (with piecewise table supplied as a part of command)

Implementation notes

To implement Zepto VM-Medium, in addition to PC, reply-offset-stack, and expression stack as required by Zepto VM-Small, the following changes need to be made:

  • PC for each pseudo-threads needs to be maintained; maximum number of pseudo-threads is a ZEPTOVM_MAX_PSEUDOTHREADS parameter of Zepto VM-Medium (which is stored in SmartAnthill DB on SmartAnthill Client and reported via DEVICECAPS instruction).
  • expression stack needs to be replaced with an array of expression stacks (to accommodate PARALLEL instruction); in practice, it is normally implemented by extending expression stack (say, doubling it) and keeping track of sub-expression stacks via array of offsets (with size of ZEPTOVM_MAX_PSEUDOTHREADS) within the expression stack. See SmartAnthill Zepto OS document for details.
  • to support replies being pushed to “reply buffer” in parallel, an additional array of 2-byte offsets of current replies needs to be maintained, with a size of ZEPTOVM_MAX_PSEUDOTHREADS.
Memory overhead

Memory overhead of ZeptoVM-Medium is (in addition to overhead of ZeptoVM-Small) is 1+4*ZEPTOVM_MAX_PSEUDOTHREADS, though an increase of ZEPTOVM_EXPR_STACK_SIZE parameter of ZeptoVM-Small is advised.

Appendix

Statistics for different Zepto-VM levels:

Level Opcodes Supported Typical Parameter Values Amount of RAM used (with typical parameter values)
Zepto VM-One TODO   1 to 2
Zepto VM-Tiny TODO ZEPTOVM_REPLY_STACK_SIZE=4 to 8 (1 to 2)+(5 to 9) = 6 to 11
Zepto VM-Small TODO ZEPTOVM_EXPR_STACK_SIZE=4 to 32 ZEPTOVM_EXPR_FLOAT_TYPE=HALF-FLOAT (6 to 11)+(9 to 65) = 15 to 76
Zepto VM-Medium TODO ZEPTOVM_EXPR_STACK_SIZE=32 to 128 ZEPTOVM_MAX_PSEUDOTHREADS=4 to 8 TBD

Reference Implementation

SmartAnthill Zepto OS

Version:v0.2.9

NB: this document relies on certain terms and concepts introduced in SmartAnthill 2.0 Overall Architecture document, please make sure to read it before proceeding.

SmartAnthill project is intended to operate on MCUs which are extremely resource-constrained. In particular, currently we’re aiming to run SmartAnthill on MCUs with as little as 512 bytes of RAM. This makes the task of implementing SmartAnthill on such MCUs rather non-trivial. Present document aims to describe our approach to implementing SmartAnthill on MCU side, named ‘Zepto OS’. It should be noted that what is described here is merely our approach to a reference SmartAnthill implementation; it is not the only possible approach, and any other implementation which is compliant with SmartAnthill protocol specification, is welcome (as long as it can run on target MCUs and meet energy consumption and other requirements).

“Zepto OS” is a network-oriented secure operating system, intended to run on extremely resource-constrained devices. It implements necessary parts of SmartAnthill 2.0 Protocol Stack, including Zepto VM.

“Zepto” is a prefix in the metric system (SI system) which denotes a factor of 10^-21. This is 10^12 times less than “nano”, a billion times less than “pico”, and a million times less than “femto”. As of now, ‘zepto’ is the second smallest prefix in SI system (we didn’t take the smallest one, because there is always room for improvement).

Zepto VM is the smallest OS we were able to think about, with an emphasis of using as less RAM as possible. Nevertheless, it has network support (more specifically, it supports necessary parts of SmartAnthill 2.0 Protocol Stack), strong encryption support (both AES-128 and Speck), and “green threads”. All of this in as low as 512 bytes of RAM. To achieve it, Zepto OS has quite a specific design.

Assumptions (=mother of all screw-ups)

  1. RAM is the most valuable resource we need to deal with. Limits on RAM are on the order of 512-8192 bytes, while limits on code size are on the order of 8192-32768 bytes.
  2. EEPROM (or equivalent) is present on all the supported chips
  3. There is a limit on number of EEPROM operations (such as 10000 to 100000 during MCU lifetime, depending on MCU)
  4. This limit is usually per-writing-location and EEPROM writings are done with some granularity which is less than whole EEPROM size. One expected granularity size is 32 bits-per-write; if EEPROM on MCU has such a granularity, it means that even we’re writing one byte, we’re actually writing 4 bytes (and reducing the number of available writes for all 4 bytes).
  5. There are MCUs out there which allow to switch to “sleep” mode
  6. During such “MCU sleep”, RAM may or may not be preserved (NB: if RAM is preserved, it usually means higher energy consumption)
  7. During such “MCU sleep”, receiver may or may not be turned off (NB: this issue is addressed in detail in SmartAnthill 2.0 Protocol Stack and SmartAnthill Guaranteed Delivery Protocol (SAGDP) documents).

Layers and Libraries

Zepto OS is divided into three parts:

  • Zepto OS kernel (the same for all MCUs)
  • MCU- and device-dependent libraries (Hardware Abstraction Layer, HAL)
  • SmartAnthill Plugins (see SmartAnthill Plugins document; from the point of view of Zepto OS plugins are similar to device drivers).

Memory Architecture

As RAM is considered the most valuable resource, it is suggested to pay special attention to RAM usage.

SmartAnthill memory architecture is designed as follows:

  • Two large blocks of RAM are pre-allocated: a) stack (TODO: size), b) “Zepto Heap”
  • In addition, there are fixed-size states of various state machines which implement certain portions of SmartAnthill protocol stack (see details below). These fixed-size state may either reside globally, or on stack of “main loop” (see below)
Zepto Heap

Zepto Heap provides access to memory allocation. However, to enable work within extremely tight memory constraints, “Zepto Heap” is unusual in a sense that all memory blocks are movable. Therefore, pointers to these memory blocks may easily change, when calling various functions (but not between such calls). As a rule of thumb, any potentially “blocking” or “allocating” function MAY change all the pointers; to avoid problems, all the pointers MUST be re-read from respective handles after each such function call. Outside of Zepto OS the API is built in a way that this requirement is usually not a problem, as long as handles are treated as completely opaque. TODO: describe exceptions if any

Each memory block within Zepto Heap is represented by REQUEST_REPLY_HANDLE. REQUEST_REPLY_HANDLE provides parsing and appending functionality which is described in SmartAnthill Plugins document. Other functionality provided by REQUEST_REPLY_HANDLE:

void zepto_convert_reply_to_request(REQUEST_REPLY_HANDLE);

zepto_convert_reply_to_request() function discards request within REQUEST_REPLY_HANDLE, and converts reply contained within it, into request. This function is used in “main loop” as described below.

Data which corresponds to REQUEST_REPLY_HANDLEs is stored in a global “heap control” structure; maximum number of simultaneously supported REQUEST_REPLY_HANDLEs is limited to a compile-time-constant ZEPTO_MAX_HEAP_BLOCKS.

Whenever ‘append’ doesn’t fit into free memory right after the block being appended to, Zepto Heap moves blocks (the block being appended to, other blocks, or any combination of these) to allow appending. This in turn causes pointers to these blocks to be invalidated (as noted above).

REPLY_HANDLE

Plugin APIs are using REPLY_HANDLEs; REPLY_HANDLEs are implemented as REQUEST_REPLY_HANDLEs with limited functionality (i.e. no parser can be created from REPLY_HANDLE, but zepto_append_*() functions do work for REPLY_HANDLEs).

ZEPTO_PARSER

ZEPTO_PARSER is an opaque structure which is used for parsing packets. It is also used by SmartAnthill Plugins (as described in SmartAnthill Plugins document). In addition to functions described in SmartAnthill Plugins document, ZEPTO_PARSER supports the following functionality:

void zepto_create_parser(ZEPTO_PARSER* parser, REQUEST_REPLY_HANDLE request_reply);

zepto_create_parser() function initializes ZEPTO_PARSER structure and prepares it for subsequent use (similar to OO constructor).

void zepto_create_parser_from_parsed_block(ZEPTO_PARSER* target_parser, ZEPTO_PARSER* source_parser, size_t sz);

zepto_create_parser_from_parsed_block() initializes a new ZEPTO_PARSER from a block of size sz within existing parser (similar to another OO constructor). This is used to support nested parsing (which in turn enables plugin processing as described below).

Error Handling and Zepto Exceptions

In Zepto OS, errors are normally handled via “Zepto Exceptions”. Zepto exceptions are implemented as a series of macros, which are described in SmartAnthill Plugins document.

Zepto exceptions are implemented either via setjmp/longjmp (if the call is supported on target MCU), or without them. ZEPTO_UNWIND(x) macro expands to a no-op if setjmp/longjmp is used, and into “if(exception_pending)return x;” otherwise. ZEPTO_THROW(exception_code) macro records (a) exception code, (b) __LINE__ where the exception has occurred, (c) hash of __FILE_ where exception has occurred. If Zepto Exception occurred within the plugin, all these parameters are then passed back to SmartAnthill Client as a part of “reply frame”, see Zepto VM document for details).

2-byte hash of __FILE__ (a.k.a. zepto_fname_hash(__FILE__)) is calculated as follows:

  • removing dir name
  • calculating hash in TODO way (TODO: get hash algorithm from std:: or boost::)
  • taking 2 TODO bytes out of it

If ZEPTO_UNWIND() unwinding mechanism (and not setjmp/longjmp unwinding) is used, Zepto OS MAY be able to collect call trace during unwinding. If available, call trace information is represented as a sequence of ‘frames’, where each frame is a pair consisting of __LINE__ and zepto_fname_hash(__FILE__). If Zepto Exception occurs within plugin, call trace information MAY be passed back to SmartAnthill Client as a part of “reply frame”, see see Zepto VM document for details).

zeptoerr

zeptoerr is a pseudo-stream which is somewhat similar to traditional stderr. However, implementation is very different.

zeptoerr is used as described in SmartAnthill Plugins document. ZEPTOERR() macro compiles (conditionally) to a call to an underlying zepto_error() function. Behaviour of zepto_error() depends on compile-time defines, and can be either “full” mode, or “short” mode.

In both modes, zepto_error() pushes result to a predefined REPLY_HANDLE.

In “full” mode, zepto_error() pushes the whole string there, and format of the records in REPLY_HANDLE is as follows:

| RECORD-LENGTH | BODYPART-ID | FORMAT-STRING | PARAM-LIST |

where RECORD-LENGTH is Encoded-Unsigned-Int<max=2>, specifying length of zeptoerr record, BODYPART-ID is an Encoded-Signed-Int<max=2> field, FORMAT-STRING is a null-terminated string, and PARAM-LIST is a list of pairs | PARAM-TYPE | PARAM-DATA |. Supported values of PARAM-TYPE are the following:

  • FOUR_BYTE_FIELD (assumes ‘SmartAnthill endianness’ as described in SmartAnthill 2.0 Protocol Stack document); FOUR_BYTE_FIELD is used for %i on platforms where int is 32-bit
  • TWO_BYTE_FIELD (assumes ‘SmartAnthill endianness’ as described in SmartAnthill 2.0 Protocol Stack document); TWO_BYTE_FIELD is used for %i on platforms where in is 16-bit
  • FLOAT_FIELD (using encoding as described in SmartAnthill 2.0 Protocol Stack document for floats); %f is passed as FLOAT, unless platform uses half-float library to simulate floats
  • HALF_FLOAT_FIELD (using encoding as described in SmartAnthill 2.0 Protocol Stack document for half-floats); %f is passed as HALF_FLOAT, if platform uses half-float library to simulate floats

In “short” mode, FORMAT-STRING is replaced with 2-byte hash (the same hash which is used for hashing filenames for error handling purposes, as described above).

“Main Loop” a.k.a. “Main Pump”

Zepto OS is implemented as a “main loop”, which calls different functions and performs other tasks as follows:

  • first, “main loop” calls a function zepto_hal_incoming_packet(REQUEST_REPLY_HANDLE data), which waits for an incoming packet and fills data with an incoming packet. This function is a part of device-specific library. If incoming packets can arrive while the “main loop” is running, i.e. asynchronously, they need to be handled in a separate buffer and handled separately.
  • then, “main loop” calls one of “receiving” protocol handlers (such as “receiving” portion of SADLP-CAN), with the following prototype: byte protocol_handler(REQUEST_REPLY_HANDLE);
  • NB: all calls of protocol handlers (both “receiving” and “sending”) are made right from the program “main loop” (and not one protocol handler calling another one), to reduce stack usage.
  • after protocol handler has processed the data, it returns to “main loop”. Now previous request within REQUEST_REPLY_HANDLE is not needed anymore, so “main loop” calls zepto_convert_reply_to_request() to discard previous request and to convert reply of the previous protocol layer into request of the next protocol layer.
  • after such zepto_convert_reply_to_request() call, we can repeat the process of calling the “receiving” “protocol handler” (such as SAGDP, and then SACCP and Zepto VM).
  • when Zepto VM is called (it has prototype zepto_vm(REQUEST_REPLY_HANDLE, WaitingFor* waiting_for);; WaitingFor structure is described in detail in ‘Asynchronous Returns’ subsection below), it starts parsing the request and execute commands. Whenever Zepto VM encounters an EXEC command (see Zepto VM document for details), Zepto VM creates a nested ZEPTO_PARSER (to parse plugin data), and calls an appropriate plugin handler (passing this nested parser as a parameter); prototype of plugin handler is specified in SmartAnthill Plugins document. After plugin_handler returns, Zepto VM merges plugin reply into it’s own reply. This ensures proper and easy forming of “reply buffer” as required by Zepto VM specification.
  • after the Zepto VM has processed the data, “main loop” doesn’t need the command anymore, so it can again call zepto_convert_reply_to_request() and call SAGDP “sending” protocol handler.
  • after “sending” protocol handler returns, “main loop” calls zepto_convert_reply_to_request() and continues calling the “sending” protocol handlers (and zepto_convert_reply_to_request() after each protocol handler call) until the last protocol handler is called; at this point, data is prepared for feeding to the physical channel.
  • at this point, “main loop” calls [TODO] function (which belongs to device-specific library) to pass data back to the physical layer.

In a sense, “main loop” is always “pumping” the data from one “protocol handler” to another one, always keeping “data to be processed” within the same REQUEST_REPLY_HANDLE, calling zepto_convert_reply_to_request() (which effectively discards ‘old’ request data and converts reply data into ‘new’ request data) as soon as ‘old’ request data becomes unnecessary. This “pumping” zepto-convert_reply_to_request()-based approach allows to avoid storing multiple copies of data (only two copies are stored at any given moment), and therefore to save on the amount of RAM required for SmartAnthill stack operation.

Return Codes

Each protocol handler returns error code. Error codes are protocol-handler specific and may include such things as IGNORE_PACKET (causing “main loop” to stop processing of current packet and start waiting for another one), FATAL_ERROR_REINIT (causing “main loop” to perform complete re-initialization of the whole protocol stack), WAITING_FOR (described below in ‘Asynchronous Returns’ subsection) and so on.

Asynchronous Returns from Zepto VM

In addition to paramaters which are usual for protocol handlers, Zepto VM also receives a pointer to a struct WaitingFor { uint16_t sec; uint16_t msec; byte pins_to_wait[(NPINS+7)/8]; byte pin_values_to_wait[(NPINS+7)/8] }; When Zepto VM execution is paused to wait for some event, it SHOULD return to “main loop” with an error code = WAITING_FOR, filling in this parameter with time which it wants to wait, and filling in any pins (with associated pin values) for which it wants to wait. These instructions to wait for are always treated as waiting for any of conditions to happen, i.e. to “wait for time OR for pin#2==1 OR for pin#4==0”.

It is responsibility of the “main loop” to perform waiting as requested by Zepto VM and call it back when the condition is met (passing NULL for src).

During such a wait, “main loop” is supposed to wait for incoming packets too; if an incoming packet comes in during such a wait, “main loop” should handle incoming packet first (before reporting to ‘Zepto VM’ that it’s requested wait is over).

Zepto VM may issue WAITING_FOR either as a result of SLEEP instruction, or as a result of plugin handler returning WAITING_FOR (see example below).

TODO: MCUSLEEP?

State Machines

Model which is described above in “Main Loop” section, implies that all SmartAnthill protocol handlers (including Zepto VM) are implemented as “state machines”; state of these “state machines” should be fixed-size and belongs to “fixed-size states” memory area mentioned in “Memory Architecture” section above.

Plugins

Zepto OS plugins MUST be compliant with SmartAnthill Plugin specification, as outlined in SmartAnthill Plugins document.

Programming Guidelines

The following guidelines are considered important to ensure that only absolutely minimum amount of RAM is used:

  • Dynamic allocation is heavily discouraged. When used, it MUST be based on REQUEST_REPLY_HANDLES as described above (yes, it means no malloc())
  • No third-party libraries (except for those specially designed for MCUs) are allowed
  • All on-stack arrays MUST be analyzed for being necessary and rationale presented in comments.

Support for PARALLEL Zepto VM instruction

PARALLEL instruction is supported by Zepto VM, starting from ZeptoVM-Medium. It allows for pseudo-parallel execution (i.e. when plugin A is waiting, plugin B may continue to work).

Implementing PARALLEL instruction is tricky, in particular, because we don’t know how much space to allocate for each pseudo-thread to use from “reply buffer”. To get around this problem, we’ve encapsulated reply buffer as an opaque REQUEST_REPLY_HANDLE.

In addition, to accommodate per-pseudo-thread expression stacks, at the moment of PARALLEL instruction we perform a ‘virtual split’ of the remaining space in “expression stack” into “per-pseudo-thread expression stacks”; to implement this ‘virtual split’, we keep an array of offsets of these “per-pseudo-thread expression stacks” within main “expression stack”, and move them as necessary to accommodate expression stack requests (in a manner similar to the handling of “reply sub-buffers” described above).

Running on top of another OS

Zepto OS is written in generic C code, and can be compiled and run as an application on top of another OS, as long as Zepto OS HAL is implemented. As of now, Zepto OS can run on top of Windows, we also plan to add support for Linux and Mac OS X.

Hardware Abstraction Layer (HAL)

HAL is intended to enable Zepto OS to run on different architectures. Below is the list of functions which HAL needs to provide:

TODO: error codes

int zepto_hal_incoming_packet(REQUEST_REPLY_HANDLE data);

where bufSize is an inout parameter, taking original buffer size and returning packet size back. get_incoming_packet() returns error code (TODO: codes)

TODO: more and more and more

HAL interface

for preliminary discussion

Version:v0.0.0

HAL interface consists of waiting and time, communication, and EEPROM access functions (TODO: any other?)

Data structures

  • WAITING_FOR

    describes various objects/events that can be waited; respectively, must address all of such objects (in particular, SPI, I2C, etc; details are TBD and is as follows)

  • SA_TIME_VAL

    expresses time in a way understood by HAL; a set of converting functions (or macros) must be provided to go back and force between SA_TIME_VAL and usual time units (such as seconds), and to perform necessary operations. Internal structure of SA_TIME_VAL must be of no interest to OS.

Initializing functions

void hal_init();

performs whatever initialization after system reboot (communication, eeprom, etc) from hardware point of view (for instance, may include functionality of a presently existing communication_initialize(), hal_init_eeprom_access(), etc).

Waiting functions

uint8_t hal_wait_for( WAITING_FOR* wf );

waiting for objects described in WAITING_FOR struct. Returns when one or more events heppen.

void hal_mcu_sleep( uint16_t sec, uint8_t transmitter_state_on_exit );

causes MCU to go to sleep mode and returns at the end of this period; no other processing is done until this function retuens (for obvious reason). TBD: way to supply time interval (seconds vs. SA_TIME_VAL)

void hal_gravely_power_inefficient_micro_sleep( SA_TIME_VAL* timeval );

presently named just_sleep( SA_TIME_VAL* timeval ), a blocking call; a time interval should not be permitted to be somehow substantial (say, in order of 1 ms max)

Time functions

void hal_get_time( SA_TIME_VAL* t );

fills SA_TIME_VAL struct; the result may then be used by supplying as is in a various time and wait related functions

  • TODO: add time conversion and other related functions/macros

Communication functions

uint8_t hal_send_packet( MEMORY_HANDLE mem_h, uint8_t bus_id, uint8_t intrabus_id );

TODO: think about ‘NOT_USED’ value for intrabus_id

bool hal_get_packet_bytes( MEMORY_HANDLE mem_h );

used for actual getting of the packet bytes (for instance, when hal_wait_for indicates that packet bytes are available). Used in repeated way together with hal_wait_for; returns true, when the whole packet is received. Note: by definition, packet ends when this call returns true; whether packet is integral will be checkedbeyond HAL.

void hal_turn_receiver_on_off( bool turn_on );

does nothing if receiver is already in a requested state

EEPROM access functions

bool hal_eeprom_write( const uint8_t* data, uint16_t size, uint16_t address );

self-described.

bool hal_eeprom_read( uint8_t* data, uint16_t size, uint16_t address);

self-described.

void hal_eeprom_flush();

when this function returns, results of previous ‘write’ operations are guaranteed to be actually stored in eeprom. Note: depending on a particular archetecture this may be an empty call.

Digital pin operation functions

bool hal_read_digital_pin( uint16_t pin_num );
void hal_write_digital_pin( uint16_t pin_num, bool value );

SPI and I2C operation functions

In the following calls each size is in bits. TODO: discuss the order of bits within an unsigned int representing command/data

void hal_start_sending_spi_command_16( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint16_t command, uint8_t command_sz);
void hal_start_sending_spi_command_32( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint32_t command, uint8_t command_sz);
void hal_start_sending_i2c_command_16( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint16_t command, uint8_t command_sz);
void hal_start_sending_i2c_command_32( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint32_t command, uint8_t command_sz);

Each of the above hal_start_sending_*() calls start an operation and return immediately; (if at all possible) to know that the request is already performed caller should wait for a respective spi_id / i2c_id by calling hal_wait_for(), and hal_wait_for() should return as soon as HAL finds the operation is over.

uint8_t hal_start_receiving_spi_data_16( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint16_t* data);
uint8_t hal_start_receiving_spi_data_32( uint8_t spi_id, uint16_t addr, uint8_t addr_sz, uint32_t* data);
uint8_t hal_start_receiving_i2c_data_16( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint16_t* data);
uint8_t hal_start_receiving_i2c_data_32( uint8_t i2c_id, uint16_t addr, uint8_t addr_sz, uint32_t* data);

Each of the above hal_start_receiving_*() calls start an operation and return immediately; to know that the data is already received caller should wait for a respective spi_id / i2c_id by calling hal_wait_for(), and hal_wait_for() should return as soon as HAL finds the operation is over.

uint8_t hal_cancel_spi_operation( uint8_t spi_id );
uint8_t hal_cancel_i2c_operation( uint8_t spi_id );

Each of the above hal_cancel_*() calls return immediately. TODO: do we need to supply as parameters addr and addr_sz as well?

Special TX/RX functions

bool hal_set_frequency();

parameters? ret?

void hal_adjust_transmitting_power( bool increase );

Increases or decreases transmitting power; decrease is done when possible; increase is done until max possible power is reached.

void hal_set_max_transmitting_power();
int8_t hal_get_min_max_transmitting_power();

If we need a range, think about returning a pair, or about splitting this call into two (for min and max).

bool hal_is_frequency_adjustable();
void hal_adjust_frequency();

Input parameters?