Reliable Remote Control via Radio/LoRa
with pfodRadio

by Matthew Ford 20^th July 2018 (original 20^th March 2018)
© Forward Computing and Control Pty. Ltd. NSW Australia
All rights reserved.

How to control remote devices reliably via Radio/LoRa
using pfodApp

Introduction

This page describes the design of a reliable message delivery system for Point-to-Point control using Radio/LoRa and pfod (Protocol for Operation Discovery).

Radio/LoRa at its base level is characterized by Low Data Rates, Half Duplex connections and Lost Messages (due to interference or multiple transmitters)

In order to implement a reliable, secure communication protocol of connected messages such as is needed for a remote control system, a session management layer is required on top of the basic Radio. This page describes that layer for use with pfod (Protocol for Operation Discovery)

This session management layer needs low level radio drivers, but is separated from them by an interface class. The example low level drivers used here are the RadioHead drivers (http://www.airspayce.com/mikem/arduino/RadioHead/) but you can replace these with another library of your choice.

The RadioHead library provides a reliable session management classes, RHReliableDatagram (and an encryption driver extension) . Why not just use on of those? There are three main problems with RadioHead's library:-

1) The RadioHead encryption provide does not protect against replay attacks. If an attacker records you opening your garage door they can replay the same message to open it later. It does not matter to the attacker that the message is encrypted, they only need to know that this message opens the door.

pfodRadio avoids this problem by using the previously developed pfodSecurity, uses challenge and response to prevent replay attacks. Also SIPHash used by pfodSecurity has a small memory foot print.

2) From the RadioHead documentation, “There is no message queuing or threading in RHReliableDatagram. sendtoWait() waits until an acknowledgement is received, retransmitting up to (by default) 3 retries time with a default 200ms timeout. … Central server-type sketches should be very cautious about their retransmit strategy and configuration lest they hang for a long time trying to reply to clients that are unreachable.”

pfodRadio uses polling structure that does not block on send/receiving data.

3) RadioHead has a race condition when replying to a message. That is when the Client sends request. The Server sends ack. Then the Server sends it response to the Client requiest. If the Client does not receive ack it will resends request which interferes with server's response. Now Client and Server are both waiting for an ack and both are doing resends until the 'ALOHA' algorithm back off works. This slows down the link.

pfodRadio overcomes this problem by sending the ack for last received msg with the first packet of the next transmitted message. The Client is in receive mode waiting for the ack to its last msg and so will not interfere with the servers response. (Compare with TCP acks). Note: pfodRadio is point-to-point half-duplex connection and does not support broadcast mode.

Design Assumptions

Radio delivery is not reliable. Messages may be partially received or not arrive not at all or be repeated.
Radio messages are small, i.e. 20 to 250 chars
Full encryption is not required, but protection against hackers re-issuing commands is required.
Point-to-Point communication for control. i.e. no multi hops or mesh connections. i.e. Each radio transmission is only either received directly from the other end of the link or not received at all. Messages from other radios are ignored. Links consist of a designated server node number and a client node number. Multiple clients may connect to the server, but only one at a time.
pfod specification is half-duplex for commands and responses. The pfodDevice (the server) only ever transmits in response to a client's command.

Requirements

A communication connection based on Radio/LoRa for pfod needs to have the following:

a) Detect and re-request missing packets or part there of. Every command sent must be responded to by the pfodDevice (the server) or the connection is considered “lost”

b) Pass UTF-8 data (null terminated). Arbitrary binary data containing embedded nulls is NOT supported.

c) Send data packets up to 1024 bytes from the pfodDevice (server) and receive data packets up to 265 bytes at the pfodDevice from the client.

d) 128bit Security to prevent unauthorized access and replay of commands.

e) 'Raw Data' (which is sent by the pfodDevice, the Server) should only be sent while the Client is waiting for a response. This, together with f) and the pfod command/response protocol, makes the link true half-duplex. As in other pfod links, sending raw data is not reliable and may be dropped or truncated. Only pfod commands and responses are sent reliably.

f) Limit pfodDevice (i.e. the Server) connections to one at a time.

Implementation

a) Detect and re-request missing packets or part there of. Every message sent from the pfodApp must be responded to by the pfodDevice or the connection is considered “lost”

The pfodDevice, the Server, must respond to each command and must not send unsolicited pfod messages. Although the pfod specification states pfodDevice can send “raw data” at any time, this implementation restricts that to only sending raw data when the Client is listening for a response.

To allow for re-requests and detection of missing and repeated messages, each command sent to the pfodDevice (server) will contain a connection number and message sequence number, starting from 0 for a new connection. The connection number is stored in the client and incremented each connection, starting from 0. When the pfodDevice replies it includes the client connection number and the pfodDevice's own seqNo starting from 1 for each new connection.

If the client does not receive a complete reply from the pfodDevice after the user configured timeout, the client will resend the command using the same MsgNo upto five (5) times, before reporting the connection 'lost'. Client resends are at a random time between 0 and ½ timeout, after the timeout. That is resends occur at between timeout and 1.5*timeout after the previous send. The default timeout is 200mS

MsgNos are 0 to 255. (In this example, RadioHead's ID header is used for the MsgNo) The MsgNo 0 indicates a new connection is being established and is only used when establishing a new connection. After the connection is established the MsgNo wraps from 255 to 1 (skipping 0)

If the client re-sends a message, due to message loss, the pfodDevice can determine from the message sequence no if it has already received and processed that message. If it has already processed that message then the pfodDevice just resends the previous response (including the previous pfodDevice message sequence number) and does not process the command again.

As the client receives the server's replies to its commands, the client re-orders them via their msgNo so that if some are missed and the client needs to resend the command, the client only needs to receive the missing radio msgNos in order to complete the response.

Note these msgNos are per radio message and are independent from the message numbers used for security. The security msg numbers refer to a complete pfod message not the smaller radio messages that make it up.

b) Pass UTF-8 data

The pfod specification requires that the transmission system can pass UTF-8 data. Radio/LoRa supports 8bit bytes and so can transmit UTF-8 data. The transmission of embedded NULL's ('\0') are NOT supported.

c) Send data packets up to 1024 bytes from pfodDevice to pfodApp and send data packets up to 256 bytes from pfodApp to pfodDevice

Since Radio/LoRa message lengths are limited, some radios can handle up to 255 octets, and some as few as 28 (each less 4 octets for headers). This necessitates sending multiple radio messages to make up one complete data packet. The radio message number is used to keep track of the message order. When the client sends a command to the pfodDevice it needs at most 11 radio messages (max 256 bytes + 8 byte hash). Almost all pfod client commands will fit in a single message. Only very long multi-selection screens and long text input screens will require more then one radio message.

For the current connection, the pfodDevice (server) only accepts a radio message if it has the next expected msgNo. It also keeps track of the starting msgNo of the previous cmd. If a radio msg is received from the client with the starting msgNo of the previous command, then the pfodDevice just retransmits the previous response and ignores the duplicate command.

The pfodDevice needs at most 43 radio messages to send back its response (max 1024 bytes + 8 bytes hash). The size of the menus and the their text determines how may radio messages will be required to send them back to the client. In many cases, simple menus easily fit into one radio message.

The pfodDevice just assigns the next msgNo to each new radio. If the client resends a command and the pfodDevice has already processed it, then the pfodDevice just sends back the previous response as multiple radio messages using the previous message numbers. Note: this implies the pfodDevice library needs to keep a copy of the previous response (up to 1024 bytes), so Uno style micros with only 2048 byte of RAM are not usable with this implementation.

pfod messages are transmitted as null terminated UTF-8 strings. The radio receiver keeps looking for radio messages until a null terminated one is received indicating the end of the multiple radio messages making up this pfod message. NOTE: This implies you cannot send arbitrary binary data (which may include embedded nulls) across this link. If you need to send binary data it needs to be encoded into non-null ASCII (or UTF-8) characters first before sending and then decode at the other side after receiving.

d) 128bit Security to prevent unauthorized access

The pfodRadio library is build on the existing pfodSecurity class to provide 128bit security against unauthorized access. Using security is optional and can be disabled by using an empty password. If security is used, it will require two additional pfod messages (each way) to complete the challenge and response, before the linke is operational.

If the pfodDevice has 128bit security key defined, then the client must send {_} with msgNo 0 as the first command of a new connection. If the pfodDevice does not already have an active, not timed-out, connection from some other client, this will start the challenge response to open a connection with 128bit security. If there is already an existing, not timed-out, connection from another client, the pfodDevice just ignores this connection requests. Sending {_} during an active connection just returns {} as _ is not a valid pfod command designator. This is also true for any command the pfodDevice, the server, does not handle.

The security hash code is appended after each command and response. i.e. after the closing }. When the pfod client sends a command to the pfodDevice, the client ensures that the terminating } and the hash code are always sent in the same radio message, so the pfodDevice only needs to check for the } to determine the last message is the end of the command and so increment the next expected message number .

e) 'Raw Data', sent by the pfodDevice, should only be sent while the Client is waiting for a response

Raw Data is never reliably sent, as noted in the pfod Specification. Raw Data is only ever sent by the pfodDevice, the server. Any data outside { } from the client is filtered out by the library and not sent.

While the pfod Specification allows for Raw Data to be sent from the pfodDevice, the server, at any time. In order to prevent radio tx-tx clashes, this design assumes Raw Data is only ever sent after the pfodDevice has received a command and BEFORE it send the response. This ensure the client is in receive mode. Nothing in the library prevents you from sending Raw Data at any time but you may overlap with the sending of a client command which will result in the loss of the raw data and a resend from the client

The client uses keep-alive commands, { }, typically ever 5sec after the last response, to keep the link from timing out and so to allows for regular safe transmission of raw data transmission. However because radio/LoRA is a slow, low bandwidth medium, the amount of raw data that can be transferred is limited.

As noted above, raw data is not transmitted in a reliable manner. If you want to reliably transfer some data then send it as text in a menu or other pfod message.

f) Limit pfodDevice connections to one at a time. i.e. only one pfodApp can communicate with the pfodDevice at a time.

Note: The pfod client may be turned off between connections.

A pfodDevice, the server, only communicates with one pfod client at a time. So only one client node number can connect at a time and the client node number defines the current connection. When the pfod client exits it sends the 'close connection' command, {!}. The pfod client does not expect a response to this command.

If the pfodDevice receives 'close connection', the pfodDevice will mark the current connection as closed and will then accept a new connection from another (or the same) node number. Since the delivery of this 'close connection' command is not guaranteed (since a response is not expected), the pfodDevice will also accept a new connection from a different client node number after an in-activity timeout (I.e no new commands from the pfod client during that time)

The common case is multiple connections from just one client node number, so the pfodDevice will always accept a new connection with message number == 0 from the current node number even if the last connection has not timed out.

Starting a new connection closes any current connection. The parser is re-initialized and its input buffer cleared to remove any partial commands from the old connection.

pfodApp Radio Message Number processing

The pfodRadio class of the pfod client also keeps track of the expected pfodDevice radio message number. For each new connection, the pfodDevice resets its outgoing radio message number to 1. So after the pfod client opens a new connection (by send sending a command with a message number of zero), the pfod client expects the first of the response from the pfodDevice to have message number 1. Subsequent messages from the pfodDevice have message numbers incrementing by 1 each time. The pfod client ignores any message number it has already seen and accumulates upto 1024 bytes (+8 bytes hash) of messages while it is building up a complete response for parsing. A radio message terminated with null(s) indicates the end of the response and higher message numbers are ignored. If one or more lower message numbers are missing, the pfodSecurity class, after a timeout (connectionTimeout) resends the command to force the pfodDevice to resend the response to fill in the missing radio messages.

Connection and Message Time outs

There are two levels of timeouts, one in the low level pfodRadio class and another in the upper level pfodSecurity and prodSucurityClient classes.

In the pfodRadio class the method setAckTimeout(mS) sets the timeout for an ack to the last radio message sent (default 200mS, limited to be between 1mS and 32.5secs). If an ack is not received in this time, the pfodRadio class retransmits the last message after a ramdom delay in the range timeout to 2*timeout. If after maxRetries (setNoOfRetries(retries) default 5), an ack has still not been received then the link has failed and is closed and EOF (0xFF) is injected into the receive buffer to indicate this to the upper level (pfodSecurity / pfodSecurityClient).

In the pfodSecurity / pfodSecurityClient there is a idle/response timeout setting, setIdleTimeout(timeInSecs), default 10sec and setResponseTimeout(time_in_mS). Setting 0 means never timeout, which is NOT recommended.

On the server side, the pfodDevice, the pfodSecurity class will mark the connection as closed and close the pfodRadio connection, if there is no commands received from the client for this length of time. At that point either a new connection can be established or the previous connection resumed. To keep the connection between commands, the pfodSecurityClient has a keepAlive setting.

On the client side, the pfodSecurityClient class will mark the connection as failed and closes it, if it does not receive a complete response from the server, to the last command the client sent, within idle timeout. Since the pfod Specification requires ALL commands to be replied to, a missing command indicates either the link or the end pfodDevice has failed. This connection timeout is also set by setIdleTimeout(timeInSecs), default 10sec, in pfodSecurityClient The receive buffer will be cleared and EOF (0xff) inserted to indicate broken connection.

Other Considerations

Radio Message Header

Each radio message has a header containing 4 octets (8bits bytes)

-TO The node address that the message is being sent to (broadcast RH_BROADCAST_ADDRESS (255) is not permitted) (uses RadioHead TO field)
-FROM The node address of the sending node (uses RadioHead FROM field)
-MsgNo, A message number, distinct (over short time scales) for each message sent by a particular node (uses RadioHead ID field)
-Ack MsgNo The message number of the message being acked. 0 if nothing to be acked, i.e. first connect msg or start of next cmd from client. (uses RadioHead FlAGS field)

CloseConnection commands

The closes the connection by sending the CloseConnection command,{!} No response is expected to this command and the client closes the connection as soon as it is sent. If this command does not reach the pfodDevice, the server, then the same client can still open a new connection immediately and a different client can open a new connection the server timeouts out the previous connection.

If the pfodDevice (server) sends a CloseConnection response, then the client closes the connection after reading the entire response. In this case, the same client can still open a new connection immediately and a different client can open a new connection once the server timeouts out the previous connection.

Future Enhancements

Reliable Raw Data

Add a new pfod command, e.g. {- … } to request data to be sent as a pfod message, i.e. In a reliable way, inside { } and allow for a 'more' flag to prompt the pfod client to request multiple data messages.

Conclusion

The design above is implemented in the pfod library for use with radio/LoRa equipped boards. This design allows reliable remote control using the pfod protocol via unreliable radio messages with the option of 128 security to guard against unauthorized access.

Android^TM is a trademark of Google Inc. For use of the Arduino name see http://arduino.cc/en/Main/FAQ

The General Purpose Android/Arduino Control App.
pfodDevice™ and pfodApp™ are trade marks of Forward Computing and Control Pty. Ltd.

Reliable Remote Control via Radio/LoRa with pfodRadio

How to control remote devices reliably via Radio/LoRausing pfodApp