Building a LwM2M Client with a Focus on Energy Conservation
By Patrick Porlan, IoTerop Senior Software Engineer
Designing a radio connected device operating on battery power means balancing several aspects that often conflict. Frequent data reporting, real time event notifications, and a rich feature set are convenient features to include, but they can be difficult to reconcile with years of operation on a small battery.
A central design decision involves the protocol used to report data and remotely control the device. One good option is Lightweight Machine to Machine (LwM2M), built on CoAP. For an embedded system designer building a low-cost LwM2M client, a complete system including the battery, processor, modem, communication stacks, and application need to be independently efficient and thoughtfully integrated for optimal battery life.
Threading
One of the first architecture choices to settle is whether to use a RTOS, or multiple execution threads. Threads are useful to lower the complexity of large applications, and to distribute CPU resources among concurrent tasks, but they may not be worth including in the context of a closed-off system controlling a small set of sensors. Using threads increases memory consumption (a 1 KB processor stack per thread is typical), and adds concurrency issues.
Nevertheless, the alternative finite state automaton model also has drawbacks, as it can be fragile and difficult to maintain. In all cases, the overall code execution schedule should be clear from the design stage. LwM2M itself does not mandate the use of a rich TCP/IP stack and can usually rely on a single thread and a single modem-provided socket for all its needs (data upload, device management, firmware update, etc.), so it does not require a RTOS.
Spotty communication
The key to reducing power usage lies in reducing and condensing the times when a device is actively communicating. It is best to set the device to sleep when its awake periods are unneeded, and only talk to a management server a few times a day.
While it is possible to build such a system using TCP, this streaming protocol bears several aspects that impact a device’s power budget:
-
-
- Establishing a TCP connection, especially if secured using TLS, is relatively costly
- Maintaining the connection requires sending keep-alive messages periodically, as well as being to reply to server-initiated keep alive probes
- It can incur significant memory usage due to its dynamic aspect
-
UDP is usually more desirable since:
-
-
- It requires no connection
- It does not require periodic transmissions (keep alive probes)
-
UDP datagrams can reach their destination in a different order than they were sent, or even be dropped. Another downside is that network UDP routes are dropped quicker than TCP routes. Once this happens, the server is unable to reach the device until the device itself sends data again.
These downsides are also known to CoAP, which associate tokens to requests and supports confirmation and retransmission, as well as LwM2M, which handles unreachable devices through the queue mode mechanism. The UDP routing problem is aligned with NB-IoT power saving mechanisms, which will turn off the radio quickly after transmission, rendering the device offline anyway.
PSM
NB-IoT supports a Power Saving Mode that is basically a sleep mode during which the device is still attached to a cell tower, but is offline. Data sent through the radio network to the device won’t reach it. As soon as the device wakes up, either periodically due to the timers negotiated while registering to the tower or because it needs to communicate, a new “connected” period starts. While connected, the device is available for incoming traffic reception. The duration of this window is negotiated with the cell tower and is best kept at a few seconds.
Power consumed while in idle mode is thousands less than while transmitting, and a small fraction of the current needed while connected, even without sending or receiving data. The connected window itself can be partitioned so the device is only reachable during small intervals, to save power: this is the DRX mechanism, which stands for Discontinuous Reception. Data sent to the device is delivered with a predictable delay, but not discarded by the network.
Such a cycle starts immediately after the device stops transmitting. A well-designed application can align its transmissions with the programmed radio timers, or restrict its radio activity itself to the connected periods boundaries, which are reported by the modem to the application.
Latency
NB-IoT latencies can easily reach ~10s and its throughput is in the order of kilobytes per second, even under ideal conditions. This needs to be considered when selecting the CoAP timeouts and PSM parameters values, and accounted for in the applications built on top of LwM2M / NB-IoT.
This also means that the power budget imposes a cap on the total amount of data that can be exchanged between device and server!
LwM2M offers several mechanisms to reduce the number of frames exchanged.
Bulk transfer
Coalescing several updates and sending them as a large packet rather than one by one reduces total latency and improves throughput. This can be done using the LwM2M 1.1 data push operation (a composite operation in nature) or by temporarily locking the transmission of notifications if the application permits. It should be noted that heavy usage of the modem, with several large frames transmitted over a short time span, needs to be supported by the underlying battery.
Peak power consumption
Batteries with low self-discharge are limited in the amount of power they can provide at once. They won’t be able to sustain prolonged radio usage as used by a firmware update session, and either need to be coupled to a capacitor, or their characteristics taken into account by all elements of the firmware stack to limit peak power usage. Testing should cover good as well as bad network conditions: NB-IoT power usage can increase markedly under degraded conditions (long distance, interference), without even taking packet loss and retransmissions into account.
The “turn on modem, transmit, then turn off” method
Instead of dealing with all these questions, it is easier to simply turn off the modem when not needed. While conceptually simple, this technique usually yields higher average power consumption than PSM, due to complexity of reattaching to the network. The typical power draw in PSM idle is a few μA, which allows for years of operation on a small battery.
RX path
Particular attention should be devoted to the handling of the radio reception window. CAT-NB1 and its newer NB2 (3GPP rel 14, higher bandwidth) iteration are half duplex. The modem can be seen as generally alternating between transmission, reception, paging (checking if the communication tower holds incoming data), and idle mode.
Several parameters are negotiated when the modem attaches to the tower. The Tracking Area Update timer controls the interval between the connected/idle cycle. The active cycle contains several paging windows, whose duration can be configured. The active time timer controls the number of seconds during which the RX path remains open after a transmission. The tower has the ultimate say on the values that get used. The modem communicates them to the application using the CEREG notification.
Having the shortest possible RX windows is a good way to save power. It works well with LwM2M provided a) the device requests queue mode when registering to the LwM2M server, b) the application itself is tolerant of this scheme and c) NB-IoT latencies, which easily go to 10 seconds under poor radio coverage, are supported.
Queue mode
LwM2M servers will buffer any outgoing traffic for clients registering with the Q mode indication, and send everything that has been stored once the client performs a registration update. The delay between registration updates is therefore an important consideration for the application designer. It’s also possible for the client to buffer outgoing notifications until the next outgoing traffic time slot, as an optimization.
RAI
NB-IoT also supports unilateral shutdown of the RX path through the Release Assistance Indication mechanism, which may help save additional milliwatts. This means there is no need for paging after a non-confirmable data push, so the application can tell the modem that it does not expect any incoming traffic for the foreseeable future. This should be used with care, though, as it may interfere with reception of ACK for confirmable CoAP messages, queued LwM2M commands following a registration update, a DTLS handshake, or a firmware update session, for example.
Encryption options
Using DTLS, it is possible to cryptographically secure the link between the device and the server. However, this may not be the best choice energy-wise. Support for the connection ID extension ought to be used to guarantee that the link can be used for a long time after the initial (relatively costly) handshake completes, even in the case of a change of IP address. Likewise, using the Pre-Shared Key mechanism is going to be less energy-intensive than using certificates. Using CoAP level authentication and encryption (OSCORE) is a better choice, as it leads to smaller frames on average.
Error management
Transitory problems (such as a power outage affecting the radio network) can happen. It’s important to react to them thoughtfully, in a way that won’t deplete the battery. This needs to be considered in the design phase of the product, even if it is difficult to test. It’s worth having a fallback mechanism, such as reverting to the LwM2M bootstrap phase or factory settings after a while.
Conclusion
Besides a careful selection of the technologies on board, getting an embedded device to run for years on battery requires a holistic understanding of the entire hardware and software, as well as painstaking attention to detail. Understanding precisely what the system does and when is key.
Join us on GitHub and let's discuss: https://github.com/ioterop