Mission-critical embedded systems: High reliability for improved security (Part-1)

Security of embedded systems

The word “security” immediately invokes related terms such as “confidentiality” and “privacy”, while conjuring up mental pictures of malicious hackers! It also brings to mind closely related security parlance, such as password, OTP, encryption etc. However, the term “security” encompasses much more in reality, including diverse aspects such as integrity, authentication, non-repudiation and availability of the system.

Security of an embedded system has been traditionally mapped to secure boot, and for good reason. Secure boot ensures that only authorized executable code is granted permission to run on the silicon. Being the very first piece of code that gets to execute on the CPU, it activates the security mechanisms, and verifies the authenticity and the integrity of the application firmware code to be executed on the hardware platform after the boot up stage. Secure Boot is indeed the foundation of device security, also known as the Root of Trust (RoT) of the device. RoT is further extended by Secure Firmware Update, sometimes called Secure FOTA update, a key ability that allows the device to get its application firmware securely updated.

Secure boot

Secure boot uses cryptography to verify (Authentication + Integrity) the application firmware code and metadata (containing version info, hardware configuration, boot conditions etc.). It grants execution permission only if verification performed on both the firmware code and the metadata is successful. Being immutable code that is launched at each system reset, secure boot code is typically protected by the write protection mechanism of the flash memory chip from which it boots up. To minimize the risk of having vulnerabilities in itself, secure boot code is concise, simple and verifiable. Complex steps are usually deferred to later stages, like a secondary bootloader. Secure boot is sometimes confused with high availability boot, although they have common objectives, the mechanisms and interfaces provided by the silicon vendor have considerable differences.

Security in mission critical applications

However, authentication and integrity check of executable code aren’t the only aspects of security in an embedded system. Take for instance the scenario where a user of a battery powered device is unable to boot up due to failure of the normal boot device or corruption of its bootable image. Does this incident compromise system security? Well, it sounds more like a typical field failure rather than a security incident, right?

Now, what if the said device is supposed to serve a mission critical function, like a military field device being operated by an infantry soldier on an active frontier, or a medical equipment being used on critically ill patients in a rural area by a paramedic! We can now begin to see that in spite of the absence of any external threat agent, and in spite of having authentication and integrity checks in place to safeguard against accidental execution of rogue code, the failure to boot simply compromises the device’s availability to perform its designated mission critical function, thereby compromising security. Therefore, we can see that for a mission critical embedded system, higher system reliability directly translates to improved security.

System design for improved reliability

Reliability in system design starts early with hardware component selection, which is MIL-grade by default for defence applications. Reliability engineering extends to the software level by implementing extensive error handling and recovery logic that is designed into the embedded firmware. In spite of these best design practices, component failures and firmware corruptions cannot be ruled out in the operational lifetime of the product. In fact, we need to plan and design for such failures, especially in the case of devices intended for outdoor operation in harsh climatic and operating conditions. As the sharp contrast between the Chandrayaan missions 2 versus 3 has been a recent revelation to system designers of all mission critical systems, we must adopt the philosophy right from start that “if something could go wrong, it will go wrong!”, and design accordingly for all kinds of potential failures and build appropriate mitigation mechanisms.

Reliability mechanisms in embedded systems

Silicon vendors usually provide a built-in redundant boot feature for system designers in their respective microcontrollers. In the eventuality of failure of the normal boot device or corruption of its bootable image, the redundant boot mechanism permits booting up from an alternative boot device. The idea is to be able to have redundancy in the boot process in spite of boot failure in the field. From a reliability metrics point of view, the redundant boot feature improves “availability”, while helping to extend overall “MTTF” (Mean Time To Failure) and drastically reducing, if not eliminating, the contribution of a boot failure scenario to “MTTR” (Mean Time To Repair). And as mentioned earlier, higher system availability of a mission critical system translates to its improved security.

Implementing redundant boot requires more than just experience with the microcontroller’s development environment and compiler toolchain. Critical design decisions are involved as to the scope of the redundant boot function, understanding the system hooks into the detection of normal boot failure and protection of the redundant bootable image against unauthorized access. It also requires knowledge of configuring the microcontroller correctly for redundant boot, creation of bootable redundant image file with the appropriate memory configuration, flashing the boot image into the redundant boot device and a test plan for validating the redundant boot functionality and establishing reliability metrics.

Mission-critical embedded systems: High reliability for improved security (Part-1)

Leave a ReplyCancel Reply

Ram Mohan Ramakrishnan

Get updates on our Insights

Get in touch with us