Started 23Apr2020. Updated 15Ap2021. In work. The title is more of a pun than reflecting real content. This note is in group Technology (plus My XMOS pages). Important: standard disclaimer, narrative style disclaimer.
I cannot hide that this is far from anything I feel comfortable writing about. Even if I have several decades of experience with safety-critical embeded systems, we at that time did not offer the idea of embedded security any consideration. I have next to no experience in cryptography, zero with trying to break it, and zero experience with making it hard to extract my secret codes . But I thought it so interesting that I wanted to give discussing it a try here.
If you know this field of technology and think that the below has too little essence, and have some spare time – I’d be happy for a comment (below) or just mail me. My goal is to learn some and ask some questions.
|Another reason for not feeling comfortable writing about this is that I don’t know how murky this is. If you are looking for a recipe on how to break into a gadget, read on. At the end I hope you have been discouraged. Anyhow, please find a more sustainable way to earn your living.|
The main theme here is quoted from  (Rambus):
..there is a category of attacks that simply ignore the mathematic properties of a cryptographic system – instead focusing on its physical implementation in hardware.
..effective side-channel countermeasures should be implemented at the design stage to ensure protection of sensitive keys and data.
NewAE Technology Inc vant med sitt ChipArmour-bibliotek i kategorien sikkerhet & trygghet. Dette er et åpent kildekodebasert bibliotek som skal hjelpe kundene å utviklet innvevde programvareløsninger som er tolerante overfor feilinjeksjon. Feilinjeksjon er en av de kraftigste angrepstypene innvevde systemer er utsatt for i dag, ettersom de gir angriperen mulighet til å ta snarveier forbi sikker oppstart og andre sikkerhetsanordninger. Det fins mange eksempler på slike angrep de senere arene.
|Open security library
NewAE Technology Inc won with its ChipArmour library in the security & safety category. This is an open source library that will help customers develop embedded software solutions that are tolerant to fault injection. Fault injection is one of the most powerful types of attack embedded systems are exposed to today, as they allow the attacker to take shortcuts past secure boot and other security mechanisms. There are many examples of such attacks in recent years
Det ChipArmour gjør, er å bygge programvarebaserte mottiltak mot feilinjeksjoner inn i brukervennlig eksempelkode. I tillegg kan Chip-Armour integreres med eksisterende produkter fra NewAE Technology Inc, slik som ChipWhisperer og ChipSHOUTER for å kunne utføre avansert verifikasjon av den endelige koden. Dette betyr at ChipArmour ikke bare er teoretisk sikker kode, men kode som er blitt testet på et utvalg av maskinvareplattformer.
|Prevents forced injection
What ChipArmour does is to build software-based countermeasures against fault injections into user-friendly sample code. In addition, ChipArmor can be integrated with existing products from NewAE Technology Inc, such as ChipWhisperer and ChipSHOUTER to enable advanced verification of the final code. This means that ChipArmour is not only theoretically secure code, but code that has been tested on a variety of hardware platforms.
Fig.2. I take the chance of placing the figure prematurely here. The figure is rather naïve. (For a real board’s block diagram, see f.ex ). First, you won’t find Rser on your board. (Rser is a shunt for the meter Mc but a serial resistor for the power path, Wiki-refs). If you did find find Rser, it won’t be with the correct value. Opening some track to insert any Rser is equally difficult. In order to measure the current drawn by a Device with Mc I assume it would also be difficult to remove the capacitors. For one, they could be what kept the power alive, so Rser certainly can’t be anywhere. The capacitors probably are ceramic and soldered onto the board. Often a Device would also have several voltages needed, and for each there would be lots of pads/pins and wires to them. I would also expect Device not to work very well if the capacitors were removed. I assume you would either have to design a board yourself, or buy one for this purpose (below). Also, an antenna is not just some wire, and a radio receiver is not just an A.M. radio. I have used a receiver for a specific purpose (note 164) – but wouldn’t even know where to start looking for a general wide band analogue (?) radio receiver. I could start with a wire connected to my 1 GSa/s scope (note 185), then buy an antenna amplifier. Finally I would have to change field of work and go for it. Therefore, this would only have academic interest. I guess, many out there might reason like this, and become “surprised” next time around when somebody had succeeded fault-injectioning his board.
Some points from  and  by NewAE Technology Inc and  and  by Rambus, and several Wikipedia articles. The fact that these give such good overviews and I did make the figure above is my excuse for making this look more like a brainstorming list:
- Side-channel attacks (SCA) may monitor power consumption (SPA) or radio signals (DPA) while a device is performing cryptographic operations
- Simple power analysis (SPA) means to read the current consumption of a device to extract meaningful information out of data sets and correlation between them to get meaningful differences. Finally, to reach some data set that would mirror, f.ex. an internally stored cryptographic key
- Differential Power Analysis (DPA) means listening to a device for the radio signals that are being emitted during calculations, and do calculations on the different data sets received. Here too, correlation between data sets and extracting meaningful differences is the main point
- I think one of the purposes here is to be able to bypass an assumed safe boot mechanism
- According to  it may be possible to listen to the power consumption of device like a processor (or a smart card!), to find out what it does, based on inserting known test vectors, such as different encryption keys. This way an encryption key hidden in the code (assumed not available) may be broken, by finding a mirror of the program flow and thus detect “conditional jumps based on key bits and computational intermediates”, during the processing of the incoming keys
- There probably are lots of units out there that one can buy to learn, test and design in order to avoid side-channel attacks. I think that common for them is that the hardware is quite expensive for an amateur (or retired) to fiddle with at home
- From NewAE Technology Inc. The ChipWhisperer-Lite is a tool from , a set of two boards, where one board basically contains a Xilinx SPARTAN-6 FPGA and the other is a necessary “target board” . This may help with testing these claims. (I have some FPGA notes here). With this same tool it is also possible to analyse, by observing from the outside, how the internal data bus of the target works while the processor handles known input data, like different encryption keys. They have an open sourced library to help with all this
- From Rambus. I have not found such readily available tools as those above. But here is from a menu: Interface IP (Memory PHYs. SerDes PHYs. Northwest Logic Controllers). Security IP (Root of Trust Solutions. Provisioning and Key Management. Protocol Engines. Crypto Accelerator (DPA Resistant Cores). DPA Countermeasures (DPA Workstation Analysis Platform). Anit-Counterfeiting). Memory Interface Chips (Server DIMM Chipsets)
- The best way to break an encryption processing is perhaps to fool the target processor into believing that the arbitrary calculation concluded with passed encryption, even if it were supplied with a wrong key. The target processor must be considerably disturbed, but with many attempts it is possible, this may succeed. The examples NewAE Technology Inc give are quite hair-raising to me. Is this really possible?
- “Even a simple A.?M. radio can detect strong signals from many cryptographic devices?” 
- When all this has been understood, and in fact tested with the boards mentioned above, how to take SPA countermeasures is the final, and I guess most important step. The Preventing SPA chapter in  is only a half, but rather interesting page. However, also from : “.. implementing e?ffective SPA and DPA countermeasures can be challenging” and continue “We have even used the technique to reverse-?engineer un?known algorithms and protocols by using DPA data to test hypotheses about a device’?s computational processes?. ?It may even be possible to automate this reverse-?engineering process??”.
According to  there are basically three ways to try to prevent DPA. All are discussed beyond my understanding, but here is some I would pick up:
- Reduce “signal sizes” (signals, I assume, as picked up by the listening electronics) by f.ex. using “constant execu?tion path code”
- Introduce noise in the calculations. However, use “temporal obfuscation with great caution”
- DPA resistant cores: Have “realistic assumptions about the underlying hardware?”. Like “aggressive use of exponent and modulus modi?fication processes” calculating values from scratch when needed instead of making temporal values that will be used several times
- DPA countermeasures might include burying the signal in more noise, introducing uncorrelated noise, changing how secret values are stored and calculated upon etc. To me it looks like countermeasures made to jam DPA also would make life harder for SPA , and vice versa
- DPA resistant libraries: Both NewAE Technology Inc and Rambus (and others?) would deliver such. Maybe it’s possible to make one at home as well? Excuse me, at work. However, Rambus seems to own and licence several patents, so I assume it’s a wilderness starting to use this
- DPA resistant cores would be “DPA Resistant Core in Verilog RTL source code; SDC constraints synthesis input for FPGA or ASIC synthesis” 
- Different cryptography algorithms as well as how they are coded, may leak more or less, ie. are more or less easy to extract data from
- Also “An attacker does not need to know specific implementation details of the cryptographic device to perform these attacks and extract keys” . I assume that a repetition of this question just indicates my level of competence: is this really true?
- Finally, side-channel attacks are dependent on having the device locally available, either by acquiring it in some way or by being very close to it. I guess this may become orthogonal to the whole matter, that there is no need to think about side-channel attacks because the attacker can never become local to the device? Or should one always be worried?
- Assumed safe booting, is that to upgrade the firmware, or to boot-to-run after power-up (assuming the code runs from some volatile storage like RAM)?
- Breaking some encryption during run-time (in order to reboot with a new program of any sort) – how would multitasking / multicore processors behave when exercised with the above procedures? I am thinking about the xCORE machines from XMOS. I cannot find any such in the Category:Hardware of  or in the Rambus page . (I have som XMOS notes here). Basically, how much would such architectures introduce noise in the calculations (for free?)
- How would pipelines, caching and speculative execution influence this detection scheme?
- Much DES processing is done by
<<shifting or rotating
>>. Does this scheme depend on whether this is done in a single clock cycle or several?
- Same for
*multiplier. At least for this I read that “The leakage functions depend on the multiplier design? but are often strongly correlated to operand values and Hamming weights” ?
- And for
xz+y exponentiators. Also from : “Modular exponential functions that operate on two or more exponent bits at a time may have more complex leakage functions?”
- By inspecting the code in the target or victim in  I get the impression that the code they extract the running pattern from runs as an interrupt, triggered by the input vector coming in that way, over a serial channel or USB. To me this looks rather repeatable. What if the input vector were sent into another task, like in a (preemptive) operating system? The code could trigger other tasks to do something badly random just to introduce noise, while the encryption is being done?
- There are some answers to the above questions 1-7 at the NewAe forum, question 1 (below)
In our research, 45% of electronics engineers cite DATA SECURITY as the biggest barrier to success – and it’s easy to see why. As technology becomes smarter, consumers and enterprises are increasingly protective of their data, fearing a loss of control, potential misuse and cyber-crime.
In the AIoT, data needs to be shared across devices – think controlled ‘hive learning’ within set parameters. Edge-AI makes this compelling, because data processing, inferencing and decisioning can take place on-device rather than in the cloud, which reduces the (perceived) risk of data leakage and vulnerability.
They continue with their solution:
xcore.ai features include secure boot, one-time-programmable key storage, random number generation and custom instructions. The on-device data processing, inferencing and decisioning capability helps (sic) reduce the (perceived) risk of data leakage, allay end user privacy concerns and improve the overall experience.
In , where the upcoming (in 2020) xcore.ai (an “X3”?) processor is described, the author describes its thread scheduler’s buffer. I certainly think this is different from the X1 and X2 range of xCOREs:
The CPU switches threads every cycle in round-robin fashion. Ideally, each thread advances every eight cycles, achieving an effective speed of 100MHz. If a thread is inac- tive, the CPU skips to the next one, but because the five-stage pipeline has no forwarding or interlocks, the minimum number of active threads is five. The thread scheduler buffers 256 bits (at least eight instructions) per thread and issues them to the decoder as needed. When the buffer is empty, it fetches another 256 bits from the SRAM; since the memory is single ported, this access prevents an instruction from executing on that cycle.
The code (download)
I have some xC code where I have experimented with some ideas. To tell the truth, I think that I am at some dead end with that code, or dead end with the basic idea. But then, maybe not? What I try to do in the code is to run it in such a way that what happens on one of the logical cores could not be spied on over radio or by monitoring the power consumption. But then, I don’t know if it’s random enough, or that it does leave so little trace that that its patterened could not be found out of.
You can dowload it from My xC code downloads page, part #206.
- Task/processes to insert random noise – 29Apr2020
- Avoiding all zeroes entering a random sequence without any conditional testing – 9May2020
- xCORE and Side-Channel Attack – 30Apr2020
- For some code I tested I raised an internal XMOS Ticket called 32552 “Possible race by timerafter case over interface input case in selects in equal code on 7 cores” – 4May2020 (xTIMEcomposer 14.4.1)
Denial-of-service (DoS) and side-channel attacks, while good future targets for Morpheus, are outside the scope of this work… Looking ahead, we see great potential for EMTD technologies. Beyond control-flow attacks, we envision that a similar approach could be adopted to protect against side-channel attacks,… (Ensembles of moving target defences = EMTD)
New to me was Address space layout randomisation – ASLR (Wiki-refs) and the fact that  “Today, ASLR is widely deployed in Windows, Linux, iOS, and Android.” and “attackers have overcome all variants of ASLR”. (By side-channel attacks (Wiki-refs)).  continue “In this work, we seek to elevate moving target defenses by combining them with a runtime re-randomization technology called churn.”
Then Todd Austin says in  that “Yeah. It’s interesting how a lot of people say that Morpheus is unhackable. You see that in the press a lot. I don’t generally say that myself, because I think it is hackable. But it’s super hard to hack.”
There are so many interesting things in this world.. I need a restart myself.
- English: Address space layout randomization (ASLR), Data Encryption Standard (DES). Fault injection. Hamming weight. Power Analysis (Simple Power Analysis SPA, Differential Power Analysis DPA). Rambus. Shunt vs. series resistor. Side-channel attack (SCA). Simon (cipher) – used by Morhpeus. Speculative execution
- Embedded Security Isn’t Easy by NewAE Technology Inc, see https://www.newae.com/embedded-security-101 (23Apr2020)- Also see refs below
- Differential power analysis by Paul Kocher,? Joshua Jaff?e? and Benjamin Jun (1999). Lecture Notes in Computer Science, 388-397, read at https://link.springer.com/chapter/10.1007/3-540-48405-1_25
- Noe redusert møteplass (“Somewhat reduced meeting space”), about Embedded World in Nürnberg (NO), by Einar Karlsen (ed) in the magazine Elektronikk 04.2020. Read at http://viewer.zmags.com/publication/695f03a2. At page 14
- NewAE Technology wiki, see https://wiki.newae.com/Main_Page. “Welcome to ChipWhisperer – the complete open-source toolchain for side-channel power analysis and glitching attacks. This is the main landing page for ChipWhisperer”. ChipWhisperer is based on a Kickstarter project called ChipWhisperer-Lite, started by Colin O’Flynn (here). 2020.4.27 from LinkedIn: now CEO at NewAE Technology Inc. & Assistant Professor at Dalhousie University.
I believe the capture boards ChipWhisperer-Lite, -Pro and -Nano are based on Xilinx SPARTAN-6 FPGA and an Atmel SAM Arm processor (Atmel is Microchip these days. Atmel has headquarter here in Trondheim!)
I believe the target boards are based on Atmel XMEGA, MEGA32, Intel 87C51, ATS Cortex M0, Xilinx FPGA Spartan 6 and Artix-7, smart card etc.
- Embedded World award 2020: AND THE WINNER IS … NewAE Technology Inc wins with ChipArmour in the category Safety & Security, see https://www.embedded-world.de/en/news/press-releases/winner-embedded-awards-0dhccwe30m_pireport
- Security Licensed Countermeasures by Rambus, see https://www.rambus.com/security/dpa-countermeasures/licensed-countermeasures/
From that page: “Our Cryptography Research division discovered SPA and DPA and has developed and patented the fundamental countermeasures needed to protect against these attacks.”
- Introduction to Side-Channel Attacks by Rambus, see see Security Licensed Countermeasures (Download after filling in name and mail address)
- Protecting Electronic Systems from Side-Channel Attacks by Rambus. Same sheme as 
- NewAE Technology wiki, page CW305 Artix FPGA Target, Hardware details, see https://wiki.newae.com/CW305_Artix_FPGA_Target
- GitHub/newaetech/chipwhisperer GitHub code of both capture and target (“victim”) boards. To me it looks like all code is C (like the victim file
XMEGA_AES_driver.c) in Verilog (.v) or Python (.py). Also Autoconf files (.in), plus makefile etc.. The layout seems to be in .sch files, which I believe are for Eagle of KiCAD (CAD-tool, graphical circuit diagram editor, see note 195). See https://github.com/newaetech/chipwhisperer
- NewAE Forum, see https://forum.newae.com. Amin is Colin O’Flynn. I registered there as Aclassifier
- THE EDGE OF TOMORROW, marketing material from XMOS, see https://www.xcore.ai/wp-content/uploads/2020/04/Edge-of-Tomorrow.pdf (Apr2020)
- XMOS xCORE.AI ADDS VECTOR UNIT by Linley Gwennap, Linley Group (May 4, 2020), see https://www.xmos.com/wp-content/uploads/2020/06/XMOS-Xcore.ai-Adds-Vector-Unit.pdf
- Morpheus Turns a CPU Into a Rubik’s Cube to Defeat Hackers. University of Michigan’s Todd Austin explains how his team’s processor defeated every attack in DARPA’s hardware hacking challenge. By Samuel K. Moore, in IEEE Spectrum 13 Apr 2021 See https://spectrum.ieee.org/tech-talk/semiconductors/processors/morpheus-turns-a-cpu-into-a-rubiks-cube-to-defeat-hackers. Also see:
- Morpheus: A Vulnerability-Tolerant Secure Architecture Based on Ensembles of Moving Target Defenses with Churn by Todd Austin et al. In ASPLOS’19, April 13–17, 2019, Providence, RI, USA. See https://dl.acm.org/doi/pdf/10.1145/3297858.3304037. Also see