Configuring Programs in Memory

In the early stages of embedded software development, it is essential to understand how information is going to flow through the entire embedded system at multiple levels, perhaps starting with a block diagram of the system’s overall functionality and working down to its instantiation in electrical-computational components. An integral step in this process is understanding the types of memory available to you as a programmer and how whatever program(s) you end up writing are going to be executed at a physical level. This might seem rather tedious, but I actually find it to be one of the most intellectually exciting parts of embedded development. We’re dancing right on the boundary of software abstraction from the physical substrate by imposing an organizational scheme on our computing resources to create program robustness and flexibility.

The details of how this process plays out will of course be different for each combination of hardware platform and application, but it’s still helpful to illuminate the process with a worked example instead of just listing some general principles in bullet-point format.

Consider a chip with:
1) 4 MB program memory,
2) 512 KB of SRAM with ECC
3) 128 KB of data flash for Emulated EEPROM with ECC

[EEPROM stands for ‘electronically erasable programmable read-only memory’. It is a form of non-volatile memory (NVM), which means that it can hold its stored information even without power. EEPROMs can be erased and programmed in-circuit by applying particular electrical signals, and it’s possible to erase a single cell at a time. Oversimplifying, data flash is a cheaper version of EEPROM with faster read times. Note that it’s not possible to erase a single cell of data flash; one can only erase an entire sector or no sector at once.]

Nominally, the purposes of the above memory areas are, respectively:
1) Store all program code
2) Run the program code
3) Store data that the programmer wants to survive power cycles, such as device settings or
telemetry from the last power-on state

Suppose that one of the highest-level requirements of the embedded software to run on this chip is that it is modifiable in situ. Suppose also that it is either impossible or extremely expensive to recall a device from the field, so reprogramming must be possible without manually applying a debugger to the circuit board running the software.

What more should we know about our memory before making any configuration decisions?
Firstly, we should know how the 4 MB program memory is organized: e.g. whether it is divided into sectors, its addressing scheme, where it starts reading and executing from on chip power-up/reset, etc.

In the back of our minds, as we look into these factors, we should be keeping in mind basic embedded systems guiding principles, such as:

A device should do nothing but wait for a command from the user when powered on or reset; in particular, it must not do anything surprising, extreme, or otherwise difficult to control
–> This means that, if we are going to have a bootloader program and an application program,
the bootloader must start running first every single time. The safest way to do this is to either store the bootloader starting at the first-to-execute address or jump to where the bootloader is stored on power-up or reset as the first operation
We must prioritize preserving the ability to communicate with the device at all costs
–> This means that the code on chip that programs receiving and sending information to the user must be and be kept pristine at all times; we never want to modify this code in situ.
–> This suggests a good software design will consist of at least two separate programs that can at least partially reference each other: a bootloader, which, at minimum, handles communications and start of application program execution, and the application program itself.

Now, looking into the above factors, let’s say that the chip starts reading from program memory address 0x0000_0000 on power on/reset. So, at 0x0000_0000 we either want to either store the first bytes of the bootloader or an instruction to jump to the address that contains those first bytes. Let’s explore those options in turn, starting with storing the bootloader starting at 0x0000_0000. Then the bootloader code will extend to some other address within those 4 MB of space. For spice, assume that there are two separate banks of program memory, each of size 2 MB, and that a program cannot span the two banks. Then we have 2 MB of space for the bootloader, which is way more than we should actually use. We should even be able to fit both a bootloader and application in that bank.

One question is, do we really want to do that? What if we did? Would that prevent us from being able to erase and rewrite application code if we wanted to modify it?
We need to look at the sector size for that memory area. If it’s equal to the size of the bank itself, we
can’t modify the application without erasing the bootloader, and if we erased the bootloader, then all comms with the device would be destroyed: game over. On the other hand, if a sector is smaller, e.g. 16 KB, then we could easily use the bootloader to erase sectors and rewrite application code in them.

Next, to consider how to arrange the two programs, as stated before, we could store a jump instruction at 0x0000_0000 that would take us to the start of the bootloader on chip power on/reset. If we pack the bootloader efficiently, i.e. at an address that leaves just enough space for it in its flash bank, then we leave as much room as possible in that bank for the application. The same is true if we store the bootloader starting at 0x0000_0000.

So, how to choose? A concern with storing the bootloader at an offset is that, if we’re not careful
with the size of our application, we could unintentionally erase and rewrite the start of the bootloader
when writing a new copy of the application. No good. We’d also have to include a jump to the applicat
inside the bootloader, but this is the case either way, so net neutral. So, for now, let’s go with the bootloader starting at 0x0000_0000 and the application “higher up” starting in a sector that doesn’t contain any bootloader code. This is an arrangement that gives us independence of the bootloader and application in memory while giving the bootloader the power to erase and rewrite the application.

What if we wanted to be able to modify the bootloader itself? Typically, we really don’t want to do this, because small memory sizes in embedded applications often dictate that you can only store one copy of the bootloader. If you try to rewrite the bootloader using the bootloader, there are unrecoverable failure modes in which the system might find itself. For example, bear in mind that the bootloader is getting the data to rewrite from somewhere else, perhaps a PC or another embedded processor. The bootloader is dutifully taking in new bytes and placing them in its own memory space. It could rewrite part of the code that programs it to rewrite code, or communicate with the external device, or the data link breaks and you have some old code with some new code that might not be mutually consistent.

All of this is terrible. By keeping the bootloader brutally simple, we can really can exhaustively test it before putting it in the field with full confidence and avoid thinking about those failure modes.

But let’s say we really want to have the ability to modify the bootloader, and we have the space to store multiple copies of the program. If we did our initial design and development correctly, we should never need to modify the bootloader after deployment. But perhaps we’re concerned about corruption of memory due to environmental factors. Then we need to decide how many copies we’re going to have, where to put them, and how to be able to select which one runs after a power on/reset.

Again, we should always have a pristinely functioning bootloader on our device (modulo disastrous hardware events) and to be able to revert to that version of the program. Ideally, that area of memory stays locked forever to prevent other program copies from tampering with it.
Let’s say we have space to store one other copy at a time. Then we could use the Eternal Bootloader to write the new copy into distinct sectors of program flash. We could also write a function/user command into the Eternal Bootloader that jumps to the starting address of Failsafe Bootloader. Likewise, we can write our Failsafe Bootloader to include a function/user command to jump to 0x0000_0000 and so retain the ability to ping-pong between copies, whether or not that’s actually a wise idea.

From a design perspective, it would be tempting to include this backup-boot-write capability in order to perform bootloader bug fixes or updates in the field. While this does happen in certain applications, there are compelling reasons not to do this.

If you have not found all bugs in your bootloader during its development and that of your primary application program, your bootloader is too complicated.
If you want to include more functionality in your bootloader later, you are making it too complicated.
If you are relying on users to update the bootloader in situ, you are possibly making user-ship too complicated and risky.

An alternate purpose of having a second copy is to include it from the factory as an identical copy of the Eternal Bootloader, just pure redundancy in case of hardware corruption of the Eternal Bootloader program memory—in which case we try the hail Mary of jumping to the backup copy, which might or might not work depending on exactly how the Eternal Bootloader memory is borked.
In this regime, we never intend to modify the backup copy because of the above 3 reasons. This same reasoning applies to the application code, but with the application the goal is also very much to possibly modify it in situ.

With the purpose of redundancy in case of hardware failure in mind, it makes sense to lay out these copies in the following (abstracted) way:
Program flash bank 1:
Bootloader Copy 1
[at least one empty flash sector]
Application Copy 1

Program flash bank 2:
Bootloader Copy 2
[at least one empty flash sector]
Application Copy 2

When we think about using the program memory available to us in an optimal way, we want to make sure that we’re laying the foundation for a software design and overall system design that at least have adequate:

modes of recovery
modes of security
testability
modularity (for future changes as well as testability)

By storing redundant program copies when space allows, limiting the responsibility of the bootloader, and giving programs sensible separation and arrangement across the available space,
we are forwarding all of those goals of our embedded system development.