ARM Cortex processors and microcontollers are ubiquitous, it’s such a successful processor architecture. It’s useful to understand how they work and how to use them as tools. When getting started with embedded development with ARM Cortex it might seem like a complex and difficult platform since a lot of the work will usually be done for you- i.e. you will download tools that setup projects for you, configure and initialize the devices for you and pull in dependencies with some API to use the device peripherals. It’s not obvious what is going on “under the hood” during those steps and what some of the resulting code is doing. Here I’ll document the steps to get an ARM Cortex microcontroller up and running with only simple tools and eventually make it perform more advanced tasks.
Source code is available at my GitHub. In this post a LED is toggled. In the repository there is code that do more advanced tasks such as configuring the microcontroller as a USB PC mouse, sending I2C data or running code asynchronously using interrupt service routines.
Article summary: I program an ARM Cortex M3 microcontroller using basic tools. With nothing more than the GCC-ARM toolchain program a STM32 microcontroller.
The tools needed is the GCC-ARM toolchain, a text-editor and a way to program the microcontroller. On Ubuntu the toolchain is installed by executing
To get a basic hello world program that blinks a LED on and off running the steps are the following.
- Define the microcontroller memory layout for the compiler using a linker-script.
- Create the startup-routine. It’s a sort of bootloader.
- Create the main application which is hello world.
- Compile the programs and flash them to the microcontroller.
The linker script is used to define the memory layout of the output binary, that is the program flashed onto the microcontroller and what it will execute on boot.
The image below is an excerpt from the datasheet and shows at which addresses flash memory and SRAM are located. Because the memory is accessed at different addresses the linker script is needed. Data needs to be written to addresses there is actual hardware memory. Certain types of data should be stored in different types of memory. For example data in SRAM is destroyed when power is lost but it’s faster to access than flash memory and perfect to store read-only data such as the actual program instructions.
Page 34 of the datasheet shows the memory mapping. The information needed is the start address of the flash memory and SRAM, that is 0x0800 0000 and 0x2000 0000. It’s possible to use 0x0000 0000 for the flash memory address start since it can be aliased to the flash memory.
Data will be segmented depending on the type. Here data is segmented using “text”, “rodata”, “data” and “bss”. This wikipedia article summarizes the difference between some of these. A custom segment called “vectors” will also be used. It is an architecture specific data-region that contains addresses for the stack-top and addresses for exception handlers. Read more about it here.
Here is the actual link script:
The memory regions are defined in the memory block. From the datasheet the flash and SRAM region starting addresses and lengths are obtained. The flash memory is 64k and the SRAM 20k. Next the data segments are defined using the section block. Some useful memory-addresses symbols are defined, such as the top of the stack, the part of flash memory after the text- and rodata-segments (flash data start) and some similar symbols for SRAM. Those symbols will be used by the startup program to initialize and load data.
By using the “AT” keyword a load-address can be specified, that is data can be loaded from one part of the binary and executed from another. Doing this all data is flashed onto the flash-memory but some parts of it will be relocated to SRAM. This is something that the startup routine needs to do.
The startup routine is needed for a few tasks. It’s used to define the custom memory region with the vector table mentioned in the previous section and to define symbols that can be used to set that vector table’s entries memory addresses to exception handler functions. It’s also used to move executable code from the flash memory to the faster SRAM and to initialize some variables to zero (for example global variables in C need to be initialized to zero).
The image below show’s the format of the vector table. The beginning of the startup routine defines this data segment.
The format of the vector table available from the Cortex-M3 Devices Generic User Guide. The vector table begins with the address of the stack pointer reset value and then follows addresses for exception handlers.
The startup routine begins with some assembler directives specifying syntax, architecture and to use the thumb-instruction set. Then the vector data segment is defined by allocating long data types contiguously. I’ve only included the generic vectors so far. More device specific vectors need to be added later to implement useful hardware interrupts, such as creating an interrupt handler that executes code when a timer reaches a certain value.
Next follows the actual startup program logic. I won’t be getting into details of the actual assembler logic. The program moves data to SRAM using the symbols in the link-script, it initializes variables in the bss-segment to zero and then it calls the main-function. The main-function is not defined yet and will be defined in a C program. The “_reset_handler” symbol from the vector table is inserted here which mean that when the reset handler is called the startup routine will re-run. This is useful as the device will be in a predictable state when it resets.
The hello world program is written in C and will simply cycle an output pin between low to high states. I’ll be using a small LED-connected to that pin and it will be possible to see it blink when the program is running. Turning a led on and off is the equivalent to hello world in embedded development.
I’m including the library header file called “stm32f10x.h” since it contains preprocessor macros for registry memory addresses and defines some helpful structs to work with registries. This makes the code infinitely more readable and no tedious work of copying registry addresses from the datasheet is required.
The code is simple. The general purpose input/output port C is first enabled and then pin 13 of the port is configured to be an output. In an infinite while loop pin 13 is repeatedly set high and then reset back to low between delays. The delay function is a simple loop that counts to a specified number.
Compiling and flashing the program
To compile the project the ARM-GCC toolchain is used. I’ve written a makefile that compiles all sourcefiles separately and then links them to a ELF-file which is then translated it to a binary format that is ready to be flashed to the microcontroller. I’m using the “ST-Link” programmer which has a command-line program called “st-flash” that is used to flash the microcontroller.
makebuilds the project and produces the output .hex-file that is ready to be flashed.
make flashtries to flash the program to the microcontroller using the ST-Link utilities.
make symbolsoutputs the symbols in the .elf file, useful for debugging
make debugtries to run the GNU-debugger with the program.
The next steps is creating a useful program. Take a look at my GitHub repository for code and libraries. I have continued on this work and written
- USART implementation
- I2C implementation
- USB HID implementation
- Use of hardware timers
- Asynchronous programs using interrupt handlers
- Setup with a 72 MHz system clock using PLL with a crystal oscillator
A more advanced project using all of these components and libraries is my USB intertial headtracker.