Saturday, September 12, 2009

Using a STM32 based board for Arduino Development

Exploring ways to enhance Arduino while still keeping the ease and experience.

This is work in progress, more details, updates and pictures coming soon. But the basic stuff works as described today.

Using Arduino boards and the Arduino GUI is a simple and fast way to develop embedded applications. There are predefined functions for input and output, serial communications and timing, many example sketches and the user does not have to worry about low level initialisation, interrupt vectors and timer configurations. On the other hand, ATMega 168 chips have limited resources, it is an 8 bit chip and not very fast.
Modern Cortex-M3 chips like the STM32 are fast, have much more RAM and flash and they have powerful and well documented debugging subsystems using JTAG tools, but starting to program these chips can be a huge step.

So here is my project to use a slightly modified, but from the users point standard Arduino IDE, standard Arduino sketches and run them on a powerful 32 bit Cortex-M3 (STM32) chip. The process of writing sketches, uploading them and then running them is exactly the same as for ordinary Arduino development. It is just that the processor board we use is not an Arduino. And you can use it all for your old Arduino boards also.

More complicated sketches that uses lowlevel access to the ATMega chip must be rewritten for this new platform.

The components we need:
  • Hardware.
  • Modified Arduino IDE.
  • Library code for this chip and board configuration.
  • Toolchain to compile and build code for a Cortex-M3 processor
  • An uplader that is used to program the chip.
The software setup has been developed and tested under Linux.

Hardware

The board I use is an ET-STAMP-STM32, a chip carrier module that brings out all chip i/o lines but not much more. I bought it for $24.90 from Futurlec (ET STM32 Stamp). It is mounted on a breadboard together with a 3.3V power supply, a serial USB adapter, a LED and some extra stuff for experimentation lika a potentiometer connected to an analog input and a push button.


The FTDI USB is only used to supply 5V to the small 3.3V regulator board. It can also be connected to UART2 or UART3.
The board is at the moment running a sketch that read the analog voltage from the potentiometer and adjusts the LED blink frequency:
volatile unsigned int count=-1;
int ledPin = 44;  // STM32_P103 Board - PC12
int dly;
int analogChn = 10;  // Analog channel 10 is PC0 = pin 32

void setup()
{
  pinMode(ledPin, OUTPUT);      // sets the digital pin as output }
  Serial.begin(115200);         // opens serial port, 31250bps (MIDI speed)
  Serial.write("\n\n   ***   Hello from Arduino 32   ***\n");
}

void loop()
{
  int k;
  count++;
  dly = analogRead(10)/2+20;
  digitalWrite(ledPin, HIGH);   // sets the LED off
  delay(dly);                  // waits for a second
  digitalWrite(ledPin, LOW);    // sets the LED on
  dly = analogRead(10)/2+20;
  delay(dly);                  // waits for a second  
  if (count%100 ==0)
  {
    Serial.print(count);    
    Serial.write("\n");
  }   
} 
So you can see that the sketch is a totally standard Arduino sketch.

Modified Arduino IDE.

The Arduino IDE is modified in order to be able to build code with the ARM toolchain. The modified files and some new files that are added can be found at http://github.com/mlu/arduino-stm32/tree/master . These files must be placed in the source tree for Arduino 0017, and then the IDE must be rebuilt.

Library code for this chip and board configuration.

The special code for this chip and board are found under hardware/stm32 and to use this you just have to select the board "STM32 Arduino32" in the Tools menu in the IDE.

Toolchain to compile and build code for a Cortex-M3 processor

I use the Codesourcery G++ Lite Edition for ARM, EABI version, that can be downloaded from the Codesourcery website (Codesourcery G++ Lite ).
Install this and make sure that the top bin directory containing "arm-none-eabi-gcc" and the rest of the cross compilation binaries are in the path.

Uploader

I have written a small uploader, using similar command line arguments as avrdude, that uploads the compiled and linked sketches to the processor.
The code and also a compiled binary that runs under Fedora 10 are included in the files at github.com/mlu/arduino-stm32.

Almost ready to rock

The first thing to notice is that the pin numberings are different
Arduino sketch              Board
digital pin number 0..15    pin PA0 .. PA15
analog pin number 0..7      pin PA0  .. PA7
analog pin number 8,9       pin PB0,PB1
analog pin number 10..15    pin PC0  .. PC5
Pins 9 and 10 (PA9 and PA10) are used by USART1 and are connected to the RS232 level shifter for the serial port and should not be used.
 
We must also manually switch the board between bootloader mode and normal run mode with the blue switch, and also do all resets manually with the reset switch.

Testing Blink

Blink uses the LED on pin number 13, This corresponds to pin PA13 so we add a LED and a 220 ohm serial resistor to PA13. Connect a serial adapter to the board and select the corresponding serial port in the Arduino IDE.
Load the Example/Digital/Blink sketch and select board type "STM32 Arduino 32". Set the board in bootloader mode by depressing the blue button, the green bootloader LED lights up, and reset the board. Now it should be possible to simply upload the sketch :), go back to normal run mode with the blue button and reboot with the reset button.

Now this is new and quite untested code so many things could go wrong when trying to do this on a computer with a different setup.

More to follow, especially with reader feedback

Saturday, September 5, 2009

Debugging on the Cortex-A8, System Components

Cortex-A8 for dummies, part 1

I have been working on the Cortex-A8 subsystem for OpenOCD for some time this summer. This is great fun, when there is enough time, but also takes a lot of work. It is a complicated system and the documentation is large, spread over several big TRM's and sometimes hard to grasp. So here comes the "Cortex-A8 Debugging for Dummies" version 0.0. This first post looks at the main components involved and the access methods.

You can also take a look at OpenOCD for the BeagleBoard at
http://elinux.org/BeagleBoardOpenOCD

The new members of the ARM family of processor cores, the Cortex-M3 and the Cortex-A8, share some features but in many fundamental ways they are not similar. They both support the Thumb2 instruction set and they both use the ARM Debug Interface v5 for debugging and direct access to the core debug units and the AHP and APB buses. Cortex-M3 processors also use the NVIC interrupt controller for handling peripheral interrupts. This makes debugging easier and interrupt handling more consistent when using processors from different suppliers. Cortex-A8 processor have implementation defined handling of external interrupts.

Architecturally there are big differences. The Cortex-M3 uses the ARMv7M profile with a very simplified set of processor modes and a reduced set of shadow registers. Cortex-M3 can only run Thumb2 code. This is in contrast to the Cortex-A8 that can run ARM, Thumb and ThumbEE instructions, and where the architecture is a variant of the standard ARM architecture seen in ARM7, ARM9 and ARM11 cores.

System Components

Here is a simplified picture of a Cortex-A8 system, based on the Texas Instrument OMAP3530 Applications Processor, used in the BeagleBoard. There are four main components involved in our picture of the system:
  • Debug Access Port, connects an external debugger through JTAG or SW, serial wire, to our system. The DAP can have one or several Access Ports, AP, that connects to different parts of the system.
  • MPU, Microprocessor Unit Subsystem. Here we find the processor core, core registers, system coprocessor CP15, Memory Management Unit, L1 and L2 Cache.
  • A Core Debug Unit that can pass data and instructions directly to the processor core, and also halt and resume the processor. This is connected to the AP through a local memory bus, an Advanced Peripheral Bus, APB.
  • External high speed bus, this is the L3 bus and it is an implementation of the Advanced High Speed Bus, AHB,  specified by ARM. Peripheral components and memory are connected to this bus.

Main Components for Core Debug System with one APB and one AHB MEMAP


In this example we have two access ports, both of them access system resources by a local memory address space, so called memory mapped access ports or MEMAP. The access ports are marked with the type of bus it is connected to. The APB-AP is connected to the debug resources without going through the system AHB bus. Each MEMAP have a access port number to identify it in the DAP. For the OMAP3530 processor the AHB-AP is number 0 and the APB-AP is number 1. 

A debug program like OpenOCD connects through JTAG to the DAP, and all communications are passed through an access port into the system. When using OpenOCD we can select which access port to use with the command:
>dap apsel n
Here n is 0 or 1 for a system with two AP's. After selecting a MEMAP we can access the memory space of the AP with the memory display word, mdw, and memory write word, mww, commands:
>dap apsel 1
>mdw 0x80000000
>dap apsel 0
>mdw 0x80000000
>mww 0x80000000 0x12ab34cd
>mdw 0x80000000
If there is no memory at the specified address or if the access is prohibited by the security manager a "Sticky Error" is generated by the AP and reported by OpenOCD. We can get more information about the type of resources a MEMAP is connected to by using the command
>dap info n
This information is encoded by the AP in a so called ROM Table. The details of how to identify system resources from  a ROM Table will be explained in a later post.

The APB AP, with MEMAP access port number 1, is reported as "MEMTYPE system memory not present. Dedicated debug bus."This indicates that only CoreSight components and other debug resources identified in the ROMTABLE for this AP are accessible. Memory and memory mapped peripheral control registers are not available.

The ADP v5 Debug Port is described in ARM IHI 0031A (ARM Debug Interface v5).

AHB and APB buses are described in ARM IHI 0011A (AMBA Specification Rev. 2.0).

The Debug Unit, the debug registers, MPU and the cache systems are described in ARM DDI 0344H (Cortex-A8 TRM).

The system overview and relation between the components can also bee sen in OMAP35x Technical Reference Manual (spruf98b.pdf) Figure 1-1. Interconnect Overview.

Core and Memory Access

The core is accessed through the APB MEMAP using the communications registers in the debug unit DTRRX, DTRTX and ITR.

Access to the full memory address range of the system can be done through the AHB AP to the L3 bus and is controlled by the security and access restrictions that are active in the system. This access always uses physical memory addresses (PA). There is risk for cache coherency problems when accessing memory this way, for volatile resources like I/O registers the problem should be less. This access can be done while the MPU Core is running.

When MMU is active all memory addresses in the core like PC, LR and data pointers are virtual addresses and must be translated to physical addresses before accessing them through the AHB AP.

Another way to access memory and memory mapped resources from the debug port is through the MPU using LDR and STR instructions written to the ITR and data in the DCC registers, DTRRX and DTRTX. [Cortex_A8 TRM, sec 12.11.6]. This access will go through the Cache and Memory Management Units of the Cortex_A8 and thus use virtual addressing when this is activated.

Strategies

For configuring systems at start up, such as setting memory controller parameters, clocks and PLL registers, memory access through the AHB access port is good. This should also work well for writing to flash memories.

For debugging code running on the MPU, the APB and access through the MPU core probably should be used, since this method avoids problems with virtual to physical address translations and also helps avoid cache coherency problems.

A debug system should implement both access methods and some method to choose which one to use.


Acknowledgements

Many thanks to Dirk Behme for proof reading and helpful comments.