Friday, August 28, 2015

Getting started with the STM32F7 Discovery board

The other week I got a brand new STM32F7 Discovery board, the fabled Cortex M7 had arrived and performance bliss was at hand ...

... except there were a few minor stumbling stones on the road to paradise ...

this is not an ad for ST, I did pay the full Mouser price.

  • ST programming tools for the ST-Link interface are not available for Mac OSX, and the board does not expose the JTAG/SWD of the Cortex M7, it is only available through the onboard ST-Link adapter chip. The GGC ARM Embedded toolchain https://launchpad.net/gcc-arm-embedded/+download was happy to generate code after importing some system defines and startup code from STM32Cube_FW_F7_V1.1.0 library and a bit of tweaking to Makefiles and linker scripts. It was possible to flash the binaries to the board using the USB file system interface, but then there is no debug connection, and it refused load code that was linked to use fast ITCM flash access.
  • With some some work it was possible to add the chip and flash configuration details to the free stlink utility https://github.com/texane/stlink and also tweak the stlink RAM based flash writing code to handle the 64 bit flash access so flash is correctly written, all running on Mac OSX.
  • The example code in the STM32Cube_FW_F7_V1.1.0 library does not by default use the external 25MHz crystal clock, only the onchip internal 16MHz oscillator. The PLL is also not configured by default, meaning the chip runs at 16MHz, and not the 196 or 216 Mhz that should be possible.
  • For optimal performance the linker settings should be configured to make the code use the ART accelerated ITCM flash access at 0x0020000 instead of the normal, slow, flash access at 0x0800000. When the linker was configured this way the ST onboard USB flash loader refused to load code into flash, but stlink utility worked.
  • The external SDRAM for the Discovery board has a different configuration than the one in the "system_stm32f7xx.c" supplied from ST. It seems ST has used the settings for another STM32F7 board and not been careful to make the changes for the F7 Discovery board.
Virtual USB com port was quite easy, it turns out that the USB subsystem is almsot identical to the the one on STMF4 chips.

After these tweaks I was able to run some simple tests where the chip executed between 200 and 300M instructions/sec. I think this can be a great chip for audio synthesis.

Next will be the big LCD display, but that must wait for when I have lots of free time again.

Monday, August 10, 2015

More USB and MIDI

Time goes fast, new and more powerful microprocessors are introduced, and the favourites of a year ago are fading and starts to collect dust in a draver filled with useful but not quite exciting boards. These days the Cortex M3 of the Maple board feels a bit old, the Teensy 3.1 is a great board and the Arduino type libraries and support is superb, but it has a bit limited memory and no hardware floating point.The boards I use mostly for audio generation at the moment are the Netduino plus 2, without the Netduino bootloader and .Net libraries, the STM32F4 Discovery and the XMC4500 Relax Kit Lite. All of these have Cortex-M4F processors, more than 128KB RAM and  1024KB of flash, they are all cheap and good value for money.

I need stable USB MIDI and after wrangling with the device libraries and development platforms, endless hierachies of folders and files that are usually targeted at Windows and often lacking USB MIDI, I decided to write yet another lightweight USB implementation. I know this might sound stupid and a waste of effort but having some time free due to vacation times and a bit of rainy summer I went ahead.

The code is written from scratch and uses only some MCU specific header files borrowed unchanged  from the manufaturer supplied code libraries. When time allows and I have done some more testing a few minimal example programs will be added.

I have uploaded the code to my Cortal Dendrites repository on Github:

https://github.com/mlu/cortal_dendrites/tree/master/cortex_m/usblib

Happy Coding :)

Friday, May 22, 2015

Computing fractional multipliers and divisors using continued fractions.

This blog post will discuss how to find the best rational approximation with bounded numerator or denominator to a a preferred frequency ratio. This will be a bit mathematical but actually nothing more than standard addition, subtraction, multiplication and division will be used. Some example code is given at the end.

Fractional baudrate generation

Some microprocessors uses fractional multipliers and divisors to generate accurate baud rates from system clocks that are not an integer multiple of the bitslice clock frequency. The bitslice frequency is the baudrate times an oversampling count.

\[ f_{bitslice} = oversample \cdot baudrate = ratio \cdot f_{sysclock} \tag{1} \]

An example is the USIC in the Infineon XMC1100 and XMC4500 family microprocessors. Here the clock for the bitslices is derived from the system clock, slightly simplified, as 

\[ f_{bitslice} = (DCQT+1) baudrate = \frac 1 {(PDIV+1)} \cdot \frac {STEP} {1024} \cdot f_{sysclock} \]

DCQT gives the oversampling count, typically set to 15. PDIV and STEP are register values controlling the frequency division. On the XMC1100 PDIV and STEP are 10 bit wide so they have values from 0 to 1023. Rearranging the above we see that we must find a good approximation

\[ \frac {STEP} {(PDIV+1)} \approx \frac {(DCQT+1) baudrate \cdot 1024} { f_{sysclock} } \tag{2} \]

with values of PDIV and STEP between 0 and 1023.

Example: 38400 baud from a 32MHz clock and 16 times oversampling

We enter the values in (2) to find

\( \frac {STEP} {(PDIV+1)} = \frac {16 \cdot 38400 \cdot 1024} { 32000000 } = \frac {629145600} { 32000000 }  = 19.6608 \) 

Working by hand we would now factor out common factors from numerator and denominator, but for computer implementations that is unnecessary extra work. As we shall see the algorithm works as well without that step.
Now we calculate the signed integer division with remainder using the rounded value 20 as quotient.

\( \frac {16 \cdot 38400 \cdot 1024} { 32000000 } = 20 - \frac {10854400} { 32000000 }  = 20 - \frac {1} {  \frac { 32000000 } {10854400}  } \tag{3} \) 

Rewrite the denominator in the last fraction by carrying out the division with remainder

\(  \frac { 32000000 } {10854400}  = 3 - \frac {563200}{ 10854400 } \approx  2.948 \approx 3  \) 

We enter this value 3 in the previous formula (3) to get the approximation

\[ \frac {16 \cdot 38400 \cdot 1024} { 32000000 } \approx 20 - \frac {1} { 3 } = \frac {59} {3} = 19.6667 \] 

This is good, after one iteration the error is less than 0.008 or about 300ppm.
We can improve this by using a better using a more precise value

\(\frac { 32000000 } {10854400} = 3 - \frac {563200}{ 10854400} = 3 - \frac 1 { \frac { 10854400 }{563200}}  = 3 - \frac 1 {19 + \frac {153600 }{563200}}  \approx 3 - \frac 1 { 19 } = \frac {56}{19}\)
Inserted into (3) this gives

\( \frac {16 \cdot 38400 \cdot 1024} { 32000000 } \approx 20 - \frac {1} {  \frac {56}{19} } = 20 - \frac {19} {56} =  \frac {1101} {56} = 19.6607  \)  

This approximation is better, error is about 3ppm, but the STEP value 1101 is to large for the 10 bit range, so we keep the previous approximation of 59/3.

Best rational approximation with continued fractions

In order to find a general method to calculate good rational approximations as in the previous example we use the theory of partial quotients, or convergents, of continued fraction expansions and best rational approximation theory. [TO DO add links ]   This is a well developed general theory for finding the best rational apprimations to a number with a given size of the denominator. There are recursive algoritms for calculating succusively better approximations.

So we are given a rational number p/q and we want to find succesive approximations \( a_n/b_n \) to p/q with smaller values for \( a_n \) and \(b_n \) than p and q. For the baud rate generations we want the best approximation where \( a_n \) and \(b_n \) fits in the respective fractional multiplier and divisor registers.

The formula

\[ \frac p q = c_0 + \frac {1}{c_1 + \frac {1}{c_2 + \ddots \frac {1}{c_n + \frac {1} {r_{n+1}}}}} \approx  c_0 + \frac {1}{c_1 + \frac {1}{c_2 + \ddots \frac {1}{c_n }}}  = \frac {a_{n+1}}{b_{n+1}} \]

For the first few values of n we get:

\[ a_{-1} =0,  b_{-1}=1 \]
\[ a_0 =1,  b_0=0 \]
\[  \frac {a_{1}}{b_{1}}  =  c_0 = \frac {0 + 1 \cdot c_0}{1 + 0 \cdot c_0}  = \frac {a_{-1} + a_0 \cdot c_0}{b_{-1} + b_0 \cdot c_0}   \]
\[ a_1 =c_0,  b_1=1 \]
\[  \frac {a_{2}}{b_{2}}  =  c_0 +  \frac {1}{c_1} =  \frac {c_0 \cdot c_1 + 1}{c_1} = \frac {a_{0} + a_1 \cdot c_1}{b_{0} + b_1 \cdot c_1}   \]
\[  \frac {a_{3}}{b_{3}}  =   \frac {a_{0} + a_1  ( c_1+ \frac 1 {c_2})}{b_{0} + b_1 ( c_1+ \frac 1 {c_2})} = \frac {(a_{0} + a_1  c_1 ) c_2+ a_1}{(b_{0} + b_1  c_1 ) {c_2}+  b_1} = \frac {a_1 + a_{2}{c_2}}{b_1+b_{2}{c_2}} \]
We now can see the recursion formula develop
\[  \frac {a_{4}}{b_{4}}  =   \frac {a_{1} + a_2  ( c_2+ \frac 1 {c_3})}{b_{1} + b_2 ( c_2+ \frac 1 {c_3})} = \frac {(a_{1} + a_2  c_2 ) c_3+ a_2}{(b_{1} + b_2  c_2 ) {c_3}+  b_2} = \frac {a_2 + a_{3}{c_3}}{b_2+b_{3}{c_3}} \] 

\[  \frac {a_{n+1}}{b_{n+1}}  =   \frac {a_{n-2} + a_{n-1}  ( c_{n-1}+ \frac 1 {c_n})}{b_{n-2} + b_{n-1} ( c_{n-1}+ \frac 1 {c_{n}})} = \frac {(a_{n-2} + a_{n-1}  c_{n-1} ) c_n+ a_{n-1}}{(b_{n-2} + b_{n-1}  c_{n-1} ) {c_n}+  b_{n-1}} = \frac {a_{n-1} + a_{n}{c_n}}{b_{n-1}+b_{n}{c_n}} \] 


The algorithm

We start with a rational number r = p/q.  We will generate a sequence of five values \(a_n, b_n, c_n, p_n\) and \(q_n\). The quotients \(a_n/b_n\) will be our succesive approximations to r. The values \(r_n\) are defined as \(p_n/q_n\) but they need not be explicitly calculated, the formula for \(r_n\) is used as the template for how \(p_n\) and \(q_n\) are updated.

Startup:
\( r_0 = r  \)
\( p_0 = p,  q_0 = q \)
\( a_{-1} =0,  b_{-1}=1 \)
\( a_0 =1,  b_0=0 \)
 
Loop until q_n+1 is 0 or an or bn are to large
\( c_{n} = round(r_{n}) = round ( p_{n}/q_{n} ) \)
\( r_{n+1} = 1/(r_n-c_n) \)    This is done calculated terms of \(p_n\) and \(q_n\)
\( p_{n+1} = q_n \)
\( q_{n+1}=p_{n}-c_n \cdot q_n \)
\( a_{n+1}=a_{n-1}+a_{n} \cdot c_{n} \)
\( b_{n+1}=b_{n-1}+b_{n} \cdot c_{n} \)

The rounding operations can be done downwards, keeping all values positive, or towards the nearest integer improving precision but also introducing negative numers and extra complexity. Note that the \(c_n\) value is calculated before updating \(p_n\) and \(q_n\) and is not remembered and used in the next iteration but recalculated.

A skeleton C implementation

Following code lacks error handling and only checks limit for numerator, but it illustrates the algorithm. It does lack a few comments, but follows the algortithm structure closely.


void cfractr(int32_t p, int32_t q,int32_t alim, uint32_t * ares, uint32_t * bres) {
    int ap = 0;
    int a1 = 1;
    int bp = 1;
    int b1 = 0;
    int cn = a1, anext, bnext, pnext;
    while (1) {
        /* Signed rounded rational division */
        if ((q>0)&&(p>0)||(q<0)&&(p<0))
            cn = (p+q/2)/q;
        else
            cn = (p-q/2)/q;
        /* Next value for partial quotients and remainder */
        anext = ap + cn*a1;
        bnext = bp + cn*b1;
        pnext = p-cn*q;
        /* Exact value, remainder is 0, break */
        if (pnext == 0) {
            a1 = anext;
            b1 = bnext;
            break;
        }
        /* Numerator too large, break */
        if ((anext<-alim)||(anext>alim)) {
            break;
        }
        /* Shift one step before next iteration */
        ap = a1;
        bp = b1;
        a1 = anext;
        b1 = bnext;
        p = q;
        q = pnext;
    }
    if (a1<0){a1=-a1;b1=-b1;}
    *ares = a1;
    *bres = b1;
}



   

Monday, July 8, 2013

Breath control revisited

I have been using my breath control and openpipe breakout for two months now and it makes for a really enjoyable instrument. There are some things to develop, first the hardware pipe and pressure sensor compartment should be redesigned, the current is a quick hack. So here comes some notes on designing a new mouthpiece. In some weeks time I hope to be able to add some sound examples and also describe some of the programming.

Designing a mouthpiece

The mouthpiece should allow some air to pass through while playing to make breathing more natural, but also stop the airflow enough for a clearly measurable pressure to build up. To avoid moisture on the sensor board i place the sensor in a compartment after the exhaust hole so that the air stream does not pass directly over the sensor board. Closing the exhaust hole and just using the pressure makes the end of notes sound bad since the pressure doesn't drop cleanly when you stop blowing. The best option is probably to make the size of the exhaust hole adjustable and to let the player decide.

The air pressure in a recorder mouthpiece varies between 200 and 1000Pa depending on the note played with high notes having more pressure. The difference in pressure between pp and ff (loud and quiet) is about 200Pa, these numbers can be found in Modeling of Gesture-Sound Relationships in Recorder Playing: A Study of Blowing Pressure, a master thesis by Leny Vinceslas.
An exhaust hole with 3-4mm diameter gives this kind of pressure on the sensor and feels quite nice to play. I will test more with different sized exhaust holes, how hard to blow and how the pressure varies on the sensor. 

Here is my design sketch for the next version of breath sensor mouthpiece.  I have found very cheap nylon tubing used for electrical installation work that fits snugly around the Open Pipe. I am fairly confident this can built at home with simple tools, the only remaining part is the silicone rubber film. It can be bought 0.3 mm thick 50x50 cm from Germany for 90 euros, a bit much money but its probably enough for more than 600 such mouthpieces  ( I might find some use for a lot of silicone rubber film :) ).

The sensors

BMP050
Reading both temperature and pressure and calculating the calibrated values takes around 11ms, this time is mostly spent waiting for the chip to complete a conversion.  With careful programming other calculations and sampling of the touch sensors can be done during this wait time. A breakout board can be found for around $15

MPL3115A2
This sensor seems to have as good or better performance than the BMP085 with faster sampling rate. The calibration and temperature compensation is done in the sensor ASIC and the convoluted calculations needed for the BMP085 are not needed. I have ordered a breakout board for testing.

A further enhancement would be to use a very open mouthpiece and sense both pressure in the middle of  the airstream and total flow, this would more correspond to playing a flute. Not sure what sensors to use for this and how to mount them.

Relation between pressure, tone height and volume

Using the data in L Vinceslas work I set up a table of the normal pressure used to to play the different notes at medium volume.  This value is used as baseline for the note, corresponding to midi volume 64. This means that like in a real flute or recorder, in order to keep a constant volume, the pressure must increase as we play higher notes.

    int volume;
    int midpressure = note_pressure[note-60];


    volume = 64 + ((pressure - midpressure)*psensitivity)/128;
    if (volume < 0) volume = 0;
    if (volume > 127) volume = 127;

This code fragment shows the midi volume calculation, the psensitiviy gives the sensitivity to pressure variation around the standard note_pressure from the table. A value of around 15-20 seems to work quite well. In my test sketch I have assigned this value to a CC controller so it can be changed dynamically while playing.

This has been tested and its easy to dynamically control the expression of the sound.

Using the pressure to control the octave of the note played

If the pressure is more than 2/3 of the pressure difference to the note one octave higher than the one fingered then scale is shifted one octave up and later if it is below 2/3 of the difference down to the original note the scale is shifted back.  This code is still in planning.

Detecting the start of a note

The program recognises the start of a note when the pressure has been more than 50 Pa above ambient for three sample periods (30ms). This is the number of samples needed for the pressure to reach its peak value so that the midi note volume can be calculated. If aftertouch, channel pressure or the expression continuous controller is active then this may be decreased at the risk of losing the initial attack.

Thursday, July 4, 2013

Adventures with the Terasic DE0 Nano

I have for a long time been fascinated by the idea of programmable logic as a complement to standard MCU's. Ideas like running 32 pwm channels and as many quadrature detectors on one chip for servo control is definitely beyond todays MCU's, powerful as they are.

I have previously played a bit with the Terasic Trac C1 and the Dallas Logic Quickgate EP2C8 Cyclone II boards, trying to learn VHDL and how to build things like an audio synthesizer with them.  So when I saw the Terasic DE0 Nano I simply couldn't resist the urge to buy one. At €74 from Mouser it is not dirt cheap, but for an FPGA board of this kind it is very good value.

Designing FPGA logic is quite different from ordinary C/C++ microprocessor programming. The best book I have found to help me is "Rtl Hardware Design using VHDL" by Pong P. Chau.

So after reviving some old VHDL projects I started to install the Quartus software on my Fedora 18 system. Quartus 13 refused to run without frequent crashes even after I changed and added several system libraries to conform to the ones coded into the Quartus 13 executables. After this I tried installing Quartus Free Web Edition 11, and it seems to run perfectly,  this might be because of the changes done to make Q 13 run, or not, but at the moment it works. Older Quartus versions can be found at  ftp://ftp.altera.com/outgoing/release/.

Most of the get started manuals for complex systems like this tells you to install some precoded development package and just click menu boxes in a specified sequence without giving the logic for that. For me this is not really learning a new tool. So I try to build small things from scratch to see what happens before using the heavyweight preprogrammed IP in the component libraries.



Right now I have a Serial Port echo running on the DE0 Nano that displays incoming serial bytes on the 8 LED's and then echoes them back, the small chip is a Teensy 3 that acts as a Serial-USB bridge. Its not very advanced yet but writing the logic from scratch is fun and rewarding.  Next step is SPI and some PWM.


Thursday, May 9, 2013

OpenPipe and breath control


I have been playing around with the OpenPipe Breakout, the electronic pipe/flute control, for a few weeks now, trying to revive some old and mostly forgottens skills on how to play a flute or Irish tinwhistle.

The pipe is connected through a I2C interface to a Maple clone, the Olimexino STM32 and then with MIDI to Garageband on my iMac. Its a fun instrument but I find it a bit hard to balance, holding it and playing some fast fingering at the same time, using a thumb for note on/off is also a bit unusal.

So I decided to try and make a breath control so that the pipe can be played almost like a real flute.



The breath control sensor is a BMP085 breakout board, this atmospheric pressure sensor
connects to the Maple board over I2C. The mouthpiece is made from two pieces of nylon tubing. A cork from a bottle of good Italian wine holds things in place. The sensor is placed inside the tube and the end is sealed with the cork, a small ventilation hole lets some air pass thrugh the mouthpiece.






The sketch reads the BMP085 and the touch sensor in the OpenPipe Breakout and starts a note if the pressure is more than 50Pa above ambient. Some early tests shows that the basic setup works but theres a lot more to do before the sound can be controlled by breath like in a real flute.

Selecting a pressure sensor

BMP085 is an absolute pressure sensor accessed using the I2C protocol. No extra components are needed. The drawbacks are that the breath only represents a small fraction of the sensors range and the baseline pressure, ambient pressure, must me calibrated for.. Price is ___

The other major type of pressure sensor is a MEMS bridge giving a small voltage representing the difference between measured pressure and ambient. The problem here is that the small sensor output must be amplified before the signal is input to a AD converter. No calibration for changing ambient temperature is needed.

Saturday, April 6, 2013

MIDI USB Class for the Maple board



I got myself an OpenPipe breakout board and want to use a Maple board to connect it to a soft synth on my computer or a hardware synth. For this I want the Maple to implement a MIDI USB class device.

The Maple has as standard a USB serial device that gets setup and loaded as part of building a sketch and its then available as SerialUSB object. The MIDI USB will replace the Serial USB, and register the device as a MIDI class compliant device. The Maple bootloader is not affected, but the remote reset into bootloader is not implemented, so a manual reset is needed to get into the bootloader, I can live with that.

The MIDI USB needs a few things to setup

  • USB Setup and handling of Control Requests
  • A MIDI USB device descriptor to present itself to a host computer as a MIDI USB device
  • Bulk IN and OUT endpoints for MIDI USB packets, 32 bit/4 byte blocks of data
  • Code that interprets the MIDI USB packets as standard MIDI events.

Building the MIDI USB class as a variant of the existing USB serial code, the first and third parts are almost identical for MIDI and Serial, actually easier for MIDI since no modem control line handling is necessary and no management endpoint is needed.
The device descriptor is bit harder, but its a static datastructure and just following the MIDI USB documentation carefully will get you through this.
The USB MIDI package handling is standard MIDI code, and does not depend on the details of the USB transport layer.  

The code has been tested and registers as a MIDI device both under OSX and Android, and seems to be working.

A git repository can be found at    https://github.com/mlu/maple-ide

The MIDI USB is built from the following files:
High level device object, Wirish style, replaces usb_serial.cpp
  • usb_midi.cpp
  • include/wirish/usb_midi.h
Low level USB driver, replaces usb_cdcacm.c
  • stm32f1/usb_midi_device.c
  • include/libmaple/usb_midi_device.h
The process of setting up a sketch to use MIDI instead of Serial is still clumsy and needs some manual editing of the boards.h file.

The development is done on a modified Maple-IDE that uses a current arm toolchain and a libmaple layout that is closer to the present libmaple layout so the files are placed in different locations than the standard Maple-IDE file layout.

UPDATE 2013/0412

The descriptor definitions have been factored out of usb_midi_device and placed into usb_midi_descr.c/h . A working copy of the libmaple git repository with the midi usb files placed in their proper place in the hierarchy can be found at https://github.com/mlu/libmaple .