Flash (♫ Ah-ah ♫)

A project log for Careless WSPR

A desultorily executed Weak Signal Propagation Reporter beacon.

ziggurat29ziggurat29 08/13/2019 at 19:593 Comments


Where we left off, the flash consumption was 63588 bytes, leaving us perilously close to the 64 KiB mark.  By ditching some large standard library code and manually implementing workalikes, we reclaim a lot (about 16 K) of space and get the project back on-track.


A while back when implementing some parts of the GPS task (parsing the NMEA GPRMC sentence) and parts of the monitor (printing and reading the persistent settings, and printing the GPS data), we used sscanf() and sprintf() to make that task easier.  Moreover, because we needed floating point support we introduced some linker flags to enable that capability.  Helpful as these functions are, they are notoriously large in their implementation (indeed this is why the default 'no float support' exists in the first place).  Time to drop a few pounds.

I spent an unnecessarily long period of time with this research, but I wanted to prove the point to myself.  First I took out all the major modules, then added them back in incrementally to see their local impact.  It proved what I already suspected about the scanf() and printf().  But a simpler test was really all that was needed, and I'll summarize those results here for the curious.

First, as a baseline, here is the flash usage with full support that we require:
63588; baseline

Then, I incrementally removed the '-u _printf_float' and '-u _scanf_float' options to see their respective impacts, and then effectively removed the sscanf() and sprintf() altogether using a '#define sscanf (void)' and '#define sprintf (void)' hacks to see the effect of their removal.  I built afresh each time and collected the sizes:

with scanf() no float, and printf() with float:
57076; cost of scanf float support = 6512

with scanf() no float, and printf() no float:
50552; cost of printf float support = 6524

removing scanf(), but leaving in printf():
48388; cost of scanf = 2164

removing both scanf() and printf()
45644; cost of printf = 2744

So, total scanf = 8676 and total printf = 9268, and total scanf/printf float cost = 17944.

So if I replace those things with an alternative implementation, I probably will save a big hunk of flash that I can use for further code development.  Hopefully the replacements will not be nearly as large.

First, I removed the dependency on scanf by implementing an atof() style function of my own concoction.  Internally this needed an atoi() which I also implemented and exposed.  This reduced the flash size considerably, to 55144.  So, from 63588 to 55144 is 8444 bytes.  If we assume the scanf() number above of 8676, that means my manual implementation incurs 232, so that is already quite nice.

Next, I removed the dependency on printf by implementing an ftoa() style function as well.  This implied I needed an itoa().  The stdlib's itoa() is not that bad -- about 500 or so bytes, but I went ahead and made my own because it was helpful to alter the API slightly to return the end pointer for parsing.  Additionally, this stdlib has no strrev(), so I implemented one of those, too.  (I found it easier to shift out digits into the text buffer in reverse, and then reverse them when finished when the required length was known).

That resulted in a build size of 46948, which is a further reduction of 8796 bytes.  If we assumed the printf() number above of 9628, then that means that my manual implementation incurs 1072 bytes.

Thus the overall savings with the manual implementations is 63588 - 46948 = 16640 bytes.  That is likely enough to keep the project in business for the upcoming implementations, which include the WSPR task, the WSPR encoder, and the Si5351 'driver'.

There was an unexpected bonus in this operation.  Apparently the scanf() and printf() functions use a lot of stack, as well.  I exercised my GPS and Command processor code to a great extent, and I can now safely revert to a 512 byte stack for both of those tasks.

So, even with the greatly reduced stacks, there is still plenty of stack for overhead for whatever.  I don't have a RAM crisis at the moment, and don't really ever expect one in this project, but still, getting 2 KiB RAM back is welcomed.


Implementing the WSPR task.


Alan Green wrote 08/13/2019 at 20:27 point

My solution to a flash crisis was to keep buying Blue Pills until I got one of the 128kB ones :)

In the unlikely event that you do end up needing a printf or scanf, there are a number of implementations out there - search for "tiny printf".

Thank you for sharing!

  Are you sure? yes | no

ziggurat29 wrote 6 days ago point

Thanks for the comment! Actually, you've anticipated my next project log; I did get the 128 KiB working, so I thought I'd post how to tweak the project to make that available.
Are there any non-128 KiB units? It was my understanding that this was a booboo that STM did, and that /all/ 'F103C8s are the same die as the 'F013CB, and that they merely burn a fuse to personalize them one way or the other, but that burning the fuse does actually preclude using the extra flash, anyway.

  Are you sure? yes | no

Alan Green wrote 6 days ago point

I have 11 Blue Pills. The first I bought from a local distributor. It was 64K. I then bought 10 in a single lot from a Chinese seller - of the ones I've used, they are all 128K. 

  Are you sure? yes | no