• Encoding WSPRs

    ziggurat292 days ago 0 comments

    Summary

    A utility module that encodes the data into the WSPR format for transmission was produced.

    Deets

    The WSPR system is for reporting weak signals (hence the name) which implies a low signal-to-noise ratio.  That also implies a lot of distortion that can (will) render a signal hopelessly lost in that noise.  The scheme Joe Taylor (K1JT) and Steve Franke (K9AN) used here in WSPR is 'Forward Error Correction' (FEC).  And if there's such a thing as 'forward' error correction, there aught to also be a 'backwards' error correction -- and there is.  Backwards (or 'reverse') is very easy to understand:  you detect errors and ask the sender to send things again.  Forward is much more involved, and avoids that back communications channel.  There are many applications where you simply can't have a reverse channel, and that's one place that FEC shines.  It was particularly popular in deep space probes, but now it's used all over the place.  The gist is that you add redundancy in a careful way such that you can not only detect errors, but with a maximized degree of certainty deduce what are the erroneous bits and change them to what they should have been.  That's the 'forward' part:  the extra stuff is sent forward along with the message and the receiver can figure it out for itself without having to ask for a retransmission.

    There's many different schemes of FEC, and although 'convolutional coding' is used in WSPR, many of the other modes in the family of things Joe Taylor use other ones.  Joe Taylor himself has mentioned on several occasions that the evolution of these protocols was partially motivated by his fascination with communications technology and the desire to familiarize himself with the various states of the art.  It's not clear to me whether the choice to use convolutional coding here was motivated by it's technical merits relative to other choices, or whether this was more the way the wind was blowing at the time of inception.  (He had previously used Reed-Solomon in JT65, and later used LDPC in FT4.)

    While researching, I came across a document that described the mechanics of the WSPR encoding process in plainer English than the original source:

    G4JNT, “Non-normative specification of WSPR protocol”,
    http://www.g4jnt.com/Coding/WSPR_Coding_Process.pdf

    I won't repeat it except as a high-level summary with my own commentary.  The steps are:

    1. condition the data
        The conditioning step is to do sanity checking and some cleanup of the data prior to encoding.  The callsign has to be placed correctly in it's buffer (the third character must be a digit, padding as needed to ensure this), the maidenhead locator has some restrictions (e.g. the first two characters must be upper case and 'A' - 'R', the others digital), and the power level has to end in the digits 0, 3, or 7.  These numbers correspond to 1x, 2x, and 5x power levels.
    2. pack the data
        Some simple arithmetic encoding is done to use as few bits as possible for each datum.  This can be viewed as a manual form a data compression.  The conditioned callsign gets encoded into 28 bits, the locator gets encoded into 15 bits, and the power is encoded into 7.  (Power is a bit special in that the high bit is always set, and I don't know why this is -- otherwise it could have been encoded into 6 bits.)  This results in 50 bits of message data.  Then 31 bits of zero are padded out to 81 bits.  (I don't really know why the zero-padding is needed; my guess is it is to pick up the tail end of the system's convolution.  There are implicitly 31 zeros in the front as well, but it doesn't require padding to realize their effect.)
    3. transform the data using a convolutional code
        The data stream are fed into a convolutional encoder.  This use well-known polynomials discussed in:
        W. Layland, James & A. Lushbaugh, Warren. (1971). A Flexible...
    Read more »

  • Implementing the WSPR Task (skeleton)

    ziggurat294 days ago 0 comments

    Summary

    The WSPR task skeleton is implemented.  This goes through all the motions of scheduling transmissions, shifting out bits at the correct rate, and some other things like re-encoding the message when GPS is locked and syncing the on-chip RTC witht he GPS time.

    Deets

    For the WSPR activity, I define another FreeRTOS task.  This will be the last one!  The task will run a state machine, cranking out the bits of the WSPR message on a scheduled basis.  It will be driven by two on-chip resources:  the Real Time Clock (RTC), and a timer (TIM4).  The RTC will be used for it's 'alarm' feature to schedule the start of a transmission at the start of an even-numbered minute, as required by the WSPR protocol.  The TIM4 will be used to pace the bits we send.  The WSPR protocol requires the bits to be sent at 1.464 Hz, or about 0.6827 bps.  I assigned some other duties to the WSPR task, such as keeping the RTC synced when GPS lock comes in.

    CubeMX Changes for RTC and TIM4

    The RTC is used to kick off a WSPR transmission when an even numbered minute begins.  The on-chip RTC has an 'alarm' capability that can be used to generate an interrupt at the designated time.  You will need to open CubeMX, and ensure that under the NVIC settings for the Timers, RTC, that the following is enabled:

        RTC alarm interrupt through EXTI line 17

    The RTC interrupt works by way of a weak symbol callback function.  You simply define your own 'HAL_RTC_AlarmAEventCallback()' and that is sufficient for you to hook into the interrupt handler.  I put mine in main.c, since CubeMX likes to put things like this there, but that is not required.  The implementation is to simply forward the call into the WSPR task implementation by calling WSPR_RTC_Alarm(), which is exposed in the header 'task_wspr.h'.

    While that kicks of the start of the transmission, the subsequent bits are shifted out under the control of a general purpose timer which also can generate an interrupt when the configured period expires.  In this case I am using TIM4, and under it's NVIC settings in CubeMX you need to ensure that the following is enabled:

        TIM4 global interrupt

    While in CubeMX, we need to also set some values for TIM4 so that the interrupts come at the correct rate.  The timers are driven by the CPU clock.  We have configured to run at the maximum speed of 72 MHz, so we need to divide that down to 0.6827 Hz.  That means dividing down by 49,154,400.  The timers have a 16-bit prescaler and then also a 16-bit period, so we need to find a product of these two that will work out to be pretty close to that number.  I chose a prescaler of 4096, and a period of 12000, which works out to be 49,152,000 and that is close enough.

    You set those values in CubeMX by subtracting 1 from each of them.  This is because 0 counts as dividing by 1.  So the prescaler will be 4095 and the period will be 11999.

    Set those values and regenerate the project.  This will cause the init code to have changed appropriately.  Do not forget to re-apply the fixups that are in the #fixup directory!

    The Timer interrupts work somewhat the same, but there is one for all the timers, and it already exists down in main.c because we configured TIM2 to be the system tick timer.  We just add to that implementation in the user block:

    void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim)
    {
      /* USER CODE BEGIN Callback 0 */
    
      /* USER CODE END Callback 0 */
      if (htim->Instance == TIM2) {
        HAL_IncTick();
      }
      /* USER CODE BEGIN Callback 1 */
    
    	else if (htim->Instance == TIM4)
    	{
    		//we use TIM4 for WSPR bit pacing
    		WSPR_Timer_Timeout();
    	}
    
      /* USER CODE END Callback 1 */
    }

    ISR Code

    So, we have two methods WSPR_RTC_Alarm() and WSPR_Timer_Timeout() that are exposed by the task.wspr.h and implemented therein.  I generally avoid doing significant work in an Interrupt Service...

    Read more »

  • Haque in a Flash

    ziggurat296 days ago 1 comment

    Summary

    The STM32F103C8 on the BluePill (and really I guess every board using that part) is rumoured to have 128KiB flash, despite the datasheet's claims otherwise.  I lightly hacked the project to test this theory out.  It seems to work!

    Deets

    The datasheets indicate that the STM32F103x8 has 64KiB flash and the STM32F103xB has 128KiB flash.  Otherwise, they have the same capability.  The chips do identify themselves over the SWD interface, and mine definitely reports itself as the 'C8, and not the 'CB (plus, the part markings).  However, I read somewhere (alas, can't cite, but apparently this is well-known) that it seems that STMicroelectronics opted for some economies and these parts actually the same die, and in fact both devices have 128KiB flash!  If you think about how much it costs to set up a production line, it starts to make sense how this saves them money -- there's no extra cost for the silicon, but there's a lot of extra cost for the tooling.  They probably do a final programming step during qualification that fixes the identity.  Fortunately, it seems that this doesn't actually inhibit access to that extra 64 KiB.  Here is how I hacked my build system to enable its use.

    Hacking OpenOCD

    The System Workbench toolchain embeds OpenOCD to operate the debugger/programmer.  This tool uses a bunch of config files that describe the chip's capabilities.  There are two locations which have configs for this chip:

    C:\Ac6\SystemWorkbench\plugins\fr.ac6.mcu.debug_2.5.0.201904120827\resources\openocd\scripts\target
    C:\Ac6\SystemWorkbench\plugins\fr.ac6.mcu.debug_2.5.0.201904120827\resources\openocd\st_scripts\target

    The 'fr.ac6.mcu.debug_2.5.0.201904120827' is quite possibly different on your system because it involves the version number that you deployed.  But you can figure that part out for yourself.

    In each of those locations is a file:

    stm32f1x.cfg

    I renamed this to 'stm32f1x.cfg.orig' and made a new copy named 'stm32f1x.cfg' which I hacked.  If you scroll down a ways, there is a part '# flash size will be probed' that is where we make a change:

    # flash size will be probed
    set _FLASHNAME $_CHIPNAME.flash
    #HHH I hacked the size to 128k
    flash bank $_FLASHNAME stm32f1x 0x08000000 0x20000 0 0 $_TARGETNAME

    So we are simply not probing, but rather explicitly telling the tool that we have 0x20000 (128 Ki) of flash.

    I went ahead and modded both copies of 'stm32f1x.cfg', however it appears that the operative one is the one that is in the 'openocd\st_scripts\target' location.

    Fair warning:  if you later download an update of the system, you will probably need to re-apply this hack, because new copies of files will be deployed.  You'll know if this happens though, because things will break (if you're using the high memory).

    Hacking the Build Config

    Next, we need to tell the linker about the flash.  This is done in the root of the project, in 'STM32F103C8Tx_FLASH.ld'.  Near the start of that file is explicitly stated the size of the RAM and Flash.  I changed it to this:

    /* Specify the memory areas */
    MEMORY
    {
    RAM (xrw)      : ORIGIN = 0x20000000, LENGTH = 20K
    /* HHH I hacked the size */
    FLASH (rx)      : ORIGIN = 0x8000000, LENGTH = 128K
    }

    My project build now is well below the 64 KiB barrier, but I modded the persistent settings file to specify the last page of the 128 KiB instead of the same for the 64 KiB.  This made it easy to prove that stuff can be stored there.

    //HHH I modded this to get 128K #define FLASH_SETTINGS_END_ADDR 0x08010000
    #define FLASH_SETTINGS_END_ADDR 0x08020000

    Then I did a clean and rebuilt.  I started the debugger, and you will see along the way a message:

    Info : device id = 0x20036410
    Info : ignoring flash probed value, using configured bank size
    Info : flash size = 128kbytes

    So it's on.

    I connected to the monitor, and persisted settings.  Then I connected with 'STM32 ST-LINK Utility'.  This will think there is 64...

    Read more »

  • Flash (♫ Ah-ah ♫)

    ziggurat2908/13/2019 at 19:59 3 comments

    Summary

    Where we left off, the flash consumption was 63588 bytes, leaving us perilously close to the 64 KiB mark.  By ditching some large standard library code and manually implementing workalikes, we reclaim a lot (about 16 K) of space and get the project back on-track.

    Deets

    A while back when implementing some parts of the GPS task (parsing the NMEA GPRMC sentence) and parts of the monitor (printing and reading the persistent settings, and printing the GPS data), we used sscanf() and sprintf() to make that task easier.  Moreover, because we needed floating point support we introduced some linker flags to enable that capability.  Helpful as these functions are, they are notoriously large in their implementation (indeed this is why the default 'no float support' exists in the first place).  Time to drop a few pounds.

    I spent an unnecessarily long period of time with this research, but I wanted to prove the point to myself.  First I took out all the major modules, then added them back in incrementally to see their local impact.  It proved what I already suspected about the scanf() and printf().  But a simpler test was really all that was needed, and I'll summarize those results here for the curious.

    First, as a baseline, here is the flash usage with full support that we require:
    63588; baseline

    Then, I incrementally removed the '-u _printf_float' and '-u _scanf_float' options to see their respective impacts, and then effectively removed the sscanf() and sprintf() altogether using a '#define sscanf (void)' and '#define sprintf (void)' hacks to see the effect of their removal.  I built afresh each time and collected the sizes:

    with scanf() no float, and printf() with float:
    57076; cost of scanf float support = 6512

    with scanf() no float, and printf() no float:
    50552; cost of printf float support = 6524

    removing scanf(), but leaving in printf():
    48388; cost of scanf = 2164

    removing both scanf() and printf()
    45644; cost of printf = 2744

    So, total scanf = 8676 and total printf = 9268, and total scanf/printf float cost = 17944.

    So if I replace those things with an alternative implementation, I probably will save a big hunk of flash that I can use for further code development.  Hopefully the replacements will not be nearly as large.

    First, I removed the dependency on scanf by implementing an atof() style function of my own concoction.  Internally this needed an atoi() which I also implemented and exposed.  This reduced the flash size considerably, to 55144.  So, from 63588 to 55144 is 8444 bytes.  If we assume the scanf() number above of 8676, that means my manual implementation incurs 232, so that is already quite nice.

    Next, I removed the dependency on printf by implementing an ftoa() style function as well.  This implied I needed an itoa().  The stdlib's itoa() is not that bad -- about 500 or so bytes, but I went ahead and made my own because it was helpful to alter the API slightly to return the end pointer for parsing.  Additionally, this stdlib has no strrev(), so I implemented one of those, too.  (I found it easier to shift out digits into the text buffer in reverse, and then reverse them when finished when the required length was known).

    That resulted in a build size of 46948, which is a further reduction of 8796 bytes.  If we assumed the printf() number above of 9628, then that means that my manual implementation incurs 1072 bytes.

    Thus the overall savings with the manual implementations is 63588 - 46948 = 16640 bytes.  That is likely enough to keep the project in business for the upcoming implementations, which include the WSPR task, the WSPR encoder, and the Si5351 'driver'.

    There was an unexpected bonus in this operation.  Apparently the scanf() and printf() functions use a lot of stack, as well.  I exercised my GPS and Command processor code to a great extent, and I can now safely revert to a 512 byte stack...

    Read more »

  • Persistent Settings

    ziggurat2908/11/2019 at 21:12 0 comments

    Summary

    A simple means of persisting settings across boots is realized.  A Flash resource crisis has manifested.

    Deets

    Many projects need to have settings that are persistent across boots.  In this case, at a minimum is the setting that contains the operator call sign, since that can't be gleaned from the environment.  Practically, several other persistent settings will exist, such as the Transmit Power level, the frequency on which to operate, the bit rate of the GPS serial port, etc.

    The STM32F103 processor does not have any EEPROM resources, but this is emulated by using the last flash page as the persistent store.  Essentially, the settings are defined in a struct, and this struct is persisted to that flash page.  There are some defaults that are defined if the page is found to be empty.

    The settings presently defined are:

    typedef struct
    {
        uint32_t    _version;    //should always be first, should be PERSET_VERSION
    
        //'dial' frequency for the WSPR channel.  WSPR works in a USB narrow
        //(200 Hz) band within a conventional USB channel.  The center of that
        //200 Hz band is 1.5 KHz above the dial frequency.
        uint32_t    _dialFreqHz;
        //the WSPR signal is extremely narrow-band (6 Hz).  The 200 Hz WSPR band
        //can accommodate 33 1/3 of these 6 Hz sub-bands.  We can be configured to
        //use a specific one, or a negative number means randomly pick one at
        //transmit time (the usual case).
        int32_t     _nSubBand;        //0-32; or < 0 to randomize
        //duty cycle (i.e. how often to try to transmit, randomized)
        uint32_t    _nDutyPtc;        //percent
    
        //call sign
        char        _achCallSign[8];  //6 chars max
        //explicit grid locator
        char        _achMaidenhead[4];    //4 chars always
        //transmit power level
        int32_t     _nTxPowerDbm;    //0-60, though only 0, 1, 3, 7 endings
    
        //use GPS (i.e. auto time sync auto grid locator, and wait-for-lock)
        uint32_t    _bUseGPS;        //boolean
        //GPS bit rate
        int32_t     _nGPSbitRate;    //9600 default, but can be other
    
    } PersistentSettings;

    The defaults are:

    const PersistentSettings g_defaultSettings = 
    {
        ._version = PERSET_VERSION,    //must be this
        ._dialFreqHz = 14095600,       //the 20-meter conventional WSPR channel
        ._nSubBand = -1,
        ._nDutyPtc = 20,
        ._achCallSign = "",            //you must set this
        ._achMaidenhead = "",          //you must set this
        ._nTxPowerDbm = 20,            //100 mW
        ._bUseGPS = 1,
        ._nGPSbitRate = 9600,          //default for the ublox NEO-6M
    };

    The gist is that there is a RAM copy of the settings that the program operates off of.  Early in the execution of the program (in main()), this RAM copy is initialized from the persistent copy.  If there is no persistent copy, then it is initialized from the baked-in defaults.

    The settings may be persisted by writing the struct to the last page (1 KiB on this device) of flash.  An elementary form of wear-leveling is done to reduce the likelihood of wearing out the flash.  This works by sequentially writing updates into the flash.  Since the erased state of the flash is to cause all values to be 0xff, this is easy to detect.  Initial depersistence involves walking through the memory forward to find the /last/ structure that has a valid version number.  This is effectively the value of the flash settings.  Similarly, persistence means walking through the memory to find the first structure that has the 0xffffffff version number.  That will be where the new copy is written.  If the page is full when trying to write then it will be erased first.  As it stands, this will reduce flash erasures by 25 x.  If the structure grows, this will become less effective, but it is also straightforward to add more pages, if needed.

    Some test code was put in main.c to repeatedly write structures to verify the functionality.  You can use the 'STM32 ST-LINK Utility' to directly view the flash page.

    The command processor was updated to include a 'set' command that if used by itself will dump the present settings values, and can be used to alter...

    Read more »

  • Maidenhead

    ziggurat2908/10/2019 at 16:22 0 comments

    Summary

    A Lat/Lon-to-Maidenhead support routine is produced.

    Deets

    WSPR (and other amateur (ham) radio things, and even some other folks, too) like to express location in terms of a 'Maidenhead Grid Square Locator'.  To wit it's named after Maidenhead, UK.  If you were to take the globe and do what is called an 'equirectangular projection':

    source: By Strebe - Own work, CC BY-SA 3.0

    wikipedia: Equirectangular Projection

    then the maidenhead is simply a scheme for encoding the latitude and longitude into an alternative form.  This is useful to hams because it is more compact to transmit than to spell out all the digits of the numeric representation.  Also, for many purposes, the extra precision is not needed, so a few characters suffice.

    The encoding scheme is straightforward:

    1. start with lat long and progressively shift out most significant chunks of resolution.  The first chunk is special -- the remaining ones are regular and based on 10.
    2. for each chunk, two symbols will be emitted.  The first is for the encoding of the longitude portion, and the second is for the latitude portion.  Alternate chunks use a different encoding:  alphabetic or numeric.  The first chunk uses alphabetic, the second uses numeric, the third uses alphabetic again, and so forth.  By convention, the first alphabetic chunk uses uppercase, and the remainder uses lower case, but strictly the system is case-insensitive.
    3. repeat to any desired resolution.

    The reverse decoding is similarly straightforward but I have not implemented that here.

    But the Metric

    While the US may be the brunt of many jokes about not having adopted to the metric system like the rest of the world, there are aspects of the metric system that veritably no one has adopted.  In this case, nearly everyone still does angular measurement in the system Sumerians devised based on 60 https://en.wikipedia.org/wiki/Sexagesimal, rather than the metric system.  (To wit, some civil engineering aspects such as surveying do use 'grads' -- the metric equivalent to 'degrees'.)  I do find it amusing that we use a system from about 7,000 years ago that was ostensibly created to make things easier on working with your fingers (counting to a high number on one hand) to in the modern technological times still also being used to make things easier on working with your fingers (communicating your location via Morse code).  At any rate, this is why the first part of the maidenhead is treated specially.  After that it's handled more uniformly.

    Code

    The first part is to precondition the data.  To the longitude is added 180 to make it go from 0 to 360, and similarly to the latitude is added 90 to make it go from 0 to 180.  The second part is to then do a 'base-18' encoding of those data by dividing the longitude into 18 zones of 20 degrees, and the latitude into 18 zones of 10 degrees.  These are encoded as the symbols 'A' through 'R', upper-case by convention, and are emitted with the longitude first and the lattitude second.  This first pair is called a 'fields' and gets of off the sexagesimal.  The remaining are encoded more consistently.

    The remaining bits of resolution are done in much of the same way, but alternate between using the digits '0' - '9', or the letters 'a' - 'x'.  So, when working on a digital portion, the encoding is base-10, and when working on an alphabetic portion the encoding is base-24.  By convention, the lower-case letters are used.  These are called 'squares' and 'subsquares'.  You can repeat this process to arbitrary precision.  In this project we only use 4 symbols because that is what the WSPR protocol requires, but the code does not have that limitation.  Speaking of code, this is it:

    //north latitude is positive, south is negative
    //east longitude is positive, west is negative
    int toMaidenhead ( float lat, float lon, 
    ...
    Read more »

  • GyPSy

    ziggurat2908/09/2019 at 17:52 0 comments

    Summary

    GPS modules have arrived.  They're a bit sketchy.

    Deets

    The Neo-6M modules have arrived.  These came with a small bar-shaped patch antenna about 1/3 the size of the square ones I am more accustomed to seeing.  I wonder if this will affect sensitivity...

    As a quicky, I connected it via a handy FTDI adapter.  This module by default runs at 9600 bps.  Data was immediately sent from the module.  However, the first lock took about 1/2 hr to be made!  However, I am inside, and the limited length of the cables keeps unit close to computer.  The tiny antenna possibly does not help, either.  I'll order some USB extension cables (which you practically need, anyway, for that ST-Link) and external GPS antenna, though that will take some time for it to arrive.  I'm sure it would fare better outside, however that doesn't really work for my development activities.

    Receiving Data

    This project's needs are very specific, and in fact the only message I need to parse is the standard 'Recommended Minimum C'; e.g.:

    $GPRMC,123519,A,4807.038,N,01131.000,E,022.4,084.4,230394,003.1,W*6A
    $GPRMC,225446,A,4916.45,N,12311.12,W,000.5,054.7,191194,020.3,E*68

    Because of this simplicity I chose not to bother with using an existing library.  Instead, I used a simple state machine to capture the text lines, and a trivial parser to tokenize the results on the command and extract the fields.  The code is actually a simplified version of what already exists for the command processor, which does a similar thing over the CDC serial port.

    //sentence buffer
    char g_achNMEA0183Sentence[82];    //abs max len is 82
    
    static char _gpsGetChar ( const IOStreamIF* pio )
    {
        char ret;
        pio->_receiveCompletely ( pio, &ret, 1, TO_INFINITY );
        return ret;
    }
    
    //this gets characters from the input stream until line termination occurs.
    static void _getSentence ( const IOStreamIF* pio )
    {
        int nIdxSentence;
    
        int bCont = 1;
    
        //pull characters into sentence buffer until full or line terminated
        nIdxSentence = 0;
        while ( bCont && nIdxSentence < COUNTOF(g_achNMEA0183Sentence) )
        {
            char chNow = _gpsGetChar ( pio );
            switch ( chNow )
            {
            case '\r':    //CR is a line terminator
            case '\n':    //LF is a line terminator
                memset ( &g_achNMEA0183Sentence[nIdxSentence], '\0',
                                     COUNTOF(g_achNMEA0183Sentence) - 
                                     nIdxSentence );    //clear rest of buffer
                ++nIdxSentence;
                bCont = 0;
            break;
    
            default:
                //everything else simply accumulates the character
                g_achNMEA0183Sentence[nIdxSentence] = chNow;
                ++nIdxSentence;
            break;
            }
        }
    }

    so, that fills the statically allocated 'sentence buffer' with incoming characters until either the CR or LF is received, which terminates it.

    A new task module was created, 'task_gps.h, .c', and it works much like the one we created for the monitor -- a loop calling the line reception and handling function 'GPS_process()'.  Wiring it is was similarly trivial -- just adding yet-another task creation in __startWorkerTasks():

    	//kick off the GPS thread, which handles incoming NMEA data
    	{
    	osThreadStaticDef(taskGPS, thrdfxnGPSTask, osPriorityNormal, 0, COUNTOF(g_tbGPS), g_tbGPS, &g_tcbGPS);
    	g_thGPS = osThreadCreate(osThread(taskGPS), NULL);
    	}
    

    For the moment I will just be setting some global variables that can be inspected, but later I will add functionality to set the RTC clock to the satellite time, and to update the maidenhead to the current location. 

    Parsing

    The trivial parser then tokenizes the incoming data by converting the comma to a nul.  This effectively makes the sentences into a sequence...

    Read more »

  • The Monitor Task and the Command Processor

    ziggurat2908/08/2019 at 19:20 0 comments

    Summary

    The skeletal implementation of the Monitor task is implemented.

    Deets

    The Monitor task is a simple command line interface over a stream interface.  In this project, that stream will the the USB CDC interface.

    The design is fairly simple:  incoming data is built into a fixed-size command line buffer, and when a CR or LF is received, that is interpreted as the end of line, and it is subsequently parsed and processed accordingly.  The command line buffer is simply a statically allocated character array of 128 chars.  This is expected to be plenty (maybe even too much, but we'll see what evolves).

    The FreeRTOS task is straightforward:

    1. define the FreeRTOS structures needed (the thread handle, the stack, the task control block, and the thread function)
    2. the task exposes a pointer to a stream interface.  This allows binding of the command process to an arbitrary stream.
    3. the thread function a loop invoking the command processor function

    The Command Processor

    The command processor is realized with a generic component I use in several projects.  This generic component defines a structure:

    struct CmdProcEntry
    {
    	const char*	_pszCommand;
    	CmdProcRetval (*_pfxnHandler) ( const IOStreamIF* pio, 
                    const char* pszszTokens );
    	const char* _pszHelp;
    };

    The intention is that your application will define an array of these structures somewhere.  The entries in that array consist of:

    1. the text that is the command
    2. a function that handles the command along with any additional parameters
    3. a short text that is used for the 'help' command

    There is a function exposed:

    CmdProcRetval CMDPROC_process ( const IOStreamIF* pio, 
            const CmdProcEntry* acpe, size_t nAcpe );

    This takes the stream on which the command processor is operating and the application-specific array of command entries.  This function will build the command line buffer and support things like backspace, etc.  When an end-of-line character (CR or LF) is encountered, it will parse the first whitespace delimited token and search in the array of command entries for the handler for that command.  It will then invoke the handler function.  This lets me easily reuse this common capability amongst several projects.  The project-specific part is to define the commands you want and to perform some action when they are received.

    The most basic command is 'help', which works two ways:

    1. invoked by itself, it will list all the commands in the repertoire
    2. invoked with a token, it will search the command list and emit the help text for that specific command

    The handler for 'help' is straightforward and shows how such is constructed:

    static CmdProcRetval cmdhdlHelp ( const IOStreamIF* pio, const char* pszszTokens )
    {
    	//get next token; we will get help on that
    	int nIdx;
    	if ( NULL != pszszTokens && '\0' != *pszszTokens &&
    		-1 != ( nIdx = CMDPROC_findProcEntry ( pszszTokens, 
                            g_aceCommands, g_nAceCommands ) ) )
    	{
    		//emit help information for this one command
    		_cmdPutString ( pio, g_aceCommands[nIdx]._pszHelp );
    		_cmdPutString ( pio, "\r\n" );
    	}
    	else
    	{
    		//if unrecognised command
    		if ( NULL != pszszTokens && '\0' != *pszszTokens )
    		{
    			_cmdPutString ( pio, "The command '" );
    			_cmdPutString ( pio, pszszTokens );
    			_cmdPutString ( pio, "' is not recognized.\r\n" );
    		}
    
    		//list what we've got
    		_cmdPutString ( pio, "help is available for:\r\n" );
    		for ( nIdx = 0; nIdx < g_nAceCommands; ++nIdx )
    		{
    			_cmdPutString ( pio, g_aceCommands[nIdx]._pszCommand );
    			_cmdPutString ( pio, "\r\n" );
    		}
    	}
    
    	return CMDPROC_SUCCESS;
    }

     additionally, in debug build, I provide the 'diag' command:

    static CmdProcRetval cmdhdlDiag ( const IOStreamIF* pio, const char*...
    Read more »

  • USC CDC Streams and Serial and HAL Fixups, 002

    ziggurat2908/07/2019 at 14:25 2 comments

    Summary

    The streamification of serial ports continues with the USB CDC peripheral.  This one require more HAL hacking than the UART.

    Deets

    The Blue Pill has a USB device peripheral that we have just been using for power up to this point, but I do want to make it a serial port that can be used for making settings.  As before, I want to abstract that serial port behind the stream interface that was built-up in the prior post.

    With the UART, the HAL interface had some awkwardness that was worked around in user code.  In this case, though, the USB CDC driver has greater deficiencies, and we have to make modifications in the library code itself.  This has consequences:  modifications to any code outside of the 'USER CODE BEGIN ...' and USER CODE END ...' will be overwritten each time we re-run CubeMX.  The project is already exposed to this with the alternative heap implementation, so I created a batch file that restore these various fixups after running CubeMX.

    The major sticking point in this case with the USB is that there is no way of knowing when a transmission has completed.  We need that so that we can continue to feed the transmission with data from our circular buffers until completed.  We had some callbacks in the case of the UART, but nothing of the sort in the case of USB.  So we create some of our own.

    The first surgery is to:
    Middlewares/ST/STM32_USB_Device_Library/Class/CDC/Inc/usbd_cdc.h

    In this case, we add add a new method TxComplete that we will use to receive notification of transmission completed.  This addition is put in the structure that is defined around line 100:

    typedef struct _USBD_CDC_Itf
    {
      int8_t (* Init)(void);
      int8_t (* DeInit)(void);
      int8_t (* Control)(uint8_t cmd, uint8_t *pbuf, uint16_t length);
      int8_t (* Receive)(uint8_t *Buf, uint32_t *Len);
    /* USER CODE BEGIN MyCDCExt */
      void (* TxComplete)       (uint8_t *, uint32_t );
    /* USER CODE END MyCDCExt */
    
    } USBD_CDC_ItfTypeDef;

    Note I made up my own 'USER CODE BEGIN MyCDCExt'.  This is purely for my eyeballs, as these are /not/ honored by CubeMX.  It seems CubeMX has an internal, hard-coded, set of tags and it disregards all others.

    While I was in this code, I also made a non-critical change a few lines down:

    typedef struct
    {
    /* USER CODE BEGIN MyCDCExt */
    //hack; this chip is FS only, so why do I want to waste 448 bytes?
    //  uint32_t data[CDC_DATA_HS_MAX_PACKET_SIZE/4];      /* Force 32bits alignment */
      uint32_t data[CDC_DATA_FS_MAX_PACKET_SIZE/4];      /* Force 32bits alignment */
    /* USER CODE END MyCDCExt */
      uint8_t  CmdOpCode;
      uint8_t  CmdLength;
      uint8_t  *RxBuffer;
      uint8_t  *TxBuffer;
      uint32_t RxLength;
      uint32_t TxLength;
    
      __IO uint32_t TxState;
      __IO uint32_t RxState;
    }
    USBD_CDC_HandleTypeDef;

    So, as noted, the out-of-box CDC always reserves some internal buffer as if for HS even if you are only supporting FS.  So that gained me another 448 bytes of RAM!

    For the last hack in this file, I add a function definition:

    /* USER CODE BEGIN MyCDCExt */
    //hack to help remember to re-apply the hacks when code is regenerated.
    void XXX_USBCDC_PresenceHack ( void );
    /* USER CODE END MyCDCExt */

    This function does absolutely nothing, but I call it early in main().  The whole purpose is to cause the build to fail if I forget to apply these hacks again.  Failing to apply these hack would otherwise build successfully, but simply not work, and I didn't want to be endlessly debugging a non-problem just because I forgot to run the script to apply the hacks.

    The implementation side of these hacks goes in two places.  One file is at:
    Middlewares/ST/STM32_USB_Device_Library/Class/CDC/Src/usbd_cdc.c

    down at around line 677 is a function 'USBD_CDC_DataIn':

    static uint8_t  USBD_CDC_DataIn(USBD_HandleTypeDef *pdev, uint8_t epnum)
    {
      USBD_CDC_HandleTypeDef *hcdc = (USBD_CDC_HandleTypeDef *)pdev->pClassData;
      PCD_HandleTypeDef *hpcd = pdev->pData;
    
      if (pdev->pClassData...
    Read more »

  • Streams and Serial and HAL Workarounds, 001

    ziggurat2908/06/2019 at 14:57 0 comments

    Summary

    A stream IO abstraction is produced and mated to the serial ports of the system.  Some peculiarities of the STM HAL implementation are worked-around.

    Deets

    I generally like to abstract serial ports and other sequence-of-bytes-in-and-out into a stream IO interface, rather than call the underlying APIs directly.  Doing so decouples the component that is producing/consuming the data from the implementation of it's source, and so it is easy to redirect the processing implementation to any pipe that implements the conformant interface.

    Abstraction

    The abstraction I define here is:

    #include <stddef.h>
    #include <stdint.h>
    
    #define TO_INFINITY 0xffffffff
    
    //These interface objects will typically be in read-only memory
    
    //IO stream abstraction; typically for serial ports
    typedef struct
    {
    	//transmit methods; non-blocking
    	void (* _flushTransmit) ( const IOStreamIF* pthis );
    	size_t (* _transmitFree) ( const IOStreamIF* pthis );
    	size_t (* _transmit) ( const IOStreamIF* pthis, const void* pv, size_t nLen );
    
    	//receive methods; non-blocking
    	void (* _flushReceive) ( const IOStreamIF* pthis );
    	size_t (* _receiveAvailable) ( const IOStreamIF* pthis );
    	size_t (* _receive) ( const IOStreamIF* pthis, void* pv, const size_t nLen );
    
    	//transmit/receive methods; blocking
    	//0 on success, nRemaining on timeout (i.e nLen - nProcessed)
    	int (* _transmitCompletely) ( const IOStreamIF* pthis, const void* pv, size_t nLen, uint32_t to );
    	int (* _receiveCompletely) ( const IOStreamIF* pthis, void* pv, const size_t nLen, uint32_t to );
    } IOStreamIF;
    

    This is in the style of C-as-a-better-C++, wherein I manhandle virtual functions and the 'this' pointer.  This project is principally C, but you could obviously redefine this in the C++ way for some added convenience but limiting yourself to C++ usage.

    The non-blocking functions are intended to attempt to push in or pull out as much data as possible, but immediately return indicating how much actually was pushed or pulled.  The blocking functions are intended to spin in a loop until all the data provided requested has been satisfied, subject to a timeout.  A special timeout of TO_INFINITY is defined that means wait forever for it to happen.

    The _transmitFree() and _receiveAvailable() functions allow one to 'peek' to see if there is any room for sending or if there is anything to receive.

    Once a hardware resource is adapted to this interface, then anything that presumes this interface can be mixed-and-matched to any of those hardware resources.  In particular, the upcoming Monitor and GPS tasks will be stream oriented and bound to the USB CDC device and the USART1 device.  This can be extended to other concepts, like a network TCP/IP socket, and I have used it before for custom stuff like an Infrared serial link that demodulates the data stream in software.

    Circular Buffers

    It's not part of the interface definition, and it's not required, but it is a reasonable assumption that there is some sort of buffer behind the concrete implementations.  I typically use circular buffers for transmit and receive side.  I have a few such implementations which have various trade-offs, but the one I use here looks like this:

    //the base type consists of indices, size, and optional debug members
    typedef struct circbuff_t circbuff_t;
    struct circbuff_t
    {
    	volatile unsigned int _nIdxRead;
    	volatile unsigned int _nLength;
    	const unsigned int _nSize;
    	const unsigned int _nTypeSize;
    #ifdef DEBUG
    	volatile unsigned int _nMaxLength;
    #endif
    };
    
    //the derived type consists of the base type, with the buffer following
    #define CIRCBUFTYPE(instance,type,size)	\
    typedef struct instance##_circbuff_t instance##_circbuff_t;	\
    struct instance##_circbuff_t	\
    {	\
    volatile circbuff_t _base;	\
    volatile uint8_t _abyBuffer[size*sizeof(type)];	\
    };
    
    //the instance data is initialized with some critical size params
    #define CIRCBUFINST(instance,type,size)	\
    instance##_circbuff_t instance =	\
    ...
    Read more »