01/13/2017 at 22:37 •
This project won grand prize in the contest : http://hackaday.com/2017/01/13/1-kb-challenge-and-the-winners-are/
Thanks to Hackaday team for this contest and all the other participants for making tough competition.
01/04/2017 at 08:47 •
Here I want to document some of assembly language tricks I used in this project. People doing it every day for years will scoff, but the perhaps somebody may be interested. I'm going to target it for PIC18, but it should be applicable for other MCU platforms too.
Just quick vocabulary:
; denotes comment, ignored line
call is jump to subroutine with program counter (PC) saved on stack
jump is jump to subroutine with no PC saving
return is restitution of PC from stack
Prepare your variables
Sometimes you can gain some bytes of code memory if you properly arrange your variables. Some MCU platforms do have faster access for something like first 256B of RAM, or some access bank
Exploit reset values
Sometimes you don't need to explicitly define the register value, if it is switched into defined state by hardware after reset and you know it applies to your conditions.
My code needs to wipe out random values from buffers from RAM and enter them into defined state (0x00). Because my working variables and display/input buffers are located at the beginning of RAM, I can wipe it out by setting data pointer at zero and write zero/increment pointer 256 times in single loop.
Setting the data pointer takes two instruction words when using instruction LFSR, which properly loads all 12 bits of data pointer. But datasheet states the upper nibble of the data pointer is zeroed after reset
Unfortunately, reset value of lower byte is undefined, so I have to force it zero. Still, CLRF instruction (clear single register) takes one byte versus two bytes of LFSR. In subsequent code I employ LFSR to properly initialize data pointers, though.
Let's have subroutine that prints ASCII character on your display
print_ascii: ;assume it's called with value in W register ;do this and that return
Now you may want to have subroutine that prints unpacked BCD (binary 0..9) on the display. You can do it like this
print_binary: addlw '0' ;adjust the W value into ASCII 0..9 call print_ascii return print_ascii: ;assume it's called with value in W register ;do this and that return
Which would definitely work, but you can omit all the cal/return thing by having subroutine with two entry points, or having fall-through to function.
print_binary: addlw '0' ;adjust the W value into ASCII 0..9 ;and now fall through the ASCII print print_ascii: ;assume it's called with value in W register ;do this and that return
You call the print_binary subroutine, it does its job and returns via return instruction, which "belongs" to print_ascii subroutine.
Let's have subroutine which prints 3 consecutive characters from buffer, utilizing print_ascii subroutine as before. The simplest approach goes like this
print_three: ;prepare number 1 call print_ascii ;prepare number 2 call print_ascii ;prepare number 3 call print_ascii return
This would certainly work, but the last subroutine call is somehow pointless. First you call subroutine print_three (consumed one stack level), then you do this and that up to last print_ascii call, where you call it (consume one stack level), the function does what is expected, then it returns (decrease stack usage) just to return again (return stack to previous value). We can save one return by doing this:
print_three: ;prepare number 1 call print_ascii ;prepare number 2 call print_ascii ;prepare number 3 jump print_ascii
It works as before, but the last call (with stack increase) is replaced by plain jump. Now the function print_three will not return via "its own" return, but via return "belonging to" function print_ascii. One return spared. Also stack level spared (only at the single subroutine "call", not in the previous calls), what could be sometimes beneficial.
I used this one for hexadecimal number printing - in fact it is a bit convoluted use of fall-through. I could do my print_hex function like this
print_hex: ;extract low nibble of number call print_ascii ;extract high nibble of number call print_ascii return print_ascii: ;assume it's called with value in W register ;do this and that return
Instead, I did it like this
print_hex: ;extract low nibble of number call print_ascii ;extract high nibble of number ;fall-through print_ascii: ;assume it's called with value in W register ;do this and that return
It takes low nibble, calls subroutine print_ascii and returns back to extract high nibble of the number, with the number in working register it falls-through to print_ascii again and now its return instruction will set program counter back from where print_hex was called. One call and return saved.
Use relative branching if possible
Many MCU platforms do have option of relative branching, as opposed to absolute jumps. Relative branches usually take less program space, but have limited range. One can take advantage of this for local, related functions, like the display functions I mentioned before. For example, PIC18 has relative branch range +-1023, which is perfect fit for 1kB contest.
01/04/2017 at 06:39 •
Now I'll check selected points from contest rules
- Contest entry deadline is Thursday, January 5, 2017 09:00 pm PST (+8 UTC)
January 4th now, so probably OK
- Projects must use 1 kB or less of code, including any initialized data tables, bootloaders, and executable code.
My code left 10B unused out of 1024B quote (here is complete program image)
though it touches 5B of configuration bytes. I believe those are to be added to program FLASH consumption, as it is part of MCU initialization. Microchip MPLABX counts config bytes to FLASH consumption, displaying expected 1019B overall consumed.
Listing from gpasm apparently takes all config bytes as 16-bit values, indicating 1023 consumed, what is still OK.
Program Memory Bytes Used: 1023 Program Memory Bytes Free: 64513
There is some ambiguity whether and how to count the config bytes/fuses whatever you call it, but I tried to look at it from more points of view and it looks I'm under 1024B.
Notice there is single define which can include sample BF and input string into buffers. This code exceeds 1024B limit by a few dozens of bytes and isn't needed for Brainf*cktor to run - just for users convenience. Intended contest setting is with demo program off, so you can start with empty buffers, just like in the video.
- Mask Rom: If you are using a chip with masked rom code, you must include the size of any ROM routines you call in your final executable size calculation. If you wanted to write a program in SPIN for the Parallax Propeller, you’d need to include the SPIN interpreter ROM, which would put you over the 1 kB limit.
- Projects must be open source. Source code, schematics, and board layouts for the project software must be either posted in the files section, or include a link to a public repository such as Github.
Project is hosted on github as well as in files section. The github is probably more complete and up-to-date.
It contains source files for the PIC, as well as binaries. PCB files are in Eagle format, main board and keyboard being in limit of free Eagle license. PDF file with schematics is included too.
- Projects must use publicly accessible toolchains. Closed source toolchains (example - IAR or Keil) can be used as long as the demo versions will compile/assemble the contest code.
For sake of completeness, the source code is distributed as MPLABX project, being available for free with no registration/activation/whatever (just download, install and run) on Windows, Mac and Linux.
If you don't like this option, you can use open-source tools. Just copy out the main.asm source and let the gpasm (part of gputils) do the job
gpasm -p p18f26k22 main.asmIt should compile the same binary as the one generated by MPLABX. Then you can use another open-source tool to flash the binary into PIC MCU, PIC18F26K22 is among the supported types.
01/04/2017 at 00:58 •
The user interface of Brainf*cktor may not be obvious for totally unconcerned person, so here is short help, partially for my back reference too.
Display is divided into three parts; first row showing the actual program. The whole buffer is divided into pages of 16 bytes, and only one page is shown. The actual page number is shown in hexadecimal format next to the P symbol after the program buffer. When the interpreter finds non-BF symbol, it is ignored.
Second row is similar to the first row, but displaying the input buffer. It is the buffer of characters the interpreter will use whenever asked for data input, byte after byte. When the interpreter can't take any more input, it will finish its execution. When the BF program doesn't use the , symbol, input buffer is not touched at all and can be left empty.
Third line, output buffer shows the last 16 characters outputted from BF interpreter. It is FIFO buffer.
The program buffer or input buffer is editable. User can edit character under cursor, shown as inverted character. Here the cursor is on program buffer, ready to edit first character.
Alternating between editing the program or input buffer is done via dedicated key.
Keyboard consists of 8 keys. The left red key allows to switch between editing of program or input buffer.
Right red key allows running the program; while running the program, red LED under the display is on. The remaining 6 keys do the actual editing action. Green keys allow do decrement or increment the cursor (pointer) position, blue keys change the actual value by one position in ASCII table, while the yellow ones jump by 8 values. By default the values are at 0, so to enter A (=0x41 hex) it is required to press 8x (8x8=64=0x40) top yellow (V+8) and once (1+64=65=0x41) top blue (V+1) key. To enter dot on following buffer location, it is needed to press top green (move to next character), then 6 x top yellow and twice bottom blue (6x8 - 2 = 46 = 0x2E).
01/03/2017 at 23:59 •
I spent nice evening with soldering iron to convert the PCBs from previous log into something that is not piece of protoboard.
The main board is heart of the Brainf*cktor and carries all the semiconductor devices.
The keyboard board is there to define positions of keys and contains keys series resistors too.
Both of them plus battery are fitted into enclosure made of doublesided PCB parts.
The PCBs are held in place by a few M3 spacers and bolts
The enclosure consists of milled PCB parts and lots of leaded solder
Some more details
and power switch
And when all the parts are together
I took a short video of me, entering simple program to echo single byte entered by user into output buffer
As you can see, the user IO of this device is rather spartan, but hey, this is BF machine.
01/03/2017 at 11:53 •
If had more time for PCB production, I'd send it to itead or oshpark, but unfortunately this wasn't the case. On the other hand, friend of mine has CNC milling machine, which can be used to mill out the PCBs. This process isn't that fine as th usual 0,15/0,15mm rules, so I had to rework my PCBs to fit his requirements. What more, single-sided PCBs are somehow better for CNC manufacturing than double-sided, so I had to take this aspect into account and design single sided PCBs. After a good hour of optimizing things, I got this
Blue lines are copper traces on bottom side, red ones are projected to be wire links on top side. Keyboard PCB was un-double-sided too
And I also designed FR4 enclosure for the thing
Great thing about home-made PCBs is the development cycle speed. Next day I had my PCBs prepared - the enclosure
as well as the "functional" PCBs
The PCBs look a bit dull on the photo, as I brushed it and covered with solution of colophony with acetone, to stop it from oxidizing and help with soldering. In reality it looks better.Now comes the soldering job, wait for next project log.
12/27/2016 at 07:43 •
I finally implemented the "methods and apparatus" from my previous log into reality. It was a bit more "code expensive" than I anticipated due to fact the LCD is divided into two halves with two separate enable pins, what is a bit of PITA if you are trying to squeeze it into as small space as possible.
Nonetheless it works and now my BF interpreter uses really only 1kB of ROM (FLASH, EEPROM, whatever) along MCU and all its peripherals.
I'm working on PCB for this one, planning some demonstration video and some how-to, as folks who have no idea how this could work... probably have no idea how this could work.
12/07/2016 at 13:37 •
Adam Fabio pointed out there could be a problem accepting 1kB challenge entry with HD44780 LCD, containing fair bit of character generator ROM. Though I have no problem with the HD44780, I'll try to rework my project to not use HD44780 or similar LCD driver with built-in character set - so, I have to use graphical LCD and construct character bitmaps from resources on my 1kB constrained FLASH.
I removed demo BF programs from FLASH and its copy routines, so I was able to obtain approximately 170 bytes of FLASH. But that seems too little for ASCII table character bitmaps - considering the usual font 5x7
I'm kinda stuck, as every single character needs 5 bytes of FLASH and there is at least 96 of them, and 480 Bytes just for character bitmaps is way too much for this project. But I'm too stubborn to give up the display output.
I could strip down the ASCII subset, but still, 170 Bytes is just 34 characters, not enough for alphabet and numbers, not to mention special symbols, so much used in BF. 64 or at least 56 characters would be better. That's still too much of FLASH consumed.
I noticed that having full 8 bits for one character btmap column is too much luxury. That is 256 combinations and many (most?) of them are simply unused. I decided to create the 16 most versatile combinations of pixels (vectors), so I can encode two columns into single byte. If I'd decrease symbol width to 4 columns (4x8 pixels characters) I can encode whole character into two bytes + 16 bytes of column vectors. For 64 characters it is 128 Bytes for character generator and 16 Bytes for vectors, consuming 144 Bytes. Much better!
I started by drawing vectors I would expect to appear most frequently.
Yes, it's really calc from LibreOffice. At first I thought I'm going to write script to generate the symbols from given input tables, but this was actually more straightforward and simpler.
The first eight rows are just combinations of black dots at lines 1,4 and 7, the rest of vectors are the ones I expected to be common for alphanumeric symbols. I did my best to construct character table from it, though looking a bit weird.
Some characters are rather hard to read, and I noticed vectors 4 and 14 are unused - wooohoo, two bytes for optimizations.
I replaced vector 14 with something more usable and character set looks a bit better now
Yet, still something to improve. I was able to free another vector (13) by optimizations of vector 14, along with vector 4 it's another two vectors to change and implement into character maps
Notice how numeric symbols changed, from the first iteration, along with letters like Z or G. I also introduced more ASCII symbols.
In the meantime I ordered 122x32 pixel LCD with PT6520 driver and I'm building new PIC board, to test out the character set.
12/01/2016 at 18:57 •
I spent some more time with this project, designing PCBs this time. I decided to use two boards, one for keys and another one for the display and MCU, in order to allow proper positioning of PCBs against front panel of enclosure.
I tried to keep the PCBs in BF language esprit and avoided too much straight lines. Unlike most of my designs, all components will the through-hole, for more fun.
12/01/2016 at 09:35 •
Yesterday I sat down to this project and wrote the firmware. At first I wanted to write just the interpreter alone and leave the display, keyboard and UI tasks for later, but I got so immersed into it I've done it during one long night. Coding in assembly language is great fun.
At first I programmed the interpreter alone - the interpreter was debugged just in MPLABX simulator, not running on real hardware at all. Along with a short demonstration program it fit exactly into 256B of code, so I felt confident to continue. LCD support took another 100B, most of the code lies in editor and UI. It's all about 600 lines of code.
You can enter both the code and data input (editor has two modes - you can work with full ASCII charset or just 8 BF symbols) into BF interpreter - in "Hello world" the input obviously isn't used and interpreter doesn't fetch a single byte from it.
Including two sample BF programs - the code size is few bytes short of 1024B limit, so I have still plenty of room for improvements ;-)
More details soon.