Last time I implemented the pew library for this TFT display using CircuitPython's displayio, and it it was fast enough, only had very noticeable tearing. So I thought I will need to either blast the data directly over SPI to it, or even write that part in C to be fast. But today I thought: wait, what if I just used the Stage library that I already use for those games? Would it be fast enough? Turns out that it totally is!
I basically did the simplest thing that could possibly work: loaded a set of tiles from the Jumper Wire game, defined a grid for it, and in the show function I'm updating that grid and then rendering the whole screen. And it works very well, no noticeable tearing! And you know what is cool about this approach? You can have a 16x256 image that defines the tiles, drawn specifically for the given game. So for Tetris you would have just boxes (maybe with some 3D effect and/or shading), for snake you would have the head, body parts and apples, for Sokoban walls, crates and a worker, etc. — so the game would still use the simplified 8x8 display, but now the different "colors" would be actual images.
I am not yet entirely sure how I want to do it, and I might still go for a custom C function for drawing this, for example to make the blocks 20x16 instead of 16x16 to fill the whole screen, but we will see.