I was noticing that the CPU usage felt a little high on my Lovett framework for a fairly simple operation. So I took a new rabbit hole plunge. One of the best tools to get detailed performance details on a linux system, is the perf tool. It turns out, however, that the driver for by graphics fb requires a newer kernel than the base raspbian system provided. When installing the driver it automatically installs the latest kernel version. That would be ok, except that the kernel install doesn't come with a paired version of perf. So if you want perf, it looks like the only way is to compile it from scratch.
So I go and download the kernel of the same version, hop over to tools/perf and see what it takes to get it up and running. The make utility is fairly helpful here, letting you know what packages you probably need to install to get it up and running. Unfortunately, I ran into a couple of gotchas.
1) It looks like some code expects a different param type than would normally occur in an arm32 compile. A bit of searching turned up a helpful patch: https://lkml.org/lkml/2020/5/11/1474 ... with that in place all the sources were able to compile.
2) Even though it compiled the util was crashing on me before it could do much. That led down another rabbit hole, But I eventually found an option that worked after taking a general glance at this article here: http://web.eece.maine.edu/~vweaver/projects/perf_events/rasp-pi/paradis_ece599.pdf
3) Equipped with
from that article I was able to generate the first round of data. Up next, though I could not generate a report. I was getting a number of crashes. The first one involved libssl. So I went through and did a bunch of raspbian updates, including installing a different libssl version. That triggered a whole lot more updates. At some point one of those updates appeared to have borked by boot options. So I reinstalled the driver which caused a new kernel version to get installed. That then required I go back and repeat steps 1 and 2 above. This time the report crashed on libunwind. So i rebuilt the perf util with
This finally succeeded and the report was not readable.
The result, I found a couple of places to tweak. Opting for some integer math over floating point. And then finally noticing the area of the code that was causing some object to get reinstantiated with a fairly large cpu penalty. Fingers crossed I'll be able to make a much more efficient font rendering and updating code over the next week.