Close

High-resolution timestamps

A project log for n00n - Real Time Music Sensor Streaming Protocol

MIDI is so outdated, welcome to the 20s and the 16-bit world !

yann-guidon-ygdesYann Guidon / YGDES 06/20/2023 at 15:260 Comments

One quite unusual aspect of the protocol is the use of a power-of-two resolution timestamp.

Usually one would use a 10MHz-derived signal, and end up with powers of ten, but that doesn't fit well with a 16-bit field that is so characteristic of the format. Using 65536Hz looks like a natural choice and the trick is that the timestamp can be simply set to null (MSB set, others cleared) in case the device does not implement a timestamp (in that case, its stream could be mixed with other timestamped streams which will fill the empty field). The unusually high resolution of the timestamp helps with dealing with multiple data sources sampled at irregular intervals or with wildly different rates, so this relaxes the requirement of a tightly coupled system, making it much more resilient. You could sample something at 37Hz and something else at 1337Hz, who cares.

A 65536Hz clock or timebase generator is ... rare. However : stable, cheap, available 32768Hz sources are made for watch and timekeeping purposes. Many Dallas Semiconductor chips, marketed as clocks and calendar with integrated temperature compensated oscillators (such as DS3231) provide a digital 32768Hz output signal for auxiliary use. Thus there is no need of a PLL : if the duty cycle is 50% then every transition creates a rather good 65536Hz event : a FPGA can detect a level change and update an internal counter for example.

The log Integrated 32KHz clock source shows the use of the DS32KHZ integrated oscillator and how easy it is (more details at #ScoPower ). Thus an autonomous device can generate a very precise timestamp signal to synchronise the whole setup, at clock stability and resolution.

Of course, not all devices need to follow such a high precision, because accuracy, precision and resolution are not the same things. The format simply provides headroom and more or less bits can be exploited. Just be careful of the jitter. There is also significant overhead and lag all over the system : data processing, buffers, transmissions... Small packets get emitted faster and more often so they suffer less jitter. It also depends on when you count a packet as emitted or received : when the header is detected, or when the whole packet is received ?

The main timestamp is emitted at at least 1Hz. It can be increased to 2, 4, 8Hz or whatever but this increases the overhead over the links so it's always a matter of compromises. The timestamp generator is considered lost if 3 consecutive packets are not received as expected (then the local generator takes over the chain). A downtime of 1 second is the target for maximal recovery time in case of failure so 4Hz or 8Hz is a reasonable compromise. Each device in a ring can take over as a master clock in case it doesn't receive it but they also add their own timestamp down the chain, increasing the resolution for downstream devices. For a star network, the global timestamp must be broadcast : easy for Ethernet/UDP/IP, less so for USB but doable.

So what matters most is that each device has a pretty accurate local clock with a high-resolution timer (multi-MHz ?) to create a sort of digital PLL in phase with the received global clock reference. A first-order filter is necessary for basic operation, maybe a 2nd order error filter will help in case of a loss of the main signal during more than 10 seconds. When a device takes over, the PLL parameters are slowly adjusted to reach a fixed model-dependent value that creates a "known-good 1Hz" period.

Discussions