The chContextTime program is provided to measure the context switching time. Using Arduino, I compiled, loaded, and ran chContextTime. Then I measured the times the LED pin was high. One time was turning the LED on and off without a context switch; the other time involved a context switch. (Look at the code for this program - sketch if you will - and its operation will be obvious.) Using the cursors on my Rigol DS1074Z, I measured 124nS without a context switch and 816nS with a context switch. The context switch took 692nS. I estimate the confidence band to be +/- 10nS. Much quicker than the 15 to 16uS on an AVR Mega328 (Arduino UNO). And of course it should be since the Teensy 3.6 uses an ARM running at 180MHz.
I also tested chDataSharing which ran as expected. It is deserving of more discussion since it reports stack usage per task - a useful debugging tool.
EDIT 1/5/17: Screenshots from my DS1074Z follow showing the cursor measurements. Interestingly, these were made after the revisions I made to properly handle floating point. As I mentioned in the edited version of my first log, the push and pop of the floating point registers had to be added. You will note that the context switch time is now 1.000uS, up from 816nS. (The task without the context switch measures 120nS now - within the +/- 10nS error band.) So the context switch measures 880nS; up from 692nS, but still toasty!
The above is for the narrow pulse. Look at BX - AX for the pulse width. Below is the wide pulse measurement. Read the same datum.