ThunderScope

I started this project sometime in 2018 and have been working on it ever since. From the very beginning, I've planned to release this project as open source, but fell prey to perhaps the most classic excuse open source has to offer: "I'll release it when I'm done". And so, the project moved forward, past various milestones of done-ness. And my fear of showing not just my work, but the (sometimes flawed, and always janky) process behind it kept me making the same excuse. In doing so, I've missed out on the input of the open-source community that I've spent so long lurking in, spent nights banging my head against problems that could have been spotted earlier, and slowed down the project as a whole.

"The best time to open source your project was when you started it, the second best time is now"

The project is now in a near-completed state and is released as open-source on GitHub under an MIT license. I will be making a series of project posts here detailing all the failures, fixes, and lessons learned in chronological order. I look back to when I was first learning about hardware through following open source projects and although I could learn a bit from finished layouts and schematics, the most I've learned is from blog posts and project logs that describe the problems faced and how they were solved. I wish to do the same for those just starting out in this amazing field, and hopefully also release an excellent oscilloscope for them to use in their electronics journey! If you're interested, sign up at Crowd Supply to be notified when the campaign starts!

Project Logs

Collapse

FPGA Module: Extreme Artix Optimization
Aleksa • 04/25/2022 at 02:33 • 1 comment

It's been a while since I posted one of these! I've got a few days before another board comes in so I figured I'd post a log before I disappear into my lab once again. Hardware-wise, we left off after the main board was finished. This board required a third-party FPGA module, which had a beefy 100k logic element Artix-7 part as the star of the show, costarring two x16 DDR3 memory chips.

But wouldn't it look better if it was all purple? The next step was to build my own FPGA module, tailored specifically to this project.
Read more »
Demo Video!
Aleksa • 11/07/2021 at 00:01 • 0 comments

Github Repo:
https://github.com/EEVengers/ThunderS...

Discord Server:
https://discord.gg/pds7k3WrpK
Crowd Supply:
https://www.crowdsupply.com/eevengers/thunderscope
Software Part 2: Electron, Redux and React
Aleksa • 10/30/2021 at 19:12 • 1 comment

Despite the name of this project log, we aren't talking about chemistry! Instead, I welcome back my friend Andrew, who I now owe a couple pounds of chicken wings for recounting the war stories behind the software of this project!

We’re coming off the tail end of a lot of hardware, and some software sprinkled in as of the earlier post. Well my friends, this is it, we’re walking down from the top of Mount Doom, hopefully to the sound of cheering crowds as we wrap up this tale. Let ye who care not for the struggles of the software part of “software defined oscilloscope” exit the room now. No, seriously, this is your easy out. I’m not watching. Go on. Still here? Okay.

Let’s get right to the most unceremonious point, since I’m sure this alone is plenty to cause us to be branded Servants of Sauron. The desktop application, and the GUI, is an Electron app. I know, I know, but hear me out. For context, Electron is the framework and set of tools that runs some of the most commonly used apps on your computer. Things like Spotify and Slack run on Electron. It is very commonly used and often gets a bad rep because of various things like performance, security, and the apps just not feeling like good citizens of their respective platforms.

All of these things can be true. Electron is effectively a Chrome window, running a website, with some native OS integrations for Windows, macOS and Linux in general. It also provides a way for a web app to have deeper integrations with some core OS functions we alluded to earlier, such as the unix sockets/windows named pipes system. Chrome is famously known for not being light on memory, this much is true, but it has gotten significantly better over the last few years and continues to be so. Much the same can be said for security, between Chrome improvements that get upstreamed to Chromium and Electron specific hardening, poor security in an electron app is now often just developer oversight. The most pertinent point is the good citizenry of the app on its platform. Famously, people on Mac expect such an app to behave a certain way. Windows is much the same, though the visual design language is not as clearly policed, many of the behaviours are. Linux is actually the easiest since clear definitions don’t really exist, this has led to, funny enough, the Linux community being some of the largest acceptors of Electron apps. After all, they get apps they may otherwise not get at all.

As much as I would love to write a book containing my thoughts on Electron, I am afraid that’s not what this blog calls for. So, in quick summary, why Electron for us, a high speed, performance sensitive application? I will note this, none on the team were web developers prior to starting. It is very often the case that when web developers or designers switch over to application development, they will use Electron in order to leverage their already existing skills. This is good, mind you, but this was not the case for us. We needed an easy way to create a cross platform application that could meet our requirements. In trying to find the best solution, I discovered two facts. Fact the first, many other high speed applications are beginning to leverage Electron. Fact the second, finding out that integration with native code on the Electron side is not nearly as prohibitive as I initially thought. So, twas on a faithful Noon when I suggested to our usual writer, Aleksa, that we should give Electron a whirl. I got laughed at. Then the comically necessary “Oh wait no you’re serious”. I got to work, making us a template to start from and proving the concept. That’s how we ended up here.
Read more »
Software Part 1: HDL, Drivers and Processing
Aleksa • 10/21/2021 at 01:07 • 0 comments

We've gone through a lot of hardware over these last 14 project logs! Before we leave the hardware hobbit hole to venture to software mount doom, let's take a look at the map of middle earth that is the block diagram of the whole system.

The first block we will tackle is the FPGA. The general structure is quite similar to the last design, there is ADC data coming in which gets de-serialized by the SERDES and placed into a FIFO, as well as scope control commands which are sent from the user's PC to be converted to SPI and I2C traffic. Since we don't have external USB ICs doing the work of connecting to the user's PC, this next part of the FPGA design is a little different.

There is still a low speed and a high speed path, but instead of coming from two separate ICs, both are handled by the PCIe IP. The low speed path uses the AXI Lite interface, which goes to the AXI_LITE_IO block to either fill a FIFO which supplies the serial interface block or to control GPIO which read from the other FPGA blocks or write values to the rest of the board. On the high speed path, the datamover takes sample data out of the ADC FIFO and writes it to the DDR3 memory through an AXI4 interface and the PCIe IP uses another AXI4 interface to read the sample data from the DDR3 memory. The reads and writes to the DDR3 memory from the AXI4 interfaces are manged by the memory interface generator. The memory here serves as a circular buffer, with the datamover always writing to it, and the PCIe IP always reading from it. Collision prevention is done in software on the PC, using GPIO data from the low speed path to determine if it is safe to initiate a read.
Read more »
ThunderScope 1000E Rev.1
Aleksa • 10/12/2021 at 00:27 • 4 comments

The time had come to make a new prototype, one with all the hardware needed to accomplish the goals of this project! The front end was well proven at this point, and just needed a slight shrink to fit under an off the shelf RF shield. The ADC had always behaved well during my tests, but it needed a new (and untested) clock generator since the one I had prototyped with wasn't suited for it. Most disturbing of all, I needed to design with an Artix-7 FPGA and DDR3 RAM in BGA packages for the first time.

Tackling that last point first, I saw way too much risk in putting these BGA parts down on one board that I hand stencil and reflow solder on a hot plate. Not just that, but I only had three months until I had to submit this project to graduate my electrical engineering program and had no experience working with DDR3 nor even large BGA packages. I committed to learning these skills for the next revision, but had to find something to tide me over in a hurry.

Enter, the TE0712-02 FPGA module. This bad boy had two DDR3 ICs, the second largest Artix-7 part, and only needed a 3.3V rail to operate. As my favorite circuits professor put it, "Simplicity itself".

Read more »
Designing and Testing a 1 GHz PLL
Aleksa • 10/06/2021 at 23:04 • 0 comments

Now that I knew that the throughput to the PC could match the ADC’s rated sample rate of 1 GS/s, I had to make a circuit that clocked the ADC at that rate as well. This circuit needed to output at 1 GHz with very low jitter, as any jitter on the ADC sample clock will turn into noise during the conversion process.

The heart of the clock generation circuit is the phase locked loop (PLL). Without getting into too much detail, the PLL compares the phase of a low frequency reference (generally from a crystal oscillator) with a divided down copy of a high frequency that is generated by a voltage controlled oscillator (VCO), which it tunes until the two match. By changing the division settings any frequency can be synthesized, with the accuracy and jitter characteristics of the reference conferred onto the output.

Looking at the other scopes that use the same ADC, I found that many also used the ADF4360-7 in their clock generation circuit. I did some research on the part and it seemed to be the cheapest solution that would give me the 1 GHz output I needed. This chip had an integrated VCO, so the only other parts I needed were the reference oscillator and some passives. Saving me loads of digging into the datasheet, Analog Devices had a tool for calculating all the values of the passives as well as the register values to program for a given output frequency.

That sticky note yellow colour... The navy blue connections... That's not KiCad! It's true, it was at this point that I was offered an Altium license through my school. And with the size and scope of the next board already in mind, and the year of internships working with it, I decided to switch over. As for the design, I chose to use two 50Ω resistors (R5, R6) to bias the output as opposed to a more complicated matched network. The reference oscillator (Y1) was a 16 MHz crystal oscillator, which came temperature compensated for added frequency stability, and the LDO (U2) was a low noise part to avoid noise on the power rails affecting the performance of the circuit. Decoupling cap values were copied from the part's evaluation board and the rest of the passive values were taken from the design tool.

Pictured here, a 1 GHz postage stamp! I didn't have any decent way to test it on its own, so I hooked the SPI bus up to the rest of the oscilloscope prototype and updated the software to set all the registers on the chip at boot.

First I connected the RF output to a balun on a scrap ADC board to generate a single ended output that I could test on my spectrum analyzer. I then verified that it output at 1 GHz and used KE5FX's excellent GPIB toolkit to measure its phase noise performance against the simulation values from the tool as well as calculate total RMS jitter.

Here it is against my RF signal generator (in pink). The 100 Hz range was off, but the other ranges matched the simulations pretty well. The RMS jitter from 1.00kHz to 1.00MHz (didn't have a screenshot of this range, so the numbers are different here) was 760 fs vs. a simulated value of 580 fs. All of this looked promising, so I moved on to functional testing.

I hooked up the RF output into the ADC board through the two UFL connectors I included for differential inputs and updated the FPGA code to reflect the new clock rate. I then ran a quick capture to a CSV file, and the script hanged! That was odd, so I started debugging. Eventually, I found that the ADC wasn't outputting a clock at all! I looked through the clocking section of the ADC datasheet and this line jumped out at me:

"For differential sine wave clock input the amplitude must be at least ± 0.8 Vpp."

A quick trip to the dBm conversion table later, I found that I needed at least 2 dBm of output power. I had about -5 dBm! The matched output network I mentioned earlier would net me an output of -2 dBm according to the datasheet, which is still not up to spec.

My conclusion is that the circuit would probably work,...
Read more »
Mach 1 GB/s: Breaking the Throughput Barrier
Aleksa • 09/25/2021 at 20:31 • 0 comments

Now that the front end was in a satisfactory state, it was time to revisit the architecture of the digital interface. At this point it had been over a year since I designed that board. I chose a USB 3 Gen 1 interface capable of 400 MB/s (which proved to be 370 MB/s in practice) as a stopgap to develop on until a USB 3 Gen 2 chip was released that could match the 1 GB/s throughput of the raw ADC data. Unfortunately, the FX3G2 on Cypress's USB product roadmap failed to materialize, leaving me with few options.

I considered using the Cyclone 10 GX (which is the cheapest FPGA with the needed 10 Gb/s transceivers) with USB 3 Gen 2 IP, but even this couldn't reach 1 GB/s, topping out at 905 MB/s according to the vendor's product sheet. I considered PCIe, which is super common on FPGAs, with free IP and loads of vendor support! However, that would seem to limit this to desktops, since most people don't have PCIe slots on their laptops.

They did have the next best thing though! Thunderbolt 3 (and now USB 4 and Thunderbolt 4) supports up to four lanes of PCIe Gen 3 at a maximum throughput of 40 Gb/s. Perfect! Unfortunately, though the chips themselves are freely available on Mouser, the datasheets are not. I didn't worry about that yet, as I could prototype the system as if it was just a PCIe card by using an external GPU enclosure. This review and teardown really showcased how simple the extra Thunderbolt 3 circuitry was, so I didn't feel like it was a big stretch to incorporate it once the PCIe design was tried and true. I bought the enclosure and got to work finding a new FPGA to do all the PCIe magic.

I used this list of FPGA development boards to find the most affordable way to start prototyping with PCIe. This turned out to be the Litefury, an Artix-7 development board which appears to be a rebadged SQRL Acorn CLE-215+ (an FPGA cryptomining board). Although this board had the four lanes of PCIe I needed, it came in an M.2 form factor so it needed an adaptor. It didn't have a built in programmer either, so I used this one, which was the cheapest one that worked directly with Vivado (Xilinx's IDE for their FPGAs).

Shown above is the Vivado block diagram of the Litefury example design, this design allows DMA access from the PC to the onboard DDR3 memory and vice verse. I would use this to verify the transfer speeds when connected directly to a desktop PC compared to those through Thunderbolt when it was installed in the enclosure. I installed the XDMA drivers (which I had to enable test mode in Windows for, since the driver is unsigned) and ran a basic transfer with the maximum transfer size of 8 MB.

It took 7.072 milliseconds to receive 8 MB, which is just over 1.1 GB/s! Best of all, this number didn't budge when I tested it over Thunderbolt!

This inspired me to finally gave this project it's name: ThunderScope!

Follow this project to catch my next post on designing a 1 GHz PLL to take advantage of this blazing fast transfer rate, and then promptly learning my lesson about cribbing off the other oscilloscope manufacturers!

Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
Testing The New Front End Architecture
Aleksa • 09/16/2021 at 00:04 • 0 comments

It was time to see if the third time really was the charm and test the newest revision of the front end! The first task was to test the front of the front end (FFE) - the coupling circuit, attenuators and input buffer.

Look ma no probes! I started off by verifying the DC bias voltage at the output, which was just about the 2.5V I expected. The exact value of the bias voltage isn't important as it will be matched by the trimmer DAC once the channel is calibrated. I tested the AC coupling by adding a DC component to the signal, which caused no change to the DC voltage at the output. Next, I enabled DC coupling and confirmed that this DC component was now added to the bias voltage at the output. I then measured the DC gain, which was just under unity. After the coupling tests, I switched on the attenuator and was greeted with a flat output - no oscillations this time! I cranked my function generator to the highest voltage it could do, and lo and behold I could see the signal again, now attenuated by a factor of 100.

I then connected the FFE to the PGA and used the front end tester board to test the frequency response of the whole front end. I did this to avoid loading down the FFE’s buffer circuit with the high input capacitance (13 pF) of an oscilloscope input.

The frequency response certainly looked more promising than the previous attempts! The bandwidth was about 230 MHz, out of the 350 MHz promised by the simulations. This alone wouldn’t be too much of an issue if I scaled back the bandwidth requirement to 200 MHz. The real issue here is the flatness of the response, which is over +/- 0.5 dB when it should ideally be +/- 0.1dB. That means that on a scope with this front end, a 100 MHz clock would look 10% larger than a 32 MHz clock!

These peaks and valleys in the frequency response could have been caused by parasitics (unwanted inductance and capacitance) in the layouts of the two boards and in the connection between them. To reduce these parasitics and improve the bandwidth and flatness of the frequency response, I combined both FFE and PGA into one front end board, moving all the parts closer together to shrink the layout.

This new board improved the bandwidth to 260 MHz and the flatness to 0.25 dB. This was clearly a step in the right direction, but also showed that the likely culprits were the components on the board. I resolved to tweak the component values to improve the response later, but was satisfied enough to keep this design and continue on to a very exciting new development in this project - breaking the 1 GB/s barrier!

Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
A New Front End Architecture
Aleksa • 09/01/2021 at 23:59 • 0 comments

At this point, there was one big issue with the front end. The attenuators could not be switched in without causing the whole circuit to oscillate! This issue was compounded by the maximum 0.7 V output of the PGA as well as the massive cost of the design (three relays and an unobtainium opamp don't come cheap). Since I already had to use digital gain to boost the output of the PGA, I decided to remove the opamp gain stage present in the current front of front end (FFE) board and replace it with a unity gain (x1) buffer. Using a unity gain buffer would allow me to remove one of the attenuators, as it would not need to scale the input voltage just to gain it up anyway. I would also need to use an active level shifting circuit instead of the resistive divider to avoid losing half the signal shifting it up to a DC level of 2.5V. Below is the spreadsheet I used to plan out the attenuation and gain needed for all the voltage division settings.

Let's take a look at the schematic, starting from the input coupling and attenuation block. I chose to remove the 50Ω termination relay to lower cost per channel since this wasn't a feature often used or provided on entry level scopes like this one. The move to one attenuator also saved another relay's worth of materials cost, and I replaced the mechanical relay used for the coupling cap with a solid state relay (U2) to further reduce cost. The input coupling cap and its relay were moved from behind the attenuator to in front of it. This maintains consistent input impedance behavior in AC-coupled mode regardless of the attenuator state, as before it would go from infinite resistance at DC to the 1 MΩ impedance of the attenuator when the attenuator was switched on.

Taking inspiration from the example oscilloscope circuit on page 34 of the LMH6518 datasheet, I used a JFET (Q1) as an AC-coupled input buffer alongside a opamp (U1) to handle the DC portion of the signal while adding the 2.5V offset needed for the PGA input. A JFET was a great choice for a front end buffer since they have very high input impedance and contribute very little noise to the signal. I used a clever circuit from page 34 of Jim Williams' AN47 application note to automatically bias the JFET at IDSS. This point is defined as the current at which the voltage between the gate and source is zero, resulting in a gain of exactly one - great news for our buffer! The circuit works by having the opamp (U3) adjust the current through the JFET using the BJT (Q2) until the filtered DC voltage at the output is equal to the DC component of the input (generated by U1) which by the definition above results in IDSS!
Hopefully this mashup of two interesting circuits makes for a working front end! Join me in the next project log where I go through the testing and results for this board and talk about the next steps I took to perfect this design.

Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
How Are The First Few Bytes?: Full System Testing
Aleksa • 08/22/2021 at 19:53 • 0 comments

Now that the FPGA code was done, I could finally assemble and test the whole system. There were many untested blocks at this point, so each block was tested incrementally to pinpoint any issues. Once these incremental tests were done, the final test would be hooking up a signal to the front end and getting the sampled signal data back to the host PC.

The first of the incremental tests I did on the system was to turn a relay on in the front end. This would confirm that the FT2232 chip as well as the FT2 Read interface, FIFO and I2C FPGA blocks were working correctly. I figured out which bytes to send based off of the IO expander IC's datasheet and made a quick python script using pyserial to send the data (this interface on the FT2232 looks like a serial port to the PC). I executed the script and heard the clack of the relay on the front end board, it worked!

Next up, I would send a SPI command to the ADC to come out of power down mode. The ADC clock starts running when it goes into active mode, so I programmed the FPGA to blink the LEDs if it gets a clock from the ADC. This would confirm that the SPI FPGA block and ADC board worked. Some more datasheet searching and a new line of python later, I was greeted with a well-deserved light show from the (too-bright) LEDs on the digital interface board.

I tested the maximum transfer rate next. To do this, I lowered the clock generator's frequency from 400 MHz (theoretical maximum throughput of the FT601) down until the FIFO full flag (which I tied to an LED for this test) was not set while running transfers using FTDI's Data Streamer Application. This resulted in a consistent data throughput of 370 MB/s. This also verified that the FT6 Write block was initiating transfers correctly when the requests came in from the host PC.

Up to this point, I didn't check the actual data coming in, only that the transfers were happening. I enlisted the help of a more software-savvy classmate (this scope would become our capstone project in a later term) to modify the data streamer code to dump a csv file from the data received. I then set the ADC to output a ramp test pattern. Since this pattern was generated inside the ADC, it would test only the FPGA blocks and not the front end. I captured the data and got what i expected: a count up from 0 to 255 and back to 0, over and over again. I did a basic check through the file and found no missing counts, this meant the transfers were completing smoothly with no interruptions in the FIFO or in the USB interface.

Finally, I hooked up my function generator to the front end, got together the set of commands needed to start sampling and sent them to the ADC. This would be the final test, a real signal in and sampled data out.

WE HAVE A PULSE! IT LIVESSSS! I was very happy to see the whole system working, but it had a long ways to go to meet the goal of this project. First of all, the front end still only supported a select few voltage ranges since the attenuators didn’t work. Secondly, the ADC’s sample rate was limited to 370 MS/s (of the 1 GS/s it was capable of) by the FT601’s maximum sustained transfer rate of 370 MB/s. And of course, software needed to be made to stream, process and display the data in real time. In my next blog post, I’ll recount how I fixed the front end issues and lowered the system’s materials cost with a new architecture!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!