Close
0%
0%

FPGA Spectrum Engine

10,240 independent oscillators · 1-sample latency · 0.001 Hz resolution

Similar projects worth following
10,240 independent oscillators on FPGA. 1-sample latency, 0.001 Hz resolution. A universal spectrum engine where FM, additive, and fractal synthesis are all special cases of bin parameter assignment.

What is this? / これは何か?

real-time additive synthesis engine implemented on FPGA (A project to port the implementation from the Terasic C5G to the Terasic DE10-nano and Cyclone V SoC in 2020), running 10,240 independent sinusoidal oscillators simultaneously with:

  • 1-sample output latency (~20 µs at 48 kHz)
  • 0.001 Hz frequency resolution per bin
  • Constant compute load — silence and a full orchestral scene require identical processing (~10 billion ops/sec)

FPGA(2020年にTerasic C5Gへの実装からTerasic DE10-nano, Cyclone V SoCへの移植プロジェクト)上に実装されたリアルタイム加算合成エンジン。10,240本の独立正弦波オシレータを同時駆動し、1サンプル遅延・ビンあたり0.001Hz分解能・常時一定の演算負荷(秒間約100億演算)を実現する。

The Core Insight / 核心的な洞察

By controlling only the bin parameters (frequency, amplitude, phase) of an iDFT engine, every known synthesis paradigm becomes a special case.

iDFTエンジンのビンパラメータ(周波数・振幅・位相)の与え方だけで、あらゆる既知の合成パラダイムを包含できる——FM・加算・ポリゴナル・スペクトルフラクタルがすべて同一ハードウェアの特殊ケースとなる。

Synthesis MethodBin FrequencyBin Amplitude
FM Synthesisωc ± nωmJₙ(β)
Polygonal + Bessel(kN+1)ω₀ − nωmcₖ(N)·Jₙ(kNβ)
Geometric / Fractalf₀·rᵏr^(−αk)
Cantor SpectrumCantor set positionsCantor measure
1/fᵅ NoiseLog-spacedfₖ^(−α/2)
Shepard Tonef₀·2ᵏBell envelope
Physical ModelMode frequenciesModal amplitudes

The FM Connection / FMとの接続

The Chowning FM equation:

sin(ωct + β·sin ωmt) = Σ Jₙ(β)·sin[(ωc + nωm)t]

means FM sideband amplitudes are Bessel function values Jₙ(β). In this engine, those amplitudes are set directly — FM becomes one special case of a far more general spectral control system.

ChowningのFM方程式は、FM側波帯振幅がベッセル関数値Jₙ(β)であることを意味する。本エンジンではその振幅を直接設定できるため、FMはより汎用的なスペクトル制御系の特殊ケースとなる。


Hardware / ハードウェア

ItemSpec
BoardTerasic DE10-nano (Cyclone V SoC 5CSEBA6U23I7)
FPGA Fabric110k LEs, 112 DSP blocks
ARM HPSDual Cortex-A9 @ 800 MHz
Sample Rate48 kHz
Bin Count10,240 (2,048 bins × 5 parallel modules)
Clock100 MHz per module
Output24-bit DAC via I²S
Control I/FGigabit Ethernet (UDP), AXI bridge ARM↔FPGA

The physical layer was first verified on a Terasic C5G board in autumn 2020 as an 80-voice polyphonic additive synthesizer (128 bins, MIDI-CC controlled from MAX8). That prototype validated the core compute architecture. Detailed account in Build Log #2.

物理層は2020年秋、...

Read more »

Launch C5G 2.png

However, the output was nothing more than a 1 kHz sine wave. I braced myself for the worst as I released the reset, but I was left feeling deflated by the result. On reflection, however, this result proved that the design was mathematically correct. I was left feeling deflated by this success.

image/png - 3.17 MB - 04/25/2026 at 12:38

Preview

Launch C5G 1.png

November 2020: The FPGA implementation on the Terasc C5G board had been completed, and this was the moment just before the system was to be powered up and produce its first sound. At that moment, I was filled with tension and dread due to the configuration, which involved summing all 10,240 sine wave oscillators at maximum level.....

Portable Network Graphics (PNG) - 3.08 MB - 04/25/2026 at 12:36

Preview

  • 1 × Terasic DE10-nano (Future models 2026.5)
  • 1 × Terasic C5G (Previous models 2020.11)

  • Build Log: A Spin-Off Is Born — Announcing the PTSG Project

    Tsuneo.Ohnaka16 小時前 0 comments

    Build Log:暖簾分けの誕生 — PTSG プロジェクトの発表

    TL;DR

    In the course of drafting Chapter 3 of the WPMS Synthesizer Layer 1 specification, an unexpected discovery surfaced: the control architecture being designed for WPMS turns out to be a general-purpose, ultra-lightweight programmable timing sequence generator with applications far beyond audio synthesis. We are spinning it off as an independent Open Prompt project: PTSG (Programmable Timing Sequence Generator). This Build Log explains what happened, why we made this decision, and what comes next.

    WPMS シンセサイザー第 1 層仕様の第 3 章起草の過程で、予期しない発見がありました:WPMS のために設計していた制御アーキテクチャが、音声合成を遥かに超えた応用範囲を持つ汎用かつ超軽量のプログラマブルタイミングシーケンス生成器であることが判明しました。これを独立した Open Prompt プロジェクトとして暖簾分けします:PTSG(Programmable Timing Sequence Generator)。本 Build Log では、何が起きたか、なぜこの決定をしたか、次に何が来るかを説明します。

    What happened during Chapter 3 dialogue / 第 3 章対話で何が起きたか

    Chapter 3 of the WPMS Layer 1 specification is titled "Sequence-Modulation Pipeline Processor Specification." Its job is to define the component that computes per-bin parameters (frequency, amplitude, phase) on the fly using the difference-engine structure described in earlier chapters.

    WPMS 第 1 層仕様の第 3 章のタイトルは「数列変調パイプラインプロセッサ仕様」です。その役割は、それ以前の章で記述された差分エンジン構造を用いて、ビン別のパラメータ(周波数、振幅、位相)をオンザフライで計算する構成要素を定義することです。

    When the dialogue turned to how this processor should be controlled, the architect proposed a design that had been gestating for some time: an extremely lightweight programmable controller with a four-opcode instruction set (Reset/Stay/Branch/Jump-class operations), a dual-axis separation of time (Stay command) from state (memory address), a "shadow execution" mechanism that uses idle wait cycles for parameter setup, and 16 timing signals routable from the instruction word.

    対話がこのプロセッサがどのように制御されるべきかへ移ったとき、アーキテクトは温めてきた設計を提案しました:4 オペコード命令セット(リセット/ステイ/分岐/ジャンプ系)、時間(ステイコマンド)と状態(メモリアドレス)の二軸分離、待機サイクルをパラメータセットアップに用いる「裏実行」機構、命令語からルーティング可能な...

    Read more »

  • WPMS Synthesizer — Layer 1 Specification [ 4 ]

    Tsuneo.Ohnaka4 天前 0 comments

    Chapter 2: Maclaurin Pipeline Specification [ Part 2 of 2 ]

    WPMS シンセサイザー — 第1層仕様書

    第2章:マクローリンパイプライン仕様 【後編】

    License: CC0 1.0 Universal (Public Domain) This chapter specifies the Maclaurin polynomial pipeline that computes sin(x) for each bin of the WPMS Synthesizer. It is the foundational signal-generation component of the FPGA Spectrum Engine physical layer.
    ライセンス:CC0 1.0 Universal(パブリックドメイン) 本章は、WPMS シンセサイザーの各ビンの sin(x) を計算するマクローリン多項式パイプラインを仕様する。これは FPGA Spectrum Engine 物理層の信号生成基盤部品である。

    2.7 Output Contract / 出力契約

    2.7.1 Interface to the amplitude multiplier / 振幅乗算器へのインターフェース

    The Maclaurin pipeline produces one output per clock to the amplitude multiplier (Chapter 4 territory):

    マクローリンパイプラインは振幅乗算器(第 4 章領域)へ 1 クロックあたり 1 出力を生成する:

    SignalWidthFormatDirection
    sin_out41Q0.40 signedMaclaurin → amplitude multiplier
    sin_valid1active-highMaclaurin → amplitude multiplier
    bin_index_out11unsignedMaclaurin → amplitude multiplier (delayed by pipeline depth)

    The 41-bit signed format is Q0.40: 1 sign bit + 40 fractional bits, representing the range [−1, +1) with precision 2⁻⁴⁰ ≈ 9 × 10⁻¹³. The maximum representable positive value is 1 − 2⁻⁴⁰; the value +1 exactly is not representable (and not reachable from the polynomial truncation in any case).

    41 ビット符号付きフォーマットは Q0.40:1 符号ビット + 40 小数ビットで、範囲 [−1, +1) を精度 2⁻⁴⁰ ≈ 9 × 10⁻¹³ で表現する。最大表現可能正値は 1 − 2⁻⁴⁰;値 +1 ちょうどは表現不能(およびいかなる場合も多項式打ち切りから到達不能)。

    2.7.2 Why Q0.40 rather than Q0.27 / なぜ Q0.27 ではなく Q0.40 か

    A simpler approach would be to truncate the Maclaurin core's output to 27 bits (Q0.26) so that the downstream amplitude multiplier (sin × A_k) can also fit in a single 27×27 DSP block. The WPMS Synthesizer rejects this simplification for the reasons recorded in Chapter 1's "spend richly outside the core" principle:

    より単純なアプローチは、マクローリンコアの出力を 27 ビット(Q0.26)に切り詰めて、下流の振幅乗算器(sin × A_k)も単一の 27×27 DSP ブロックに収まるようにすることだろう。WPMS シンセサイザーはこの簡略化を、第 1 章の「コア外では贅沢に」原則に記録された理由により拒否する...

    Read more »

  • WPMS Synthesizer — Layer 1 Specification [ 3 ]

    Tsuneo.Ohnaka5 天前 0 comments

    Chapter 2: Maclaurin Pipeline Specification [Part 1 of 2]

    WPMS シンセサイザー — 第1層仕様書

    第2章:マクローリンパイプライン仕様 【前編】

    License: CC0 1.0 Universal (Public Domain) This chapter specifies the Maclaurin polynomial pipeline that computes sin(x) for each bin of the WPMS Synthesizer. It is the foundational signal-generation component of the FPGA Spectrum Engine physical layer.
    ライセンス:CC0 1.0 Universal(パブリックドメイン) 本章は、WPMS シンセサイザーの各ビンの sin(x) を計算するマクローリン多項式パイプラインを仕様する。これは FPGA Spectrum Engine 物理層の信号生成基盤部品である。

    2.1 Role and Boundary of this Chapter / 本章の役割と境界

    What this chapter specifies / 本章が仕様するもの

    • The mathematical formulation of the polynomial sine evaluation / 多項式正弦評価の数学的定式化
    • The argument-range reduction strategy (four-quadrant decomposition) / 引数範囲縮小戦略(4 象限分割)
    • The internal fixed-point formats at every pipeline stage / 各パイプライン段の内部固定小数点フォーマット
    • The DSP block usage constraints / DSP ブロック使用制約
    • The pipeline depth budget and structural skeleton / パイプライン深度予算と構造骨格
    • The input contract from the sequence-modulation processor / 数列変調プロセッサからの入力契約
    • The output contract to the amplitude multiplier and summation tree / 振幅乗算器および総和ツリーへの出力契約
    • The error budget and its allocation across stages / 誤差予算とその段間配分

    What this chapter does NOT specify / 本章が仕様しないもの

    • Verilog or VHDL source code (this is Layer 1, not Layer 3) / Verilog または VHDL ソースコード(これは第 1 層であり第 3 層ではない)
    • Specific Cyclone V DSP block instantiation patterns (left to implementer) / 特定の Cyclone V DSP ブロックインスタンス化パターン(実装者に委ねる)
    • Manual placement or floorplanning constraints (recorded in Layer 2 traces during implementation) / 手動配置またはフロアプランニング制約(実装中の第 2 層軌跡に記録される)
    • Synthesis tool settings, timing constraints .sdc files (Layer 3 territory) / 合成ツール設定、タイミング制約 .sdc ファイル(第 3 層領域)
    • The sequence-modulation processor that produces the phase accumulator value (Chapter 3) / 位相累算器値を生成する数列変調プロセッサ(第...
    Read more »

  • WPMS Synthesizer — Layer 1 Specification [ 2 ]

    Tsuneo.Ohnaka7 天前 0 comments

    Chapter 1: Scope and Boundary Conditions [Part 2 of 2]

    WPMS シンセサイザー — 第1層仕様書

    第1章:スコープと境界条件 【後編】

    License: CC0 1.0 Universal (Public Domain) This is the architectural specification for a Wave Packet Modulation Synthesis (WPMS) synthesizer implementing the FPGA physical layer of the FPGA Spectrum Engine in standalone form. Read it, redistribute it, build on it, regenerate from it.
    ライセンス:CC0 1.0 Universal(パブリックドメイン) これは波束変調合成 (WPMS) シンセサイザーのアーキテクチャ仕様書であり、FPGA Spectrum Engine の FPGA 物理層を単独形態で実装するものである。読み、再配布し、その上に構築し、再生成してよい。

    1.7 Sequence-Modulation Pipeline Processor / 数列変調パイプラインプロセッサ

    Role: The sequence-modulation pipeline processor is the WPMS-specific component that computes per-bin parameters (f_k, A_k, φ_k) on the fly, using the difference-engine structure rather than per-bin memory storage.

    役割: 数列変調パイプラインプロセッサは WPMS 固有の構成要素であり、ビンごとのパラメータ (f_k, A_k, φ_k) を、ビン別メモリ記憶ではなく差分エンジン構造を用いてオンザフライで計算する。

    Recurrences (one accumulator update per bin transition):

    漸化式(ビン遷移ごとに 1 回の累算器更新):

    f_{k+1} = f_k + (Δf + α) + 2α · k
    φ_{k+1} = φ_k + (δφ + ψ) + 2ψ · k
    A_{k+1} = A_k · exp(−β) · exp(−γ · (2k − N + 1))
    

    The frequency and phase recurrences require only addition; no multiplier is consumed beyond what the Maclaurin core already uses for its trigonometric computation. The amplitude recurrence multiplies the previous amplitude by a small dynamic factor; this can be implemented either with a single dedicated DSP block per module or with log-domain accumulation.

    周波数と位相の漸化式は加算のみを要求する。マクローリンコアがすでにその三角関数計算に用いている以上の乗算器は消費されない。振幅の漸化式は前段の振幅を小さな動的因子で乗じる。これはモジュールあたり 1 個の専用 DSP ブロックで実装するか、対数領域累算で実装するかのいずれかが可能である。

    Detailed processor architecture: Deferred to Chapter 3 (Sequence-Modulation Pipeline Processor Specification), to be drafted in subsequent dialogue. Parameter bit-widths for f₀, A₀, φ₀, Δf, α, β, γ, δφ, ψ, and N are specified there.

    詳細プロセッサアーキテクチャ: 第 3...

    Read more »

  • WPMS Synthesizer — Layer 1 Specification [ 1 ]

    Tsuneo.Ohnaka05/03/2026 at 11:31 0 comments

    Chapter 1: Scope and Boundary Conditions [Part 1 of 2]

    WPMS シンセサイザー — 第1層仕様書

    第1章:スコープと境界条件

    License: CC0 1.0 Universal (Public Domain) This is the architectural specification for a Wave Packet Modulation Synthesis (WPMS) synthesizer implementing the FPGA physical layer of the FPGA Spectrum Engine in standalone form. Read it, redistribute it, build on it, regenerate from it.
    ライセンス:CC0 1.0 Universal(パブリックドメイン) これは波束変調合成 (WPMS) シンセサイザーのアーキテクチャ仕様書であり、FPGA Spectrum Engine の FPGA 物理層を単独形態で実装するものである。読み、再配布し、その上に構築し、再生成してよい。

    1.1 Purpose of this Synthesizer / 本シンセサイザーの目的

    The WPMS Synthesizer is a standalone FPGA implementation that produces audible output from a single FPGA device, using only the physical layer of the FPGA Spectrum Engine three-layer architecture (FPGA physical layer / ARM intermediate layer / PC abstraction layer). Neither the ARM intermediate layer (running on the Cyclone V SoC's HPS) nor the PC abstraction layer (Max/MSP, OSC servers, Ableton Live integration) is present or required.

    WPMS シンセサイザーは、FPGA Spectrum Engine の3層アーキテクチャ(FPGA 物理層/ARM 中間層/PC 抽象層)のうち、物理層のみを用いて、単一の FPGA デバイスから可聴出力を得る単独実装である。ARM 中間層(Cyclone V SoC の HPS 上で動作する)も、PC 抽象層(Max/MSP、OSC サーバ、Ableton Live 連携)も存在せず、要求されない。

    Three concrete intentions drive this scope:

    このスコープは三つの具体的な意図によって駆動されている:

    Intention 1 — Earliest possible audible output. The WPMS Synthesizer is the first deliverable in the FPGA Spectrum Engine roadmap because it is the shortest path from "repository contents" to "sound coming out of a speaker." Readers of the Hackaday.io project should be able to write the provided .rbf to a DE10-nano, connect HDMI to a monitor, and hear sound — without any additional software, build environment, or ARM-side configuration.

    意図1 — 可能な限り早期の可聴出力。 WPMS シンセサイザーは FPGA Spectrum Engine ロードマップにおける最初の成果物である。「リポジトリの内容」から「スピーカーから音が出る」までの最短経路だからである。Hackaday.io プロジェクトの読者は、提供される .rbf を DE10-nano に書き込み、HDMI をモニタに接続し、追加のソフトウェア、ビルド環境、ARM 側構成なしに音を聴けるようにすべきである。...

    Read more »

  • Fifty years of synthesis: from FM sidebands to bin-direct addressing

    Tsuneo.Ohnaka04/30/2026 at 05:52 0 comments

    Synthesis Paradigm Lineage — from Chowning's FM to Razor, and where this engine sits

    Build Logs #1, #2, and #4 covered, respectively, the polynomial evaluation architecture, the 2020 prototype that proved the architecture works, and the Open Prompt paradigm under which this project is released. This log places the project on the timeline of digital sound synthesis itself.

    The argument is short and, I think, defensible: every major synthesis paradigm of the last fifty years is a special case of "decide what frequency, amplitude, and phase to assign to each spectral bin, and sum the results." What changed across decades was not the underlying mathematics — it was which subset of bin patterns the available hardware could actually compute in real time, and which abstractions made those subsets composable for human composers.

    The FPGA Spectrum Engine is what becomes possible when the hardware no longer has to choose a subset.

    A note on this log

    This log is the most opinionated of the four. It situates the project in a lineage; lineages are arguments. Where the technical Build Logs (#1, #2) were careful, and the philosophical one (#4) was deliberate, this one is interpretive. Reasonable engineers and historians will disagree with parts of it. That is appropriate.

    The strong claim — that all these paradigms reduce to the same problem — is mine, not the field's consensus. I make it because I think it is true, and because making it explicit lets us see what the next paradigm looks like.

    1973 — Chowning's FM and the side-band epiphany

    In 1973, John Chowning at Stanford published "The Synthesis of Complex Audio Spectra by Means of Frequency Modulation." The mathematical heart of the paper is the trigonometric identity that converts a frequency-modulated carrier into a sum of side-bands:

    sin(ωc·t + β·sin(ωm·t)) = Σ Jₙ(β) · sin((ωc + n·ωm)·t)
    

    The right-hand side is a spectral description. It says: to make this FM tone, place sinusoidal bins at frequencies ωc + n·ωm with amplitudes Jₙ(β), the n-th Bessel function of the first kind evaluated at the modulation index β.

    If you had hardware capable of placing arbitrary bins with arbitrary amplitudes in real time, FM synthesis would be one specific bin-placement recipe. But Chowning did not have that hardware. He had a single oscillator capable of sinusoidal output and a means of frequency-modulating it. The genius of FM was that a recursive trick on the time-domain side produced the spectral richness on the frequency-domain side, with hardware barely capable of one sine wave.

    The Yamaha DX7 (1983) industrialized this trick. With six "operators" — six FM-capable oscillators arranged in 32 routing topologies ("algorithms") — the DX7 could produce a vast tone palette while consuming only modest hardware per voice. The reason the DX7 could not do everything an additive synthesizer could is not theoretical; it is that FM constrained which bin patterns it could reach. Bessel-function side-bands, scaled to whatever β you chose, with carrier-frequency ratios that produced predictable spectra. Beautiful, but constrained.

    FM was always a special case of bin-direct addressing. It just happened to be the special case that fit on 1980s silicon.

    Late 1970s — Synclavier and Kawai K5: additive synthesis for real

    While FM was conquering the consumer keyboard market, a parallel lineage at the high end was implementing additive synthesis directly: place dozens, then hundreds of partials (bins, in our terminology) at integer multiples of a fundamental, give each its own envelope, sum the result.

    The New England Digital Synclavier II (1980) and especially the Kawai K5 (1987) showed that additive synthesis was possible. The Kawai K5 offered 63 harmonics per source, with each harmonic assignable to one of four six-stage amplitude envelopes; in full mode...

    Read more »

  • Open Prompt — a knowledge-sharing paradigm for the LLM era

    Tsuneo.Ohnaka04/28/2026 at 14:19 0 comments

    Open Prompt — a knowledge-sharing paradigm for the LLM era

    This is the log I have been pointing toward in every previous one.

    The FPGA Spectrum Engine is being released under a knowledge-sharing scheme I call Open Prompt. This log is the formal declaration: what Open Prompt is, what it is not, why it exists, what it shares, and how others can adopt it for their own projects.

    It is also, by necessity, a log about how engineering knowledge propagates in a world where capable engineers and capable language models collaborate as a matter of course. That world is already here. The question is whether our knowledge-sharing conventions have caught up with it.

    Why a new paradigm is needed

    For four decades, open source has been the dominant model for sharing engineering knowledge. It works by distributing source code under a license that permits — and structures — its reuse, modification, and redistribution. The code is the artifact. The license is the social contract. Forking, attribution, and copyleft are the operational mechanisms.

    Open source is one of the most successful intellectual movements in modern history. Nothing here is meant to diminish it.

    But open source was designed in a world where the source code was the bottleneck. Producing source code took human time measured in days, weeks, and years. Sharing it meant sharing the scarcest resource. The license structure exists because the artifact was hard-won.

    That world is changing. In a world where a capable engineer with a capable language model can regenerate functional source code from an architectural description in hours, the source code is no longer the bottleneck. What was scarce becomes abundant. What was abundant — clarity of architecture, precision of reasoning, transparency of design intent — becomes scarce.

    Open source distributes the abundant resource and licenses around its scarcity. We need a paradigm that distributes the scarce resource and lets the abundant one regenerate freely.

    That paradigm is what I am calling Open Prompt.

    What Open Prompt is

    Open Prompt distributes engineering knowledge in three layers:

    Layer 1 — Architectural Specification

    The mathematics, the constraints, the structural decisions, the invariants. Written for a competent engineer to read directly. Sufficient — in combination with current language model capabilities — for that engineer to regenerate a working implementation.

    For the FPGA Spectrum Engine, Layer 1 includes:

    • The 11th-order Maclaurin truncation and its error bound
    • The 3-layer hardware architecture (PC / ARM HPS / FPGA)
    • The bin-direct addressing principle
    • The iDFT/SDFT duality
    • The synthesis paradigm unification (FM, additive, fractal, polygonal as bin patterns)

    This layer is the commons. It is in the public domain. Anyone may read it, redistribute it, build on it, write derivative explanations of it, teach it.

    Layer 2 — Reasoning Trace

    The actual design conversations and decision logs that led from constraints to implementation. This layer is what is genuinely new in the LLM era.

    Engineering decisions are rarely fully reconstructible from the specification alone. "Why this and not that" lives in the process of arriving at the specification. Historically, that process was lost — preserved only in scattered notebooks, mailing list threads, and the memories of the engineers involved.

    In the LLM era, however, a substantial portion of engineering reasoning takes place as dialogue with language models. Those dialogues are literally exportable as text. They can be saved, versioned, shared, and replayed. A reader who replays such a dialogue with their own language model collaborator does not just read the reasoning — they can resume it.

    For this project, Layer 2 includes:

    • Architectural design dialogues (with myself, with collaborators, with LLM partners)
    • Recorded decision rationale at branching points
    • Build...
    Read more »

  • First sound, then proof: how the 2020 C5G prototype validated the architecture

    Tsuneo.Ohnaka04/26/2026 at 13:12 0 comments

    2020 Prototype — from first sound to Dirichlet kernel, and the wrong turn that taught me everything

    Before describing the current architecture in detail, I want to walk through what was actually built and running five years ago — in four stages, in the order they happened.

    This is the proof-of-concept record. It is also the record of how the design philosophy of the current engine was forged through a sequence of experiments, two of which succeeded for the wrong reasons and one of which succeeded for reasons I did not appreciate until much later.

    Stage 1 — First sound (November 2020)

    [Tektronix display with 1 kHz sine, C5G board on the workbench]

    This photograph captures the moment the C5G prototype produced its first sound. The Tektronix oscilloscope shows a clean 1 kHz sine wave; the small Terasic Cyclone V GX board on the cutting mat is the source. Behind the soldering equipment, the parts bins, and the everyday clutter of an electronics bench, something unprecedented was happening: 10,240 independent oscillators, all set to the same frequency, were summing into a single tone.

    The test setup, in full, was the following:

    • All 10,240 bins set to 1 kHz
    • All bins set to phase = 0
    • All bins set to maximum amplitude
    • Reset held high; output muted

    I will admit that I held my breath when I released reset. Mathematically, the output should have been a single 1 kHz sine at full scale — 10,240 unit-amplitude sinusoids, all coherent, all at the same frequency, summing to a single coherent tone scaled by the count. But "mathematically should" and "actually does, on real silicon, on the first try" are two different statements. There was a real possibility that I would hear a click, a burst of broadband noise, or nothing at all — any of which would have meant a fundamental error somewhere in the pipeline that would have taken weeks to find.

    What I heard instead was a clean, sustained "piiiiiiiii——" at 1 kHz. The oscilloscope confirmed it visually. No transient. No artifact. Just the tone the mathematics predicted.

    It took me a moment to register what had just happened. Then it took me longer to register what it meant. If 10,240 phase-coherent oscillators all sum cleanly into one tone, then the entire compute pipeline — every Maclaurin evaluation, every NCO accumulator, every adder in the 10,240-input adder tree — is working correctly, simultaneously, in real time. The single coherent output was, paradoxically, the strongest possible test signal: any error anywhere in any of the 10,240 bins would have shown up immediately as decoherence.

    The boring tone on the oscilloscope was a complete proof of life.

    Stage 2 — Dirichlet kernel (November 18, 2020)

    Once the basic compute pipeline was confirmed, the obvious next test was to break the coherence on purpose, and see whether the bins behaved as 10,240 independent oscillators rather than as one big oscillator pretending to be many.

    I picked a single 2,048-bin pipeline module (one of the five that make up the full engine) and detuned its 2,048 oscillators across a narrow frequency range:

    • 2,048 sinusoidal oscillators, all phase-coherent, all equal amplitude
    • Frequencies spaced from 996 Hz to 1004 Hz, in steps of 1/256 Hz
    • All bins at maximum amplitude that did not clip the sum (97% of full scale)

    What you hear is not a synthesizer "patch." It is a direct physical realization of a Dirichlet kernel — the closed-form sum of 2,048 equal-amplitude phase-coherent sinusoids spaced uniformly in frequency. The mathematics predicts a sharp main lobe at the geometric center, followed by gradually decaying side lobes whose mutual beating creates a slow envelope at frequencies determined by the detune step.

    What it sounds like

    I had expected a click. A burst. Possibly something noisy.

    It is none of those.

    It begins with a clear "piiiiinnnn——————" — a sustained tone at the centroid frequency, with no transient onset whatsoever. Then, over several seconds, it...

    Read more »

  • Why 11th-order Maclaurin? Not CORDIC, not LUT — the four reasons

    Tsuneo.Ohnaka04/25/2026 at 13:35 0 comments

    Why 11th-order Maclaurin? Not CORDIC, not LUT — and why the implementation is an open arena

    When you tell someone "I generate 10,240 sinusoids on FPGA," the first response is almost always one of two questions: "CORDIC?" or "big lookup table?"

    Neither. Every bin in this engine generates its sine value through direct evaluation of an 11th-order Maclaurin series, computed on a fixed-latency DSP pipeline. One result per clock, per pipeline. Five pipelines in parallel. 10,240 sines per sample period at 48 kHz.

    This log is about why that choice — direct polynomial evaluation — was forced by the architecture. It is also about something more interesting: once that choice is made, the actual evaluation strategy on silicon is wide open. The mathematics is fixed. The implementation is an arena.

    The mathematics

    The Maclaurin series for sine is one of the first things you meet in calculus:

    sin(x) = x − x³/3! + x⁵/5! − x⁷/7! + x⁹/9! − x¹¹/11! + ···
    

    Truncating at the x¹¹ term, the remainder bound is:

    |R₁₁(x)| ≤ |x|¹³ / 13!
    

    For input range |x| ≤ π/2, the worst-case error is about 1.3 × 10⁻⁷, which corresponds to roughly 23 bits of effective precision — enough headroom for a 24-bit DAC with a few bits of guard against accumulated quantization noise downstream.

    Range reduction is trivial. A standard quadrant-fold reduces the input to [0, π/2], which costs one comparator and at most one subtraction at the head of the pipeline. Range reduction is not the interesting part of this design.

    Reason 1 — Verifiability

    The coefficients in the polynomial are 1/n! — exactly. They are not the result of a Remez optimization. They are not table-fitted. They are not vendor-supplied IP block constants. They are mathematical necessities, computable from a textbook in two minutes.

    This matters because the entire engine is being distributed as Open Prompt (more on this in Build Log #4). Anyone who wants to regenerate this implementation, with or without LLM assistance, can derive the coefficients themselves. There is no opaque parameter that has to be transferred along with the architecture description. The mathematics is the specification.

    A CORDIC implementation has the same property in principle (the rotation angles are arctan(2⁻ᵏ)), but in practice CORDIC implementations on FPGAs come with a thicket of magic numbers — gain compensation factors, micro-rotation orderings, scaling adjustments — that do not survive translation across architectures cleanly. A polynomial sine survives translation perfectly, because the polynomial is the answer.

    Reason 2 — Pipeline-natural

    Direct polynomial evaluation maps onto a fixed-latency DSP pipeline with one new sine result emerging from the end of the pipeline every clock cycle. The exact structure of that pipeline is an implementation choice (more on this below), but every reasonable structure shares the same property: the polynomial degree determines the pipeline depth, not the throughput. Throughput is one result per clock, period.

    This is the property that makes 2,048 bins per pipeline viable. At 100 MHz, a single pipeline produces 100M sines per second. At 48 kHz output rate, that is 2,083 samples worth of throughput per output sample — which I round down to a clean 2,048 bins per module, leaving headroom for control overhead. Five modules give 10,240.

    CORDIC also pipelines well, but each CORDIC iteration produces only a partial-precision result; you need 16–20 iterations for 24-bit precision, each with its own pipeline stage, each consuming a DSP block or its equivalent in logic. The DSP-block budget on a Cyclone V — 112 blocks total — is the hard limit on how many bins fit, and direct polynomial evaluation uses that budget more efficiently than CORDIC for the precision target I needed.

    Reason 3 — No memory...

    Read more »

View all 9 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates