FPGA Boot Camp #4: State Machines

Most FPGA projects will have at least one state machine. Learn how to create these powerful little dedicated CPUs in your designs.

Similar projects worth following
On a CPU, you have some kind of software to tell the machine what to do. For FPGAs, all but the simplest circuits use a state machine. The idea is simple: some state variable tracks what you are doing right now and various stimulus like inputs or time cause the state variable to change. It sounds simple, but there's a lot of details to get right. And depending on your design goals, there's plenty of ways to produce a state machine.

As usual, we will focus on the practical and build state machines using Verilog. And, once again, we'll rely on the EDA Playground online simulator, although you can use another Verilog simulator like Icarus if you prefer.

A state machine is a bit like a Rube Goldberg contraption. Something happens, which causes something else to happen, and so on. One of the first state machines I ever saw was made with timing motors and poker chips. A ringing phone would start the motors and the poker chips would push microswitches to start a tape player, then a tape recorder, and it made a crude answering machine. Our FPGA-based state machines will be more flexible and not quite as  kludgy, but the principles will be the same.

Let's take a really simple example. Suppose you have a machine that puts air in car tires at a gas station. There's a compressor, a timer, and a coin slot. When a quarter comes in the coin slot, you start the compressor and the timer. When the timer expires, you turn off the compressor. That's pretty simple, and you could probably do that with just some ad hoc logic. However, it could be a state machine. You'd only have two states: Off and On. When you are in the off state, you watch for a coin. When you see one, you go to the On state, turning on the compressor and the timer. Then you watch for the timer and when it fires, you turn everything off and go back to the off state.

A state diagram -- one we will look at in the Bootcamp -- might look like this:

You might think that example isn't very useful, but consider this: What if next month, the price of air goes up to 75 cents and the coin slot will tell you if you got a quarter, a dime, or a nickle? Now it is harder to change an ad hoc design. What if while in the on state, the customer can add more coins to get more time in proportion to the coin they deposit? These would be easy things to handle in a state machine-based design.

You might think this is very different than using a CPU, but if you think about it, a CPU is just a very large state machine itself. Each instruction causes it to do different things depending on inputs and the previous state. Very few of our state machines will be that complex, though, unless you are building a CPU.

For this bootcamp, we'll assume you understand Verilog and the fundamentals of FPGAs. If not, you are in luck, because this bootcamp is part of a series:

Bootcamp 0Covers basic digital logic concepts with simulations
Bootcamp 1Introduction to FPGA coding and simulation with combinatorial logic 
Bootcamp 2More FPGA coding and simulation with flip flops (sequential logic)
Bootcamp 3Working with actual FPGA hardware
Bootcamp 4State Machines (this bootcamp)

If you aren't comfortable with the basics, you can check out those bootcamps first and then return to this one when you are ready.

To start, look in the step-by-step instructions to learn about state machines with some hands-on exercises. You'll also find background articles in the project logs. You might want to browse them first and refer back to them as you work through the steps. The logs also have a glossary you can check for any unfamiliar terms.


One pulse per second module (use with traffic1.v; already inside traffic.v)

text/x-verilog - 694.00 bytes - 10/12/2018 at 14:09



Traffic light example (one of 3 files or use traffic.v by itself)

x-verilog - 4.93 kB - 10/12/2018 at 14:09



Simple RS fipflop (use with traffic1.v; already inside traffic.v)

x-verilog - 364.00 bytes - 10/12/2018 at 14:09



Traffic light example (one file with multiple modules; use traffic1.v instead if you want one module per file).

text/x-verilog - 5.86 kB - 10/12/2018 at 14:06



Test bench for traffic light

x-verilog - 743.00 bytes - 10/12/2018 at 14:06


View all 9 files

  • VCD File Sizes

    Al Williams10/12/2018 at 04:41 1 comment

    If you start doing serious work, you probably won't want to use EDA Playground for everything. You can use commercial tools, which makes sense if you are targeting a specific FPGA. Some vendors have their own simulation tools and others use ModelSim or another commercial product. However, you can always use Icarus, which is open source and does a great job of simulation. In fact, that's the tool we usually pick when using EDA Playground, so even if you aren't using Icarus directly, you are probably using it through EDA Playground.

    In the instructions, I offer some tips on how to use gtkwave to view the results of your simulation and give state values meaningful names as well as save your setup for future use. However, another issue with using Icarus and GTK is the size of the VCD files from meaningful simulations.

    The problem is the VCD file is ASCII and not very terse. However, if your waveform viewer supports better formats (gtkwave does) you can dump your waves in LXT2 format or fst. LXT2 and fst are fast binary formats. 

    To use these you probably want to change the $dumpfile directive in your test bench to use a .lxt or .fst extension. Then you can pass -lxt2 or -fst to the vpp command when you run your simulation. Then you can open that much smaller file much faster using gtkwave or any other tool that supports those formats.

  • About `timescale

    Al Williams10/12/2018 at 01:00 0 comments

    When you are running simulations in Verilog, you might notice that many of the files will have a directive like this:

    `timescale 1s/1ms

    This tells the Verilog simulator how to mark time. What the line above means is that you want to express delays in seconds and the simulator should keep track of time down to 1 millisecond. You have to use a "decade" number for the first number. So 1, 10, or 100. You can't use 20 or 1000 or 0.5. But you can use different units. For example, instead of saying .1 seconds (which is illegal) I could say 100ms.

    Suppose you have this code in y our simulation:

    #10 reset=1'b1;

    If the timescale above was in force, the simulator will multiply 10 by 1 second so that delay is a 10 second delay. It will also track the time down to 1 millisecond, although that's not important in that case. However, I can also specify delays that are fractional:

    `timescale 1ms/10us
    #0.333 reset=1'b1;

    In this case, the computed delay is 333uS but since the precision is only 10uS, the simulator will treat it as 330uS.

    The truth of it is, you don't have to have a timescale if you don't mind doing your own math. But it does make it easier to have a meaningful timescale, set your clocks to the right frequency, and then have proper time readouts in your waveform viewer.

    You can see an example of this in the final project for this bootcamp. 

  • Glossary

    Al Williams10/05/2018 at 04:25 0 comments

    • Combinatorial Logic - Logic that does not rely on the previous state of the system to set the current output state.
    • FSM - Finite State Machine; the formal name for the state machines we are discussing.
    • Mealy Machine - A state machine where the outputs can rely to some extent on the inputs.
    • Moore Machine - A state machine where the outputs rely only on the current state.
    • Sequential Logic - Logic that typically uses flip flops and the current output state influences future output states.
    • State - The current operating mode of a state machine. For example, a state machine may go from IDLE to ARMED to TRIGGERED. These three things (IDLE, ARMED, and TRIGGERED) are the states.
    • Transition - The change of a state machine from one state to another state. It is possible to have a machine state that transitions to itself.

  • State Machines Everywhere

    Al Williams10/05/2018 at 04:21 0 comments

    You probably use a variety of state machines every day. Some of them may be in FPGAs, some in software, and some might even use old-fashioned regular hardware. Here's some common examples:

    • Traffic lights
    • Cryptography machines
    • Vending machines
    • Gas pumps
    • Video recorders

    Can you think of other examples?

View all 4 project logs

  • 1
    ▇▇▇▇ A Simple State Machine ▇▇▇▇

    This diagram shows what could be one of the simplest state machines possible.

    It operates a hypothetical air machine at a gas station. You insert a quarter and the air compressor turns on for a fixed amount of time. When the timer runs out, the compressor goes off and they system waits for more money. 

    This is very simple, of course, but it illustrates a few things. The blue circles are states and the double circle indicates the state the system starts in. In some diagrams, you'll see an arrow coming from nowhere labeled reset or some other way of identifying the initial state.

    The lines are transitions. The labels on them should tell you why you'd move from one state to another and what outputs occur on those transitions. There's some variations possible on this and we'll talk about them later, but this is basically it.

    Coding this in Verilog would be pretty simple:

    `timescale 1s/100ms
    module airpressure(input clk, input reset, input coin, output compressor);
       reg state; // 1 = on, 0 = off
       reg gotcoin;
       reg [6:0] timer;
       localparam timerinit=60;  // must fit in timer!
       wire nextstate;
    // logic for output
       assign compressor=state;
    // logic for next state
       assign nextstate=state?timer!=0:gotcoin;
      // Manage state and timer (for simplicity,
      // assuming clock is every second and that's
      // sufficient for the coin slot
        always @(posedge clk)
          if (reset)  // initial conditions
              begin state<=0;
                   if (nextstate) timer<=timerinit;
                   if (timer!=0) timer<=timer-1;

     Of course, you could argue that an SR flip flop could do the same job and you'd be right.  But this framework will allow us to build more complicated machines. For example, what if you wanted to change the price and allow more types of coins? Or let the user add coins to extend the time?

    Here's part of the simulation:

    Note that I made the simplifying assumption that the clock was one second and the coin acceptor would hold its output that long. In real life, you'd need to implement the timer differently, I'm sure.

    Don't forget you can run all of this yourself in EDA Playground. Consider changing the code to require two quarters instead of one. What would have to change?

  • 2
    ▇▇▇▇ Adding Complexity ▇▇▇▇

    Instead of two quarters, suppose we just want to let the machine take nickles (5 cents), dimes (10 cents), and quarters (25 cents). The price is still 25 cents, but you don't have to have a quarter. If you put too much money in, we'll just keep it and dispense the same amount of air.

    Take a minute and try to sketch the state diagram for that before you proceed. Although in this case, you could have a very simple state machine that keeps track of the total, try thinking about how to not store the sum but remember the current value as part of the machine's state.

    Too Simple

    You could, of course, have a state for every possible sequence of coins:

    • 25 cents
    • 10 cents, 10 cents, 5 cents
    • 5 cents, 5 cents, 5 cents, 5 cents, 5 cents
    • ... etc.

    This gets you a lot of states fast. Keep in mind, that the second line (10/10/5) could also be 5/10/10 or 10/5/10 so the number of states will grow fast.


    The trick is to gather up equivalent states -- something we will talk a little bit about in a future section on state machine optimization. If you imagine the entire sea of states that would make up the above list, you should realize that every state that represents 20 cents is exactly the same as far as outputs and transistions go. So it makes sense you can merge them into one single state. Since this machine is pretty simple we can do that just by thinking about it and you should wind up with something like this:

    To simplify things, I am not showing the outputs. They are the same as before. The On state turns on the compressor and leaving the On state turns the compressor back off. The Inxx (like In5) means the input of a coin with that value. In10/25 means either a 10 cent or 25 cent coin detection.


    As I mentioned, you could do better than this by applying some domain knowledge. We already did this a bit by supposing we have a timer. That is, we didn't have a state for 59 seconds left, 58 seconds left, etc. You could apply this same logic to keeping count of the coins in a separate register. Your state machine would then look like this:

    As a practical matter, this may be the best implementation, but it doesn't show off the state machine very well and real projects will have many more states. But like most design problems, there are many ways to solve this and there are probably many more, too.

  • 3
    ▇▇▇▇ State Implementation ▇▇▇▇

    When creating a state machine, there are two things to consider: How to represent the state you are currently in is the most obvious one. This may seem like a trivial concern, but as you'll see, there is more to it than meets the eye.

    The second decision is how you translate your logic into a state machine. There are many possible ways, of course, but most people prefer to use the following methodology:

    1. Use a clocked always block to store the current state.
    2. Create one or more pieces of combinatorial logic to compute the output for the current state.
    3. Create one or more pieces of combinatorial logic to compute the output for the next state.


    The machines we have looked at so far have a pretty small number of states. So there's no real problem representing the states as normal binary numbers. For example, a parity checker might have an idle state (0), an odd state (1), and an even state (2). You use two flip flops and you are "wasting" one state (since 3 is an illegal state). But that's not a big deal.

    However, any time you want to decode the state, you have to examine both state variables. Again, for two that's not a big deal, but what if you had, say 100 states in 8 bits? Or 1000 states in 10? Finding state 821 will take a few inverters and an AND gate. Of course, storing 1000 states in 10 bits is relatively compact, too.

    But what if you don't care about compactness. You care about speed. Probably the fastest method to represent state is the so-called "one hot" method. In the case of the parity machine we would have three binary bits for the state. Idle might be 001, odd would be 010, and even would be 100. At all times, there is only one flip flop set -- which makes sense if you think about the name one-hot.

    Now it is easy to detect a particular state by examining a single bit. So it is fast and easy, what's the downside? You'll need to pay attention to reset because 000 isn't a legitimate state. You sometimes see people use a "modified one-hot" where 000 is the initial state, but if you ever need to decode that state, you lose the advantage.

    Space is another problem. If you really had 1000 states (and, yes, that's probably bigger than you should have) then you'd need 1000 bits of state. Another more subtle problem is reliability. If you want to detect an illegal state, you'll need some sort of gating arrangement that asserts only if more than one state bit asserts. That gets complicated fast with too many bits.

    Still, one-hot is often a great choice for state encoding, and you see it used frequently. With negative logic you sometimes see "one cold" encoding which is the same as one hot, but the active state is zero.

    Gray Codes

    Speaking of reliability, gray code encoding is also possible in some cases. This is similar to the numeric representation except the bit patterns are selected so that only one bit changes per state transition. For example, this wasn't true with the parity machine we mentioned before. In binary, the states were 00, 01 and 10. If you can go from 01 to 10 or vice versa, that's two bit flips. We say the hamming distance between those states is two.

    If we made the parity machine gray, it would turn into a one hot. But consider a different example. Let's say we have a state machine that will count the number of button pushes from 1 to 3. A simple encoding would be two binary bits: 00, 01, 10, and 11 where 00 is the idle state. But, again, the transition between 01 and 10, as well as 11 and 00 requires two bit flips. With a gray code, the states might be: 00 01 11 10. Now each transition only takes one bit flip. 

    There are other encodings like Johnson and O'Brien that seek to minimize bit changes between adjacent states. These schemes can be useful when you want to use combinatorial logic to detect state changes or you are worried about power consumption. 

    It isn't always possible to build a gray or similar sequence without some work. For example,  consider the 00, 01, 11, 10 sequence mentioned above. That's fine if the states go from one to another in sequence. But what if state 00 could go to state 11? You'd need to rearrange the code or use more bits. This is a big problem if you add or change transitions in the middle of the design. 

    Coding Style

    Let's consider a state machine to figure out if the sum of positive numbers is even or odd. It doesn't matter how many bits the numbers are, because all we need to do is look at the last bit of any number to know if it is even or odd. An even number plus an even number is even. An odd number plus an odd number is even. And an even number plus an odd number stays odd.

    We will need three states: idle, odd, and even and let's use one hot encoding, just for practice. It is handy to use localparam to name the states so:

    module odd_even_sum_detect(input clk, input reset, input bit0, 
            output oddled, output evenled);
       reg [2:0] state; 
       reg inbit;
       localparam idle=3'b001;
       localparam odd=3'b010;
       localparam even=3'b100;

    Of course, this is a simple example, so you might have even had:

       reg idle, odd, even;

    This gets clumsy, though, when you need to reset things or examine several states at once.

    Remember, we said it is customary to put the state machine into at least three parts. The first part is state management. We'll also use this as an excuse to save the input bit so if it changes externally, we won't care.

       reg [2:0] nextstate;
        always @(posedge clk)
            if (reset) 

    Obviously, the trick now is to compute nextstate. This could be done with an assign or with a combinatorial always block. I'll use the always block:

        always @(*)
              if (state==odd) nextstate=inbit?even:odd;
              else nextstate=inbit?odd:even;

    There are a few interesting things to notice here. First, the idle state is zero so it is intrinsically even. The else branch catches both cases. That's the second point, though. You must assign nextstate in all cases so you don't infer a latch. If you fail to do so, the synthesizer will assume you want to keep the next state, but not with a flip flop and you will get a latch. This is usually undesirable.

    Because this is combinatorial, you can use = instead of <= and the last assignment is going to "win." So you could, for example, assign something at the top of block as a default or, if using a case statement, be sure to include a default case. In this example, even an illegal state (if one were possible) or an idle state would still get an assignment because we only check for odd and then do something for everything else.

    You might also notice that if state is odd, nextstate is the inverse of inbit otherwise, it is inbit. So you could write this as:


    However, this is less intuitive and -- as we will talk about later -- the tools may even figure that out on our behalf. 

    That leaves one more piece to write. The output section which is, again, combinatorial. This time I'll use assign, but I could do another always @(*) if I wanted to.

       assign oddled=state==odd;
       assign evenled=state==even;

    You hope your synthesizer would figure out that checking for all the bits is not necessary, but if it really checks for 010 and 100, that is kind of a waste. If you want to be more explicit, you could say:

       assign oddled=state[1];
       assign evenled=state[2];

    Having a single clocked piece of code along with a combinatorial part to compute the next state and another to compute the output is known as the "3 process style" of state machine. However, some argue that this is more error prone. You can easily put everything in one place (the "1 process style"):

    always @(posedge clk)
            if (reset) 

    For something this simple, it is fine. Note that I now don't need to latch inbit because that's done inherently when I store state. Of course, now oddled and evenled are reg types. I could have stayed with the assign and had a "2 process" state machine.

    Which one is best? That's a hotly debated topic. In general, you should pick the style you are most comfortable with for the task at hand.

View all 11 instructions

Enjoy this project?



Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates