Cheat sheet
Process structure
One-shot coroutine
Runs once top-to-bottom. Must explicitly idle at the end, with forever @(posedge clk) or similar.
Cyclic coroutine
Loops back to the top after the last @. Use for repeating protocols and continuous signals.
Idle tail
Suspend forever. Common at the end of an initial begin to hold outputs, or to idle after an initialization FSM.
Time and waiting
One-cycle wait
Advance one clock. Every @ becomes a transition between two states.
Guarded wait
Hold the current state until cond is high on a clock edge. The primary way to wait on inputs.
N-cycle delay
Wait exactly N clocks before continuing. N can be a parameter.
Control flow
Per-state branch
Selects outputs within the current state. Both branches drive the same signals; neither contains its own @.
Diverging branch
When one or both branches contain @, control flow splits into separate sub-state chains and rejoins after the if/else. Each side can have its own length and outputs.
Bounded loop
Iterate N times. If i is a module-level local the source name is preserved on the counter and you can read it as an output or use it to index arrays; otherwise an auto-counter is generated.
Conditional loop
Run the body each cycle while cond holds, exit when it clears. No iteration counter is generated; bound it explicitly if needed.
Reuse
Task definition
A reusable sequence with input formals only. Tasks may contain @s. The call graph must be acyclic; tasks may call other tasks but not themselves.
Task call
Inlined with arguments substituted in. Repeated sequential calls of the same task collapse into a single state controlled by a program-counter mux.
State and locals
Cross-edge local
A module-level local written in one state and read in another is inferred as a flop in the generated module. No manual always_ff needed.
Combinational local
A local only touched between two @s stays as a wire — a temporary in the state's output expression.
Output drive
Drives out combinationally for the current state, and holds into subsequent states until reassigned. No latch is inferred.
Semantics
States and clock waits
The coroutine body is split at every @. Each chunk between two waits becomes the output expression for one FSM state; the wait itself becomes the transition out of that state. States are numbered in source order — S0 is the first chunk, S1 the second, and so on.
1initial forever begin2 a = 1'b1; // S0 outputs3 @(posedge clk); // S0 -> S14 a = 1'b0; // S1 outputs5 @(posedge clk); // S1 -> S06end
For an initial forever body the last transition wraps back to S0. For a plain initial begin the last state self-loops, holding its outputs.
Outputs and held values
Outputs are driven combinationally per state by a single always_comb block with a case on the state register. Critically, an output is held across states: assigning a = 1 in S0 and not touching a in S1 means the generated always_comb drives a = 1 in both. There is no latch — the held value is re-driven combinationally.
Where the transpiler can't prove an output is set on every path, it defaults the output to its zero value at the top of the always_comb before the case.
Locals: wires vs. flops
A module-level local is one of two things:
- A combinational temporary if every read and write happens within the same state — between two adjacent
@s. Lowered to a wire insidealways_comb. - A flop if any read crosses an
@after a write. Lowered tologic ... x_qwith a smallalways_ffblock that captures the writes.
Counter variables in for loops follow the same rule. Because they're written in one iteration and read in the next, they're always flopped — and the source name is preserved when you declare the counter at module level.
Guarded waits
A guarded wait @(posedge clk iff cond) only fires when cond is high on a clock edge. The state holds its outputs and self-loops while the guard is false, then advances on the next edge where it's true. This is the primary way coroutines wait on inputs — req, ready, valid_in, and so on.
Branches
A condition can shape the FSM in three different ways, depending on where you put it:
- Per-state output mux. An
if/elsewith no@inside. The state's output expression becomes a ternary on the condition; no new state is added. Use this when both branches drive the same signals for the same duration. - Diverging sub-states. An
if/elsewhere one or both branches contain@. Control flow splits at the branch and rejoins after theif/else. Each side becomes its own chain of states; the parent state gets a conditional transition out, and a join state is implicit at the rejoin point. Use this when the branches have different lengths or drive different signals. - Guarded wait. A condition on the wait itself:
@(posedge clk iff cond). Affects only when the state advances, not what it drives.
The three forms answer different questions — what does this state drive?, which states come next?, and when does this state advance? Pick whichever matches your intent.
Loops
Loops lower to compact state-machine shapes:
repeat (N) @(posedge clk)is one state with an auto-countercyc_q. The state self-loops whilecyc_q < N - 1and exits on the bound. The counter width is$clog2(N);Nmay be a parameter.while (cond) begin … @ … endlowers to body states with a conditional exit transition: whilecondholds the body runs each cycle, and when it clears the loop falls through. No iteration counter is generated; to deadline-bound it, AND in a counter check explicitly. The body may be empty (a bare@(posedge clk)), in which case it's a busy wait.for (i = A; i < B; i = i + 1) begin … @ … endlowers to body states + a counter step + a loop-back transition. The body must contain at least one@on every path so the loop makes forward progress; non-progressing loops are rejected at compile time.
Tasks and inlining
A task automatic is inlined at every call site. Each call produces its own block of states using the task body, with arguments substituted in as the values driven into the task's formals. Inlining means N call sites cost N copies of the task's states; tasks themselves don't add a state-machine layer.
When the same task is called many times in sequence, the transpiler can collapse the call sites into a single state controlled by a small program-counter register. The state's output expressions become case muxes over the PC. This is what makes the UART transmitter's ten send_bit calls compile to just two states.