4. SystemC Tutorial - SC_METHOD vs SC_THREAD
Introduction
"My simulation hangs forever" or "my simulation aborts with an error about wait() — I don't understand why."
These are the two most common problems new SystemC engineers hit, and they both come from the same source: using the wrong process type. In SystemC, every process — every piece of concurrent hardware behavior — is either an SC_METHOD or an SC_THREAD. Choosing correctly is not a style preference. It is a hard requirement with concrete, observable consequences when you get it wrong.
The distinction maps directly to hardware. Some parts of a CPU are combinational — they produce outputs purely as a function of their current inputs, with no memory of previous states and no dependency on time passing. The ALU is the canonical example. Feed it two operands and an opcode, and it immediately produces a result. It does not wait for a clock. It does not remember what it did last cycle. It just computes. This behavior maps to SC_METHOD.
Other parts of a CPU are sequential — they depend on things happening in order, over time. The fetch unit retrieves an instruction from memory, then waits for the memory response, then increments the program counter, then requests the next instruction. It sequences operations. It waits. It loops. This behavior maps to SC_THREAD.
Getting this assignment wrong has immediate consequences. A wait() call inside an SC_METHOD aborts the simulation with a runtime error on the first call. A missing sc_stop() inside an SC_THREAD lets the simulation run forever, consuming CPU, never terminating. These are not subtle bugs — they are loud failures that halt your work. Understanding why the distinction exists, not just the rule, is what makes you able to apply it correctly to new modules you haven't seen before.
In a RISC-V CPU, the implications are pervasive. The ALU is SC_METHOD — every signal that feeds it (two operands, an opcode) is on its sensitivity list, and the moment any of those change, the ALU recomputes and updates its output. No clock needed. No time passing. Just pure combinational function evaluation. The fetch unit, the pipeline stall controller, and the instruction sequencer are all SC_THREAD — they issue a request, then call wait() for a response, then take action, then loop back to issue the next request. They are sequential processes that are meaningless without time.
This post builds a concrete AND gate as the DUT using SC_METHOD, drives it with a SC_THREAD testbench, and shows both bugs explicitly — so you can recognize and fix them when they appear in your own work.
Prerequisites
- Completed Post 1 —
SC_MODULE,sc_in,sc_out,sc_signal, port binding - Post 1 — Modules, Ports & Signals
- Completed Post 2 —
sc_start(),sc_time_stamp(), simulation control - Post 2 — Simulation Time & Clocks
- Completed Post 3 — sensitivity lists, delta cycles, evaluate-update model
- Post 3 — Delta Cycles & Event-Driven Semantics
- Code for this post: GitHub — section1/post04
Translation Table
Before diving in, here is how SC_METHOD and SC_THREAD map to what you already know:
| Concept | SystemVerilog | C++ (raw threads) |
|---|---|---|
SC_METHOD |
always_comb / always @(sensitivity_list) |
A regular function that returns immediately |
SC_THREAD |
initial / always with #delay or @(posedge clk) |
std::thread with std::this_thread::sleep_for() |
wait(10, SC_NS) |
#10 (delay in time units) |
std::this_thread::sleep_for(10ns) |
wait(event) |
@(event_name) |
cv.wait(lock) |
sensitive << sig |
Implicit in always_comb or explicit sensitivity list |
No direct equivalent |
Calling wait() in SC_METHOD |
Calling #delay inside a function — compile error |
Blocking inside a regular function — logic error |
The key insight for SystemVerilog engineers: SC_METHOD is always_comb. SC_THREAD is initial or an always block that uses #delays. If you would never put #10 inside a Verilog function, you should never put wait() inside an SC_METHOD. The analogy is exact.
The key insight for C++ engineers: SC_METHOD is a regular callback function that must return immediately, just like a signal handler or a Qt slot. SC_THREAD is a thread that can block, just like std::thread. SystemC manages the thread stack — you never call std::thread yourself — but the semantics of "this code can block" vs. "this code must return immediately" are identical.
Concept Explanation
SC_METHOD — Combinational Logic in Software
An SC_METHOD is a process that:
- Runs from start to finish without stopping — it must return every time it is called
- Is triggered by its sensitivity list — it runs exactly when one of its sensitive signals changes
- Cannot call
wait()— attempting to do so terminates the simulation immediately with an error - Is re-triggered every time any sensitive signal changes — it does not "resume", it restarts
This is exactly the semantics of always_comb. A combinational block re-evaluates every time any input changes. It does not "wait" — it computes and produces an output in zero time (one delta cycle). SystemC's SC_METHOD models this faithfully.
flowchart TD
A[Signal A changes] --> B[SC_METHOD\nwakes up]
C[Signal B changes] --> B
B --> D[Run compute function\nto completion]
D --> E[Outputs updated\nin next delta]
E --> F[SC_METHOD sleeps\nuntil next trigger]
style B fill:#06b6d4,color:#fff
style D fill:#334155,color:#fff
style E fill:#10b981,color:#fff
SC_THREAD — Sequential Behavior Over Time
An SC_THREAD is a process that:
- Has its own call stack — the SystemC kernel allocates a stack for it, just like an OS thread
- Can call
wait()to suspend itself — suspending yields control back to the kernel, which then runs other processes - Resumes from exactly where it suspended — when the wait condition is satisfied, the thread picks up on the line after
wait() - Typically runs an infinite loop — once started, it loops forever (or until
sc_stop()is called)
This is exactly the semantics of a initial block or an always @(posedge clk) block in SystemVerilog. The process starts, executes, waits for a clock or a delay, resumes, and repeats. The difference from an SC_METHOD is fundamental: the process persists between triggers. Its local variables retain their values. It can count, sequence, and remember.
flowchart TD
A[SC_THREAD starts at t=0] --> B[Execute code]
B --> C["call wait(10, SC_NS)"]
C --> D[Kernel suspends this thread\nruns other processes\nadvances time by 10ns]
D --> E[Resume after wait]
E --> F[Execute more code]
F --> G{Loop condition?}
G -->|yes| B
G -->|no| H[Thread exits\nor calls sc_stop]
style C fill:#f59e0b,color:#fff
style D fill:#334155,color:#fff
style E fill:#10b981,color:#fff
The Complete Comparison
| SC_METHOD | SC_THREAD | |
|---|---|---|
Can call wait() |
No — simulation aborts | Yes |
| Runs to completion | Always, every trigger | Can suspend mid-execution |
| Sensitivity list | Required (or next_trigger()) |
Optional (can wait on any event) |
| Runs at t=0? | Yes, if initial trigger fires | Yes |
| Has own call stack | No | Yes |
| Local variables persist between triggers | No — function restarts each time | Yes — stack is preserved across wait() |
| Use for | Combinational logic, decoders, MUXes, forwarding | Stimulus generators, FSMs, protocol handlers, monitors |
| SV equivalent | always_comb / always @(sel) |
initial / always with #delays |
| CPU example (this series) | ALU (Post 5), Decoder (Post 9), Forwarding unit (Post 20) | Fetch sequencer (Post 8), Stall controller (Post 16), Testbench driver (Post 23) |
next_trigger() — Dynamic Sensitivity for SC_METHOD
By default, an SC_METHOD's sensitivity list is fixed at construction time in the SC_CTOR. But there is a mechanism to change when the method will next fire: next_trigger().
void my_method() {
// Do some work...
output.write(input.read() + 1);
// Schedule next invocation 10ns from now, regardless of sensitivity list
next_trigger(10, SC_NS);
}
next_trigger() overrides the default sensitivity list for one invocation only. On the next trigger, the method runs again and can either call next_trigger() again or fall through to use the default sensitivity list.
This is useful for modeling:
- Combinational logic with a minimum hold time (re-evaluate after N ns, not immediately)
- Rate-limited sampling (sample input every 100ns instead of on every change)
- One-shot behaviors that re-arm themselves
The critical difference from SC_THREAD: next_trigger() does not suspend the method. The method still runs to completion and returns before the next trigger fires. It merely schedules when the method will be called again. This is different from wait() in an SC_THREAD, which actually suspends execution and preserves local state.
Implementation
Example: AND Gate DUT + Testbench
This example is deliberately simple. The goal is to make SC_METHOD and SC_THREAD the focus — not the logic. An AND gate is familiar enough that the hardware is transparent and you can concentrate on the SystemC process mechanics.
The DUT is a 2-input AND gate modeled as SC_METHOD. The testbench is an SC_THREAD that exercises all four input combinations with 10ns between each, checks the output, and reports pass/fail.
// File: and_gate_tb.cpp
#include <systemc.h>
// ─── DUT ────────────────────────────────────────────────────────────────────
// AND gate — pure combinational logic, always SC_METHOD.
// Maps to: always_comb in SystemVerilog, or a simple function in C++.
// Never needs to wait — it computes output from inputs immediately.
SC_MODULE(and_gate) {
sc_in<bool> a, b;
sc_out<bool> y;
void compute() {
y.write(a.read() & b.read());
}
SC_CTOR(and_gate) {
SC_METHOD(compute);
sensitive << a << b; // re-evaluate whenever a or b changes
}
};
// ─── Testbench ───────────────────────────────────────────────────────────────
// SC_THREAD stimulus generator — sequences through all 4 input combinations.
// Uses wait() to advance simulation time between stimulus changes.
// This is the correct use of SC_THREAD: sequential stimulus over time.
SC_MODULE(tb_and) {
sc_out<bool> a, b;
sc_in<bool> y;
void run() {
// Truth table for AND gate — all 4 input combinations
struct { bool a_val, b_val, expected; } tests[] = {
{false, false, false},
{true, false, false},
{false, true, false},
{true, true, true }
};
int pass = 0, fail = 0;
for (auto& t : tests) {
a.write(t.a_val);
b.write(t.b_val);
wait(10, SC_NS); // allow DUT to settle, advance simulation time
bool got = y.read();
bool ok = (got == t.expected);
std::cout << "[" << sc_time_stamp() << "] "
<< "a=" << t.a_val << " b=" << t.b_val
<< " → y=" << got
<< " (expected " << t.expected << ")"
<< (ok ? " PASS" : " FAIL") << std::endl;
if (ok) pass++; else fail++;
}
std::cout << "\nResults: " << pass << " PASS, "
<< fail << " FAIL" << std::endl;
sc_stop(); // ← CRITICAL: without this, simulation runs forever
}
SC_CTOR(tb_and) {
SC_THREAD(run);
// Note: no sensitive list — SC_THREAD uses wait() for timing control
}
};
// ─── sc_main ─────────────────────────────────────────────────────────────────
int sc_main(int argc, char* argv[]) {
// Signals connecting DUT to testbench
sc_signal<bool> sig_a, sig_b, sig_y;
// Instantiate and bind
and_gate dut("dut");
dut.a(sig_a);
dut.b(sig_b);
dut.y(sig_y);
tb_and tb("tb");
tb.a(sig_a);
tb.b(sig_b);
tb.y(sig_y);
sc_start(); // run until sc_stop() — called by tb after all tests complete
return 0;
}
Expected output:
[10 ns] a=0 b=0 → y=0 (expected 0) PASS
[20 ns] a=1 b=0 → y=0 (expected 0) PASS
[30 ns] a=0 b=1 → y=0 (expected 0) PASS
[40 ns] a=1 b=1 → y=1 (expected 1) PASS
Results: 4 PASS, 0 FAIL
What is happening internally at each step:
At t=0: sc_start() is called. Both the and_gate's compute() SC_METHOD and the tb's run() SC_THREAD start. The SC_THREAD is the only one that actually runs immediately — the SC_METHOD fires only if its sensitive signals change.
run() writes a=0, b=0 (they were already 0 from sc_signal initialization — no change, no SC_METHOD trigger), then calls wait(10, SC_NS).
At t=10ns: simulation time advances. run() resumes. y.read() returns 0. The PASS line prints. run() writes a=1, b=0.
When a.write(true) executes, signal a changes. In the next evaluate-update cycle, compute() fires — it reads a=1, b=0, writes y=0 (same value, so no second trigger). run() calls wait(10, SC_NS).
This pattern repeats for all four input combinations.
Build & Run
# CMakeLists.txt in section1/post04/
cmake_minimum_required(VERSION 3.16)
project(post04_method_vs_thread)
set(SYSTEMC_HOME $ENV{SYSTEMC_HOME})
include_directories(${SYSTEMC_HOME}/include)
link_directories(${SYSTEMC_HOME}/lib-linux64)
add_executable(and_gate_tb and_gate_tb.cpp)
target_link_libraries(and_gate_tb systemc)
mkdir build && cd build
cmake .. && make
./and_gate_tb
The Two Bugs — Seen Explicitly
Bug 1 and Bug 2 are not hypothetical. Both will appear in your work. Seeing them here — with the exact error message — means you will recognize them immediately when they appear.
Bug 1: Calling wait() in an SC_METHOD
// WRONG — this will abort the simulation
SC_MODULE(broken_method) {
sc_in<bool> clk;
void bad_compute() {
// Engineer thinks: "I'll wait for the clock edge before computing"
wait(); // ← RUNTIME ABORT on first call
output.write(input.read());
}
SC_CTOR(broken_method) {
SC_METHOD(bad_compute);
sensitive << clk.pos(); // rising edge
}
};
Error you will see:
Error: (E519) wait() is only allowed in SC_THREAD or SC_CTHREAD:
in SC_METHOD process 'broken_method.bad_compute'
The simulation terminates immediately when wait() is called. There is no recovery. The fix is either: (a) remove the wait() and model the behavior using the sensitivity list, or (b) change SC_METHOD to SC_THREAD — but then you must add the infinite loop and sc_stop() discipline.
Why this rule exists: SC_METHOD processes do not have their own call stack. The kernel calls them as regular C++ functions. A function cannot suspend itself mid-execution without a stack to preserve its state. wait() requires a stack — which only SC_THREAD has. Calling wait() in an SC_METHOD is asking the kernel to do something architecturally impossible.
Bug 2: Forgetting sc_stop() in an SC_THREAD
// WRONG — simulation runs forever, burning CPU
SC_MODULE(broken_thread) {
sc_out<bool> a, b;
sc_in<bool> y;
void run() {
a.write(true); b.write(false);
wait(10, SC_NS);
// ... checks output ...
// FORGOT sc_stop()
// SC_THREAD returns here — but sc_start() never stops
}
SC_CTOR(broken_thread) { SC_THREAD(run); }
};
When run() returns without calling sc_stop(), the SC_THREAD terminates — but sc_start() in sc_main continues running, waiting for other events that will never come. The simulation time does not advance. The process appears "stuck" at the final simulation timestamp. CPU usage goes to 100%.
Fix: Always call sc_stop() from the thread that drives simulation end. In a testbench with one driver, this is the end of the run() function, after all tests complete. In a multi-threaded testbench, it is the "scoreboard" or "controller" thread that detects completion.
DV Insight
SC_METHOD / SC_THREAD distinction is not a SystemC-specific quirk — it maps to a fundamental distinction in hardware itself. In real silicon, combinational logic and sequential logic follow different timing rules. Combinational logic resolves within a combinational delay budget (the "critical path"). Sequential logic waits for clock edges. SystemC encodes this distinction in the type system: SC_METHOD cannot wait, SC_THREAD can. If you are ever unsure which to use, ask: "Does this hardware block need to wait for something before producing its output?" If yes, SC_THREAD. If no, SC_METHOD.
The second insight is about testbench discipline. In the code above, sc_stop() is the testbench's responsibility — not the DUT's. A DUT never calls sc_stop(). It has no concept of "test complete." Only the verification environment knows when enough stimulus has been applied and results checked. This separation — DUT knows nothing about the test, testbench controls test lifecycle — is a principle that scales all the way up to full UVM environments. The UVM run_phase ends when the testbench's drain_time expires or when all objections are dropped. sc_stop() here is the hand-rolled version of UVM phase.drop_objection().
Integration
SC_METHOD and SC_THREAD are not theoretical concepts — they are the implementation choice for every module in our RISC-V CPU build. Here is how the choice plays out across the series:
SC_METHOD modules (combinational — no wait, pure function of inputs):
- ALU (Post 5) — computes result from operands + opcode, sensitive to all three inputs
- Decoder (Post 9) — decodes 32-bit instruction word into control signals
- Forwarding unit (Post 20) — a combinational MUX network that routes the correct operand values
- Branch comparator (Post 14) — evaluates branch condition from two register values
SC_THREAD modules (sequential — sequence operations, use wait()):
- Fetch unit (Post 8) — issues memory request, waits for response, updates PC, loops
- Pipeline stage registers (Posts 18–21) — each stage waits for posedge clk, latches values, propagates
- Stall controller (Post 16) — detects hazard, asserts stall, waits N cycles, de-asserts
- Testbench driver (Posts 23–27) — generates instruction sequences with timing, calls sc_stop()
- Monitor (Post 6 preview) — passively observes outputs, records transactions in a log
When you see a module in this series and wonder "why SC_METHOD and not SC_THREAD?" — the answer is always: "Does it need to wait for something, or does it purely compute from its current inputs?" If it computes, it's SC_METHOD. If it sequences, it's SC_THREAD.
Series progress:
- Post 1 — Modules, Ports & Signals ✓
- Post 2 — Simulation Time & Clocks ✓
- Post 3 — Delta Cycles & Event-Driven Semantics ✓
- Post 4 — SC_METHOD vs SC_THREAD ✓
- Post 5 — Building the RV32I ALU (next)
What's Next
Post 5: Building the RV32I ALU
With the SC_METHOD pattern established, Post 5 applies it to a real piece of hardware: the full RV32I Arithmetic-Logic Unit. Ten operations, 32-bit operands, signed and unsigned arithmetic, shift operations, and a zero flag that drives conditional branches.
The ALU is the first module in our CPU that does something you could recognize from an architecture diagram. It is also the first module where the choice of sc_uint<32> vs sc_int<32> matters for correctness — and where getting it wrong produces subtly wrong results that your testbench might not catch unless you design the test cases carefully.
Comments (0)
Leave a Comment