4. SystemC Tutorial - SC_METHOD vs SC_THREAD

Introduction

"My simulation hangs forever" or "my simulation aborts with an error about wait() — I don't understand why."

These are the two most common problems new SystemC engineers hit, and they both come from the same source: using the wrong process type. In SystemC, every process — every piece of concurrent hardware behavior — is either an SC_METHOD or an SC_THREAD. Choosing correctly is not a style preference. It is a hard requirement with concrete, observable consequences when you get it wrong.

The distinction maps directly to hardware. Some parts of a CPU are combinational — they produce outputs purely as a function of their current inputs, with no memory of previous states and no dependency on time passing. The ALU is the canonical example. Feed it two operands and an opcode, and it immediately produces a result. It does not wait for a clock. It does not remember what it did last cycle. It just computes. This behavior maps to SC_METHOD.

Other parts of a CPU are sequential — they depend on things happening in order, over time. The fetch unit retrieves an instruction from memory, then waits for the memory response, then increments the program counter, then requests the next instruction. It sequences operations. It waits. It loops. This behavior maps to SC_THREAD.

Getting this assignment wrong has immediate consequences. A wait() call inside an SC_METHOD aborts the simulation with a runtime error on the first call. A missing sc_stop() inside an SC_THREAD lets the simulation run forever, consuming CPU, never terminating. These are not subtle bugs — they are loud failures that halt your work. Understanding why the distinction exists, not just the rule, is what makes you able to apply it correctly to new modules you haven't seen before.

In a RISC-V CPU, the implications are pervasive. The ALU is SC_METHOD — every signal that feeds it (two operands, an opcode) is on its sensitivity list, and the moment any of those change, the ALU recomputes and updates its output. No clock needed. No time passing. Just pure combinational function evaluation. The fetch unit, the pipeline stall controller, and the instruction sequencer are all SC_THREAD — they issue a request, then call wait() for a response, then take action, then loop back to issue the next request. They are sequential processes that are meaningless without time.

This post builds a concrete AND gate as the DUT using SC_METHOD, drives it with a SC_THREAD testbench, and shows both bugs explicitly — so you can recognize and fix them when they appear in your own work.


Prerequisites


SystemC Language Reference

Quick-reference for the constructs introduced in this post:

Construct Syntax SV Equivalent Key Difference
Declare SC_METHOD SC_METHOD(proc_name); in SC_CTOR always_comb or always @(list) Must register in constructor; cannot be added at runtime
Declare SC_THREAD SC_THREAD(proc_name); in SC_CTOR initial / always with delays Kernel allocates a private stack; survives across wait()
Declare SC_CTHREAD SC_CTHREAD(proc_name, clk.pos()); always_ff @(posedge clk) Always clock-triggered; wait() with no args = next clock edge
Static sensitivity sensitive << sig; always @(sig) Multiple: sensitive << a << b << c
Posedge sensitivity sensitive << clk.pos(); always @(posedge clk) clk.pos() returns an sc_event for the posedge
Negedge sensitivity sensitive << clk.neg(); always @(negedge clk) Separate events — posedge ≠ negedge
Wait for time wait(10, SC_NS); #10 Only inside SC_THREAD or SC_CTHREAD
Wait for event wait(event); @(event_name) Only inside SC_THREAD or SC_CTHREAD
Wait for clock edge wait(); (in SC_CTHREAD) @(posedge clk) inside always_ff In SC_CTHREAD, bare wait() is always the registered clock
Override next trigger next_trigger(event); No direct equivalent SC_METHOD only; overrides sensitivity list for one invocation
Timed next_trigger next_trigger(10, SC_NS); No direct equivalent Makes SC_METHOD time-aware without stack
Async reset async_reset_signal_is(rst, true); always_ff @(posedge clk or posedge rst) Must be called in constructor, not in process body
End simulation sc_stop(); $finish Call from testbench thread; DUT never calls this

Translation Table

Before diving in, here is how SC_METHOD and SC_THREAD map across Verilog, SystemVerilog, and C++:

Concept Classic Verilog SystemVerilog SystemC
Combinational always always @(a or b or sel) always_comb SC_METHOD + sensitive << a << b << sel
Sequential clocked always @(posedge clk) always_ff @(posedge clk) SC_CTHREAD(proc, clk.pos())
Async reset FF always @(posedge clk or posedge rst) always_ff @(posedge clk or posedge rst) SC_CTHREAD + async_reset_signal_is(rst,true)
Initial block initial begin ... end initial begin ... end SC_THREAD (no sensitivity list)
Time delay #10 #10 / ##10 (clocking block) wait(10, SC_NS)
Event wait @(signal) @(signal) wait(signal.value_changed_event())
Posedge wait @(posedge clk) @(posedge clk) wait(clk.posedge_event())
Task with delays task with #delay task automatic with @/## SC_THREAD subroutine (can call wait)
Function (no delay) function function / function automatic SC_METHOD process or plain C++ function

The key insight for SystemVerilog engineers: SC_METHOD is always_comb. SC_THREAD is initial or an always block that uses #delays. If you would never put #10 inside a Verilog function, you should never put wait() inside an SC_METHOD. The analogy is exact.

The key insight for C++ engineers: SC_METHOD is a regular callback function that must return immediately, just like a signal handler or a Qt slot. SC_THREAD is a thread that can block, just like std::thread. SystemC manages the thread stack — you never call std::thread yourself — but the semantics of "this code can block" vs. "this code must return immediately" are identical.


Concept Explanation

SC_METHOD — The Execution Model in Detail

An SC_METHOD is a process that:

  1. Runs from start to finish without stopping — it must return every time it is called
  2. Is triggered by its sensitivity list — it runs exactly when one of its sensitive signals changes
  3. Cannot call wait() — attempting to do so terminates the simulation immediately with an error
  4. Is re-triggered every time any sensitive signal changes — it does not "resume", it restarts

What "no stack between invocations" means precisely: Every time the kernel calls your SC_METHOD process, it is a fresh C++ function call. The call stack unwinds completely when the function returns. Any local variables you declared inside the function are destroyed. The next time the process fires, new local variables are allocated on the stack. There is no continuity of local state between invocations.

This is exactly the semantics of always_comb. A combinational block re-evaluates every time any input changes. It does not "wait" — it computes and produces an output in zero time (one delta cycle). SystemC's SC_METHOD models this faithfully.

Memory efficiency of SC_METHOD: Because there is no persistent stack, SC_METHOD processes are extremely lightweight. A design with 10,000 SC_METHOD processes uses essentially no extra memory for process state — only the module's member variables persist between calls (heap-allocated as part of the module object). By contrast, every SC_THREAD has its own stack (default 64 KB). 10,000 SC_THREAD processes would consume 640 MB of stack space alone.

How the kernel executes SC_METHOD — the internal call sequence:

[Kernel evaluates runnable process list]
    │
    ▼
process_handle->execute()
    │
    ▼
your C++ function: void compute() { ... }
    │
    ▼
[Function returns to kernel]
    │
    ▼
[Kernel updates signals written by compute()]
    │
    ▼
[Evaluate-update cycle continues]

The kernel literally calls compute() as a C++ function pointer. The function executes, returns, and the kernel moves on. There is no coroutine machinery, no fiber switching, no context save/restore. This is why wait() is illegal: there is no context to save and restore when the method returns.

flowchart TD
    A[Signal A changes] --> B[SC_METHOD\nwakes up]
    C[Signal B changes] --> B
    B --> D[Run compute function\nto completion]
    D --> E[Outputs updated\nin next delta]
    E --> F[SC_METHOD sleeps\nuntil next trigger]
    style B fill:#06b6d4,color:#fff
    style D fill:#334155,color:#fff
    style E fill:#10b981,color:#fff

SC_THREAD — The Execution Model in Detail

An SC_THREAD is a process that:

  1. Has its own call stack — the SystemC kernel allocates a stack for it, typically 64 KB by default
  2. Can call wait() to suspend itself — suspending yields control back to the kernel, which then runs other processes
  3. Resumes from exactly where it suspended — when the wait condition is satisfied, the thread picks up on the line after wait()
  4. Typically runs an infinite loop — once started, it loops forever (or until sc_stop() is called)

How SC_THREAD implements coroutine semantics: The SystemC kernel uses fibers (also called green threads or coroutines) to implement SC_THREAD. When wait() is called inside an SC_THREAD:

  1. The current stack frame (all local variables, return addresses, register state) is saved in the fiber's private stack area.
  2. Control transfers back to the kernel's scheduler.
  3. The kernel runs other ready processes.
  4. When the wait condition fires (the event occurs, the time expires, or the signal changes), the kernel marks this fiber as runnable.
  5. At the next delta cycle when this fiber is scheduled, the kernel restores the saved stack and resumes execution from the instruction immediately after wait().

This is identical in mechanism to how an OS kernel suspends and resumes threads — but implemented entirely in user space by the SystemC runtime, with no OS thread per process.

Stack size and memory cost: Each SC_THREAD has a default stack of 64 KB. For large designs:

100   SC_THREADs × 64 KB  =   6.4 MB
1,000 SC_THREADs × 64 KB  =  64   MB
10,000 SC_THREADs × 64 KB = 640   MB

If you need many SC_THREADs with simple behavior, consider whether SC_METHOD with next_trigger() can model the same behavior more efficiently.

flowchart TD
    A[SC_THREAD starts at t=0] --> B[Execute code]
    B --> C["call wait(10, SC_NS)"]
    C --> D[Kernel suspends this thread\nruns other processes\nadvances time by 10ns]
    D --> E[Resume after wait]
    E --> F[Execute more code]
    F --> G{Loop condition?}
    G -->|yes| B
    G -->|no| H[Thread exits\nor calls sc_stop]
    style C fill:#f59e0b,color:#fff
    style D fill:#334155,color:#fff
    style E fill:#10b981,color:#fff

SC_CTHREAD — Clocked Thread

SC_CTHREAD is a specialized SC_THREAD with one additional constraint: it is always triggered by a specific clock edge. You register it as:

SC_CTHREAD(proc_name, clk.pos());   // trigger on posedge clk
SC_CTHREAD(proc_name, clk.neg());   // trigger on negedge clk

What makes SC_CTHREAD different from SC_THREAD with sensitive << clk.pos():

  • In SC_CTHREAD, wait() with no arguments always means "wait for the next edge of the registered clock." There is no ambiguity.
  • In a plain SC_THREAD with sensitive << clk.pos(), a bare wait() waits for the default sensitivity — which is the same clock edge. But SC_CTHREAD semantics are more explicit and less error-prone.
  • SC_CTHREAD enables async_reset_signal_is() — a mechanism that allows an asynchronous reset signal to interrupt the thread even while it is suspended in wait().

Async reset with SC_CTHREAD:

SC_CTHREAD(proc_name, clk.pos());
async_reset_signal_is(rst, true);  // true = active-high reset

When rst goes high — even while the process is suspended inside wait() — the process receives a reset interrupt. The fiber is restarted from the beginning of the process body. This maps directly to:

always_ff @(posedge clk or posedge rst) begin
    if (rst) begin
        // reset logic
    end else begin
        // normal clock logic
    end
end

The first wait() pattern in SC_CTHREAD: A common convention is to put wait() as the first line of an SC_CTHREAD body:

void clocked_proc() {
    wait();        // skip time 0 — synchronize to first clock edge
    while (true) {
        // normal logic
        wait();    // wait for next clock edge
    }
}

Without the initial wait(), the process body runs at time 0 before the first clock edge has ever occurred. For most RTL models this is incorrect — you want to process on clock edges, not at time zero.

The Complete Comparison

SC_METHOD SC_THREAD SC_CTHREAD
Can call wait() No — simulation aborts Yes Yes
Runs to completion Always, every trigger Can suspend mid-execution Can suspend mid-execution
Sensitivity list Required (or next_trigger()) Optional Implicit: clock edge only
Runs at t=0? Yes, if initial trigger fires Yes Yes (careful: runs before first edge)
Has own call stack No Yes (64 KB default) Yes (64 KB default)
Local variables persist between triggers No — function restarts each time Yes — stack is preserved across wait() Yes — stack is preserved across wait()
Async reset support No No (must poll manually) Yes — async_reset_signal_is()
Use for Combinational logic, decoders, MUXes, forwarding Stimulus generators, FSMs, protocol handlers, monitors Clocked RTL registers, pipeline stages, counters
SV equivalent always_comb / always @(sel) initial / always with #delays always_ff @(posedge clk or posedge rst)
CPU example (this series) ALU (Post 5), Decoder (Post 9) Fetch sequencer (Post 8), Testbench driver Pipeline registers (Posts 18–21)

next_trigger() — Dynamic Sensitivity for SC_METHOD

By default, an SC_METHOD's sensitivity list is fixed at construction time in the SC_CTOR. But there is a mechanism to change when the method will next fire: next_trigger().

next_trigger(event) — wait for a specific event next time:

void my_method() {
    output.write(input.read() + 1);

    // Next invocation: wait for this specific event, not the static list
    next_trigger(done_event);
}

next_trigger(time, unit) — re-run after a time delay:

void my_method() {
    // Sample input every 100ns, regardless of when it changes
    sampled_output.write(input.read());
    next_trigger(100, SC_NS);
}

This is the key capability that next_trigger() gives SC_METHOD: time-awareness without becoming a thread. The method still returns immediately on every invocation — there is no stack, no coroutine overhead. But it controls exactly when it will next be called.

There is no direct SystemVerilog equivalent. The closest analog is an always block that switches between two sensitivity lists based on an enable condition, or a clocking block with a programmable delay. SystemC's next_trigger() is more flexible.

next_trigger() overrides the default sensitivity list for one invocation only. On the next trigger, the method runs again and can either call next_trigger() again or fall through to use the default sensitivity list.

The critical difference from SC_THREAD: next_trigger() does not suspend the method. The method still runs to completion and returns before the next trigger fires. It merely schedules when the method will be called again.


Simulation Semantics

How the Kernel Executes SC_METHOD vs SC_THREAD — Step by Step

Understanding the kernel's execution model prevents most process-type bugs. Here is the exact sequence for a design with one SC_METHOD and one SC_THREAD, with two signals a and b and an output y:

Elaboration phase (before sc_start()):

1. All SC_MODULE constructors run
2. SC_METHOD processes are registered with their sensitivity lists
3. SC_THREAD processes are registered (no sensitivity list needed)
4. sc_signal objects initialized to default values (false/0)

At sc_start(), delta cycle 0:

5. All processes with static sensitivity to initialized signals are scheduled
6. SC_THREAD processes are also scheduled immediately (they start at t=0)
7. Kernel runs the "evaluate" phase:
   - SC_METHOD: kernel calls compute(), it runs and returns
   - SC_THREAD: kernel resumes the fiber from its start
8. SC_THREAD calls wait(10, SC_NS):
   - Fiber stack is saved
   - Process is removed from runnable set
   - A time-10ns wakeup event is posted
9. Kernel runs "update" phase: all pending signal writes become visible
10. If any signal changed, processes sensitive to it are re-scheduled
11. Kernel runs delta cycle 1 if any processes are scheduled

At t=10ns:

12. Time advances to 10ns
13. SC_THREAD wakeup event fires
14. Fiber stack is restored, execution resumes after wait()
15. SC_THREAD writes a.write(true)
16. Signal 'a' is scheduled to update
17. SC_THREAD calls wait(10, SC_NS) again
18. Kernel update phase: sig_a changes to true
19. SC_METHOD (sensitive to a) is scheduled
20. Kernel evaluate phase: SC_METHOD runs, computes y, writes y
21. y's new value is written in the next update phase

ASCII timing diagram for the AND gate example:

Time:    0ns      10ns     20ns     30ns     40ns
         |        |        |        |        |
a:       0........1........0........1........(stop)
b:       0........0........1........1
         |        |        |        |
SC_METH: fires    fires    fires    fires
(compute)at Δ    at Δ     at Δ     at Δ
         |        |        |        |
y:       0........0........0........1
         |        |        |        |
SC_THRD: runs─wait─runs─wait─runs─wait─runs─sc_stop
(tb)     write    read     write   read
         a,b      y        a,b     y

The Δ marks are delta cycle boundaries — zero simulated time passes, but a full evaluate-update cycle occurs. The SC_METHOD fires within the same 10ns time step that the SC_THREAD writes the inputs, because signal writes propagate in the next delta cycle.

wait() Inside SC_THREAD — What Actually Happens

SC_THREAD fiber executing:
    ...
    a.write(true);      ← writes to signal's "new value" buffer
    wait(10, SC_NS);    ← 1. posts a wakeup event at t=current+10ns
                           2. calls setjmp() or equivalent to save fiber state
                           3. transfers control to kernel scheduler
                           4. kernel scheduler picks next runnable process
                           5. ... time passes, other processes run ...
                           6. at t=current+10ns, wakeup event fires
                           7. kernel marks this fiber as runnable
                           8. at next scheduling point, longjmp() restores fiber
    y = a.read();       ← execution continues HERE after wait() returns

This is why local variables survive across wait(): they are on the fiber's stack, and the stack is preserved exactly. Variables declared before wait() are available after wait() with their previous values intact.


Implementation

Example: AND Gate DUT + Testbench

This example is deliberately simple. The goal is to make SC_METHOD and SC_THREAD the focus — not the logic. An AND gate is familiar enough that the hardware is transparent and you can concentrate on the SystemC process mechanics.

The DUT is a 2-input AND gate modeled as SC_METHOD. The testbench is an SC_THREAD that exercises all four input combinations with 10ns between each, checks the output, and reports pass/fail.

// File: and_gate_tb.cpp
#include <systemc.h>

// ─── DUT ────────────────────────────────────────────────────────────────────
// AND gate — pure combinational logic, always SC_METHOD.
// Maps to: always_comb in SystemVerilog, or a simple function in C++.
// Never needs to wait — it computes output from inputs immediately.

SC_MODULE(and_gate) {
    sc_in<bool>  a, b;
    sc_out<bool> y;

    void compute() {
        y.write(a.read() & b.read());
    }

    SC_CTOR(and_gate) {
        SC_METHOD(compute);
        sensitive << a << b;   // re-evaluate whenever a or b changes
    }
};

// ─── Testbench ───────────────────────────────────────────────────────────────
// SC_THREAD stimulus generator — sequences through all 4 input combinations.
// Uses wait() to advance simulation time between stimulus changes.
// This is the correct use of SC_THREAD: sequential stimulus over time.

SC_MODULE(tb_and) {
    sc_out<bool> a, b;
    sc_in<bool>  y;

    void run() {
        // Truth table for AND gate — all 4 input combinations
        struct { bool a_val, b_val, expected; } tests[] = {
            {false, false, false},
            {true,  false, false},
            {false, true,  false},
            {true,  true,  true }
        };

        int pass = 0, fail = 0;

        for (auto& t : tests) {
            a.write(t.a_val);
            b.write(t.b_val);
            wait(10, SC_NS);   // allow DUT to settle, advance simulation time

            bool got = y.read();
            bool ok  = (got == t.expected);

            std::cout << "[" << sc_time_stamp() << "]  "
                      << "a=" << t.a_val << "  b=" << t.b_val
                      << "  →  y=" << got
                      << "  (expected " << t.expected << ")"
                      << (ok ? "  PASS" : "  FAIL") << std::endl;

            if (ok) pass++; else fail++;
        }

        std::cout << "\nResults: " << pass << " PASS, "
                  << fail << " FAIL" << std::endl;

        sc_stop();  // ← CRITICAL: without this, simulation runs forever
    }

    SC_CTOR(tb_and) {
        SC_THREAD(run);
        // Note: no sensitive list — SC_THREAD uses wait() for timing control
    }
};

// ─── sc_main ─────────────────────────────────────────────────────────────────

int sc_main(int argc, char* argv[]) {
    // Signals connecting DUT to testbench
    sc_signal<bool> sig_a, sig_b, sig_y;

    // Instantiate and bind
    and_gate dut("dut");
    dut.a(sig_a);
    dut.b(sig_b);
    dut.y(sig_y);

    tb_and tb("tb");
    tb.a(sig_a);
    tb.b(sig_b);
    tb.y(sig_y);

    sc_start();   // run until sc_stop() — called by tb after all tests complete
    return 0;
}

Expected output:

[10 ns]  a=0  b=0  →  y=0  (expected 0)  PASS
[20 ns]  a=1  b=0  →  y=0  (expected 0)  PASS
[30 ns]  a=0  b=1  →  y=0  (expected 0)  PASS
[40 ns]  a=1  b=1  →  y=1  (expected 1)  PASS

Results: 4 PASS, 0 FAIL

What is happening internally at each step:

At t=0: sc_start() is called. Both the and_gate's compute() SC_METHOD and the tb's run() SC_THREAD start. The SC_THREAD is the only one that actually runs immediately — the SC_METHOD fires only if its sensitive signals change.

run() writes a=0, b=0 (they were already 0 from sc_signal initialization — no change, no SC_METHOD trigger), then calls wait(10, SC_NS).

At t=10ns: simulation time advances. run() resumes. y.read() returns 0. The PASS line prints. run() writes a=1, b=0.

When a.write(true) executes, signal a changes. In the next evaluate-update cycle, compute() fires — it reads a=1, b=0, writes y=0 (same value, so no second trigger). run() calls wait(10, SC_NS).

This pattern repeats for all four input combinations.


Build & Run

# CMakeLists.txt in section1/post04/
cmake_minimum_required(VERSION 3.16)
project(post04_method_vs_thread)

set(SYSTEMC_HOME $ENV{SYSTEMC_HOME})
include_directories(${SYSTEMC_HOME}/include)
link_directories(${SYSTEMC_HOME}/lib-linux64)

add_executable(and_gate_tb and_gate_tb.cpp)
target_link_libraries(and_gate_tb systemc)
mkdir build && cd build
cmake .. && make
./and_gate_tb

The Two Bugs — Seen Explicitly

Bug 1 and Bug 2 are not hypothetical. Both will appear in your work. Seeing them here — with the exact error message — means you will recognize them immediately when they appear.

Bug 1: Calling wait() in an SC_METHOD

// WRONG — this will abort the simulation
SC_MODULE(broken_method) {
    sc_in<bool> clk;

    void bad_compute() {
        // Engineer thinks: "I'll wait for the clock edge before computing"
        wait();        // ← RUNTIME ABORT on first call
        output.write(input.read());
    }

    SC_CTOR(broken_method) {
        SC_METHOD(bad_compute);
        sensitive << clk.pos();  // rising edge
    }
};

Error you will see:

Error: (E519) wait() is only allowed in SC_THREAD or SC_CTHREAD:
  in SC_METHOD process 'broken_method.bad_compute'

The simulation terminates immediately when wait() is called. There is no recovery. The fix is either: (a) remove the wait() and model the behavior using the sensitivity list, or (b) change SC_METHOD to SC_THREAD — but then you must add the infinite loop and sc_stop() discipline.

Why this rule exists: SC_METHOD processes do not have their own call stack. The kernel calls them as regular C++ functions. A function cannot suspend itself mid-execution without a stack to preserve its state. wait() requires a stack — which only SC_THREAD has. Calling wait() in an SC_METHOD is asking the kernel to do something architecturally impossible.

Bug 2: Forgetting sc_stop() in an SC_THREAD

// WRONG — simulation runs forever, burning CPU
SC_MODULE(broken_thread) {
    sc_out<bool> a, b;
    sc_in<bool>  y;

    void run() {
        a.write(true); b.write(false);
        wait(10, SC_NS);
        // ... checks output ...
        // FORGOT sc_stop()
        // SC_THREAD returns here — but sc_start() never stops
    }

    SC_CTOR(broken_thread) { SC_THREAD(run); }
};

When run() returns without calling sc_stop(), the SC_THREAD terminates — but sc_start() in sc_main continues running, waiting for other events that will never come. The simulation time does not advance. The process appears "stuck" at the final simulation timestamp. CPU usage goes to 100%.

Fix: Always call sc_stop() from the thread that drives simulation end. In a testbench with one driver, this is the end of the run() function, after all tests complete. In a multi-threaded testbench, it is the "scoreboard" or "controller" thread that detects completion.

Bug 3: Infinite Loop in SC_THREAD Without wait()

// WRONG — simulation hangs at t=0, never advances
void run() {
    while (true) {
        // process data
        result = compute(input);
        // FORGOT wait() — loop never yields to kernel
    }
}

An SC_THREAD with an infinite loop and no wait() inside the loop will execute forever at the same simulation timestamp. The kernel never gets control back to advance time. Simulation appears frozen at t=0. This is distinct from Bug 2: here the process is running but time never advances; in Bug 2, the process has exited but sc_start() blocks.

Symptom: sc_time_stamp() always shows the same time. The process consumes 100% CPU. The simulation never terminates.

Fix: Every infinite loop in SC_THREAD must contain at least one wait() call that suspends the process and allows time to advance.


Common Pitfalls for SV Engineers

Pitfall 1: wait() Inside SC_METHOD — The Silent Migration Trap

When converting an always_ff block to SystemC, the instinct is to use SC_METHOD. But always_ff blocks conceptually wait for a clock — and that "wait" doesn't translate. The correct SystemC target is SC_CTHREAD.

// SV: always_ff with posedge clock
always_ff @(posedge clk or posedge rst) begin
    if (rst) q <= 0;
    else     q <= d;
end
// WRONG: SC_METHOD cannot wait
SC_METHOD(ff_proc);
sensitive << clk.pos() << rst.pos();
// Inside: cannot call wait() — but also doesn't need to for simple FFs

// CORRECT for simple FF: SC_METHOD is fine if no wait() needed
void ff_proc() {
    if (rst.read()) q.write(0);
    else if (clk.read()) q.write(d.read()); // check clock manually
}

// BETTER for clocked processes: SC_CTHREAD
SC_CTHREAD(ff_proc, clk.pos());
async_reset_signal_is(rst, true);
void ff_proc() {
    q.write(0);       // reset action
    wait();           // wait for first clock after reset clears
    while (true) {
        q.write(d.read());
        wait();       // wait for next posedge
    }
}

Pitfall 2: SC_CTHREAD wait() Semantics — Not Always the Clock

In SC_CTHREAD, a bare wait() waits for the registered clock edge. But if you call wait(some_event) or wait(10, SC_NS), you are waiting for that specific condition — not the clock. The clock sensitivity is only the default for bare wait().

SC_CTHREAD(proc, clk.pos());

void proc() {
    wait();                    // wait for next posedge clk — correct
    wait(done_event);          // wait for done_event — NOT the clock edge
    wait(10, SC_NS);           // wait 10ns — NOT a clock edge
    while (true) {
        wait();                // each iteration: wait for posedge clk
    }
}

SV engineers expect @(posedge clk) to be the only trigger in always_ff. SystemC's SC_CTHREAD is more flexible but also more dangerous: mixing wait() variants inside one SC_CTHREAD is legal but can create hard-to-debug timing mismatches.

Pitfall 3: async_reset_signal_is() Must Be in the Constructor

SV engineers are accustomed to adding sensitivity conditions in the always block sensitivity list, which can be modified freely. In SystemC, async_reset_signal_is() must be called in the constructor — specifically, it must be called as part of the SC_CTHREAD registration sequence. Calling it from inside the process body at runtime does nothing.

SC_CTOR(my_module) {
    SC_CTHREAD(proc, clk.pos());
    async_reset_signal_is(rst, true);   // CORRECT: in constructor
}

void proc() {
    async_reset_signal_is(rst, true);   // WRONG: no effect at runtime
    wait();
    // ...
}

Pitfall 4: SC_METHOD Sensitivity and sensitive << clk.pos() vs. sensitive << clk

sensitive << clk;       // fires on ANY change to clk (0→1 AND 1→0)
sensitive << clk.pos(); // fires only on 0→1 (posedge)
sensitive << clk.neg(); // fires only on 1→0 (negedge)

In SV, always @(clk) fires on both edges. always @(posedge clk) fires on posedge only. The SystemC analogs are exact — but sensitive << clk (without .pos()) is the common mistake for SV engineers who write always @(posedge clk) and reach for sensitive << clk intending posedge-only behavior.

Pitfall 5: next_trigger() Is Not wait() — Local State Does Not Persist

next_trigger() schedules when the SC_METHOD will next fire, but the method still returns immediately after next_trigger() is called. Any local variable values are lost when the function returns. This confuses SV engineers who think of always blocks where state between #delays persists.

void my_method() {
    int counter = 0;        // declared locally
    counter++;              // incremented
    next_trigger(10, SC_NS); // next call in 10ns
    // counter is DESTROYED here — local variables do not persist
}
// On next invocation: counter starts at 0 again, not 1

If you need state to persist between next_trigger() invocations, use member variables (declared in the module, not in the function):

SC_MODULE(counter_module) {
    int count;   // member variable — persists between method invocations

    void my_method() {
        count++;              // member variable — survives
        next_trigger(10, SC_NS);
    }

    SC_CTOR(counter_module) : count(0) {
        SC_METHOD(my_method);
        // no sensitive list — next_trigger drives scheduling
    }
};

DV Insight

DV Insight The SC_METHOD / SC_THREAD distinction is not a SystemC-specific quirk — it maps to a fundamental distinction in hardware itself. In real silicon, combinational logic and sequential logic follow different timing rules. Combinational logic resolves within a combinational delay budget (the "critical path"). Sequential logic waits for clock edges. SystemC encodes this distinction in the type system: SC_METHOD cannot wait, SC_THREAD can. If you are ever unsure which to use, ask: "Does this hardware block need to wait for something before producing its output?" If yes, SC_THREAD. If no, SC_METHOD.

The second insight is about testbench discipline. In the code above, sc_stop() is the testbench's responsibility — not the DUT's. A DUT never calls sc_stop(). It has no concept of "test complete." Only the verification environment knows when enough stimulus has been applied and results checked. This separation — DUT knows nothing about the test, testbench controls test lifecycle — is a principle that scales all the way up to full UVM environments. The UVM run_phase ends when the testbench's drain_time expires or when all objections are dropped. sc_stop() here is the hand-rolled version of UVM phase.drop_objection().

A third practical note on stack size: the default 64 KB per SC_THREAD is usually sufficient, but deeply recursive algorithms or processes that allocate large local arrays can overflow the stack silently (corrupting adjacent process stacks). If you see bizarre signal corruption that seems impossible given the logic, suspect stack overflow in an SC_THREAD. The SystemC kernel provides a way to increase stack size: SC_THREAD(proc); set_stack_size(256*1024); in the constructor. Budget extra when processes call deep C++ call chains.


Integration

SC_METHOD and SC_THREAD are not theoretical concepts — they are the implementation choice for every module in our RISC-V CPU build. Here is how the choice plays out across the series:

SC_METHOD modules (combinational — no wait, pure function of inputs):
- ALU (Post 5) — computes result from operands + opcode, sensitive to all three inputs
- Decoder (Post 9) — decodes 32-bit instruction word into control signals
- Forwarding unit (Post 20) — a combinational MUX network that routes the correct operand values
- Branch comparator (Post 14) — evaluates branch condition from two register values

SC_THREAD modules (sequential — sequence operations, use wait()):
- Fetch unit (Post 8) — issues memory request, waits for response, updates PC, loops
- Pipeline stage registers (Posts 18–21) — each stage waits for posedge clk, latches values, propagates
- Stall controller (Post 16) — detects hazard, asserts stall, waits N cycles, de-asserts
- Testbench driver (Posts 23–27) — generates instruction sequences with timing, calls sc_stop()
- Monitor (Post 6 preview) — passively observes outputs, records transactions in a log

When you see a module in this series and wonder "why SC_METHOD and not SC_THREAD?" — the answer is always: "Does it need to wait for something, or does it purely compute from its current inputs?" If it computes, it's SC_METHOD. If it sequences, it's SC_THREAD.

Series progress:
- Post 1 — Modules, Ports & Signals ✓
- Post 2 — Simulation Time & Clocks ✓
- Post 3 — Delta Cycles & Event-Driven Semantics ✓
- Post 4 — SC_METHOD vs SC_THREAD ✓
- Post 5 — Building the RV32I ALU (next)


What's Next

Post 5: Building the RV32I ALU

With the SC_METHOD pattern established, Post 5 applies it to a real piece of hardware: the full RV32I Arithmetic-Logic Unit. Ten operations, 32-bit operands, signed and unsigned arithmetic, shift operations, and a zero flag that drives conditional branches.

The ALU is the first module in our CPU that does something you could recognize from an architecture diagram. It is also the first module where the choice of sc_uint<32> vs sc_int<32> matters for correctness — and where getting it wrong produces subtly wrong results that your testbench might not catch unless you design the test cases carefully.

Post 5 → Building the RV32I ALU

Author
Mayur Kubavat
VLSI Design and Verification Engineer sharing knowledge about SystemVerilog, UVM, and hardware verification methodologies.

Comments (0)

Leave a Comment