3. SystemC Tutorial - Delta Cycles & Event-Driven Semantics
Introduction
"Why doesn't my output update immediately after I write a signal?"
This is the single most common question from engineers who are new to SystemC. You call sig_in.write(true), you immediately read sig_out.read(), and you get the old value back. You add a sc_start(SC_ZERO_TIME) call. Now the first stage updates — but the second stage is still wrong. You add another sc_start(SC_ZERO_TIME). Now everything looks correct, but you're not sure why, and you're worried you're masking a bug rather than understanding the model.
The answer is delta cycles — and understanding them is not optional. It is the foundation of every multi-process simulation you will ever write in SystemC.
Delta cycles exist because SystemC uses the same evaluate-update model as real hardware. In silicon, a signal driven by one flip-flop does not arrive at the next flip-flop instantaneously — it propagates through wires and combinational logic over a finite (if tiny) time. In a simulator, zero-time propagation through multiple stages still requires an ordering mechanism. If stage 1 writes a signal and stage 2 reads that signal in the same instant, which process runs first? The evaluate-update model answers this by separating the act of computing a new value (evaluate phase) from the act of making that value visible to other processes (update phase). A delta cycle is one pass through that pair of phases.
This is not a SystemC quirk. Every hardware simulator you have used — ModelSim, VCS, Xcelium, Questa, Verilator — uses exactly this model. The IEEE 1364 Verilog standard defines it. The IEEE 1800 SystemVerilog standard defines it. IEEE 1666 (SystemC) defines it. They all converge on the same evaluate-update loop because they are all modeling the same physical reality.
Consider the forwarding logic in a real RISC-V pipeline. When the EX stage computes a result and the MEM stage needs that value in the same clock cycle, the forwarding path must propagate through several levels of combinational logic — a multiplexer, a comparator, routing wires — before settling at the MEM stage input. In silicon this takes maybe 200–400ps. In a simulator it takes delta cycles: each level of logic that re-evaluates is one delta. The simulator produces the correct stable value before it advances the clock, just as silicon produces the correct stable value before the next clock edge samples it.
Post 3 builds two concrete models. First, a two-stage combinational chain that makes delta cycle propagation visible in your terminal output — you will see stage 1 and stage 2 both fire within a single sc_start() call. Second, a producer-consumer handshake using sc_event, which is the explicit event mechanism that underlies transaction-level communication. The producer-consumer pattern we build here is structurally identical to the fetch→decode handshake in our RISC-V CPU at Post 10.
Prerequisites
- Completed Post 1 — the
pass_throughmodule must be working.SC_METHOD,sc_signal, and port binding are used throughout this post. - Post 1 — Modules, Ports & Signals
- Completed Post 2 — the
dffmodule must be working. Familiarity withsc_start(),SC_ZERO_TIME, andsc_time_stamp()is assumed. - Post 2 — Simulation Time & Clocks
- SystemC 2.3.x installed — see the install guides linked in Post 1
- Code for this post: GitHub — section1/post03
Translation Table
Before the concept explanations, here is the mapping from what you already know:
| Concept | C++ Engineer | SystemVerilog Engineer |
|---|---|---|
| Delta cycle | Sub-timestep evaluation round | Delta delay (same concept, same name) |
sc_event |
std::condition_variable |
event keyword |
event.notify() |
cv.notify_one() |
-> event_name |
wait(event) |
cv.wait() (blocking in sim, not in OS) |
@(event_name) |
| Evaluate phase | "run all ready functions" | always block execution |
| Update phase | "apply buffered writes" | Non-blocking assignment update (#0 semantics) |
sc_signal.write() |
Buffered write — not visible until update phase | Non-blocking assignment <= |
The key insight for C++ engineers: sc_signal.write() is not like a normal variable assignment. It does not change the visible value immediately. It enqueues the new value for the update phase. If you want the value you just wrote to be visible to another process, a delta cycle must complete first.
The key insight for SystemVerilog engineers: delta cycles in SystemC work exactly as they do in SystemVerilog. If you have used non-blocking assignments and wondered why a chain of always blocks triggered by the same event all see the old value until the #0 update, this is the same mechanism. SystemC makes it more explicit — you can see it in the sc_start() call boundaries — but the semantics are identical.
Concept Explanation
SystemC Language Reference
Every delta-cycle and event construct used in this post:
| Construct | Syntax | SV Equivalent | Key Difference |
|---|---|---|---|
sc_event |
sc_event my_ev; |
event my_ev; |
No associated value — pure occurrence signal. Unlike sc_signal, it does not hold state between firings |
notify() (immediate) |
my_ev.notify() |
-> my_ev |
Fires in the CURRENT evaluate phase; triggered processes run in the same evaluation cycle as the notifier |
notify(SC_ZERO_TIME) |
my_ev.notify(SC_ZERO_TIME) |
#0 -> my_ev (approximately) |
Fires in the NEXT delta — one evaluate-update cycle later; avoids immediate notification races |
notify(T, unit) |
my_ev.notify(10, SC_NS) |
fork #10 -> my_ev; join_none |
Fires at a future simulation timestamp; no direct single-statement SV equivalent |
wait(event) |
wait(my_ev) |
@(my_ev) |
Suspends SC_THREAD until event fires; no timeout — waits indefinitely |
value_changed_event() |
sig.value_changed_event() |
@(sig) (any change) |
Returns the event that fires whenever sig's committed value changes |
posedge_event() |
sig.posedge_event() |
@(posedge sig) |
Returns the event for 0→1 transition of a boolean signal |
negedge_event() |
sig.negedge_event() |
@(negedge sig) |
Returns the event for 1→0 transition |
sc_delta_count() |
sc_delta_count() |
No equivalent | Returns the current delta counter (implementation-specific; for debugging only) |
| Evaluate phase | Kernel internal | always block execution round |
Phase where all runnable processes execute and call write() |
| Update phase | Kernel internal | NBA update region | Phase where all queued write() values are committed to signals |
| Delta cycle | One evaluate+update pair | "delta delay" | Occurs at constant simulation time T; simulation time does not advance during delta cycles |
1. The Evaluate-Update Cycle — Fully Explained
The SystemC simulation kernel operates as a loop. At each simulation time step, it does not simply "run all processes." It runs two distinct phases, potentially many times, before advancing the clock:
Evaluate phase: All processes (SC_METHOD or SC_THREAD) whose sensitivity lists have triggered run to completion. During this phase, a process may call signal.write(new_value). That write is buffered — the signal's visible value does not change yet. Other processes reading the signal in the same evaluate phase see the old value.
Update phase: All buffered writes are committed atomically. Signals now reflect their new values. The kernel checks: did any signal change? If yes, some process may be sensitive to that change and needs to run — so the kernel triggers another evaluate phase. This is one delta cycle.
Stability: When an evaluate-update pair completes with no new signal changes, the system has reached a stable state at this time step. The kernel advances to the next scheduled time event.
The kernel's inner loop as pseudocode:
while (simulation_not_finished):
// ── EVALUATE PHASE ──────────────────────────────────────────────────────
while (runnable_processes_exist):
process = ready_queue.pop()
process.execute()
// Inside execute():
// signal.read() → returns signal.m_cur_val (current committed value)
// signal.write() → stores into signal.m_new_val ONLY (buffered)
// event.notify() → adds triggered processes to ready_queue
// ── UPDATE PHASE ────────────────────────────────────────────────────────
changed_signals = []
for each signal in written_signals_this_phase:
if signal.m_new_val != signal.m_cur_val:
signal.m_cur_val = signal.m_new_val // commit the write
changed_signals.append(signal)
// fire signal.value_changed_event()
// → adds processes sensitive to this signal to ready_queue
// ── DELTA DECISION ──────────────────────────────────────────────────────
if ready_queue is not empty:
delta_count++
// Loop back to EVALUATE (still at timestamp T)
else:
// System is stable at timestamp T
// Advance simulation time to next_scheduled_event
T = event_queue.pop_earliest_time()
This is not a SystemC invention. IEEE 1666 (SystemC), IEEE 1364 (Verilog), and IEEE 1800 (SystemVerilog) all specify essentially the same loop. The evaluate-update model exists because hardware physics requires it: gates propagate signals in finite time, and a simulator must provide an ordering mechanism that respects that causality even when modeling zero-delay ideal gates.
Relationship to Verilog's scheduling regions:
Verilog defines multiple scheduling regions within a single timestep: Active, NBA (non-blocking assignment), Observed, Reactive, Postponed. SystemC collapses this to evaluate + update, which maps directly to Verilog's Active + NBA regions. The correspondence is:
| Verilog Region | SystemC Phase | What happens |
|---|---|---|
| Active | Evaluate | Blocking assignments, always @(*) execution, function calls |
| NBA | Update | Non-blocking assignment (<=) commits |
| Observed | — | SV assertions (no direct SystemC equivalent) |
| Reactive | — | SV program blocks (no direct SystemC equivalent) |
SystemC also has a concept of "delta notification" vs "immediate notification" for sc_event, which maps roughly to the distinction between scheduling events in the Active vs. NBA region.
2. Delta Cycles — Concrete Example with Step-by-Step Numbering
Consider a three-process chain: A drives B, B drives C, C drives the output.
Input → [Process A] → sig_b → [Process B] → sig_c → [Process C] → Output
In hardware, this represents three levels of combinational logic. Each level adds propagation delay. In the simulator, each level takes one delta cycle.
Why delta cycles are necessary:
Without delta cycles, the simulator would need to run processes in strict topological order — A before B before C — based on the signal dependency graph. For combinational logic this is computable (Verilator does it). But for feedback loops, mutual dependencies, and dynamically-created process graphs, strict topological ordering is impossible. The evaluate-update model solves this by never requiring topological ordering: all processes can run in any order during evaluate, and the update phase makes writes visible atomically.
Step-by-step execution of the three-stage chain:
Initial state at timestamp T:
sig_b.m_cur_val = old_b
sig_c.m_cur_val = old_c
output.m_cur_val = old_out
Input signal changes to new_input (from testbench write):
input.m_cur_val = new_input (testbench writes take effect after sc_start)
Process A is in ready_queue (sensitive to input)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
EVALUATE Δ0 at timestamp T
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Process A runs:
reads input.m_cur_val = new_input
calls sig_b.write(transform(new_input))
sig_b.m_new_val = transform(new_input) ← BUFFERED, not yet visible
Process A returns.
Process B does NOT run — sig_b.m_cur_val hasn't changed yet.
UPDATE Δ0:
sig_b.m_cur_val ← sig_b.m_new_val = transform(new_input)
sig_b changed → value_changed_event fires
Process B added to ready_queue (sensitive to sig_b)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
EVALUATE Δ1 at timestamp T
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Process B runs:
reads sig_b.m_cur_val = transform(new_input) ← sees NEW value
calls sig_c.write(transform2(sig_b))
sig_c.m_new_val = transform2(sig_b) ← BUFFERED
UPDATE Δ1:
sig_c.m_cur_val ← sig_c.m_new_val
sig_c changed → value_changed_event fires
Process C added to ready_queue
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
EVALUATE Δ2 at timestamp T
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Process C runs:
reads sig_c.m_cur_val = transform2(sig_b) ← sees NEW value
calls output.write(final_value)
UPDATE Δ2:
output.m_cur_val ← final_value
output changed → value_changed_event fires
No processes sensitive to output → ready_queue is empty
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
STABLE at timestamp T
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
output.m_cur_val = final_value (correct, fully propagated)
Simulation time advances to next scheduled event.
The key observation: All three delta cycles happen inside a single sc_start(SC_ZERO_TIME) call. From the caller's perspective, one function call → system is stable → all values are correct. You do not call sc_start once per delta cycle.
ASCII timeline — three-stage chain:
Timestamp T (sc_time_stamp() returns T throughout all deltas):
Δ0 Δ1 Δ2 Stable
├──────────────┼──────────────┼──────────────┤
EVALUATE A runs B runs C runs
write(sig_b) write(sig_c) write(output)
UPDATE sig_b commits sig_c commits output commits
B→ready C→ready no one→ready
sig_b: old_b──────────new_b──────────────────────────
sig_c: old_c──────────────────new_c──────────────────
output: old_out─────────────────────────final_value───
Time: ────────────────────────────── T ─────────────►
(sc_time_stamp() == T for ALL of these delta cycles)
Equivalence to Verilog's always-chain:
// Three-stage chain in SystemVerilog
always @(input) sig_b <= transform(input);
always @(sig_b) sig_c <= transform2(sig_b);
always @(sig_c) output <= transform3(sig_c);
When input changes in Verilog, the first always block fires (Active region), computes sig_b's new value via NBA. After the NBA region, sig_b is committed — the second always fires, computes sig_c, and so on. This is delta-cycle propagation in Verilog's terminology. Each always firing is one delta. The Verilog simulator handles N levels of always blocks in N delta cycles — exactly the SystemC behavior.
flowchart TD
A[Simulation time T] --> B[Evaluate Phase\nRun all ready processes]
B --> C{Any signals\nchanged?}
C -->|Yes — delta cycle| B
C -->|No — stable| D[Advance to next\ntime event]
D --> A
style B fill:#06b6d4,color:#fff
style C fill:#f59e0b,color:#fff
style D fill:#10b981,color:#fff
Notice that the evaluate phase loops back to itself, not forward to a new time step. That loop is the delta cycle. Each iteration is "delta N" at the same simulation timestamp T. From the perspective of sc_time_stamp(), the time does not change during delta cycles — they all occur at time T. The delta counter is an internal kernel bookkeeping number that is not directly exposed to user code in standard SystemC (though some implementations provide sc_delta_count() as a debugging aid).
3. Two-Stage Chain: Where Delta Cycles Become Visible
Consider a combinational chain: A → stage1 → B → stage2 → C.
Stage 1 is an SC_METHOD sensitive to signal A. Stage 2 is an SC_METHOD sensitive to signal B.
When A changes at time T:
| Delta | What happens |
|---|---|
| Evaluate Δ0 | Stage 1 runs. It reads A (new value) and calls B.write(new_value). Write is buffered. Stage 2 does NOT run yet — B has not officially changed. |
| Update Δ0 | B's new value is committed. The kernel sees B changed. Stage 2 is sensitive to B. |
| Evaluate Δ1 | Stage 2 runs. It reads B (new value) and calls C.write(new_value). Write is buffered. |
| Update Δ1 | C's new value is committed. No more sensitive processes. |
| Stable | System is stable at time T. Both B and C hold the correct propagated value. |
From your perspective as a user calling sc_start(SC_ZERO_TIME), both delta cycles happen inside that one call. When sc_start returns, the system is stable — C already has the correct value. You do not need to call sc_start once per delta cycle. The kernel handles all internal delta iterations automatically.
This is the most important thing to understand before reading Example 1 below.
flowchart LR
A([A\nchanges at T]) --> |Δ0 evaluate| S1[Stage 1\nreads A\nwrites B ← buffered]
S1 --> |Δ0 update| B([B\ncommitted])
B --> |Δ1 evaluate| S2[Stage 2\nreads B\nwrites C ← buffered]
S2 --> |Δ1 update| C([C\ncommitted\nstable])
style A fill:#f59e0b,color:#fff
style B fill:#06b6d4,color:#fff
style C fill:#10b981,color:#fff
style S1 fill:#334155,color:#fff
style S2 fill:#334155,color:#fff
4. sc_event Theory — Explicit Event Signaling
sc_signal<T> carries an implicit event: the signal fires its value_changed_event automatically whenever its committed value changes. This is what makes sensitive << my_signal work.
sc_event is a separate, explicit event object with no associated value. It is a pure occurrence — a moment in time that something happened. It has no history, no current state, no "has this event fired recently" query. Use sc_event when you need to signal occurrence, not value.
sc_signal vs. sc_event — key behavioral differences:
| Property | sc_signal<T> |
sc_event |
|---|---|---|
| Carries a value | Yes — read() returns current value |
No — pure occurrence |
| Persists between firings | Yes — value holds until next write | No — event is instantaneous |
| Automatic notification | Yes — fires on value change only | No — must call notify() explicitly |
| Can notify unconditionally | No — write same value = no event | Yes — notify() always fires |
| Multiple readers | Yes — any process can read() |
Via wait(event) — process suspends until event |
| Hardware analogy | Wire / net | Interrupt pulse / strobe |
The three forms of notify() — semantics matter:
sc_event e;
// FORM 1: Immediate notification
e.notify();
// Fires NOW, in the CURRENT evaluate phase.
// Any process calling wait(e) that is ALREADY suspended on this event
// is added to the ready queue and runs in THIS delta cycle's evaluate phase.
// Processes that call wait(e) AFTER this notify() in a later delta will
// NOT see this notification — it has already passed.
// FORM 2: Delta notification
e.notify(SC_ZERO_TIME);
// Fires in the NEXT delta cycle.
// This is equivalent to: schedule a value_changed_event for e
// at the next evaluate phase. Processes waiting on e will wake up
// in the next evaluate phase after the current one.
// FORM 3: Timed notification
e.notify(10, SC_NS);
// Fires 10 simulation nanoseconds from now.
// The kernel places this notification in the future event queue.
// No direct single-statement equivalent in SV.
Mapping to SystemVerilog event semantics:
| Notification form | SystemC | SystemVerilog |
|---|---|---|
| Immediate | e.notify() |
-> e |
| Next delta | e.notify(SC_ZERO_TIME) |
#0 -> e (approximately) |
| Timed | e.notify(10, SC_NS) |
fork #10 -> e; join_none |
| Wait for event | wait(e) in SC_THREAD |
@(e) |
| Multiple events (OR) | wait(e1 \| e2) |
@(e1 or e2) |
| Multiple events (AND) | No direct primitive — use flags | @(e1) @(e2) sequentially |
When immediate notify() can cause an infinite loop:
SC_MODULE(bad_example) {
sc_event my_event;
void process_a() {
// This process is triggered by my_event
// ...some work...
my_event.notify(); // BUG: immediate notification re-triggers this same process!
// next delta: process_a runs again → notifies again → infinite delta loop
// Simulator detects this and aborts with "delta cycle limit exceeded"
}
SC_CTOR(bad_example) {
SC_THREAD(process_a);
// process_a waits on my_event somewhere inside
}
};
If a process notifies the same event it is sensitive to using immediate notify(), the kernel will keep re-triggering the process indefinitely. Use notify(SC_ZERO_TIME) to break the cycle — the process completes first, then the event fires in the next delta, giving the kernel a chance to check stability:
void process_a() {
// ...some work...
my_event.notify(SC_ZERO_TIME); // CORRECT: fires in next delta, not current
}
sc_event vs sc_signal
// Use sc_signal<bool> when the value matters:
sc_signal<bool> ready;
ready.write(true); // other modules can read ready.read() at any time
// Use sc_event when the occurrence matters, not the value:
sc_event data_valid;
data_valid.notify(); // signals "data is ready NOW" — no persistent state
The practical rule: if any module needs to query "is the signal currently high?" at an arbitrary time (polling), use sc_signal<bool>. If modules only need to react to the moment of occurrence (event-driven), use sc_event. In our RISC-V CPU, instruction-ready handshakes between fetch and decode use sc_event (react to occurrence), while the pipeline flush flag uses sc_signal<bool> (decode stage polls whether flush is currently active).
Simulation Semantics — The Scheduler from First Principles
The SystemC scheduler is the most important piece of infrastructure in this entire tutorial series. Every behavioral nuance — why values appear when they do, why processes run in a certain order, why sc_start(SC_ZERO_TIME) settles everything — flows from understanding the scheduler.
Here is the scheduler written as complete pseudocode, including both the evaluate-update loop and the event queue management:
// ── GLOBAL SCHEDULER STATE ───────────────────────────────────────────────
event_queue: sorted list of (time, sc_event) pairs — future events
ready_queue: list of processes to run in current evaluate phase
delta_count: integer counter, resets at each new timestamp
current_time: sc_time — current simulation timestamp
// ── INITIALIZATION ────────────────────────────────────────────────────────
// (runs before first sc_start() call)
for each SC_METHOD registered:
add to ready_queue // all methods run once at t=0 for initialization
// ── MAIN SIMULATION LOOP ─────────────────────────────────────────────────
while (simulation_not_stopped):
// EVALUATE PHASE
while ready_queue is not empty:
process = ready_queue.dequeue()
process.execute()
// execute() calls:
// signal.read() → reads signal.m_cur_val
// signal.write(v) → stores v in signal.m_new_val; marks signal as dirty
// event.notify() → immediate: adds waiting processes to ready_queue
// SC_ZERO_TIME: adds event to delta_event_queue
// timed: adds event to future event_queue
// UPDATE PHASE
delta_events_fired = []
for each dirty_signal:
old_val = dirty_signal.m_cur_val
dirty_signal.m_cur_val = dirty_signal.m_new_val
if dirty_signal.m_cur_val != old_val:
// Signal value actually changed — notify waiters
for each process in dirty_signal.waiters:
ready_queue.enqueue(process)
delta_events_fired.append(dirty_signal.value_changed_event)
// DELTA DECISION
if ready_queue is not empty or delta_event_queue is not empty:
// Process delta events
for each event in delta_event_queue:
for each process waiting on event:
ready_queue.enqueue(process)
delta_event_queue.clear()
delta_count++
// Loop back to EVALUATE at same current_time
else:
// System is stable at current_time
// Advance to next future event
if event_queue is empty:
break // simulation ends — no more events
next_time, next_event = event_queue.pop_minimum()
current_time = next_time
delta_count = 0
// Wake all processes waiting on next_event
for each process waiting on next_event:
ready_queue.enqueue(process)
This pseudocode is faithful to IEEE 1666. Real implementations add thread scheduling, process priorities, and optimization (Verilator-style static scheduling), but the observable behavior matches this model.
Why this model is correct for hardware:
In real CMOS gates, when an input changes, the output changes after the gate's propagation delay. Gates that are inputs to the changed gate then change after their propagation delays. This creates a wave of changes propagating forward through the combinational logic cloud. In ideal (zero-delay) simulation, all those propagation delays collapse to zero real time but must still be ordered. Delta cycles provide that ordering: each "wave front" of changes is one delta cycle. The simulator runs waves until no new changes occur — exactly as silicon settles to a new stable state after an input transition.
Expanded Translation: Verilog, SystemVerilog, and SystemC Side by Side
| Concept | Classic Verilog | Modern SystemVerilog | SystemC |
|---|---|---|---|
| Deferred signal update | b <= a + 1 (non-blocking) |
b <= a + 1 (same) |
sig_b.write(sig_a.read() + 1) always deferred |
| Immediate update | b = a + 1 (blocking, within one always) |
b = a + 1 (same, procedural) |
Local C++ variable: int b = sig_a.read() + 1 |
| Sensitivity list | always @(a or b) |
always_comb (auto-sensitivity) |
sensitive << a << b (manual, always) |
| Auto-sensitivity | Not in Verilog-95; @* in Verilog-2001 |
always_comb infers automatically |
Not supported — must explicitly list |
| Named events | event done; -> done; @(done) |
Same, plus triggered property |
sc_event done; done.notify(); wait(done) |
| Timed event | #10 -> done |
fork #10 -> done; join_none |
done.notify(10, SC_NS) |
| Delta cycle count | Not user-accessible | $get_time_precision (time only) |
sc_delta_count() (impl-specific) |
| Combinational chain depth | N always blocks = N delta cycles | Same | N SC_METHODs = N delta cycles |
| Circular comb dependency | Oscillates until delta limit | Same (simulation error) | Same — kernel detects and aborts |
| Wait for condition | wait (sig == 1) (Verilog task) |
wait (sig == 1) (native) |
while(sig.read()!=1) wait(sig.value_changed_event()) |
| Concurrent threads | fork ... join_none |
fork ... join_none + class + task |
Multiple SC_THREAD registrations |
Implementation
Example 1: Two-Stage Combinational Chain
This example makes the delta cycle propagation visible in the terminal. Read the output carefully — both stages fire within the single sc_start(SC_ZERO_TIME) call.
// File: two_stage_chain.cpp
#include <systemc.h>
SC_MODULE(two_stage_chain) {
sc_in<bool> in_sig;
sc_out<bool> mid_sig;
sc_out<bool> out_sig;
sc_signal<bool> internal; // intermediate signal between the two stages
// Stage 1: combinational — propagates in_sig → internal
void stage1() {
internal.write(in_sig.read());
std::cout << " stage1 fired: internal will be " << in_sig.read()
<< " (write buffered, not yet visible)" << std::endl;
}
// Stage 2: combinational — propagates internal → out_sig
void stage2() {
out_sig.write(internal.read());
mid_sig.write(internal.read());
std::cout << " stage2 fired: out_sig will be " << internal.read()
<< " (write buffered, not yet visible)" << std::endl;
}
SC_CTOR(two_stage_chain) {
SC_METHOD(stage1); sensitive << in_sig;
SC_METHOD(stage2); sensitive << internal;
}
};
int sc_main(int argc, char* argv[]) {
sc_signal<bool> sig_in, sig_mid, sig_out;
two_stage_chain dut("dut");
dut.in_sig(sig_in);
dut.mid_sig(sig_mid);
dut.out_sig(sig_out);
// Initialize to known state
sig_in.write(false);
sc_start(SC_ZERO_TIME); // settle — both stages fire once during this call
std::cout << "[t=0 settled] in=" << sig_in.read()
<< " mid=" << sig_mid.read()
<< " out=" << sig_out.read() << std::endl;
// Drive input high — BOTH stages will fire inside this single sc_start call
std::cout << "\n--- Driving in=true ---\n";
sig_in.write(true);
sc_start(SC_ZERO_TIME); // stage1 fires in delta 1, stage2 fires in delta 2 — all inside here
std::cout << "[after sc_start] in=" << sig_in.read()
<< " mid=" << sig_mid.read()
<< " out=" << sig_out.read() << std::endl;
return 0;
}
Expected output:
stage1 fired: internal will be 0 (write buffered, not yet visible)
stage2 fired: out_sig will be 0 (write buffered, not yet visible)
[t=0 settled] in=0 mid=0 out=0
--- Driving in=true ---
stage1 fired: internal will be 1 (write buffered, not yet visible)
stage2 fired: out_sig will be 1 (write buffered, not yet visible)
[after sc_start] in=1 mid=1 out=1
PEDAGOGICAL NOTE — this is the most important thing to take away from this example:
Look at the output for the in=true drive. Both stage1 fired and stage2 fired print before the final state line. This means both stages ran within the single sc_start(SC_ZERO_TIME) call. Stage 1 fired in one internal delta, stage 2 fired in the next internal delta, and by the time sc_start returned, the system had settled through both deltas automatically.
You might expect that you need to call sc_start(SC_ZERO_TIME) once to propagate through stage 1, and again to propagate through stage 2. This is incorrect. One sc_start(SC_ZERO_TIME) call drives the SystemC kernel until stability — which means it runs as many internal evaluate-update cycles (delta cycles) as necessary. For a two-stage chain, it runs two internal deltas. For a ten-stage chain, it would run ten. You never need to call sc_start once per delta cycle.
The labels "delta 1" and "delta 2" that you might write in comments or diagrams are pedagogical labels — conceptual markers that help you reason about propagation order. They are not values you can read from a counter in standard user code. The SystemC kernel tracks them internally. What you observe as a user is: one sc_start call → system converges → correct values everywhere.
Example 2: sc_event Producer-Consumer Handshake
This example shows sc_event as an explicit synchronization mechanism between two modules. The producer writes data to a signal and fires an event. The consumer blocks on that event and reads the data when it fires. No polling. No shared state beyond the event object and the signal.
These RISC-V instruction encodings are real — 0x00500113 is ADDI x2, x0, 5 and 0x00A00193 is ADDI x3, x0, 10. We will parse these exact encodings in our decode stage at Post 11.
// File: producer_consumer.cpp
#include <systemc.h>
SC_MODULE(producer) {
sc_out<sc_uint<32>> data;
sc_event data_ready; // explicit event — fires when data is valid
void produce() {
// Transaction 1: ADDI x2, x0, 5 (RV32I encoding)
data.write(0x00500113);
data_ready.notify(); // fire event immediately — data is on the bus now
wait(20, SC_NS);
// Transaction 2: ADDI x3, x0, 10 (RV32I encoding)
data.write(0x00A00193);
data_ready.notify();
wait(20, SC_NS);
sc_stop();
}
SC_CTOR(producer) {
SC_THREAD(produce);
}
};
SC_MODULE(consumer) {
sc_in<sc_uint<32>> data;
sc_event& data_ready; // reference to producer's event — no copy
void consume() {
while (true) {
wait(data_ready); // suspend here until producer calls data_ready.notify()
std::cout << "[" << sc_time_stamp() << "] Received instruction: 0x"
<< std::hex << data.read() << std::dec << std::endl;
}
}
SC_HAS_PROCESS(consumer);
consumer(sc_module_name name, sc_event& ev)
: sc_module(name), data_ready(ev) {
SC_THREAD(consume);
}
};
int sc_main(int argc, char* argv[]) {
sc_signal<sc_uint<32>> instr_bus;
producer prod("prod");
prod.data(instr_bus);
// Pass producer's event to consumer at construction time
consumer cons("cons", prod.data_ready);
cons.data(instr_bus);
sc_start(); // run until sc_stop() is called
return 0;
}
Expected output:
[0 ns] Received instruction: 0x500113
[20 ns] Received instruction: 0xa00193
The 20ns gap between the two transactions is exactly the wait(20, SC_NS) in the producer. The consumer fires immediately when the event arrives — there is no polling delay, no wasted cycles. This is the efficiency advantage of event-driven simulation over polling loops.
Note: This producer-consumer handshake is a preview of the instruction fetch mechanism in Post 10. The data_ready event is structurally identical to the handshake signal between the fetch unit and the decode stage in our RISC-V CPU. When the fetch unit retrieves an instruction from memory, it will fire an event to wake the decode stage — exactly the pattern shown here.
Build & Run
Both examples use the same CMake structure established in Posts 1 and 2. Create a CMakeLists.txt in section1/post03/:
cmake_minimum_required(VERSION 3.16)
project(post03_delta_cycles)
set(SYSTEMC_HOME $ENV{SYSTEMC_HOME})
include_directories(${SYSTEMC_HOME}/include)
link_directories(${SYSTEMC_HOME}/lib-linux64)
add_executable(two_stage_chain two_stage_chain.cpp)
target_link_libraries(two_stage_chain systemc)
add_executable(producer_consumer producer_consumer.cpp)
target_link_libraries(producer_consumer systemc)
mkdir build && cd build
cmake .. && make
./two_stage_chain
./producer_consumer
If stage 2 does not fire in your two_stage_chain output, check that internal is declared as sc_signal<bool>, not a plain bool. A plain bool has no event mechanism — stage 2's sensitivity list will never trigger.
Verification
The producer-consumer example is its own timing verification. The 20ns gap between the two output lines confirms that:
- The consumer blocked correctly on
wait(data_ready)and did not spin - The producer's
wait(20, SC_NS)advanced simulation time exactly as expected - The event fired at the correct time (immediately after
data.write()+notify())
For a more rigorous checker, you can attach a monitor SC_THREAD that also waits on data_ready and asserts that the received instruction matches an expected sequence:
SC_MODULE(checker) {
sc_in<sc_uint<32>> data;
sc_event& data_ready;
sc_uint<32> expected[2] = {0x00500113, 0x00A00193};
int count = 0;
void check() {
while (count < 2) {
wait(data_ready);
sc_assert(data.read() == expected[count]);
std::cout << "[PASS] Transaction " << count
<< ": 0x" << std::hex << data.read() << std::dec << std::endl;
count++;
}
}
SC_HAS_PROCESS(checker);
checker(sc_module_name name, sc_event& ev)
: sc_module(name), data_ready(ev) {
SC_THREAD(check);
}
};
sc_assert terminates the simulation with an error message if the condition is false. This is the lightweight equivalent of a UVM uvm_error — no framework overhead, just a condition check that halts on failure.
sc_event is the foundation of transaction-level modeling (TLM). When we reach Posts 14–17, TLM ports and sockets are built on this same event mechanism, extended with standardized payload types and timing protocols. Understanding sc_event now makes TLM-2.0 straightforward later.
DV Insight
Signal vs. Event — Choosing the Right Tool
sc_signal and sc_event are both synchronization mechanisms, but they serve different purposes:
Use sc_signal<T> when:
- You are modeling a physical wire or bus that carries a value
- Multiple modules need to read the current state at any time
- The value persists between events (it holds its last-written value)
- Example: data buses, address lines, control flags
Use sc_event when:
- You are signaling that something happened, not communicating a value
- The event is transient — "pulse" semantics, not "level" semantics
- You need a handshake or acknowledgment between two specific modules
- Example: "transaction complete", "pipeline flush requested", "interrupt asserted"
In our RISC-V CPU, most inter-stage communication uses sc_signal for data (the instruction word, the PC value, the ALU result) and sc_event for control (pipeline stall requests, flush signals, instruction-ready handshakes). This mirrors how real CPU designs separate data paths from control paths — a clean architectural boundary that makes verification much easier.
The Three Forms of notify() — Getting It Right
The timing of notify() matters:
event.notify(); // Immediate: fires in current evaluate phase
event.notify(SC_ZERO_TIME); // Next delta: fires at next delta boundary
event.notify(10, SC_NS); // Timed: fires 10ns from now
For the producer-consumer pattern shown in Example 2, immediate notify() after a data write is correct. The sequence is:
data.write(0x00500113)— buffered write to signaldata_ready.notify()— event fires immediately- Update phase runs —
datasignal is committed - Consumer's
wait(data_ready)wakes up — reads committed data value
If you used notify(SC_ZERO_TIME) instead, the event fires one delta after the immediate case, but the data write commits in the same delta as the signal update — so the consumer still sees the correct value. Both are safe here.
Where it becomes dangerous: if you call notify(SC_ZERO_TIME) and then the consumer reads the signal before the delta where the write commits, it will see stale data. This is a classic race condition in SystemC that notify(SC_ZERO_TIME) can introduce. When in doubt: write the data, notify immediately, let the evaluate-update cycle handle the ordering.
Common Pitfalls for SV Engineers
Delta cycles and sc_event are the #1 source of hard-to-find bugs for engineers coming from SystemVerilog. These five pitfalls cover the most common failure modes.
Pitfall 1: Immediate notify() inside a process triggered by that same event — infinite delta loop.
void process_a() {
// ...
my_event.notify(); // fires immediately → this process re-triggers immediately
// → fires again → re-triggers → ...
// Simulator detects delta cycle limit and aborts:
// "delta cycle limit of 10000 exceeded"
}
SC_CTOR(bad) {
SC_METHOD(process_a);
sensitive << my_event; // BUG: same event triggers and is notified here
}
Fix: use notify(SC_ZERO_TIME) to fire the event in the next delta. The process completes, the update phase runs, then the event fires in the next evaluate phase — giving the kernel a cycle to check stability.
Pitfall 2: sc_signal
sc_signal<bool> flag;
flag.write(true); // fires value_changed_event (false → true)
flag.write(true); // DOES NOT fire value_changed_event (value unchanged)
flag.write(false); // fires value_changed_event (true → false)
This surprises engineers used to SV's always @(posedge clk) where every posedge fires regardless. With sc_signal<bool>, if you write true to an already-true signal, no event fires and no sensitive process re-runs. If you need unconditional notification, use sc_event::notify() — it always fires regardless of any value.
This distinction matters for reset signals that stay asserted: if reset is true and you write true again (e.g., during testbench initialization), processes sensitive to the reset signal will not re-run.
Pitfall 3: Delta cycle depth with deeply cascaded combinational logic.
A chain of N SC_METHOD processes connected by N-1 intermediate sc_signal objects takes N delta cycles to propagate a change from input to output. For a 10-stage pipeline hazard detection unit with 10 levels of combinational checks, this is 10 delta cycles — all happening inside one sc_start(SC_ZERO_TIME) call, and all correct.
The danger: circular combinational dependencies. If stage A writes to sig_b and stage B writes to sig_a, and both are sensitive to each other's outputs:
A → writes sig_b → B wakes up → writes sig_a → A wakes up → writes sig_b → ...
The kernel will run delta cycles until it hits the maximum (default 10,000) and aborts. In hardware, circular combinational logic is a design error (latch inferred or oscillation). In SystemC, the kernel catches it and reports the delta limit exceeded. This is a design bug, not a SystemC bug.
Pitfall 4: Don't confuse posedge_event() with value_changed_event() for boolean signals.
sc_signal<bool> sig;
// value_changed_event fires on ANY transition (0→1 OR 1→0):
wait(sig.value_changed_event()); // wakes on both rising and falling
// posedge_event fires ONLY on 0→1:
wait(sig.posedge_event()); // wakes only on rising edge
// negedge_event fires ONLY on 1→0:
wait(sig.negedge_event()); // wakes only on falling edge
SV engineers translating @(posedge rst_n) (wait for reset deassert) must use wait(rst_n.posedge_event()), not wait(rst_n.value_changed_event()) — the latter would wake on both assertion and deassertion.
Pitfall 5: Reading a signal inside the update phase — it doesn't happen and you can't do it.
User code (process bodies) always runs in the evaluate phase. The update phase is kernel-internal — you never write code that executes during the update phase. The kernel calls your process functions during evaluate, and during evaluate, signal.read() always returns the current committed value (m_cur_val).
The implication: if two processes A and B both run in the same evaluate phase (Δ0), and A calls sig.write(v), B's call to sig.read() returns the value BEFORE A's write, even if A ran first. There is no process ordering within an evaluate phase that makes A's write visible to B in the same delta. B will only see A's write after the update phase — in delta Δ1, when B re-runs due to the value_changed_event.
This is correct and intentional. It prevents process-ordering races: all processes see the same consistent snapshot of signal values within an evaluate phase, regardless of the order they execute.
Integration
Delta cycles are what make the RISC-V forwarding unit in Post 19 correct. The EX stage computes an ALU result at simulation time T (within one clock cycle). Before the clock edge that advances to time T + 10ns, that result must propagate through the forwarding multiplexers to the MEM stage's operand inputs. In silicon, this happens in 200–400ps of combinational propagation delay. In SystemC, it happens in delta cycles — each forwarding path level that re-evaluates is one delta, all within the same sc_time timestamp.
When you model the forwarding unit as a set of SC_METHOD processes sensitive to the EX stage's output signals, the SystemC kernel automatically runs the evaluate-update loop until the forwarded values settle at the MEM stage inputs — before the clock edge SC_METHODs fire. You do not need to manually sequence these. The evaluate-update model handles it, just as propagation delay handles it in silicon.
The same applies to pipeline flush signals. When a branch misprediction is detected in the MEM stage (Post 21), a flush event fires. The fetch and decode stage SC_METHODs that are sensitive to the flush signal re-evaluate within the same clock cycle's delta iterations, clearing the pipeline registers before the next clock edge latches the new state. Delta cycles make this zero-time correction possible.
Series progress:
- Post 1 — Modules, Ports & Signals ✓
- Post 2 — Simulation Time & Clocks ✓
- Post 3 — Delta Cycles & Event-Driven Semantics ✓
- Post 4 — SC_METHOD vs SC_THREAD (next)
Section 1 foundation is taking shape. The two_stage_chain module will be reused directly in Post 7 as the basis for the combinational ALU interconnect, and the producer-consumer event pattern reappears in Post 10 as the fetch-unit handshake. The concepts in this post are not introductory scaffolding — they are the mechanisms we will rely on through the entire CPU build.
What's Next
Post 4: SC_METHOD vs SC_THREAD
Now that we understand when processes fire (sensitivity lists, delta cycles, event notifications), Post 4 covers which process type to use and why getting this choice wrong causes either deadlocks or missed events.
The rules are not arbitrary. SC_METHOD processes must run to completion without blocking — they model combinational logic and clocked registers. SC_THREAD processes can call wait() and suspend — they model sequential behavior, testbench drivers, and monitors. Using SC_METHOD where you need blocking behavior causes a runtime error. Using SC_THREAD where you need strict sensitivity-list semantics causes subtle missed-event bugs that are very hard to find.
Post 4 builds a concrete example of both mistakes and shows the correct version of each, with the rules stated explicitly enough to apply to any new module you write.
Comments (0)
Leave a Comment