3. SystemC Tutorial - Delta Cycles & Event-Driven Semantics

Introduction

"Why doesn't my output update immediately after I write a signal?"

This is the single most common question from engineers who are new to SystemC. You call sig_in.write(true), you immediately read sig_out.read(), and you get the old value back. You add a sc_start(SC_ZERO_TIME) call. Now the first stage updates — but the second stage is still wrong. You add another sc_start(SC_ZERO_TIME). Now everything looks correct, but you're not sure why, and you're worried you're masking a bug rather than understanding the model.

The answer is delta cycles — and understanding them is not optional. It is the foundation of every multi-process simulation you will ever write in SystemC.

Delta cycles exist because SystemC uses the same evaluate-update model as real hardware. In silicon, a signal driven by one flip-flop does not arrive at the next flip-flop instantaneously — it propagates through wires and combinational logic over a finite (if tiny) time. In a simulator, zero-time propagation through multiple stages still requires an ordering mechanism. If stage 1 writes a signal and stage 2 reads that signal in the same instant, which process runs first? The evaluate-update model answers this by separating the act of computing a new value (evaluate phase) from the act of making that value visible to other processes (update phase). A delta cycle is one pass through that pair of phases.

This is not a SystemC quirk. Every hardware simulator you have used — ModelSim, VCS, Xcelium, Questa, Verilator — uses exactly this model. The IEEE 1364 Verilog standard defines it. The IEEE 1800 SystemVerilog standard defines it. IEEE 1666 (SystemC) defines it. They all converge on the same evaluate-update loop because they are all modeling the same physical reality.

Consider the forwarding logic in a real RISC-V pipeline. When the EX stage computes a result and the MEM stage needs that value in the same clock cycle, the forwarding path must propagate through several levels of combinational logic — a multiplexer, a comparator, routing wires — before settling at the MEM stage input. In silicon this takes maybe 200–400ps. In a simulator it takes delta cycles: each level of logic that re-evaluates is one delta. The simulator produces the correct stable value before it advances the clock, just as silicon produces the correct stable value before the next clock edge samples it.

Post 3 builds two concrete models. First, a two-stage combinational chain that makes delta cycle propagation visible in your terminal output — you will see stage 1 and stage 2 both fire within a single sc_start() call. Second, a producer-consumer handshake using sc_event, which is the explicit event mechanism that underlies transaction-level communication. The producer-consumer pattern we build here is structurally identical to the fetch→decode handshake in our RISC-V CPU at Post 10.


Prerequisites


Translation Table

Before the concept explanations, here is the mapping from what you already know:

Concept C++ Engineer SystemVerilog Engineer
Delta cycle Sub-timestep evaluation round Delta delay (same concept, same name)
sc_event std::condition_variable event keyword
event.notify() cv.notify_one() -> event_name
wait(event) cv.wait() (blocking in sim, not in OS) @(event_name)
Evaluate phase "run all ready functions" always block execution
Update phase "apply buffered writes" Non-blocking assignment update (#0 semantics)
sc_signal.write() Buffered write — not visible until update phase Non-blocking assignment <=

The key insight for C++ engineers: sc_signal.write() is not like a normal variable assignment. It does not change the visible value immediately. It enqueues the new value for the update phase. If you want the value you just wrote to be visible to another process, a delta cycle must complete first.

The key insight for SystemVerilog engineers: delta cycles in SystemC work exactly as they do in SystemVerilog. If you have used non-blocking assignments and wondered why a chain of always blocks triggered by the same event all see the old value until the #0 update, this is the same mechanism. SystemC makes it more explicit — you can see it in the sc_start() call boundaries — but the semantics are identical.


Concept Explanation

1. The Evaluate-Update Cycle

The SystemC simulation kernel operates as a loop. At each simulation time step, it does not simply "run all processes." It runs two distinct phases, potentially many times, before advancing the clock:

Evaluate phase: All processes (SC_METHOD or SC_THREAD) whose sensitivity lists have triggered run to completion. During this phase, a process may call signal.write(new_value). That write is buffered — the signal's visible value does not change yet. Other processes reading the signal in the same evaluate phase see the old value.

Update phase: All buffered writes are committed atomically. Signals now reflect their new values. The kernel checks: did any signal change? If yes, some process may be sensitive to that change and needs to run — so the kernel triggers another evaluate phase. This is one delta cycle.

Stability: When an evaluate-update pair completes with no new signal changes, the system has reached a stable state at this time step. The kernel advances to the next scheduled time event.

flowchart TD
    A[Simulation time T] --> B[Evaluate Phase\nRun all ready processes]
    B --> C{Any signals\nchanged?}
    C -->|Yes — delta cycle| B
    C -->|No — stable| D[Advance to next\ntime event]
    D --> A
    style B fill:#06b6d4,color:#fff
    style C fill:#f59e0b,color:#fff
    style D fill:#10b981,color:#fff

Notice that the evaluate phase loops back to itself, not forward to a new time step. That loop is the delta cycle. Each iteration is "delta N" at the same simulation timestamp T. From the perspective of sc_time_stamp(), the time does not change during delta cycles — they all occur at time T. The delta counter is an internal kernel bookkeeping number that is not directly exposed to user code in standard SystemC (though some implementations provide sc_delta_count() as a debugging aid).

2. Two-Stage Chain: Where Delta Cycles Become Visible

Consider a combinational chain: A → stage1 → B → stage2 → C.

Stage 1 is an SC_METHOD sensitive to signal A. Stage 2 is an SC_METHOD sensitive to signal B.

When A changes at time T:

Delta What happens
Evaluate Δ0 Stage 1 runs. It reads A (new value) and calls B.write(new_value). Write is buffered. Stage 2 does NOT run yet — B has not officially changed.
Update Δ0 B's new value is committed. The kernel sees B changed. Stage 2 is sensitive to B.
Evaluate Δ1 Stage 2 runs. It reads B (new value) and calls C.write(new_value). Write is buffered.
Update Δ1 C's new value is committed. No more sensitive processes.
Stable System is stable at time T. Both B and C hold the correct propagated value.

From your perspective as a user calling sc_start(SC_ZERO_TIME), both delta cycles happen inside that one call. When sc_start returns, the system is stable — C already has the correct value. You do not need to call sc_start once per delta cycle. The kernel handles all internal delta iterations automatically.

This is the most important thing to understand before reading Example 1 below.

flowchart LR
    A([A\nchanges at T]) --> |Δ0 evaluate| S1[Stage 1\nreads A\nwrites B ← buffered]
    S1 --> |Δ0 update| B([B\ncommitted])
    B --> |Δ1 evaluate| S2[Stage 2\nreads B\nwrites C ← buffered]
    S2 --> |Δ1 update| C([C\ncommitted\nstable])
    style A fill:#f59e0b,color:#fff
    style B fill:#06b6d4,color:#fff
    style C fill:#10b981,color:#fff
    style S1 fill:#334155,color:#fff
    style S2 fill:#334155,color:#fff

3. sc_event — Explicit Event Signaling

sc_signal<T> carries an implicit event: the signal fires its event automatically whenever its value changes. This is what makes sensitive << my_signal work.

sc_event is a separate, explicit event object with no associated value. You fire it manually with notify(), and you wait for it with wait(event). Use sc_event when:

  • You need a handshake or acknowledgment that is not a data value (e.g., "transaction complete", "pipeline flush requested")
  • Two modules need to synchronize without sharing a signal
  • You want to signal something that happens, not a value that changes

The API is minimal:

sc_event done_ev;          // declare an event

done_ev.notify();          // fire immediately — in current delta cycle
done_ev.notify(SC_ZERO_TIME);  // fire at next delta cycle
done_ev.notify(10, SC_NS); // fire 10ns from now

wait(done_ev);             // suspend until done_ev fires (SC_THREAD only)

The three forms of notify() have meaningfully different semantics. Immediate notify() fires in the current evaluate phase — any process waiting for this event in the same evaluate phase will be scheduled for the next delta. notify(SC_ZERO_TIME) fires at the next delta boundary — slightly later. notify(10, SC_NS) fires at a future time step. For the producer-consumer handshakes we build in this series, immediate notify() after a data write is almost always correct — the data is on the signal, the event fires, and the consumer wakes up in the next delta and reads the stable value.

Choosing the wrong form causes races that are very difficult to debug. This is covered in detail in Post 20 (Synchronization and Races).


Implementation

Example 1: Two-Stage Combinational Chain

This example makes the delta cycle propagation visible in the terminal. Read the output carefully — both stages fire within the single sc_start(SC_ZERO_TIME) call.

// File: two_stage_chain.cpp
#include <systemc.h>

SC_MODULE(two_stage_chain) {
  sc_in<bool>  in_sig;
  sc_out<bool> mid_sig;
  sc_out<bool> out_sig;

  sc_signal<bool> internal; // intermediate signal between the two stages

  // Stage 1: combinational — propagates in_sig → internal
  void stage1() {
    internal.write(in_sig.read());
    std::cout << "  stage1 fired: internal will be " << in_sig.read()
              << " (write buffered, not yet visible)" << std::endl;
  }

  // Stage 2: combinational — propagates internal → out_sig
  void stage2() {
    out_sig.write(internal.read());
    mid_sig.write(internal.read());
    std::cout << "  stage2 fired: out_sig will be " << internal.read()
              << " (write buffered, not yet visible)" << std::endl;
  }

  SC_CTOR(two_stage_chain) {
    SC_METHOD(stage1); sensitive << in_sig;
    SC_METHOD(stage2); sensitive << internal;
  }
};

int sc_main(int argc, char* argv[]) {
  sc_signal<bool> sig_in, sig_mid, sig_out;

  two_stage_chain dut("dut");
  dut.in_sig(sig_in);
  dut.mid_sig(sig_mid);
  dut.out_sig(sig_out);

  // Initialize to known state
  sig_in.write(false);
  sc_start(SC_ZERO_TIME); // settle — both stages fire once during this call
  std::cout << "[t=0 settled] in=" << sig_in.read()
            << " mid=" << sig_mid.read()
            << " out=" << sig_out.read() << std::endl;

  // Drive input high — BOTH stages will fire inside this single sc_start call
  std::cout << "\n--- Driving in=true ---\n";
  sig_in.write(true);
  sc_start(SC_ZERO_TIME); // stage1 fires in delta 1, stage2 fires in delta 2 — all inside here
  std::cout << "[after sc_start] in=" << sig_in.read()
            << " mid=" << sig_mid.read()
            << " out=" << sig_out.read() << std::endl;

  return 0;
}

Expected output:

  stage1 fired: internal will be 0 (write buffered, not yet visible)
  stage2 fired: out_sig will be 0 (write buffered, not yet visible)
[t=0 settled] in=0 mid=0 out=0

--- Driving in=true ---
  stage1 fired: internal will be 1 (write buffered, not yet visible)
  stage2 fired: out_sig will be 1 (write buffered, not yet visible)
[after sc_start] in=1 mid=1 out=1

PEDAGOGICAL NOTE — this is the most important thing to take away from this example:

Look at the output for the in=true drive. Both stage1 fired and stage2 fired print before the final state line. This means both stages ran within the single sc_start(SC_ZERO_TIME) call. Stage 1 fired in one internal delta, stage 2 fired in the next internal delta, and by the time sc_start returned, the system had settled through both deltas automatically.

You might expect that you need to call sc_start(SC_ZERO_TIME) once to propagate through stage 1, and again to propagate through stage 2. This is incorrect. One sc_start(SC_ZERO_TIME) call drives the SystemC kernel until stability — which means it runs as many internal evaluate-update cycles (delta cycles) as necessary. For a two-stage chain, it runs two internal deltas. For a ten-stage chain, it would run ten. You never need to call sc_start once per delta cycle.

The labels "delta 1" and "delta 2" that you might write in comments or diagrams are pedagogical labels — conceptual markers that help you reason about propagation order. They are not values you can read from a counter in standard user code. The SystemC kernel tracks them internally. What you observe as a user is: one sc_start call → system converges → correct values everywhere.


Example 2: sc_event Producer-Consumer Handshake

This example shows sc_event as an explicit synchronization mechanism between two modules. The producer writes data to a signal and fires an event. The consumer blocks on that event and reads the data when it fires. No polling. No shared state beyond the event object and the signal.

These RISC-V instruction encodings are real — 0x00500113 is ADDI x2, x0, 5 and 0x00A00193 is ADDI x3, x0, 10. We will parse these exact encodings in our decode stage at Post 11.

// File: producer_consumer.cpp
#include <systemc.h>

SC_MODULE(producer) {
  sc_out<sc_uint<32>> data;
  sc_event            data_ready; // explicit event — fires when data is valid

  void produce() {
    // Transaction 1: ADDI x2, x0, 5  (RV32I encoding)
    data.write(0x00500113);
    data_ready.notify();    // fire event immediately — data is on the bus now
    wait(20, SC_NS);

    // Transaction 2: ADDI x3, x0, 10  (RV32I encoding)
    data.write(0x00A00193);
    data_ready.notify();
    wait(20, SC_NS);

    sc_stop();
  }

  SC_CTOR(producer) {
    SC_THREAD(produce);
  }
};

SC_MODULE(consumer) {
  sc_in<sc_uint<32>> data;
  sc_event&          data_ready; // reference to producer's event — no copy

  void consume() {
    while (true) {
      wait(data_ready); // suspend here until producer calls data_ready.notify()
      std::cout << "[" << sc_time_stamp() << "] Received instruction: 0x"
                << std::hex << data.read() << std::dec << std::endl;
    }
  }

  SC_HAS_PROCESS(consumer);
  consumer(sc_module_name name, sc_event& ev)
    : sc_module(name), data_ready(ev) {
    SC_THREAD(consume);
  }
};

int sc_main(int argc, char* argv[]) {
  sc_signal<sc_uint<32>> instr_bus;

  producer prod("prod");
  prod.data(instr_bus);

  // Pass producer's event to consumer at construction time
  consumer cons("cons", prod.data_ready);
  cons.data(instr_bus);

  sc_start(); // run until sc_stop() is called
  return 0;
}

Expected output:

[0 ns] Received instruction: 0x500113
[20 ns] Received instruction: 0xa00193

The 20ns gap between the two transactions is exactly the wait(20, SC_NS) in the producer. The consumer fires immediately when the event arrives — there is no polling delay, no wasted cycles. This is the efficiency advantage of event-driven simulation over polling loops.

Note: This producer-consumer handshake is a preview of the instruction fetch mechanism in Post 10. The data_ready event is structurally identical to the handshake signal between the fetch unit and the decode stage in our RISC-V CPU. When the fetch unit retrieves an instruction from memory, it will fire an event to wake the decode stage — exactly the pattern shown here.


Build & Run

Both examples use the same CMake structure established in Posts 1 and 2. Create a CMakeLists.txt in section1/post03/:

cmake_minimum_required(VERSION 3.16)
project(post03_delta_cycles)

set(SYSTEMC_HOME $ENV{SYSTEMC_HOME})
include_directories(${SYSTEMC_HOME}/include)
link_directories(${SYSTEMC_HOME}/lib-linux64)

add_executable(two_stage_chain two_stage_chain.cpp)
target_link_libraries(two_stage_chain systemc)

add_executable(producer_consumer producer_consumer.cpp)
target_link_libraries(producer_consumer systemc)
mkdir build && cd build
cmake .. && make

./two_stage_chain
./producer_consumer

If stage 2 does not fire in your two_stage_chain output, check that internal is declared as sc_signal<bool>, not a plain bool. A plain bool has no event mechanism — stage 2's sensitivity list will never trigger.


Verification

The producer-consumer example is its own timing verification. The 20ns gap between the two output lines confirms that:

  1. The consumer blocked correctly on wait(data_ready) and did not spin
  2. The producer's wait(20, SC_NS) advanced simulation time exactly as expected
  3. The event fired at the correct time (immediately after data.write() + notify())

For a more rigorous checker, you can attach a monitor SC_THREAD that also waits on data_ready and asserts that the received instruction matches an expected sequence:

SC_MODULE(checker) {
  sc_in<sc_uint<32>> data;
  sc_event& data_ready;

  sc_uint<32> expected[2] = {0x00500113, 0x00A00193};
  int count = 0;

  void check() {
    while (count < 2) {
      wait(data_ready);
      sc_assert(data.read() == expected[count]);
      std::cout << "[PASS] Transaction " << count
                << ": 0x" << std::hex << data.read() << std::dec << std::endl;
      count++;
    }
  }

  SC_HAS_PROCESS(checker);
  checker(sc_module_name name, sc_event& ev)
    : sc_module(name), data_ready(ev) {
    SC_THREAD(check);
  }
};

sc_assert terminates the simulation with an error message if the condition is false. This is the lightweight equivalent of a UVM uvm_error — no framework overhead, just a condition check that halts on failure.

sc_event is the foundation of transaction-level modeling (TLM). When we reach Posts 14–17, TLM ports and sockets are built on this same event mechanism, extended with standardized payload types and timing protocols. Understanding sc_event now makes TLM-2.0 straightforward later.


DV Insight

Signal vs. Event — Choosing the Right Tool

sc_signal and sc_event are both synchronization mechanisms, but they serve different purposes:

Use sc_signal<T> when:
- You are modeling a physical wire or bus that carries a value
- Multiple modules need to read the current state at any time
- The value persists between events (it holds its last-written value)
- Example: data buses, address lines, control flags

Use sc_event when:
- You are signaling that something happened, not communicating a value
- The event is transient — "pulse" semantics, not "level" semantics
- You need a handshake or acknowledgment between two specific modules
- Example: "transaction complete", "pipeline flush requested", "interrupt asserted"

In our RISC-V CPU, most inter-stage communication uses sc_signal for data (the instruction word, the PC value, the ALU result) and sc_event for control (pipeline stall requests, flush signals, instruction-ready handshakes). This mirrors how real CPU designs separate data paths from control paths — a clean architectural boundary that makes verification much easier.

The Three Forms of notify() — Getting It Right

The timing of notify() matters:

event.notify();              // Immediate: fires in current evaluate phase
event.notify(SC_ZERO_TIME);  // Next delta: fires at next delta boundary
event.notify(10, SC_NS);     // Timed: fires 10ns from now

For the producer-consumer pattern shown in Example 2, immediate notify() after a data write is correct. The sequence is:

  1. data.write(0x00500113) — buffered write to signal
  2. data_ready.notify() — event fires immediately
  3. Update phase runs — data signal is committed
  4. Consumer's wait(data_ready) wakes up — reads committed data value

If you used notify(SC_ZERO_TIME) instead, the event fires one delta after the immediate case, but the data write commits in the same delta as the signal update — so the consumer still sees the correct value. Both are safe here.

Where it becomes dangerous: if you call notify(SC_ZERO_TIME) and then the consumer reads the signal before the delta where the write commits, it will see stale data. This is a classic race condition in SystemC that notify(SC_ZERO_TIME) can introduce. When in doubt: write the data, notify immediately, let the evaluate-update cycle handle the ordering.


Integration

Delta cycles are what make the RISC-V forwarding unit in Post 19 correct. The EX stage computes an ALU result at simulation time T (within one clock cycle). Before the clock edge that advances to time T + 10ns, that result must propagate through the forwarding multiplexers to the MEM stage's operand inputs. In silicon, this happens in 200–400ps of combinational propagation delay. In SystemC, it happens in delta cycles — each forwarding path level that re-evaluates is one delta, all within the same sc_time timestamp.

When you model the forwarding unit as a set of SC_METHOD processes sensitive to the EX stage's output signals, the SystemC kernel automatically runs the evaluate-update loop until the forwarded values settle at the MEM stage inputs — before the clock edge SC_METHODs fire. You do not need to manually sequence these. The evaluate-update model handles it, just as propagation delay handles it in silicon.

The same applies to pipeline flush signals. When a branch misprediction is detected in the MEM stage (Post 21), a flush event fires. The fetch and decode stage SC_METHODs that are sensitive to the flush signal re-evaluate within the same clock cycle's delta iterations, clearing the pipeline registers before the next clock edge latches the new state. Delta cycles make this zero-time correction possible.

Series progress:
- Post 1 — Modules, Ports & Signals ✓
- Post 2 — Simulation Time & Clocks ✓
- Post 3 — Delta Cycles & Event-Driven Semantics ✓
- Post 4 — SC_METHOD vs SC_THREAD (next)

Section 1 foundation is taking shape. The two_stage_chain module will be reused directly in Post 7 as the basis for the combinational ALU interconnect, and the producer-consumer event pattern reappears in Post 10 as the fetch-unit handshake. The concepts in this post are not introductory scaffolding — they are the mechanisms we will rely on through the entire CPU build.


What's Next

Post 4: SC_METHOD vs SC_THREAD

Now that we understand when processes fire (sensitivity lists, delta cycles, event notifications), Post 4 covers which process type to use and why getting this choice wrong causes either deadlocks or missed events.

The rules are not arbitrary. SC_METHOD processes must run to completion without blocking — they model combinational logic and clocked registers. SC_THREAD processes can call wait() and suspend — they model sequential behavior, testbench drivers, and monitors. Using SC_METHOD where you need blocking behavior causes a runtime error. Using SC_THREAD where you need strict sensitivity-list semantics causes subtle missed-event bugs that are very hard to find.

Post 4 builds a concrete example of both mistakes and shows the correct version of each, with the rules stated explicitly enough to apply to any new module you write.

Post 4 → SC_METHOD vs SC_THREAD

Author
Mayur Kubavat
VLSI Design and Verification Engineer sharing knowledge about SystemVerilog, UVM, and hardware verification methodologies.

Comments (0)

Leave a Comment