Architecture & Design Verification: 1. SystemC Tutorial

Introduction

Before the first gate switches on a new chip, engineers at ARM, Intel, and SiFive are already running software on it. Not on silicon — on a SystemC virtual platform. ARM built the virtual platforms for the Cortex-A series using SystemC, allowing firmware teams to boot Linux and validate drivers months before tape-out. Intel's pre-silicon validation flow for modern SoCs relies on Transaction-Level Modeling (TLM) in SystemC to catch architectural bugs that RTL simulation would find only much later — and at far greater cost. Western Digital's SweRV RISC-V core and SiFive's FU540 were both validated using SystemC models before a single gate was committed to silicon.

SystemC is how the industry thinks in hardware, but works in software.

This series will build a complete RV32I RISC-V CPU from scratch, one component at a time across 34 posts. By the end you will have a working processor model, a UVM-SystemC verification environment, and the skills to read and write the kind of SystemC you will encounter in a real pre-silicon validation team. This is not a toy project — RISC-V is running in billions of devices right now: T-HEAD's XuanTie C906 cores power Android tablets and set-top boxes, RISC-V MCUs from GigaDevice and WCH are displacing ARM Cortex-M0 in embedded products, and major hyperscalers are evaluating RISC-V for custom silicon.

This first post builds the skeleton. Every module in our CPU — the ALU, the Register File, the Instruction Decoder, the Load/Store Unit — will be structured exactly like the pass_through module we write here. Get this pattern locked in and the rest of the series flows naturally.

Prerequisites

Before diving in, you will need:

SystemC 2.3.x installed and configured:
P1 — Installing SystemC 2.3 on Windows
P2 — Configuring SystemC 2.3 Libraries
C++17 compiler — GCC 7+ or Clang 5+ (SystemC 2.3.3 requires C++17 for full compliance)
CMake 3.16+ for the build system used throughout this series
Code for this post: GitHub — section1/post01

If you can run g++ --version and cmake --version without errors, and SYSTEMC_HOME points to your SystemC install, you are ready to go.

Concept Explanation

SystemC Language Reference

Every construct used in this post at a glance. Bookmark this table — it is the Rosetta Stone between what you know and what you are learning:

Construct	Syntax	SV Equivalent	Key Difference
`SC_MODULE`	`SC_MODULE(name) { ... };`	`module name(...);`	Expands to a C++ `struct` inheriting `sc_module`; hierarchy is a runtime C++ object tree, not a static elaboration
`sc_in<T>`	`sc_in<sc_uint<8>> in_data;`	`input logic [7:0] in_data`	A C++ object with `.read()` / `.write()` methods; type-checked at compile time
`sc_out<T>`	`sc_out<sc_uint<8>> out_data;`	`output logic [7:0] out_data`	Same object model; calling `.write()` queues the value — not immediately visible
`sc_inout<T>`	`sc_inout<sc_uint<8>> bidir;`	`inout logic [7:0] bidir`	Bidirectional; rarely used in RTL-level models
`sc_signal<T>`	`sc_signal<sc_uint<8>> sig;`	`logic [7:0] sig` (as wire)	Has TWO internal values: current (what readers see) and next (what `.write()` stores). Updates happen in the kernel's update phase, not immediately
`SC_CTOR`	`SC_CTOR(name) { ... }`	N/A — implicit initialization	Constructor macro; process registration happens here before `sc_start()`
`SC_METHOD`	`SC_METHOD(func_name);`	`always @(sensitivity_list)`	No `wait()` allowed; must run to completion; re-triggered by sensitivity list events
`sensitive <<`	`sensitive << port_name;`	`always @(port_name)`	Port/signal change event subscription; without this, the method fires only once at init
`sc_start()`	`sc_start(10, SC_NS);`	`#10;` in initial block	Runs the event-driven scheduler for the given duration; phases: elaboration → init → simulation
Port binding	`dut.port(signal);`	`dut(.port(signal))`	Runtime binding via `operator()`; unbound ports cause fatal error at `sc_start()`, not at compile time
`sc_uint<N>`	`sc_uint<8> val;`	`logic [7:0] val`	Fixed-width unsigned integer; arithmetic, bit-select, part-select all defined; no implicit conversion to `int`

The Translation Table

If you are coming from C++ or SystemVerilog, you already know most of this — the concepts just have different names:

Concept	If you know C++	If you know SystemVerilog
`SC_MODULE`	class with simulation awareness	`module`
`sc_in<T>`	read-only member for external data	`input` port
`sc_out<T>`	write-only member for external data	`output` port
`sc_inout<T>`	bidirectional member	`inout` port
`sc_signal<T>`	thread-safe shared variable	`wire` / `logic`
`SC_CTOR`	constructor with simulator registration	`initial`/`always` setup block
`sc_start()`	"run the simulation engine"	start of simulation time
`sensitive << port`	"re-run this method when port changes"	`always @(port)`

The big mental shift for C++ engineers: you are not writing a program that executes top-to-bottom. You are describing concurrent hardware processes that respond to signal changes. The simulator controls execution order, not you.

The big mental shift for SystemVerilog engineers: ports are typed C++ objects with member functions (read(), write()), not bare wires. The type parameter <T> can be bool, int, sc_uint<8>, sc_bv<32>, or any user-defined type.

SC_MODULE — Deep Dive: What the Macro Actually Does

SC_MODULE is a preprocessor macro. Writing:

SC_MODULE(pass_through) {
  // ...
};

...expands to approximately:

struct pass_through : public sc_core::sc_module {
  typedef pass_through SC_CURRENT_USER_MODULE;
  pass_through(::sc_core::sc_module_name)
  // ...
};

This expansion has four concrete consequences that matter for everything you build in this series:

1. Kernel registration — the module enters the object hierarchy tree.

Inheriting from sc_module is what registers the module with the simulation kernel. The moment you write pass_through dut("dut"), the kernel records this object in its internal hierarchy tree. Every module, port, and signal declared inside it gets a fully-qualified dotted path: dut.in_data, dut.out_data. When something goes wrong at runtime, error messages use these paths to tell you exactly which instance in which module hierarchy is misbehaving.

2. Hierarchical naming — the "dut" string becomes the instance path.

The string argument to SC_CTOR (or the plain constructor) is the hierarchical name. If you nest a module inside another:

SC_MODULE(cpu) {
  pass_through alu;
  SC_CTOR(cpu) : alu("alu") { ... }
};
cpu top("top");

Then the inner module's ports have paths like top.alu.in_data. This is identical to the instance hierarchy in a SystemVerilog module instantiation, but it is a runtime C++ object tree, not a static netlisted structure. That distinction matters for the next point.

3. Process binding — SC_METHOD and SC_THREAD must be registered with the kernel.

The kernel does not automatically discover your member functions. You must explicitly register each process inside SC_CTOR:

SC_CTOR(pass_through) {
  SC_METHOD(pass);       // tells kernel: "pass() is a simulation process"
  sensitive << in_data;  // tells kernel: "re-run pass() when in_data changes"
}

If you write a void pass() method but forget SC_METHOD(pass), the function exists as a plain C++ function — the kernel never calls it. No warning. The module appears dead.

4. Port ownership — ports belong to modules, not to the namespace.

sc_in, sc_out, and sc_signal declared inside a module are owned by that module's instance. They are not global. When you instantiate pass_through dut("dut") and pass_through dut2("dut2"), each instance has completely independent port objects. This is exactly like SV module instances having independent signal storage.

SC_MODULE vs. plain class — when to use which:

SC_MODULE expands to a struct, which means all members are public by default. This is intentional — hardware ports are meant to be accessible from the outside for binding. If you need private implementation details or want to use templates or multiple inheritance, you can inherit from sc_module directly:

template<int WIDTH>
struct alu : sc_module {
  sc_in<sc_uint<WIDTH>>  operand_a;
  sc_out<sc_uint<WIDTH>> result;
  SC_HAS_PROCESS(alu);
  alu(sc_module_name name) : sc_module(name) {
    SC_METHOD(execute);
    sensitive << operand_a;
  }
  void execute() { result.write(operand_a.read()); }
};

SC_HAS_PROCESS is required when you are not using SC_CTOR — it provides the SC_CURRENT_USER_MODULE typedef that SC_METHOD and SC_THREAD macros depend on.

Dynamic elaboration — what SystemC can do that SV cannot without generate:

Because the hierarchy is a runtime C++ object tree, you can create modules dynamically:

SC_MODULE(cpu_cluster) {
  std::vector<pass_through*> cores;

  SC_CTOR(cpu_cluster) {
    int num_cores = read_config_file();   // determined at runtime
    for (int i = 0; i < num_cores; i++) {
      std::string name = "core_" + std::to_string(i);
      cores.push_back(new pass_through(name.c_str()));
    }
  }
};

In SystemVerilog, replicating modules requires generate for with a statically-known count. In SystemC, the count can come from a config file, a command-line argument, or any C++ computation. ARM uses this heavily in their Corstone and Neoverse SystemC platforms — the number of CPU cores in the virtual platform is a constructor parameter, not a compile-time constant.

sc_signal — Theory: The Evaluate-Update Model

sc_signal<T> is not a simple variable. Understanding this is the single most important insight in SystemC for engineers coming from any background.

Every sc_signal<T> internally maintains two values:

Current value (m_cur_val) — what all readers see when they call .read()
Next value (m_new_val) — what .write() stores temporarily

When you call sig.write(42), only m_new_val is updated to 42. The current value is still whatever it was before. The kernel's update phase — which runs after all processes in the current evaluate phase complete — copies m_new_val to m_cur_val for every signal that was written. After the update phase, if m_cur_val changed, the kernel fires the signal's value_changed_event, which re-triggers any processes sensitive to that signal.

This is the evaluate-update model. It is not unique to SystemC — it is how every IEEE-compliant hardware simulator works.

Side-by-side comparison with SystemVerilog:

Scenario	Verilog/SV (blocking `=`)	Verilog/SV (non-blocking `<=`)	SystemC (`sc_signal`)
Assignment visibility	Immediate — next statement sees new value	Deferred — committed after current always block	Deferred — committed in kernel update phase
Models	Procedural sequential logic (registers within one always block)	Hardware register updates (cross-always-block synchronization)	All inter-module communication, always
Analogy	Local variable	Hardware flip-flop output	Hardware wire/net
Risk	Race conditions in concurrent always blocks	Correct for RTL flip-flops	Correct for concurrent hardware processes

sc_signal is behaviorally equivalent to using <= (non-blocking assignment) for all inter-module communication, enforced automatically. You do not choose — it is always deferred.

The write-then-read trap:

void bad_process() {
  out_data.write(in_data.read() + 1);  // queues new value
  sc_uint<8> check = out_data.read();  // READS THE OLD VALUE, not the +1 value
  // check is wrong — out_data.read() returns m_cur_val, not m_new_val
}

This surprises every SV engineer on first encounter. In SV:

always @(*) begin
  out_data = in_data + 1;  // blocking: out_data immediately has new value
  check = out_data;         // check sees the new value
end

In SystemC, write() is always non-blocking. If you need to compute a chain of values within one process, use local C++ variables:

void correct_process() {
  sc_uint<8> temp = in_data.read() + 1;  // local C++ variable, updated immediately
  out_data.write(temp);                   // queues to signal
  sc_uint<8> check = temp;               // correct — reads the local variable
}

ASCII timing diagram — what happens across one delta cycle:

EVALUATE PHASE:
  pass() runs:
    in_data.read()   → returns m_cur_val = 0x2A (42)
    out_data.write(0x2A) → stores 0x2A in m_new_val only
    pass() returns

UPDATE PHASE:
  out_data: m_cur_val = m_new_val = 0x2A
  value_changed_event fires (0x00 → 0x2A was a change)

NEXT EVALUATE PHASE:
  Any process sensitive to out_data runs here
  out_data.read() now returns 0x2A

  Time: ────────────────────────────────────────────►
               t=0              t=0+Δ1           t=0+Δ2
         [write queued]   [update committed]  [readers see 0x2A]

Port Types: sc_in, sc_out, sc_inout

Ports in SystemC are typed objects, not raw connections:

sc_in<sc_uint<8>>   in_data;   // 8-bit input
sc_out<sc_uint<8>>  out_data;  // 8-bit output
sc_inout<sc_uint<8>> bidir;    // bidirectional (rare in RTL models)

sc_uint<8> is a SystemC unsigned integer with exactly 8 bits — part-select, bit-select, and arithmetic all behave like hardware. For our RISC-V CPU we will use:
- sc_uint<32> for data buses (the RV32I data width)
- sc_uint<32> for instruction words
- sc_bv<32> when we care about individual bits and do not need arithmetic
- bool for control signals (reset, enable, valid)

Ports are not values — they are interfaces. You read a value through them with .read() and write through them with .write(). This distinction matters when we get to pipelined stages in Parts 18–22, where a module reads one port and writes another in a way that is sensitive to pipeline stage boundaries.

Port Binding — How Modules Connect

Port binding is the SystemC equivalent of the port connection list in a SystemVerilog instantiation. The mechanics are different, and the differences matter.

SystemVerilog — named port connection:

logic [7:0] wire_a, wire_b;

pass_through dut (
  .in_data  (wire_a),   // port name must match module declaration exactly
  .out_data (wire_b)
);

SystemC — binding via operator():

sc_signal<sc_uint<8>> sig_a, sig_b;
pass_through dut("dut");

// Named-equivalent binding (most readable):
dut.in_data(sig_a);    // sc_port::operator() calls sc_port::bind(sig_a)
dut.out_data(sig_b);

// Alternative: positional binding using sc_port::bind() explicitly:
dut.in_data.bind(sig_a);
dut.out_data.bind(sig_b);

Both forms call sc_port::bind() internally. At bind time, the kernel:

Checks type compatibility — sc_in<sc_uint<8>> can only bind to sc_signal<sc_uint<8>> or another compatible interface. Binding mismatched types is a compile-time error.
Checks direction — sc_in is read-only; binding it to a signal adds the port as a reader. sc_out is write-only; binding it adds the port as the writer. sc_inout allows both.
Records the binding — the kernel tracks which signals are bound to which ports. At sc_start(), it verifies every port has been bound.

What happens if you forget a binding:

Error: (E109) complete binding failed: port not bound:
port 'dut.in_data' (sc_in)
In file: .../src/sysc/communication/sc_port.cpp:230

This is a runtime fatal error at sc_start(), not a compile-time error. Unlike an undriven wire in SV (which produces 'z' or 'x' but does not abort), an unbound port in SystemC halts the simulation before it starts. The error message tells you exactly which port on which instance is unbound.

Binding to another module's port (hierarchical connection):

In hierarchical designs, parent modules bind child ports to their own ports:

SC_MODULE(cpu_top) {
  sc_in<sc_uint<8>>  top_in;
  sc_out<sc_uint<8>> top_out;

  pass_through stage1;
  pass_through stage2;
  sc_signal<sc_uint<8>> mid;

  SC_CTOR(cpu_top) : stage1("s1"), stage2("s2") {
    stage1.in_data(top_in);    // parent port → child port (direct pass-through)
    stage1.out_data(mid);      // child port → internal signal
    stage2.in_data(mid);       // internal signal → next child port
    stage2.out_data(top_out);  // child port → parent port
  }
};

This is the wiring pattern for every stage boundary in the RISC-V pipeline.

sc_signal — The Wire Between Modules

sc_signal<T> is the SystemC equivalent of a net. In the testbench (or in a top-level wrapper) you declare signals and use them to bind ports together:

sc_signal<sc_uint<8>> sig_in, sig_out;
dut.in_data(sig_in);    // bind port to signal
dut.out_data(sig_out);  // bind port to signal

One important behavioral detail: signal writes are not immediately visible. When you call sig_in.write(42), the new value is queued. It becomes visible to all readers only after the simulation kernel processes the delta cycle — a zero-time scheduling step that propagates values and re-runs sensitive processes. This mirrors how combinational logic works in hardware: the output of a gate does not instantly change the input of the next gate in the same simulation timestep.

For our RISC-V CPU, signals will connect the outputs of the Register File to the inputs of the ALU, the ALU output to the writeback logic, and the PC register output to the instruction memory address port. Getting comfortable with the signal/port binding pattern now pays off every time we wire up a new stage.

SC_METHOD and the Sensitivity List

SC_METHOD registers a C++ member function as a combinational process:

SC_CTOR(pass_through) {
  SC_METHOD(pass);       // register pass() as a process
  sensitive << in_data;  // fire when in_data changes
}

This is the SystemC equivalent of always @(in_data) in Verilog. The method runs once at t=0 (initialization) and then re-runs every time any signal in its sensitivity list changes.

Rules for SC_METHOD:
- No wait() calls allowed. If you need to wait for a clock edge, you want SC_THREAD or SC_CTHREAD (covered in Post 3).
- Must complete quickly. The kernel cannot proceed to the next timestep while a METHOD is running.
- Can read any signal, but should only write to signals it owns.

For the RISC-V ALU (Post 5), the execute method will be an SC_METHOD sensitive to the operands and operation code — pure combinational logic, no clock needed.

What Happens at sc_start() — The Three Phases

sc_start() is more than "run the simulation." It orchestrates three distinct phases, and knowing which phase is active explains many behaviors that otherwise seem mysterious.

Phase 1 — Elaboration (before sc_start()):

Elaboration happens when you write code in sc_main before calling sc_start(). During elaboration:
- Module constructors run (SC_CTOR bodies execute)
- Processes are registered with the kernel (SC_METHOD, SC_THREAD calls inside SC_CTOR)
- Port bindings are established (dut.in_data(sig_in) calls)
- Signal initial values are set (default: zero/false for numeric types)

No simulation processes run during elaboration. No sensitivity lists fire. You can safely write to signals and set initial conditions here.

Phase 2 — Initialization (the first thing sc_start() does):

When you call sc_start() for the first time, the kernel runs all SC_METHOD processes exactly once, regardless of their sensitivity lists. This establishes initial output values from initial input values. It is the equivalent of an RTL simulator computing the initial state of all always @(*) blocks at time 0.

Important: this initialization fires even if none of the inputs have changed. If your pass() method reads in_data (which is 0 at t=0) and writes out_data, then out_data will be set to 0 during initialization.

Phase 3 — Event-driven simulation:

After initialization, the kernel enters the event-driven loop:

Step 1: Take the earliest pending event from the event queue
Step 2: EVALUATE — run all processes triggered by this event
        Each process reads current signal values, queues writes
Step 3: UPDATE — commit all queued writes to signals
        For each committed signal that changed value:
          fire value_changed_event
Step 4: If any value_changed_events fired, go to Step 2 (delta cycle)
Step 5: If no more events at this timestamp, advance simulation time
        to the next event in the queue
Step 6: If no more events and sc_start(duration) has elapsed, return

Numbered walkthrough for our pass_through module:

sc_main writes sig_in.write(42) — queued in sig_in.m_new_val

sc_start(10, SC_NS) begins:
  INITIALIZATION (first call only):
    pass() runs: reads in_data (= 0, default), writes out_data.write(0)
    UPDATE: out_data.m_cur_val = 0

  sig_in.write(42) queued during elaboration now commits:
    sig_in.m_cur_val = 42
    sig_in.value_changed_event fires

  EVALUATE Δ0 at t=0:
    pass() runs (sensitive to in_data = sig_in):
      reads in_data.read() → 42
      calls out_data.write(42) → queued in out_data.m_new_val
    UPDATE: out_data.m_cur_val = 42
    out_data.value_changed_event fires (0 → 42 changed)

  EVALUATE Δ1 at t=0:
    No process is sensitive to out_data, so nothing runs
    No more pending events at t=0

  Time advances to t=10ns (end of sc_start(10, SC_NS))

sc_start returns. sig_in.read() = 42, sig_out.read() = 42.

This sequence explains why out=0 appears if you print the output before calling sc_start — the update has not happened yet. And it explains why out=42 appears after sc_start returns — two full delta cycles ran inside that one call.

The Module in Context

Here is where pass_through sits in our overall CPU build and how data flows through it:

graph LR
    SIG_IN["sig_in\nsc_signal<sc_uint<8>>"]
    PT["pass_through\nSC_MODULE\nSC_METHOD: pass()"]
    SIG_OUT["sig_out\nsc_signal<sc_uint<8>>"]

    SIG_IN -->|"in_data (sc_in)"| PT
    PT -->|"out_data (sc_out)"| SIG_OUT

    style PT fill:#06b6d4,color:#fff
    style SIG_IN fill:#1e293b,color:#94a3b8
    style SIG_OUT fill:#1e293b,color:#94a3b8

The signal on the left is driven by the testbench stimulus. The module reads it through in_data, passes it through, and writes to out_data. The signal on the right is read back by the testbench to check the output. This exact topology — testbench drives signals, module reads and writes through ports — is the pattern we will use for every component in the CPU.

Simulation Semantics — The Evaluate-Update Model in Full

The SystemC scheduler is not a mystery box. Its inner loop can be written as pseudocode:

while (simulation_not_finished):

  // EVALUATE PHASE
  // Run every process whose sensitivity event has fired
  for each runnable_process in ready_queue:
    runnable_process.execute()
    // Inside execute(): signal.write() stores into m_new_val only
    // Inside execute(): signal.read() returns m_cur_val (OLD value)

  // UPDATE PHASE
  // Commit all queued writes atomically
  for each signal in written_signals:
    if signal.m_new_val != signal.m_cur_val:
      signal.m_cur_val = signal.m_new_val
      schedule value_changed_event for signal

  // DELTA DECISION
  if any value_changed_events were scheduled:
    delta_count++
    // Move value_changed_events to ready_queue
    // Loop back to EVALUATE (same timestamp T)
  else:
    // No more changes at timestamp T
    advance simulation time to next_scheduled_event_time

This is not a SystemC-specific design decision. It is the required behavior under IEEE 1666 (SystemC), IEEE 1364 (Verilog), and IEEE 1800 (SystemVerilog). All three standards describe the same evaluate-update loop because they all model the same physical reality: electrons in silicon do not update instantly and simultaneously.

Verilog/SV comparison — side by side:

// Verilog non-blocking assignment — deferred update
always @(a) begin
  b <= a + 1;   // b's new value is queued
  // b still holds old value here in this always block
end

always @(b) begin
  c <= b + 1;   // reads b's old value if both blocks fire in same time step
end
// After the NBA update region: b and c both have new values

// SystemC sc_signal — same semantics, explicit API
void stage1() {           // SC_METHOD sensitive to a
  b.write(a.read() + 1); // queues b's new value
  // b.read() still returns old value here
}

void stage2() {           // SC_METHOD sensitive to b
  c.write(b.read() + 1); // reads b's old value in same evaluate phase
}
// After update phase: b and c both have new values (two delta cycles)

The behavior is identical. The difference is that SystemVerilog makes this implicit through the NBA scheduling region, while SystemC makes it explicit through the .read() / .write() API.

Visual timeline — two processes at the same timestamp:

Timestamp T=0:

  EVALUATE Δ0:
  ┌─────────────────────────────────────────────────────┐
  │  stage1(): reads a=5, calls b.write(6)              │
  │            b.m_cur_val = 5 (unchanged)              │
  │            b.m_new_val = 6 (queued)                 │
  │  stage2(): reads b.m_cur_val = 5 (OLD value!)       │
  │            calls c.write(6) (based on stale b)      │
  └─────────────────────────────────────────────────────┘

  UPDATE Δ0:
  ┌─────────────────────────────────────────────────────┐
  │  b.m_cur_val ← 6   (b changed: fire value_changed) │
  │  c.m_cur_val ← 6   (c changed: fire value_changed) │
  └─────────────────────────────────────────────────────┘

  EVALUATE Δ1:
  ┌─────────────────────────────────────────────────────┐
  │  stage2() re-runs (triggered by b's value_changed)  │
  │  reads b.m_cur_val = 6 (NEW value)                  │
  │  calls c.write(7) — correct result                  │
  └─────────────────────────────────────────────────────┘

  UPDATE Δ1:
  ┌─────────────────────────────────────────────────────┐
  │  c.m_cur_val ← 7   (c changed: fire value_changed) │
  └─────────────────────────────────────────────────────┘

  EVALUATE Δ2:
  ┌─────────────────────────────────────────────────────┐
  │  No process sensitive to c → nothing runs           │
  │  No new events at T=0 → system is stable            │
  └─────────────────────────────────────────────────────┘

Time advances to next scheduled event.

This diagram shows that stage2 ran with the wrong b value in Δ0, re-ran with the correct value in Δ1, and the system converged. This is exactly correct behavior — it is how two levels of combinational logic in hardware settle after an input changes.

Common Pitfalls for SV Engineers

Coming from SystemVerilog, these five issues will bite you within the first week of SystemC coding. Each one is silent — no compiler error, often no runtime error, just wrong output.

Pitfall 1: signal.read() returns the OLD value in the same method invocation.

void wrong() {
  out_data.write(in_data.read() + 1);
  // BUG: out_data.read() still returns the OLD value here
  if (out_data.read() > 10) { ... }  // always false if old value was <= 10
}

Fix: use a local C++ variable for any value you need to both write and read in the same method.

void correct() {
  sc_uint<8> new_val = in_data.read() + 1;
  out_data.write(new_val);
  if (new_val > 10) { ... }  // correct — reads the local variable
}

This is the most common first-week bug for SV engineers. In SV, out_data = in_data + 1 with blocking assignment makes out_data immediately readable. In SystemC, .write() is always non-blocking.

Pitfall 2: Forgetting sensitive << port — the module silently freezes.

SC_CTOR(pass_through) {
  SC_METHOD(pass);
  // MISSING: sensitive << in_data;
}

Result: pass() fires once during initialization (at t=0) and never again. Output stays at its initial value (0) regardless of input changes. No warning. This is equivalent to forgetting @(*) in a Verilog always block — the block runs once in elaboration and stops.

Pitfall 3: All ports must be bound before sc_start() — no exceptions.

In SV, an undriven input defaults to z. In SystemC, an unbound port causes a fatal abort:

Error: (E109) complete binding failed: port not bound: port 'dut.in_data' (sc_in)

Every sc_in, sc_out, and sc_inout must have a corresponding .bind() call before sc_start(). In large designs, use a helper function or assertion to verify all bindings in the testbench before sc_start().

Pitfall 4: SC_MODULE is a struct — everything is public.

SC_MODULE expands to struct, not class. All members are public by default. This means you can accidentally access internal signals from outside the module:

dut.internal_signal.write(42);  // compiles! but violates encapsulation

If encapsulation matters (it does in large designs), use private: explicitly inside the module, or inherit from sc_module directly using class instead of struct.

Pitfall 5: No implicit conversion from sc_uint to int.

sc_uint<8> val = in_data.read();
int raw = val;        // COMPILE ERROR in strict mode: no implicit conversion
int raw = val.to_int(); // correct
int raw = (int)val;     // also works, but less readable

This surprises C++ engineers who are used to implicit numeric conversions. sc_uint<8> is not an int — it is a hardware type with defined bit width. Operations like val % 256 and val >> 3 work, but arithmetic mixing with plain int requires explicit conversion.

Implementation

Below is the complete, compilable implementation for this post. Every CPU component in this series will follow this same structural pattern.

// post01_pass_through.cpp
// SystemC Tutorial Series — Post 1: Modules, Ports & Signals
// Builds the structural skeleton for every RISC-V CPU component in this series.
//
// Compile:
//   g++ -std=c++17 -I$SYSTEMC_HOME/include -L$SYSTEMC_HOME/lib-linux64 \
//       -lsystemc post01_pass_through.cpp -o post01
//
// Or use the provided CMakeLists.txt (see Build & Run section).

#include <systemc.h>

// ---------------------------------------------------------------------------
// pass_through: our first SC_MODULE — the skeleton every CPU component uses.
//
// In our RISC-V CPU, every unit (ALU, Register File, Decoder, Load/Store Unit)
// will follow this same pattern:
//   1. Declare input and output ports with typed sc_in / sc_out
//   2. Write a process function (SC_METHOD for combinational, SC_THREAD for clocked)
//   3. Register the process in SC_CTOR with its sensitivity list
// ---------------------------------------------------------------------------
SC_MODULE(pass_through) {

  // Ports — typed and simulation-aware
  sc_in<sc_uint<8>>  in_data;   // 8-bit input  — read with in_data.read()
  sc_out<sc_uint<8>> out_data;  // 8-bit output — write with out_data.write()

  // The combinational process: re-runs whenever in_data changes.
  // No clock needed — this is pure combinational logic.
  // In the RISC-V ALU (Post 5), the execute() process will work the same way.
  void pass() {
    out_data.write(in_data.read());
  }

  // Constructor: register processes and their sensitivity lists.
  // SC_CTOR macro expands to:
  //   pass_through(sc_module_name name) : sc_module(name)
  // Everything inside the braces runs once at construction time,
  // before simulation starts.
  SC_CTOR(pass_through) {
    SC_METHOD(pass);       // register pass() as a METHOD process
    sensitive << in_data;  // re-run when in_data changes (like always @(in_data))
  }
};

// ---------------------------------------------------------------------------
// sc_main — the SystemC entry point, replaces main().
// The simulator calls this after initialising the kernel.
// Everything declared here is part of the "elaboration" phase:
// modules are instantiated, ports are bound, before sc_start() is called.
// ---------------------------------------------------------------------------
int sc_main(int argc, char* argv[]) {

  // Signals are the wires connecting modules together.
  // They live outside any module — in the testbench or top-level wrapper.
  sc_signal<sc_uint<8>> sig_in;   // drives the DUT input
  sc_signal<sc_uint<8>> sig_out;  // captures the DUT output

  // Instantiate the design under test.
  // The string "dut" is the hierarchical instance name — appears in logs.
  pass_through dut("dut");

  // Port binding — connect ports to signals.
  // This is equivalent to: assign sig_in = dut.in_data; (conceptually)
  // The simulator will error if any port is left unbound when sc_start() runs.
  dut.in_data(sig_in);
  dut.out_data(sig_out);

  // --------------------------------------------------------------------------
  // Stimulus and observation
  // --------------------------------------------------------------------------

  // Apply first stimulus value
  sig_in.write(42);
  sc_start(10, SC_NS);  // advance simulation 10 nanoseconds
  std::cout << "t=" << sc_time_stamp()
            << "  in=" << sig_in.read()
            << "  out=" << sig_out.read()
            << std::endl;

  // Apply second stimulus value — 0xAB = 171 decimal
  sig_in.write(0xAB);
  sc_start(10, SC_NS);
  std::cout << "t=" << sc_time_stamp()
            << "  in=" << sig_in.read()
            << "  out=" << sig_out.read()
            << std::endl;

  return 0;
}

A CMakeLists.txt for this post is provided in the GitHub repo. It locates your SystemC install via SYSTEMC_HOME and wires up the include paths and library links correctly across Linux, macOS, and Windows.

Build & Run

# Clone the repo and enter the post01 directory
git clone https://github.com/vlsidesignverification/risc-v-systemc.git
cd risc-v-systemc/section1-foundations/post01-modules-ports-signals

# Configure and build
mkdir build && cd build
cmake .. -DSYSTEMC_HOME=/path/to/your/systemc-install
make

# Run the simulation
./post01

Expected output:

        SystemC 2.3.x --- <date and time>
        Copyright (c) 1996-2017 by all Contributors,
        ALL RIGHTS RESERVED

t=10 ns  in=42  out=42
t=20 ns  in=171  out=171

If you see the two output lines with matching in and out values, the module is working. The SystemC 2.3.x banner is printed by the kernel at startup — you cannot suppress it in 2.3.x (nor should you want to; it confirms the version).

Troubleshooting:

error: 'sc_uint' was not declared — your SYSTEMC_HOME is wrong or the include path is not being passed to the compiler. Check cmake -DSYSTEMC_HOME=... and confirm $SYSTEMC_HOME/include/systemc.h exists.
error: unbound port at runtime — you forgot a port binding line. Every sc_in and sc_out must be bound to a signal before sc_start().
The output shows out=0 for both lines — the sensitivity list is missing (see DV Insight below).

Verification

There is no formal testbench in this post — and that is intentional.

We are learning the language first. Printing stimulus and response values to stdout is enough to confirm the module works as expected at this stage. Starting in Post 5, when we build the ALU, we will introduce a proper SystemC testbench with a separate SC_MODULE driving stimulus, golden reference checking, and pass/fail reporting. By Post 8 we will have a complete self-checking testbench infrastructure that all subsequent modules reuse.

Verification methodology comes after language fluency. Do not skip ahead — the testbench patterns in Post 5 will make much more sense once you have used SC_METHOD, SC_THREAD, and clocked processes in Posts 2–4.

DV Insight

The most common beginner mistake: forgetting the sensitivity list.

Try removing the sensitive << in_data; line and rebuilding. The output will be:

t=10 ns in=42 out=0 t=20 ns in=171 out=0

Without a sensitivity list, SC_METHOD fires exactly once — at t=0 during initialization — and never again. The module appears dead: input changes, output stays at its initial value (zero). This bug is silent. No compiler warning, no runtime error, just wrong output.

In a complex design this can waste hours. Whenever a module output is not tracking its input, check the sensitivity list first. This is the SystemC equivalent of forgetting @(*) in a Verilog always block.

Why port binding matters — and what happens when you skip it.

The line dut.in_data(sig_in) wires the port to the signal. If you omit it, SystemC will not silently proceed — the simulator will abort at sc_start() with a message like:

Error: (E109) complete binding failed: port not bound: port 'dut.in_data' (sc_in)

This is actually helpful behavior. In a large design with many ports, unbound port detection catches wiring mistakes before simulation starts. When you get this error, the message tells you exactly which port on which instance is not connected — much friendlier than hunting for a floating wire in a waveform viewer.

Port binding syntax: instance.port(signal) — the port name is a member of the instance, and you call it like a function passing the signal as the argument. This is sc_port's operator() overload. In the GitHub repo's CMakeLists.txt there is a helper macro to bind ports in bulk for larger modules; we will use it starting in Post 5.

Integration

The pass_through module is deliberately trivial — an 8-bit wire with a name. Its value is structural, not functional. Here is what the same pattern looks like scaled up to a real CPU component (preview for Post 5):

SC_MODULE(alu_32) {
  sc_in<sc_uint<32>>  operand_a;
  sc_in<sc_uint<32>>  operand_b;
  sc_in<sc_uint<4>>   alu_op;
  sc_out<sc_uint<32>> result;
  sc_out<bool>        zero_flag;

  void execute() {
    // ... ALU logic ...
  }

  SC_CTOR(alu_32) {
    SC_METHOD(execute);
    sensitive << operand_a << operand_b << alu_op;
  }
};

Same structure. More ports. Richer process body. The SC_METHOD / sensitive << / port binding pattern is identical.

Where we are in the full CPU build:

[Part 1: Skeleton ✓] → [Part 5: ALU] → [Parts 7-12: Single-Cycle CPU] → [Parts 18-22: Pipeline] → [Parts 23-34: UVM-SystemC VIP]

Every post from here to Part 34 builds on this post. When something breaks in Post 18 (pipeline hazard logic), the first thing you will reach for is this module skeleton to isolate the failing stage — a clean, minimal SC_MODULE with the minimum ports needed to reproduce the problem.

The pass_through module also appears directly in the RISC-V CPU as a placeholder during the incremental build process. In Posts 7–9 (single-cycle CPU integration), we will stub out unimplemented stages with pass_through variants while wiring the datapath, so the simulation compiles and runs even before every unit is complete. This is a standard technique in pre-silicon model development at companies like ARM and SiFive — build the interconnect first, fill in the logic unit by unit.

What's Next

Post 2: Simulation Time & Clocks

The pass_through module has no clock — it is purely combinational. Real CPU components are clocked: registers latch on a rising edge, pipelines advance one stage per cycle, and memory reads take multiple cycles.

In Post 2 we will:
- Learn how SystemC models hardware time with sc_time and sc_clock
- Build the clock generator module that all our sequential CPU components will use
- Introduce SC_THREAD — the process type that can call wait(), letting us write clocked behavior in a natural way
- Build a D flip-flop in SystemC and verify it holds its value across clock edges

After Post 2, our modules will have a heartbeat. After Post 3 (resets and initialization), they will be safe to wire together into multi-module hierarchies. The single-cycle CPU starts taking shape in Post 7.

Code for Post 2: GitHub — section1/post02

Part 1 of 13 Part 2: Simulation Time & Clocks →

1. SystemC Tutorial - Modules, Ports & Signals