Architecture Overview

THE
ARCHITECTURE

A hub-and-spoke hierarchical reasoning system engineered from the ground up for edge-native inference — no cloud, no GPS dependency, no comms requirement.

System Topology

HUB-AND-SPOKE
HIERARCHICAL REASONING

The central Hub orchestrates all reasoning and coordination. Spoke agents handle specialized, domain-specific inference. No Spoke communicates with another — all intelligence flows through the Hub, ensuring security isolation and coordinated autonomy.

HUB · HRM
27M PARAM · DUAL-PROCESS · EDGE-NATIVE
Vision

Optical / IR perception

Navigation

Passive PNT, inertial

Threat

Hazard classification

Comms

Multi-mode radio mgmt

Mission

Task & waypoint logic

Inside the Hub

DUAL-PROCESS
COGNITIVE ARCHITECTURE

L-MODULE — FAST PATH

~15M parameter causal Transformer decoder. Handles rapid, reactive processing — continuous sensory input, immediate hazard response, real-time state updates. Converges to local equilibrium within milliseconds.

H-MODULE — SLOW PATH

~12M parameter encoder-decoder. Performs abstract, deliberative planning — mission logic, route optimization, multi-step reasoning. Updates only after the L-module completes a full cycle, providing strategic context.

Inspired by dual-process cognitive theory. The same architecture your brain uses to drive a familiar road while planning tomorrow's mission — reactive and deliberate, simultaneously.

Training Methodology

TWO-PHASE
LEARNING PIPELINE

01

KNOWLEDGE DISTILLATION

The Hub is bootstrapped from a large teacher model using a unified Knowledge Distillation and Reinforcement Learning (KDRL) framework. Temperature-scaled KL divergence provides a rich training signal — teaching not just correct answers but the reasoning relationships between possible outputs.

A bidirectional distillation cycle follows: as the Hub discovers novel high-reward reasoning paths through operational experience, those discoveries refine the original teacher — creating a co-evolving system where both models improve iteratively.

02

MA-GFLOWNET REINFORCEMENT LEARNING

Multi-Agent Generative Flow Networks (MA-GFlowNets) enable the Hub and Spokes to learn collaboratively using the Centralized Training with Decentralized Execution (CTDE) paradigm. Unlike traditional RL that seeks a single optimal policy, GFlowNets sample diverse high-reward solutions — critical for contested environments where multiple valid approaches must be available.

Monte Carlo Tree Search (MCTS) with PUCT-based selection guides exploration, dramatically improving sample efficiency. Dynamic reward shaping via Bayesian Active Learning by Disagreement (BALD) ensures the system continuously focuses training on the most uncertain, highest-value regions of the solution space.

Integration & Compliance

PLATFORM-AGNOSTIC
BY DESIGN

Novawerke's architecture integrates into existing platform systems without requiring hardware redesign. The open-source core is structurally aligned with DoD Modular Open Systems Approach (MOSA) policy.

MOSA COMPLIANT

Hub-and-spoke open architecture meets DoD MOSA requirements. Integration does not create vendor lock-in — it helps partners meet their own contractual obligations.

LOW-SWAP NATIVE

Designed for SoC-class edge hardware. Full inference capability on size, weight, and power-constrained platforms — maritime vessels, UGVs, UAVs, logistics systems.

AIRLOCKED SECURITY

Star topology enforces strict capability isolation. No inter-Spoke communication. All coordination mediated by the Hub. Architecturally prevents lateral movement attacks.

Discuss Integration