A hub-and-spoke hierarchical reasoning system engineered from the ground up for edge-native inference — no cloud, no GPS dependency, no comms requirement.
The central Hub orchestrates all reasoning and coordination. Spoke agents handle specialized, domain-specific inference. No Spoke communicates with another — all intelligence flows through the Hub, ensuring security isolation and coordinated autonomy.
Optical / IR perception
Passive PNT, inertial
Hazard classification
Multi-mode radio mgmt
Task & waypoint logic
~15M parameter causal Transformer decoder. Handles rapid, reactive processing — continuous sensory input, immediate hazard response, real-time state updates. Converges to local equilibrium within milliseconds.
~12M parameter encoder-decoder. Performs abstract, deliberative planning — mission logic, route optimization, multi-step reasoning. Updates only after the L-module completes a full cycle, providing strategic context.
Inspired by dual-process cognitive theory. The same architecture your brain uses to drive a familiar road while planning tomorrow's mission — reactive and deliberate, simultaneously.
The Hub is bootstrapped from a large teacher model using a unified Knowledge Distillation and Reinforcement Learning (KDRL) framework. Temperature-scaled KL divergence provides a rich training signal — teaching not just correct answers but the reasoning relationships between possible outputs.
A bidirectional distillation cycle follows: as the Hub discovers novel high-reward reasoning paths through operational experience, those discoveries refine the original teacher — creating a co-evolving system where both models improve iteratively.
Multi-Agent Generative Flow Networks (MA-GFlowNets) enable the Hub and Spokes to learn collaboratively using the Centralized Training with Decentralized Execution (CTDE) paradigm. Unlike traditional RL that seeks a single optimal policy, GFlowNets sample diverse high-reward solutions — critical for contested environments where multiple valid approaches must be available.
Monte Carlo Tree Search (MCTS) with PUCT-based selection guides exploration, dramatically improving sample efficiency. Dynamic reward shaping via Bayesian Active Learning by Disagreement (BALD) ensures the system continuously focuses training on the most uncertain, highest-value regions of the solution space.
Novawerke's architecture integrates into existing platform systems without requiring hardware redesign. The open-source core is structurally aligned with DoD Modular Open Systems Approach (MOSA) policy.
Hub-and-spoke open architecture meets DoD MOSA requirements. Integration does not create vendor lock-in — it helps partners meet their own contractual obligations.
Designed for SoC-class edge hardware. Full inference capability on size, weight, and power-constrained platforms — maritime vessels, UGVs, UAVs, logistics systems.
Star topology enforces strict capability isolation. No inter-Spoke communication. All coordination mediated by the Hub. Architecturally prevents lateral movement attacks.