BERT/GPT with Inner-Thinking Cycles: Iterative Refinement via Dynamic Head Routing
A novel approach introducing inner-thinking cycles to standard transformers — achieving deeper reasoning without increasing parameters.
Where mathematical elegance meets infinite possibility. Advancing the boundaries of efficient AI architectures.
PoT (Pointer-over-Heads Transformer) is built around a simple idea: instead of producing output in one forward pass, the model thinks through its representations over several refinement steps.
At every step, the model looks at its current hidden states and asks: "Given what I know now, how should I use my attention heads to refine this understanding?"
This process is not about memorizing — it's about progressive self-correction. PoT doesn't just compute token embeddings — it thinks within them.
Apply the transformer stack R times for multi-step reasoning. By the final iteration, embeddings encode a richer, more internally consistent view.
A fast component adapts every step. A slow component maintains broader contextual plans, forming hierarchical reasoning.
Dynamic-Routing Transformer with Iterative Refinement
Dynamically select or weight attention heads per token via differentiable softmax.
Two-timescale recurrent modules for fast adaptation and strategic planning.
Controller → α weights → Weighted Multi-Head Attention → SwiGLU FFN.
16 total reasoning steps with dynamic head routing at each cycle.
Supports Transformer, Mamba, and Diffusion depth controllers.
6 modes including broadcast, film, depth_token, and alpha_gated.
Experience inner-thinking cycles with our interactive Sudoku solver
9x9 Sudoku Solver
Watch the model think through multiple refinement cycles, adjusting its reasoning in real-time. All reasoning happens in the embedding space — no chain-of-thought tokens required.
Launch DemoThe missing part in ChatGPT is real. The GAP is real — and it can be quantified.
As I showed in the demo: 20 million parameters on CPU beat any ChatGPT that exists today on B200s GPU at inference time.
That gap by itself constitutes a new field — Symbol — that can derivate across all markets: from Symbolic Robot tasks to Symbolic Routers.
Discovered a new "Equivalence Class" of bizarre symbolic scientific aliens that would impact the market.


Founder & Researcher
Pioneering the next generation of efficient AI architectures. Focused on bridging the gap between massive language models and lightweight symbolic reasoning systems that outperform at a fraction of the cost.
Ready to explore the second order of intelligence?