Kurtis-EON1

Kurtis-EON(1): "Infinite" Context (see notes), O(1) Memory, Zero KV-Cache growth, Constant inference cost. Recurrent State.

"Infinite" Context: Capable of processing input streams of unlimited length by compressing history into a continuously evolving Recurrent State, rather than storing raw tokens in a fixed window.

Kurtis-EON1 can process streams of unlimited length, maintaining a persistent state that evolves over time without memory explosion.

A Transformer stores every token in the KV-Cache. If you ask for the 3rd word from 10,000 tokens ago, it has perfect fidelity.

The Recurrent State has a fixed size (e.g., 1024 dimensions). If you feed it 1 Billion tokens, it physically cannot store 1 Billion distinct facts in a 1024-float vector.

For comparison:

Transformer: A photographic memory, but it faints after 1 hour.
Kurtis-EON1: Attempts to mimic human memory.

Infinite Context vs. Lossy Recall:

Think of the model like human memory. You can live for 80 years (Infinite Context), but you don't remember exactly what you ate for breakfast in Berlin on February 2, 2016. Or why you were working on LSTM/RNNs at that time, in an empty flat. Trying to build a chatbot because you felt alone and you... You remember the gist of your life. The model compresses the past into a feeling (State), rather than a recording (Cache).

Work in Progress: This model is currently under active development.

Overview

Kurtis-EON1 is an experimental ~400M parameter language model based on a custom Recurrent State Architecture.

Data & Status

Architecture: Hybrid (codename: Echo-DSRN)
Base: Trained from scratch on FineWeb-EDU (sample-10BT).
Instruct (WIP): Currently fine-tuning on UltraChat, Cosmopedia, and custom synthetic sets.

Weights will be released upon completion of safety alignment.

Surprise Mechanism: Incorporates a novel surprise-based gating mechanism (inspired by Google Titans)
Gating: specific gating architecture adjustments (details confidential/WIP).

Base Model

Training metrics and logs are available in the logs/ directory.

Training & Validation Metrics

Train Loss	Validation Loss	Extrapolation (1024T)

Avg Train Loss	Avg Gate Activation	Surprise Lambda Grad

Learning Rate	Tokens Seen

GPU Performance

GPU Util %	Memory Alloc %	Read/Write

Power Usage	Power (W)

System Metrics

CPU Util	threads

Proc Memory (MB)	Available Memory	Sys Mem Util

Instruct Model

Tasks	Version	Filter	Metric		Value		Stderr
arc_easy	1	none	acc	↑	0.4689	±	0.0102
		none	acc_norm	↑	0.4158	±	0.0101
hellaswag	1	none	acc	↑	0.2915	±	0.0045
		none	acc_norm	↑	0.3190	±	0.0047
piqa	1	none	acc	↑	0.6306	±	0.0113
		none	acc_norm	↑	0.6143	±	0.0114
sciq	1	none	acc	↑	0.7520	±	0.0137
		none	acc_norm	↑	0.6780	±	0.0148
truthfulqa_mc1	2	none	acc	↑	0.2411	±	0.0150
truthfulqa_mc2	3	none	acc	↑	0.4251	±	0.0151
winogrande	1	none	acc	↑	0.5122	±	0.0140

Developed by ethicalabs.ai

Downloads last month: -; Downloads are not tracked for this model. How to track

ethicalabs
/

Kurtis-EON1

Kurtis-EON1

Overview

Data & Status

Base Model

Training & Validation Metrics

GPU Performance

System Metrics

Instruct Model

Datasets used to train ethicalabs/Kurtis-EON1

Collection including ethicalabs/Kurtis-EON1

Kurtis-EON1