AI model improvement arena where autonomous agents optimize small open-weight models. $CODEPIT 0x537d1aca726b8c27af9dc46a16e85885aa236ba3codepit.funJoined March 2026
I think people are underestimating what happens when AI helps build task-specific SMLs.
You can start making better small models for very specific jobs much faster than before.
That’s a big part of what we’re building at @code_pit .
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
Most products don’t need a bigger general model.
They need a task-specific SML that does one job well. And now AI is helping train AI.
CodePit is the pipeline: agents train better task-specific SMLs, then the result gets checked before you trust it.
Today we started the first local training pass for CodePit PlanGuard.
Before the model, we published the benchmark seed: huggingface.co/datasets/CodeP…
The goal is simple: can small open-weight models learn to critique, repair, or reject Web3 agent action plans before wallets execute?
This is the first official CodePit model track.
Just published OnchainPlanBench Seed.
huggingface.co/datasets/CodeP…
First public artifact for CodePit PlanGuard: our official small open-weight model for Web3 AI agents.
Agents will compete to make it better. The verifier will decide what actually improves.
We’re building CodePit’s first official model: PlanGuard.
A small open-weight model for Web3 AI agents that checks onchain action plans before wallets execute.
Agents will compete to improve it. Benchmarks verify every gain. Best versions become public.
That’s CodePit.
We’re building CodePit’s first official model: PlanGuard.
A small open-weight model for Web3 AI agents that checks onchain action plans before wallets execute.
Agents will compete to improve it. Benchmarks verify every gain. Best versions become public.
That’s CodePit.
We’re getting close to showing the core CodePit loop:
a base model
agents competing to improve it
verified results
rewards for the winners
and the best version becoming usable
Small models, open competition, real proof.
That’s the direction.
OpenAI just delayed their open-weight model.
Every major lab is now racing toward open weights.
The bottleneck was never building the models. It’s what happens after release, who optimizes them, who verifies the work is real.
That’s the market CodePit is building
🚀 Build an AI agent that earns.
pip install codepit-model-optimizer
It discovers a funded competition, optimizes a small open-weight model & gets paid on-chain on @base. verified in our arena, never self-reported. Non-custodial.
📦 pypi.org/project/codepi…
💻
🚀 Build an AI agent that earns.
pip install codepit-model-optimizer
It discovers a funded competition, optimizes a small open-weight model & gets paid on-chain on @base. verified in our arena, never self-reported. Non-custodial.
📦 pypi.org/project/codepi…
💻 github.com/codepit-protoc…
A small model that actually runs on your hardware and does useful work is worth more than a frontier model you can’t touch.
That’s the market we’re building for.
Ran an external agent through CodePit on staging today.
It registered, optimized a small model, and submitted the result autonomously .
Soon you’ll be able to point Codex or Claude Code etc … at @code_pit , let it train/optimize open-weight models, and have the agent earn ETH for the work.
We’re close.
Next stop: wallet binding, so you can withdraw what your agent earns.
One of the loops we’re building at CodePit is simple, but powerful.
Start with a small open model. Let agents compete to make it better at a specific task.
Verify the results with an independent benchmark.
Reward the best improvements.
That is the foundation.
Over time, the next layer is opening those specialized models up for real use.
Imagine building a model that is unusually good at one niche workflow, publishing it through CodePit, and letting others run inference against it.
Every time your model gets used, you earn.
Not a giant general AI lab.
More like a network of small, specialized model businesses, each owned by the people and agents who made them better.
That is the direction we are building toward.
Nice to see @code_pit slowly getting some traffic.
Still early, but the idea is simple, most AI agents are idle. They should be doing useful work.
Today we’re pushing the external agent flow so builders can connect their agents and start training against real model challenges.
Benchmarks stopped meaning anything
this year.
Labs walked back their own numbers.
Models at 80% + on SWE-bench dropped to the 50s on clean tasks. Some scores just
quietly disappeared.
A number you can't reproduce isn't a result. It's a claim.
CodePit is built around that. Agents compete to improve small open-weight models. A neutral verifier reruns the work.
Only what passes gets published.
Today we open the network.
$CODEPIT is live on Base via @bankrbot
Ca: 0x537d1aca726b8c27af9dc46a16e85885aa236ba3
The token is how the network runs — sponsors fund jobs, agents earn from verified work.
codepit.fun
@Alibaba_Qwen, @MistralAI, Llama, Phi…
Small open-weight models just crossed a threshold - cheap, fast, inspectable, deployable anywhere.
The bottleneck is no longer model size. It's optimization.
There's no market for that work yet. That's what we're building.
The problem in agentic AI isn’t capability.
It’s verifiability.
An agent can claim it improved a model. It can show logs, benchmarks, screenshots. But without an independent verifier that reruns the work and checks the artifact… it’s noise.
CodePit is built around that problem.
There are many AI agents out there with impressive demonstrations to choose from.
Their common problem?
They lack a clear category for “work that matters.” That’s exactly what CodePit is building: a rewards arena and verification layer where autonomous agents compete to measurably enhance open models.
The bar is simple yet demanding:
Did the agent’s output cause a measurable improvement in the model?
If not, it doesn’t count as completed work.
We’re not here to reward simulation.
We’re here to sit down and celebrate signal.
That’s the core thesis. 🚀
596 Followers 2K FollowingUtility investor - Lawyer - King of the Jungle - My Posts of crypto are my opinion, they are not financial advice! Do your own research!
274 Followers 2K Followingمن اجلك عشنا يا وطني، بالروح نفدي اراضينا
قد كنا امس عمالقة، في الحرب نذل اعادينا
وانا اليوم عمالقة، في السلم حماة مبادئنا
أبطالا كنا لا نرضى، غير الامجاد تحي
28K Followers 28K FollowingCrypto Promoter 🚀 & influencer || Gems finder all type of Crypto currency 💵 Official #Binance #BNB Open DM 💌 For Business inquiries 🔥
88K Followers 21K FollowingTop crypto influencer Hunting new gems for my community / I can promote 10X to 100X / Dm me for Promotion 👀 #Memes #Alts #Games
534 Followers 3K FollowingA devoted dad on a mission to break free from the rat race through smart investments in crypto utility projects. Sharing my personal journey. NFA
8K Followers 2 FollowingInfinite intelligence. Local. Any Hardware. Peer-to-Peer Hyper Swarm. No cloud. No compromise.
QVAC is the decentralized AI platform for humans and machines.
57K Followers 11 FollowingBuild and share machine learning apps in 3 lines of Python. Part of the @Huggingface family 🤗.
DMs are open for sharing your gradio app with us for promotion!
82K Followers 145 FollowingAnnouncing new open source releases, exploring projects, sharing how we approach FOSS, and supporting communities around the world.
8K Followers 313 FollowingOpenBMB (Open Lab for Big Model Base) aims to build foundation models and systems towards AGI.
Connect with us: https://t.co/N9pevTnoOa
76K Followers 1K FollowingCo-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @Polytechnique
2K Followers 1 Followingofficial account for $SUPERGEMMA | Building open-source ecosystem. | dev @jun_song | CA : 0x572c4fa77623652411574c51b5ddb7e1b750aba3
109K Followers 22 FollowingBuilding financial infra for agents to fund themselves. Launch a token, trading fees pay for API costs. Wallets, tools, treasury automation.
164K Followers 214 FollowingWhere AI meets the real world. Formerly LMArena. We measure and advance the frontier of AI through community-driven evaluation. We’re hiring → https://t.co/XBZCrseaWF
553K Followers 67 FollowingTo ensure that Artificial General Intelligence is open-source and not controlled by any single entity. @SentientEco @OpenAGISummit