One week out: Seattle Rust User Group meets Thursday, June 18 at 6pm PT, hosted by @boundaryML.
@conradirwin from Zed is in town to talk CRDTs, alongside talks from @boundaryML and the local Rust crowd.
Excited to see you there!
meetup.com/seattle-rust-u…
@antoniosarosi@Montyclt basically:
1. agent comes up with a task
2. another agent tries to do it e2e w/ BAML (ralph loop)
3. another agent looks at that agent's transcript to identify issues + dedupes against existing issues
4. humans triage
5. another agent tries to fix them
@Jonathan_Blow 5/x: and then use @boundaryML 's BAML for chunking things out and testing / iterating quickly. fuck llms and prompts etc but if you have to work with them BAML is like the only thing that makes it feel marginally manageable.
We are launching a programming language built for agents soon called BAML that has been in the making for 1+ years. You can follow @boundaryML
We are a small team of 6 developing it with care, gathering feedback from humans and agents.
Fully Open Source.
If you are interested DM me for early alpha access.
Introducing Zero
The programming language for agents.
I wanted a systems language that was faster, smaller, and easier for agents to use and repair.
Explicit capabilities. JSON diagnostics. Typed safe fixes.
Made for agents on day zero.
I benchmarked a new extraction harness on a private eval dataset for lerim-cli (new version is out now - v0.1.83) and the main lesson was very clear: if you want smaller models to work well, you should stop asking the model to do everything and start doing more engineering work.
Before, the agent was closer to a single-pass PydanticAI setup: read a large trace, understand what matters, decide what is durable memory, call tools correctly, stay inside the context window, and output clean structured records.
That puts too much burden on the model, especially when you want to use smaller or cheaper models.
The new harness is BAML (@boundaryML) + LangGraph (@LangChain).
The graph now does more of the deterministic work:
- read the trace in windows
- ask the model to scan one window at a time
- keep compact findings instead of the whole trace
- synthesize memory records only at the end
- validate/retry typed BAML outputs
- persist with normal code, not model improvisation
So the model is not the whole agent anymore -> It is one reasoning component inside a more engineered system.
On the private benchmark, using the same MiniMax M2.7 model, the new harness completed all cases while the old harness had multiple failures from tool retries and context window issues.
- Task completion: BAML+LangGraph completed 100.0% vs PydanticAI at 72.73%, a +27.27 point lead.
- Case failures: BAML+LangGraph had 0 failures vs PydanticAI with 6, meaning 6 fewer failures.
- Episode count rate: BAML+LangGraph reached 100.0% vs PydanticAI at 81.25%, a +18.75 point lead.
- Record budget rate: BAML+LangGraph reached 46.88% vs PydanticAI at 28.12%, a +18.76 point lead.
- Concept recall average: BAML+LangGraph scored 0.428 vs PydanticAI at 0.2598, a +0.1682 improvement.
- Quality average: BAML+LangGraph scored 0.3352 vs PydanticAI at 0.318, a +0.0172 improvement.
- Tool call errors average: BAML+LangGraph had 0.0625 vs PydanticAI at 1.9688, much better.
Quality is not solved yet. It is only slightly better overall and still needs better pruning before persistence. But robustness improved a lot.
This is the direction I think specialized agents should go: smaller models, more deterministic scaffolding, less magical thinking about one giant prompt doing the whole job.
Next step is to make this work well with models people can run locally.
A new version of Lerim-cli is now released with the extract agent refactored to use Langgraph+BAML. Next agents will be refactored as well soon in the next releases.
github.com/lerim-dev/leri…
Refactoring is definitely much easier now. Here's how i used agents to refactor our entire compiler (+65k, -117k, 74 commits).
1. Forked parallel crates (foo --> foo2) 2. Strict dependency firewall (enforced via precommit)
3. Audited core data structures first <-- took time
My dear front-end developers (and anyone who’s interested in the future of interfaces):
I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at
Announcing the BAML Bounty...
For all power-users of BAML, we're giving away free BAML merch! (t-shirts, stickers, hoodies 🔥🧯).
Share what you built with BAML with #baml → Fill out tally.so/r/PdErze → Free merch!
Hurry! Supplies are limited to the first 50 posts.
syntax makes a huge difference to how good coding agents are, and languages should be rethought. some great learnings from rust and go!
great post by @aaronvi
@cursor_ai 30% agent-written PRs is crazy. We're betting on the same future of self-driving codebases. Gardens you tend > repos you commit to. As agents write more, we think they need a more cooperative language. That's why we're making one 🔥 github.com/BoundaryML/baml
Hot take: building agents without BAML in 2026 is like training models without Jupyter notebooks in 2020.
Technically possible? Sure.
But why would you torture yourself?
Type-safe prompts. Instant playground testing. Multi-language support.
promptfiddle.com
48 Followers 428 FollowingLead Developer of @Huckleberryenv at FSM Controls | UNF Grad | Soccer is Life | Philly Sports Fan | Avid Special Olympics Volunteer
3K Followers 2K FollowingHeart Surgeon / Web Developer / Speaker / Husband / Father / Muslim.
Author of @livecodes_io, racing-bars and diagrams-js.
Posts in English and Arabic.
204 Followers 2K Following"Identity narratives" are context-injection attacks:
Replacing your Personal Signal with a caricature.
a puppet
meant to participate
in their script
4K Followers 5K FollowingPeople are already sick of AI. Some debacles? Sure. But it's not the future. It's right now. Adapt or Die. Need help? LEVERAGEAI: Integrity Meets Innovation. LG
399 Followers 2K FollowingSenior Research Scientist at Meta, PhD from Purdue. I study safety, privacy-preserving ML, and Computational Social Sciences.
153K Followers 2 FollowingA programming language empowering everyone to build reliable and efficient software.
** This account is no longer active. Follow us on other platforms! **
1.2M Followers 787 FollowingProfessor at NYU & Executive Chairman at AMI Labs.
Ex-Chief AI Scientist at Meta.
Researcher in AI, Machine Learning, Robotics, etc.
ACM Turing Award Laureate.
20K Followers 2K Followingbuilding the post-IDE IDE at https://t.co/9PCpYRSVea - @aitinkerers sf lead, prev @replicatedhq @SproutSocial @nasa - ai that works pod @ https://t.co/69BhaNtWfd
1.4M Followers 2 FollowingWe're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
4.9M Followers 4 FollowingOpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA
2K Followers 461 FollowingMaking a programming language - BAML. @boundaryml, 🦄 aitw podcast: https://t.co/g4CJWgOsrj
prev YC, google, msft, deshaw, and other things