“red team” sounds cool. “blue team” sounds cool. so you’d think “purple team” would be very cool. alas, it is a Slack thread with 19 unresolved comments.
Every release is a high‑wire act. Instead of praying for calm winds, build a net.
EvalOps ties your policies, metrics and audits into a mesh that lets you scale without falling.
EvalOps is where evaluations meet operations — and security is no exception.
“keep” shows how device posture, SSO, and OPA policies can be continuously tested and traced like any other system.
Run it, break it, measure it.
github.com/evalops/keep
Agents are already writing your code. The question isn't "should we use them?" It's "how do we ship them without surprises?"
Provenance gives you a ledger. Every line. Every agent. Every risk. Measurable.
github.com/evalops/proven…
We’re open-sourcing Smith — the Firecracker-based CI runner that powers EvalOps.
Why rebuild Blacksmith?
Because eval gating needs specialized infra — and we’re not forcing you onto our cloud.
Run evals on EvalOps Cloud or your own. github.com/evalops/smith
🔥 Just dropped an evaluation‑driven LoRA loop built on Tinker from @thinkymachines! It trains, benchmarks & iterates until your model meets the mark. It auto‑spots weaknesses, spawns targeted LoRA jobs & tracks improvements.
Proof‑of‑concept repo:
github.com/evalops/tinker…
Sick of yak-shaving to get a clean Transformers setup?
We built a stack that just works:
PyTorch + HF Transformers
Hydra configs
FastAPI serving + Prometheus
vLLM, LoRA, flash-attn, bitsandbytes
Reproducible. Dockerized. CI/CD baked in.
github.com/evalops/stack
Developer resumes are frozen in time. GitHub tells the real story.
7k commits, +1.4M lines → now that’s a holographic trading card worth flexing. 🚀
cards.evalops.dev
LLM vendor: “Just quantization.”
Reality: reward-hacked code, broken workflows, lost week.
Companies: “nbd.”
Users: 🙃🔥
Making this a thing of the past.
All of us have been dazzled by large language models’ ability to spit out code, fix bugs, or draft boilerplate. But when you put that code into production, every hidden bug is a potential outage, compliance fine, or security hole. And today’s AI tools leave you guessing.
782 Followers 4K FollowingCybersecurity as a Service for defense fintech, healthtech, startups. 🦄 We make security & compliance easier. 😌 Founder of @RedCupIT 📍 SF
2K Followers 3K FollowingToxic Optimist // Building the Operating System for character-driven IP @AdultFantasyHQ // Alum: @Marvel, @Cartoonnetwork, @JeffKoons
74 Followers 730 FollowingFounder of Kraliki & Verduona. AI agents that automate the work while operators keep control — AgentJack is the front door. Prague 🇨🇿
151 Followers 7K Following(e/acc) Using internet since the era of dialup & Caltiger. MATLAB & C are good. Lost in Taschen, and Farrar, Straus and Giroux.