PRd a small fix to Prime-RL: github.com/PrimeIntellect…
Training unaffected, but if anyone is manually using solve_all/all or reward/all/mean as stopping signal or to make decisions about hyperparameters mid-run those decisions could be wrong
@samsja19@jannik_stra@johannes_hage
@ypatil125 How quickly do you think this problem compounds wrt context window length? DS r1 was only 128k, but 1M+ becoming table stakes (DS v4 for example) cc @lindensli
@msfeldstein@MountainsGuy1 “Likely telemarketer” ≠ “flagged as spam”. You want the silence unknown callers option checked. That’s what actually hides them. Still shows the missed call notification but 🤷
@alexgshaw@cl571128 Not “Cheating” (implies malice intent).
+1 requiring auditable trajectories, but are trajectories being audited, inc when submitted by staff?
@togethercompute’s DSGym has read-only volumes. Reduced attack surface + faster rollouts linkedin.com/pulse/mythos-e…@Ameen_ml your team?
My second fav example is @cursor_ai ‘s Cloud Agents setting stage for RL for Composer 2 rollout infra
“When a pod fork is requested, we attempt to first schedule the fork onto the same node” 👌
@vmg@ellev3n11@srush_nlp@EvanHub@aditjain1980
Whoever at @AnthropicAI‘s idea for forked subagents was, what a nice convergence of UX, technical efficiency (no compression loss), and likely Stealth RL Hack: every forked agent is a potential super rich env/task pair way deep in the weeds in something that has to be figured out
Anthropic just introduced forked subagents in their latest update.
Unlike regular subagents, forked subagents can inherit the same context as the main agent. This looks convenient for cases where richer context matters more.
This is just what I needed!
@stefanopopoulos@aditjain1980@vmg@ellev3n11 can you say whether the training environment was production-with-limits or simulated-well-enough? I’m about to start building simulated mcps into procedural env gen at @InvTechInc and am curious if anyone else is going the mock-mcp route, or how far that’ll get me
@stefanopopoulos@aditjain1980 Or similar enough to it.
That was the thing in the Composer technical paper that blew me away. It was pitched as a “simulated” backend but sounds like maybe it’s easier to limit the agents auth and have it hit production itself than build a faithful simulation of it
"Hardening" that we think of as the realm of cyber security is actually one way to prevent whole classes of reward hacking, and to run faster and more cost-efficient rollouts linkedin.com/pulse/mythos-e…
853 Followers 2K FollowingJournalist. Author of GASLIGHT: THE ATLANTIC COAST PIPELINE AND THE FIGHT FOR AMERICA'S ENERGY FUTURE (2024) + FIRE AND ICE (2015).
bluesky: https://t.co/rVBF45bSVD
281 Followers 804 FollowingWe are introducing JAV books and DVDs.
We also sell.
Books and DVDs will be shipped from Japan.
Please contact us by email.
We will respond politely.
5K Followers 792 FollowingBlocPower is upgrading America’s buildings to be smarter, greener and healthier for all-fighting the climate crisis and creating good green jobs
134 Followers 4K FollowingGED Commerce was established in the year 2013 is a pioneering body in the field of, Data Solutions, Mailing Lists, Data Intelligence & Digital Marketing Service
68 Followers 164 FollowingLead Engineer @ https://t.co/Ak31CqsDuH
System building and problem solving
Working on:
https://t.co/MpILg0cYcm
https://t.co/KDvbQr5CVB
18K Followers 9K FollowingI push the AI frontier by building tough benchmarks with amazing people. SWE-bench, SWE-agent, SciCode, AlgoTune. Postdoc @Princeton. PhD @nlpnoah @UW.
1K Followers 362 FollowingCS PhD student @Stanford advised by @tengyuma & @tatsu_hashimoto. Former CS and Math undergraduate @Harvard. Website: https://t.co/zDpmBGVhkR
36K Followers 5K FollowingCo-founded June (“self-driving oven,” acquired by @webergrills) & co that became @Lyft. Building again, more soon. OS: @slashlast30days 41.6k★ @ppressdev 4.8k★
5K Followers 2K FollowingAGI maxxing @collinearAI 🧪 | MIT 35u35 | UN AI Advisory Body | Featured in NYT, Quanta, Science, MIT TR| Previously: @huggingface, @SFResearch, PhD @utcompsci
1K Followers 399 Followingflapping & soaring above the compute gardens, @stanfordnlp, previously: @appliedcompute @PrimeIntellect
life shrinks or expands according to one’s courage!