Jordan Feldstein @jfeldstein

linkedin.com/in/jsfeldstein Montpelier, VT Joined May 2008

Tweets

2K
Followers

726
Following

692
Likes

114

Jordan Feldstein @jfeldstein

a month ago

@johannes_hage @omouamoua Anything you can share about the harness design for making this happen?

0 0 0 30 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

PR "feat: add Harbor ATIF trajectory format support" to @OpenAI / euphony cc @Jay4w @alexgshaw github.com/openai/euphony…

0 0 0 26 0

View Details

@theanakin87 But watch out: "Overtraining SFT or training on shorter/simpler examples boosts SFT scores fastest but often produces the worst RL learners" arxiv.org/abs/2510.01624 HT @maximelabonne from LI linkedin.com/posts/maxime-l…

0 0 2 29 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

PRd a small fix to Prime-RL: github.com/PrimeIntellect… Training unaffected, but if anyone is manually using solve_all/all or reward/all/mean as stopping signal or to make decisions about hyperparameters mid-run those decisions could be wrong @samsja19 @jannik_stra @johannes_hage

0 0 1 66 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@ypatil125 How quickly do you think this problem compounds wrt context window length? DS r1 was only 128k, but 1M+ becoming table stakes (DS v4 for example) cc @lindensli

0 0 0 26 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@dorkitude Reminds me of an old joke “10 kinds of people in the world…”

0 0 1 53 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@msfeldstein @MountainsGuy1 “Likely telemarketer” ≠ “flagged as spam”. You want the silence unknown callers option checked. That’s what actually hides them. Still shows the missed call notification but 🤷

1 0 0 18 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@alexgshaw @cl571128 Not “Cheating” (implies malice intent). +1 requiring auditable trajectories, but are trajectories being audited, inc when submitted by staff? @togethercompute’s DSGym has read-only volumes. Reduced attack surface + faster rollouts linkedin.com/pulse/mythos-e… @Ameen_ml your team?

0 0 0 18 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

My second fav example is @cursor_ai ‘s Cloud Agents setting stage for RL for Composer 2 rollout infra “When a pod fork is requested, we attempt to first schedule the fork onto the same node” 👌 @vmg @ellev3n11 @srush_nlp @EvanHub @aditjain1980

0 0 0 53 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

Whoever at @AnthropicAI‘s idea for forked subagents was, what a nice convergence of UX, technical efficiency (no compression loss), and likely Stealth RL Hack: every forked agent is a potential super rich env/task pair way deep in the weeds in something that has to be figured out

Aran Komatsuzaki @arankomatsuzaki

2 months ago

Anthropic just introduced forked subagents in their latest update. Unlike regular subagents, forked subagents can inherit the same context as the main agent. This looks convenient for cases where richer context matters more. This is just what I needed!

40 69 929 91K 564

1 0 1 66 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@aditjain1980 @stefanopopoulos Everything is bias. In my mind biasing @ prod behavior can’t be too bad

0 0 0 25 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@stefanopopoulos @aditjain1980 @vmg @ellev3n11 can you say whether the training environment was production-with-limits or simulated-well-enough? I’m about to start building simulated mcps into procedural env gen at @InvTechInc and am curious if anyone else is going the mock-mcp route, or how far that’ll get me

0 0 0 38 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

@stefanopopoulos @aditjain1980 Or similar enough to it. That was the thing in the Composer technical paper that blew me away. It was pitched as a “simulated” backend but sounds like maybe it’s easier to limit the agents auth and have it hit production itself than build a faithful simulation of it

2 0 0 65 0

View Details

Jordan Feldstein @jfeldstein

2 months ago

"Hardening" that we think of as the realm of cyber security is actually one way to prevent whole classes of reward hacking, and to run faster and more cost-efficient rollouts linkedin.com/pulse/mythos-e…

0 0 1 15 0

View Details

Jordan Feldstein @jfeldstein

2 years ago

@recursiverealms @msfeldstein Yes yes yes put me in 3d space based at 1:20th scale asterweb.jpl.nasa.gov/gdem.asp

0 0 1 20 0

View Details

Jordan Feldstein @jfeldstein

2 years ago

@beehiiv why are you not responding when I submit the “interested in purchasing ads” form?

0 0 0 16 0

View Details

Jordan Feldstein @jfeldstein

2 years ago

.@whereskap , tell me about this VR game idea you have?

0 0 0 92 0

View Details

Jordan Feldstein @jfeldstein

3 years ago

email.offerhunt.app filters noisy job-search emails using ai so you see interviews, but nothing else. medium.com/@jordan_offerh…

0 0 0 104 0

View Details

Jordan Feldstein @jfeldstein

3 years ago

@heatherandlace_ No longer true. email.offerhunt.app filters noisy job-search emails using ai so you see interviews, and nothing else. medium.com/@jordan_offerh…