david yan @dzyan01

phd @PrincetonVL david-yan1.github.io Joined March 2025

Tweets

61
Followers

163
Following

323
Likes

303

david yan @dzyan01

4 days ago

@baaadas didn't know innates get topdecked if you have more then 10. saving this for my next clone run

0 0 0 136 0

View Details

Sergey Zakharov @ZakharovSergeyN

a month ago

Releasing RecGen: a collaboration between @ToyotaResearch, @toyota_europe, and @UvA_Amsterdam tackling a core 3D vision challenge: reconstructing complete multi-object scenes (parts, poses, textures, even occluded geometry) from just 1 to a few RGB-D views. Trained purely on synthetic data, RecGen achieves SOTA on real-world robotics and 6D pose benchmarks, handling occlusions, symmetry, and complex interactions. A step toward scalable, high-fidelity digital twins for robotics, and better evaluation and training of generalist policies. reconstruction-by-generation.github.io

2 35 220 27K 170

View Details

david yan @dzyan01

a month ago

I’d previously thought that single-view reconstruction would be tough with only synthetic data, but it turns out it’s not! Check out this very cool work applying procedural 3D data to *full* reconstruction.

Sergey Zakharov @ZakharovSergeyN

a month ago

2 35 220 27K 170

0 1 14 2K 11

View Details

david yan @dzyan01

a month ago

@cindy_x_wu @orussakovsky congrats!

1 0 1 137 0

View Details

david yan @dzyan01

2 months ago

@holoday_ The baselines we use are wider than that (>4 cm), but you can always change the code to generate your own. You should definitely check out @_ilya_c's very great work on this (though they consider the unsupervised setting). arxiv.org/abs/2212.12324

2 0 3 367 1

View Details

david yan @dzyan01

2 months ago

Stereo depth is important in robotics, and relies heavily on synthetic data. But what actually makes for good synthetic data? In WMGStereo, we study dataset design and discover a powerful data recipe - just 500 samples of our data can match 40k Sceneflow samples! 🧵[1/7]

4 41 249 15K 181

View Details

david yan @dzyan01

2 months ago

Our work is open-source and you can also check it out in-person at our #CVPR2026 Highlight this summer! Dataset: huggingface.co/datasets/princ… Code: github.com/princeton-vl/I… [7/7]

0 0 13 838 5

View Details

david yan @dzyan01

2 months ago

By collecting the best design choices from our study, we create a full-scale dataset, WMGStereo-150k. Our data is super sample efficient and scales well! [6/7]

1 0 5 871 3

View Details

Guanyu Zhou @TMartyr4951

2 months ago

It's time to systematically teach VLMs to see with synthetic images! We built VisionFoundry, a simple but intuitive framework that generates synthetic image datasets from only a task name. 10k synthetic data → over +10% improvement on visual perception benchmarks 👀

6 38 235 24K 151

View Details

david yan @dzyan01

2 months ago

@a1zhang huge

0 0 2 1K 0

View Details

Kaleb Newman @kalebnewman8

2 months ago

Video models surprisingly can solve mazes, but inconsistently. We understand little about how they reason, making it hard to use such abilities. We investigate the denoising process and find models commit to a plan early, letting us screen far more candidates for better perf. 🧵

1 17 94 14K 59

View Details

david yan @dzyan01

2 months ago

@i_ikhatri "poweruserslop," as they say

0 0 0 21 0

View Details

Juno KIM @junokim_ai

2 months ago

Excited to share our new paper on sharp capacity scaling of the Muon optimizer! Joint work with @EshaanNichani Denny Wu @albertobietti @jasondeanlee: arxiv.org/abs/2603.26554 (1/7)

4 31 125 21K 71

View Details

Ethan @torchcompiled

2 months ago

ML interview question: You’re training a 72B MoE MNIST classifier. Layer 53 MLP expert 7 destabilizes when the ones in the dataset are turned upside down. What happened?

25 18 333 74K 132

View Details

Princeton Vision & Learning Lab @PrincetonVL

2 months ago

Stereo depth is highly useful for robots. Meet WAFT-Stereo: #1 on ETH3D (BP-0.5), Middlebury (RMSE), and KITTI (all metrics); 61% less zero-shot ETH3D BP-0.5 error; 1.8-6.7x faster than prior SOTA. Key idea: classify disparity into bins, then iterative high-res warping.🧵1/2

3 24 114 8K 56

View Details

Jack Zhang @jcz42

2 months ago

We made Muon run up to 2x faster for free! Introducing Gram Newton-Schulz: a mathematically equivalent but computationally faster Newton-Schulz algorithm for polar decomposition. Gram Newton-Schulz rewrites Newton-Schulz such that instead of iterating on the expensive rectangular X matrix, we iterate on the small, square, symmetric XX^T Gram matrix to reduce FLOPs. This allows us to make more use of fast symmetric GEMM kernels on Hopper and Blackwell, halving the FLOPs of each of those GEMMs. Gram Newton-Schulz is a drop-in replacement of Newton-Schulz for your Muon use case: we see validation perplexity preserved within 0.01, and share our (long!) journey stabilizing this algorithm and ensuring that training quality is preserved above all else. This was a super fun project with @noahamsel, @berlinchen, and @tri_dao that spanned theory, numerical analysis, and ML systems! Blog and codebase linked below 🧵