Fan-Yun Sun 
            I am a final-year CS PhD Candidate at Stanford AI Lab ,
               affiliated with the Autonomous Agents Lab  and Stanford Vision and Learning Lab .
                During my PhD, I also work extensively with Nvidia Research, including the Learning and Perception Research Group , Metropolis Deep Learning (Omniverse) , and the Autonomous Vehicle Research Group .
            
            I'm interested in generating embodied (3D) environments and data to train robotics/RL policies, particularly towards advancing embodied, multi-modal foundational models and their reasoning abilities .
            
            
            Previously, I'm grateful to have worked with Jure Leskovec , Stefano Ermon ,  Jian Tang  at MILA , and Shou-De Lin  at NTU .
            Outside of research, I occasionally write on X , build AI applications to experiment with new ways of creating (for example ), and have enjoyed hosting conferences  / bootcamps .
            
         
        
            
                LinkedIn  /
                Twitter  /
                GitHub  /
                Scholar 
            
         
   
      
        
          
            News 
            
                I'm honored to have received The Google Graduate Fellowship in Computer Science . 
                I gave a talk at BuzzRobot .
                 My research was featured in some press outlets (Tech Times , TechXplore , and Interesting Engineering ). 
                I have received my master's degree in CS as part of the PhD program. 
             
         
            Selected Publications 
            
                
                
                  
                      
                          
                          Your browser does not support the video tag.
                        
                   
                        
                            ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code Tianyu Hua ,
                            Harper Hua ,
                            Violet Xiang ,
                            Benjamin Klieger ,
                            Sang T. Truong ,
                            Weixin Liang ,
                            Fan-Yun Sun Nick Haber 
                            NeurIPS , 2026 (spotlight) project page (TBA) 
                            /
                            arXiv 
                         
                     
                
                  
                      
                          
                          Your browser does not support the video tag.
                        
                   
                        
                            3D-Generalist: Self-Improving Vision-Language-Action Models for Crafting 3D Worlds Fan-Yun Sun Shengguang Wu ,
                            Christian Jacobsen ,
                            Thomas Yim ,
                            Haoming Zou ,
                            Alex Zook ,
                            Shangru Li ,
                            Ethem Can ,
                            Xunlei Wu ,
                            Clemens Eppner ,
                            Valts Blukis ,
                            Jonathan Tremblay ,
                            Jiajun Wu ,
                            Stan Birchfield † ,
                            Nick Haber † 
                            To appear 
                    project page 
                            /
                            arXiv (TBA) 
                         
                     
                    
                        
                             
                        
                            Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images Shengguang Wu ,
				Fan-Yun Sun Kaiyue Wen ,
                                Nick Haber 
                            ACL main conference, 2025 
                            project page  / arXiv 
                         
                     
                    
                        
                             
                        
                            LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models Fan-Yun Sun Weiyu Liu *,
                            Siyi Gu ,
                            Dylan Lim ,
                            Goutam Bhat ,
                            Federico Tombari ,
                            Manling Li ,
                            Nick Haber ,
                            Jiajun Wu 
                            *  Equal Contribution
                            CVPR , 2025
                            project page 
                            /
                            paper 
                         
                     
                    
                        
                            
                                
                                Your browser does not support the video tag.
                              
                         
                        
                            GRS: Generating Robotic Simulation Tasks from Real-World Images Alex Zook ,
			    Fan-Yun Sun Josef Spjut ,
                            Valts Blukis ,
                            Stan Birchfield ,
                            Jonathan Tremblay 
                            CVPR Workshop, 2025 
                            paper 
                         
                     
                    
                        
                             
                        
                            Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition Hsuan Su ,
                            Hua Farn ,
			    Fan-Yun Sun Shang-Tse Chen ,
                            Hung-yi Lee 
                            EMNLP , 2024
                            project page 
                            /
                            code 
                         
                     
                    
                        
                            
                                
                                Your browser does not support the video tag.
                              
                         
                        
                            FactorSim: Generative Simulation via Factorized Representation Fan-Yun Sun S. I. Harini ,
                            Angela Yi ,
                            Yihan Zhou ,
                            Alex Zook ,
                            Jonathan Tremblay ,
                            Logan Cross ,
                            Jiajun Wu ,
                            Nick Haber 
                            NeurIPS , 2024
                            project page 
                            /
                            code 
                         
                     
                    
                        
                            
                                
                                Your browser does not support the video tag.
                              
                         
                        
                            Holodeck: Language-Guided Generation of 3D Embodied Environments Yue Yang * ,
			    Fan-Yun Sun * ,
                            Luca Weihs * ,
                            Eli Vanderbilt ,
                            Alvaro Herrasti ,
                            Winson Han ,
                            Jiajun Wu ,
                            Nick Haber ,
                            Ranjay Krishna ,
                            Lingjie Liu ,
                            Chris Callison-Burch ,
                            Mark Yatskar ,
                            Aniruddha Kembhavi ,
                            Christopher Clark 
                            *  Equal Technical Contribution
                            CVPR , 2024
                            project page 
                            /
                            code 
                            
                            
                         
                     
                    
                             
                        
                            Partial-View Object View Synthesis via Filtering Inversion Fan-Yun Sun Jonathan Tremblay ,
                            Valts Blukis ,
                            Kevin Lin ,
                            Danfei Xu ,
                            Boris Ivanovic ,
                            Peter Karkus ,
                            Stan Birchfield ,
                            Dieter Fox ,
                            Ruohan Zhang ,
                            Yunzhu Li ,
                            Jiajun Wu ,
                            Marco Pavone ,
                            Nick Haber 
                            Workshop XRNeRF, CVPR , 2023
                            3DV , 2024 (Spotlight) project page 
                            /
                            paper 
                            /
                            code 
                         
                    
                    
                        
                             
                        
                            Interaction Modeling with Multiplex Attention NeurIPS , 2022
                            project page 
                            /
                            code 
                         
                     
                    
                        
                            
                                
                                Your browser does not support the video tag.
                              
                         
                        
                            Physion: Evaluating Physical Prediction from Vision in Humans and Machines 
                                Daniel M Bear ,
                                
Elias Wang ,
                                
Damian Mrowca ,
                                
Felix J Binder ,
                                
Hsiau-Yu Fish Tung ,
                                
RT Pramod ,
                                
Cameron Holdaway ,
                                
Sirui Tao ,
                                
Kevin Smith ,
				
Fan-Yun Sun ,
                                
Fei-Fei Li ,
                                
Nancy Kanwisher ,
                                
Joshua B Tenenbaum ,
                                
Daniel LK Yamins ,
                                
Judith E Fan 
                             
                            NeurIPS, Datasets and Benchmarks Track , 2021project page 
                            /
                            code 
                         
                     
                    
                        
                             
                        
                            Equivariant Neural Network for Factor Graphs
                             paper 
                         
                     
                    
                        
                             
                        
                            InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization ICLR , 2020 (spotlight) project page 
                 
             
            
                        
                             
                        
                            vGraph: A Generative Model for Joint Community Detection and Node Representation Learning NeurIPS , 2019
                            arXiv 
                            /
                            slides 
                            /
                            poster 
                         
                     
                    
                        
                             
                        
                            Organ At Risk Segmentation with Multiple Modality Kuan-Lun Tseng ,
                            Winston Hsu ,
                            Chun Ting Wu ,
                            Ya-Fang Shih ,
			Fan-Yun Sun paper 
                         
                     
                    
                        
                             
                        
                            Designing Non-Greedy Agents through Reward Shaping1  and Regulation Enforcement in Multi-Agent Reinforcement Learning2  1 AAAI/ACM conference on AI, Ethics, Society(Oral) 2 AAMASpaper 1 
                            /
                            paper 2 
                         
                     
                 
        
         
            Academic Services & Awards