Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
A test of uploading images.
Published:
I successfully inserted this html file in the post with
Short description of portfolio item number 1
Short description of portfolio item number 2 
Published in Advances in Neural Information Processing Systems (NeurIPS 2022), 2022
This paper presents a systematic study on gap-dependent sample complexity in offline reinforcement learning.
Recommended citation: Xinqi Wang, Qiwen Cui, Simon S. Du. (2022). "On Gap-dependent Bounds for Offline Reinforcement Learning." Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper_files/paper/2022/hash/5f5f7b6080dcadced61cf5d96f7c6dde-Abstract-Conference.html
Published in arXiv preprint, 2024
This paper investigates preference-based multi-agent reinforcement learning, focusing on identifying Nash equilibria from offline datasets with sparse human feedback, and introduces temporal MSE regularization and pessimism mechanisms for improved reward modeling.
Recommended citation: Xinqi Wang, Natalia Zhang, Qiwen Cui, Runlong Zhou, Sham M. Kakade, Simon S. Du. (2024). "Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques." arXiv:2409.00717. https://arxiv.org/abs/2409.00717
Published in Advances in Neural Information Processing Systems (NeurIPS 2024), 2024
This paper introduces DiSPOs, a novel approach that learns distributions of successor features from offline datasets to enable zero-shot policy optimization across different reward functions, avoiding compounding errors in model-based RL.
Recommended citation: Chuning Zhu, Xinqi Wang, Tyler Han, Simon S. Du, Abhishek Gupta. (2024). "Distributional Successor Features Enable Zero-Shot Policy Optimization." Advances in Neural Information Processing Systems. https://proceedings.neurips.cc/paper_files/paper/2024/hash/e15ef893e137cd40e6c7313a04307437-Abstract-Conference.html
Published in arXiv preprint, 2025
This paper introduces trajectory clustering for offline RL datasets where cluster centers represent generating policies, proposing Policy-Guided K-means (PG-Kmeans) and Centroid-Attracted Autoencoder (CAAE) with finite-step convergence guarantees.
Recommended citation: Xinqi Wang, Hao Hu, Simon S. Du. (2025). "Policy-Based Trajectory Clustering in Offline Reinforcement Learning." arXiv:2506.09202. https://arxiv.org/abs/2506.09202
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.