Publications

Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation
Zhonghan Zhao*, Kewei Chen*, Dongxu Guo*, Wenhao Chai, Tian Ye, Yanting Zhang✉, Gaoang Wang✉
arXiv Preprint.
[Paper]
HAS is a hierarchical auto-organizing multi-agent navigation system for MLM embodied agents.

See and think: Embodied agent in virtual environment
Zhonghan Zhao*, Wenhao Chai*, Xuan Wang*, Li Boyi, Shengyu Hao, Shidong Cao, Tian Ye, Jenq-Neng Hwang✉, Gaoang Wang✉
arXiv Preprint.
[Paper]
STEVE is a comprehensive and visionary embodied agent in the Minecraft virtual environment. STEVE comprises three key components: vision perception, language instruction, and code action.

Devil in the Number: Towards Robust Multi-modality Data Filter
Yichen Xu, Zihan Xu, Wenhao Chai, Zhonghan Zhao, Enxin Song, Gaoang Wang✉
arXiv Preprint.
[Paper]
Devil in the number involves reevaluating the CLIP scores after eliminating these influences to filter multi-modality data sets on a web scale.

UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning
Meiqi Sun*, Zhonghan Zhao*, Wenhao Chai, Hanjun Luo, Shidong Cao, Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang✉
arXiv Preprint.
[Paper]
UniAP is a novel Universal Animal Perception model that leverages few-shot learning to enable cross-species perception among various visual tasks.

A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision
Zhonghan Zhao*, Wenhao Chai*, Shengyu Hao, Wenhao Hu, Guanhong Wang, Shidong Cao, Gaoang Wang✉, Mingli Song, Jenq-Neng Hwang
arXiv Preprint.
[Paper]
Our survey provides valuable reference material for researchers interested in deep learning applications within the sporting industry while shedding light on its potential to utilize sports data for analysis.