2024

Paper Image

MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception

Yiran Qin*, Enshen Zhou*, Qichang Liu*, Zhenfei Yin, Lu Sheng†, Ruimao Zhang†, Yu Qiao, Jing Shao‡

CVPR (The IEEE/CVF Conference on Computer Vision and Pattern Recognition), 2024

SimulationMineCraftPlanningMLLM
Paper Image

Octavius: Mitigating Task Interference in MLLMs via MoE

Zeren Chen*, Ziqin Wang*, Zhen Wang*, Huayang Liu, Zhenfei Yin‡, Si Liu, Lu Sheng†, Wanli Ouyang, Yu Qiao, Jing Shao†

ICLR (The International Conference on Learning Representations), 2024

MLLMTraining3D Vision
Paper Image

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao†, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, Limin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao†, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin‡, Zhipin Wang

Technical Report, 2024

MLLMEvaluationHuman ValueCausal Reasoning
Paper Image

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents

Zeren Chen*, Zhelun Shi*, Xiaoya Lu*, Lehan He*, Sucheng Qian, Hao Shu Fang, Zhenfei Yin‡, Wanli Ouyang, Jing Shao†, Yu Qiao, Cewu Lu†, Lu Sheng†

Arxiv 2024

Robot ArmManipulationMLLMPlanningExecuting
Paper Image

Assessment of Multimodal Large Language Models in Alignment with Human Values

Zhelun Shi*, Zhipin Wang*, Hongxing Fan*, Zaibin Zhang, Lijun Li, Yongting Zhang, Zhenfei Yin, Lu Sheng†, Yu Qiao, Jing Shao†

Arxiv 2024

MLLMEvaluationHuman Value
Paper Image

MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control

Enshen Zhou*, Yiran Qin*, Zhenfei Yin, Yuzhou Huang, Ruimao Zhang†, Lu Sheng†, Yu Qiao, Jing Shao‡

Arxiv 2024

SimulationMineCraftMLLMWorld ModelExecuting