An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
A comprehensive benchmark for evaluating multi-modal large language models as vision-driven embodied agents across diverse environments and task complexities.