Cutting-edge AI and ML Research Laboratory

Pioneering the Future of Artificial Intelligence

We push the boundaries of machine learning, computer vision, and natural language processing to solve real-world challenges.

Scroll to explore
16+
Researchers
9+
Papers
2+
Research Projects
1+
News

Featured Research

Discover our latest breakthroughs in AI research

LMFlow
NAACL 2024 Best Demo Award
Project 1 of 2

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

LLMMLSys
Explore Project

Selected Publications

Recent highlights from our research community

Featured

RLHF Workflow: From Reward Modeling to Online RLHF

Hanze Dong, Wei Xiong et al.

Transactions on Machine Learning Research (TMLR), 2024

We present the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF) in this technical report, which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature. However, existing open-source RLHF projects are still largely confined to the offline learning setting. In this technical report, we aim to fill in this gap and provide a detailed recipe that is easy to reproduce for online iterative RLHF. In particular, since online human feedback is usually infeasible for open-source communities with limited resources, we start by constructing preference models using a diverse set of open-source datasets and use the constructed proxy preference model to approximate human feedback. Then, we discuss the theoretical insights and algorithmic principles behind online iterative RLHF, followed by a detailed practical implementation. Our trained LLM achieves impressive performance on LLM chatbot benchmarks, including AlpacaEval-2, Arena-Hard, and MT-Bench, as well as other academic benchmarks such as HumanEval and TruthfulQA. We have shown that supervised fine-tuning (SFT) and iterative RLHF can obtain state-of-the-art performance with fully open-source datasets. Further, we have made our models, curated datasets, and comprehensive step-by-step code guidebooks publicly available.

PDF

Latest News

Stay updated with our recent achievements and announcements

Our Paper Accepted at NeurIPS 2025
October 15, 2025

Our Paper Accepted at NeurIPS 2025

Breakthrough research on LLM published at the leading AI conference

Read more