
Tuning Large Language Models with Reinforcement Learning on a Single GPU
March 30, 2023 | 7 min readA quick guide for RLHF using trlX, OPT-1.5B, and LoRA.
Under the sea, in the hippocampus's garden...
A quick guide for RLHF using trlX, OPT-1.5B, and LoRA.
Transformer has undergone various application studies, model enhancements, etc. This post aims to provide an overview of these studies.
This post explains how MobileBERT succeeded in reducing both model size and inference time and introduce its implementation in TensorFlow.js that works on web browsers.