Get the latest Science News and Discoveries
Release of “Fugaku-LLM” – a large language model trained on the supercomputer “Fugaku” - EurekAlert
A team of researchers in Japan released Fugaku-LLM, a large language model with enhanced Japanese language capability, using the RIKEN supercomputer Fugaku. <p>To train large language models on Fugaku, the researchers developed distributed training methods, including porting the deep learning framework Megatron-DeepSpeed to Fugaku in order to optimize the performance of Transformers on Fugaku. They accelerated the dense matrix multiplication library for Transformers, and optimized communication performance for Fugaku by combining three types of parallelization techniques and accelerated the collective communication library on the Tofu interconnect D.</p>
None
Or read this on Eureka Alert