GShard

Have you ever been frustrated by slow or inefficient neural network computations? If so, you may be interested in GShard, a new method for improving the performance of deep learning models. What is GShard? GShard is an intra-layer parallel distributed method developed by researchers at Google. Simply put, it allows for the parallelization of computations within a single layer of a neural network. This can drastically improve the speed and efficiency of model training and inference. One of th

Mesh-TensorFlow

Overview of Mesh-TensorFlow Mesh-TensorFlow is a programming language used to distribute tensor computations. Like data-parallelism that splits tensors and operations along the "batch" dimension, Mesh-TensorFlow can split any dimensions of a multi-dimensional mesh of processors. This allows users to specify the exact dimensions to be split across any dimensions of the mesh of processors. What is Tensor Computation? Tensor computation is a concept in which matrices and higher-dimensional arra

Tofu

Overview of Tofu Tofu is a system designed to partition large deep neural network (DNN) models across multiple GPU devices, reducing the memory footprint for each GPU. The system is specially designed to partition a dataflow graph used by platforms like TensorFlow and MXNet, which are frameworks used for building and training DNN models. Tofu makes use of a recursive search algorithm to partition different operators in a dataflow graph in a way that minimizes the total communication cost. This

1 / 1