parameter-sharing

Weight Tying

Weight Tying is a technique used to improve the performance of language models by sharing the weights of the embedding and softmax layers. This technique has been widely adopted in various neural machine translation models and has been proposed by different researchers. The main advantage of weight tying is its ability to reduce the total number of parameters, which can lead to a faster model training process. What are Language Models? Language models are computational models that are trained