This repository highlights the outcome of an experimental merging algorithm that combined the weights of two distinct language models through the application of the add difference technique using an alpha value of 0.6. The process of weight merging is an innovative approach that enables the integration of knowledge from multiple models, culminating in the development of a more dynamic and advanced language model.

Proto-Synthia showcases an achievement in optimization within a mere 10 minutes, thereby, in many cases, obviating the need for the conventional time-intensive training process. For Model A and B, the GPT2-Large and GPT2 trained solely on the wikitext dataset have been used, respectively. More to come in the following days.