GPT-3 performance-like language model as open source…

lswcap

4 년 ago

GPT-3, a language model that can create accurate sentences, was developed by OpenAI, but it is not open source open access and cannot be used freely because it has an exclusive license agreement with Microsoft. GPT-Neo is the move to create an open source version of GPT-3 for this situation.

The research group developing GPT-Neo is EleutherAI. Prior to the formation of the group, they attempted to create a copy of GPT-2 using Tensorflow Research Cloud (TFRC), and this code became GPT-Neo based.

However, targeting a copy of GPT-3 had a problem that the TPU provided through TFRC was insufficient. Helping this point is CoreWeave, a crypto asset mining company that provides cloud services for CGI rendering and machine learning. Of course, CoreWeave only receives hardware resources to the last, and GPT-Neo is said to be open source.

Since it is pointed out that deviations from the training data set may be amplified depending on the language model, a strict editorial policy has been established to subtract data sets containing unacceptable negative biases. The completed Corpus The Pile has a data size of 835 GB and has extensive generalization capabilities by combining 22 small databases.

Eleuterio AI says it expects GPT-Neo to perform similarly to the same parameter quantity as GPT-3. In the future, it is said that the final model is planned to be reduced by one parameter digit and lightened.

In addition, there is no plan to provide a commercial API for GPT-Neo, but it is expected that general users will be able to use GPT-Neo by providing a service by CoreWeave or a third party. Related information can be found here .