You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In deep learning, the popularity of large models (gpt-3, T5, megatron LM) is growing. However, due to this, the polarization of wealth in AI is intensifying.
As an example that touches very well, take gpt-3, a recently very hot potato. gpt-2 was 6GB on disk and the number of parameters was 1.5B. However, since gpt-3 has 175B parameters, it is assumed that its weight alone will occupy 700GB.
To train or inference through the existing framework, all weights had to be loaded into memory. However, in the case of gpt-3, it is difficult to use 700GB of memory on a general PC.
But matorage can solve this problem. The philosophy of matorage's model storage is not to store one model as a single file, but to store it layer-wise. Therefore, matorage will solve this problem by fetching only the submodel weight acceptable to the PC, loading it into memory, and storing the calculated value in file storage. It has a similar philosophy to pydata/numexpr.
The implementation of this feature is reflected in 0.3.0. In addition, we will implement operations that forward rather than backward and are released first in the pytorch version.
Once again, I hope that the future of AI will not be centralized by wealth, but decentralized by collective intelligence.
Note
This issue is not using the official gpt-3 weights. Run the test by randomly initializing the model with the same conditions as shown in the picture below.
The text was updated successfully, but these errors were encountered:
graykode
changed the title
support inference large models such as gpt-3 in storage.
support inference large models such as gpt-3 in storage calculation.
Aug 23, 2020
In deep learning, the popularity of large models (gpt-3, T5, megatron LM) is growing. However, due to this, the polarization of wealth in AI is intensifying.
As an example that touches very well, take gpt-3, a recently very hot potato. gpt-2 was 6GB on disk and the number of parameters was 1.5B. However, since gpt-3 has 175B parameters, it is assumed that its weight alone will occupy 700GB.
To train or inference through the existing framework, all weights had to be loaded into memory. However, in the case of gpt-3, it is difficult to use 700GB of memory on a general PC.
But matorage can solve this problem. The philosophy of matorage's model storage is not to store one model as a single file, but to store it layer-wise. Therefore, matorage will solve this problem by fetching only the submodel weight acceptable to the PC, loading it into memory, and storing the calculated value in file storage. It has a similar philosophy to pydata/numexpr.
The implementation of this feature is reflected in 0.3.0. In addition, we will implement operations that forward rather than backward and are released first in the pytorch version.
Once again, I hope that the future of AI will not be centralized by wealth, but decentralized by collective intelligence.
If you want to know more, please refer to the issue :
#openai/gpt-3/issues/1
#huggingface/transformers/issues/4658
Note
This issue is not using the official gpt-3 weights. Run the test by randomly initializing the model with the same conditions as shown in the picture below.
The text was updated successfully, but these errors were encountered: