Falcon 180B: Can It Run on Your Laptop?

There may be additionally a chat model. The fashions can be found on the Hugging Face hub:
Falcon 180B is totally free and state-of-the-art. However it’s additionally an enormous mannequin.
Can it run in your pc?
Except your pc is prepared for very intensive computing, it may possibly’t run Falcon 180B out-of-the-box. You will want to improve your pc and use a quantized model of the mannequin.
On this article, I clarify how one can run Falcon-180B on shopper {hardware}. We’ll see that it may be moderately inexpensive to run a 180 billion parameter mannequin on a contemporary pc. I additionally talk about a number of methods that assist cut back the {hardware} necessities.
The very first thing you should know is that Falcon 180B has 180 billion parameters saved as bfloat16. A (b)float16 parameter is 2 bytes in reminiscence.
While you load a mannequin, the usual Pytorch pipeline works like this:
- An empty mannequin is created: 180B parameters * 2 bytes = 360 GB