You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser.
-
Wheeeee !!! Linode and TensorDock FTW !!!
-
8B model with 5GB memory sounds like a pretty heavy quantisation. Do check the perplexity numbers to see how much you're losing in model...
-
btw i just tried the 8b model (5 GB memory requirement) on my zen4 laptop and was able to run both CPU and GPU modes... the laptop iGPU...
-
No reason, just wanted to experience it for kicks aka shits and giggles lol.
Also to compare the speed difference on various hardware...
-
Check out sqlcoder by defog. It's available on huggingface. They fine-tune llama for text to SQL. We benchmarked the codellama finetune...
-
Any particular reason you're not using the smaller variants? They are surprisingly good. Alternatively, you might look into quantizing...