Im using Ollama on my server with the WebUI. It has no GPU so its not quick to reply but not too slow either.
Im thinking about removing the VM as i just dont use it, are there any good uses or integrations into other apps that might convince me to keep it?
I use the Continue VS Code plugin with Ollama to use a couple of different models (deepseek-coder-v2 & starcoder2) to recreate a local only Github Copilot type experience for coding. This is on an M1 Apple Silicon though. For autocomplete the generation needs to be pretty brisk - I’m not sure how that would go in a VM without a GPU.
How well does the M1 chip keep up? What size models are you running with it? Interested in getting an M1 laptop and I am curious.