Playing around with local #llm and wanted to evaluate the performance of my #gpu so I wrote this little #ollama benchmarking tool: https://github.com/stefanthoss/ollama-server-benchmark Impressive - my small #nvidia Tesla T4 generates prompt responses roughly 4 times as fast as my 64 core server CPU can.
