This is great stuff. Is there a way of integrating this into Openweb UI so there is a standard interface. I currently run Ollama and Lemonade through this interface (with no NPU support). If I could compile some models and run them through this interface getting NPU support, I would love to see the performance improvements. I am running an ASUS AMD Ryzen AI 9 365 w/ Radeon 880M and 32G.
This is great stuff. Is there a way of integrating this into Openweb UI so there is a standard interface. I currently run Ollama and Lemonade through this interface (with no NPU support). If I could compile some models and run them through this interface getting NPU support, I would love to see the performance improvements. I am running an ASUS AMD Ryzen AI 9 365 w/ Radeon 880M and 32G.