Used Optane memory runs trillion-parameter model on one GPU
A workstation built with second-hand Intel Optane persistent memory modules was used to run Kimi K2.5 locally with a single GPU. The setup highlights renewed interest in a memory tier between DRAM and SSDs for large language model inference.
