64k token race
iGPU+NPU LOAD CONCURRENCY TEST
We compare a fully loaded iGPU running with a concurrent Qwen3.5 9B under full load to measure impact.
Processed-Token Race
Each lane runs the same full 64k main-model workload to a 65,664 processed-token finish line. The baseline is Qwen3.6 35B iGPU Alone, the NPU lane is Qwen3.6 35B + Qwen3.5 9B NPU, and the iGPU contention lane is Qwen3.6 35B + Qwen3.5 9B iGPU.
Leader
n/a
Qwen3.6 35B iGPU Alone
195.31 tok/s
336s finish | 65,664 tokens | baseline
Qwen3.6 35B + Qwen3.5 9B NPU
189.09 tok/s
347s finish | +3.29% latency | concurrent NPU load
Qwen3.6 35B + Qwen3.5 9B iGPU
115.60 tok/s
568s finish | +68.96% latency | concurrent iGPU load