64k token race

iGPU+NPU LOAD CONCURRENCY TEST

We compare a fully loaded iGPU running with a concurrent Qwen3.5 9B under full load to measure impact.

Processed-Token Race

Each lane runs the same full 64k main-model workload to a 65,664 processed-token finish line. The baseline is Qwen3.6 35B iGPU Alone, the NPU lane is Qwen3.6 35B + Qwen3.5 9B NPU, and the iGPU contention lane is Qwen3.6 35B + Qwen3.5 9B iGPU.

Leader n/a

Qwen3.6 35B iGPU Alone

195.31 tok/s

336s finish | 65,664 tokens | baseline

Qwen3.6 35B + Qwen3.5 9B NPU

189.09 tok/s

347s finish | +3.29% latency | concurrent NPU load

Qwen3.6 35B + Qwen3.5 9B iGPU

115.60 tok/s

568s finish | +68.96% latency | concurrent iGPU load