HP Z2 Mini G1a Review: Running GPT-OSS 120B Without a Discrete GPU

mapumbaa@lemmy.zip · 3 months ago

HP Z2 Mini G1a Review: Running GPT-OSS 120B Without a Discrete GPU

Domi@lemmy.secnd.me · 3 months ago

Prompt processing at 12.3t/s, inference at 10.7-11.1 t/s.

Is that still on CPU or did you get it working on GPU?

I have seen a few people recommending GLM 4.5 at lower quants primarily for more intricate writing, might be worth the lower speed and context size for shorter texts.

Thanks for testing!

panda_abyss@lemmy.ca · 3 months ago

That was GPU, CPU was 5.

I’ve also tested the image processing more, a 512x512 takes about a minute, 1400x900 takes about 7-10, and image to image takes about 10 minutes

Most of the time is spent on the encoder decoder layers for image to image, and decoding is what shapes the slowest with image size