Michael Goin

Michael Goin's contributions

Featured blog image with the following text: vLLM and DeepSeek
Article

How we optimized vLLM for DeepSeek-R1

Michael Goin +4

Explore inference performance improvements that help vLLM serve DeepSeek AI models more efficiently in this technical deep dive.

Featured image for Distributed inference with vLLM.
Article

Distributed inference with vLLM

Michael Goin

Explore how distributed inference works within vLLM in this recap of Neural Magic's vLLM Office Hours with Michael Goin and Murali Andoorveedu, a vLLM committer from CentML.

featured image for SparseGPT.
Article

SparseGPT: Remove 100 billion parameters for free

Robert Shaw +1

Compress large language models (LLMs) with SparseGPT to make your machine learning inference fast and efficient. Prune in one-shot with minimal accuracy loss.