A vLLM Docker Compose recipe for running Qwen 3.6 27B on dual RTX 3090s (+OpenCode configuration)
I spent some time yesterday getting the new Qwen 3.6 27B model running locally on a (solar-powered) machine with dual RTX 3090 GPUs. With this setup I'm able to achieve around 50 tokens/second and use the model's full 256k context window.
This deployment uses …