Programming & Development

How to Run Reliable Local LLM Agents on an RTX 3090: A Benchmark (5 Models, Priced in Watts)

I gave GLM-4.5-Air (106B, open weights) 12 coding tasks through opencode on my RTX 3090. It scored 0% — never edited a single file. Same model, same GPU, same tasks, but driven by a ~150-line LangGraph agent instead: 93%. The model was never the problem. The orchestrator was. Here's the benchmark — including the part nobody else measures, the electricity cost per correct task. Setup RTX 3090 (24 GB) + 128 GB RAM, models via ollama, Q4 quants, temp 0.2 5 recent open models × 2 orchestrators (opencode vs custom LangGraph ReAct with ollama-native tool-calling) 17 graded tasks (12 coding in Python/JS/C++ + 5 general-agent) with hidden unit tests Every run priced in GPU watts via my open-source homelab-monitor Results Model tok/s opencode adh. LangGraph adh. LangGraph coding LangGraph general Qwen3-Coder 30B-A3B 130 92% 100% 100% 100% GLM-4.5-Air 106B 5.7 0% 100% 89% 100% Devstral Small 24B 49 8% 53% 8% 40% Seed-OSS 36B 9.5 0% 7% 0% 20%

DEV Community

3h ago

0 0

Discussion

Say something first

It all starts with you—share your thoughts now.

No comments yet.

Be the first to share your take and keep the conversation moving.

Join the conversation

UPVOTERS

Community appreciation

See who found this content valuable and showed their support.

No upvotes yet.

Be the first to show your appreciation for this content.

TOPICS

Explore the same topics

Discover more content from the topics this post is mapped to.

dev.to

I built my own package manager in Rust while building a Linux d…

A few months ago I decided to build a Linux distribution entirely from scratch using LFS (Linux From Scratch) and BLFS. Every package — GCC, glibc, systemd, XF…

Sofia Bennett

2026-06-28 11:12

marfapublicradio.org

Marfa Public Radio Puts You to Sleep

Comments

Hacker News

4h ago

dev.to

Building a RAG System from Scratch with pgvector and Gemini — I…

What This Guide Covers When you start building LLM-powered applications, one pattern becomes unavoidable: RAG (Retrieval-Augmented Generation). LLMs only k…

DEV Community

10h ago

dev.to

Database Rate Limiting: The Missing Piece After a Circuit Break…

This is an extension to my previous post on Circuit Breakers. It only decides: "Should I even try calling Redis?" When the circuit is open, your applicati…

Anna Theodorou

11h ago

ipcrawl.com

IP Crawl: Living atlas of open webcams discovered on the public…

Comments

Hacker News

12h ago

dev.to

I open-sourced high-performance open-source Bonkfun Bundler for…

high-performance open-source Bonkfun Bundler for Solana A high-performance open-source Bonkfun Bundler for Solana. It allows users to create a token + bundl…

Stefani

13h ago

Keep browsing

Explore more from this topic

Dive into the full feed of curated posts covering Programming & Development.

Browse Topics

Continue exploring

Discover more content that aligns with your interests and this post.

dev.to

I built my own package manager in Rust while building a Linux d…

A few months ago I decided to build a Linux distribution entirely from scratch using LFS (Linux From Scratch) and BLFS. Every package — GCC, glibc, systemd, XF…

Sofia Bennett

2026-06-28 11:12

dev.to

Building a RAG System from Scratch with pgvector and Gemini — I…

What This Guide Covers When you start building LLM-powered applications, one pattern becomes unavoidable: RAG (Retrieval-Augmented Generation). LLMs only k…

DEV Community

10h ago

dev.to

Database Rate Limiting: The Missing Piece After a Circuit Break…

This is an extension to my previous post on Circuit Breakers. It only decides: "Should I even try calling Redis?" When the circuit is open, your applicati…

Anna Theodorou

11h ago

dev.to

I open-sourced high-performance open-source Bonkfun Bundler for…

high-performance open-source Bonkfun Bundler for Solana A high-performance open-source Bonkfun Bundler for Solana. It allows users to create a token + bundl…

Stefani

13h ago

dev.to

APX `mcp check` Is the Fastest Way to Debug Shadowed MCPs

APX mcp check Is the Fastest Way to Debug Shadowed MCPs APC is the portable context layer. APX is the local runtime and tooling layer that makes APC useful …

DEV Community

20h ago

dev.to

Hi everyone!

Hi everyone, I'm Gabriele, and I'm here to keep growing as a programmer. I've spent the last 5 years working with C#, and now I'm keen to explore new methodol…

Anna Theodorou

21h ago

Still curious?

See more related posts

Keep the inspiration flowing with fresh submissions and trending finds from the community.

View Latest

How to Run Reliable Local LLM Agents on an RTX 3090: A Benchmark (5 Models, Priced in Watts)

Say something first

Join the conversation

Community appreciation

Explore the same topics

Explore more from this topic

Continue exploring

See more related posts

Share Content