6 uncensored AI models + tools โ one codes 2,593 lines โ another runs 70B on a 4GB GPU โ all local, free
Local AI (runs on your computer, not someoneโs server) with the โI canโt help with thatโ stripped out. Free, offline, yours.
These are abliterated models โ that just means the part that makes an AI refuse stuff has been surgically removed. They run locally through Ollama (a free app that runs AI models on your own PC โ one install, copy-paste a command, done) or Hugging Face (the site where people share AI models for free).
No account ยท No cloud ยท Nothing logged. Hereโs the drop.
The Models
๐ง Huihui Qwen3.5 35B โ the no-refusal workhorse
From Chinese devs huihui-ai. Built on Qwen 3.5, most refusals stripped. Handles controversial/sensitive topics other chatbots dodge.
Runs in one Ollama command:
ollama run huihui_ai/qwen3.5-abliterated:35b
Built for experienced users โ with the guardrails gone, outputs can get highly explicit, provocative, or unpredictable.
๐ฅ Gemini Heretic 40B โ for coding + long writing
Minimal refusals, 128K context (it can hold a huge document or long chat in memory without forgetting) โ so it handles large documents, long conversations, and complex projects without losing track.
Shows its own reasoning. Built for coding, long-form writing, brainstorming, research. Few-clicks local setup.
โก Gemma 4 12B Obliterated โ zero refusal, zero quality drop
The first one to hit 0 refusals with no benchmark loss โ meaning they killed the โnoโ without making it dumber.
Lightweight 12B, runs on modest hardware.
๐ Qwen3.5 21B Deckard โ the coding monster
Arguably the strongest here. Cranked out 2,593 lines of code in one go โ ChatGPT usually chokes around 1,200โ1,500.
Holds structure and logic across a big codebase, not just snippets.
The Tools That Run Them
๐พ AirLLM โ run giant models on a potato PC
The catch with big AI is it needs a monster GPU. AirLLM (a free code library, 20k stars / 240k downloads) reworks how the model loads so a 70B model runs on a 4GB GPU โ they even run 405B Llama 3.1 on 8GB VRAM.
Works on basically any setup, from a low-end GPU down to CPU-only. Hooks straight into Hugging Face models. Beyond chat it handles OCR (reads text from images), image generators, assistants, and more.
๐งฉ AgentMemory โ give your AI a permanent memory
AI forgets everything between chats. AgentMemory is a memory layer โ it stores past interactions, compresses them into structured memories, and pulls the relevant bits back when needed โ so your AI remembers your project across sessions with no re-explaining.
#1 trending repo on GitHub. Plugs into Claude Code, Cursor, Codex, and any MCP tool.
Bonus: way less re-sending context = way lower token cost on long projects.
Quick Picks
Do
Start with Gemma 4 12B if your PCโs modest โ lightest one here
Use AirLLM if a modelโs too big for your GPU
Run everything through Ollama for the easiest setup
Donโt
Donโt grab the 40B models on a weak machine โ theyโll crawl
Donโt expect a guardrail to catch you โ there isnโt one, thatโs the point
Got a rig running one of these? Drop your specs + which model below โ helps everyone pick. ![]()








!