[FREE] Annihilation LLM — Feed Any HuggingFace Model In, Get A Local Uncensored Version Back, No Cloud, No Account
AI’s “I can’t help with that” isn’t a filter on top — it’s wired into the model’s actual math. Annihilation is a free tool that finds that wiring and rips it out. Not a jailbreak, not a prompt trick — the refusal reflex gets surgically removed. ![]()
Any open-source model (Qwen, Llama, Mistral, SmolLM…) · runs on CPU, no GPU needed · one command · author was hesitant to drop it, did anyway.
github.com/tjcrims0nx/annihilation-llm
🧠 the actual trick — why this is different from jailbreaking
Jailbreaking = sweet-talking the bouncer
with clever prompts. Sometimes works, breaks on updates, annoying.
This = removing the bouncer from the AI’s DNA so it never existed in the first place.
Here’s how AI refusal actually works: models are giant blobs of math numbers (weights = the AI’s brain). Researchers found there’s a specific direction inside those numbers — like a hidden dial that goes from “harmless” to “harmful.” When a prompt nudges that dial past a threshold, the model refuses.
Directional ablation (the technique here, also called abliteration) = calculate that dial’s exact position in the math → subtract it from the whole model → the dial mostly stops firing. The result: the model quits throwing explicit “I can’t help with that” refusals.
Honest note: refusal is diffuse — smeared across the model, not one clean switch. So this kills the flat-out refusals, but it’s not a magic 100% “answers literally anything” button. A well-abliterated model won’t explicitly refuse; edge cases can still slip through. That’s just how the technique works, no tool dodges it.
It does this with LoRA adapters (lightweight clip-on layers — like prescription lenses you clip onto the model that change how it “sees” requests) targeting the specific attention layers responsible for refusal. Then it optimizes those adapters automatically using real harmful vs harmless prompts to measure how well the refusal direction got killed.
Not a hack. Not a bypass. A mathematical surgery on the model’s brain. The author built this himself. ![]()
📦 install + quick start (2 min)
# Option 1 — global install (simple)
pip install -U annihilate-llm
# Option 2 — isolated environment (recommended so it doesn't mess with your Python)
python -m venv annihilation-env
# Activate it:
.\annihilation-env\Scripts\activate # Windows
source annihilation-env/bin/activate # Mac/Linux
pip install annihilate-llm
Then just run it on any model:
# Grab a model name from HuggingFace (the free AI model library) and feed it in
annihilate Qwen/Qwen3-4B-Instruct-2507
No GPU? It’ll say “no accelerator detected, operations will be slow” and still run on your CPU. Smaller models (1-3B) are totally fine on a regular laptop.
⚙️ tuning options (verified repo)
See every option:
annihilate --help
Or tune via a config file — rename config.default.toml to config.toml and edit these keys:
| Key | Default | What it does |
|---|---|---|
n_trials |
200 | How many optimization rounds — more = better tuned, slower |
quantization |
none | Set to bnb_4bit to shrink RAM use for big models |
row_normalization |
full | Weight normalization strategy |
orthogonalize_direction |
true | Direction adjustment method |
Low on RAM?
quantization = bnb_4bitis the one that lets bigger models fit. On a small machine, stick to 1–4B models.
GPU not detected? The repo has a dedicated “GPU Setup” section (Windows + Ubuntu) with the exact torch reinstall command for your CUDA version — grab it straight from there so you don’t install the wrong build.
✅ what you actually get + who's this for
The tool spits out a modified version of the model — the rewired brain — that you keep locally. Run it through Ollama, LM Studio, or any local AI interface. It’s your AI, with the refusal reflex stripped out, running on your hardware, offline.
Works with any model you can download from HuggingFace (the free library where you get open-source AI models). Popular targets:
- Qwen3 (4B, 8B, 14B)
- Llama 3 / 3.1
- Mistral
- SmolLM
- Phi-3
Who this slaps for: researchers stress-testing AI safety, devs who want an unrestricted local model for their apps, privacy people who want their own AI that never phones home, and anyone who’s sick of their locally-running model refusing to answer things that are none of its business.
Free, open-source, GNU AGPL v3. Educational and research use. You’re responsible for what you do with it — the disclaimer in the repo covers this in full.
simple-pimple: pip install annihilate-llm → annihilate <model-name> → refusal reflex surgically stripped → locally yours forever. ![]()
Author: tjcrims0nx — was hesitant to drop this. Respect. Go leave a star.



!