Experiments With Abliterating Ai Models
Abliteration is a technique to remove censorship and safety guardrails from Large Language Models (LLMs). I wanted to abliterate an existing consumer model in order to reduce the refusal rate of my requests for it to perform offensive security tasks.
These are the commands I used to prepare the docker container environment and run Heretic which is a software package that performs abliteration.
# Start the docker container we'll work in
sudo docker run --gpus all -it --rm -v $(pwd)/models:/models -v ollama_data:/ollama_data pytorch/pytorch:2.11.0-cuda12.8-cudnn9-devel /bin/bash
# Check the GPU is accessible from within the container - should return True
python -c "import torch; print(torch.cuda.is_available())"
# Download and install Heretic
apt update -y && apt install git
git clone https://github.com/p-e-w/heretic
cd heretic/
python3 -m pip install --break-system-packages .
cd /models/
# Run Heretic against the LLM you wish to abliterate (Qwen3.6-27B in this case)
heretic Qwen/Qwen3.6-27B
Using Heretic to try abliterating a fairly performant consumer model (Qwen3.6-27B)
I let it run for a week and noted during this time GPU VRAM utilisation would sit at about 75%. I discontinued the abliteration process after a week so I could use my GPU for inference. However it was a fun experiment to try abliterating my own model!
© 2026-05-13 Kuso Technology