Experiments With Abliterating Ai Models

Abliteration is a technique to remove censorship and safety guardrails from Large Language Models (LLMs). I wanted to abliterate an existing consumer model in order to reduce the refusal rate of my requests for it to perform offensive security tasks.

These are the commands I used to prepare the docker container environment and run Heretic which is a software package that performs abliteration.

# Start the docker container we'll work in
sudo docker run --gpus all -it --rm   -v $(pwd)/models:/models -v ollama_data:/ollama_data   pytorch/pytorch:2.11.0-cuda12.8-cudnn9-devel /bin/bash
# Check the GPU is accessible from within the container - should return True
python -c "import torch; print(torch.cuda.is_available())"
# Download and install Heretic
apt update -y && apt install git
git clone https://github.com/p-e-w/heretic
cd heretic/
python3 -m pip install --break-system-packages .
cd /models/
# Run Heretic against the LLM you wish to abliterate (Qwen3.6-27B in this case)
heretic Qwen/Qwen3.6-27B

Screenshot of Heretic running Using Heretic to try abliterating a fairly performant consumer model (Qwen3.6-27B)

I let it run for a week and noted during this time GPU VRAM utilisation would sit at about 75%. I discontinued the abliteration process after a week so I could use my GPU for inference. However it was a fun experiment to try abliterating my own model!