How to change the data type of primary keys in Django
Problem: development started on a fresh project and, although there aren’t “real” users yet, some data is starting to be written to the DB by other developers. One day you realize primary keys in your models are still using integers instead of strings. You want to change that.
Obvious solution: change:
id = models.BigIntegerField(primary_key=True, default=utils.generate_unique_id, editable=False)
To:
id = models.CharField(primary_key=True, default=utils.generate_unique_id, editable=False, max_length=100)
Build and run migrations.
django.db.utils.ProgrammingError: foreign key constraint "tasks_task_project_id_a2815f0c_fk_projects_project_id" cannot be implementedDETAIL: Key columns "project_id" and "id" are of incompatible types: bigint and character varying.
Damn it…
Another obvious solution: destroy and rebuild current schema (a simple PostgreSQL backup/restore will not work).
Yeah…No. So how about we use…
Fixtures!
./manage.py dumpdata --natural-foreign --natural-primary -e contenttypes --indent=4 > dump.json
--natural-foreign
and --natural-primary
tell Django to skip the primary/foreign keys and simply regenerate them when storing the data. This will avoid problems such as:
django.db.utils.IntegrityError: Problem installing fixture '/app/dump.json': Could not load contenttypes.ContentType(pk=23): duplicate key value violates unique constraint "django_content_type_app_label_model_76bd3d3b_uniq"DETAIL: Key (app_label, model)=(tasks, historicaltask) already exists.
Now we can wipe out the DB using a PostgreSQL client (e.g. psql
):
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;
GRANT ALL ON SCHEMA public TO public;
GRANT ALL ON SCHEMA public TO postgres;
Wipe out the old migrations and re-create new ones (make sure you don’t delete any overridden/data migration!):
find . -maxdepth 6 -path "./*/migrations/*.py" -not -name "__init__.py" -delete
Rebuild and run new migrations:
./manage.py makemigrations && ./manage.py migrate
And finally load the fixtures back in:
./manage.py loaddata dump.json
(Praying should not be necessary but it’s highly suggested)
A few notes:
- Any reference to a previously stored primary key will be lost (e.g. a client’s cache).
- Make sure the data can be loaded into the DB before running the migrations.
- This is obviously not suggested for mature projects.