Queryable Encryption¶
Added in version 6.0.1.
Queryable Encryption is a powerful MongoDB feature that allows you to encrypt sensitive fields in your database while still being able to query those fields.
This document will guide you through the process of configuring Queryable Encryption in your Django project.
MongoDB requirements
Queryable Encryption can be used with MongoDB replica sets or sharded clusters running version 7.0 or later. Standalone instances are not supported. The Queryable Encryption Compatibility table summarizes which MongoDB server products support Queryable Encryption.
Installation¶
In addition to Django MongoDB Backend’s regular installation and configuration steps, Queryable Encryption requires PyMongo 4.16.0 or later, as well as some additional Python dependencies that can be installed like so:
$ pip install 'django-mongodb-backend[encryption]'
You’ll also have to download the Automatic Encryption Shared Library. You can choose the latest version,
even if it doesn’t match your MongoDB server version. Extract the shared
library from the archive. You’ll use the path to it in the next step (see
crypt_shared_lib_path).
Configuring the DATABASES setting¶
In addition to the database settings
required to use Django MongoDB Backend, Queryable Encryption typically requires
configuring a separate database connection that uses use PyMongo’s
AutoEncryptionOpts.
Here’s a sample configuration using a local KMS provider:
from pymongo.encryption_options import AutoEncryptionOpts
DATABASES = {
"default": {
"ENGINE": "django_mongodb_backend",
"HOST": "mongodb+srv://cluster0.example.mongodb.net",
"NAME": "my_db",
# ...
},
"encrypted": {
"ENGINE": "django_mongodb_backend",
"HOST": "mongodb+srv://cluster0.example.mongodb.net",
"NAME": "encrypted_db",
# ...
"OPTIONS": {
"auto_encryption_opts": AutoEncryptionOpts(
key_vault_namespace="encrypted_db.__keyVault",
kms_providers={
"local": {
# Generated by os.urandom(96)
"key": (
b'-\xc3\x0c\xe3\x93\xc3\x8b\xc0\xf8\x12\xc5#b'
b'\x19\xf3\xbc\xccR\xc8\xedI\xda\\ \xfb\x9cB'
b'\x7f\xab5\xe7\xb5\xc9x\xb8\xd4d\xba\xdc\x9c'
b'\x9a\xdb9J]\xe6\xce\x104p\x079q.=\xeb\x9dK*'
b'\x97\xea\xf8\x1e\xc3\xd49K\x18\x81\xc3\x1a"'
b'\xdc\x00U\xc4u"X\xe7xy\xa5\xb2\x0e\xbc\xd6+-'
b'\x80\x03\xef\xc2\xc4\x9bU'
)
},
},
# Exact file name depends on your operating system.
crypt_shared_lib_path="/path/to/mongo_crypt_v1.so",
crypt_shared_lib_required=True,
)
},
},
}
key_vault_namespace specifies where to store the data encryption keys.
The database name of the key vault must be the same as in "NAME". The
vault’s collection name can be whatever you wish, but by convention, it’s often
__keyVault.
Why is a second connection recommended?
Connections that use AutoEncryptionOpts have some query
limitations, regardless of whether or
not a model is encrypted. Creating two separate entries in
DATABASES avoids imposing these query limitations on your
non-encrypted models.
Configuring the DATABASE_ROUTERS setting¶
Similar to configuring the DATABASE_ROUTERS setting for
embedded models, Queryable Encryption requires
a DATABASE_ROUTERS setting to route database operations to the
encrypted database.
The following example shows how to configure a router for the "myapp"
application that routes database operations to the encrypted database for all
models in that application:
# myapp/routers.py
class EncryptedRouter:
def allow_migrate(self, db, app_label, model_name=None, **hints):
# Create myapp's models only in the encrypted database.
if app_label == "myapp":
return db == "encrypted"
# Don't create collections for other apps in the encrypted db.
if db == "encrypted":
return False
return None
def db_for_read(self, model, **hints):
# All reads and writes for myapp's models go to the encrypted db.
if model._meta.app_label == "myapp":
return "encrypted"
return None
db_for_write = db_for_read
Then in your Django settings, add the custom database router to the
DATABASE_ROUTERS setting:
# settings.py
DATABASE_ROUTERS = [
"django_mongodb_backend.routers.MongoRouter",
"myapp.routers.EncryptedRouter",
]
Encrypted fields¶
Now you can start using encrypted fields in your Django models.
Encrypted fields may be used to protect sensitive data like social security numbers, credit card information, or personal health information.
Import the encrypted fields from django_mongodb_backend.fields, and use
them to define your models as usual.
Here are models based on the Python Queryable Encryption Tutorial:
# myapp/models.py
from django.db import models
from django_mongodb_backend.models import EmbeddedModel
from django_mongodb_backend.fields import (
EmbeddedModelField,
EncryptedCharField,
EncryptedEmbeddedModelField,
)
class Patient(models.Model):
name = models.CharField(max_length=255)
patient_id = models.BigIntegerField()
patient_record = EmbeddedModelField("PatientRecord")
def __str__(self):
return f"{self.name} ({self.patient_id})"
class PatientRecord(EmbeddedModel):
ssn = EncryptedCharField(max_length=11, queries={"queryType": "equality"})
billing = EncryptedEmbeddedModelField("Billing")
bill_amount = models.DecimalField(max_digits=10, decimal_places=2)
class Billing(EmbeddedModel):
cc_type = models.CharField(max_length=50)
cc_number = models.CharField(max_length=20)
Migrations¶
Once you have defined your models, create a migration as usual:
$ python manage.py makemigrations
Then run the migrations on the encrypted database:
$ python manage.py migrate --database encrypted
Warning
Be aware that MongoDB does not allow adding encrypted fields to existing collections, nor can you change the definition of an encrypted field, for example, to make it queryable. If you wish to add or change an encrypted field, you must create a new collection.
Creating encrypted data¶
Now create and manipulate instances of the data just like any other Django model data. The data is automatically encrypted and decrypted, ensuring that sensitive data is stored securely in the database.
Here’s an example of creating a new Patient instance:
>>> from myapp.models import Patient, PatientRecord, Billing
>>> billing = Billing(cc_type="Visa", cc_number="4111111111111111")
>>> record = PatientRecord(ssn="123-45-6789", billing=billing, bill_amount=250.75)
>>> patient = Patient(name="John Doe", patient_id=1001, patient_record=record)
>>> patient.save()
Querying encrypted fields¶
In order to query encrypted fields, you must include the queries argument. For example, notice PatientRecord's
ssn field:
class PatientRecord(EmbeddedModel):
ssn = EncryptedCharField(max_length=11, queries={"queryType": "equality"})
You can perform a equality query just like you would on a non-encrypted field:
>>> patient = Patient.objects.get(patient_record__ssn="123-45-6789")
>>> patient.name
'John Doe'
See EncryptedField.queries for more details.
Configuring a Key Management Service (KMS)¶
To use Queryable Encryption, you must configure a Key Management Service (KMS) to store and manage the encryption keys used to encrypt and decrypt data.
A local KMS provider with a hardcoded key is suitable for development and testing, but in a production environment, you should securely store and manage encryption keys using a KMS Provider.
There are two primary configuration points:
The
kms_providersparameter ofAutoEncryptionOpts. See thekms_providersparameter inAutoEncryptionOptsfor the available providers (aws,azure,gcp, etc.) as well as the provider options).The
KMS_CREDENTIALSinner option ofDATABASES. The keys for each provider are documented under themaster_keyparameter ofcreate_data_key().
Here’s an example of KMS configuration with aws:
from pymongo.encryption_options import AutoEncryptionOpts
DATABASES = {
"encrypted": {
# ...
"OPTIONS": {
"auto_encryption_opts": AutoEncryptionOpts(
# ...
kms_providers={
"aws": {
"accessKeyId": "your-access-key-id",
"secretAccessKey": "your-secret-access-key",
},
},
),
},
"KMS_CREDENTIALS": {
"aws": {
"key": "...", # Amazon Resource Name
"region": "...", # AWS region
},
},
},
}
Configuring the encrypted_fields_map option¶
Encryption keys are created when you run migrations for models that have encrypted fields.
To see the encrypted fields map for your models (which includes the encryption
key IDs), run the showencryptedfieldsmap command:
$ python manage.py showencryptedfieldsmap --database encrypted
Didn’t work?
If you get the error Unknown command: 'showencryptedfieldsmap', ensure
"django_mongodb_backend" is in your INSTALLED_APPS setting.
It’s recommended to include this map in your production settings to protect against a malicious server advertising a false encrypted fields map:
from bson import json_util
from pymongo.encryption_options import AutoEncryptionOpts
DATABASES = {
"encrypted": {
# ...
"OPTIONS": {
"auto_encryption_opts": AutoEncryptionOpts(
# ...
encrypted_fields_map=json_util.loads(
"""{
"encrypt_patient": {
"fields": [
{
"bsonType": "string",
"path": "patient_record.ssn",
"keyId": {
"$binary": {
"base64": "2MA29LaARIOqymYHGmi2mQ==",
"subType": "04"
}
},
"queries": {
"queryType": "equality"
}
},
]
}}"""
),
),
},
},
}
Do not include this in your development and testing settings since the data encryption keys will be different from those in your production database.
Typical deployment workflow¶
A typical development and deployment workflow for a project that uses Queryable Encryption could be something like this:
Develop and test your project locally.
Deploy your project and run
migrateto create the encrypted collections and data encryption keys.Run
showencryptedfieldsmapin production and use the output to setencrypted_fields_mapin your production settings.