Skip to main content

Tagged JSON for Secure Data

The TaggedJSONSerializer system provides a compact, lossless way to represent Python types that are not natively supported by JSON. This is primarily used by Flask's session management to store complex data structures in signed cookies.

The Tagging Mechanism

The core of this system is the TaggedJSONSerializer class in src/flask/json/tag.py. It works by wrapping non-standard types in a single-key dictionary where the key identifies the type (the "tag") and the value is a JSON-serializable representation.

For example, a Python tuple is serialized as:

{" t": [1, 2, 3]}

When the serializer encounters a value, it iterates through its registered tags and uses the first one that matches.

Base Tag Structure

All tags inherit from the JSONTag base class. A tag implementation must define:

  • key: The string used to identify the tag in JSON (e.g., " u" for UUIDs).
  • check(value): Returns True if the value should be handled by this tag.
  • to_json(value): Converts the Python object to a JSON-compatible type.
  • to_python(value): Converts the JSON-compatible type back to the original Python object.

Supported Types

The TaggedJSONSerializer includes several default tags in its default_tags list:

TypeTag KeyImplementationDescription
dict diTagDictHandles 1-item dicts that might conflict with tags.
tuple tTagTupleConverts tuples to lists for JSON storage.
bytes bTagBytesBase64 encodes byte strings.
Markup mTagMarkupSerializes objects with an __html__ method.
UUID uTagUUIDStores UUIDs as hex strings.
datetime dTagDateTimeUses Werkzeug's http_date format.

Structural Tags

Some tags do not add a key but instead facilitate recursive processing:

  • PassList: Recursively tags every item within a standard Python list.
  • PassDict: Recursively tags every value within a standard Python dict. Note that JSON keys must be strings, so PassDict does not attempt to tag dictionary keys.

Handling Ambiguity with TagDict

A potential issue arises if a developer stores a dictionary that happens to look like a tagged object, such as {" t": [1, 2, 3]}. To prevent this from being incorrectly deserialized as a tuple, TagDict acts as an escape mechanism.

If a dictionary has exactly one key and that key matches a registered tag, TagDict appends __ to the key during serialization:

# Original data
data = {" t": [1, 2, 3]}

# Serialized representation
# {" di": {" t__": [1, 2, 3]}}

During deserialization, TagDict.to_python removes the __ suffix to restore the original key.

Customizing the Serializer

You can extend the system by registering custom tags. The order of registration is critical because the serializer checks tags in the order they appear in the order list.

Example: Registering an OrderedDict Tag

If you need to preserve the order of a dictionary, you can implement a custom tag and register it at the beginning of the list so it is checked before the standard PassDict.

from flask.json.tag import JSONTag, TaggedJSONSerializer
from collections import OrderedDict

class TagOrderedDict(JSONTag):
key = " od"

def check(self, value):
return isinstance(value, OrderedDict)

def to_json(self, value):
return [[k, self.serializer.tag(v)] for k, v in value.items()]

def to_python(self, value):
return OrderedDict(value)

# Usage
serializer = TaggedJSONSerializer()
serializer.register(TagOrderedDict, index=0)

Integration with Flask Sessions

Flask uses a global instance of this serializer, session_json_serializer, defined in src/flask/sessions.py. This instance is the default serializer for the SecureCookieSessionInterface.

When SecureCookieSessionInterface.save_session is called, it uses an itsdangerous.URLSafeTimedSerializer configured with this tagged JSON serializer to produce the final signed cookie string:

# From src/flask/sessions.py
def get_signing_serializer(self, app: Flask) -> URLSafeTimedSerializer | None:
# ...
return URLSafeTimedSerializer(
keys,
salt=self.salt,
serializer=self.serializer, # This is the TaggedJSONSerializer
signer_kwargs={
"key_derivation": self.key_derivation,
"digest_method": self.digest_method,
},
)

This integration ensures that developers can store complex types like datetime objects or Markup strings directly in the session object without manual conversion.