Rules-based authorization

June 15, 2024

Authorization is a simple problem with a very large and complex solution space.

I’m going to plant a flag and say Django’s documentation on authorization follows a bad pattern (for most applications). Django’s built-in permissions work well enough for the django-admin views, but they’re not ideal for a generic web application.

Django goes all-in on the “permissions-as-data” pattern. Every possible action is a permission object, and these permissions are linked to users or groups. In a sense, django’s permission utilities are incomplete, as they only support model-level permissions, not record-level permissions. django-guardian is a popular package that fills this gap using the same permission API which I still don’t recommend.

The problem with permissions as data

One of the supposed benefits to ‘permissions-as-data’ is that non-developers can change permissions without re-deploying the application. In practice, this approach is often more work for developers. Permission records need to be created in the first place, and they need to be deleted once a record is destroyed, and of course modified when desired behavior change. Often, permissions-as-data introduce redundancy in the database.

Let’s say we’re building a project management app. We have projects, users and tasks. Projects have contributors and leaders. Project-leaders can modify tasks for their project, but contributors can only view them.

Using the django’s permission API (with guardian installed), our view logic would be checking permissions like this:

can_edit = user.has_permission("edit", task)

If we follow permissions-as-data as a paradigm, we’ll need to create many records. Here’s what our permissions table might look like for a single project, task, leader and contributor:

id	user_id	task_id	permission (verb)
1	boss1	proj1_task1	view
2	boss1	proj1_task1	edit
3	employee1	proj1_task1	view

Notice a couple things:

Every time we add a new task, we need to 2 new permissions for each leader and one new permission for each contributor
If we add new features, e.g. “leaders can delete tasks”, we need to create new delete permission records for each leader/task pair

This is a lot of complexity, and it’s all redundant. Our data-model should already describe relationships between users, projects and tasks. There’s no need to duplicate this information in a permissions table. The project, user and tasks tables for this data-model are relatively straightforward, and the roles can be stored as project-user linkage table with a role=contributor|leader sort of enum field.

It’s pretty straightforward to write permission functions here. I like to put them all in a rules.py module:

# rules.py 
def is_project_leader(user_id, project_id):
    return ProjectUserRole.objects.filter(user_id=user_id, project_id=project_id, role='leader').exists()

def is_project_contributor(user_id, project_id):
    return ProjectUserRole.objects.filter(user_id=user_id, project_id=project_id, role='contributor').exists()

def can_view_task(user_id, task_id):
    project_id = Task.objects.get(id=task_id).project_id
    return is_project_leader(user_id, project_id) or is_project_contributor(user_id, project_id)

def can_modify_task(user_id, task_id):
    project_id = Task.objects.get(id=task_id).project_id
    return is_project_leader(user_id, project_id)

No need to maintain any permission records, just describe the relationships between users, projects and tasks in the data-model and let the business rules determine what users can do.

You might be thinking “these roles are permission records”, but that’s not quite true. This project leader/contributor is something your data-model should have already described before you even started thinking about authorization logic. If not, it would be an anti-pattern to use permission records to describe relationships between users and projects.

Another benefit to using these rules is that the rules can change independently of the relationships in the data. The business might decide project contributors can modify tasks. With rules, this is a one-line change, with permissions-as-data, a script must be written to migrate data.

Tips for rules-based authorization

I recommend treating authorization rules as their own independent “layer”. Views should defer all authorization logic to rule functions. Your templates can even call rules when you need to decide whether a button should say “view” or “edit”.

For testing purposes, and syntactic sugar tricks I won’t showcase here, I also use a package called django-rules, which I then wrap in my own rules_util.py module.

Using the rules framework will force me to call rules through a rules.test_rule(rule_name, user,object) API, which is admittedly verbose and error-prone to type, but it allows me to easily mock out any rule during tests. This means I can test my views and rules separately. If I want to test a project-leader view, I don’t need to create a project-leader user, I can just mock out the is_project_leader rule. I can assert that my view returns a 403 status-code when the rule is mocked as false, and returns the proper response when the rule is mocked as true.

I’ll first show you how it looks like as a consumer (rules, views, tests), then I’ll share the implementation:

Defining rules:

# myapp/rules.py 
from project.rules_util import add_rule, auto_rule

@auto_rule
def is_project_leader(user_id, project_id):
    return ProjectUserRole.objects.filter(user_id=user_id, project_id=project_id, role=LEADER_ROLE).exists()

# or using add_rule syntax,
def is_project_leader(user_id, project_id):
    return ProjectUserRole.objects.filter(user_id=user_id, project_id=project_id, role=LEADER_ROLE).exists()

add_rule("is_project_leader", is_project_leader)

Using them in views (or templates):

# myapp/views.py
from project.rules_util import test_rule

class ProjectLeaderPage(View):
    # this dispatch method makes a good authorization mixin
    def dispatch(self, request, *args, **kwargs):
        if not rules.test_rule("is_project_leader", request.user, kwargs['project_id']):
            return HttpResponseForbidden()
        return super().dispatch(request, *args, **kwargs)'
    # ...

testing views (pytest-django):

# tests/test_project_leader_feature.py
from project.rules_util import mock_rules

def test_my_project_leader_feature(client):
    with mock_rules(is_project_leader=True):
        response = client.get('/project/1/leader_feature')
        assert response.status_code == 200

    with mock_rules(is_project_leader=False):
        response = client.get('/project/1/leader_feature')
        assert response.status_code == 403

testing rules:

# tests/test_rules.py
from project.rules_util import test_rule

def test_is_project_leader():
    project = ProjectFactory()
    project2 = ProjectFactory()
    leader = UserFactory()
    contributor = UserFactory()
    ProjectUserRole.objects.create(user=leader, project=project, role='leader')
    ProjectUserRole.objects.create(user=contributor, project=project, role='contributor')

    assert test_rule("is_project_leader", leader, project)
    assert not test_rule("is_project_leader", contributor, project)
    assert not test_rule("is_project_leader", leader, project2)

As you can see, using rules is a bit more verbose than plain functions, but it’s worth it for the testing benefits. There are probably easier ways to mock out plain functions, but the rules package also has some nice syntax for composing rules using boolean operators.

It is on my TODO list to publish this as a package, but until then, you can find rules_util.py, a thin wrapper around the django-rules package, below:

Click to expand code

from unittest.mock import patch

import rules
from rules import add_rule, predicate


class NonExistentRuleException(Exception):
    pass


# this is the "private" version, for mocking purposes
def _test_rule(name, user=None, obj=None):
    if not rules.rule_exists(name):
        raise NonExistentRuleException(f"rule {name} does not exist")

    return rules.test_rule(name, user, obj)


def test_rule(*args, **kwargs):
    return _test_rule(*args, **kwargs)


def auto_rule(fn):
    """
    usage: as decorator:

    @auto_rule
    def can_edit_foo(user, obj):
        return user.has_foo(obj)
    """
    pred = predicate(fn)
    add_rule(fn.__name__, pred)
    return pred


class mock_rules:
    """
    usage: with mock_rules(can_access_foo=False):
      assert test_client.get(some_view_that_uses_patched_rules).status_code == 403
    """

    def rule_mocker(self, **rule_stubs):
        def exec_rule(rule_name, user=None, obj=None):
            if rule_name in rule_stubs:
                return rule_stubs[rule_name]

            return self.actual_rule_func(rule_name, user, obj)

        return exec_rule

    def __init__(self, **rule_stubs):
        self.actual_rule_func = _test_rule
        self._patch = patch(
            "rules_util._test_rule", self.rule_mocker(**rule_stubs)
        )

    def __enter__(self):
        return self._patch.__enter__()

    def __exit__(self, *excp):
        return self._patch.__exit__(*excp)

You don’t have to use the rules package, but I do recommend consuming rules against a central API that can easily be mocked for testing. One disadvantage of this particular implementation is that these rule strings are checked at runtime, so if you misspell a rule name, you won’t know until you run the code.

Often, rules will be composed of other rules. My rules rarely consume the central test_rule internally, but call other rule functions directly, and I never mock rules when testing rules. Ideally, rule tests should test behavior deeply, not shallowly test implementation details.

Performance tips

By setting up an independent layer of rules, we’ve made it difficult for a view to efficiently check permissions for many objects at once. For instance, if you have a view that lists all projects, and you need to know whether to display a “view” or “edit” button for each project, you’ll need to call is_project_leader() for each project, even if it’s trivially easy to write query that fetches all roles for a particular (project,user) pair!

We could write a few rules that take a list of objects and returns a list or dictionary of booleans. This API is awkward to consume, but it works in a pinch.

Instead, I recommend using something like django-data-fetcher which will allow views to pre-populate a cache of records used by your rules (disclaimer: I authored this package).

Wrap-up

As a broader concept, permissions-as-data is not always bad. It has worked really well for unix file systems and relational databases. Sometimes you’ll find it’s necessary: If your authorization logic isn’t implied from your existing data model, then you’ll need to store permission information as data somehow. Even in this case, I would still suggest layering a rules-based approach on top of it. This way, you can still treat authorization as its own layer, and therefore test your views and rules independently. It’s also easier to modify logic later, e.g. if you’re adding administrators who can do anything, it’s much simpler to modify rule functions than it is to create new permission records for every single object in the database.

Rules-based authorization can easily wrap around permissions-as-data, but permissions-as-data can never wrap around a rules-based authorization.