Context and tramp-data
A common problem I see developers struggle with is passing extraneous data down functions where it doesn’t feel like it belongs as argument.
It’s common to see this kind of thing happen when someone is eliminating global variables when refactoring, or when passing redundant data for performance reasons.
Not much is written about this (anti-?) pattern. According to stack-overflow, this is called “tramp data”:
The data itself is called “tramp data”. It is a “code smell”, indicating that one piece of code is communicating with another piece of code at a distance, through intermediaries.
- Increases rigidity of code, especially in the call chain. You are much more constrained in how you refactor any method in the call chain.
- Distributes knowledge about data/methods/architecture to places that don’t care in the least about it. If you need to declare the data that is just passing through, and the declaration requires a new import, you have polluted the name space.
Refactoring to remove global variables is difficult, and tramp data is one method of doing so, and often the cheapest way. It does have its costs.
I’m not a fan of this name, but it’s the first time I’ve seen anyone describe this problem so we’ll have to stick with it ¯\_(ツ)_/¯
Solutions to this problem are basically global variables and singletons. Sometimes we’ll just have to deal with tramp-data and there’s no clean solution. Often times, though, there is a cleaner pattern: context.
Context is a vague concept, it can take different forms:
- Context in react components is a way for components (or hooks) to share data without passing it down through props
- Context in graphQL resolvers is a way to pass dependencies down the resolver chain
- contextvars in python are variables that are scoped to a single thread/async task (also see: thread-locals), basically thread-scoped singletons.
In all of these cases, context shares similarities with singletons and global variables.
As far as I know, there are two distinct context APIs. Let’s call them “big context” and “atomic context”.
Big context
This method of accessing context is usually part of a larger framework. Context is treated more as a namespace than an object with a particular purpose. Usually the framework will pass or attach a context object to the consumer, but sometimes there are functions available to access context directly too. Sometimes the context is extensible.
- When building Android applications, there are context objects available at the application-level and at the activity-level. These objects are rich with references to APIs to start services, subscribe/publish to events, etc.
- In GraphQL, all resolvers are passed a context argument. It’s up to you as a developer to decide what you want to put in that context (usually, the current user and data-fetching dependencies).
It is usually discouraged to access “big context” outside a framework’s designated areas. It’s a big dependency that can be hard to mock in tests.
Big context is arguably an anti-pattern, it violates the interface segregation principle:
A client should never be forced to implement an interface that it doesn’t use, or clients shouldn’t be forced to depend on methods they do not use.
If you have a large context object, you are forced to depend on all the methods and data in that object, even if you only need a small part of it. In practice, this is often mitigated by only being able to access context in certain areas of the codebase.
Atomic context
Another approach is to have (possibly many) small context objects. These objects don’t extend a common context
interface or anything, they can be anything, even a primitive (like a string or integer).
Because there may be many small context objects, they are usually accessed via a function that takes a key and returns the value, e.g. user = getContext('user')
.
This is the approach that python’s contextvars
and react’s useContext
take.
React’s useContext
is only supposed to work within components (and hooks). It will throw an error if it’s used in a plain function, even if that plain function is called from a component.
Python’s contextvars
, on the other hand can be used anywhere in the application, though developers usually restrict their use to certain areas. For instance, I’ve written a django utility get_request()
that will use contextvars to expose the HTTP request object from anywhere in the app, but it may log an error if it’s called where the request object is not available.
Testing code dependent on contexts
On one hand, big contexts are going to be harder to mock. This is one of the reasons big context usually has tighter restrictions on where they can be accessed.
Small context encourages less dependency proliferation, so it’s easier to test, but it still requires mocking. Testing “tramp data” is still easier.
When is context a good idea?
Whether context is a good idea or not depends on the umm, context.
Memoization is a great use-case for context. If your code is running in a multi-threaded or async environment, e.g. an application server, you can’t just memoize data in top-level variables (this will cause memory leaks, or worse, leakage of data between requests). Context is a great way to memoize data that is scoped to a single request. Optimization is a common cause of tramp-data. Rather than passing down some cached-data many layers, consider exposing memoized functions via context. This is why I built django-data-fetcher.
Singletons are another good use-case for context. Again, in an app-server, you can’t just use a top-level singleton, you need to scope it to the current request. Often times, this is also an optimization technique (e.g. maybe the singletons cache data internally).
Whatever you do, just keep in mind that context tends to sneak in dependencies. If you write utilities that depend on context, you’re limiting where those utilities can be used and making it harder to test those utilities.