Iāve had a rule of thumb for data modeling that unfortunately I forget on a regular basis, so I thought Iād write it down to maybe help myself actually use it more.
The rule of thumb is āalways separate user intent from derived behavior/business logicā.
This rule is particularly applicable to things like state
or status
fields, which have a way of becoming āhalf-user controlledā and āhalf-system controlledā.
As the most recent example where Iāve run across this, consider a system of tasks for a stereotypical ātasks have predecessors/successor in a project planā:

In terms of data model, weāll focus on just a few things:
- There is a task
status
field that has three potential values:NotStarted
,InProgress
, andComplete
- The requirements state that the system handles all
NotStarted <-> InProgress
transitions, i.e. it āauto-startsā tasks once preceeding tasks areComplete
. - However only the user can say āthis task is actually done (or not done)ā.
An Okay Way
An initial attempt at modeling this is a single Task.status
field that is an enum of NotStarted
, InProgress
, and Complete
.
Then we use business logic to do ānot rocket science but still somewhat nuancedā things like:
- Anytime a predecessor task changes maybe change the successor Taskās
status
, but only if itās notComplete
- In the UI, treat
status = Complete
as āyou checked completeā butstatus = NotStarted | InProgress
as āyou didnāt check completeā
This is all fine and not that bad, but we end up with a āsometimes the field is written by X and sometimes it is written by Yā:

Which is not terrible, but generally more of a ābusiness logic is hidden in susceptible-to-being-spaghetti āpushā codeā.
I.e. itās pretty common in this setup for, if the user unchecks ātask is completeā, to forget to re-run the āah right, set it back to the ābased on predecessorsā valueā logic.
A Better Way
Generally a cleaner way of modeling things is to strictly delineate user intent from derived behavior, i.e.:
- The user intent of āthis is complete yes/noā is itās own āthingā (database field)
- The calculated āpotential status based on predecessorsā logic is itās own thing (derived field)
- The calculated combination of āstatus based on user intent or potential status based on predecessorsā is itās own thing (another derived field)
I.e. our data model would move from having a single status
field to:
Task.is_complete
is a boolean that is directly/always controlled by the user intent to mark āyes, this is/is not doneāTask.status_based_on_predecessors
(probably not stored/persisted, so not a real column-in-the-db) does the calc of āthis task should beInProgress
if all predecessors areComplete
, otherwiseNotStarted
āTask.status
still exists, but is now derived (although likely still persisted for simplicity of reads) by the calculation āifis_complete
thenComplete
elsestatus_based_on_predecessors
ā i.e.InProgress
orNotStarted
This moves the model to be more like a DAG of inputs with nodes of calculated values:

Which makes the application logic more functional, more reactive, rather than if
statements sprinkled in various places.
Granted, a separate but tempting tangent is that reactive / data flow paradigms have not generally taken hold on the server-side yet, especially at a āmore than just lifecycle hooks within a single micro-service/monolith/ORM codebaseā scale, so you still have to generally nudge/wire these derived values together, but I think the end result is still cleaner than the original āfuzzy ownership of a single fieldā approach.