Lately I’ve been musing about the Flatpack entity/JSON serialization framework.
Flatpack’s goal is to provide an entity-based JSON framework/format that can be used across multiple front-end clients, e.g. web/iOS/Android.
As a very large/up-front disclaimer, I’ve not actually used Flatpack, either on a real project, or even a prototype, but especially the wire format has piqued my interest in a few ways, such that I’ve been thinking about how I would approach/extend it if I was doing something similar.
(One last disclaimer, I had this post ~75% written for quite awhile and then stopped; I’m going to quickly wrap it up and push it out, but be kind.)
Flatpack Quick/Naive Overview
Providing a cross-platform API is a fairly common goal these days, but there are a few things that make Flatpack’s approach interesting:
-
It is entity-based.
The usual suspects of Thrift/Protobuff/Avro/JSON/etc., while all built to be cross-platform as well, are lower-level than Flatpack; they are about defining raw messages, and message types, and letting each application define it’s own protocol to use.
This is quite fine, but a whole lot of applications do CRUD, e.g. entities (save employee, read employee, save employer, read employer, etc.), and so I think it’s more natural to think of Flatpack sitting on top of the underlying message layer itself, which in this case is JSON, to provide common/reusable entity-based semantics.
E.g. the same protocol (messages/message contents) could be used on another transport, e.g. Thrift/Protobuff, if applicable.
-
It uses a flat encoding.
So, instead of an employer/employee being sent like:
{ employer: { name: "er1", employees: [ { name: "ee1" }, { name: "ee2" } ] }
The employees are flattened out:
{ employer: [ { id: <uuid>, name: "er1" } ] }, { employee: [ { name: "ee1", employerId: <uuid> }, }, { name: "ee2", employerId: <uuid> } ] }
(Note that I’m using pseudo-JSON here; their actual JSON has more structure.)
This does have a few wrinkles:
-
Because parent/child relationship cannot be expressed implicitly by nesting, objects must now refer to their parent by the parent’s ID.
-
This insinuates (along with using UUIDs) clients assign IDs for new objects, which is fairly standard today (e.g. Asana’s task ids are UUIDs).
However, a weakness, IMO, of the nested approach is modeling it with objects in a client library, e.g.:
// update an employee, makes sense... s.updateEmployee(Employee( id = ..., name = ..., salary = ...)) // now update an employer: s.updateEmployer(Employer( id = ..., name = ...)) // update employees nested in employer s.updateEmployer(Employer( id = ..., employees = Seq( // can we really set salary here? it's in the object... Employee(name = ..., salary = ...), Employee(name = ..., salary = ...)))
Notice how
Employee
is used both as a top-level entity, e.g. forPOST /employees
, and also nested within aPOST /employer
call.This is nice, in that you get to reuse the
Employee
class.However, the set of fields that
POST /employees
supports for employees and thatPOST /employer
supports for nested employees is almost always different. E.g. if you use the “submit the Employee through the Employer”, you can only use a subset of the fields, or on read only get a subset of the fields back.Which means to really model this correctly, you’d need like an “ErEmployee” type, and something ugly like:
s.updateEmployer(Employer( id = ..., employees = Seq( // no salary field is available ErEmployee(name = ...), ErEmployee(name = ...)))
The idea being that
ErEmployee
would not have thesalary
method defined, as it’s not supported for reading/updating through the/employer
endpoint.This is the right thing to do, but ugly, and I believe in practice rarely done. Instead you just reuse the
Employee
type, because, hey, it’s pretty close, and hope that clients don’t try and set unsupported fields. -
-
It uses very normal-looking REST/Jersey endpoints:
/** * Return the list of products. */ @GET @Path("products") @FlatPackResponse({ List.class, Product.class }) public List<Product> productsGet() { return db.get(Product.class); }
That said, I’m a little curious how you specific the depth of the returned object graph.
Notice how they’re returning JPA entities (
Product
directly instead ofProductDto
). This is great in terms of avoiding the usual nightmare of domain object/DTO translation, but it’s curious then how to customize what does/does not get return from each endpoint.Ah, there is a TraversalMode, which I guess allows something like:
@GET @Path("products") @FlatPackResponse( value = { List.class, Product.class }, traversalMode = TraversalMode.DEEP) public List<Product> productsGet() { return db.get(Product.class); }
You still need ability to control which fields to return (even just statically), and more fun things like access control, etc., all of which Flatpack handles, AFAICT, and I’m going to defer to their docs and avoid that tangent.
My Musings
So, that is flatpack. It seems very reasonable. But it has some hints of things, such that I wonder about trying a few variations:
-
Use just one
/entities
end point.While REST is defacto endpoint-per-resource, I would like to try building a system that has one endpoint for entities. As I think a few nice things fall out:
-
You get cross-object transactions for free.
With REST APIs, it’s usually impossible to get atomic behavior of “create both of these objects as one unit of work”.
-
You get an API that is inherently consistent.
With large (or even small) REST APIs, where each entity is a different endpoint, it’s very easy for conventions to unintentionally vary from endpoint to endpoint.
Especially with the endpoints are being coded by hand, which is pretty much always the case (cough codegen cough), it’s very easy for a programmer to slip in a slightly different input format, or output format, or error encoding, or “last modified” parameter name, etc.
This may not seem like a big deal, but having written client-side wrappers for several REST APIs, it makes building abstractions a lot easier if each endpoint is as consistent as possible with the others.
All that said, my musing is that if there is only one endpoint,
/entities
, then all entities should automatically/for-free use the same idioms/conventions as the others.
-
-
Standardize the entity/field to error message mapping
Perhaps implementations of Flatpack already do this, as the errors section is already a map:
errors : { "Some key" : "Some message" },
But I would go further and suggest/document explicit keys, like:
${entityId}.${fieldName}
for field-level errors${entity1d}
for entity-level errors
-
Use negative ids for new objects (yes, this sounds heretical, but hear me out)
So far, I’ve yet to work on a system that needed/used UUIDs for primary keys. Note that I don’t mean no systems should ever use UUIDs, I think they make a lot of sense if you need them, I’m just saying I personally haven’t yet.
And, admittedly perhaps just due to habit, I like/prefer integer primary keys, because pasting around “1” or “10456” when debugging tests/bugs/etc. is nicer than UUIDs.
However, it does mean the server has to monotonically assign ids.
For most applications that currently use client-assigned IDs/UUIDs, I assert their rationale is not actually a true distributed/client-to-client use case (like git), but most likely an offline/temporarily disconnected use case.
If that is the case, it’s not really that
client1
andclient2
both need to create unique/separate ids for new objects, because they might sync directly to each other, and then collide on their client-generated IDs. Instead,client1
andclient2
will only ever interact through the server, so the server eventually acts as the mediator anyway, and we don’t actually require globally-unique IDs.More specifically, if you have server-side storage for your entities (which most systems do), the offline problem can also be solved by partitioning your id space into client-controlled (and per-client, so collisions across clients are fine), and server-controlled (cross-client, so no collisions allowed).
Splitting ids into negative/positive does this:
- For new objects, each client makes a new negative id. Start at -1, then -2, etc.
- On upload, the server recognizes any negative id object as new, saves it, and sends back the negative to positive id mapping (e.g.
Employee#-1
is nowEmployee#1456
) - On response, the client updates their entities that had negative ids to use the new, server-assigned id
Cutting to the Chase
Anyway, wrapping all of those musings up, here is what my protocol might look like.
A request, e.g. POST /entities
:
{
entities : [ {
id: -1, // new employee
type: "employee",
name: "Bob",
employerId: 10 // fk to parent
}, {
id: 10, // existing employer
type: "employer",
...
}
And a potential response:
{
// would only be returned if transaction committed
idMappings: { "employee#-1": 5, "employer#-2", 6 }
// object/field errors
entityErrors: {
"employee#-1.name": [ "Name is required" ],
"employer#10": [ "Access is denied" ],
},
// misc errors
otherErrors: [ "Something bad happened" ]
}
Admittedly, having type: "employee"
is a little odd, so maybe separate collections per entity would be better, and the string-based "employee#-1.name"
keys are a little odd, maybe having it be a dictionary of type
, id
, field
would be better.
So, anyway, I won’t claim this is new or amazing or perfect, but it’s been floating in my head for awhile, and now I can finally finish this post and move on to something else.
(As a final musing, I forgot to mention that a lot of this might be re-inventing what other “sync entities across clients” frameworks like Firebase do, and so it would be worth examining/stealing from/just reusing their protocols, or even platforms, but I’ve not had a chance to do any investigation there.)