Different kinds of storage

Publish date: 2016-04-26

Tags:

I’ve been spending most of my time so far on Project Tofino thinking about how a user agent stores data.

A user agent is software that mediates your interaction with the world. A web browser is one particular kind of user agent: one that fetches parts of the web and shows them to you.

(As a sidenote: browsers are incredibly complicated, not just for the obvious reasons of document rendering and navigation, but also because parts of the web need to run code on your machine and parts of it are actively trying to attack and track you. One of a browser’s responsibilities is to keep you safe from the web.)

Chewing on Redux, separation of concerns, and Electron’s process model led to us drawing a distinction between a kind of ‘profile service’ and the front-end browser itself, with ‘profile’ defined as the data stored and used by a traditional browser window. You can see the guts of this distinction in some of our development docs.

The profile service stores full persistent history and data like it. The front-end, by contrast, has a pure Redux data model that’s much closer to what it needs to show UI — e.g., rather than all of the user’s starred pages, just a list of the user’s five most recent.

The front-end is responsible for fetching pages and showing the UI around them. The back-end service is responsible for storing data and answering questions about it from the front-end.

To build that persistent storage we opted for a mostly event-based model: simple, declarative statements about the user’s activity, stored in SQLite. SQLite gives us durability and known performance characteristics in an embedded database.

On top of this we can layer various views (materialized or not). The profile service takes commands as input and pushes out diffs, and the storage itself handles writes by logging events and answering queries through views. This is the CQRS concept applied to an embedded store: we use different representations for readers and writers, so we can think more clearly about the transformations between them.

Where next?

One of the reasons we have a separate service is to acknowledge that it might stick around when there are no browser windows open, and that it might be doing work other than serving the immediate needs of a browser window. Perhaps the service is pre-fetching pages, or synchronizing your data in the background, or trying to figure out what you want to read next. Perhaps you can interact with the service from something other than a browser window!

Some of those things need different kinds of storage. Ad hoc integrations might be best served by a document store; recommendations might warrant some kind of graph database.

When we look through that lens we no longer have just a profile service wrapping profile storage. We have a more general user agent service, and one of the data sources it manages is your profile data.