We are working on better ways of doing data across the entire business, not just within data teams. We value:
- Convention over configuration
- Producers owning their data rather than central data teams
- Consumers interacting with data owners rather than with central data teams
- Standard ingest patterns over manually written data pipelines
That is, while there is value in the items on the right, we value the items on the left more
Fundamentally, as a data team we don’t actually want to work with data. The data is not ours to own, nor do we understand it as well as the producers of the data.
As a data team, we exist to build good quality data platforms, tooling, and processes to enable the wider business to get value from their data. We are not here to manage your data, but we can provide consultancy and support for doing so.
This manifesto brings together the ideas of data meshes, data hubs, NoOps, DataOps and the latest generation of data platforms that we and our peers are working on. It is the latest generation of data platforms, and we see this rough evolution happening in many different companies, in both Europe and the USA:
- Data was queried directly from operational data stores
- Data was extracted into data warehouses to satisfy reporting
- Data was extracted into Big Data platforms to make it available to all
- We made the ingest into Big Data platforms more real time
- We build the tools for the wider business to engage with data more directly and more democratically
What these more accessible and democratic data platforms look like is something we are still figuring out. It almost certainly involves the cloud, serverless data platforms, and automation of these. It probably involves shared data dictionaries. It might involve some method of joining together disparate data sources, or it might involve requiring standard data stores.