CKAN: Some Ideas for the Road Ahead


5 Year Ago

  • Last 5-5y have been relatively slow due to resource constraints
  • Flask and vdm now dealt with (legacy is done)
  • New features are starting to come in … 🥳

Lots has Happened in last 5y

  • Data has just got bigger
  • Rise of (micro)services - docker, k8s etc
  • Rise of Javascript for webapps - frontend & backend
  • CKAN has continued to establish itself :thumbsup:

CKAN: the Future

  • CKAN is leading data portal
  • Data Portals are rich and complex (and features )
  • CKAN has "monolithic" architecture

Data Portals are rich and complex

Here is an overview of the different feature clusters of a full portal:


A Challenge (and Opportunity)

=> A challenge … as CKAN gets bigger and more complex … gets brittle … gets harder to maintain and/or add features

Could we Refactor CKAN for More Flexibility

Example: (Read) Frontend

Definition: Frontend has two parts

graph LR adminui[Admin UI] readfe[Read Frontend]
  • No. 1 "customization" of CKAN is theming
  • ATM that requires a frontend dev / designer to learn docker + python just to do theming
  • Painful to make unified frontend

"Headless" DMS

Background: DMS'es today (and CKAN in particular)

  • People are creating more and more data portals, both public facing e.g. government open data portals and internal for sharing data within an organization
  • Data portals are growing more complex and often people want to have a portal that integrates data and content into a unified experience
  • A DMS like CKAN was originally focused around the core aspect of a data portal: the data catalog with its dataset pages and its search
    • This has since expanded to include many other complementary features: organizations, collections of datasets etc)
  • In addition, there has often been a need fo the data portal to include some content: guides, documentation, news and blogs
  • Finally, theming and frontend customization have always been central to data portals

What's the Problem with the current Setup?

There are two main issues:

  1. There is no standard, satisfactory way to create data portals that integrates data and content.
  2. Theming and frontend work is slow and painful because it requires installing and interacting with the full (complex) DMS

Headless CKAN and Decoupled (Read) Frontend

Running in Production