Subscribe to Datasets: New CKAN Feature Explained

June 16, 2020 by Annabel van Daalen and Irio Musskopf, with graphics by Monika Popova

Last month, we announced the launch of a new CKAN feature developed by Datopian that allows users to subscribe to datasets. This is an opt-in feature that sends users an email notification when a dataset to which they are subscribed is changed or updated. Let’s take a look at the feature in more detail.

Photo by Dele Oke on Unsplash

# Why subscribe to datasets with CKAN?

The subscribe to datasets feature designed by Datopian was born out of the needs of our enterprise customers. In order to provide clients with a robust messaging system, we needed to build a feature outside of the main application process.

Before Datopian developed a subscribe to datasets feature, data portal users had no good way of finding out about changes to datasets. Approaches to notifying users of changes include using RSS feeds or CKAN’s built-in email integration. However, these approaches were not applicable for our client’s context because:

  • Some datasets and resources can change rapidly, and many different types of stakeholders can subscribe to change notifications. This means that anywhere from 50,000 to 200,000 notifications may be broadcast in a given month.
  • Our client wants to extend the notification feature to support additional notification channels as well as email. A next iteration will add SMS notifications, giving users the choice to receive notifications by SMS, email, or both.

Another advantage of the feature is that the granularity is high. Users can currently receive the following information via email notifications:

  • The name of the datasets in which a change has taken place.
  • Whether the change was applied to a whole dataset, or a single resource.
  • Whether there were changes to the metadata.

Here’s an example notification:

Screenshot section of an example email notification

# Overview

Fig 1.1. Diagram demonstrates that data curators edit the metadata and data of a dataset or resource to which a user is subscribed.

Fig 1.2. Diagram shows, at a high level, the technical design of the data subscription service, including how it interacts with CKAN.

# Current features

  1. Configure notification frequency - system administrators can determine the frequency with which users receive email notifications. This is particularly helpful for users subscribed to very large datasets that are updated multiple times per minute/hour.
  2. Disable notifications for certain datasets - system administrators may opt to disable notifications for certain datasets for a number of reasons. In particular, companies using CKAN data portals may choose to disable notifications for datasets that are updated frequently, should the cost of mass emailing become too high.

# Upcoming features

  1. Subscribe to new datasets - soon, CKAN users will be able to receive emails notifying them when new datasets are added to the portal. This is particularly helpful for users monitoring all portal activity.

# How can I get the new feature?

The data subscriptions service is currently available for use. If you are interested in deploying it against your existing CKAN installation, please reach out to us by visiting the project on GitHub here and creating an issue. Additionally, contact Datopian to discuss how we can deploy a data subscription integration for your platform.

# Call to Action!

CKAN is an open-source software that relies on collaboration to develop functionality. If you extend this new feature, we would be really interested in using this code to improve CKAN and thereby encourage others to opt for open-source solutions.