Data Engineering, Data Integrity, Data Migration, Data Strategy & Strategic Consulting, Custom Data Portal;
October 2020 – June 2022
Brief summary of the project
In a strategic collaboration, Datopian and the Global Initiative for Fiscal Transparency (GIFT) successfully revamped the OpenSpending platform delivering an end-to-end solution that modernized its capabilities to serve as an effective Open Government Data platform. The partnership elevated global fiscal transparency, accountability, and citizen engagement by enhancing data integrity, accessibility, and usability.
The OpenSpending platform had grown obsolete, lacking the capacity to address the modern demands of fiscal transparency, open governance, and public data access. It lacked modern capabilities for data integrity, schema standardization, and user management, posing a challenge for GIFT and its complex stakeholder environment.
An updated, sustainable platform was urgently needed to serve governments, civil society, and financial institutions globally. The solution had to accommodate historical data, offer flexible user management, support effortless data sharing, and allow dynamic schema modifications, all while ensuring a user-friendly interface.
Harnessing cutting-edge technologies Datopian engineered OpenSpending - a streamlined, user-friendly platform. The portal combines seamless data migration, robust error reporting, easy data schema modification, and automated dataset merging. This new portal sets a new gold standard in how governments and public entities share, manage, and derive value from data, thereby fostering greater transparency and accountability.
Main technologies & tools used
The Global Initiative for Fiscal Transparency (GIFT) serves as a cornerstone in the global governance landscape, providing a dynamic platform for fostering communication among key actors such as governments, civil society organizations, international financial institutions, and a variety of other stakeholders. Functioning under the stewardship of the International Budget Partnership (IBP), a nonprofit entity established in accordance with the laws of the District of Columbia, GIFT operates under rigorous financial and administrative protocols set by IBP. As a central hub dedicated to unearthing and disseminating innovative solutions to the multi-faceted challenges of fiscal transparency and public participation, GIFT aims to pave the way for enhanced accountability and responsible governance on a global scale.
In today's digitally-driven world, Open Government Data (OGD) stands as a pivotal force in amplifying government transparency and fortifying public accountability. But its impact doesn't end there; it also catalyzes economic innovation and creates value for business owners and citizens by making large amounts of data publicly accessible. Born over a decade ago, OpenSpending has been instrumental in this transformation, offering a comprehensive platform to "search, visualize and analyze fiscal data in the public sphere". It even served as a direct inspiration when conceiving the powerful CKAN open-source data management software!
However, as with all technology, OpenSpending is not immune to the march of time. Over the years, OpenSpending has matured and has fulfilled its purpose: a modern incarnation of the platform is now required to keep the project alive and maintainable for the foreseeable future as its stewardship transitioned to Datopian some time ago in 2020. Building on top of its technical strengths, a suitable solution will need to tackle data integrity, schema standardization, processing and storage of flat files such as CSVs, and metadata management to name a few key data issues. But it's not just about storing flat files such as CSVs; it's about transforming the way these data sets interact, operate, and deliver value.
This is where GIFT, the Global Initiative for Fiscal Transparency, is playing an important role to sustain the future of OGD. By becoming the official host of OpenSpending's successor platform, GIFT emerges as a vital stakeholder in the OGD ecosystem. What makes GIFT a fitting custodian? Its close collaborative history with the primary users of OpenSpending, including global governments, civil society organizations, and international financial institutions. GIFT's mission—to facilitate dialogue between its stewards, partners from the aforementioned entities, and other stakeholders to "find and share solutions to challenges in fiscal transparency and participation"—resonates powerfully with the ever-increasing demands of a globally connected, data-savvy community.
At a high level, the main objective is to ensure that a sustainable platform for sharing fiscal data will remain available to existing users. On one hand, there is a need to back up historical data between different cloud providers. Just as well, it is necessary to port over functionality and adapt the whole data flow to a more modern approach. To make the transition as smooth as possible, there are some pain points and desired solutions to keep in mind:
- Flexible User Management & Authentication: To eliminate the inefficiency of sharing a single account among multiple users within an organization;
- Intuitive UI with Robust Error Reporting: A user-friendly interface that delivers precise, actionable error messages during the data schema validation process;
- Effortless Data Sharing: The capability to instantly share datasets via a secure link, maximizing both accessibility and convenience;
- Dynamic Schema Modification: Empower GIFT administrators with the tools to easily modify data schemas as needs evolve;
- User-Centric Data Publication: Simplify the dataset publishing process to be seamless and straightforward, requiring zero programming expertise;
- Automated Dataset Merging: Introduce mechanisms that automatically consolidate multiple datasets, eliminating manual errors and streamlining the process;
- Versatile File Encoding Support: Default to UTF-8 encoding, while offering compatibility with other prevalent file encodings.
Having contributed heavily to the healthy development of OpenSpending since its inception—enriching its codebase with open-source contributions—Datopian is well-situated to bring forth the evolution of the superseding data portal.
The journey towards creating an efficient, reliable, and user-friendly data portal can be broken down into five seminal stages:
First Stage: Exhaustive Data Audit As a starting point, conduct a comprehensive review and assessment of existing datasets to the foundation for the migration strategy;
Second Stage: Seamless Data and Services Migration Ensure the secure transition of data and associated services to the new platform;
Third Stage: Development of the GIFT Portal Create an innovative data portal platform, complete with a public-facing website for dataset display and a specialized publisher tool for managing datasets via a dashboard;
Fourth Stage: Data Restoration Back up and reintegrate all existing data into the newly established platform;
Fifth Stage: Beta Rollout Launch the Beta version of the GIFT Portal, strategically phasing out the older system;
Taking the lessons learned from a long usage history, we pinpointed common sources of failures and frustrations in the system and solved most of these real-world issues in the next iterations: incremental improvement is, after all, the only viable way to ship working software in an ever-changing technological landscape. Firstly, to remedy the complicated process of updating datasets and publishing new ones, an intuitive interface was designed from scratch with fewer moving parts behind the scenes. The approach consists of leveraging the presence of established software providers so that we can focus on building a set of customized features instead of crafting every single one of them from the ground up, which would be costly and too time-consuming. This led to the production of a custom-made data portal simpler to use, which we will affectionately call GIFT Portal.
GitHub is a major part of the back-end infrastructure, allowing the new portal to solve many issues: its extensive support for user permissions and team management for one is a compelling reason to use it; additionally, keeping data under a version control system provides numerous benefits; GitHub Actions can be used to take into account continuous integration and continuous delivery (CI/CD) of software; the Git protocol with the Large File Storage (LFS) extension breaks through the standard file size limitations, etc. Furthermore, Datopian has been working on tooling related to Git, most notably on Giftless, a Git LFS server that will enable the connection of the portal to a cloud storage provider.
Some OpenSpending services were to be migrated and since those were being hosted on the Google Cloud Platform, Google Storage will be a natural fit to store all the GIFT datasets. The idea is then to store metadata on GitHub, rely on a Git LFS server to upload resources to the cloud, and let users of the GIFT Portal retrieve data directly from Google Storage: a simple, effective, and reliable flow.
- GitHub's Robust Infrastructure: Leveraged for its exceptional user permission capabilities, team management, version control, and CI/CD pipelines through GitHub Actions.
- Git Large File Storage (LFS): Addresses standard file size limitations, enhancing data storage efficiency.
- Giftless by Datopian: Serves as the Git LFS server, enabling seamless integration with cloud storage solutions.
- Google Storage: Chosen for natural compatibility, as previous OpenSpending services were hosted on Google Cloud Platform.
After alluding to some of the biggest pieces of the puzzle, it is time to introduce the rest of the technology stack that is going to power the platform. Remaining on the back-end side, data flow management will be taken care of principally by Apollo. GitHub Actions and Vercel will work hand in hand to manage automatic deployments of the application and Next.js, another fantastic tool offered by Vercel, will deal with server-side rendering, smart bundling, route prefetching, and more.
On the front end, the application will be tested with Cypress and Jest, which offer parallelization and load balancing to run quickly on a CI/CD pipeline. React will be our web framework of choice to build reusable, stateful components. On top of React, Material UI will be the main user interface library to display tables and the Tailwind CSS framework will help to shape the portal into a colorful and expressive reincarnation of OpenSpending. Helping with the data publishing workflows, Datapub, another Datopian offspring, is going to assist us with common components to speed up the development phase.
- Apollo: Manages back-end data flow
- Vercel & GitHub Actions: Work in synergy for automated deployments
- Next.js: Takes charge of server-side rendering, smart bundling, and route prefetching
User Interface & Testing
- Cypress and Jest: Ensures robust testing capabilities, with CI/CD pipeline optimization
- React: Our chosen web framework for constructing reusable, stateful components
- Material UI: Handles the UI library functions, complemented by Tailwind CSS for aesthetic finesse.
- Datapub: A Datopian innovation, accelerates the development phase by providing common components for data publishing workflows.
After years of loyal services, OpenSpending is ready to sunset as GIFT Portal is now its worthy replacement. Schema validation features were migrated over since data integrity is such a crucial aspect when publishing open fiscal data, yet the ability to modify the data schema has also been taken into consideration, giving more freedom to the state-of-the-art platform to keep evolving as it sees fit. No longer at the mercy of custom-built microservices requiring a team of software engineers to preserve the source code up-to-date due to constantly changing technologies, GIFT Portal can count on stable products from well-established companies, featuring a simplified infrastructure that can expect much lower maintenance costs and ongoing technical updates from tech titans. Necessary data transformations, such as merging resources in a dataset, are well-tested with functional code, minimizing dependencies between different parts of the system for a more robust experience.
Soon, the OpenSpending API will be migrated as is and synchronized with the new platform to give back programmatic power to the governments who will continue to rely on it for the coming months. In the meantime, users can be confident that their open data will remain publicly available, searchable, and conveniently editable thanks to the support yielded by GIFT Portal, a notable and long overdue software upgrade.