Article Image

Making our Municipalities more Transparent using Python

Author Image
Adam Kariv
2 mins read

DataCity is a project aimed at creating a single repository of all municipal data in Israel. In my presentation at this years’ PyCon Israel, I’ll talk about the project and the Python toolset we’ve built to create and manage this large ETL operation.

Municipalities are the branch of the government that probably affect us the most (think education, garbage collection, building permits, etc.).
They are also notoriously known to be non-transparent - making it difficult for us citizens to make sure that the people in charge are making good use of our taxes and that our city is performing well in comparison to others.

Less than half of Israel’s municipalities publish essential infos on their websites - information such as phone numbers or opening times of the city hall; Moreover, about 7% of municipalities don’t even have a website.

And that’s only the first degree which displays their lack of transparency and openness.

Very few publish good quality data methodically - However “good quality data” is not enough because there is no standard.

In the beginning of 2019 we (at Public Knowledge Workshop, also called “Hasadna”) embarked on a project to make municipalities more transparent - DataCity. In this project we aim to create a single API endpoint for all municipalities’ data (normalized, standardized, verified, regularly-updated).
There are a few problems along the way, though - Firstly, they don’t really want to be transparent. Secondly, data is of low quality and very non-uniform.

To solve the second point, we’re building a versatile framework for extracting data from various sources and formats, cleaning it, mapping it to a predefined schema, validating it with domain-specific rules, enriching it and finally publishing it in our data warehouse.
We’re doing all that in a reusable way, based on open source tools.

In my presentation I’ll talk in more detail about the software tools we used (e.g. the dataflows ETL library) as well as the reasons for the lack of transparency.

I’m Senior Data Engineer at Datopian, and an Open Data Consultant and Activist Founder of the Public Knowledge Workshop – הסדנא לידע ציבורי

A few of my projects: The Budget Key for opening the Israeli Budget and Spending; OpenSpen, a global database for fiscal data; Dataflows, a lightweight and versatile ETL library.

We are the CKAN experts.

Datopian are the co-creators, co-stewards, and one of the main developers of CKAN. We design, develop and scale CKAN solutions for everyone from government to the Fortune 500. We also monitor client use cases for data to ensure that CKAN is responding to genuine challenges faced by real organizations.

Related blog posts

Case Study Image
4 min read

A Brief Introduction to Data Portals

A crucial tool for any organization, data portals perform a range of functions, from providing an easily-searchable catalog of your data to enabling data visualizations and enhancement. This article i...

Author ImageAuthor Image

Annabel Van Daalen

Paul Walsh

Case Study Image
6 min read

On the Value of Data

Data has become increasingly intertwined with our daily lives as more companies collect, analyze, and utilize it—and its use is growing exponentially. Data is everywhere. IoT is opening up new possibi...

Author Image

Michael Polidori