Tech Onboarding Guide

Tech Onboarding Guide

This guide walks through the essential basics of tech work at Datopian.


First, please actively take notes on your experience so you can provide constructive and critical feedback ๐Ÿ“ฃ both on this guide and especially on the tools you use so that we can improve them. ๐Ÿ‘

Create an issue in the Onboarding issue tracker based on this template to track your progress and record the results of your work. When completing items that require outputs, make sure you record the outputs in the issue (either inline in the task or at the bottom for larger items).

When doing something substantive

## Part I: Processes and tools

* [ ] Read *all* of the style guides
  * [ ] Installed any linters
* [ ] Read about job stories and watched the video at the bottom (you do *not* need to read all the links)
  * [ ] Created >= 2 job stories (try and make these as real and relevant as possible, pick something you are working on)
  * [ ] Written short paragraph summarizing difference between job stories and user stories
* [ ] Created >= 2 issues following the structure (e.g. in onboarding issue tracker)
* [ ] Command line git installed (and a GUI if you like that)
* [ ] Python 2 and 3 installed
* [ ] Recent version of NodeJS installed

## Part II: Data Packages

* [ ] Read the documentation
  * [ ] Summarize in your own words what a data package is
  * [ ] Create a minimal datapackage.json by hand (and validate it)
* [ ] Curate a new dataset
  * [ ] Select a dataset from
  * [ ] Turn it into a data package with a script to automate collecting the data
  * [ ] Added a graph (bonus)
  * [ ] Published to
* [ ] Provided feedback

## Part IIB: DataHub

* [ ] Account on
* [ ] Have published a sample dataset to your account
* [ ] Provided feedback on the experience

## Part III: CKAN Classic

* [ ] Read the full tutorial
* [ ] [Install and launch ckan with docker-compose](
* [ ] [Play with CKAN]( 
* [ ] Created a working extension and published to github/GitLab (post screenshots of results)

## Part IV: CKAN Next Gen

* [ ] Read the materials
* [ ] Frontend running
* [ ] Frontend customized
* [ ] Frontend deployed

Part I: Processes and tools

Let's get familiar with our work environment and install relevant tooling. ๐Ÿ› 


Part II: Data Packages

Intention: you are familiar with Data Packages and are able to curate a new dataset as a Data Package (and publish it to DataHub in next step).


  • Frictionless Data and Data Packages.
    • Read this and the tutorial linked at the bottom.
  • Our best practice process for curation and publishing of datasets: Data Package + DataFlows + Github (+ Actions).
    • Data Package (plus Table Schema and CSV) is the container format plus the data schema.
    • Github (or GitLab) is our default location for storing (smallish) datasets.
    • We use Github Actions to automate running the pipeline, publishing to DataHub and doing continuous data integration.
  • Practice task: Curate a new dataset.
    • Select a dataset from โ€“ easiest is to look at the board and focus on "Ready to Package": (verify your choice with your mentor or coach).
    • Turn it into a data package.
      • With a script to automate collecting (and packaging) the data.
    • Validate it.
    • [Bonus] Add a graph to it.
    • Push to GitHub (or GitLab).
      • [Bonus] with automated collection automated by GitHub actions.
    • Publish to

Part IIB: DataHub

Intention: you can publish data to the DataHub and have published a dataset to your account.


Part III: CKAN Classic

Intention: you are familiar with CKAN, you have it set up for development work and you have created a hello world extension.


Part IV: CKAN Next Gen (CKAN 3)

Intention: you are familiar with CKAN, you have it set up for development work and you have created a hello world extension.

  • Read the overview.
  • Install and try the Next Gen Frontend.
    • Tweak the front page in some fun way to add content! Here are a few ideas to get started:
      • Change the layout with different margins and padding;
      • Add dynamic elements, such as a gallery slider;
      • Create a dark version of the website;
      • Customize colors, scrollbar, fonts, hovering effects, favicon, etc.
    • Add a new route to show a new page /dash (which can be empty other than a title).
    • Deploy this somewhere, e.g. Heroku or any host that supports NodeJS.