Over the last few months, Open Data Services developers and researchers have been beavering away developing Standards Lab. Standards Lab is a new tool that builds on CoVE to provide an experimental environment for developing new or existing open data standards.
The project is funded by the Open Data Institute as part of their Tools Development for Data Institutions and Data Access Initiatives project.
Why build Standards Lab?
Open Data Services work with many open data organisations. This work involves helping data publishers to produce high quality open data as well as assisting with the development of the data standards themselves.
One of the tools we use for both of these purposes is CoVE. This is a production service for data publishers to validate and explore their data prior to publishing. CoVE is carefully customised to an open data organisation’s needs, by providing additional data checks, visual theming, and copy specific to that organisation and standard. This makes it an excellent tool for a finalised standard and the user experience is designed specifically for that standard.
However, there are many steps before coming up with the final data standard. A tool that is designed to be the final product isn’t sufficient during the development process.
Standards Lab builds on CoVE to create an environment where developers can easily test potential changes to a data standard, and understand how these changes would interact with any existing data.
What can Standards Lab do?
Projects
Standards Lab provides a project based workflow that allows standards developers to start a JSON schema based data standard from scratch or to continue developing an existing one.
To facilitate development Standards Lab allows projects to be forked, creating a new independent version of the current project which can be used for versioning the development of the standard.
As a web based tool, Standards Lab is a natural fit to make a development process collaborative. Project synchronisation allows us to have a single development session that is synchronised across browsers and users.
Standards Lab also allows you to disable all editing of projects globally to provide a simple data testing service. This is also available at a per-project level where each project can be set to be editable by more than just the owner of the project.
Developing a standard
Using the project area it is possible to upload, create and edit schema as well as upload, create and edit data to test against the schema.
The Application
Standards Lab has been designed to be both an application that can be deployed to a remote server (for example via Dokku) or installed as a local application using Docker.
The software architecture of Standards Lab is highly modular and easily extensible with clear separations of responsibilities in the code. A client server model using standardised web based APIs to connect the user interactions with backend processes. These APIs can also be used to access and extend functionality.
Standards Lab is an Open Source project hosted on GitHub. We have a Live demo server and the documentation available on readthedocs.
Data processing features of Standards Lab are provided by CoVE. This includes schema aware data format conversion (e.g. XLSX to JSON), JSON schema validation and the ability to have custom validation extensions. By integrating CoVE as a service we are able to process multiple data files in parallel and then share the validation and quality test results.
An example use case for the 360Giving data standard
In this example the goal is to test and develop extensions to the 360Giving’s data standard.
First we create a new project on our Standards Lab instance “ThreeSixtyGivingTest”. For this project we also set up the top-level key to ‘grants’ to allow spreadsheet data upload (this tells Standards Lab what the data key name is, in order to nest the spreadsheet rows of grant data under).
Next we grab a copy of the JSON schemas for ThreeSixtyGiving and uploaded them. We can now make and save amendments to the schema by clicking on the files and editing them in the editor area.
Now we wanted to see what effect our changes to the standard might have on existing data. It is very important not to break any data that has already been published.
We upload some test data and also some data that is live from some data publishers (not shown here).
We hit the “Start Test” button and after a few seconds of processing the two data files (which are in both spreadsheet format .xlsx and JSON format) have been tested, we get a summary of the validation errors and can view the full details on a dedicated details page.
Once we’re happy with the changes and results we can share this project with others and let people test different data. This example shows how we have used just one tool to develop, test and collaborate on developing a data standard, something that wasn’t previously possible.
Future plans
Standards Lab development is in an early stage and we have many ideas for future development.
They include:
-
Connect standards lab with our developing a standard guide ‘An open data standard toolkit’ and other resources for people developing standards
-
Enhance the test extensions / additional tests mechanism
-
Investigate spreadsheet to schema conversion
We’re already collecting issues, new features and enhancements on the Github project. We welcome any contributions and feedback.
Now you’ve read the details, why not have a look at our demo site!