Recently we have seen several projects attempting to map community data to help people in these coronavirus times. We can understand why. People need good information more than ever, and they need information on a range of services and topics they weren’t looking up before.
For example, several projects have sprung up to list local businesses that deliver in Edinburgh — Lockdown Economy, Local Shopper and The Reporter are all examples. The Scottish Council for Voluntary Organisations (SCVO) have created a Scottish directory with various ways for people to get help. (EDIT: All these projects have now gone off-line.)
Unfortunately we also know that these kinds of projects are very difficult. Getting good data and then keeping it up-to-date is a challenge. If these projects are to stay relevant in the long term, this needs to be considered.
This post will ask nine questions that people should consider when working on one of these projects.
Question 1 — Will you ask people to send you data?
The challenge is that people running useful places already get a lot of requests to “just add their data” to many different websites. These people are always busy — but even more so at the moment. Why should they spend their time on your project? That’s a hard question to answer, and we will return to that at the end.
But when they do, respect their time and make it as easy as possible for them. Make sure the process they have to follow is simple and quick.
Think about how this creates workload. You could just ask people to email you. This is easy for them but creates more work for your poor admins. Is it possible to guide people to submit more helpful data, but also in a form that can be used immediately?
Another issue is that in some cases, people may feel they aren’t meant to add data even when the project encourages them to — we have seen people unwilling to add public data about something without “permission”. Think about ways to encourage people.
Question 2 — Can you reuse existing data?
Just asking people to add data takes time, and puts a work burden on people.
Are there existing data sources out there that you can reuse? These could be other lists of information or it might be possible to automatically get information directly from something you want to list (for example; parsing information from a company website). If you can make this process automatic by regularly consuming Open Data, it’s even better. This cuts down on the amount of work people have to do and also helps with the next question …
Question 3 — How can you keep the data up to date?
Data like this can very quickly become out of date and this is a key challenge. How are you going to keep it up to date?
We can think of some automatic things that could help here. As we just discussed, regularly consume good Open Data. You could have a bot that checks any websites in your data and raises an alert if there are errors such as a 404 status. You could track when an entry was last edited and if it is old prompt someone to check.
But also make it easy for people to keep the data up to date. For instance, make it easy for people to report any problems they find when browsing your website via a simple form on each item. Or if you have an email address of someone who gave you some data in the first place, can you set up an automatic process to email them regularly (with permission of course) and give them easy ways to tell you the data is still correct or tell you of any changes?
Question 4 — How can you encourage others to use your data?
You will be presenting this data on some kind of website or app. The first thing to do is to really think about your users needs. Why are they searching? What kind of criteria will they want to search by? How will they want to see the results? Presenting your data in the most accessible way will help people use it directly.
For instance, a lot of people display results on maps — it’s fun and interactive! But actually this may not be the best way for a user to go through the results. If you are listing businesses that deliver, what does it matter where they are currently — they come to you! The real question the user cares about is “will this business deliver to my area?” and that is much harder to answer.
Question 5 — How can you encourage others to REuse your data?
You want your project to have the biggest impact it can.
You should — and it’s no surprise we recommend this — make your data into Open Data. If other people take your data and reuse it in other places then your data has been even more useful. This will also help others working in the same space and reduce duplication of effort.
Think about putting in place an Open Data licence, and think of a good way for people to automatically get the data like an API or bulk download.
Question 6 — Which existing data standards should you use to publish your data?
Your Open Data has now been published — but is it useful to others?
If you can publish your data in a data standard which is common and is used by other people, that’s even better. It means others have to do less work to reuse your data.
For example, if you are listing events then publishing data in an iCal/ICS format is essential. People can even import these data feeds directly into their personal calendars — how’s that for an impact for your project?
Also look at schema.org — you will probably want to markup your web pages with this standard anyway and there may be some existing models you can reuse.
Research more specialised data standards — for instance does Open Referral fit?
Question 7 — How can you spread the load of editing?
Many of these projects end up with one or a few people bearing the workload of editing the data. This is a problem as it is very easy for these people to feel burnt out eventually and then the whole project stops. Also people just want to take holidays sometimes!
Instead, can you set your project up such that many people can share the work of editing data? Think of a wiki model like Wikipedia that has thousands of editors. Is there a way you can safely allow this in your project?
For example, Open Tech Calendar managed to get hundreds of people to contribute; this old blog post shows the “long tail” of contributions.
Question 8 — How can you test and demonstrate the quality of your data?
When someone looks at your project they will have questions about how trustworthy the data is. In some cases, it can take people years to trust that a project will provide the correct information and that it’s worth putting effort into.
Are there ways you can regularly test how good your data is? Then once you know your data is good, are there ways you can communicate that to your users? For example, could you show the source or when the data was last checked?
Question 9 — How can you tie this all together?
Many of these points end up reinforcing each other, so it’s important to think about how they can all work together.
For instance, why should people take the time to add data to your project? If you can show that the data will be well looked after, will be well used and will be reused in other places then it’s easier to answer that question.
Or why should you spend time publishing your data as open data? It becomes a virtuous circle — if you can show that projects like this can be successful and other people start doing it then you can share data sources and help each other.
But some of these points can oppose each other. It’s up to you to find the right balance for your project.
For instance, one way to make it easy to add data is to put more workload on your admins — but that can lead to burnout. Another way to make it easy is to keep the number of fields of information you request low — but then there might not be enough information there and people won’t find your data useful.
Useful, usable and in use
At Open Data Services Coop our tagline is making data Useful, Usable and In Use. We want community data projects to be all those things and more! We recognise the impact a good data project can have as people search for information in these difficult times and we wish your project success.
EDIT: Thanks to Ian for pointing out that I’d missed something: See if your data goals and openness goals match with an existing data platform, and if they do just use that. For example, have a look at Wikidata or Open Street Map.