In the first article of this series of three on open source strategy, we seen how Open Source Software (OSS) is now part of corporations strategy and the benefit of building an ecosystems. By sharing your code and vision openly you enable other expand your solution, adapt to their specific environment. It is not rare to see unexpected innovation sparking in this process. In this article we will see the steps to create such innovation friendly environment and take example the OpenRefine community.
No developer is an island.
Very few developers work isolated from the rest of the world, even in today’s corporate environment. R&D departments are opening up and this is thanks in part to open source projects, which enable developers to be part of something bigger than their current project or local implementation.
Opening a project can happen in many different ways: through API, Software Developer Kit (SDK), or by releasing the entire code base under an open license. When a company’s technology opens up, it facilitates building a community of developers, partners, collaborators, and users. However, communities don’t spontaneously create themselves and you will have to actively reach out to potential collaborators and nurture those relationships over time. As community leader, you set the direction of the project while listening and responding to the feedback from the community. While working on the project roadmap or the next feature, you need to strike a balance between the interests of the community itself and your own interests. If you don’t, you run the risk of cutting yourself off from your users and contributors. Or even worse some might do a fork of the open code and starting a community of their own, weakening your effort.
The investment can be very lucrative. The new community will help to expand your brand and provide valuable insights. Managed properly, the feedback, bug reports, and code contribution together will drive innovation in your organization and open up new markets.
Community Base Innovation with OpenRefine/Refine
The OpenRefine community is a great example of unexpected innovation within an open source project. Refine enables non-technical people with business expertise to explore, enrich, and prepare data for daily decisions by making powerful data cleaning and mining functions available in a point and click interface.
Refine began its existence as Freebase Gridworks, a companion tool to clean and load data into Freebase — a “community-curated database of well-known people, places, and things”. It evolved to allow for broader data hygiene usage between more and more platforms. In 2010, Google purchased Metaweb, the publisher of Freebase, and renamed it “Google Refine”. In 2012, Google stopped supporting Refine, by which time it was a mature data cleaning tool with contributors outside its initial community. This community rebranded the project OpenRefine and kept drawing interest as the first self service data normalization and preparation platform, with over a thousand weekly downloads.
In addition to the core data refining software, Refine’s extension-friendly architecture allows third parties to extend the functionality and connect it to their services. Extensions can be broken down into four types:
- Importing data from a system
- Exporting data to a system
- Querying remote data processing service via their API
- Reconciliation Service to extend your data by doing fuzzy join with a remote master data source. Reconciliation helps to align taxonomy and import new information into your project
Refine integrates with over 16 reconciliation services and has 10 community-contributed plugins that extend its capability. Through the extensions and reconciliation services, each industry can customize Refine for its needs while sharing the core functionality for data cleaning exploration and normalization (e.g. removing duplicate data and types, splitting and merging different fields). The following map lists the different services and plugins working with OpenRefine, as well as projects that have done heavy customization to add OpenRefine in their data manipulation processes. Details about each extension can be found on OpenRefine blog.
Building an open ecosystems isn’t easy, however when done properly it provides valuable competitive advantages including expanded brand recognition and new sources of innovation. To ensure continuity and sustainability of the project, the community now need to scale and gain the critical size and drive broad adoption.
Continue Reading – Part 3: Scale your Open Source Communities For Success.