The recent proliferation of data has become the center point for analytic, system migration, and reconciliation processes. However, data cleaning and preparation (which includes data normalization, duplicate removal, pivoting, joining, and splitting data) is still a major hurdle in the process. Tools available for end users haven’t fully caught up to the demand. Spreadsheets offer entry level interface to the data but are time consuming and don’t scale, while programming languages offer flexibility but have a steep learning curve for the non technical person.
OpenRefine addresses the growing data literacy gap by lowering the technical skills needed to normalize and prepare data. OpenRefine empowers those who understand the context in which the data are generated or used by offering them the best of both worlds with an iterative interface for data discovery and preparation and an easy-to-learn scripting language.
Thanks to OpenRefine, subject matter experts with an in-depth knowledge of a specific issue or domain can:
- explore data related to the topic; drill down to have a sense of the information available, find nuggets of information or inconsistencies and quality gaps;
- clean and export the data to a format useful for their needs by doing data normalization, removing duplicates and typos, pivoting, joining and splitting columns; and
- enrich the project by joining data sets together, processing data via an API, or working with a reconciliation service.
Extensive Input & Output Support
Cluster & Deduplication
Filter & Sort
Join, Merge & Reconcile
Join and merge different data sets to build new views and insights. Create lookups and reconcile against master data sources. Easily concatenate multiple fields together.
Split a field into multiple columns based on any character(s). Create new rows from multi-value cells.
Undo / Redo
Do your data cleaning in a safe environment where you can undo any changes. Review and audit your transformation history. Once your project is completed you can save your steps to reapply them next time.
Pivot columns into rows or transpose rows into columns with just a few clicks.
Custom Query Language
The General Refine Query Language (GREL) is flexible, powerful, and yet simple enough to create custom filters and transformation expressions. Preview your changes in real time before committing your changes.
Fetch Web Pages & work with APIs
Call web services and fetch web pages from within Refine. Extend your dataset with your favorite API or machine intelligence services with limited coding knowledge.