Posts tagged with 'data quality'

  • Blog   
  • Posts Tagged With 'data Quality'
  • 10 questions to ask before using new data

    Martin Magdinier  |  25 May 2020

    Data extraction projects are complex and often require quite a lot of time and effort. To make sure your organization is creating value and that your money and your time are well spent, the first logical step is to choose your sources carefully. To help you achieve just that, we create a list of 10 questions you need to ask before you set your sights on a dataset. The goal here is to collect and analyze all the data existing information in order to clarify its ownership, publication, structure, content, quality, relationship, etc. Only by going through this process can you guarantee the suitability of your sources and identify potential problems and particularities.

  • PDF extraction - Everything you need to know

    Martin Magdinier  |  18 May 2020

    Our team here at RefinePro has a deep experience doing research and development in data processing and automation. And PDF extraction is one of the many services we offer.

  • How to divide and conquer your data project for success

    Martin Magdinier  |  17 May 2020

    Data extraction is now one of the most efficient ways for companies to stay up to date with current events and trends, but also to position themselves in their field. But for a lot of small entrepreneurs and even larger companies, the implementation of data extraction projects presents new challenges: How should these processes be implemented, and by whom?

  • 14 rules to succeed with your ETL project

    Martin Magdinier  |  15 May 2020

    Extracting, transforming, and loading (ETL) data is a complex process at the center of most organizations’ data extraction projects. As we saw in our article on web scraping and ETL, the implementation of an ETL workflow is a process that requires a lot of in-depth knowledge in several subfields of statistics and programming.

  • How to Maintain Data Quality at Every Step of Your Pipeline

    Martin Magdinier  |  05 April 2020

    Maintaining the quality of your data is paramount to any web scraping or data integration project. Think about it: there’s absolutely no point in collecting a massive amount of data if you can’t rely on it to make sound decisions! And the only way to maintain high quality is by implementing quality checks and validation at every step of your data pipeline. As the saying goes: garbage in, garbage out!




Never miss an update! Subscribe for OpenRefine's announcements and RefinePro's news.