Posts tagged with 'strategy'
-
The secret for long term-growth - or why data is the new oil
Martin Magdinier | 09 July 2020
Read more... -
10 questions to ask before using new data
Martin Magdinier | 25 May 2020
Data extraction projects are complex and often require quite a lot of time and effort. To make sure your organization is creating value and that your money and your time are well spent, the first logical step is to choose your sources carefully. To help you achieve just that, we create a list of 10 questions you need to ask before you set your sights on a dataset. The goal here is to collect and analyze all the data existing information in order to clarify its ownership, publication, structure, content, quality, relationship, etc. Only by going through this process can you guarantee the suitability of your sources and identify potential problems and particularities.
Read more... -
14 rules to succeed with your ETL project
Martin Magdinier | 15 May 2020
Extracting, transforming, and loading (ETL) data is a complex process at the center of most organizations’ data extraction projects. As we saw in our article on web scraping and ETL, the implementation of an ETL workflow is a process that requires a lot of in-depth knowledge in several subfields of statistics and programming.
Read more... -
How to Maintain Data Quality at Every Step of Your Pipeline
Martin Magdinier | 05 April 2020
Maintaining the quality of your data is paramount to any web scraping or data integration project. Think about it: there’s absolutely no point in collecting a massive amount of data if you can’t rely on it to make sound decisions! And the only way to maintain high quality is by implementing quality checks and validation at every step of your data pipeline. As the saying goes: garbage in, garbage out!
-
Agile Data Process
Martin Magdinier | 24 June 2015
Stefan Urbanek when laying the foundation for the school of data program at the Open Knowledge, presented the following Data Processing Pipeline going from:
Read more...