blog
-
The secret for long term-growth - or why data is the new oil
Martin Magdinier | 09 July 2020
Read more... -
10 questions to ask before using new data
Martin Magdinier | 25 May 2020
Data extraction projects are complex and often require quite a lot of time and effort. To make sure your organization is creating value and that your money and your time are well spent, the first logical step is to choose your sources carefully. To help you achieve just that, we create a list of 10 questions you need to ask before you set your sights on a dataset. The goal here is to collect and analyze all the data existing information in order to clarify its ownership, publication, structure, content, quality, relationship, etc. Only by going through this process can you guarantee the suitability of your sources and identify potential problems and particularities.
Read more... -
PDF extraction - Everything you need to know
Martin Magdinier | 18 May 2020
Our team here at RefinePro has a deep experience doing research and development in data processing and automation. And PDF extraction is one of the many services we offer.
Read more... -
How to divide and conquer your data project for success
Martin Magdinier | 17 May 2020
Data extraction is now one of the most efficient ways for companies to stay up to date with current events and trends, but also to position themselves in their field. But for a lot of small entrepreneurs and even larger companies, the implementation of data extraction projects presents new challenges: How should these processes be implemented, and by whom?
Read more... -
14 rules to succeed with your ETL project
Martin Magdinier | 15 May 2020
Extracting, transforming, and loading (ETL) data is a complex process at the center of most organizations’ data extraction projects. As we saw in our article on web scraping and ETL, the implementation of an ETL workflow is a process that requires a lot of in-depth knowledge in several subfields of statistics and programming.
Read more... -
How to Maintain Data Quality at Every Step of Your Pipeline
Martin Magdinier | 05 April 2020
Maintaining the quality of your data is paramount to any web scraping or data integration project. Think about it: there’s absolutely no point in collecting a massive amount of data if you can’t rely on it to make sound decisions! And the only way to maintain high quality is by implementing quality checks and validation at every step of your data pipeline. As the saying goes: garbage in, garbage out!
-
The Who, the What, and the “With What” of Web Scraping
Martin Magdinier | 12 March 2020
Data is the new differentiator. It’s what you, a product owner, a marketing strategist, your local journalist, and a multimillionaire who already owns twelve successful companies all need. And web scraping is one way to get that data.
-
You’re looking for a Web Scraping tool? Look for a Service Instead.
Martin Magdinier | 01 March 2020
Is web scraping easy? No. Is it Profitable? It can with a great scheduling, monitoring, and maintenance plan. Learn how!
Your Python skills are not too bad, and it’s not your first time dealing with tech. In fact, technology is part of your daily life. You see the world goes around, and you’ve heard how web scraping could help automate your business. So, you start looking for the best scrapers and web scraping tools, hoping maybe to boost your e-commerce sales.
Read more... -
RefinePro's joel test
| 08 October 2019
Twenty years after its publication, the Joel test remains a reference to the software industry to assess the maturity of a development team.Read more...
-
Clarifications regarding RefinePro participation to the OpenRefine community.
Martin Magdinier | 13 December 2017
Following recent discussions with several OpenRefine community members, I realized that there were some confusions and misunderstanding regarding RefinePro intentions regarding the OpenRefine project.
Read more...