Posts by Martin Magdinier
Blog Archive
-
Webinar Data Operations for CRM and Marketing
20 September 2020
In case you missed our webinar with Macro on September 10, 2020, you can find here the recording and the slides.
Read more... -
The secret for long term-growth - or why data is the new oil
09 July 2020
Read more... -
Download The Data Innovation Canvas
09 July 2020
Learn more about the Data Innovation Canvas from Communitech.
Watch Chris Willsher, Director of Data Platforms at Communitech, presents the canvas during the May 8th Communitech® Data Hub Sessions. The video starts at 41:25
Read more... -
Download PDF Form Web Scraping Software Comparison Table
15 June 2020
-
Download PDF Form 10 Questions Before Using New Data
15 June 2020
-
10 questions to ask before using new data
25 May 2020
Data extraction projects are complex and often require quite a lot of time and effort. To make sure your organization is creating value and that your money and your time are well spent, the first logical step is to choose your sources carefully. To help you achieve just that, we create a list of 10 questions you need to ask before you set your sights on a dataset. The goal here is to collect and analyze all the data existing information in order to clarify its ownership, publication, structure, content, quality, relationship, etc. Only by going through this process can you guarantee the suitability of your sources and identify potential problems and particularities.
Read more... -
PDF extraction - Everything you need to know
18 May 2020
Our team here at RefinePro has a deep experience doing research and development in data processing and automation. And PDF extraction is one of the many services we offer.
Read more... -
How to divide and conquer your data project for success
17 May 2020
Data extraction is now one of the most efficient ways for companies to stay up to date with current events and trends, but also to position themselves in their field. But for a lot of small entrepreneurs and even larger companies, the implementation of data extraction projects presents new challenges: How should these processes be implemented, and by whom?
Read more... -
14 rules to succeed with your ETL project
15 May 2020
Extracting, transforming, and loading (ETL) data is a complex process at the center of most organizations’ data extraction projects. As we saw in our article on web scraping and ETL, the implementation of an ETL workflow is a process that requires a lot of in-depth knowledge in several subfields of statistics and programming.
Read more... -
How to Maintain Data Quality at Every Step of Your Pipeline
05 April 2020
Maintaining the quality of your data is paramount to any web scraping or data integration project. Think about it: there’s absolutely no point in collecting a massive amount of data if you can’t rely on it to make sound decisions! And the only way to maintain high quality is by implementing quality checks and validation at every step of your data pipeline. As the saying goes: garbage in, garbage out!
-
The Who, the What, and the “With What” of Web Scraping
12 March 2020
Data is the new differentiator. It’s what you, a product owner, a marketing strategist, your local journalist, and a multimillionaire who already owns twelve successful companies all need. And web scraping is one way to get that data.
-
You’re looking for a Web Scraping tool? Look for a Service Instead.
01 March 2020
Is web scraping easy? No. Is it Profitable? It can with a great scheduling, monitoring, and maintenance plan. Learn how!
Your Python skills are not too bad, and it’s not your first time dealing with tech. In fact, technology is part of your daily life. You see the world goes around, and you’ve heard how web scraping could help automate your business. So, you start looking for the best scrapers and web scraping tools, hoping maybe to boost your e-commerce sales.
Read more... -
RefinePro Platform
29 April 2019
Data Sheet for RefinePro Platform, a customizable administration center to help data science and engineering team schedule, configure and monitor data processing jobs written in any language.
Read more... -
Data Migration Services
01 April 2019
Data Sheet regarding RefinePro Data Migration Services for consulting firms
Read more... -
Using External Data
24 May 2018
Download our cheat sheet of what to keep in mind when using external or third party data
Read more... -
Clarifications regarding RefinePro participation to the OpenRefine community.
13 December 2017
Following recent discussions with several OpenRefine community members, I realized that there were some confusions and misunderstanding regarding RefinePro intentions regarding the OpenRefine project.
Read more... -
Sunset for RefinePro’s Public OpenRefine Hosting
24 May 2017
Starting today, we are sunsetting RefinePro’s OpenRefine hosting. We are not accepting new registration and new subscriptions. Existing users can continue to access the service until the end of their subscription or trial phase.
Read more... -
Improve OpenRefine’s extensibility with data repository and processing services.
26 January 2016
In September 2015, we submitted the following application to the Knight News Challenge: How might we make data work for individuals and communities? We are cross posting it here for archive. You can also consult it directly on the Knight Challenge website.
Read more... -
Agile Data Process
24 June 2015
Stefan Urbanek when laying the foundation for the school of data program at the Open Knowledge, presented the following Data Processing Pipeline going from:
Read more... -
Thoughts on the importance of a clear governance rules with open source project.
14 April 2015
The main motivation behind RefinePro is being able to commit developer time to OpenRefine (since I am not a developer myself) and have the project leave its stagnant stage. RefinePro motivation isn’t to make millions or get some prestige by stealing other people work – there is faster and less risky way to do that than creating a start-up on a niche market.
Read more... -
Scale your Open Source Communities For Success.
28 March 2015
In the two articles of this series of three we have seen how corporations leverage Open Source Software (OSS) to build new solution and how an open source strategy foster in-house and community based innovation. It is a new world for software development and related business-building strategies, a world where even enterprise software projects leverage open source components for their foundation. The freedom to extend or even fork the project to explore new options can lead to unexpected innovation and usage, when done right. Open licenses mean that contributors – including yourself – have no restrictions on tweaking code and tailoring it to match requirements or customizing it to specific needs within an organization or industry. However to ensure continuity and sustainability of the project, the community now need to scale, gain the critical size and drive broad adoption.
Read more... -
Unexpected innovation thanks to open source
27 March 2015
In the first article of this series of three on open source strategy, we seen how Open Source Software (OSS) is now part of corporations strategy and the benefit of building an ecosystems. By sharing your code and vision openly you enable other expand your solution, adapt to their specific environment. It is not rare to see unexpected innovation sparking in this process. In this article we will see the steps to create such innovation friendly environment and take example the OpenRefine community.
Read more... -
Sharing is at the core of Open Source Strategies
26 March 2015
In its 2014 annual survey on the Future of Open Source, Black Duck indicated that 56 percent of corporations expected to contribute to open source software solutions, more than ever before. Most of those companies were already using open source software internally, but wanted to go a step further and contribute back through comments, bug reports, or by subsidizing developer time.
Read more... -
RefinePro Roadmap
29 November 2014
The last five weeks have been full of learning and surprises (good and bad). We have a better understanding on how Refine is used and how it behaves with different browsers and data set sizes. Some things worked as expected, other completely broke. This post highlights the coming technical points we plan to address to improve RefinePro.
Read more... -
Opening RefinePro Knowledge Base
17 November 2014
It all started in June 2011 when we opened OpenRefine Tips and Recipes blog. Over the last three years we documented Google Refine and then OpenRefine functionality with concrete examples and screenshots and listed the best materials written by others. It started as a personal notebook and grew into a knowledge base of over 110 articles coverings more than 80 topics.
Read more... -
First Partnership with DST4L
04 November 2014
We are extremely happy to announce that RefinePro has signed a partnership with the Data Scientist Training for Librarians (DST4L) allowing all librarians who take the course to have access to the RefinePro platform for the next eight months (until June 2015). Through their RefinePro accounts, they will have access to their projects from multiple computers and be able to work on the latest version of OpenRefine.
Read more... -
A Vision for OpenRefine
06 October 2014
Over the last five years, OpenRefine has built a robust platform, to which many developers have contributed plugins and extensions useful for their own audiences. That list of plugins and reconciliation services grows month after month, demonstrating that the community is active and thriving, with a healthy and expanding user base.
Read more... -
When manual line by line cleaning is not enough
04 October 2014
One of the big news in the industry this month was CrowdFlower raising $12.5 million in funding to support its growth. CrowdFlower is like a souped up Amazon Mechanical Turk with a very nice API and well-thought-out back end for job editors. I couldn’t agree more when Mark Sullivan say:
Read more... -
Announcing RefinePro
15 September 2014
Less than two weeks ago we announced the start up of RefinePro as a new participant of the OpenRefine ecosystem. This post provides background on where we come from and how we see the position of RefinePro within OpenRefine community.
Read more...