Bart Maertens

I am a principal consultant at with over 15 years of experience in business intelligence and analytics. I am the founder and proud leader of the pack.
Find me on:

Recent Posts

3 Reasons to take a look at WebSpoon for web or cloud ETL

Nov 29, 2017 8:30:00 AM / by Bart Maertens posted in pentaho, pentaho data integration, webspoon, cloud, web etl, cloud etl


3 reasons to move your ETL to the web, cloud

ETL development heavily relies on the desktop with files, database and network connections that require the developer to be the resources that are located in the company network.
Apart from these access restrictions, most of the established ETL platforms have a history of over a decade and were originally developed in an era where web based applications were basic at best.
Times have changed, however, and web applications have come a long way. We'll look at a number of reasons to move your ETL to the web and/or cloud. 

1. Data can't leave the organization

There are plenty of cases where data is considered to be too sensitive to leave the organization's premises or (virtual) private cloud. 
With a centralized ETL infrastructure, ETL developers and data engineers can work from anywhere in the world. All of the data is managed over secure connections without the need for a single byte of data to leave the organization's systems. 

2. Data is too big to copy or changes frequently

ETL developers and data engineers often need to work in geographically separate locations, while the data remains in one location. 
Developing ETL or working with frequently changing data over VPN connections and remote deskop protocols is painful, if possible at all.
Life can be a lot easier if the ETL and data management work can be done over a standard HTTP(S) protocol from anywhere in the world. 

3. Simplified installation, configuration and project management

Last but not least, ETL configuration management and overall DevOps for a large number of desktop installations can be a burden. 
Instead of maintining an installation on every ETL developer's or data engineer's machine, a centralized approach can significantly simplify the process. 

With a centralized installation, developers are guaranteed to work on the same standardized software version, configuration and set of plugins. 
Additionally, ETL working practices and conventions are a lot easier to enforce from a centralized environment. 



Try it out for yourself

If you're using or considering Pentaho (now part of Hitachi Vantara), all of this is within grasp: with the WebSpoon project, your existing ETL can simply be moved to the web and cloud. No changes to your existing code base are required, and you can gradually (or partially) make the switch to web or cloud based ETL. 

We've set up a demo environment for WebSpoon, feel free to give it a try.  

WebSpoon is available as open source and is not (yet) part of the Pentaho Enterprise Edition. Let us know if you'd like to find out how we can help you bridge the gap. 

  Talk to an expert!

Disclaimer: the use cases and images in this post were taken from WebSpoon author Hiromu Hota's presentation




Read More

PCM17 - Business Use Cases Room

Nov 18, 2017 10:24:11 AM / by Bart Maertens posted in pentaho, pdi, ctools, pcm17, pcm, community


PCM17 - Business Use Cases Room

Read our overview of the Keynotes 

Read our overview of the talks in the Technical room

Using a BI tool to improve the management of health data in Mozambique - Devan Manharlal

Devan kicked off the talks in the business room by sharing his experiences in building a health data in Mozambique.
Part of the scope of the project was the geographical allocation of e.g. nurses over Mozambique, which poses some specific challenges in a developing country like Mozambique.
The reasons Pentaho was chosen are mainly because the need for 

Read More

PCM17 - Technical Room

Nov 18, 2017 9:37:40 AM / by Bart Maertens posted in pentaho, pentaho data integration, pcm17, pcm, community


PCM17 - Technical Room

Read our overview of the Keynotes 

Read our overview of the talks in the Business room

Data Pipelines - Running PDI on AWS Lambda - Dan Keeley

Dan explained how serverless PDI allows to spend time on the solution rather than getting the PID server and infrastructure up and running. 
Although virtualization already takes some of the infrastructure management pain away, there's still quite a bit of overhead involved, whereas no infrastructure management is needed when running In the cloud. 

Read More

PCM17 - Keynotes

Nov 14, 2017 10:50:12 PM / by Bart Maertens posted in pentaho, kettle, pentaho data integration, ctools, pcm17, pcm, community, prd


PCM17 - the tenth edition

After 9 years of absence, the annual Pentaho Community Meeting was back where it started in 2008: Mainz, Germany. 
Same city, same venue, the only difference was the crowd: instead of the 30-something enthousiasts in 2008, there were now close to 300 registrations, of which over 200 showed up. 

Read More

Pentaho World 2017 recap

Nov 8, 2017 10:00:00 AM / by Bart Maertens posted in pentaho, data, pentaho world, data engineering, data processing, data science, open source, oss, pdi, kettle


On October 25th-27th, the huge Rosen Shingle Creek hotel in Orlando, Florida was home for Pentaho World 2017, probably the last edition as a standalone Pentaho event.

Read More