know.bi blog

Graph Databases - Analytical Use Cases

Mar 9, 2018 10:00:00 AM / by Shila Casteels posted in graph databases, neo4j, graph analytics, fraud detection, social network analysis, recommendation engine

1 Comment

What is a graph database?

 Although graph theory has been around for centuries, graph databases started to appear relatively recently.

‘Traditional’ relational databases store data in tables. These tables have a fixed format (fixed number of columns, each with a fixed data type). Tables are linked through the primary key in one table and the corresponding foreign keys in other tables. When a query is executed, the database engine fetches the primary keys from one table and links them to the corresponding foreign keys in other tables through SQL joins.

Although this works well to insert, select, update and delete individual records over one or a limited number of tables (CRUD operations), performing joins during query execution over large schemas and data volumes is expensive and slow.

Property graph databases (like the one offered by market leader Neo4J) use nodes with labels and properties to store data instead of the relational database tables and columns. Instead of fitting all data into a fixed, predefined table structure, each node is a new instance with a structure that can be different from other nodes.

This schema-less structure offers more flexibility than the relational database table. On top of the additional flexibility, graph databases treat relationships as first class citizens. Instead of being built during query execution, relationship are persisted in a graph database. Having all relationships stored with the data not only allows to extremely outperform relational databases on relationship-heavy queryies, it allows use cases that simply aren’t possible in relational databases.

In this post, we’ll have a look at a couple of analytical use cases with graph databases.

Read More

Build a dashboard with AWS Athena and Quicksight in less than an hour

Feb 28, 2018 10:00:00 AM / by Hans Van Akelyen posted in amazon, analytics, cloud, aws

0 Comments

Quickly move from data to insight

Read More

Basic Machine Learning - Choose a learning type and algorithm

Feb 20, 2018 10:30:00 AM / by Yannick Mols posted in data science, r, machine learning, artificial intelligence, python, matlab

0 Comments

So you want to get started with Machine Learning?

Read More

5 Key Components For Your Cloud Analytics Project

Feb 9, 2018 10:00:00 AM / by Bart Maertens posted in aws, amazon, business intelligence, analytics, analytical database, column store, cloud etl, cloud, data engineering

0 Comments

Why move your BI to the cloud? 

As discussed in a previous post, there are many reasons to move your BI to the cloud.
Security, being able to work from anywhere and delivering faster, with more resource flexibility and at a lower cost are just a few.  

Read More

Predictive Analytics with Vertica

Jan 30, 2018 10:00:00 AM / by Yannick Mols posted in vertica, olap, analytical database, column store, data warehouse

0 Comments

In-SQL machine learning

Vertica is a clustered analytical database that handles large, fast-growing volumes of data with ease and provides lightning fast query performance. Apart from that it also has in-database machine learning which we’ll be taking a look at in this blogpost!

Read More

Easily load data to Neo4J with Pentaho Data Integration

Jan 25, 2018 10:00:00 AM / by Bart Maertens posted in data integration, pentaho, pdi, neo4j

0 Comments

Load data to Neo4J

Read More

Scalable PDI architecture using Docker

Jan 16, 2018 10:00:00 AM / by Yannick Mols posted in pentaho, docker

0 Comments

Shipping PDI in a Docker container

Read More

5 reasons to move your bi to the cloud

Jan 9, 2018 9:30:00 AM / by Hans Van Akelyen posted in pentaho, amazon, cloud

0 Comments

Cloud computing is the way to the future, and the way to bring your company to the next level. With the abillity to have enterprise grade services and technologies at a significantly lower price, your company can focus on creating more value while your IT department has to spend less time on maintaining infrastructure. 

These are our top five reasons to move your BI infrastructure to the cloud:

Read More

Getting Started with AWS DMS

Dec 27, 2017 9:30:00 AM / by Willem Dullaart posted in amazon, aws, database migration service, dms, cloud

0 Comments

What is Amazon DMS

Every day, more and more companies are moving towards cloud computing, with Amazon Web Services (AWS) undoubtedly being the biggest player. Having all the possible AWS services available at your fingertips is great, but you still need to migrate your existing infrastructure and data into the (AWS) cloud. At re:Invent 2015, Amazon announced “AWS Database Migration Service”, aiming to make the process of moving data into databases on AWS a lot easier.

AWS DMS supports most open-source and commercial databases such as PostgreSQL, MySQL, MariaDB, Oracle, Microsoft SQL Server, and of course their own Aurora, Redshift, DynamoDB and S3 services. Both homogeneous (e.g. Postgres to Postgres) and heterogeneous migrations (e.g. Oracle to MySQL) are supported. Either the source or target database is required to be in the AWS cloud. DMS regularly gets updated with new features and supported engines.

DMS Overview

At the highest level, you have three components to take care of when starting a migration using DMS:

Read More

Business Intelligence, simplified.

Dec 18, 2017 9:30:00 AM / by Shila Casteels posted in pentaho, business intelligence, analytics, data integration

0 Comments

My views on BI after one year in the trenches

After having worked for about a year as a Business Intelligence consultant, I’d like to explain my views on the subject. At know.bi, we mainly work with the commercial open source BI platform Pentaho, so I’ll use that as a reference, but this post should apply to BI in general and is not meant to be limited to any given platform.

Read More

Subscribe to Email Updates

Recent Posts