当前位置: 动力学知识库 > 问答 > 编程问答 >

OLAP vs In-Memory

问题描述:

I am working with Big Data and all my backend logics are written in php. So for faster output, which of the follwoing technologies would be efficient and good for my product.

  1. OLAP.
  2. In-Memory Database.

网友答案:

Well, when we talk about Big Data, I would choose an OLAP database. But let's take a closer look at the technologies:

OLAP (= On-line Analytic Processing)

... has the basic technological idea of pre-aggregating data on dimension-levels.

Let's guess you wanna query a sales order table with thousands of orders per day, month and years. You define dimensions like order date, sales channel, ship-to country and measures like turnover, no of orders, shipping time.

Usually, you would answer the following questions with an OLAP database:

  • How many sales orders did we had in June 2016?
  • What was the turnover (aggregated amount of sales orders) in 2016 with sales channel SHOP send to the USA?
  • How long did it take on average to ship a sales order per week/month?

... or more technical:

You can answer all questions, where you have an aggregation in the SELECT clause and a dimension in the where clause:

SELECT
    SUM(amount) AS Turnover,
    AVG(shipping_time) AS avg_shipping_time
FROM sales_orders
WHERE DATEPART(year,order_date) = 2016 AND sales_channel = 'SHOP'

As more as the OLAP system can aggregate, as better is the performance. Therefore it would be a bad approach using the sales order number or post addresses as dimensions. The OLAP idea is to eliminate data (or rows). That requires standardized data.

The following questions you would better be answered in relational databases (data warehouse):

  • Which were the Top 50 sales orders of September 2016?
  • Tell me the customer address of the sales orders of January 2017 etc.

So what is In-Memory?

The idea of In Memory is that it is faster to query data in RAM than on your disk. But RAM is also expensive.

In-Memory in relational databases are actually build more for OLTP (On-line Transaction processing) systems - systems where a user makes transactions and work - not for analysis.

Actually, today enterprise OLAP Systems like SQL Server Analytics Service uses also a In-Memory technology after aggregating the data (OLAP technology). You just don't see it.

--

So OLAP is the right thing, or...?

Let's think also about something else: An OLAP database is something different than a relational database and sometimes it is too oversized to use an OLAP database (f.e. when you just have this one huge table). An OLAP database needs to be processed (aggregated & prepared for use). That is - most of the time - done in the night where no one is working (ok, you can do it every second if you want :-) )

If you are new to the Big Data and just want to fix this one thing in your application - and don't have a clue about OLAP, I recommend you: Try to fix it in your application code - except you want to dig into a new world with new terms, languages like MDX instead of SQL etc.

The complexability depends on the OLAP database you choose. But in fact, you can develop easily your own "OLAP" aggregation level in your application... it just might be not so flexible as a OLAP database.

Possible solutions in your applications might be:

  • use SQL Server indexed views - or similar functions in other DBs
  • use SQL table trigger
  • use a cron job to aggregate data and write it into a table
分享给朋友:
您可能感兴趣的文章:
随机阅读: