In-Memory Database: SPARC T7-1 Faster Than x86 E5 v3

来源:转载

Fast analytics on large databases are critical to transforming key business processes. Oracle's SPARC M7 processors are specifically designed to accelerate in-memory analytics using Oracle Database 12 c Enterprise Edition utilizing the In-Memory option. The SPARC M7 processor outperforms an x86 E5 v3 chip by up to 10.8x on analytics queries. In order to test real world deep analysis on the SPARC M7 processor a scenario with over 2,300 analytical queries was run against a real cardinality database (RCDB) star schema.

The SPARC M7 processor does this by using Data Accelerator co-processor (DAX). DAX is not a SIMD instruction but rather an actual co-processor that offloads in-memory queries which frees the cores up for other processing. The DAX has direct access to the memory bus and can execute scans at near full memory bandwidth. Oracle makes the DAX API available to other applications, so this kind of acceleration not just for the Oracle database, it is open.

The SPARC M7 processor delivers up to a 10.8x Query Per Minute speedup per chip over the Intel Xeon Processor E5-2699 v3 when executing analytical queries using the In-Memory option of Oracle Database 12 c .

Oracle's SPARC T7-1 server delivers up to a 5.4x Query Per Minute speedup over the 2-chip x86 E5 v3 server when executing analytical queries using the In-Memory option of Oracle Database 12 c .

The SPARC T7-1 server delivers over 143 GB/sec of memory bandwidth which is up to 7x more than the 2-chip x86 E5 v3 server when the Oracle Database 12 c is executing the same analytical queries against the RCDB.

The SPARC T7-1 server scanned over 48 billion rows per second through the database.

The SPARC T7-1 server compresses the on-disk RCDB star schema by around 6x when using the Memcompress For Query High setting (more information following below) and by nearly 10x compared to a standard data warehouse row format version of the same database.

Performance Landscape

The table below compares the SPARC T7-1 server and 2-chip x86 E5 v3 server. The x86 E5 v3 server single chip compares are from actual measurements against a single chip configuration.

RCDB Performance Chart2,304 Queries System ElapsedSeconds Queries PerMinute SystemAdv ChipAdv DB MemoryBandwidth SPARC T7-11 x SPARC M7 (32 cores) 381 363 5.4x 10.8x 143 GB/sec x86 E5 v3 server2 x Intel E5-2699 v3 (18 cores) 2059 67 1.0x 2.0x 20 GB/sec x86 E5 v3 server1 x Intel E5-2699 v3 (18 cores) 4096 34 0.5x 1.0x 10 GB/sec

The number of cores is per chip, multiply by number of chips to get system total.

Fused Decompress + Scan

The In-Memory feature of Oracle Database 12 c puts tables in columnar format. There are different levels of compression that can be applied. One of these is Oracle Zip (OZIP) which is used with the "MEMCOMPRESS FOR QUERY HIGH" setting. Typically when compression is applied to data, in order to operate on it, the data must be:

(1) Decompressed (2) Written back to memory in uncompressed form (3) Scanned and the results returned.

When OZIP is applied to the data inside of an In-Memory Columnar Unit (or IMCU, an N sized chunk of rows), the DAX is able to take this data in its compressed format and operate (scan) directly upon it, returning results in a single step. This not only saves on compute power by not having the CPU do the decompression step, but also on memory bandwidth as the uncompressed data is not put back into memory. Only the results are returned. To illustrate this, a microbenchmark was used which measured the amount of rows that could be scanned per second.

Compression

This performance test was run on a Scale Factor 1750 database, which represents a 1.75 TB row format data warehouse. The database is then transformed into a star schema which ends up around 1.1 TB in size. The star schema is then loaded in memory with a setting of "MEMCOMPRESS FOR QUERY HIGH", which focuses on performance with somewhat more aggressive compression. This memory area is a separate part of the System Global Area (SGA) which is defined by the database initialization parameter "inmemory_size". See below for an example. Here is a breakdown of each table in memory with compression ratios.

Column Name Original Size(Bytes) In MemorySize (Bytes) CompressionRatio LINEORDER 1,103,524,528,128 178,586,451,968 6.2x DATE 11,534,336 1,179,648 9.8x PART 11,534,336 1,179,648 9.8x SUPPLIER 11,534,336 1,179,648 9.8x CUSTOMER 11,534,336 1,179,648 9.8x Configuration Summary

SPARC Server:

1 X SPARC T7-1 server

1 X SPARC M7 processor

512 GB memory

Oracle Solaris 11.3

Oracle Database 12 c Enterprise Edition Release 12.1.0.2.13

x86 Server:

1 X Oracle Server X5-2L

2 X Intel Xeon Processor E5-2699 v3

512 GB memory

Oracle Linux 6 Update 5 (3.8.13-16.2.1.el6uek.x86_64)

Oracle Database 12 c Enterprise Edition Release 12.1.0.2.13

Benchmark Description

The real cardinality database (RCDB) benchmark was created to showcase the potential speedup one may see moving from on disk, row format data warehouse/Star Schema, to utilizing Oracle Database 12 c 's In-Memory feature for analytical queries.

The workload consists of 2,304 unique queries asking questions such as "In 2014, what was the total revenue of single item orders", or "In August 2013, how many orders exceeded a total price of $50". Questions like these can help a company see where to focus for further revenue growth or identify weaknesses in their offerings.

RCDB scale factor 1750 represents a 1.75 TB data warehouse. It is transformed into a star schema of 1.1 TB, and then becomes 179 GB in size when loaded in memory. It consists of 1 fact table, and 4 dimension tables with over 10.5 billion rows. There are 56 columns with most cardinalities varying between 5 and 2,000, a primary key being an example of something outside this range.

One problem with many industry standard generated databases is that as they have grown in size the cardinalities for the generated columns have become exceedingly unrealistic. For instance one industry standard benchmark uses a schema where at scale factor 1 TB it calls for the number of parts to be SF * 800,000. A 1 TB database that calls for 800 million unique parts is not very realistic. Therefore RCDB attempts to take some of these unrealistic cardinalities and size them to be more representative of at least a section of customer data. Obviously one cannot encompass every database in one schema, this is just an example.

We carefully scaled each system so that the optimal number of users was run on each system under test so that we did not create artificial bottlenecks. Each user ran an equal number of queries and the same queries were run on each system, allowing for a fair comparison of the results.

Key Points and Best Practices This benchmark utilized the SPARC M7 processor's co-processor DAX for query acceleration.

All SPARC T7-1 server results were run with out of the box tuning for Oracle Solaris.

All Oracle Server X5-2L system results were run with out of the box tunings for Oracle Linux except for the setting in /etc/sysctl.conf to get large pages for the Oracle Database:

vm.nr_hugepages=64520

To create an in memory area, the following was added to the init.ora:

inmemory_size = 200g

An example of how to set a table to be in memory is below:

ALTER TABLE CUSTOMER INMEMORY MEMCOMPRESS FOR QUERY HIGH See Also SPARC T7-1 Server oracle.comOTN Oracle Server X5-2L oracle.comOTN Oracle Solaris oracle.comOTN Oracle Database oracle.comOTN Oracle Database – In-Memory oracle.comOTN Disclosure Statement

Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Results as of 10/25/2015.



分享给朋友:
您可能感兴趣的文章:
随机阅读: