SQL Server – OLTP vs OLAP

In this blog post I will explain the differences between OLTP & OLAP and how should we configure SQL Server and its workload in both the cases.

OLTP (Online Transaction Processing)

OLTP is designed to serve as a persistent data store for business or front-end applications. OLTP administers day to day transaction of an organization. They excel at quickly looking up specific information as well as processing a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF).

Examples of some common requirements need  OLTP:

  • What are the active orders of the current users?
  • What is the name of the current user, when given an email address?
  • What was the last stage that a player was on a mobile game?
  • What is current billing addresses for a set of clients?

These type of requirements need a system that can look up and update one or more columns within one or many rows. The strength of OLTPs is that they support fast writes and fast data retrieval. A typical workload for OLTP is both frequent reads and writes, but the reads tend to be more of looking up a specific value rather than scanning all values to compute an aggregate.

Example of OLTP system

An example of OLTP system is ATM center. Assume that a couple has a joint account with a bank. One day both simultaneously reach different ATM centers at precisely the same time and want to withdraw total amount present in their bank account.

However, the person that completes authentication process first will be able to get money. In this case, OLTP system makes sure that withdrawn amount will be never more than the amount present in the bank. The key to note here is that OLTP systems are optimized for transactional superiority instead data analysis.

Other examples of OLTP system are:

  • Online banking
  • Online airline ticket booking
  • Sending a text message
  • Order entry
  • Add a book to shopping cart

Benefits of OLTP method

  • Provides ability to faster time to market
  • Serves daily transactions of an organization.
  • Widens the customer base of an organization by simplifying individual processes.

OLAP (Online Analytical Processing)

In contrast to an OLTP database, an OLAP database is designed to process large datasets to answer questions about data. OLAP is a combination of software tools which provide analysis of data for business decisions. OLAP systems allow users to analyze database information from multiple database systems at one time.

It is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations of large data sets. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema).

  • Star schema: In computing, the Star Schema is the simplest style of data mart schema. The star schema consists of one or more fact tables referencing any number of dimension tables. The star schema is an important special case of the snowflake schema, and is more effective for handling simpler queries.

Examples of some common requirements need  OLAP:

  • What is the most selling product of website?
  • What is the demand trend of a product?
  • Setting up the optimal prices as per demand?
  • What is the customer lifetime value of an e-commerce application?
  • What is the median duration of play time on a mobile game?
  • What is the conversion rate for various landing pages based on the referrer?

An OLAP database is optimized for scanning and performing computations across many rows of data for one or multiple columns. Since OLAP is optimized for analyzing data, basic transactional procedures like writes or updates tend to be done in infrequent batches, typically once a day or an hour. OLAP shines when it comes to reads and analytical calculations like aggregation.

Example of OLAP

Any Data warehouse system is an OLAP system. Uses of OLAP are as follows

  • A company might compare their mobile phone sales in September with sales in October, then compare those results with another location which may be stored in a separate database.
  • A retail company analyzes purchases by its customers to come up with a personalized homepage with products which likely interest to their customer.
  • An airline may analyse their busiest routes during summers and winters and schedule their flights accordingly
  • A restaurant may compare their sales during different seasons and can control their food supply as per demand. 

Difference between OLTP and OLAP

Migrationa

Parameters

OLTP System

OLAP System

Source of data

Operational data; OLTPs are the original source of the data.

Consolidation data; OLAP data comes from the various OLTP Databases

Purpose of data

To control and run fundamental business tasks

To help with planning, problem solving, and decision support

What the data

Reveals a snapshot of ongoing business processes

Multi-dimensional views of various kinds of business activities

Inserts and Updates

Short and fast inserts and updates initiated by end users

Periodic long running batch jobs refresh the data

Queries

Relatively standardized and simple queries Returning relatively few records

Often complex queries involving aggregations

Processing Speed

Typically very fast

Depends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexes

Space Requirements

Can be relatively small if historical data is archived

Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP

Database Design

Highly normalized with many tables

Typically de-normalized with fewer tables; use of star and/or snowflake schemas

Backup and Recovery

Backup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liability

Although regular backups are always a safer option, but some environments may consider simply reloading the OLTP data as a recovery method

OLTP is a data modeling approach typically used to facilitate and manage usual business applications. Most of applications you see and use are OLTP based.

OLAP is an approach to answer multi-dimensional queries. OLAP was conceived for Management Information Systems and Decision Support Systems.

With the constant growth of data analysis and business intelligence applications (now even in small business) understanding OLAP nuances and benefits is a must if you want provide valid and useful analytics to management.

The following table will summarize main technical differences between OLTP and OLAP:

Parameters

OLTP System

OLAP System

Application

Operational: ERP, CRM, legacy apps, …

Management Information System, Decision Support System

Typical users

Staff/Customers

Analyst/Business users

Horizon

Weeks, Months

Years

Refresh

Immediate

Periodic

Data model

Entity-relationship

Multi-dimensional

Schema

Normalized

Star

Emphasis

Update/retrieval

Retrieval

Workload distribution on OLTP and OLAP

SQL Server Data Warehouse (DW) workloads differ in significant ways from traditional Online-Transaction-Processing (OLTP) workloads in how they should be performance tuned because of the different query patterns inherent in both designs. Below is a table which summarizes some of the most important differences.

5-3-2013_3-31-36_pm

An OLTP workload is characterized by seek centric operations that are best measured by IOPS (In-and-Out-Operations-Per-Second). The typical OLTP query only requires a few rows from several tables (3rd normal form) to be returned ideally from memory and rarely from disk. An OLTP system’s performance is determined by how quickly it can seek those few rows and return them back to the application. However, a typical data warehouse query will return a lot of historical data from disk to be aggregated on the CPU. For example to get the Total Sales Amount for an entire year all of the sales transactions for that year will have to move from disk onto the CPU to be aggregated to the yearly level. This large data movement requires scan centric operations.

End users and applications update the OLTP system through DML  (Data Manipulation Language: updates, inserts, and deletes) operations continuously throughout the day so that data is described as being volatile while a data warehouse typical only preforms the bulk INSERTS (or potentially UPDATES and DELETES) operations during a daily ETL process.

In order for an OLTP system to quickly locate the rows that it requires many indexes will need to be created, however data warehouse tables do not require nearly as many indexes. In the example of yearly Total Sales Amount an index will help located all the transactions for the entire year but will not be able to improve the time that it takes to move all of those transaction from disk to the CPU. Therefore, only relatively few indexes are required.

OLTP systems are also characterized by high concurrency because many users or applications can be querying the system at the same time. Data warehouses, however, normally only support a few users who are typically characterized as analysts.

Fine tune your workload on OLTP and OLAP

Following are the best practices while configuring your SQL Server on an OLTP and OLAP:

 

OLTP System

OLAP System

SQL Server Indexes

Try to cover queries with supporting indexes to run faster.

Try to cover queries minimally as it will impact bulk loading of data significantly. In most of the cases analysis processes read full table.

Parallelism in SQL Server Plan

Try to restrict the parallel queries and ensure non-parallel SQL plan is being used

A parallel plan is preferred

Preferred Physical Join

Nested Loop

Hash Join

Maximum Degree of Parallelism

As small as it can be- as OLTP serves high concurrent transactions, we need to make sure CPU is available for other concurrent transactions so one process cannot take the whole CPU.

As high as it can be- as OLAP does not serve high concurrent workload so we should utilize as many CPU as we can during one bulk transaction.

SQL Transactions

Keep it small as it can be.

Keep it big as it can be.

Primary Key

It should be created on business unique column so that joins and data filter can extract data faster.

Surrogate keys are database keys used to relate the dimension tables to the fact tables. Surrogate keys (SK) have no meaning to the business and no intrinsic meaning.

Note: These are the best practices and can vary environment to environment so please configure your SQL Sever as per your requirements.

Please leave your feedback in comment section.

Cheers.

10 thoughts on “SQL Server – OLTP vs OLAP

Add yours

  1. There is still a lot of misconception about OLAP and OLTP workload so I have tried to explain both the type of workloads as per my professional experience. I am glad you found it worth reading.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: