100 Times Faster Experiences Making SQL Server Fly

100 Times Faster
Experiences Making SQL Server Fly
OutAt
The
ofthe
respect
events
request
for
depicted
the
of the
dead,
occurred
survivors
the
rest
has
names
been
told
have
This
isinthe
aMassachusetts
true
story
between
exactly
been
as
2005
changed
it occurred.
and 2015
The Stories or Cases
• Different Algorithms
– Comparing Big Tables
– The Good the Bad and the Ugly
• Mistakes - OR - 100 Times Slower
T-SQL Functions
Indexes
Cursors
Triggers
Wrong Technology
Row-By-Row Inserts
• Comprehensive Approach
– Hekaton
– Fast ETL – A Service Broker Solution
Andy Novick
•
•
•
•
•
SQL Server Consultant
SQL Server MVP since 2010
Author of 2 books on SQL Server
[email protected]
www.NovickSoftware.com
Focus of this Presentation
Schema
Changes
Procedure &
Function
Code
System
Tuning
5
Case: Compare 2 Big Tables
Table_2
Table_1
ColA_PK
ColB
ColC
ColD
ColA_PK
ColB
ColC
ColD
ABCD
1.24456
3.21e-16
Andy
ABCD
2.2456
2.11e-15
Andy
EFGH
1.9123
10002
Eric
EFGH
1.9123
NULL
Tom
The output:
Differences
PK_Value
Col_name
Value_1
Value_2
Issue
ABCD
ColB
1.24456
2.2456
Not equal
EFGH
ColC
10002
NULL
Only Table_1
EFGH
ColD
Eric
Tom
Not equal
6
The Old Way - UNPIVOT
Table_1 t1
Table_2 t2
ColA_PK
ColB
ColC
ColD
ColA_PK
ColB
ColC
ColD
ABCD
1.24456
3.21e-16
Andy
ABCD
2.2456
2.11e-15
Andy
EFGH
1.9123
10002
Eric
EFGH
1.9123
NULL
Tom
#Unpivot_1
#Unpivot_2
PK_value
Col_name
Value_str
PK_value
Col_name
Value_str
ABCD
ColB
1.24456
ABCD
ColB
2.2456
ABCD
ColC
3.21e-16
ABCD
ColC
2.11e-15
ABCD
ColD
Andy
ABCD
ColD
Andy
EFGH
ColB
1.9123
EFGH
ColB
1.9123
EFGH
ColC
10002
EFGH
ColC
NULL
EFGH
ColD
Eric
EFGH
ColD
Tom
7
Compare UNPIVOTed Tables
SELECT
,
,
,
Coalesce(t2.ColA_PK, t1.ColA_PK) ColA_PK
coalesce(t2.col_name, t1.col_name) col_name
t1.value_str , t2.value_str
CASE WHEN t1.col_name is null or t1.value is NULL
THEN ‘only Table_2'
WHEN t2.col_name is NULL or t2.value is NULL
THEN ‘only Table_1'
WHEN t2.value is NOT NULL and t1.value IS NOT NULL
and t2.value!=t1.value
THEN 'not equal'
ELSE 'unknown issue‘ END
FROM #unpivot_1 t1 FULL outer join #unpivot_2 t2 on t1.ColA_PK = t2.ColA_PK
AND t1.col_name = t2.col_name
WHERE 1=CASE WHEN t1.col_name is null or t1.value is NULL then 1
WHEN t2.col_name is NULL or t2.value is NULL then 1
WHEN t2.value is NOT NULL and t1.value IS NOT NULL
and t2.value!=t1.value THEN 1
ELSE 0
END
8
Why the original solution?
• The solution existed
But for different circumstances
• When written, performance didn’t matter
Why the need for a better solution?
• Frequency of use jumped
FROM: 10 per day
TO: 100 per day
- taking 25 minutes
- taking 4 hours
And then …
Light Dawns over Marblehead
11
New Way – Compare in Place
Table_1
Table_2
ColA_PK
ColB
ColC
ColD
ColA_PK
ColB
ColC
ColD
ABCD
1.24456
3.21e-16
Andy
ABCD
2.2456
2.11e-15
Andy
EFGH
1.9123
10002
Eric
EFGH
1.9123
NULL
Tom
12
WITH issue_lines_cte AS(
SELECT COALESCE(t1.[ColA_PK], t2.[ColA_PK]) PK_Value
,CASE WHEN t1.[ColB] IS NOT NULL AND t2.[ColB] IS NULL
THEN '|ColB^only Table_1^C^'+ t1.[ColB]+'^'
WHEN t1.[ColB] IS NULL AND t2.[ColB] IS NOT NULL
THEN '|ColB^only Table_2^C^^'+ t2.[ColB]
WHEN t1.[ColB] != t2.[ColB]
THEN '|ColB^not equal^'+ t1.[ColB] + '^' + t2.[ColB]
ELSE '' END
+ CASE WHEN t1.[ColC] IS NOT NULL AND t2.[ColC] IS NULL
THEN '|ColC^only Table_1^C^'+ t1.[ColC]+'^'
WHEN t1.[ColC] IS NULL AND t2.[ColC] IS NOT NULL
THEN '|ColC^only Table_2^C^^'+ t2.[ColC]
WHEN t1.[ColC] != t2.[ColC]
THEN '|ColC^not equal^C^'+ t1.[ColC] + '^' + t2.[ColC] ELSE '' END
. . .
issue_line
FROM Table_1 t1 FULL OUTER JOIN Table_2 t2 ON t1.[ColA_PK] = t2.[ColA_PK]
)
INSERT INTO #issue_lines (PK_value, issue_line)
SELECT PK_Value, issue_line
FROM issue_lines_cte
WHERE issue_line != ''
13
The code for numeric columns
WITH issue_lines_cte AS(
+ CASE WHEN t1.[ColN] IS NOT NULL AND t2.[ColN] IS NULL
THEN '|ColN^only Table_1^N^'+ t1.[ColN]+'^‘
WHEN t1.[ColN] IS NULL AND t2.[ColN] IS NOT NULL
THEN '|ColN^only Table_2^N^^'+ t2.[ColN]
WHEN 1=ISNUMERIC(t1.ColN) AND 1=ISNUMERIC(t2.ColN)
THEN CASE WHEN ABS(CONVERT(float,t1.ColN) - CONVERT(float,t2.ColN)) > 1e-13
THEN '|ColN^not equal^N^'+ t1.[ColN] + '^' + t2.[ColN]
ELSE '' END
WHEN t1.[ColN] != t2.[ColN]
THEN '|ColN^non-numeric not equal^N^'+ t1.[ColN] + '^' + t2.[ColN]
ELSE '' END
14
#issue_lines
Differences
PK_value Issue_line
ABCD
|ColB^Not equal^1.24456^2.2456
EFGH
|ColC^Only Table_1^10002^^|ColD^Not equal^Eric^Tom
PK_Value
Col_name
Value_1
Value_2
Issue
ABCD
ColB
1.24456
2.2456
Not equal
EFGH
ColC
10002
NULL
Only Table_1
EFGH
ColD
Eric
Tom
Not equal
15
Results
Start: 10 per day - taking 25 minutes
Grew to: 100 per day - taking 4 hours
Finally: 100 per day - taking 10 minutes
16
Lesson
Finding a better way to accomplish the task
has the biggest impact on the results
CASE:
THE GOOD,
THE BAD,
AND THE UGLY
Case: Find all dates in table
CREATE TABLE myData (
asof_date smalldatetime
, [entity_id] int
, attribute_id smallint
, value varchar(255)
, CONSTRAINT PK_myDAta PRIMARY KEY CLUSTERED
(asof_date, entity_id, attribute_id)
)
• 15 Billion Rows
The Bad
120 Seconds
The Ugly
declare @target_asof_date smalldatetime = '1900-01-01'
, @next_asof_date SMALLDATETIME
CREATE TABLE #dates(asof_date SMALLDATETIME)
SELECT TOP (1) @next_asof_date = asof_date
FROM dbo.myData WHERE asof_date >= @target_asof_date
ORDER BY asof_date
WHILE @next_asof_date IS NOT NULL
BEGIN
INSERT INTO #dates (asof_date) VALUES (@next_asof_date)
SET @target_asof_date = DATEADD(DAY, 1, @next_asof_date)
SET @next_asof_date = NULL
1.2 Seconds
SELECT TOP (1) @next_asof_date = asof_date
FROM dbo.myData
WHERE asof_date >= @target_asof_date
ORDER BY asof_date
END
The Good
declare @max_date
=
, @min_date
=
smalldatetime
(SELECT MAX(asof_date) FROM myData)
smalldatetime
(SELECT MIN(asof_date) FROM myData)
SELECT found_dates.asof_date
INTO #dates
FROM (SELECT n, DATEADD (DAY, -n, @max_date) target_date
FROM Numbers
WHERE n < DATEDIFF (DAY, @min_date, @max_date)
) dates
CROSS APPLY (SELECT TOP (1) asof_date FROM myData
WHERE dates.target_date = asof_date
) found_dates
290 Millseconds
The Good, The Bad and The Ugly
• Use information SQL Server Doesn’t know
• Table of numbers – used to construct data
• CROSS APPLY (Also OUTER APPLY)
Lesson
Query plans count.
You have to understand them.
MISTAKES
or
100 TIMES SLOWER
Case: Pick N values from Y
• Existing Solution:
T-SQL Multi-Statement Table-Valued Function
• 300 sec to pick 8,000 values from 30,000
Problem: Pick X values from Y
• Cause:
– Cursors (2 of them)
– Using @Tables as Arrays
• Solution: CLR Table-Valued Function
– Machine code (via IL)
– Real arrays
• Result: 2 seconds to select 8,000 values from 30,000 numbers
• How do you know it’s a random sample?
– Chi-Squared test for randomness
Lesson
Use the right technology for the job
It isn’t always T-SQL
Case: Slow DB on Big Hardware
• Existing Solution: Bespoke web app using ASP.Net & SQL Server
– 16 Cores, 16 GB RAM, Multiple Disks
– Slow serving 1000 users per day
• Cause: 306 Triggers on 80 tables
IF (SELECT count(*) from INSERTED) > 1 ROLLBACK
• Additional Cause: Cursors in every procedure
• Root Cause:
• Solution: Re-write required
Lesson
Ignorance isn’t an excuse
Case: Algorithm takes 100’s of days
• Deobsification Algorithm
• Government NACIS Labor data down to the county level
• Obscured by giving ranges of possible values
• Row and column totals allow actual values to be estimated
• Existing Solution:
– Implemented in T-SQL with Cursors
• Cause:
– Cursor
– Using #Tables as Arrays
• Solution: Implement with Inline UDFs: 4 hours
Case: ETL 100,000 rows per day
• Existing Solution: SSIS and stored procs
– Taking 17 hours:
How could that be?
• Cause:
– 15 indexes on one table
– 10 indexes on the other
– Simultaneous query activity
• Solution:
– Read Committed Snapshot Isolation
– Reduce the number of indexes
READ COMMITTED SNAPSHOT ISOLATION
• ALTER DATABASE <mydbname>
SET READ_COMMITTED_SNAPSHOT ON
• Readers are not blocked by writers
• Readers get EXISTING row during update
• Row versions stored in tempdb
How do you get too many indexes?
• One index at a time
• Always create every missing index!
• The Consultant’s Friend:
_dta_index_…..
– Database Tuning Advisor
sys.dm_db_index_usage
• Find indexes with more writes than reads
• Look for indexes with zero reads
Ladies and Gentlemen: for our next case
The
Hardware
is Slow!!!
Little
Software
Vendor
The
Software
is Slow!!!
VS
Big
New York
Bank
Case: ETL 15 Million Rows a day
• Existing Solution:
– Application using stored procedures
– Big hardware + SQL Server 2000
• ETL started at 8 PM
– ran into the work day
– Big Problem
Environment
• Windows 2003/ SQL 2000
• SAN
– 6 Luns RAID 5 5 to 10 drives
– I/O Stalls 2.5 to 3.5 milliseconds
38
Diagnosis
•
•
•
•
Server Trace
3 Gigabytes of trace in 3 minutes
Cause: Each row INSERTed individually
Solution: Bulk Insert
39
Lesson:
Be careful about blaming the Hardware
Pride goeth before destruction,
And a haughty spirit before a fall.
The Bible, Proverbs 16:18
Case: Slow application
• Existing Solution: T-SQL Based database
• Cause: Excessive use of T-SQL Functions
42
Solution to Slow Scalar Funtions
• Re-Write with:
– Inline code
– Inline functions
SELECT dbo.fnDoSomething(ColA) as A
, dbo.fnDoSomething(ColC) as C
, dbo.fnDoTheOtherThing(ColD) as D
, dbo.fnDoMin(ColE, 5) + dbo.fnDoMin(ColF, 7) as EF
FROM myTable t1
INNER JOIN theOtherTable t2 on t1.PK=t2.PK
ORDER BY dbo.fnDoSomething(ColA), dbo.fnDoSomething(ColC)
SELECT
,
,
,
(SELECT ds FROM dbo.fnDoSomething_inline(ColA)) as A
(SELECT ds FROM dbo.fnDoSomething_inline(ColC)) as C
(SELECT dtot FROM dbo.fnDoTheOtherThing_inline(ColD) as D
CASE WHEN ColE < 5 THEN 5 ELSE ColE END
+ CASE WHEN ColF < 5 THEN 7 ELSE ColF END as EF
FROM myTable t1
INNER JOIN theOtherTable t2 on t1.PK=t2.PK
ORDER BY 1,3
Functions
• A great way to encapsulate reusable logic
• Scalar Functions are slow
– Row-by-Row cursor-like processing
– They inhibit query parallelism
• Table Valued Functions (TVF) are slow
– Row-by-Row cursor-like processing
– Inhibit query parallelism
– Use of a @Table Variable to return data
• Inline Functions are fast
– They’re views with parameters
Which Functions should be Rewritten?
• Hard to find: not in sys.dm_exec_procedure_stats
but will have a DMV in SQL 2016
• Traces can record functions
– Event SP:Procedure Complete
– Filter on ObjectType = 20038
– Lots of overhead!
• Extended Events is the low overhead way to
measure function use
How about a SQLCLR function?
• Write Scalar or TVF in C# or VB.Net
• Great for complex algorithms that don’t
access much data
–
–
–
–
Analytics or Statistics
String manipulations
When the code has loops/cursors
When the code needs arrays
• Aggregates
47
Functions Resources
• Videos on the problem and solution
http://novicksoftware.com/problem-user-definedfunctions-solution/
• Connect item about getting it fixed - 273443
https://connect.microsoft.com/SQLServer/feedback/detail
s/273443/the-scalar-expression-function-would-speedperformance-while-keeping-the-benefits-of-functions
There is Hope
SQL Server 2016: Natively Compiled Functions
49
Another Case: Slow application
• Existing Solution: T-SQL Based database
• Cause: Excessive use of T-SQL Functions
• Solution: Re-Write with:
– Inline code
– Inline functions
And Another Case: Slow application
• Existing Solution: T-SQL Based database
• Cause: Excessive use of T-SQL Functions
• Solution: Re-Write with:
– Inline code
– Inline functions
Were These Mistakes?
• All represent working code
• All ran fast when initially coded
A mistake is not something to be determined after the fact,
but in light of the information until that point.
Nasim Taleb
Fooled by Randomness
How did we get there?
What could prevent these mistakes?
• Testing with sufficient quantity of data
• Refactoring as data quantity grows
• Assume the code is the problem!
– Don’t point fingers at the hardware until you’re sure
• More knowledge of alternatives
COMPREHENSIVE
SOLUTIONS
CASE: HEKATON
SQL SERVER’S
IN-MEMORY TABLES
What makes it faster?
• Data is always in memory. Only limited I/O required
• No locks, latches, spinlocks, etc
• Data is written sequentially, only!
– Checkpoints in particular
•
•
•
•
No indexes are stored or logged. They’re in memory only
Multiple row versions in-memory
Non-durable tables allowed. No I/O
Compiled Native T-SQL stored procedures
Hekaton Results
• 3X on Interop interpreted T-SQL
• 25-35 X on In-mem + Natively Compiled
59
CASE: FAST ETL
A SERVICE BROKER SOLUTION
Existing ETL Solution
• Entity-Attribute-Value (EAV) Table
CREATE Table eav (
entity_id
, attribute_id
, value
, begin_datetime
, end_datetime
)
int not null
smallint not
varchar(255)
datetime not
datetime not
null
not null
null
null
Why Entity-Attribute-Value (EAV) ?
•
•
•
•
Flexibility
New attributes don’t require schema changes
Can handle more than 1024 attributes
Business users control the attributes through
metadata tables
Data sources
• Multiple flat files – Several times a day
• Some with more than 300 attributes
Entity_id
Attribute_1
Attribute_2
Attribute_45 Attribute_1856
1
1.0001
1.0002
1.0045
1.1856
2
2.0001
2.0002
2.0045
2.1856
3
3.001
2.0002
3.0045
3.1856
…
Why is the ETL Slow?
• 40,000,000 values per day
• Input files often have 7,000,000 values
• Breakdown
BULK INSERT
UNPIVOT
MERGE/INSERT
5%
15%
80%
What to do about it?
• Two Tables
EAV_current - Partitioned by attribute_id
ETL into EAV_current
EAV_history – Partitioned by end_datetime
Weekly archive process
• Unified by a Partitioned View, EAV
What is a partitioned table?
• A collection of identical HOBTs
– Heap-Or B-Tree - Usually a B-Tree
• All with the same
– Columns
– Indexes
– Constraints
• HOBTs distinguished by a partitioning column
• Looks Like a Single Table
66
Table partitioned on a date
CREATE Table eav_history (
[entity_id]
int not null
, attribute_id
smallint not null
, value
varchar(255) not null
, begin_datetime datetime not null
, end_datetime
datetime not null
, CONSTRAINT pk_eav_history PRIMARY KEY CLUSTERED
(attribute_id, begin_datetime, [entity_id], end_datetime)
) ON ps_eav_history_on_history_filegroups (end_datetime)
Table partitioned on attribute_id
CREATE Table eav_current (
[entity_id]
int not null
, attribute_id
smallint not null
, value
varchar(255) not null
, begin_datetime datetime not null
, end_datetime
datetime not null
, CONSTRAINT pk_eav PRIMARY KEY CLUSTERED
(end_datetime, [entity_id], attribute_id)
) ON ps_attribute_id_on_user_tables(attribute_id)
EAV Table Partitioning
CREATE View EAV AS SELECT * from EAV_current UNION ALL SELECT * from EAV_history
EAV_current table
EAV_history table
Partitioned on attribute_id
Partitioned on end_datetime
Eliminates blocking during load to EAV_current
End_datetime=2015-10-08
End_datetime=2010-03-01
Attribute_id=1
Attribute_id=2000
Partitioned Table Details
EAV_current table
B-Tree for
Partition 1374
Level 1
Level 2
Partitioned on attribute_id
ALTER TABLE eav_current
SET (LOCK_ESCALATION = AUTO)
Leaf
Leaf
Level 2
Leaf
Leaf
Leaf
B-Tree for
Partition 812
Level 1
Level 2
Leaf
Leaf
Level 2
Leaf
Leaf
Leaf
ETL Process Block Diagram
Flat file input
EAV_current table
BULK
INSERT
#input
Unpivot
#tran
TSQL EAV_Load_procedure
MERGE/INSERT
Partitioned
on
attribute_id
But it’s still slow!
This is Why
But it’s Still Slow!
73
Light Dawns Over Marble Head
And then…
Break it down to single partition operations
• SQLCLR proc breaks the file by attribute_id
• SEND attribute_id’s data to a
Service Broker QUEUE
• Each task is working on ONE attribute_id
– That’s one HOBT / Partition
• Run 1-2 tasks per core
What does that look like?
Flat file input
Service Broker Task 18
T-SQL
EAV_current table
Service Broker Task 9
T-SQL
SQLCLR file
parser
TSQL EAV_SB_Load
SB Queue
Service Broker Task 10
T-SQL
Service Broker Task 3
T-SQL
Partitioned
on
attribute_id
Service Broker Activation
• Multiple Activation using Event Notifications
• See Pro SQL Server 2008
Service Broker
By Klaus Aschenbrenner
EAV Load Results
•
•
•
•
Before: 26 minutes
After: 4 to 5 minutes
More Cores = Faster Loads
Until we hit the next bottleneck
Questions to ask about a slow process
• What work is the existing solution doing?
– Is it necessary to the end result?
• What’s the least work for the task?
• Can you do work earlier and re-use it when it counts?
• What isn’t parallel?
– Could it be?
• Are you using the right tool/technology?
Conclusions
• It can be done!
• Testing, Testing, Testing – With lots of rows
• Figure out how to get SQL Server to use
more memory and more cores
• Humans, not technology are responsible for
most of our problems
80
Finding a Solution
• Look for mistakes
• Listen to your instincts about how long it
should take
• Look for a better way
• Pull out all the stops. Change everything!
[email protected]
http://www.NovickSoftware.com
Thank you for coming!