Windows Management and Scripting

A wealth of tutorials Windows Operating Systems SQL Server and Azure

  • Enter your email address to follow this blog and receive notifications of new posts by email.

    Join 721 other subscribers
  • SCCM Tools

  • Twitter Updates

  • Alin D

    Alin D

    I have over ten years experience of planning, implementation and support for large sized companies in multiple countries.

    View Full Profile →

Posts Tagged ‘Databases’

How to Increase SQL Server query performance

Posted by Alin D on December 21, 2010

Thanks to the natural language roots of the SQL language, writing queries has become extremely easy for just about anyone to pick up. But its simplicity also makes it easy to write poorly performing queries. Here are some simple changes you can make to improve not only query performance, but, in some cases, overall SQL Server system performance as well.

CREATE TABLE vs. SELECT INTO

Oftentimes, within stored procedures or other SQL scripts, temp tables must be created and loaded with data. When writing these queries, many SQL Server DBAs and developers like to use the SELECT INTO method, like this:

SELECT *
INTO #TempTable
FROM sysobjects

While this technique works fine for small tables, when dealing with large record sets or long-running queries, it creates locks on the system objects within the tempdb database. As a result, other queries and procedures that need to create objects within the tempdb database will have to wait for the long-running query to complete. This is because when an object is created, an exclusive lock is taken against the sysobjects, syscolumns, sysindexes, etc system tables (SQL Server 2000) or the sysallocunits, syscolpars, syshobtcolumns, sysschobjs, sysserefs, etc system tables (SQL Server 2005). You can see this easily by opening two query windows and running the following:

(First window)

begin tran
create table #test1 (c1 int)

(Second window SQL 2005)

select object_name(rsc_objid), *
from sys.syslockinfo
where req_spid = 52 /*Where 52 = the SPID of the first window*/
order by 1

(Second window SQL Server 2000)

sp_lock 52 /*Where 52 = the SPID of the first window*/

When you have a very long-running query in a temporary table using the SELECT INTO format, those same system table locks are held until the query completes and data loads into the temp table. You can avoid system table locking by manually creating the table with the CREATE TABLE command – before loading the data into the table.

For example, this code …

CREATE TABLE #TempTable
(spid int)

INSERT INTO #TempTable
SELECT spid
FROM sys.objects

… will require much less locking than this code:

SELECT spid
INTO #TempTable
FROM sys.objects

While the total number of locks taken is the same, the length of time the locks are held for the first query will be much shorter. This allows other processes to create temp tables.

Typically, when developing SQL code the development server has only a single user or few users. When working on SQL code, it’s important to know when the code will impact sessions other than the current session. And unexpected interaction can cause major performance issues.

Accessing data across linked servers

Linked servers are an excellent way to get data in real time from one server to another. However, incorrectly written linked server queries can quickly decrease system performance on one or both servers. While it’s easy to write these queries across linked servers, the query optimizer doesn’t always work as you would expect. I often see queries that join a local table to two remote tables and the queries take hours to run. That’s because the local optimizer doesn’t know which records to request from the remote table.

It therefore requests that the remote server transmit the entire table, and all that data is then loaded into a temporary table and the join is done locally. Unfortunately, because the local table is a temporary table — and not a physical table on the source system — the indexes

on the remote table do not get created on the temporary table. Because of the lack of indexes, expected query execution time skyrockets.

There are a couple of techniques you can use to improve query response time. The first is to create a stored procedure on the remote database and have it return a record set, being a subset of the remote tables, which is then loaded into a local temporary table. It can then be indexed as needed. The trick with this method is to provide an input variable to the remote procedure where input values can be passed to. Thus, you will reduce the number of returned records by as much as possible. Fewer records will reduce the run time of that stored procedure as well as the network latency on transferring those records from the remote system to the local system.

The second technique you can use is a variation of the first method. You create local temporary tables for each of the remote tables and transfer over the columns and records needed from each of the remote tables. Next, index the tables as needed and join the temp tables locally.

While the second technique is easier and faster to set up and implement, the first method gives you a greater performance savings, as typically less data needs to be transferred between servers.

Subqueries as join partners

When working with joins, you may want to manually control the order that data is selected. An easy (and usually safe) way to do this is to use subqueries as the join object instead of joining directly to a table.

In some instances, you can decrease your query execution time by forcing the SQL Server to prefilter data in the table. This method is not foolproof and if used incorrectly it can increase the execution time of your query. The method should be fully tested before moving it to your production environment.
As we have seen, there are some quick and easy methods for improving query performance for some long-running processes. While these techniques will not apply to every issue you run across, they will help in some instances.

In some instances, you can decrease your query execution time by forcing the SQL Server to prefilter data in the table. This method is not foolproof and if used incorrectly it can increase the execution time of your query. The method should be fully tested before moving it to your production environment.
As we have seen, there are some quick and easy methods for improving query performance for some long-running processes. While these techniques will not apply to every issue you run across, they will help in some instances.

Posted in SQL | Tagged: , , , | Leave a Comment »

Identity Property Range Checking in SQL Server

Posted by Alin D on December 15, 2010

The IDENTITY property for a column of a numerical data type is a frequently used method to achieve system-generated “uniqueness” for each row in a table. Such a column then in turn is a quite popular choice for the PRIMARY KEY constraint. Most of the times one would choose the data type int for the underlying column. However, the IDENTITY property can be defined on any integer-like data type and even on the decimal data type as long as the chosen scale is 0. By default SQL Server picks only the positive values unless you specify otherwise. So, when you opt to start with a negative seed value, this is perfectly fine for SQL Server and by doing so, you essentially double the range of possible values for most of the available data types. It may hurt one’s aesthetic experience, but if you take negative values into account, this gives you the following range of possible values:

tinyint 0 – 255
smallint -32.768 32.767
int -2.147.483.648 2.147.483.647
bigint -2^63 2^63-1

If you decide to use a decimal data type such as decimal(38, 0) this gives you a range of -10^38 to 10^38-1 possible values, which, for almost any practical purposes should be more than enough.

But what can actually happen if  you are about to exceed this range?

Let’s create a very simple test case:

CREATE TABLE dbo.id_overflow (
    col1 int IDENTITY(2147483647,1)
);
GO

The above script creates a new table dbo.id_overflow with only one column col1. This column is of type int with the IDENTITY property defined on it. The seed value is chosen to be the maximum value for the int type which is 2147483647. I just arbitrarily picked the int data type, I could have chosen any other eligible data type, the result would still be the same. So, when we now insert into this table, the very first insert statement is likely to succeed, while any subsequent one will fail with an arithmetic overflow error.

--This insert will succeed
INSERT INTO dbo.id_overflow DEFAULT VALUES;
--This insert will fail
INSERT INTO dbo.id_overflow DEFAULT VALUES;

(1 row(s) affected)
Msg 8115, Level 16, State 1, Line 2
Arithmetic overflow error converting IDENTITY to data type int.
Arithmetic overflow occurred.

So far, everything is as expected and when we look at the content of the table we only see the one row from the first insert.

SELECT
    *
FROM
    dbo.id_overflow;

col1
2147483647

(1 row(s) affected)

But what do you do in such a case? You can’t insert any more rows into this table. Even if there might be gaps in the sequence of the existing IDENTITY values, these gaps won’t be reused automatically. Once allocated, SQL Server doesn’t care about them and if an insert doesn’t succeed for whatever reason, this just freshly allocated value is gone.

Essentially, the only feasible solution to this problem is to choose a “bigger” data type. So, a very simplified change script to change the data type in our example to bigint would look like this:

IF OBJECT_ID('dbo.id_overflow') IS NOT NULL
    DROP TABLE dbo.id_overflow;
GO
CREATE TABLE dbo.id_overflow (
    col1 int IDENTITY(2147483647,1)
)
GO

--This insert will succeed
INSERT INTO dbo.id_overflow DEFAULT VALUES;

--Now change the data type to a bigger one.
ALTER TABLE dbo.id_overflow ALTER COLUMN col1 bigint;

--This insert will now succeed as well
INSERT INTO dbo.id_overflow DEFAULT VALUES;

SELECT
    *
FROM
    dbo.id_overflow;

If you run this batch, it will finish without an error and yield the expected resultset of 2 rows. But, as mentioned above, a change script in almost any real-world database would be much more complex. Indexes would have to be changed, referencing tables would have to be changed, code parts where the value of that column is assigned to a variable of type int, etc…

It is not hard to predict, that you’re in deep trouble when this table is one of your main tables in a database and is referenced by many other tables and/or in many places in your code.

I was  bitten by a similar scenario not that long ago. Fortunately it was “only” a lookup table with an IDENTITY column on a smallint data typed column. And I was fortunate that I could simply reseed the IDENTITY value because the last +7000 inserts failed due to a misunderstanding between the developers of the calling application and me on how a certain parameter to a procedure should be used. But it still was enough trouble for me to decide to write a small check script that is now part of my weekly scripts and that gives me all the tables having such an IDENTITY column along with the last value consumed as well as the buffer I have left before I run out of values again. Here it is:

;WITH TypeRange AS (
SELECT
    'bigint' AS [name],
    9223372036854775807 AS MaxValue,
    -9223372036854775808 AS MinValue
UNION ALL
SELECT
    'int',
    2147483647,
    -2147483648
UNION ALL
SELECT
    'smallint',
    32767,
    -32768
UNION ALL
SELECT
    'tinyint',
    255,
    0
),
IdentBuffer AS (
SELECT
    OBJECT_SCHEMA_NAME(IC.object_id) AS [schema_name],
    O.name AS table_name,
    IC.name AS column_name,
    T.name AS data_typ,
    CAST(IC.seed_value AS decimal(38, 0)) AS seed_value,
    IC.increment_value,
    CAST(IC.last_value AS decimal(38, 0)) AS last_value,
    CAST(TR.MaxValue AS decimal(38, 0)) -
        CAST(ISNULL(IC.last_value, 0) AS decimal(38, 0)) AS [buffer],
    CAST(CASE
            WHEN seed_value < 0
            THEN TR.MaxValue - TR.MinValue
            ELSE TR.maxValue
        END AS decimal(38, 0)) AS full_type_range,
    TR.MaxValue AS max_type_value
FROM
    sys.identity_columns IC
    JOIN
    sys.types T ON IC.system_type_id = T.system_type_id
    JOIN
    sys.objects O ON IC.object_id = O.object_id
    JOIN
    TypeRange TR ON T.name = TR.name
WHERE
    O.is_ms_shipped = 0)

SELECT
    IdentBuffer.[schema_name],
    IdentBuffer.table_name,
    IdentBuffer.column_name,
    IdentBuffer.data_typ,
    IdentBuffer.seed_value,
    IdentBuffer.increment_value,
    IdentBuffer.last_value,
    IdentBuffer.max_type_value,
    IdentBuffer.full_type_range,
    IdentBuffer.buffer,
    CASE
        WHEN IdentBuffer.seed_value < 0
        THEN (-1 * IdentBuffer.seed_value +
          IdentBuffer.last_value) / IdentBuffer.full_type_range
        ELSE (IdentBuffer.last_value * 1.0) / IdentBuffer.full_type_range
    END AS [identityvalue_consumption_in_percent]
FROM
    IdentBuffer
ORDER BY
    [identityvalue_consumption_in_percent] DESC;

Since SQL Server 2005 it has been really easy to get this information. As you can see from the script, I have omitted the decimal(38,0) alternative. For me, a bigint column with a negative seed value is more than I would possibly ever need. I got into the habit of running this daily to monitor how many values we have left in the buffer before it blows up again and to get a feeling for “how urgent” it is to look at the inevitable changes to the database. Possible other variations would be to send out an alert when a certain threshold is reached but that I leave up to your fantasy.

Posted in SQL | Tagged: , , , , , | Leave a Comment »

Exchange Server Tools for Load Testing and Monitoring

Posted by Alin D on October 27, 2010

Good administrators know that oftentimes their value is measured by the tools they bring to the table. Knowing which tools to use for troubleshooting and diagnostics can be a valuable time saver as well when systems are down or performance is slowed.

Administrators have a variety of performance bottlenecks that they are constantly trying to improve upon. Sometimes it is I/O performance, other times it is network performance. And when end users complain about response times the factors involved can include: the CPU speeds, memory size, disk storage, network infrastructure, etc.

Most enterprises have three or four environments: Production, Pre-Production or Staging, Testing and Development. Each environment requires its own set of diagnostic tools to aide in the troubleshooting process. Exchange server administrators can use tools to help them with load testing of the servers before they leave the Testing and Pre-Production environments. Load testing of the servers can help to ensure that the Exchange server has been sized appropriately and that the configuration settings are correct for your Production environment.

Some of the performance and load generating tools that can be used for Exchange server include:

  1. Exchange Server Jetstress Tool – The Jetstress tool allows an administrator to simulate the disk I/O load on a test machine running Exchange Server. Administrators can verify the performance and stability of the disk subsystem before placing their server into a production environment. Using the tool, administrators can simulate the Exchange server database and log file loads that are produced based on a specific number of users. Additional tools to use include: Performance Monitor, Event Viewer and ESEUTIL. If you wish to download Jetstress then go to the following link: http://www.microsoft.com/downloads/en/results.aspx?freetext=JetStress&displaylang=en&stype=s_basic
  2. Performance Monitor – Administrators can use the Microsoft Exchange Server User Monitor to collect real-time data about client user patterns and can be used for planning purposes. Such data that can be collected include: IP addresses used by clients, versions and modes of Microsoft Office Outlook, and resource usage such as CPU utilization. Important information such as server-side processor latency and total latency for network and processing can also be collected. Exchange Server User Monitor can be downloaded from the following site: http://www.microsoft.com/downloads/en/details.aspx?FamilyId=9A49C22E-E0C7-4B7C-ACEF-729D48AF7BC9&displaylang=en
  3. Event Viewer – Administrators can use the Event Viewer to review the data that is contained in the application and system log files. Both the application and the system log files contain errors, warnings, and informational events that are related to the operation of Exchange Server, security, and other applications. Most Exchange Server-related information such as authentication, connections and user actions are recorded in the Application log. Administrators should review the event logs daily for errors and warnings.
  4. ESEUTIL – The Exchange Server Database Utilities (Eseutil.exe) is a command-line tool that allows administrators to manipulate Exchange’s Extensible Storage Engine (ESE). An administrator can use it to verify, modify, and repair any Exchange ESE database file in Exchange Server. Eseutil can also be used with ESE databases on the Exchange Hub Transport and Edge Transport server roles. Corrupt or damaged databases can be restored or repaired from backup using the Eseutil tool. Eseutil can also be used with database files and log files associated with a Microsoft Exchange database. Administrators can run the Eseutil  tool from the Exchange default install folder located in the <SystemDrive>:Program FilesMicrosoftExchange ServerBin directory.

ESEUTIL uses switches such as “/k”, “/mh” and “/cc” to allow administrators to re-run various processes on Exchange server. For example, an administrator can display headers of database files, transaction log files, and checkpoint files. You can verify checksums of all pages in the database, log files, and checkpoint files. Or and administrator can display the restore log files and control hard recovery after restoration from legacy online backups.

Another important switch of the ESEUTIL tool is the “/d” switch which allows an administrator to defragment the database offline but leaves the new, defragmented database in the temporary location. This procedure can be performed with or without overwriting the original database. The net effect of the “/d” switch is to reduce the gross size of the database (.edb) by rebuilding the indexes and deleting most of the empty pages.

Using the “/r” switch an administrator can restore a database by replaying the transaction log files or roll them forward to a known restore point.

Tools such as Jetstress, Performance Monitor, Event Viewer, and ESEUTIL can be used together to verify that your Exchange server and configuration settings are set appropriately for your environment and desired performance levels.

Posted in Exchange | Tagged: , , , , , , , | Leave a Comment »