Windows Management and Scripting

A wealth of tutorials Windows Operating Systems SQL Server and Azure

  • Enter your email address to follow this blog and receive notifications of new posts by email.

    Join 721 other subscribers
  • SCCM Tools

  • Twitter Updates

  • Alin D

    Alin D

    I have over ten years experience of planning, implementation and support for large sized companies in multiple countries.

    View Full Profile →

Posts Tagged ‘SQL Server Management Studio’

How to use multi-server management features in SQL Server 2008 R2

Posted by Alin D on June 28, 2011

One of SQL Server 2008 R2’s most anticipated features is centered around its new multi-server management capabilities. As of the current Community Technical Preview (CTP), this feature is referenced as the utility control point (UCP), so that is how it will be referred to in this article.

The UCP allows you to gather a large number of metrics about multiple SQL Server instances and display this data in an easy-to-use dashboard report. This report provides a quick overview of your entire enterprise from a single screen within SQL Server Management Studio. The utility control point does this by creating a database on the server that will function as your UCP. The monitored instances then upload their data to the UCP.

When designing your UCP servers for your enterprise environment, there are some licensing limits to keep in mind. For example, SQL Server 2008 R2 Enterprise Edition allows you to enroll up to 25 monitored instances. To enroll more servers, you’ll need either another Enterprise Edition UCP or a SQL Server 2008 R2 Datacenter Edition instance (there is no limit to the number of servers that can be enrolled on that UCP).

Setting up the utility control point

Before you can begin capturing data in the UCP, you have to set it up, which requires you to open the Utility Explorer. This is done in SQL Server Management Studio by clicking on the View drop-down menu and selecting Utility Explorer. This opens another tab in the pane where the object explorer is. You can then create a new UCP by clicking the Create Utility Control Point button on the right. This will launch the Create UCP wizard.

The first page of the wizard is your standard introduction page. The second page lets you select the server instance, which will serve as the UCP. You can also name the UCP to distinguish it from other utility control points in your enterprise.

The third page of the wizard allows you to select a proxy account or use the SQL Agent account for data collection. If your SQL Agent service is running under a domain account, you can choose to use the service account. If it is notrunning under a domain account, you will want to enter credentials to use as a proxy account so that the SQL Agent job that handles the data collection can access and authenticate against the remote SQL Server instances. The fourth page of the wizard validates that all the rules have passed their validation check. The fifth page shows a basic summary of the information collected, while the sixth page of the wizard creates the objects that will serve as the UCP.

Using the utility control point

After you create the UCP, there won’t be any data available since data collection will have just started. To add additional instances, right-click on Managed Instances from under the UCP you just created. This will prompt you with a wizard that allows you to add additional SQL Servers to be monitored.

After the utility control point has collected some data, you’ll see a screen similar to screenshot below that displays how many of your instances are overutilized, underutilized and, most important, well-utilized. You can also see which specific components are over or underused, including the CPU, memory and disk capacity.

If you use DACPACs to deploy applications to your databases, then you will see the utilization of the data-tier application in all its parts.

Modifying thresholds

Microsoft has set some respectable defaults when it comes to utility control points. When you click on the Utility Administration option under the selected UCP in the Utility Explorer, you can change the settings. By default, the thresholds are 70% for all overutilization settings and 0% for underutilization settings. The 70% overutilization setting probably works for most people, but you might want to adjust the 0% underutilization setting depending on the normal server load for a given system.

From this same section, you can adjust the amount of evaluations per period and how many of them must be in violation before the CPU alarm is triggered.

For the high-utilization alarm, this is done by adjusting a slider from short to long, where short is one hour and long is one week. You can then set the percentage of alarms that must be in violation before the alarm is triggered.

For the low-utilization alarm, the sliders are very similar, except that the time frame is from one day to one month with the same percentage-based scale.

If needed, you can specify instance-specific policies by editing the managed instance and clicking on the Policy Details tab of the instance properties. This allows you to apply specific settings for instances that don’t fall within the global policies of the other servers that are managed by the UCP.

Last step

Now that you have this information, the key is to find a way to use it to make intelligent decisions. From a single glance, you can see which servers are overutilized and which ones have room available on them. This should allow you to make better use of your environment. Simply put, if you know that a database application on Server1 has a high CPU load and that Server2 has a very low CPU load, then that database application might be a good candidate to move from one server to the other.



 

 

Posted in SQL | Tagged: , , , | Leave a Comment »

SQL Server Analysis Services partitions – Create and Manage

Posted by Alin D on May 12, 2011

Partitions are portions of a SQL Server Analysis Services measure group that hold some or all of the measure group’s data.

When a measure group is first created, it contains a single partition corresponding to all the data in your fact table or view. Additional partitions need to be created for any measure group with more than 20 million rows.

Since a majority of corporate databases have far more than 20 million rows in fact tables, you should know how to create partitions and also be aware of good partition design practices.

ou can define partitions using the Business Intelligence Development Studio (BIDS). On the partitions’ tab within your project, simply click the New Partition link near a measure group to open the Partition Wizard. (I won’t cover the steps of the Partition Wizard here because it is fairly simple to follow).

An alternative method to creating new partitions is through XMLA scripts, which is what BIDS does behind the scenes.

You can script an existing partition in SQL Server Management Studio (SSMS) by right-clicking a partition and then choosing Script Partition as CREATE to open a new query window. You will need to edit certain properties such as the partition identifier, its name and the query used for populating the partition.

Here is a sample XMLA for a partition:

<Create xmlns=http://schemas.microsoft.com/analysisservices/2003/engine>
<ParentObject
>
< DatabaseID>Adventure Works DW 2008</DatabaseID>
< CubeID>Adventure Works</CubeID>
<MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
</ParentObject>
<ObjectDefinition>
<Partition xmlns:xsd=http://www.w3.org/2001/XMLSchemaxmlns:xsi=http://www.w3.org/2001/XMLSchema-instancexmlns:ddl2=http://schemas.microsoft.com/analysisservices/2003/engine/2xmlns:ddl2_2=http://schemas.microsoft.com/analysisservices/2003/engine/2/2xmlns:ddl100_100=http://schemas.microsoft.com/analysisservices/2008/engine/100/100>
<ID>Internet_Sales_2001</ID>
<Name>Internet_Sales_2001</Name>
<Source xsi:type=QueryBinding>
<DataSourceID>Adventure Works DW</</#0000FF>DataSourceID>
<QueryDefinition>
SELECT
[dbo].[FactInternetSales].[ProductKey],
[dbo].[FactInternetSales].[OrderDateKey],
[dbo].[FactInternetSales].[DueDateKey],
[dbo].[FactInternetSales].[ShipDateKey],
[dbo].[FactInternetSales].[CustomerKey],
[dbo].[FactInternetSales].[PromotionKey],
[dbo].[FactInternetSales].[CurrencyKey],
[dbo].[FactInternetSales].[SalesTerritoryKey],
[dbo].[FactInternetSales].[SalesOrderNumber],
[dbo].[FactInternetSales].[SalesOrderLineNumber],
[dbo].[FactInternetSales].[RevisionNumber],
[dbo].[FactInternetSales].[OrderQuantity],
[dbo].[FactInternetSales].[UnitPrice],
[dbo].[FactInternetSales].[ExtendedAmount],
[dbo].[FactInternetSales].[UnitPriceDiscountPct],
[dbo].[FactInternetSales].[DiscountAmount],
[dbo].[FactInternetSales].[ProductStandardCost],
[dbo].[FactInternetSales].[TotalProductCost],
[dbo].[FactInternetSales].[SalesAmount],
[dbo].[FactInternetSales].[TaxAmt],
[dbo].[FactInternetSales].[Freight],
[dbo].[FactInternetSales].[CarrierTrackingNumber],
[dbo].[FactInternetSales].[CustomerPONumber]
FROM [dbo].[FactInternetSales]
WHERE OrderDateKey &lt;= ‘20011231’
</QueryDefinition>
</Source>
<StorageMode>Molap</StorageMode>
<ProcessingMode>Regular</ProcessingMode>
<ProactiveCaching>
<SilenceInterval>-PT1S</SilenceInterval>
<Latency>-PT1S</Latency>
<SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
<ForceRebuildInterval>-PT1S</ForceRebuildInterval>
<AggregationStorage>MolapOnly</AggregationStorage>
<Source xsi:type=ProactiveCachingInheritedBinding>
<NotificationTechnique>Server</NotificationTechnique>
</Source>
</ProactiveCaching>
<EstimatedRows

>1013</EstimatedRows>
<AggregationDesignID>Internet Sales 1</AggregationDesignID>
</Partition>
</ObjectDefinition>
</Create>

Note that when defining an effective partition, specifying the source of the data it will hold is perhaps the most important part.

As a rule of thumb, your partitions should contain between five and 20 million rows of fact data. Furthermore, you should avoid partitioning files greater than 500 MB in size. Partition files are in your Analysis Services installation folder under datadatabase_namecube_namemeasure_group_name.

You can also bind a partition to a table, view or a SQL query. If a relational data warehouse has multiple tables holding fact data, you should bind partitions to such tables as long as each table size is constrained as advised above. If you have a single, large fact table, you could write SQL queries for each Analysis Services partition to retrieve only part of this data.

Views provide a nice alternative for partition binding, especially when testing cubes. For example, if a fact table has millions of rows, processing is going to take a long time. For testing the solution, you don’t necessarily need to load all the data. Instead, create a view that selects only a portion of rows from the large table(s).

Later, when you’re ready to deploy your solution into production, alter your partition(s) definition so that they are bound to appropriate tales, queries or views.

How do you decide what data to include in each partition? SQL Server Analysis Services uses partitions to speed up MDX queries. Each partition contains an XML file that defines the range of dimension member identifiers in a given partition. When an MDX query is submitted, the Analysis Services engine decides what partition files to scan based on the values in each partition’s XML file.

The XML file is created when you process the partition and the file can be found in each partition folder (the file name is info.xml). Don’t try to edit this file – the dimension key references are internal values that cannot be retrieved from SQL Server Analysis Services.

If the data requested by an MDX query is spread across all partitions in your measure group then Analysis Services has no choice but to read every single partition. To see how every partition in the measure group is read, record a SQL Profiler trace when you run such a query. If the requested data is contained in a single, small partition, your query will only have to scan a single partition file.

Reading a single 500 MB file will invariably beat scanning through 200 files of the same size. However, if you have 200 partitions to read, Analysis Services could scan some of them in parallel, and the query won’t necessarily be 200 times slower without proper partitioning.

For best MDX query performance, you should tailor partition design to the pattern of common queries. Most SQL Server Analysis Services solutions start with measure groups partitioned using a date or periodicity dimension, each partition spanning one month’s or one day’s data.

This is a reasonable approach if your queries are typically focused on a given month or several months. But what if your queries examine data across all months and are specific to product categories? In that case, partitioning only by month won’t be optimal.

If you have 10 years worth of data partitioned by month — which is not unusual — each query would have to examine 120 partitions. In this case, query performance could improve if you further partition data by product category dimension.

For example, dealership sales cube users may only be interested in comparing sales across time for a particular product category – cars, trucks or motorcycles, for example. For such cubes, you could create partitions for each month and each product category.

Like any other SQL Server Analysis Services object, partitions have a multitude of properties. Perhaps one of the most frequently discussed is partition slice. This property defines a portion of the measure group data that Analysis Services expects to be exposed by the partition.

Most Analysis Services literature suggests that this property does not have to be set for partitions that use Multidimensional OLAP (MOLAP) storage. While in most situations Analysis Services is smart enough to figure out what dimension members are included in each partition by examining data IDs in info.xml files, to be safe you should always set the partition slice, regardless of the storage mode of your partition.

Partition slices are defined through MDX. This is an example of what a slice definition would look like for a 2001 partition:

<Slice>[Date].[Calendar].[Calendar Year]. &amp;[2001] </Slice>

To further partition data by product categories, a slice definition would look like this:

<Slice> ([Date].[Calendar].[Calendar Year]. &amp; [2001], [Product].[Product Categories].[Category]. &amp; [1]) </Slice>

If you don’t specify a slice for a given dimension, SQL Server Analysis Services assumes that any member of that dimension can be found in the partition.

For example, say you specify a month and product category in partition slice but do not specify the store. Queries that examine sales data by store, but do not include any filters for date or product, may have to search through every partition.

You can also customize the storage mode for each partition. MOLAP storage mode is optimal for data retrieval — but it copies your relational data. If you prefer to leave your data in the relational format without making its copy, then you can use Relational OLAP (ROLAP) mode for infrequently accessed partitions. For example, most recent partitions can utilize MOLAP storage while historical partitions can use ROLAP.

SQL Server Analysis Services has an upper limit on the number of partitions — 2^31-1 = 2,147,483,647 — but cubes that have this many partitions are rare. Don’t be afraid to create as many partitions as needed.

Occasionally, partitions may need to be merged. For example, if a majority of your queries focus on recent data and historical queries are infrequent, you may have a separate partition for each product line and for each day for 30 or 60 days.

Once the data is stale and seldom accessed, you could combine historical partitions into weekly or monthly partitions. To merge partitions using SSMS, right-click on a partition and choose the Merge Partitions option.

This is what the XMLA for merging 2001 and 2002 partitions looks like:

<MergePartitions
xmlns=http://schemas.microsoft.com/analysisservices/2003/engine>
<Sources>
<Source>
<DatabaseID>Adventure Works DW 2008</DatabaseID>
<CubeID>Adventure Works</CubeID>
<MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
<PartitionID>Internet_Sales_2002</PartitionID>
</Source>
</Sources>
<Target>
<DatabaseID>Adventure Works DW 2008</DatabaseID>
<CubeID>Adventure Works</CubeID>
<MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
<PartitionID>Internet_Sales_2001</PartitionID>
</Target>
</MergePartitions>

Be aware that you can copy aggregation design from one partition to another, which I will discuss in more detail in a future article. For now know that if you’re happy with your current aggregation design, you can assign it to a newly created or an existing partition.

If a partition has 500,000 or more estimated rows (you can set estimated numbers of rows in BIDS) and you haven’t defined any aggregations for this partition, then BIDS 2008 warns that your application performance can be improved by adding aggregations.

Partitions reduce the time it takes to process your measure group because each partition only loads a portion of the entire fact table and view. Remember that during processing, SQL Server Analysis Services modifies the SQL query, defining the partition before sending it over to the relational data source.

For example, earlier I showed the definition for Internet Sales 2001 partition within the Internet Sales measure group of an Adventure Works 2008 database.

The query that Analysis Services sends to SQL Server while processing this partition is considerably different from the original query:

SELECT

[dbo_FactInternetSales].[dbo_FactInternetSalesSalesAmount0_0] AS[dbo_FactInternetSalesSalesAmount0_0],
[dbo_FactInternetSales].[dbo_FactInternetSalesOrderQuantity0_1] AS[dbo_FactInternetSalesOrderQuantity0_1],
[dbo_FactInternetSales].[dbo_FactInternetSalesExtendedAmount0_2] AS[dbo_FactInternetSalesExtendedAmount0_2],
[dbo_FactInternetSales].[dbo_FactInternetSalesTaxAmt0_3] AS[dbo_FactInternetSalesTaxAmt0_3],
[dbo_FactInternetSales].[dbo_FactInternetSalesFreight0_4] AS[dbo_FactInternetSalesFreight0_4],
[dbo_FactInternetSales].[dbo_FactInternetSalesUnitPrice0_5] AS[dbo_FactInternetSalesUnitPrice0_5],
[dbo_FactInternetSales].[dbo_FactInternetSalesTotalProductCost0_6] AS[dbo_FactInternetSalesTotalProductCost0_6],
[dbo_FactInternetSales].[dbo_FactInternetSalesProductStandardCost0_7] AS[dbo_FactInternetSalesProductStandardCost0_7],
[dbo_FactInternetSales].[dbo_FactInternetSales0_8] AS[dbo_FactInternetSales0_8],
[dbo_FactInternetSales].[dbo_FactInternetSalesPromotionKey0_9] AS[dbo_FactInternetSalesPromotionKey0_9],
[dbo_FactInternetSales].[dbo_FactInternetSalesSalesTerritoryKey0_10] AS[dbo_FactInternetSalesSalesTerritoryKey0_10],
[dbo_FactInternetSales].[dbo_FactInternetSalesProductKey0_11] AS[dbo_FactInternetSalesProductKey0_11],
[dbo_FactInternetSales].[dbo_FactInternetSalesCustomerKey0_12] AS[dbo_FactInternetSalesCustomerKey0_12],
[dbo_FactInternetSales].[dbo_FactInternetSalesCurrencyKey0_13] AS[dbo_FactInternetSalesCurrencyKey0_13],
[dbo_FactInternetSales].[dbo_FactInternetSalesOrderDateKey0_14] AS[dbo_FactInternetSalesOrderDateKey0_14],
[dbo_FactInternetSales].[dbo_FactInternetSalesShipDateKey0_15] AS[dbo_FactInternetSalesShipDateKey0_15],
[dbo_FactInternetSales].[dbo_FactInternetSalesDueDateKey0_16] AS[dbo_FactInternetSalesDueDateKey0_16]
FROM
(
SELECT
[SalesAmount] AS [dbo_FactInternetSalesSalesAmount0_0],
[OrderQuantity] AS [dbo_FactInternetSalesOrderQuantity0_1],
[ExtendedAmount] AS [dbo_FactInternetSalesExtendedAmount0_2],
[TaxAmt] AS [dbo_FactInternetSalesTaxAmt0_3],
[Freight] AS [dbo_FactInternetSalesFreight0_4],
[UnitPrice] AS [dbo_FactInternetSalesUnitPrice0_5],
[TotalProductCost] AS [dbo_FactInternetSalesTotalProductCost0_6],
[ProductStandardCost] AS[dbo_FactInternetSalesProductStandardCost0_7],
1     AS [dbo_FactInternetSales0_8],
[PromotionKey] AS [dbo_FactInternetSalesPromotionKey0_9],
[SalesTerritoryKey] AS [dbo_FactInternetSalesSalesTerritoryKey0_10],
[ProductKey] AS [dbo_FactInternetSalesProductKey0_11],
[CustomerKey] AS [dbo_FactInternetSalesCustomerKey0_12],
[CurrencyKey] AS [dbo_FactInternetSalesCurrencyKey0_13],
[OrderDateKey] AS [dbo_FactInternetSalesOrderDateKey0_14],
[ShipDateKey] AS [dbo_FactInternetSalesShipDateKey0_15],
[DueDateKey] AS [dbo_FactInternetSalesDueDateKey0_16]
FROM
(
SELECT
[dbo].[FactInternetSales].[ProductKey],
[dbo].[FactInternetSales].[OrderDateKey],
[dbo].[FactInternetSales].[DueDateKey],
[dbo].[FactInternetSales].[ShipDateKey],
[dbo].[FactInternetSales].[CustomerKey],
[dbo].[FactInternetSales].[PromotionKey],
[dbo].[FactInternetSales].[CurrencyKey],
[dbo].[FactInternetSales].[SalesTerritoryKey],
[dbo].[FactInternetSales].[SalesOrderNumber],
[dbo].[FactInternetSales].[SalesOrderLineNumber],
[dbo].[FactInternetSales].[RevisionNumber],
[dbo].[FactInternetSales].[OrderQuantity],
[dbo].[FactInternetSales].[UnitPrice],
[dbo].[FactInternetSales].[ExtendedAmount],
[dbo].[FactInternetSales].[UnitPriceDiscountPct],
[dbo].[FactInternetSales].[DiscountAmount],
[dbo].[FactInternetSales].[ProductStandardCost],
[dbo].[FactInternetSales].[TotalProductCost],
[dbo].[FactInternetSales].[SalesAmount],
[dbo].[FactInternetSales].[TaxAmt],
[dbo].[FactInternetSales].[Freight],
[dbo].[FactInternetSales].[CarrierTrackingNumber],
[dbo].[FactInternetSales].[CustomerPONumber]
FROM [dbo].[FactInternetSales]
WHERE OrderDateKey <= ‘20011231’
) AS [FactInternetSales]
)
AS [dbo_FactInternetSales]

Why should you care what query is sent to SQL Server (or another RDBMS) during partition processing? Because any kind of query hints or SET options that may be valid in a regular SQL statement, might not be supported for partition query definition.

For example, BIDS would allow us to append the SET NOCOUNT ON statement to the beginning of the partition query. If we add this option, however, SQL Server Analysis Services will report a syntax error and fail processing.

You can customize the partition’s processing mode, which defines whether aggregations are calculated during partition processing or by a lazy aggregation thread after processing is complete.

Lastly, you could use storage location property to store data in default or alternate location. This property may come in handy if the disk where partition data is normally stored reaches its storage capacity.

Posted in SQL | Tagged: , , , , , , , , , , , , | Leave a Comment »

How to Use Policy Based Management to Implement Database Settings

Posted by Alin D on March 16, 2011

Introduction

Database Administrators have always had a tough time to ensuring that all the SQL Servers administered by them are configured according to the policies and standards of organization. Using SQL Server’s  Policy Based Management feature DBAs can now manage one or more instances of SQL Server 2008 and check for policy compliance issues. In this article we will utilize Policy Based Management (aka Declarative Management Framework or DMF) feature of SQL Server to implement and verify database settings on all production databases.

It is best practice to enforce the below settings on each Production database. However, it can be tedious to go through each database and then check whether the below database settings are implemented across databases. In this article I will explain it to you how to utilize the Policy Based Management Feature of SQL Server 2008 to create a policy to verify these settings on all databases and in cases of non-complaince how to bring them back into complaince.

Database setting to enforce on each user database :

  • Auto Close and Auto Shrink Properties of database set to False
  • Auto Create Statistics and Auto Update Statistics set to True
  • Compatibility Level of all the user database set as 100
  • Page Verify set as CHECKSUM
  • Recovery Model of all user database set to Full
  • Restrict Access set as MULTI_USER

Configure a Policy to Verify Database Settings

1. Connect to SQL Server 2008 Instance using SQL Server Management Studio

2. In the Object Explorer, Click on Management > Policy Management and you will be able to see Policies, Conditions & Facets as child nodes

3. Right click Policies and then select New Policy…. from the drop down list as shown in the snippet below to open the  Create New Policy Popup window.

4. In the Create New Policy popup window you need to provide the name of the policy as “Implementing and Verify Database Settings for Production Databases” and then click the drop down list under Check Condition. As highlighted in the snippet below click on the New Condition… option to open up the Create New Condition window.

5. In the Create New Condition popup window you need to provide the name of the condition as “Verify and Change Database Settings”. In the Facet drop down list you need to choose the Facet as Database Options as shown in the snippet below. Under Expression you need to select Field value as @AutoCloseand then choose Operator value as ‘ = ‘ and finally choose Value as False. Now that you have successfully added the first field you can now go ahead and add rest of the fields as shown in the snippet below.

Once you have successfully added all the above shown fields of Database Options Facet, click OK to save the changes and to return to the parent Create New Policy – Implementing and Verify Database Settings for Production Database windows where you will see that the newly created condition “Verify and Change Database Settings” is selected by default.

6. The next option within the Create New Policy Parent window is Against targets, by default EveryDatabase option will be checked as shown in the snippet below.

7. Since, we want to perform this check only against all the online user databases choose Online User Database option under Against target as shown in the snippet below.


Some of the other targets that are available to you are mentioned below for reference:

a) Every Database: – when this option is selected the policy will target all the database including system database
b) Online User Database: – when this option is selected the policy will target only user databases which are online
c) User or Model: – when this option is selected the policy will target only User or Model databases
d) Read – only: – when this option is selected the policy will target only read-only databases
e) Is Master:- when this option is selected the policy will target only master database
f) System Databases Not Including Master:- when this option is selected the policy will target all system databases except master database
g) Databases Not in SQL Server 2008 Compatibility Level:- when this option is selected the policy will target only those database which are not in SQL Server 2008 Compatibility Level

8. Next, choose an appropriate mode of execution for this policy. SQL Server 2008 supports three types of execution modes :

  • On demand: – Policy will only be run when you run it from the Evaluate dialog box
  • On schedule: – Periodically evaluates the policy, records a log entry for policies that have out-of-compliance, and creates a report.
  • On change: log only: – When changes are tried, this option does not prevent out-of-compliance changes, but logs policy violations.

In this example we will choose the Evaluation Mode as On Schedule and then click Pick button to associate this policy with existing schedules. Click New to open up New Job schedule.

9. In the New Job Schedule popup window choose the settings as shown in the snippet below and click OK to save the New Job Schedule and to return to parent Create New Policy Window.

10. Finally, click OK to in the Create New Policy window to create the policy.

Evaluate the Policy

Once you have successfully created Implementing and Verify Database Settings for Production Databases Policy then the next step will be to evaluate the policy.

1. Connect to SQL Server 2008  using SQL Server Management Studio

2. In the Object Explorer, expand Management, expand Policy Management and then expand Policies. Right click Implementing and Verify Database Settings for Production Databases Policy which was created newly and then select Evaluate option from the drop down list as shown below.

 

 

3. In the below snippet you can see that the policy has failed for most of the databases in my local instance of SQL Server 2008.

4. You can check why the Policy has failed by clicking the View… link under details for a particular database. In the below snippet you can see that for the AdventureWorks database the policy has failed because of four issues.

  1. The expected value for @AutoShrink was False however, the actual value turned out to be True
  2. The expected value for @AutoUpdateStatisticsEnabled was True however, the actual value turned out to be False
  3. The expected value for @PageVerify was Checksum however, the actual value turned out to beNone
  4. The expected value for @RecoveryModel was Full however, the actual value turned out to beSimple

Click Close to exit the Results Detailed View and to return to Evaluate Policies window.

5. In the Evaluate Policies window, select the check box against the target database and click the Applybutton to get compliance with the policy.

6. When you click the Apply button it will pop up a Policy Evaluation Warning as shown in the snippet below. Click Yes to apply the policy against the database.

7. Once the policy has modified all the selected targets which do not comply, you will be able to see the below screen indicating that all the targets are in compliance now. Click the Close button to exit.

SQL Server 2008 stores all the policies in the MSDB database, so the DBA will need to make sure that the MSDB database is backed up immediately once a policy or a condition is newly added or changed.

Conclusion

In this article you have seen how easily a database administrator can check for policy compliance or non compliance issues for all production databases by leveraging the Policy Based Management Feature of SQL Server 2008. This is an excellent feature which helps database administrators to enforce organizational level policies across SQL Server 2008 database environments.

Posted in SQL | Tagged: , , , , , , | Leave a Comment »

SSRS Reporting Services & Architecture

Posted by Alin D on December 15, 2010

SQL Server Reporting Services [SSRS] is Reporting system based on SQL Server. It provides a set of tools & services enabling us to create, manage, and deliver reports for entire organization. It is a Microsoft Product released in year 2000.

SSRS Architecture intentionally not given in the starting of article. Before looking on each component of architecture, I preferred to make practical approach to it. In the end of article Architecture has been explained.

Create SSRS Report

First of all go to MS SQL Server 2005=>SQL Server Business Intelligence Development Studio.

1.gif

Move to File=>New=>Project=>Report Server Project

2.gif

After creating new report project, we get two folders in Solution:-

Share Data Sources=>here we set database credentials.

Reports=>here we add report files

3.gif

SSRS Data Source

Right click on Shared Data Sources folder & add data source. Following window panel will be opened where we need to provide data server details & database name.

To confirm that defined database is successfully connected click on ‘Test Connection’.

4.gif

5.gif

After successfully adding data source, we can add a report in Report folder. Right click on Report folder & Add-New-Report. Window panel will be opened. Select Report item & provide report name ‘Header_Report’.

6.gif

SSRS Report Design

After adding new report, we see report has 3 sections in different Tabs:

  1. Data-Here we put SQL Query or Procedure to fetch data from database that we have to show on report.
  2. Layout-This is the designing section where we format report by dragging tables, rectangle, lines etc from Toolbox. And Data field on report from Dataset panel.
  3. Preview-This panel shows how the report will display to end user.

All these 3 section circled in below image.

7.gif

SSRS Toolbar

Before moving to design report just we can briefly go through Report Items available in Toolbox.

  1. Textbox: To add any custom text on report we use textbox.
  2. Line: Drawing line on report.
  3. Table: Creating a table having rows & columns, header & footer. We can format table according to our requirement.
  4. Image: Adding image to report.
  5. Chart: facilitate to add different type of charts to report.
  6. Subreport: We can add a report in another one report. Like having Header & Footer report on a report.

Toolbox items shown in below image:-

9.gif

Now I dragged 3 textboxes on report & put text ‘Dhania Hospital’, & ‘Health is Wealth’, & ‘Bhiwani Haryana 127021’ respectively.

10.gif

After previewing report in Preview tab, report will appear as shown below.

11.gif

After finishing Header_Report, now we are creating new report AdmittedPatientList.

12.gif

After adding new report, move to Data section of report. Select <New Data Set>. A new window Dataset will be opened. Here choose Command type [store proc or text query] we need to use to fetch data from database. Here in this report stored procedure USP_GET_INPATIENT_REPORT.

13.gif

After adding dataset, click on Run button to execute command to get data. Define Query Parameters window opened where we need to pass values to procedure parameters[@FROM_DATE,@TO_DATE etc.]

14.gif

After execution of command, data shown in below panel & Report Datasets occupied with data fields we have in USP_GET_INPATIENT_REPORT procedure.

15.gif

After adding dataset to report, now we move to design report. Move to Layout section of report & drag a table on it. As we drag a table we get 3 sections in it:-

  1. Header: here we put data that need to be shown header of report. We can have more than one row in header just by clicking on Header row & add new row.
  2. Details: this is the part of table where we drag data fields from dataset panel.
  3. Footer: here we add items we need to show in footer of report.

16.gif

SSRS Subreport

In this report I am adding subreport item to add Header in report. Sub report Header_Report that we created previously chosen in Subreport property.

17.gif

Now we are adding 2 more rows in header section of report by just right clicking on left most of header column.

18.gif

Report designer provide Expression Window to help developer to use different formula, functions, operations. We can directly drag any data field like patient name, address etc from Dataset window or just right click on any row cell & select EXPRESSION option.

EXPRESSION Window has been shown in below image

19.gif

By default table have 3 columns only so we can add more columns as requirements by right click to header of a column & option to add column in left or right of selected column.

We can merge no of columns to accommodate more space required for a field. For example in our current report we have to merge all columns in header section to put text ‘List of in patient from X Date to Y Date’.

New row added in header section to put name of column name like S.No, Patient Name etc. To format text of cell just right click on cell & select PROPERTY. Here in property window we can set font size, font type etc.

20.gif

Row Formatting: To format a row we need to open property window of a row by just selecting row & right clicking it & go to Property option.
In property window we can set border font, type, back color, text alignment, padding etc.

Now we dragging data fields like patient name, address in detail section of table just below their corresponding headers like patient name address. It is shown in below image.

21.gif

Now report is ready to use. Move to Preview panel of report & pass required parameters [FROM_DATE, TO_DATE etc] of report.

22.gif

Publish SSRS Report on Report Server

After creation of report we need to publish it on report server so it could be available to end user.

To publish report on sever we need to set credential of report server. Go to property of Solution explorer as shown below image.

23.gif

In Property window set ‘TargetServerURL’ field with Report-Server [i.e. NSARORA] and report virtual folder [i.e. ReportServer$NSARORA].

Field ‘TargetReportFolder’ contain folder name in report server where published reports saved. In detail it is shown in below image.

24.gif

After making setting for report server credential now we can deploy reports on server.

25.gif

As deployment of report starts output window shows the deployment.

26.gif

SSRS Report available to End User

After publishing report on Server, report is available to user. There are two ways to expose report to user:-

  1. Report Manager
  2. SSRS Report in ASP.NET Application

Report Manager

Report Manager is web interface that allow to access to reports published on Report Server. Report Manager can access in browser by entering Report Server path i.e. HTTP://NSARORA/ReportServer$NSARORA?SSRS_Demo_Project


After accessing report manager now we can navigate to AdmittedPatientList report shown in list in above image.

We need to provide parameters FROM_DATE & TO_DATE to access list of patients admitted in hospital.

The other way we can access report manager by MS SQL Server Management Studio.

Connecting to Reporting services is shown in below image

After connecting to reporting server we can have folder where we published our reports on report server.

We published our reports in ‘SSRS_Demo_Project’ folder containing AdmittedPatientList report as shown below-

To open report in Report Manager right click on report & select ‘View Report’.

To access report in Report Manager first of all it needs authentication:-

After successfully authenticated we are able to view report shown below:-

SSRS Report in ASP.NET Application:

The first way to provide SSRS Report to end user is Report Manager just studied above. But Report Manager is usually used by System Administrator. So we need to create a custom application in ASP.Net that will be available to user. Here we are creating ASP.NET Application & SSRS Report to be integrated on aspx web page.

ReportViewer Control

First of all create ASP.NET Application in VS 2005. Add ReportViewer control to aspx page. We included to textboxes passing FROM_DATE/TO_DATE parameters in report. Page design has been shown in below image.

Design code has been shown below:-

ASPX[Design page]

<form id=”form1″ runat=”server”>
<div>
<table width=”100   %” class=”lab1″ align=”center”>
<tr>
<td  align=”right” >From</td>
<td align=”left”>
&nbsp;<asp:TextBox ID=”txtFromDate” runat=”server”></asp:TextBox></td>
</tr>
<tr>
<td align=”right”>To</td>
<td align=”left”>
&nbsp;<asp:TextBox ID=”txtToDate” runat=”server”></asp:TextBox></td>
</tr>
<tr>
<td></td>
<td align=”left”>
<asp:Button ID=”btnSubmit” runat=”server” Text=”Submit” CssClass=”Button” />
<asp:Button ID=”btnReset” runat=”server” Text=”Reset” CssClass=”Button” /></td>
</tr>
<tr>
<td colspan=”3″ bordercolor=”green”>&nbsp;<rsweb:ReportViewer ID=”ReportViewer1″ BorderWidth=”10″ BorderColor=”AliceBlue”  ProcessingMode=”Remote” Visible=”true” runat=”server” Width=”688px”>
</rsweb:ReportViewer>
</td>
</tr>
</table>
</div>
</form>

Report Parameter in SSRS

On button submit, we pass Server [i.e. NSARORA], Report Server [i.e. Reportserver$nsarora] & name of the report located in publishing folder[/SSRS_Demo_Project/AdmittedPatientList].

Here SSRS_Demo_Project is folder name where we publish our reports on report server. We create object of ReportParameter Class to pass parameters to ReportViewer control.

ASPX.VB[ Code behind Page]

Partial Class _Default
Inherits System.Web.UI.Page
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
End Sub
Protected Sub btnSubmit_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnSubmit.Click
Try
ReportViewer1.ShowDocumentMapButton = False
ReportViewer1.DocumentMapCollapsed = True
ReportViewer1.ShowParameterPrompts = False
ReportViewer1.ShowBackButton = True
Dim s As System.UriBuilder = Nothing
s = New System.UriBuilder
s.Host = “NSARORA”
s.Path = “Reportserver$nsarora”
ReportViewer1.ServerReport.ReportServerUrl = s.Uri
ReportViewer1.ServerReport.ReportPath = “/SSRS_Demo_Project/AdmittedPatientList”
Dim PARAM1 As New Microsoft.Reporting.WebForms.ReportParameter(“IS_DISCHARGE”, “-1”)
Dim PARAM2 As New Microsoft.Reporting.WebForms.ReportParameter(“FROM_DATE”, txtFromDate.Text.Trim())
Dim PARAM3 As New Microsoft.Reporting.WebForms.ReportParameter(“TO_DATE”, txtToDate.Text.Trim())
Dim P As Microsoft.Reporting.WebForms.ReportParameter() = {PARAM1, PARAM2, PARAM3}
ReportViewer1.ServerReport.SetParameters(P)
Catch ex As Exception
End Try
End Sub
End Class

SSRS Report on ASP.NET Application

Report Builder

Up to now we studied how to develop a SSRS Report, publish it, and make it available to end user via ReportManager or a custom application that we just created in previous section.

Now Microsoft has given facility to end user to create their own reports of their choices. Here role of Report Builder involved making available environment to end users so they can create report there.

Question also arises that why we need to give option to create own reports to end user. The reason behind it that many times User needs to analyze data according to their requirements.

Report Builder is not giving full access to SQL Server database to end user. It restricts to make available only those tables required by user & other tables not shown to user.

To make limited access to Database, Report Builder use Report Data Module.

As we create a Report Model Project, we get Data Source, Data Source Views, & Report Models folders. So next we are going to use each of these folders.

  1. Data Source
  2. Data Source View
  3. Report Model

First of all we need to give details of data source from where we have to fetch data

Database server & database selected & connection tested here.

After configuration of Data source in Report Model application, we need to define Data Source View. In Data Source View we choose which table we want to expose to end user.

So right click on Data Source View folder & choose Add New Data Source View.

Data Source [dsDataSource] that we created previously chosen here

After choosing data source now we select tables to be available to end user.

In this application we just chosen one table [TBL_PATIENT_REGISTER] only

After adding Data Source View, we need to create Report Model by right clicking on Report Models folder.

In Report model we chose Data Source View.

Once we complete Report Model, all tables & their corresponding columns shown in application.

Now our Report Model application completed. So we need to deploy it on server. Make changes in property of application as shown below.

Here nsarora is our server & ReportServer$nsarora is report server.

Now go to solution explorer right click on project name & choose Deploy:-

So we successfully deployed our Report Model application.

Now its time to hand over facility to create own report to end user. So role of Report Builder involve now.

Report Builder is a tool provided to end user to create own reports.

To access Report Builder, we use following general link to access it:- http://servername/reportserver/ReportBuilder/ReportBuilder.application

Type this link in browser to access Report Builder.

Authentication details to be provided to access report server for Report Builder.

After successfully connected to report server, Report Builder will be opened. Report Builder will contain Report module we created in last section.

Different sections of Report Builder have been shown in below image:-

Now we step by step are creating report in Report Builder. Drag Patient_ID, Name, DOB, Address fields on drag and drop columns fields section.

After selecting fields on report we need to put filter criteria on behalf of which records to be shown. For making filter setting we need to use Filter Data window.

We can also put sorting criteria on report by using sorting window.

So we have created a report in Report Builder, now we need to run it. For it click on RUN REPORT button in menu items of Report Builder.

Report Server Database

Deployed Reports, data sources, Report Module etc all these SSRS Reporting objects stored in Report Server Database.

To connect to Report Data base, go to MS SQL Server Management Studio & connect to Reporting Services.

Enter authentication details to connect to Report Database.

After connected, we have Data Sources, Folders containing deployed reports, Models folders.

Below given image shows details of these folders of Report Server database.

Data Sources folder: – This folder contains dsDataSource that we created in Report module application.

Models Folder: – Having ‘SSRS Demo Report Model’ file of Report Module Application.

Report Deployed Folder: – This folder ‘SSRS_Demo_Project’ contains all reports we created here this article and deployed on server.

SSRS Architecture

SSRS Architecture includes different tools & services involved to creating, deployment of reports & making available to end user.

Block diagram of Architecture has been shown below.

Following are the components of SSRS Architecture:-

1. Reporting Services:-

It is the Execution Engine of SSRS that runs all services involved for reporting. These services includes:-

  1. Authentication Service: – To access reports, or any tools with reporting server, authentication service involved.
  2. Deployment Service: – To deploy or publish report on server, deployment services involved.
  3. Data Processing: -Data need to be shown in report processed.
  4. Rendering Service: – On request of a report, response in form of HTML stream made available.

2. Data Sources:-

SQL Server, Excel Sheet, or XML files may be the source of data for reporting.

3. Report Designer: –

This is the tool that a developer used to create reports. This tool we already used to design reports.

4. Report Server Database: –

Deployed Reports, data sources, Report Module etc all these SSRS Reporting objects stored in Report Server Database.

5. Report Builder: –

This is the tool provided to end user to develop reports themselves.

6. Report Manager: –

It is a web interface allow to access reports deployed on report server.

Above brief description has been given about SSRS Architecture. We already had gone thoroughly in each of the component of architecture.

Posted in SQL | Tagged: , , , , , , | Leave a Comment »

Restoring SQL Server 2008 R2 Backup file to SQL Server 2008

Posted by Alin D on November 26, 2010

Recently i had to restore a SQL Server 2008 R2 Database to a Database in another machine and i ended up getting the message:

“The Database was backed up on a server running version 10.50.1600. That version is incompatible with this server, which is running 10.00.1600″ .

hen when exploring the cause of the reason ,i found that the database that i took the backup was from SQL Server 2008 R2 .

It was the same Backup file that was used to restore in another machine and interestingly , the other machine had SQL Server 2008 .

The Version 10.50 is SQL Server 2008 R2 whereas 10.00 is SQL Server 2008.

Also , the same SQL Server Management Studio 2008 was used to access both the server instances ( 2008 and 2008 R2 ) .

It was a bit confusing for me since i was able to access the SQL Server 2008 R2 Express from the SQL Server Management Studio 2008 , but was unable to restore it to the SQL Server 2008 Express ,but then realised that since the on-disk format is different the versions and the restoring the SQL Server 2008 database to SQL Server 2008 R2 is possible . :)

I also had another option to generate the Create SQL Scripts and execute it in SQL Server 2008 and 2005 and it worked fine too :)
To find the version of Microsoft SQL Server 2008 , connect to SQL Server 2008 by using SQL Server Management Studio and execute the query

1
SELECT SERVERPROPERTY('productversion'), SERVERPROPERTY ('productlevel'), SERVERPROPERTY ('edition')

Here’s a nice reference of the list of SQL Server versions

Posted in SQL | Tagged: , , | Leave a Comment »

Mistakes When Building and Maintaining a Database

Posted by Alin D on October 28, 2010

Building and maintain a SQL Server database environment takes a lot of work. There are many things to consider when you are designing, supporting and troubleshooting your environment. This article identifies a top ten list of mistakes, or things that sometimes are overlooked when supporting a database environment.

#1 Inappropriate Sizing of DB

If you are new to managing SQL Server databases, it is easy to fall into the trap of using SQL Server Management Studio to create your database. We all do it. When using SQL Server Management Studio you have many options you can set, or you can take the defaults. The mistake many DBAs make when sizing a database is to take the default sizing options. There are two sizing options you need to consider, initial size and autogrowth.

The initial size is how much space the initial database will take up. The default is 3 MBs for Data file and 1 MB for the log file. Now these could be the appropriate size for your database. However, if your database is expected to have 300,000 or millions of rows of data shortly after being created then a larger initial size for data might be appropriate. You need to determine how much space you will need initially and the growth rate of your database to set the initial size and growth rate appropriately for your Data file. Sizing the log file should also be considered and should be based on how much change occurs in your database. If you are performing lots of inserts and updates then a larger log file will be needed, verses a database that changes very little.

#2 No Database Backup Plan

The worst thing you can do is not have any backups of your databases. The next worse thing you can do is not testing to see if you can restore your databases from backup. As a DBA, if you cannot recover your corporate data then you had better get your resume out on the streets, because you most likely will need to be looking for a new job.

There are three different backup types: full, differential and transaction logs. In order to determine which ones to use, and how often they should be run depends on your backup requirements. Work with your customers to define their recovery point objective and then at a minimum make sure you have database backups to meet their requirements.

You should periodically test restoring databases from these backups just to make sure they work. Ideally, you would perform some disaster recovery testing in conjunction with your customers at a secondary location. However, at a minimum you might want to test the recovery process on some isolated test box.

Lastly, you might want to consider an alternative storage location for your backup files. Do not just leave them on the server where the databases exist. If that server melts down all your backups are gone. Make sure you copy your backups to an alternative storage location. That alternative location could be another machine, tape, or some external storage device. You might also want to consider offsite storage for these copies of your database backups.

#3 Poor Security Model

Your best line of defense against security breaches is to have a good security model. Do not use the SA account for anything. When you install SQL Server set the SA account password to something complicated, write it down, store it in a secure location and never use it again, except in an emergency. SQL Server has two different security models you can use, SQL Server and Windows Authentication. SQL Server is the least secure. Use these different authentication methods appropriately in your environment.

When you provide access to a database and/or database objects, you should only give people the kind of access they need to do their job. Do not give developer DBO access in production, unless it is their job to maintain a database. Only allow people to have update rights to tables if they need to have that kind of access to do their job.

You also want to develop a security model that is easy to maintain. Consider using Windows Groups and Database Roles to help organize your security rules. Using groups and roles allows you to provide similar access to people by simply dropping them into a Windows Group or Database role. Organizing access rights using group and roles can make security administration much easier.

#4 Use of Adhoc Queries

Do not allowing applications to submit adhoc queries. When you have adhoc queries, you have to provide permissions to tables and views in order for these adhoc queries to run. This means that users will need to have SELECT, INSERT, UPDATE, and DELETED permissions to your database tables in order to run these adhoc queries. This kind of access allows individuals to write their own code against the database, circumventing using an application to access and update a database.

It is better to encapsulate your application code into stored procedures. This will allow you to only give users EXECUTE permissions. Doing this means users will not be able to access tables directly outside the applications using SELECT, INSERT, UPDATE, and DELETE statements. Additionally, stored procedures can provide better performance through compiling and caching the execution plans.

#5 No Data Integrity Rules

Do not rely on your application to enforce all data integrity rules. A SQL Server database can be designed to enforce some data integrity rules. Take advantage of these rules when you build your database.

If your application requires a given table to have a unique record then make sure you provide a unique constraint on that table. In addition, if your business rules say that a parent record needs to exist prior to creating a child record then make sure you create a foreign key relationship between the parent table and the primary key of the child table. You can also provide additional data integrity by providing default values and check constraints to make sure your data matches your application business rules.

#6 No Testing or Limited Testing

It is poor practice to put code into production without testing it. This goes for all changes, even a small logic change. When you do not test your code, you run the risk of it not working, or more importantly introducing additional problems.

When building your code it is common practice to run your code against a small development database. This provides for faster testing turnaround times, and allows you to use fewer resources. The down side of this is your application might not scale well. Therefore, prior to moving a new application into production you should test your application against a production size database. This will allow you to find those performance issues you might have related to large tables.

#7 Lack of Monitoring

What happens when you fill a glass of water and you do not watch the level of the water while it is filling? The glass will fill up and overflows if you do not turn off the water in time. If you do not monitor your database system, you might also fill up your capacity. By not monitoring your environment, you run the risk of having poorly performing applications that will eat up all your hardware capacity (CPU, I/O, and disk space).

By monitoring your application for poorly performing queries, you can identify performance opportunities. These opportunities will allow you to re-write these poorly performing queries or add additional indexes to optimize them. By monitoring and tuning up your poorly performing queries, you reduce the CPU and I/O required by your application.

You do not want your databases to run out of disk space, just like you would not want your glass of water to overflow. To keep your disk drives from filling up you need to monitor them to ensure there is adequate disk space on your physical drives. Ideally, you should track the amount of disk space growth over time. By doing this you can predict when you will run out of disk space. By proactively managing the growth rate, you help minimize the possibility of running out of disk space.

#8 No Periodic Rebuild or Reorganization of Indexes

As databases grow, indexes are constantly being updated. This constant updating makes indexes grow and split, which leads to index fragmentation. In order to optimize your indexes you need to periodically rebuild or reorganize your indexes.

SQL Server provides us with index fragmentation information. You should query this fragmentation information to determine which indexes should be rebuilt or organized. By periodically rebuilding and reorganizing indexes, you can improve your application performance.

#9 No Indexes or Too Many Indexes

Performance of your application is important. Indexes will help your application quickly find the data they need. Having fast access to the data provides your end users with a well performing application. However, if you have no indexes or too many indexes your performance may suffer.

You need to monitor your database for missing indexes, or indexes that are not being used. SQL Server provides you with index usage and missing index information. If you find a missing index that could help a large number of commands then you should consider adding that missing index. Monitoring can also identify indexes that are never used. These are the indexes you should consider removing. Be careful when monitoring and removing indexes because these indexes might be only used for monthly, quarterly, or yearly processing.

#10 No Change Management Process

Maintaining a stable production environment is extremely important. Database changes need to be well thought out and planned. Having a change management process provides some structure around how changes are made.

You need to develop a change management process for your environment. Changes should be tested in an isolated non-production environment. Once you have thoroughly tested a change then you can plan for the production implementation. When implementing changes into production, make sure you have a fallback plan just in case your implementation into production does not go as expected. By having a change management process, you are able to document every change that goes into production, and make sure it goes through the appropriate testing process. Doing this well help maintain a stable production environment.

Posted in SQL | Tagged: , , , , , , , | Leave a Comment »

5 Cool New Features in SQL Server 2008 R2

Posted by Alin D on October 27, 2010

It hardly seems possible, but SQL Server 2008 R2 is almost ready. Like most R2 releases, SQL Server 2008 R2 builds on the functionality of the base SQL Server 2008 release by improving scalability as well as BI features. Here are five of the most important features you should watch for in SQL Server 2008 R2:

5. Support for 256 logical processors
Organizations pushing the high end of scalability will want to take advantage of SQL Server 2008 R2 support for up to 256 logical processors (which requires Windows Server 2008 R2). This is a big improvement over the previous maximum of 64 processors.

4. Improvements in multi-server management
SQL Server Management Studio (SSMS) has always been great for managing SQL Server. However, SSMS was really for single server management and wasn’t strong in the area of multi-server management. New wizards, dashboards, and enhancements to Policy Based Management will boost SQL Server 2008 R2’s multi-server management capabilities.

3. Master Data Services
Almost all large organizations face the problem of multiple data definitions where the same data is defined and used differently in various locations. SQL Server 2008 R2’s new Master Data Services (MDS) feature provides a central portal from which administrators can create and update master data members and hierarchies, with the goal of ensuring data consistency across the enterprise.

2. Geospatial visualizations in Reporting Services
Building on the geospatial data types in the base SQL Server 2008 release, SQL Server 2008 R2’s Reporting Services will support reports with visual geographic mapping. New geospatial features include visualizations for mapping routine and custom shapes and support for Microsoft Virtual Earth tiles.

1. Self-service BI with Project Gemini
Project Gemini is the core feature of the SQL Server 2008 R2 release. Its goal is to enable organizations to more easily adopt and take advantage of SQL Server’s BI capabilities. It provides self-service BI through a powerful Excel add-in and integrates with SharePoint for centralized management. Project Gemini will require Excel 2010 and SharePoint 2010. For more about Gemini, check out Sheila Molnar’s interview of Microsoft’s Donald Farmer, InstantDoc ID 102613, in this issue of SQL Server Magazine.

Posted in SQL | Tagged: , , , , , , , , , | Leave a Comment »

Viewing and Modifying Data in SQL 2008

Posted by Alin D on October 15, 2010

A view is a database object that represents a saved SELECT statement and is also referred to as a virtual or logical table. Views can be queried in the same way as tables, and some types of views can be updated, too. Using views instead of tables can greatly simplify data access and decouple client applications from the underlying tables containing actual data. With the appropriate use of views, it is possible to completely change the schema of the database and redesign the tables without breaking any client applications. Think of views as an abstract interface between your physical database tables and the people or applications querying them.

Creating Views

SQL Server 2008 allows you to create views of the following types:

  • Standard view This view is based on one or more base tables. The view may include joins, filter restrictions (using the WHERE clause), and row count restrictions (using the TOP and ORDER BY clauses). You cannot use the ORDER BY clause in a view without specifying the TOP clause as well.
  • Updateable view A view that is based on a single underlying table can be updated directly. Executing INSERT, UPDATE, DELETE, and MERGE statements on this type of view will affect the data in the underlying table. You can also define an INSTEAD OF INSERT, INSTEAD OF UPDATE, and INSTEAD OF DELETE trigger on any view, which will perform a particular action when you attempt to insert, update, or delete data in the view.
  • Indexed view Sometimes it is valuable to create one or more indexes on a view in order to optimize the time it takes to query the view. Indexes can be created on views using standard CREATE INDEX syntax. Indexed views must be created with the SCHEMABINDING option (see the “Using the SCHEMABINDING Option to Lock in a View’s Underlying Schema” sidebar).
  • Partitioned view A partitioned view joins data that is spread across a table partitioned horizontally—for example, if you have partitioned a table by OrderDate to store orders from five years ago and earlier in one partition, orders created within the last five years in another partition, and orders created this year in yet another partition. A partitioned view will join all the partitions together into one Orders virtual table containing data from all three partitions.

To create a view, use the CREATE VIEW statement syntax shown in Example 1.

Example 1. CREATE VIEW Statement—Syntax
CREATE VIEW [schema_name].view_name [(column_names)]
[ WITH ENCRYPTION | SCHEMABINDING ]
AS select_statement
[ WITH CHECK OPTION ]


Specifying the column_names in a view definition allows you to assign names to computed columns or to rename columns produced by the SELECT statement. This is useful for calculated columns and columns that may have ambiguous names. If you don’t specify explicit column names, the view columns will inherit the same names as the columns in the SELECT statement.

Specifying the WITH ENCRYPTION option encrypts the view definition. This also prevents the view from being used in replication.

Configuring & Implementing…: Using the SCHEMABINDING Option to Lock in a View’s Underlying Schema

Views are named SELECT statements and include one or more columns from one or more tables. What will happen if a column or table referenced by a view is dropped from the database? The view will become invalid and will return an error the next time it is queried. To lock the view into the schema objects on which it relies, add the WITH SCHEMABINDING option to your CREATE VIEW statement.

This option ensures that any table or column referenced by this view cannot be dropped or altered, until the view itself is dropped. This applies only to columns referenced by the view. You can freely add and remove columns from underlying tables, as long as they are not used in the view.

Only specify the SCHEMABINDING option when the view references tables from a single database. You must specify the SCHEMABINDING option if you wish to build indexes on the view you are creating.


Example 2 creates a view based on the Stars table using the SCHEMABINDING option. We then attempt to alter the underlying structure of the base table but receive an error. Figure 1 demonstrates how the same view can be created using the graphical view designer in SQL Server Management Studio.

Figure 1. Creating a View Using SQL Server Management Studio


Example 2. Working with Views
CREATE VIEW MyStarsView WITH SCHEMABINDING
AS SELECT StarName, StarType FROM dbo.Stars
WHERE SolarMass >=1;
GO
SELECT * FROM MyStarsView;
-- Results:
-- StarName   StarType
-- ---------- --------------------------------------------------
-- Deneb      White supergiant
-- Pollux     Orange Giant
-- Sun        Yellow dwarf
ALTER TABLE Stars
DROP COLUMN StarType;
GO
-- Results:
--Msg 5074, Level 16, State 1, Line 1
-- The object 'MyStarsView' is dependent on column 'StarType'.
-- Msg 5074, Level 16, State 1, Line 1
-- ALTER TABLE DROP COLUMN StarType failed because one or more objects
access this column.
-- This view is updateable, as it is based upon only one base table
UPDATE MyStarsView
SET StarType = 'White Supermassive Giant'
WHERE StarType = 'White supergiant'
GO
SELECT * FROM MyStarsView;
-- Results:
-- StarName   StarType
-- ---------- --------------------------------------------------
-- Deneb      White Supermassive Giant
-- Pollux     Orange Giant
-- Sun Yellow dwarf

					    


Creating Stored Procedures

Stored procedures are Transact-SQL statements that perform one or more actions and are saved in the database with a name. Stored procedures, used widely to encapsulate the logic of your database system, can accept parameters and return values. Stored procedures are the only database object that can update data by executing DML statements. For example, you may write a stored procedure named AddCustomer that accepts a CustomerName, EMailAddress, and PhoneNumber parameter. The logic within this stored procedure can check that the potential customer’s details are valid, insert a new row into the Customers table using parameter values supplied, and then return the CustomerID of the newly created customer.

To create a stored procedure, use the CREATE PROCEDURE statement syntax shown in Example 3. The CREATE PROCEDURE keywords can be shortened to CREATE PROC. To change the definition or options of a stored procedure, use the ALTER PROCEDURE or ALTER PROC statement.

Example 3. CREATE PROCEDURE Statement—Syntax
CREATE PROCEDURE [schema_name].stored_procedure_name[ ; procedure_number]
[@parameter1_name parameter1_data_type [=default_parameter_value]
[ OUT | OUTPUT] [READONLY]
[@parameter2_name parameter2_data_type...]
[ WITH ENCRYPTION | RECOMPILE | EXECUTE AS]
AS [BEGIN] transact_sql_statements [END]


Stored procedures can be grouped into logical named groups. Each procedure within a group will have a unique procedure_number, while the entire group can be referred to using the procedure_name. The entire procedure group can be dropped at once using the DROP PROCEDURE statement. To use a single procedure, you can omit the procedure_number. In this case procedure_name will always be used to refer to it.

Parameters are named variables passed into the procedure. Parameter names always start with an @, and a data type must be specified for each parameter. You can also use the default_parameter_value to assign a default value to a parameter if the procedure was called without this parameter being supplied. The most common use of procedure parameters is to pass values to the stored procedure, so that it can use these values within the Transact-SQL statements that comprise it. Sometimes you must return values to the caller of your stored procedure. To do so, mark each parameter you wish to return to the caller as OUTPUT or OUT (the two are equivalent). If the parameter is not to be updated within the stored procedure, you can specify it as READONLY.

Similar to defining views, specifying the WITH ENCRYPTION option encrypts the stored procedure definition. Specify the WITH RECOMPILE option to instruct the database engine never to cache the execution plan for this stored procedure. Instead, the optimal execution plan will be calculated every time the procedure is called. The EXECUTE AS option allows you to run the procedure as an alternative set of user credentials, different from those of the caller.

Example 4 creates and executes a stored procedure to add a new row into the Stars table. The procedure accepts parameters for the star name, star type, solar mass, and description; and returns the ID of the newly created star.

Example 4. Creating and Executing a Stored Procedure
CREATE PROC AddNewStar
@ID int OUT,
@StarName varchar(50),
@SolarMass decimal(10,2),
@StarType varchar(50),
@Description ntext = 'No description provided.'
AS
BEGIN
  DECLARE @NextStarID int
  SET @NextStarID = (SELECT MAX(StarID) FROM Stars)
  SET @NextStarID = @NextStarID + 1
  INSERT dbo.Stars(StarID, StarName, SolarMass, StarType, Description)
  VALUES(@NextStarID, @StarName, @SolarMass, @StarType, @Description)
  SET @ID = @NextStarID
END;
DECLARE @NewStarID int
EXECUTE AddNewStar @NewStarID OUT, 'Sigma Octantis', 5.6, 'Giant'
SELECT @NewStarID as NewStarID
SELECT * FROM Stars
-- Results:
-- (1 row(s) affected)
-- NewStarID
-- -----------
-- 4
-- (1 row(s) affected)
-- StarID   StarName      SolarMass   StarType           Description
-- ------- -------------- ----------- --------------     ------------
-- 3       Deneb          6.00        White Supermassive Deneb is the...
                                      Giant
-- 1       Pollux         1.86        Orange Giant       Pollux,also...
-- 4       Sigma Octantis 5.60        Giant              No description...
-- 2       Sun            1.00        Yellow dwarf       No description...
--(4 row(s) affected)

					    


Creating Functions

Functions, like stored procedures, are saved Transact-SQL statements. Unlike stored procedures, functions cannot perform actions by executing DML statements. Functions always return a single value or a single table-valued expression. They are used by database developers to encapsulate and reuse calculations. For example, you may create a function to calculate the tax amount given a particular salary or to determine whether an e-mail address that has been provided is valid.

It is possible for a function to take no parameters, but often functions accept multiple input parameters and use the parameter values in the calculations which the particular function represents. Unlike stored procedures, functions do not support output parameters. The following types of functions are available within SQL Server 2008:

  • Scalar functions These functions return a single value of any data type.
  • Single statement table-valued functions These functions execute a single SELECT statement and return the result of this statement as a table-valued expression.
  • Multiple statement table-valued functions These functions return several table-valued expressions created by one or more SELECT statements.
  • Built-in functions SQL Server provides many built-in functions to perform common tasks. For example, the GETDATE() built-in function returns today’s date and time. The AVG() function returns the average value of the column.

You can use the CREATE FUNCTION statement to create new functions using the syntax shown in Example 5. You can use the ALTER FUNCTION statement to change the function’s definition.

Example 5. Create Function Statement—Syntax
CREATE FUNCTION [schema_name].function_name (
[@parameter1_name parameter1_data_type [=default_parameter_value],
[@parameter2_name parameter2_data_type...])
RETURNS data_type
AS
transact_sql_statements


Example 6 demonstrates how to create and use scalar and table-valued functions.

Example 6. Working with Functions
CREATE FUNCTION ConvertKilogramsToPounds
(@Kilograms decimal(18,2))
RETURNS decimal(18,2)
AS
BEGIN
DECLARE @Pounds decimal (18,2)
SET @Pounds = @Kilograms * 2.21
RETURN (@Pounds)
END
PRINT dbo.ConvertKilogramsToPounds(5)
-- Results:
-- 11.05


Creating Triggers

Triggers are stored procedures that are bound to a table or view. They run when a DML statement is executed on the table or view. You can specify triggers as FOR UPDATE, FOR INSERT, and FOR DELETE. These triggers will execute immediately after INSERT, UPDATE, or DELETE operations. You can also create INSTEAD OF UPDATE, INSTEAD OF INSERT, and INSTEAD OF DELETE triggers. These triggers will execute without the data being actually inserted, updated, or deleted.

A trigger can query tables and views, execute DML statements, and include complex Transact-SQL logic. The trigger and DML statement that caused the trigger to fire occur within the context of a single transaction. It is possible to roll back INSERT, UPDATE, and DELETE statements from within a trigger. This is useful for complex data validation purposes. You can use triggers to manually cascade changes through related tables; to guard against malicious or incorrect INSERT, UPDATE, and DELETE operations; and to enforce other restrictions that are more complex than those defined by using CHECK constraints.

Exam Warning

Triggers should be used sparingly because they have severe performance implications. In addition, triggers can be difficult to maintain.


Unlike CHECK constraints, triggers can reference columns in other tables. For example, a trigger can use a SELECT statement from another table to compare to the inserted or updated data and to perform additional actions, such as modifying the data or displaying a user-defined error message. Triggers can evaluate the state of a table before and after a data modification and take actions based on that difference. Multiple triggers of the same type (INSERT, UPDATE, or DELETE) on a table allow multiple different actions to take place in response to the same modification statement. Triggers also allow the use of custom error messages.

Triggers can be specified as FOR, AFTER, or INSTEAD OF. The trigger action will fire during the DML statement, after the DML statement, or in place of the DML statement, respectively. Triggers can be specified for UPDATE, INSERT, DELETE, or any combination of these.

How do you know what data the user is attempting to insert, update, or delete within a trigger? The trigger can access special tables called INSERTED and DELETED. These virtual tables exist only while the trigger is executing. The INSERTED table contains the new values you are attempting to insert into the table, or new values of the row when you are attempting to update data. The DELETED table contains the row you are attempting to delete or old values of the row when you are attempting to update data. Make use of these tables by querying them to determine old and new values of the data being affected. To cancel the DML statement from within a trigger and roll it back, use the ROLLBACK TRANSACTION statement.

Example 7 demonstrates how to create triggers, and the effect they take after a DML statement is executed on the table to which the trigger is bound.

Example 7. Creating a Trigger on the Stars Table
Code View: Scroll / Show All
CREATE TABLE StarHistory
(StarHistoryId int IDENTITY PRIMARY KEY, StarName varchar(50), OldType ntext,
NewType ntext, DateChanged DateTime);
GO
CREATE TRIGGER UpdateStarHistory
on dbo.Stars
AFTER INSERT, UPDATE
AS
BEGIN
  INSERT StarHistory (StarName, OldType, NewType, DateChanged)
  SELECT INSERTED.StarName, DELETED.StarType, INSERTED.StarType, GETDATE()
  FROM INSERTED LEFT JOIN DELETED on INSERTED.StarID = DELETED.StarID
END
GO

UPDATE Stars SET StarType = 'Burnt out' WHERE StarName = 'Sun';
GO
SELECT * FROM StarHistory
-- Results:
-- StarHistoryId  StarName     OldType       NewType     DateChanged
-- -------------  ----------   ----------    ----------- ---------------
-- 1              Sun          Yellow dwarf  Burnt out   2009-01-21
                                                         11:56:29.530

					    


Posted in SQL | Tagged: , , , , , , , | Leave a Comment »

SQL Server Management Studio and sp_configure

Posted by Alin D on October 15, 2010

sp_configure provides far more configuration options than the Management Studio interface, and you will need to be familiar with it for the exam. Because of this, we’re going to focus on sp_configure and its options, highlighting where a particular option is also available in Management Studio.

To run sp_configure simply type it into a query window and click execute. In Figure 1 you can see the results of executing sp_configure on an instance with default values. To access the Server Properties in Management Studio, right-click on a r egistered server in the Object Explorer window and select Properties.

Figure 1. sp_configure Output With Default Values


Advanced Options

As you can see in Figure 1, when you run sp_configure you’ll get a list of 16 options. One of the first things you’ll typically do as a DBA is to expand that list to show all of the configuration options. You do this by enabling show advanced options using the following command:

sp_configure 'show advanced options', 1
go
reconfigure


This is the format for changing all of the options; you pass the option name in single quotes followed by a comma and then the value you want to change it to. When you change a configuration option, typically only the config_value changes, and then you need to run reconfigure to tell SQL Server to reload all the values. After this is executed you’ll see the new values in both the config_value and run_value columns in the sp_configure output, indicating that SQL Server is using the new values. There are some settings, however, that require SQL Server to be restarted..

When you run sp_configure again you’ll see an extended list of 68 options. This is a persisted instance-wide change, so all administrators will now see the advanced options.

We’ll now take a look at some of the most commonly changed options and requirements, highlighting where the change can also be made in Management Studio.

AWE

On a 32-bit system SQL Server is limited to using 2GB of memory. To use more than 2GB of memory you need to enable the Address Windowing Extensions (AWE) option in SQL Server. AWE allows a 32-bit application, built to support it, to access as much memory as the underlying operating system. Windows Server 2008 Enterprise supports as much as 64GB of RAM. AWE has two valid values, 0 or 1.

To enable AWE to run:

sp_configure 'AWE', 1
go


This setting is only visible after enabling show advanced options. It requires you to restart SQL Server before it takes effect and can also be configured in SQL Server Management Studio. You should always set max server memory when using AWE to prevent SQL Server taking too much RAM from Windows.

Setting the Maximum and Minimum Memory for SQL Server

By default, SQL Server will grow its memory usage to take advantage of as much memory on the server as possible. This can very often leave only 100 to 200 MB of memory free on a server, causing problems when Windows requires more memory for something and SQL Server is too slow to give some back.

Max Server Memory (MB)

Max server memory (MB) controls the maximum size of the buffer pool, which is the pool of memory that SQL Server uses for most of its requirements. Setting the maximum server memory is strongly recommended if you are using AWE or if you’re running a 64-bit version of SQL Server. To set max server memory (MB) to 6 GB run:

sp_configure 'max server memory (MB)', 6144
go
reconfigure


This setting is only visible with sp_configure after enabling show advanced options, but you can also change this setting in SQL Server Management Studio on the Memory page.

Min Server Memory (MB)

Min server memory works hand in hand with max server memory but isn’t as important to as many scenarios. It provides a minimum value that SQL Server will attempt not to go under once it has reached that value.

For example, you set min server memory to 2GB. When SQL Server first starts it takes the minimum amount of memory required for it to start and then grows its memory usage as it needs to. Once it has grown to be more than 2GB, SQL Server will attempt to keep the buffer pool above that value. This setting is most useful when running multiple instances on a single server or if there are other memory-intensive applications running on the server. To set min server memory (MB) to 2GB run:

sp_configure 'min server memory (MB)', 2048
go
reconfigure


You can also change this setting in SQL Server Management Studio on the Memory page.

Head of the Class…: 32-bit, 64-bit, and Memory

32-bit (or x86 as it’s sometimes known) systems have a limitation of 4 GB of virtual address space, so if you have more than 4 GB of RAM in a server without AWE enabled, then SQL Server won’t be able use the extra memory. 64-bit (or x64 as the most common platform is known) has a limitation of 8TB of virtual address space, so you get access to more RAM without having to switch on AWE.

The Standard and Enterprise Editions of SQL Server both support as much RAM as the Windows version that they run on, and they come with licenses to run either the x86 or x64 version of SQL Server at no additional cost.


Maximum Degree of Parallelism

If SQL Server determines that a query that has been executed is expensive enough in terms of resource consumption, it may try to break down the work into several units and execute them on separate CPUs. This is called parallelism and is a very intensive operation, where SQL Server assumes that this query is important enough to run as quickly as possible at the expense of additional CPU usage.

By default SQL Server will use up to all of the available CPUs on a server for a parallel operation. This can cause problems with all the CPUs running at 100 percent for a period and slowing down other operations.

The max degree of parallelism option allows you to control how many CPUs can be used for parallel operations. A common best practice is to set this value to half the number of CPU cores on your server. For example, if you have four dual-core CPUs, you will see eight cores in Windows, so you should set max degree of parallelism to 4. You will not be tested on what the formula should be to calculate the optimal value, but you may be tested on what it does.

To set max degree of parallelism to 4 run:

sp_configure 'max degree of parallelism', 4
go
reconfigure


The max degree of parallelism is also known as MAXDOP and can be specified at the query level using the MAXDOP keyword, as well as the server level.

The default value of 0 means use all available processors. You can also change this setting in SQL Server Management Studio on the Advanced page.

Security Certifications

SQL Server 2008 can be easily configured to support requirements to meet certain compliance standards. You should be aware of these and how to enable them for the exam.

C2 Auditing

This is a standard developed by the U.S. government that determines how system usage is audited. It is enabled by running:

sp_configure 'c2 audit mode', 1
go
reconfigure


This will start a SQL Trace that runs continually and stores the output in a trace file in the default data directory. Because of the stringent requirements set out by the C2 standard, if SQL Server can’t write that trace file (if you run out of disk space for example) it will stop the SQL Server Service, in other words, nothing happens unless it’s audited.

You can also enable this setting in SQL Server Management Studio on the Security page.

Common Criteria Compliance

This a security standard developed in Europe and adopted worldwide that supersedes the C2 standard. There are several levels of Evaluation Assurance Levels (EAL) within the Common Criteria and SQL Server 2008 is certified to level EAL4+, which is the most widely adopted.

It is enabled by running:

sp_configure 'common criteria compliance enabled', 1
go


Common Criteria Compliance requires a SQL Server Service restart to take effect, and it can also be enabled in SQL Server Management Studio on the Security page. To be fully EAL4+ compliant, you need to download and run a script from Microsoft.

New Features

You’ll notice a couple of new configuration options if you’re upgrading your skills from SQL Server 2005.

Backup Compression Default

SQL Server 2008 has a new feature that enables compressed backups to be taken, which saves time and storage requirements. It is an Enterprise Edition-only feature that is switched off by default. When you take a SQL Server backup it will not be compressed unless you specifically asked it to be. You can control this default behavior by configuring the backup compression default option with sp_configure. It is enabled by running:

sp_configure 'backup compression default', 1
go
reconfigure


The backup compression default cannot be controlled through the SQL Server Management Studio interface.

Posted in SQL | Tagged: , , , , , , , , , , , , , | Leave a Comment »

Working with Tables Constraints and Indexes in Windows 2008

Posted by Alin D on October 13, 2010

Tables store data. For each column in the table, you must select a built-in or a user-defined data type. Indexes are created on tables to maximize query performance. Constraints are associated with table columns and define the rules to which data in a particular column or columns must adhere. Constraints can also define relationships between tables, like the necessity to have a Customer entity associated with every Order entity. These are known as FOREIGN KEY constraints. In this section we’ll provide you with details about tables, constraints, and indexes.

Working with Tables and Views

Tables are the database objects that store data in a SQL Server database. Tables are structured as columns and rows, like a spreadsheet. The columns define the type and length of data they can store. Every table must have at least one column. Column names must be unique within a table; that is, you cannot specify ProductName column to appear twice in the Product table. Tables store the underlying data within the .mdf and .ndf data files as pages and extents. Columns are sometimes associated with constraints; for example, PRIMARY KEY, UNIQUE, or DEFAULT. Types of constraints will be explained later in this chapter. You can also mark columns with the following special attributes:

  • Identity columns Values for these columns are generated automatically in sequence by incrementing every time a row is added. Usually, values 1, 2, 3, n are used, but you can define your own seed (starting value) and increment value for the IDENTITY column.
  • Computed columns These columns do not store any data; instead, they define a formula that calculates the column value at query time.
  • Timestamp columns These columns are used as a mechanism for version-stamping table rows and tracking changes.
  • UniqueIdentifier columns These columns store globally unique identifiers (GUIDs). GUIDs values are used for replication and are guaranteed to be unique. GUIDs values can be generated using the NEWID() built-in function.

When defining columns for a new or existing table, you can specify column nullibility. A column is said to be nullible if it allows storing null (empty) values. You can choose to mark a column as not nullible. If anyone attempts to insert a NULL value into this column, an error will be raised, and the INSERT operation will fail. To mark a column as nullible, use the NULL keyword when defining the column in the CREATE TABLE or ALTER TABLE statements. To mark a column as not nullible, use the NOT NULL keyword. By default, columns are nullible. Columns designated as primary keys cannot be nullible.

Creating Tables

Tables can be created and modified using the SQL Server Management Studio table designer or the CREATE TABLE or ALTER TABLE statements. To access the SQL Server Management Studio graphical table designer, in Object Explorer expand the database if you wish to create the table. Then, right-click Tables and click New Table. To modify an existing table, right-click it and then click Design. The table designer shows the columns that will be created for your table at the top and the properties of the selected column in the Column Properties pane, usually located at the bottom of the screen. Figure 1 shows the use of SQL Server Management Studio to create a new table.

Figure 1. Using the SQL Server Management Studio Table Designer


To create a table using DDL, use the CREATE TABLE statement along with the syntax shown in Example 1

Example 1. Create TABLE Statement—Syntax
CREATE TABLE [database_name].[schema_name].table_name
(column1_name data_type [NULL | NOT NULL] | [PRIMARY KEY] | [IDENTITY],
Column2_name data_type [NULL | NOT NULL],
[<computed_column_definition>]


In this statement, the table_name is the name of the table you wish to create. When defining columns, you can specify whether or not they will allow NULL values. You can also state that a column will be designated as the PRIMARY KEY for the table and whether it will contain automatically incrementing values, known as IDENTITY columns. The computed_column_definition is the formula for a calculated column. When defining columns, you must designate a data type, like varchar or int, and in some cases a length.

Table 1 summarizes built-in data types that are available to you when you are defining columns.

Table 1. Built-In Data Types
Numeric Character Dates and Times Other
Tinyint Char Datetime Binary
Smallint Nchar Smalldatetime Bit
Int Varchar Date Cursor
Bigint Nvarchar Datetime2 Xml
Smallmoney Text Datetimeoffet Smalldatetime
Money Ntext Time Varbinary
Decimal Timestamp Uniqueidentifier
Double Hierarchyid
Float Rowversion
Real Sql_variant
Image


Some of the data types shown in the table also allow you to specify the length or precision for the data stored in the column you are creating. For example, a column of type char(1000) allows you to store up to 1,000 characters per row. A column of type decimal(10) allows you to store up to 10 digits on either side of the decimal point, while decimal(10,5) allows you to store numbers of up to 10 digits with up to 5 digits to the right of the decimal point. Variable-length data types, like varchar, nvarchar, and varbinary, consume only the space that the characters stored in the column take up. Fixed-length equivalents of char, nchar, and binary consume a fixed amount of space regardless of the amount of actual data contained in the column. Data types prefixed with “n”—nvarchar and nchar—store Unicode text and can be used to store characters from multiple languages in one column.

Computed Columns

Computed columns are used when you need to store a calculated value in a table. As a best practice, you should never physically store a value in a table that can be derived from other data within the table. If you do so, you will be consuming disk space unnecessarily. You will also open yourself up to logic errors, when the base columns have been updated, but the column storing the derived value is not, and vice versa. SQL Server offers the computed column feature to overcome these challenges. Use computed columns when you don’t want to manually calculate a column’s value in every query and when you wish to simplify your query syntax.

A computed column is based on a formula that can reference the values of noncomputed columns in the same row of the same table. Scalar functions can also be used in the definition of a computed column. A computed column cannot reference data in other tables or contain a subquery.

By default, computed columns are virtual, and the values in them are not stored on disk. The values are calculated every time the computed column is queried. When it is more efficient to store the values of computed columns on disk, you can use the PERSISTED keyword to instruct SQL Server to physically store the column values. PERSISTED columns are updated when other columns they are based on are updated, or when new rows are inserted into the table. Only computed columns using deterministic functions can be marked as PERSISTED. A function is deterministic when it is guaranteed to return the same value, if you pass it the same parameters. For example, the AVG function is deterministic, while the GETDATE function is not. Whether or not the computed column is persisted, you can never write directly to it. Persisted computed columns can be indexed. This is useful if you need to search and sort by the values of this column.

Creating User-Defined Data Types

Sometimes you need to create your own data types that are based on the built-in data types introduced earlier. Custom data types are also known as user-defined data types. User-defined data types are especially useful when you must store the data with the same length or precision over and over again. For example, you can create a new user-defined data type to represent people’s names. This user-defined data type can be based on nvarchar(50) and cannot contain nulls. This user-defined data type can now be bound to any column that is to contain people’s names and will be consistent throughout. Create your user-defined data types in the Model system database, so that it is automatically inherited by all new databases you create. User-defined data types are created using the CREATE TYPE statement. The syntax is shown in Example 2 :

Example 2. Create TYPE Statement—Syntax
CREATE TYPE [schema_name. ]type_name
{FROM base_type([precision],[scale])
  [NULL | NOT NULL]
}


Example 3 shows the syntax used to create a user-defined data type named PersonName and to create a table that contains two columns of type PersonName.

Example 3. Using the Create TYPE Statement
CREATE TYPE PersonName
{FROM varchar(50)
  NOT NULL
};
GO
CREATE TABLE TeamMembers
(MemberId int PRIMARY KEY,
MemberName PersonName,
ManagerName PersonName);
GO


Use the ALTER TYPE statement to change the definition of your user-defined types. The DROP TYPE statement should be used to remove the user-defined data types you no longer need in the database. You cannot remove user-defined types from the database while there are tables with columns based on these types. If you attempt to use the DROP TYPE statement to remove a data type that is in use, you will get an error message similar to: “Msg 3732, Level 16, State 1, Line 1. Cannot drop type ‘PersonName’ because it is being referenced by object ‘TeamMembers’. There may be other objects that reference this type.”

Working with Constraints

Constraints are data validation rules that are bound to a column or a set of columns in a table. Constraints can also be used to enforce a relationship between two entities represented as two tables. The available types of constraints are as follows:

  • CHECK constraints These constraints validate the integrity of data in a column by checking it against a valid comparison. For example, you can use a CHECKBirth Date earlier than 01/01/1880. You can also use a CHECK constraint to validate that an e-mail address is always at least seven characters long. constraint to ensure that no one in your Employees table has a
  • Primary Key constraints PRIMARY KEY constraints represent the unique identifier column that will enforce the uniqueness of each row. For example, you can designate the CustomerID column as the PRIMARY KEY for the Customers table. If you get two customers that have the same values in the Name column and other columns, but represent different people, you will use the PRIMARY KEY to distinguish between them. It is a best practice to always have a PRIMARY KEY in each table and to use surrogate PRIMARY KEYs that have no meaning to the application.
  • Unique constraints These constraints are similar to PRIMARY KEY constraints, except that you can have more than one unique constraint per table. For example, you can designate that the combination of FirstName, LastName, and TelephoneNumber is unique in the Customers table and that the EMailAddress column can only contain unique values.
  • FOREIGN KEY constraints These constraints enforce a relationship between two tables. For example, you can use a FOREIGN KEY constraint to specify that any row in the Orders table must have a corresponding row in the Customers table and that the tables are linked through the CustomerID column, which is included in both tables. Once this FOREIGN KEY constraint is enforced, you cannot delete a row from the Customers table that has related rows in the Orders table.
  • Default constraints Also known as “defaults,” the DEFAULT constraints specify a default value to be inserted into a column if no value is inserted. Defaults can be bound to a column that is defined as NULL or NOT NULL. An example of a default is to use the value “Not Applicable” for the ProductColor every time someone adds a product to the Products table without specifying a color.

When you attempt to insert, delete, or modify data in a table that will result in a constraint violation, the statement will roll back. DML statements, like INSERT, UPDATE, DELETE, or MERGE, always succeed or fail as a whole. For example, if you were inserting 1,000 records into a table, but one violated a PRIMARY KEY or UNIQUE constraint, all 1,000 rows would roll back and nothing would be inserted. If a DELETE statement violated a FOREIGN KEY constraint, even on one row, the entire DELETE statement would fail and nothing would be deleted. You will never receive a partial result set from a DML statement. Example 4 shows the syntax used for working with constraints.

Test Day Tip

Remember that DML statements commit as a whole or not at all. A constraint violation will cause the entire statement to fail and roll back.


Example 4. Working with Constraints
CREATE TABLE Stars
(StarID int PRIMARY KEY,
StarName varchar(50) Unique,
SolarMass decimal(10,2) CHECK(SolarMass > 0),
StarType varchar(50) DEFAULT 'Orange Giant');
GO
INSERT Stars (StarID, StarName, SolarMass)
VALUES (1, 'Pollux', 1.86);
INSERT Stars (StarID, StarName, SolarMass, StarType)
VALUES (2, 'Sun', 1, 'Yellow dwarf');
SELECT * FROM Stars
-- Results:
-- StarID      StarName   SolarMass  StarType
-- ----------- ---------- ---------- ----------
-- 1           Pollux     1.86       Orange Giant
-- 2           Sun        1.00       Yellow dwarf
INSERT Stars (StarID, StarName, SolarMass, StarType)
VALUES (2, 'Deneb', 6, 'White supergiant');
-- Results:
-- Msg 2627, Level 14, State 1, Line 1
-- Violation of PRIMARY KEY constraint 'PK__Stars__06ABC647542C7691'.
Cannot insert duplicate key in object 'dbo.Stars'.
-- The statement has been terminated.
INSERT Stars (StarID, StarName, SolarMass, StarType)
VALUES (3, 'Deneb', -6, 'White supergiant');
-- Results:
-- Msg 547, Level 16, State 0, Line 1
-- The INSERT statement conflicted with the CHECK constraint "CK__Stars__
SolarMass__58F12BAE". The conflict occurred in database "AdventureWorks",
table "dbo.Stars", column 'SolarMass'.
-- The statement has been terminated.

INSERT Stars (StarID, StarName, SolarMass, StarType)
VALUES (3, 'Deneb', 6, 'White supergiant');
SELECT * FROM Stars
-- Results:
--DROP TABLE Stars
-- StarID      StarName   SolarMass  StarType
-- ----------- ---------- ---------- ----------
-- 1           Pollux     1.86       Orange Giant
-- 2           Sun        1.00       Yellow dwarf
-- 3 Deneb 6.00 White supergiant

					    


Enforcing Referential Integrity through FOREIGN KEY Constraints

A FOREIGN KEY constraint creates a relationship between two tables, based on a value held in a column or multiple columns. One of the tables participating in the relationship contains a PRIMARY KEY used in the relationship. The value of the primary key column can only appear once in this table. You can also use a UNIQUE constraint instead of a PRIMARY KEY constraint to define a relationship. Sometimes the table holding the PRIMARY KEY is referred to as the parent table or the “one” in a one-to-many relationship. The second table also has a column that contains the same values as the PRIMARY KEY column in the parent table. In the second table these values can repeat many times. This table is referred to as the child table, or the “many” table in a one-to-many relationship.

Consider this simple example. The Customers table may contain columns CustomerID, CompanyName, and StreetAddress. The CustomerID column is the PRIMARY KEY column and contains unique values. The Orders table contains an OrderID column, a CustomerID column, and an OrderAmount column. The CustomerID column in the Orders table contains the unique identity of the customer who has placed the order. To look up which customer an order belongs to, look them up by their CustomerID. To find all orders for a particular customer, look them up by their CustomerID. The CustomerID column in the Orders table is known as a FOREIGN KEY because it is a key of a foreign entity, an entity that does not belong in the table.

By creating a FOREIGN KEY constraint between two tables, their relationship is formalized. The rules of the constraint are applied to the relationship. By default, you cannot delete a parent record, if there are related child records in the child table referenced by a FOREIGN KEY constraint. You can also explicitly specify an action to take when the parent record is deleted or the key value of the parent record is updated. To do this, use the ON UPDATE and ON DELETE optional clauses when creating a FOREIGN KEY constraint. The following actions are available:

  • NO ACTION This is the default action. No special action is taken, and if the FOREIGN KEY is violated, the statement rolls back.
  • CASCADE Propagate the update or delete action to child rows. If you delete a parent row that participates in a cascading relationship, all child rows will be deleted. If you change a key value of a parent row, the corresponding child rows will also change.
  • SET NULL Set the values of the FOREIGN KEY column to null for all related records.
  • SET DEFAULT Set the values of the FOREIGN KEY column to its default values for all related records.

Configuring & Implementing…: Null FOREIGN KEYs and Self-Referencing FOREIGN KEYs

Columns marked as FOREIGN KEYs can contain null values. However, this practice is not recommended because when a FOREIGN KEY consists of two or more columns and contains null values, the constraint cannot be verified, and the integrity of your data cannot be guaranteed.

It is also possible for a FOREIGN KEY constraint to reference columns in the same table. This is known as a self-reference, and when querying a table in this arrangement it is referred to as a self-join. An example of a self-reference is a Generations table containing names of people with the columns PersonID, PersonName, and MotherID. The mother is also a person stored in the Generations table, and therefore, you can create a FOREIGN KEY relationship from the MotherID (FOREIGN KEY column) referencing PersonID (PRIMARY KEY column).


FOREIGN KEY constraints are frequently used by queries to join the parent and child tables. For this reason, it is recommended that you create a nonclustered index on every FOREIGN KEY contained in a table.

Example 5 creates two tables and links them by a FOREIGN KEY constraint.

Example 5. Working with FOREIGN KEY Constraints
CREATE TABLE Team(
TeamID int PRIMARY KEY,
TeamName varchar(50));
GO
CREATE TABLE TeamMember(
TeamMemberID int PRIMARY KEY,
FullName varchar(100),
TeamID int CONSTRAINT FK_Team_TeamMember
FOREIGN KEY REFERENCES dbo.Team(TeamID));
GO
INSERT Team VALUES (1, 'Development'), (2, 'Testing'), (3, 'Management');
INSERT TeamMember VALUES (1, 'Valentine', 1), (2, 'Bryant', 1), (3, 'Shane', 1),
(4, 'Keith', 3)
SELECT Team.TeamID, TeamName, FullName FROM
Team LEFT JOIN TeamMember ON
Team.TeamID = TeamMember.TeamID ;
GO
-- Results:
-- TeamID      TeamName    FullName
-- ----------- ----------- -----------
-- 1           Development Valentine
-- 1           Development Bryant
-- 1           Development Shane
-- 2           Testing     NULL
-- 3           Management  Keith
DELETE FROM Team WHERE TeamID = 2;
GO
-- Results:
-- (1 row(s) affected)
DELETE FROM Team WHERE TeamID = 3;
GO
-- Results:
-- Msg 547, Level 16, State 0, Line 1

-- The DELETE statement conflicted with the REFERENCE constraint "FK_Team_
TeamMember". The conflict occurred in database "AdventureWorks", table "dbo.
TeamMember", column 'TeamID'.
-- The statement has been terminated.
ALTER TABLE TeamMember
DROP CONSTRAINT FK_Team_TeamMember;
GO
ALTER TABLE TeamMember
ADD CONSTRAINT FK_Team_TeamMember
FOREIGN KEY(TeamID) REFERENCES dbo.Team(TeamID)
ON DELETE CASCADE;
GO
DELETE FROM Team WHERE TeamID = 3;
GO
-- Results:
-- (1 row(s) affected)
DROP TABLE TeamMember;
GO
DROP TABLE Team;
GO

					    


Head of the Class…: Using CHECK and NOCHECK options

When you add FOREIGN KEY or CHECK constraints to an existing table that already contains data, by default all data in the table is validated against the constraint. Depending on the amount of data in your table, this may take a long time. If you wish to instruct SQL Server not to validate data integrity, and just create the constraint, you can specify WITH NOCHECK. The constraint will apply to new and modified rows, but the existing rows will not be examined. Data integrity is not checked when reenabling a previously disabled constraint.

It is not recommended that you specify WITH NOCHECK as this may result in erratic behavior when updating existing data. For example, you may be updating a column that has nothing to do with the constraint, and you may receive constraint violation errors. Only use WITH NOCHECK in situations that explicitly require it.

You must understand that constraints that were defined WITH NOCHECK are not used by the query optimizer to create efficient queries. To allow the query optimizer to use these constraints, you must reenable them using the ALTER TABLE statement with the CHECK CONSTRAINT ALL clause.


Creating Indexes

An index is a lookup structure created on a table to optimize, sort, and query performance. Indexes are created on a particular column or columns and store the data values for this column or columns in order. When raw underlying table data is stored in no particular order, this situation is referred to as a heap. The heap is composed of multiple pages, with each page containing multiple table rows. When raw underlying data is stored in order, sorted by a column or columns, this situation is referred to as a clustered index. For example, if you have a table named Customer, with a clustered index on the FullName column, the rows in this table will be stored in order, sorted by the full name. This means that when you are searching for a particular full name, the query optimizer component can execute the query more efficiently by performing an index lookup rather than a table scan. Only one clustered index is allowed per table; usually this is created on the column designated as the PRIMARY KEY.

You can also create additional nonclustered indexes on a table that is stored either as a heap or as a clustered index. A nonclustered index is a separate lookup structure that stores index values in order, and with each index value, it stores a pointer to the data page containing the row with this index value. Nonclustered indexes speed up data retrieval. It makes sense to create nonclustered indexes on all frequently searched fields in a table. The trade-off with indexes is write performance. Every time a new row is inserted, the index must also be updated. When writing data to a table with nonclustered indexes, sometimes the pages within the table have to be rearranged to make room for the new values. In addition, indexes are storage structures that take up disk space. Indexes are created using the CREATE INDEX statement. Example 6 shows the syntax for creating an index.

Example 6. CREATE INDEX Statement—Syntax
CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
ON table_or_view ( column1 [ ASC | DESC ], column2, ...n)
[ INCLUDE (additional_column_name, ...n) ]
[ WHERE filter_clause]
[ WITH OPTIONS]


The CREATE INDEX statement creates a clustered or nonclustered index on a specified column or columns. You can choose to create the index as UNIQUE, which will enforce a UNIQUE constraint on the index columns. A filter_clause can be specified to create indexes only on a subset of data that meets specific criteria. This is useful for a very large table, where creating an index on all values of a particular column will be impractical. Table 2 summarizes index options that can be used with the CREATE INDEX statement.

Table 2. Index Options
Option Explanation
PAD_INDEX = ON | OFF When this option is ON, free space is allocated in each page of the index. Allows for new values to be inserted without rearranging a large amount of data. The amount of free space allocated is specified by the FILLFACTOR parameter. When this option is OFF, enough free space for one row is reserved in every page during index creation.
FILLFACTOR = fill factor percentage Specifies the percentage of each page percentage. that should be filled up with data. For example, a fill factor of 80 means 20% of each page will be empty and available for new data. The fill factor is used only when you create or rebuild an index. Fill factor and index padding are discussed in detail in
SORT_IN_TEMPDB = ON | OFF Specifies whether the data should be sorted in the tempdb database instead of the current database. This may give performance advantages if the tempdb database is stored on a different disk to the current database.
IGNORE_DUP_KEY = ON | OFF Specifies that duplication errors should be ignored when creating unique indexes.
STATISTICS_NORECOMPUTE =ON | OFF Specifies that optimization statistics should not be updated at this time.
DROP_EXISTING = ON | OFF Specifies that the existing index with the same name should be dropped and then be re-created. This equates to an index rebuild.
ONLINE = ON | OFF Specifies that the underlying table should remain online and accessible by users while the index is being built. This option is only available in SQL Server 2008 Enterprise or Developer edition.
ALLOW_ROW_LOCKS = ON | OFF Specifies whether locks should be held on each row, as necessary.
ALLOW_PAGE_LOCKS = ON | OFF Specifies whether locks should be held on each page, as necessary.
MAXDOP = max_degree_of_parallelism Specifies the maximum number of processors that are to be used during the rebuild operation.
DATA_COMPRESSION = NONE | ROW | PAGE Use data compression at row or page level of the index.


Example 7 creates a clustered index (by star name) and a nonclustered index (by star type) on the Stars table we created in the previous example. Figure 2.3IX_Star_Name can be created using the interface of SQL Server Management Studio.

Example 7. Working with Indexes
--Create the table specifying that the PRIMARY KEY index is to be created
as nonclustered
CREATE TABLE Stars
(StarID int PRIMARY KEY NONCLUSTERED,
StarName varchar(50) Unique,
SolarMass decimal(10,2) CHECK(SolarMass > 0),
StarType varchar(50) DEFAULT 'Orange Giant');
GO
CREATE CLUSTERED INDEX Ix_Star_Name
ON Stars(StarName)
WITH (PAD_INDEX = ON,
FILLFACTOR = 70,
ONLINE = ON);
GO
CREATE NONCLUSTERED INDEX Ix_Star_Type
ON Stars (StarType)
WITH (PAD_INDEX = ON,
FILLFACTOR = 90);
GO


Figure 2. Creating an Index Using SQL Server Management Studio


When you are creating a PRIMARY KEY constraint, an index on the column(s) designated as PRIMARY KEY will be created automatically. This index will be clustered by default, but this can be overridden when creating the index by specifying the PRIMARY KEY NONCLUSTERED option. As a best practice, it is recommended that you accept the default of the clustered PRIMARY KEY column, unless you have a specific reason to designate another column as the clustered index key. Usually, the automatically created index is named PK_TableName_<Unique Number>, but this can be changed at any time by renaming the index. For example, a newly created Stars table with a PRIMARY KEY of StarID automatically has an index named UQ__Stars__A4B8A52A5CC1BC92.

Exam Warning

Remember that when creating a table, a unique index will be automatically created on the columns designated as the PRIMARY KEY. If you wish to avoid the long rebuild time associated with building a clustered index, or if you wish to create the clustered index on a column different from the PRIMARY KEY, you must explicitly specify the PRIMARY KEY NONCLUSTERED option. The PRIMARY KEY will always be unique.


Working with Full–Text Indexes

Standard indexes are great when used with the simple WHERE clause of the SELECT statement. An index will greatly reduce the time it will take you to locate rows where the indexed column is equal to a certain value, or when this column starts with a certain value. However, standard indexes are inadequate for fulfilling more complex text-based queries. For example, creating an index on StarType will not help you find all rows where the StarType column contains the word “giant,” but not the word “supermassive”.

To fulfill these types of queries, you must use full-text indexes. Full-text indexes are complex structures that consolidate the words used in a column and their relative weight and position, and link these words with the database page containing the actual data. Full-text indexes are built using a dedicated component of SQL Server 2008—the Full-Text Engine. In SQL Server 2005 and earlier, the Full-Text Engine was its own service, known as full-text search. In SQL Server 2008, the Full-Text Engine is part of the database engine (running as the SQL Server Service).

Full-text indexes can be stored on a separate filegroup. This can deliver performance improvements, if this filegroup is hosted on a separate disk from the rest of the database. Only one full-text index can be created on a table, and it can only be created on a single, unique column that does not allow null values. Full-text indexes must be based on columns of type char, varchar, nchar, nvarchar, text, ntext, image, xml, varbinary, and varbinary(max). You must specify a type column, when creating a full-text index on a image, varbinary, or varbinary(max) columns. The type column stores the file extension (.docx, .pdf, .xlsx) of the document stored in the indexed column.

Example 8 amends the Stars table to include a Description column and creates a full-text index on this column. The FREETEXT function allows us to search on any of the words specified using the full-text index. This yields a similar user experience as using an Internet search engine.

Example 8 Creating and Using a Full-Text Index
ALTER TABLE Stars
ADD Description ntext DEFAULT 'No description specified' NOT NULL ;
GO
CREATE FULLTEXT CATALOG FullTextCatalog AS DEFAULT;
CREATE FULLTEXT INDEX ON Stars (Description)
KEY INDEX PK__Stars__06ABC6465F9E293D;
GO
UPDATE Stars SET Description = 'Deneb is the brightest star in the
constellation Cygnus and one of the vertices of the Summer Triangle. It is
the 19th brightest star in the night sky, with an apparent magnitude of 1.25.
A white supergiant, Deneb is also one of the most luminous stars known. It
is, or has been, known by a number of other traditional names, including
Arided and Aridif, but today these are almost entirely forgotten. Courtesy
Wikipedia.'
WHERE StarName = 'Deneb';
UPDATE Stars SET Description = 'Pollux, also cataloged as Beta Geminorum,
is an orange giant star approximately 34 light-years away in the constellation
of Gemini (the Twins). Pollux is the brightest star in the constellation
(brighter than Castor (Alpha Geminorum). As of 2006, Pollux was confirmed to
have an extrasolar planet orbiting it. Courtesy Wikipedia.'
WHERE StarName = 'Pollux';
GO
SELECT StarName
FROM Stars
WHERE FREETEXT (Description, 'planet orbit, giant');
GO
-- Results:
-- StarName
-- --------------------------------------------------
-- Pollux

					    


Partitioning Data

When working with large databases, query performance often becomes an issue, even if your indexing strategy is spot-on. If you have decided that indexing is not enough to produce your desired result, your next step can be data partitioning. Data partitioning separates a database into multiple filegroups containing one or more files. These filegroups are placed on different disks, enabling parallel read and write operations, thus significantly improving performance. Approach a partitioning strategy by separating different tables and indexes into different filegroups and placing them on separate disks. As a guide, always separate large, frequently accessed tables that are in a FOREIGN KEY relationship, so that they can be scanned in parallel when performing a join.

If the desired performance is not achieved by simple partitioning, this is usually due to very large single tables. You can employ a horizontal or vertical partitioning technique to split a single large table into multiple smaller tables. Queries that access this table will run quicker, and performance of maintenance tasks, such as backup and index rebuild, will also be improved.

Horizontal Partitioning

Horizontal partitioning splits a table into several smaller tables by separating out clusters of rows, based on a partitioning function. The structure of the smaller tables will remain the same as the structure of the initial table, but the smaller tables will contain fewer rows. For example, if you have a very large table that has 100 million rows, you can partition it into 10 tables containing 10 million rows each. Date columns are often a good choice for horizontal partitioning. For example, a table could be partitioned historically by year—each year stored in a smaller table. Thus, if a query requires data for specific dates, only one smaller table needs to be scanned.

Analyze the data and how your users are accessing this data in order to derive the best horizontal partitioning strategy. Aim to partition the tables so that the majority of the queries can be satisfied from as few smaller tables as possible. To join smaller tables together, UNION queries are required, and these can degrade performance.

Vertical Partitioning

Unlike horizontal partitioning, vertical partitioning separates different columns of a single table into multiple tables. The resultant smaller tables have the same number of rows as the initial table, but the structure is different. Two types of vertical partitioning are available:

  • Normalization Normalization is the process of applying logical database design techniques to reduce data duplication. This is achieved mainly by identifying logical relationships within your data and implementing multiple tables related by FOREIGN KEY constraints.
  • Row splitting This technique separates some columns from a larger table into another table or tables. Essentially, each logical row in a table partitioned using row splitting is stored across two tables. To maintain integrity between the tables, use a FOREIGN KEY constraint when both the primary and FOREIGN KEY participants are unique. This is known as a one-to-one relationship.

If implemented correctly, vertical partitioning reduces the time it takes to scan data. Use row splitting to separate frequently used and rarely accessed columns into separate tables, and eliminate overhead. The drawback of vertical partitioning is the processing time and resources it takes to perform the joins, when needed.

Posted in Windows 2008 | Tagged: , , , , , , , , , , , , , , , , , , , | Leave a Comment »