Windows Management and Scripting

A wealth of tutorials Windows Operating Systems SQL Server and Azure

Posts Tagged ‘RAID’

NTFS File System explained

Posted by Alin D on August 31, 2011

Intoduction

NTFS is Microsoft’s file system for Windows server and desktop operating systems. This short Windows NTFS tutorial provides information and links breaking down how it works, with details on NTFS vs FAT32, NTFS recovery techniques and management best practices. You’ll also find resources dealing with NTFS compression, permissions and optimization.

NTFS explanations

Windows NTFS, or NT File System, is the standard fyle system of the microsoft operationg systems. Before the Windows NT the fyle system where the Microsoft operating system was installed was FAT (File Alocation Table).

FAT was designed to act as a map for all files stored on a hard disk. FAT went though several incarnations before the creation of the NTFS, from FAT 16 to FA32. Because FAT32 was limited to volumes of 32 GB`s many users these days take the advantage of NTFS for file system management.

NTFS has many advantages over the FAT32 like:

  • Access control list (ACL) that increased the folder security and alow administrators to controll who to have access on specific file or folder.
  • Informations about a files`s cluster and other data data stored with each cluste, not just a governing table.
  • Data security on removable and fixed disks.

In simple terms the difference between FAT and NTFS is that while FAT file system has had a number of modifications made alow it to work with larger hard drives, NTFS file system was originally developed to support large hard disks from the beginning.. Also in adition to the use of large hard drives , NTFS offers more secure directories and files against unauthorized users, has better data protection and doesn`t succumb to errors of fragmentation as easily FAT32.

Another benefit of NTFS over FAT32 involves NTFS permissions. Administrator can user NTFS utility to track permissions and provide ownership of files and folders. This benefit led to NTFS file and folder permissions to quickly become the most common form of authorization since windows 2000. 

Of course, NTFS permissions were not without some issues. For example, what if NTFS permissions were set to control user access to files, but administrators could still potentially grant themselves permissions to certain restricted documents? The classic “Who will watch the watcher” scenario is naturally all about trust, but was sometimes proacively dealt with for NTFS through auditing tools designed to inform you whenever a change was made.

Data Recovery in NTFS

As one would expect, recovery for NTFS works quite differently than FAT32 recovery. For starters, the NTFS file system is designed to perform file recovery on its own, without the need for third-party data recovery utilities or administrative actions. This is made possibe by two technologies: cluster remapping and transaction logging.

Cluster remapping is a technique that prevents data loss by automatically moving data from clusters containing bad sectors on the hard disk to good clusters. The transaction loggingfeature of the NTFS file system is designed to prevent data corruption. Although the mechanics behind transaction logging in NTFS are complicated, the basic idea is that when a write operation occurs, the Windows NTFS file system records the operation to a log file. Once the write operation is logged, NTFS updates the volume cache and then makes a log entry indicating that the transaction is complete. For more information on how cluster remapping and transaction logging work, check out this article on NTFS data recovery.

There are also some free NTFS recovery tools out there. One such tool is FreeUndelete 2.0, which is designed specifically for the recovery of files deleted from NTFS volumes. For example, say you accidentally deleted a file on an external NTFS-formatted hard drive, and you needed to get it back right away. FreeUndelete is a great free tool for fast NTFS data recovery in instances just like that.

What about NTFS recovery of encrypted files? The trick here is that you must have an authorized private key and a file encryption key that was encrypted using the corresponding public key. Without these keys, there is no way to recover NTFS encrypted files. For a detailed rundown of the process, check out this article on recovery of encrypted files on an NTFS partition.

While the NTFS file system was designed to be much less prone to corruption than FAT32, acorrupt boot sector can somtimes occur, requiring the recovery of NTFS data. The boot sector is critical to Windows NTFS, so if it’s corrupt, the entire volume may be inaccessible. To repair a corrupt boot sector in NTFS, all you need to do is locate the backup copy, then use the information it contains to overwrite the primary boot sector. You can then begin the NTFS data recovery process.

Best practices

There are several expert-recommended best practices to be aware of when working with Windows NTFS. One such suggestion involves NTFS cluster sizes. Since larger cluster sizes speed up disk access, it might be tempting to ramp up the cluster size as far as possible (up to 256K) on a big partition with big files. This isn’t always the smartest thing to do, however, as many third-party utilities aren’t designed to recognize NTFS clusters larger than 4K. Many defragmentation programs, for instance, cannot work correctly when confronted with a non-standard cluster size. Therefore, If you plan on using third-party disk tools, creating RAID arrays or mirrored disks, experts advise that you don’t edit the NTFS cluster size by hand.

There are many different ways to optimize NTFS performance. These techniques include having you or an administrator format NTFS legacy filenames or resize the master file table. For the latter, this involves making sure that there is enough space reserved for the master file table at all times. This is important becuase the NTFS master file table is essentially a directory of all of the files and folders found on the hard disk volume (similar to FAT), so it is critical to the volume’s performance that the master file table remains as unfragmented as possible.

Other Window NTFS optimization techniques include disabling the last access date security feature, minimizing the impact of antivirus utilities, and keeping NTFS compression to a minimum. The latter is especially important, because Windows NTFS compression doesn’t compress files by clusters. Instead, it uses compression units of 16 clusters and calculates file space on the basis of the number of compression units needed. This wastes an average of eight clusters of additional space for each file. In a typical user’s account with hundreds or thousands of files, that space adds up, and when it is charged against users’ quotas, they run out of quota space even though the file size on their screen shows that they still have plenty of space left in their quotas.

It’s also recommended that administrators beware of ACLs on NTFS volumes from old Windows installations. This is because NTFS Access Control Lists (ACLs) and Security Descriptors describe who can access what NTFS objects, and to what degree. If a given user or group has access to an object, the ACL for that object will contain a reference to that user or group not by their name, but by their GUID. This means that if you have an object somewhere on an NTFS partition that belongs to a user on a specific machine, those permissions are unique. You can’t create a user with the same name on another machine and expect to have unrestricted access to that object; you have to take ownership of the object first. By following these steps, you can reverse Windows NTFS object ownership from administrators to object’s creator.

Finally, administators should also take action against NTFS disks that fill up too quickly. This can be the result of compromised systems and corrupted disks, as well as other factors, such as master file table expansion and invalid file names.

 

 

 

Posted in TUTORIALS | Tagged: , , , , , , | Leave a Comment »

Improve SQL Server performance via disk arrays and disk partitioning

Posted by Alin D on December 21, 2010

As a DBA, much of your focus is on performance tuning SQL Server. But have you spent time to tune the hardware supporting your SQL Server system? Are you using the optimal disk array configuration? Are the disk partitions aligned? This tip discusses how to get your SQL Server hardware performance in top shape – whether the system is already in operation or it’s a new setup.

With the massive amount of raw horse power available in today’s server class hardware, it’s easy to skip over the hardware when it comes to performance tuning the SQL Server database. After all, with so much power available, who cares if something takes a few extra milliseconds to complete? Is anyone really going to notice that extra millisecond?

But what happens when you perform an operation that takes 10 extra milliseconds to complete; and it needs to perform 100 times an hour, for a year? All of a sudden, that 10 milliseconds turns into 2.4 hours. If you perform that operation 1,000 times an hour — which isn’t all that unheard of in a smaller OLTP database — you are now looking at more than 24 hours of wasted time.

In my particular environment, we run the same stored procedure at least 2,000 times per minute. If that stored procedure takes an extra 10 milliseconds to complete, we are looking at eight hours of lost time daily, or 121 days of lost time per year.

Tune SQL Server by tuning disk arrays

There are a few places to check hardware components when tuning your SQL Server system. The easiest components to check are disk arrays. Typically, disk arrays are where the most time is expended

waiting for something to happen. There are a couple of ways to tune the disks to improve SQL Server performance. The first is to make sure your disk array has enough spindles to handle the workload that will be placed on it. Second, make sure the disk arrays are in the correct RAID level to offer the best support level for the database.

While it is true that RAID 10 offers better write performance, in most cases, RAID 10 isn’t required for the data files. That said, you should use RAID 10 for your transaction logs and tempdb database, as they are mostly write files. The reason I say not to use RAID 10 for all database files is that RAID 10 is very costly to implement in large disk sizes. This is because for each spindle used for data, a second spindle is used for redundancy.

Finding out if you need more spindles on an existing system is easy. Open Performance Monitor on the server and add the “Physical Disk” object and the “Current Disk Queue Length” counter. Some queuing is OK; however, there is a tipping point. To find out where the tipping point of “OK queuing” and “too much queuing” is, take the number of disks in the array and multiply it by two. If the result is greater than the maximum value in Performance Monitor, then you have too much queuing. When we talk about the number of disks, we’re referring to the number of disks that are actively working with data. If you have a RAID 10 array, this is half the number of disks in the array.

“Number of Disks” x 2 = Maximum Allowable Queuing

How to configure the disk array on your new SQL Server system

When working on a new system without any load on it, making sure you configure your disk array correctly is a little more challenging. If you have another system with the same amount of load on it, you can use that system as a guide. However, if this is the first large system in your environment, then getting it correct can be a bit harder.

You’ll need to look at your database system and the expected transactions per second, and make an educated guess on how many spindles you’ll need. When dealing with high-end drives, expect each drive to give you about 100 IOPs to 120 IOPs per second per disk in an OLTP environment. When dealing with SATA drives, expect each drive to give you about 60 IOPs to 80 IOPs per second per disk in an OLTP environment. Those numbers will go up when working in an OLAP environment because OLAP databases put a different kind of load on the disk system. It’s a more sequential load, whereas OLTP databases put a random load on the disks.

Disk partition alignment improves SQL Server performance

Once you set up your disk array, you’ll want to make sure the partition you create is correctly aligned. By default, when Windows (or any other operating system for that matter) creates the partition on a disk or array, the partition is not correctly aligned for peak performance. Disk drives are made up of 1K blocks. The physical disks like to do all their operations in 64-block chunks called clusters. Conveniently, SQL Server likes to do all its operations in 64 K operations — there are eight 8K blocks in each extent, and SQL does its reads one extent at a time. When the partition is created, the boot sector is created at the beginning of the partition. This boot sector is 32 K in size, causing the 64K operations to be spread between two 64K clusters. This then causes each logical operation to take twice as many physical operations as needed.

You can see your current alignment offset by using the diskpart application. Open a command prompt and run diskpart.exe. When you are prompted with a DISKPART> prompt, type SELECT DISK n where n is the disk number you want to look at — the command LIST DISK will give you a list of disks in the machine. After selecting the disk, type in “LIST PARTITION” to get the partition information, including the offset.

DISKPART application to view disk alignment

DISKPART application to view disk alignment

In order to create the partition, you’ll need to use the CREATE PARTITION command. The full command is CREATE PARTITION PRIMARY ALIGN=64. This creates the new partition with the 64K offset, aligning the partition into the optimum position for maximum performance.

Posted in SQL | Tagged: , , , , | Leave a Comment »

Common Storage Configurations

Posted by Alin D on September 20, 2010

Introduction

In today’s world everything is on computers. More specifically, everything is stored on storage devices which are attached to computers in a number of configurations. There are many ways in which these devices can be accessed by users. Some are better than others and some are best for certain situations; in this article I will give an overview of some of these ways and describe some situations where one might want to implement them.

Firstly there is an architecture called Directly Attached Storage (DAS). This is what most people would think of when they think of storage devices. This type of architecture includes things like internal hard drives, external hard drives, and USB keys. Basically DAS refers to anything that attaches directly to a computer (or a server) without any network component (like a network switch) between them.


Figure 1: Three configurations for Direct Attached Storage solutions (Courtesy of ZDNetasia.com)

A DAS device can even accommodate multiple users concurrently accessing data. All that is required is that the device have multiple connection ports and the ability to support concurrent users. DAS configurations can also be used in large networks when they are attached to a server which allows multiple users to access the DAS devices. The only thing that DAS excludes is the presence of a network device between the storage device and the computer.

Many home users or small businesses require Network Attached Storage (NAS). NAS devices offer the convenience of centrally locating your storage devices, though not necessarily located with your computers. This feature is convenient for home users who may want to store their storage devices in their basement while roaming about their house with their laptop. This feature is equally appealing to small businesses where it may not be appropriate to have large storage devices where clients or customers present. DAS configurations could also provide this feature, though not as easily or elegantly for smaller implementations.


Figure 2: Diagram of a Network Attached Storage system (Courtesy of windowsnas.com)

A NAS device is basically a stripped down computer. Though they don’t have monitors or keyboards they do have stripped down operating systems which you can configure, usually by connecting to the device via a web browser from a networked computer. NAS operating systems are typically stripped down versions of UNIX operating systems, such as the open source FreeNAS which is a stripped down version of FreeBSD. FreeNAS supports many file formats such as CIFS, FTP, NFS, TFTP, AFP, RSYNC, and iSCSI. Since FreeNAS is open source you’re also free to add your own implementation of any protocol you wish. In a future article I will provide more in-depth information on these protocols; so stay tuned.

Because NAS devices handle the file system functions themselves, they do not need a server to handle these functions for them. Networks that employ DAS devices attached to a server will require the server to handle the file system functions. This is another advantage of NAS over DAS. NAS “frees up” the server to do other important processing tasks because a NAS device is connected directly to the network and handles all of the file serving itself. This also means that a NAS device can be simpler to configure and maintain for smaller implementations because they won’t require a dedicated server.

NAS systems commonly employ RAID configurations to offer users a robust storage solution. In this respect NAS devices can be used in a similar manner as DAS devices (for robust data backup). The biggest, and most important, difference between NAS systems and DAS systems are that NAS systems contain at least one networking device between the end users and the NAS device(s).

NAS solutions are similar to another storage configuration called Storage Area Networks (SAN). The biggest difference between a NAS system and a SAN system is that a NAS device handles the file system functions of an operating system while a SAN system provides only block-based storage services and leaves the file system functions to be performed by the client computer.

Of course, that’s not to say that NAS can’t be employed in conjunction with SAN. In fact, large networks often employ SAN with NAS and DAS to meet the diverse needs of their network users.

One advantage that SAN systems have over NAS systems is that NAS systems are not as readily scalable. SAN systems can quite easily add servers in a cluster to handle more users. NAS systems employed in networks where the networks are growing rapidly are often incapable of handling the increase in traffic, even if they can handle the storage capacity.

This doesn’t mean that NAS systems are scalable. You can in fact, cluster NAS devices in a similar manner to how one would cluster servers in a SAN system. Doing this still allows full file access from any node in the NAS cluster. But just because something can be done, doesn’t mean it should be done; if you’re thinking of going down this path tread carefully – I would recommend implementing a SAN solution instead.


Figure 3: Diagram of a Storage Area Network (Courtesy of anildesai.net)

However, NAS systems are typically less expensive than SAN systems and in recent years NAS manufacturers have concentrated on expanding their presence on home networks where many users have high storage demands for multimedia files. For most home users a less expensive NAS system which doesn’t require a server and rack space is a much more attractive solution when compared with implementing a SAN configuration.

SAN systems have many advantages over NAS systems. For instance, it is quite easy to replace a faulty server in a SAN system whereas is it much more difficult to replace a NAS device which may or may not be clustered with other NAS devices. It is also much easier to geographically distribute storage arrays within a SAN system. This type of geographic distribution is often desirable for networks wanting a disaster tolerant solution.

The biggest advantage of SAN systems is that they offer simplified management, scalability, flexibility, and improved data access and backup. For this reason SAN configurations are becoming quite common for large enterprises that take their data storage seriously.

Apart from large networks SAN configurations are not very common. One exception to this is is in the video editing industries which require a high capacity storage environment along with a high bandwidth for data access. A SAN configuration using Fibre Channel is really the best solution for video editing networks and networks in similar industries.

While any of these three configurations (DAS, NAS, and SAN) can address the needs of most networks, putting a little bit of thought into the network design can save a lot of future effort as the network grows or the need arises to upgrade various aspects of the network. Choosing the right configuration is important, you need to choose a configuration that meets your networks current needs and any predictable needs of the near to medium term future.

Posted in TUTORIALS | Tagged: , , , , , , , , , , , , , , , , , , , , , , , | Comments Off on Common Storage Configurations

Exchange Server 2007 High Availability Part 1

Posted by Alin D on August 27, 2010

HA Fundamentals

Microsoft Exchange Server 2007 has several different Mailbox Server high availability features included with the product.  Each of the features is similar to the others in some ways but also very different.

In this post I will explain each of the high availability features and which types of scenarios they are suitable for.

What is High Availability?

High availability is a term used to describe the avoidance of unplanned downtime for a computer system through the implementation of hardware and/or software solutions.  Generally speaking a high availability solution will involve the elimination of and single points of failure in the system, often by duplicating or replicating components of the system so that if one fails the other is able to continue performing the role.

An example of downtime would be an email server that has suffered a hard disk crash and is unavailable to users who are then unable to send or receive email.  An example of a high availability solution in this case would be the use of a RAID volume to protect from single disk failures.Exchange Server 2007 contains several high availability features in the Mailbox Server role that can protect a system from multiple types of failure.  These features are a combination of database replication and server clustering technology.

What is Asynchronous Log Shipping?

Some of the features I am going to describe will include the term “asynchronous log shipping“.  “Asynchronous” means “not synchronised“, and “log shipping” refers to the copying of a transaction log file from one location to another where it is then replayed into a replica of the original database to keep it updated with the changes made to the source database.

Exchange Server 2007 writes transaction log files of 1Mb in size, meaning as each log file reaches 1Mb it is closed off and the next transaction log file is created.  Asynchronous log shipping occurs after a transaction log file is closed off and no longer in use as the active log file.

In essence, asynchronous log shipping is how Exchange Server 2007 database replication occurs.

What is Clustering?

A server cluster is two or more servers working together to perform a particular role so that it appears to be performed by a single system.  There are several different types of clustering commonly used.

Compute Clusters – this refers to the combination of processing power to perform tasks at a high speed than a single system is capable of.  A compute cluster usually involves a master node and several slave nodes.  The master node hands off computational tasks to the slaves and then receives the completed tasks back from them.  For example, many animated movies are created using computer graphics that are rendered by compute cluster farms, with individual frames of animation being processed by different slave nodes.

Load Balanced Clusters – this refers to the combination of several systems to act as a single system by distributing workload across all of the cluster nodes.  For example, a cluster of two web servers will load balance web page requests, so that approximately half of the requests are served by one web server and half by the other.  Very highly trafficked web sites that need to handle millions of visitors each month will operate on load balanced clusters.

High Availability Clusters – these clusters, also commonly known as Failover Clusters, provide high availability for servers by having redundant nodes that are able to take over serving requests if the active node should fail.  Exchange Server 2007 clustering makes use of failover clustering.

Posted in Exchange | Tagged: , , , , , , , , , | Leave a Comment »