Introduction
Windows Server 2008 supports several different types of storage. You can either connectto storage physically, or by using a virtual hard drive (VHD).When Hyper-V is installed on a host, it can access the many different storage optionsthat are available to it, including direct attached storage (DAS, such as SATA or SAS) orSAN storage (FC, or iSCSI). Once you connect the storage solution to the parent partition,you can make it available to the child partition in a number of ways.
Hyper-V Storage Options
Windows Server 2008 with Hyper-V supports the use of direct attached storage, NAS, iSCSI, and Fibre Channel storage.
VHD or Pass-through Disk
A virtual hard drive (VHD) can be created on the parent partition’s volume with access granted to the child partition. The VHD operates as a set of blocks, stored as a regular file using the host OS file system (which is NTFS).
Within Hyper-V there are different types of VHDs, including fixed size, dynamically expanding, and “differencing” disks:
- Dynamically expanding This type of virtual hard drive starts small but automatically expands to react to need. It will expand up to the maximum size indicated when the virtual hard drive is created. “Dynamic,” in this context, is sort of a misnomer. Dynamic implies that the size of the virtual drive changes up and down based on need. But in actuality, the hard drive will keep expanding until it reaches the maximum limit. If you remove content from the virtual hard drive, it will not shrink to meet the new, smaller capacity.
In Hyper-V, you can expose a host disk to the guest without putting a volume on it by using a pass-through disk. Hyper-V allows you to bypass the host’s file system and access the disk directly. This disk is not limited to 2,040 GB and can be a physical hard drive on the host or a logical one on a SAN.
Hyper-V ensures that the host and guest are not trying to use the disk at the same time by setting the drive to be in the offline state for the host. Pass-through disks have their downsides. You lose some VHD-related features, like VHD snapshots and dynamically expanding VHDs.
IDE or SCSI on the Guest
Configuring the child partition’s virtual machine settings requires you to choose how the disk will be shown to the guest (either as a VHD file or pass-through disk). The child partition can see the disk as either a virtual ATA device or as a virtual SCSI disk. But you do not have to expose the drive to the child partition the same way as you exposed it to the parent partition. For example, a VHD file on a physical IDE disk on the parent partition can be shown as a virtual SCSI on the guest.
What you need to decide is what capabilities you want on the guest. You can have up to four virtual IDE drives on the guest, but they are the only type the virtualized BIOS will boot from. You can have up to 256 virtual SCSI drives on the child partition, but you cannot boot from them.
You can also expose drives directly to the child partition by using iSCSI. This bypasses the parent partition completely. All you have to do is load an iSCSI initiator in the child partition and configure your partition accordingly.
Hyper-V does not support booting to iSCSI, so you still need another boot drive.
Fibre Channel
A Fibre Channel Storage Area Network (SAN) is the most widely deployed storage solution in enterprises today. SANs came into popularity in the mid- to late 90s and have had huge growth numbers with the boom of data that customers are keeping and using every day. Ultimately, a SAN is really just a network that is made up of storage components, a method of transport, and an interface. A SAN is made up of a disk array, a switch, and a host bus adapter. Most hardware providers such as HP, Dell, and SUN have a SAN solution and there are companies such as EMC, Hitachi, NetApp, and Compellent that focus primarily on SAN and storage technology.
Hyper-V Storage
Fibre Channel Protocol
The Fibre Channel Protocol is the mechanism used to transmit data across Fibre Channel networks. There are three main fabric topologies that are used in Fibre Channel: pointto-point, arbitrated loop, and fabric.
Point-to-Point
Point-to-point topology is a direct connection between two ports, with at least one of the ports acting as a server. This topology needs no arbitration for the storage media because of separate links for transmission and reception. The downside is that it is limited to two nodes and is not scalable.
Arbitrated Loop
Arbitrated loop topology combines the advantages of the fabric topology (support for multiple devices) with the ease of operation of point-to-point topology. In the arbitrated loop topology, devices are connected to a central hub, like Ethernet LAN hubs. The Fibre Channel hub arbitrates (or shares) access to devices, but adds no additional functionality beyond acting as a centralized connection point.
Within the arbitrated loop category, there are two types of topologies:
- Arbitrated loop hub Devices must seize control of the loop and then establish a point-to-point connection with the receiving device. When the transmission has ended, devices connected to the hub begin to arbitrate again. In this topology, there can be 126 nodes connected to a single link.
- Arbitrated loop daisy-chain Devices are connected in series and the transmit port of one device is connected to the receive port on the next device in the daisy chain. This topology is ideal for small networks but is not very scalable, largely because all devices on the daisy chain must be on. And if one device fails, the entire network goes down.
Fabric
The fabric topology is composed of one or more Fibre Channel switches connected through one or more ports. Each switch typically contains 6, 16, 32, or 64 ports.
Disk Array
The SAN disk array is a grouping of hard disks that are in some form of RAID configuration and, just as with any RAID configuration, the more disks you have, the more I/O you will receive.
In most modern disk arrays, the disk controller (the component that controls RAID configuration and access to the disk array) will have some form of cache built into it. The cache on the disk controller is used to increase performance of the disk subsystem for both reads and writes. In the case of virtualization, the more cache available, the better performance you will see from your virtual machines.
Fibre Channel Switch
A Fibre Channel switch is a networking switch that uses the Fibre Channel protocol and is the backbone of the storage area network fabric. These switches can be implemented as one switch or many switches (for redundancy and scalability) to provide many-tomany communications between nodes on the SAN fabric. The Fibre Channel switch uses zoning to segregate traffic between storage devices and endpoint nodes. This zone can
be used to allow or deny a system access to a storage device. In the past few years three companies have really cornered the market on Fibre Channel switches: Cisco Systems, QLogic, and Brocade. When purchasing a switch, pay close attention to the back plane bandwidth that it has as well as how fast the ports are. You can get a Fibre Channel switch that supports port speeds of 2, 4, or 8 gigabits per second.
Tiered Storage
Using a technique called tiered storage, you assign different categories of your data to different types of storage media, in order to reduce total storage cost. Categories may be based on the levels of protection needed, performance issues, frequency of access, or whatever other considerations you have.
Because assigning data to a particular form of media is an ongoing and complex activity, some vendors provide software that automatically manages the process based on your organization’s policies.
For example, at tier 1, mission-critical or frequently accessed files are stored on highcapacity, fast-spinning hard drives. They might also have double-level RAIDs on them.
At tier 2, less important data is stored on less expensive, slower-spinning drives in a conventional SAN. As the tiers progress, the media gets slower and less expensive. As such, tier 3 of a three-tier system might contain rarely used or archived material. The lowest level of the tiered system might be simply putting the data onto DVD-ROMs.
SAN Features
Today SANs come with features that can really help your enterprise manage your data and your virtual machines.
iSCSI
iSCSI is a type of SAN that uses industry technologies such as Ethernet and Ethernet NICs for transport and an interface. So far iSCSI has proven to be a great lower-cost solution to the Fibre Channel SAN solutions that are available. If, for instance, you are building your virtualization environment on servers that are worth U.S. $4,000 and you want to connect them to your Fibre Channel SAN, you will have to purchase Fibre Channel adapter cards that are compatible with your SAN. Each one of the Fibre Channel cards you implement into your server can range from $1,000 to $2,000 each. That can really start to be an additional cost if you are building out one or more 16-node clusters.
Now we’re sure some of you reading this are saying, “Well yeah, but I’ve already implemented my Fibre Channel architecture. Now you want me to implement another infrastructure just for this?”
Well, no, that’s not the case. What we’re saying is that if you don’t already have an infrastructure in place, iSCSI would be a great technology for you to research. One other consideration you will want to keep in mind when you are determining whether or not to implement iSCSI is whether you want to implement iSCSI on separate physical switches than your production network. You can do this, but we don’t recommend it. The reason we don’t recommend this is that an iSCSI network can become saturated extremely quickly and use all of the bandwidth on your back plane, creating network contention for your production services.
Direct Attached Storage
Direct attached storage (DAS) is a storage array that is directly connected, rather than connected via a storage network. The DAS is generally used for one or more enclosures that hold multiple disks in a RAID configuration.
DAS, as the name suggests, is directly connected to a machine, and is not directly accessible to other devices. For an individual computer user, the hard drive is the most common form of DAS. In an enterprise, providing storage that can be shared by multiple computers tends to be more efficient and easier to manage.
The main protocols used in DAS are ATA, SATA, SCSI, SAS, and Fibre Channel. A typical DAS system is made of one or more enclosures holding storage devices such as hard disk drives, and one or more controllers. The interface with the server or the workstation is made through a host bus adapter.
NAS
Network attached storage (NAS) is a hardware device that contains multiple disks in a RAID configuration that purely provides a file system for data storage and tools to manage that data and access. An NAS device is very similar to a traditional file server, but the operating system has been stripped down and optimized for file serving to a heterogeneous environment. To serve files to both UNIX/Linux and Microsoft Windows, most NAS devices support NFS and the SMB/CIFS protocols.
NAS is set up with its own network address. By removing storage access and its management from the server, both application programming and files can be served faster, because they are not competing for the same processor resources. NAS is connected to the LAN and assigned an IP address. File requests are mapped by the main server to the NAS server.
NAS can be a step toward and included as part of a more sophisticated SAN. NAS software can usually handle a number of network protocols, including Microsoft’s Internetwork Packet Exchange and NetBEUI and Novell’s Netware configuration.