This blog is meant to provide information, thoughts and links that may be found useful to a computer programmer. There is no set strategy or limits of what topics will be discussed.

May 22, 2008

Virtualization (in a nutshell)

The other day a friend of mine and I were discussing how Virtualization works; to help her out I broke it done to a simple analogy. In fact, I think it will help out most people that are not at all familiar with virtualization. Here is a snippet of the e-mail; some content has been modified to remove personal discussions and to help make the conversation make more sense to someone reading from a 3rd party perspective.

...You’d use virtualization if you wanted to run X number of instances of SQL Server and each instance (or many of the instances) needed to be on its own operating system. Reasons for doing this would vary, some would be because they believe that a dedicated OS will handle a single instance of SQL better than running 2 instances; while others might choose this route because they have 15 server boxes with SQL and to lower the amount of reconfiguring it’s easier to put each box into its own virtual hard drive. Basically, you could use Virtual software solutions to accomplish the sharing (and dedicating) memory with SQL Server and a separate OS. Some people will purchase a single box for their SQL Server 'virtual worlds’ (my terminology there) with maximum RAM and CPU installed, and then set up each virtual hard drive to use enough RAM for that particular instance. It's also worth noting that a 'virtual world' is limited to a maximum amount of RAM as the OS for the 'virtual world' will allow. Even if the server box is using Windows Server 2008 (64-bit) with maximum allowed memory, you still would be limited to 4 GB if you installed Windows XP Home Edition as the OS for the 'virtual world'. You could always tell it to use more RAM, but the OS will only recognize the maximum limit it is allowed (or configured for). This is an important part to understand because some new administrators will try to cut corners (or costs) by using a lesser OS thinking that the Server OS will pick-up the slack; this is NOT the case as the underlying Server OS is simply a host that is allowing the 'virtual world' to run..it has no dictation over the actual running of the 'virtual world', other than the resources that will be made available to the 'virtual world'.

An example would be if you had a 64-bit system with 4 dual-core processors running on Windows Server 2003 with maximum RAM (I think is 32 GB, I could be wrong on the max amount). Now, let’s say your business wants a SQL Server instance that holds valuable company information (such as Human Resources records, accounting records, employee personal information, etc). In this case you’d probably want to make sure this instance of SQL Server is not on the same machine that your users (rather it be employees of the company, or customers of the company) are using. So, instead of purchasing a separate box for it, you could use virtualization to accomplish this feat (but as with anything else there are security concerns and procedures that should be followed when doing this). You would simply create a new virtual hard drive (‘virtual world’) that contained this SQL Server instance. And when creating this ‘virtual world’ you can specify how much RAM can be used; and this would be determined by the amount of data flowing through the SQL instance and the number of users. What’s really ingenious about virtualization is that you can go into an existing virtual hard drive (I believe even while it is running) and tell it to use less RAM or other resources, without ever shutting down the server. So, let’s say on the same box your primary business SQL instance is running and using up 28 GB of the 32 GB, and the other 4 GB is reserved for the Server OS use. Well, you just go into the first virtual hard drive and tell it to use 24 GB, this now frees up 4 GB; which you would then tell the virtual hard drive being created for the company sensitive info (SQL instance) to use that 4 GB. Then just monitor both ‘virtual worlds’ for a while (maybe a couple days or weeks) and adjust the RAM as you see fit, as well as any other resources.

As you can see, virtualization can be handy. But, there are also drawbacks; in which Windows Server 2008 addresses. I think Windows Server 2008 did a very good job at addressing the issues, but this is my opinion. The basic drawback is memory allocation/sharing; I’ll cover that in a second. If you are interested in this type of scenario (using Virtual Hard Drives) and see a business requirement that you can fill with it to improve your systems then you’d want to read into Virtualization, and look at applications like VMWare or Microsoft’s Virtual Server. I’ve heard VMWare is very good, especially in cases where you are consolidating 100s of servers into a few servers. VMWare, I believe, was the first true virtual environment..I could be wrong on this; I don’t know the whole history of virtualization. Personal experience has been with using Microsoft’s virtual server, it’s fairly simple to setup and use. I chose to go with MS over VMWare because I personally feel that MS would design their Virtual Server to utilize every nook and cranny of the Windows OS; I’d honestly look into using VMWare if I were running an OS that isn’t MS based...

...Hyper-V is a term that you will want to familiarize yourself with if you are getting into virtualization. Hyper-V is a concept created by Microsoft that addresses memory allocation within the OS to improve the performance of virtual hard drives/OS. Hyper-V in particular was designed to be used with Virtualization; it cleans up a lot of the memory issues that occurred with Windows Server 2003 and earlier versions. People would see problems when running 15 instances of virtual hard drives. Mainly that their expensive RAM wasn’t worth any more than common RAM picked up at a local computer shop! Microsoft’s solution was Hyper-V! If you ever get bored and want to geek out, read up on Hyper-V. The concepts are actually quite interesting, and if you’re not familiar with Virtualization you will still see how Hyper-V improves virtualization. Amazing stuff!

The basic idea, in a nutshell, is that prior to Windows Server 2008 the virtual hard drives would compete with each other and the OS for memory. Think of memory like a catalog file in a library. If you want to find a book, you need to open a drawer find out the location of the book, then go to the book. Now, if you want to find another book, you’d have to close the open catalog drawer; then open another catalog drawer and find the location of that book. Now, let’s say there is only 5 catalog drawers and they hold all of the locations of all books in the world; and let’s even say that everyone knows that these catalog drawers has this information. So, you might be looking for a book on virtualization; and the person next in line wants a book about Vacuums. They obviously can’t look in the catalog at the same time as you because there isn’t enough room to move the catalog cards around so both people can view it. So, the next person MUST WAIT for you to finish; then when you are finished and leave to get your book they can look for their book. Now, what if you figure out that you wanted to look up something about SQL. Now you must WAIT for that person to finish looking in the catalog, PLUS anyone else that is in line before you returned.

This is how memory would work also, it is more of a linear type of thing. So, the OS would be accessing part of the RAM, then Virtual Hard Drive (VHD) # 1 is accessing a part after the OS is done, then VHD # 2 would access another part. So when switching between VHDs you’d run into WAITS for the OS or VHD to complete its accessing of the memory. First attempts to resolve this problem was to divide memory into sections; but regardless the OS alone uses up so much room for memory that the VHDs would have to compete just to get enough memory to run; let alone to actually perform any operations (such as just logging in to the VHD).

To resolve this MS introduces Hyper-V; which basically will hold references or copies of memory in a temporary memory table. So when VHD #1 is done, the accessed memory isn’t actually released, it’s stored in a temporary table just in case VHD # 1 needs to come back to access it again. This is the same with the OS and VHD #2; this doesn’t necessarily resolve every potential memory sharing issue…but it sure does greatly reduce it! Going back to the catalog analogy, if the section of catalog cards you were looking at was copied and then placed into a separate temporary box and you realized that you wanted specific topic under Virtualization; you could go to the temporary box of catalog cards that held info about virtualization books. This obviously would save you some time because you don’t have to wait in that long line to access the original cards; it would have its own limits also (storing some cards in boxes); such as it would take additional time/effort to copy the information, you must find a temporary box (or create one), try to keep others from not using the box, and this only works if you are searching for a book the is relatively similar to your original book. If you want to search for a book about car engines, then obviously you still have to get back in line to access the original cards. See how there are some drawbacks?

That’s about the best I can do at explaining this subject without getting into great technical details. It’s quite an amazing concept (virtualization) and can help businesses that use lots of databases and servers. Hardware technology, and Operating System technology, are quickly adapting to handle virtualization more elegantly. It will probably be a few more releases before we can truly start seeing virtualization performing near its peak of potential; but, now would be a good time to familiarize yourself with the topic while it’s still being developed/refined. As with any other technology, the longer you wait to learn about it, the more complex it becomes which means the more confusing it will be to learn. Imagine if you learned about SQL Server when it was simple, just a database with commands (before all of the DTS, SSIS, etc); then now you’d only have to learn about the new options offered in SQL Server. Imagine if you were to walk into using SQL Server today for the first time ever, and told to run a multi-national company’s data structure..that’d be a daunting task because there is just so much to learn with SQL Server alone!

I hope you get a lot of information out of this!

Until next time, Happy Coding!!

No comments: