Category Archives: Virtualization

Duplicate Machine SID’s are not an issue except when they are an issue.

I came across an article on InfoWorld about a blog post from a Microsoft tech regarding Windows Machine SID’s and the myths that surround them.  The InfoWorld article is mostly fluff, but the blog post is well worth the read.  Basically, machines that are imaged without being sysprepped usually have the same machine SID’s.  It’s long been believed that this is a security issue.  It turns out that’s not the case.  Machines can be on the same network with the same SID”s if the machine is not already connected to a domain, not going to be promoted to a Domain Controller, and if there isn’t an application that reacts badly to it.  (The example given is applications that use the Machine SID as their own ID’s.)  The bottom line is that machines SHOULD BE SYSPREPED to prevent any known and unknown issues.  Also, Microsoft will not support machines that don’t have unique SID’s.  Sysprep is easy to run.  Don’t slack off just because it might not cause a problem.

The reason that I began considering NewSID for retirement is that, although people generally reported success with it on Windows Vista, I hadn’t fully tested it myself and I got occasional reports that some Windows component would fail after NewSID was used. When I set out to look into the reports I took a step back to understand how duplicate SIDs could cause problems, a belief that I had taken on faith like everyone else. The more I thought about it, the more I became convinced that machine SID duplication – having multiple computers with the same machine SID – doesn’t pose any problem, security or otherwise. I took my conclusion to the Windows security and deployment teams and no one could come up with a scenario where two systems with the same machine SID, whether in a Workgroup or a Domain, would cause an issue. At that point the decision to retire NewSID became obvious.

I realize that the news that it’s okay to have duplicate machine SIDs comes as a surprise to many, especially since changing SIDs on imaged systems has been a fundamental principle of image deployment since Windows NT’s inception. This blog post debunks the myth with facts by first describing the machine SID, explaining how Windows uses SIDs, and then showing that – with one exception – Windows never exposes a machine SID outside its computer, proving that it’s okay to have systems with the same machine SID. Note that Sysprep resets other machine-specific state that, if duplicated, can cause problems for certain applications like Windows Server Update Services (WSUS), so MIcrosoft’s support policy will still require cloned systems to be made unique with Sysprep.

You can read the full article here.

You should also read this follow-up post by Microsoft tech Aaron Margosis that explains the difference between Machine SID’s and Domain SID’s.  The key statement in his post: “So while it’s OK to clone a system before it joins a domain, doing so after it joins a domain (and is assigned a domain computer account and a corresponding domain SID) will cause problems.”

Advertisement

I upgraded from ESX 3.0.2 to ESX 3.5 and it was a pain.

I upgraded our ESX servers over the Christmas break.  I had to install a new ESX server, so I took the opportunity to upgrade the rest of our environment.  It was a pain in the ass.  There were a few bugs that caused me problems.  Details below:

I decided to wipe the ESX servers and install 3.5 fresh from the CD.  I did the upgrade from 2.5.2 to 3.0.1 this way and it worked well.  I upgraded the Virtual Center server from 2.0 to 2.5.

VMotion caused me a lot of problems.  I was not able to ping the VMotion port after the upgrade.  This happened to varying degrees on all of the servers.  The last server was the worst.  It was driving me crazy.  I had enabled VMotion and named it properly.  It just would not work.  Eventuall I called support.  They ran vmkping to the IP address of the VMotion port on the server while I pinged the IP address from my workstation.  This seemed to magically enable the VMotion port.  Running just vmkping or just ping didn’t work.  The combination of the two worked for some bizarre reason.

“No Active Primaries” message when I tried to add a server to the Cluster.  This one perplexed me for a while.  It comes from the way clustering works.  Clustering doesn’t work perfectly in mixed 3.0/3.5 environments.  The first server added to a cluster is considered the “primary.”  When I initially created the cluster, ESX1 (server name) was the first server in the cluster.  When I did the upgrade, I took ESX 1 out of the cluster.  It didn’t pass the role of “primary” onto one of the other servers.  So when I tried to add ESX1 back into the cluster, it gave me the “No Active Primaries” error.  I fixed this by removing all of the servers from the cluster and adding them back in.  This thread pointed me towards a solution:  http://communities.vmware.com/message/701671;jsessionid=AA7526EEA3E0EE5EAFAFDB7A761815ED

“Unable to read partition information from this disk”: I got an error like this when I was installing ESX on a machine attached to a SAN with raw drive mappings.  I disconnected the server from the SAN and started the installation over just to be safe.  A good piece of advice… Always disconnect the server from the SAN when you are reinstalling ESX.  There is a decent possibility that you’ll accidentally overright your LUN’s.

 I had some other general problems, but nothing too serious.  Let me know if you have any questions or issues that I can help with.

Virtual Desktop Infrastructure, Client Consolidation, and Blade PC’s… Oh My!

I’ve begun researching VDI because I believe that the PC is no longer necessary in medium to large environments that can operate with less than workstation class performance.  The potential advantages of replacing PC’s with Thin Clients that connect to full fledged XP installations are compelling.  I’ve been researching all of this for a couple weeks now, and I have to say that VDI, CCON, CCI, is in a pre-1.0 state.  I’ll explain it all below.

There are three terms going around to describe Client Consolidation technology.  They are:

  • VDI: Virtual Desktop Infrastructure
  • CCON: Client Consolidation
  • CCI: Consolidated Client Infrastructure

They all essentially mean the same thing.  My definition of CCON is centralizing desktop/PC systems by hosting them in the data center.  All computing functions other than KVM are hosted and managed in a computer room away from the user.  The user uses a client device or application to access the centralized computer.  There are multiple terms battling to be the methodological name for this technology.  VDI was the first term that I saw used.  VDI is the trendy name in my view, and has been co-opted by VMware and turned into a product.  CCON is the name used by an IBM employee named Massimo Re Ferre’ who is a heavy contributor to VDI technology research.  Client Consolidation happens to be the name of IBM’s implementation of VDI (what a coincidence).  CCI is a product name used by HP after they abandoned the use of VDI.  Another name that’s out there is “Centralized Computing.”  Centralized Computing is the term used to define the days of mainframes and dumb terminals. 

My preference for the academic name of this technology is Client Consolidation (CCON).  I believe that CCON is the most descriptive, most open name of all.  CCON is general enough to encompass all of diverse technologies in this area.

There’s a lot of overlapping information and noise out there.  I want to explain the bottom line as I see it.

The technology “models” (Re Ferre’, 2007) for CCON are:

  • Shared Services (Citrix)
  • Virtual Machines (VMware, Xen, others)
  • Blade PC’s/Blade Workstations (HP, ClearCube)

You will ultimately have to select one (or more) of those methedologies for a production rollout.

Client consolidation is all about the use of RDP to connect to Windows systems.  RDP is what it’s all about (some solutions prefer/support ICA).   If you know how to use Remote Desktop, you’re most of the way to understanding what CCON is all about.   Everything after this is about services and features built around the use of RDP accessed Windows systems (VM’s, Blade PC’s).

The components of CCON are:

  • Client Access Devices (thin clients, repurpossed PC’s)
  • Connection Broker (software)
  • Host Systems (VM’s, Blade PC’s)

 VDI-CCON

Client Access Devices are straight forward.  You need a device that can understand how to connect to remote systems using RDP.  The client device can be a full blown XP/Vista PC, or a thin client running the proper client software.  You’re going to hear a lot about Windows XPe in this space.  XPe is a stripped down version of Windows XP often used for development and loaded onto many thin clients. 

Host systems are also straight forward.  You can run your XP/Vista/Other hosts as VM’s or on Blade PC’s.

Connection Brokers is where all the fun is.  Connection Brokers handle the setup, and advanced features of CCON.  Brokers decide (based on policy) which VM/Blade should be assigned, the features that are available to the user, and in some cases the robustness of the service.  I think of Brokers as travel agents.  A client shows up to the broker with a request.  The Broker knows how to handle the request based on requirements and makes all of the arrangements including the connection.  The broker is usually finished at that point, though the broker is an intermediary in some solutions.

That’s basically what CCON is all about.

CCON is barely at a 1.0 level.  There’s very little information out there (other than Citrix) and all of the solutions are patch up jobs.  There’s no long standing, widely accepted solution.  Most of the solutions that I have found have been assembled piecemeal.  The absolute best information that I have found comes from Massimo at http://it20.info/misc/brokers.htm.  He’s created a table with extensive descriptions of all the features he’s been able to confirm for brokers and clients.  It’s not a complete list of brokers and features, so do your own research and testing (HP SAM, IBM TMP missing).  Regardless, it is a must read if you are going down the CCON road.

Two other items of interest are VMware’s VDI forum and HP’s CCI forum.  Notice that there is very little activity at those forums.  That’s because most people still aren’t working on this.  Also, VMware’s product is in Beta.  That’s right…VMware’s broker is vaporware, yet they’re calling it VDM 2.0.  Now that’s good marketing.

That’s it for now.  Please let me know if you have any questions or if you have something to add.  There is so much information out there that I’m positive there is more to come.

I completed our VMware ESX 3 upgrade this past weekend.

 I have been planning my company’s ESX upgrade for a while.  After many delays and other conflicts, I was able to schedule it for this past weekend.  I want to braindump everything I learned if possible. It’s a bit of a mish-mosh, but so is my brain.

  • Plan, Document, Plan, and Document: There are so many moving parts that you’re going to want to document EVERYTHING.  The upgrade is not difficult, but it is tricky.
  • Be prepared for your Virtual Center upgrade to go bad.  This is the only in place upgrade that you cannot avoid and it’s the least reliable.  Have a backup plan whether it’s restoring the database or wiping it and starting clean.  Make a decision in advance.
  • If you lose VC you lose stats, permissions, VM groups, and a few other things.  Document all of VC at minimum (if possible).
  • VMware says you need 1200 MB of free disk space on your LUN’s.  This is not enough.  I had at least 2 gigs and still ran into problems.
  • The VM Hardware upgrade moves VM configuration files from the ESX server to the SAN.  One of these files is the VM swap file.  The swap file is twice the size of the VM’s memory.  Reducing the assigned memory increases free space on the LUN.  This helps with insufficient disk errors at boot up.
  • You can’t suspend a VM if you don’t have enough disk space.
  • Rebooting the ESX servers seems to clear up “Object” errors.
  • VMotion: You have to license it, set up the virtual switch as vmkernel, AND enable VMotion on the port.
  • WinSCP is a great program.
  • You MUST upgrade Hardware on all VM’s before putting them in a cluster.  This makes sense, but isn’t obvious.
  • Test as much of your upgrade as possible in advance.  This helped me tremendously.
  • Make sure that your VMFS2 LUN’s are formatted at 8MB block size or less.  ESX cannot upgrade LUN’s that are formatted with anything larger than 8MB block size.  The two LUN’s I used as backup were both formatted with 16 MB block sizes.  I knew the limitation, but I didn’t think it affected me because I always used the default block size.  The only thing that’s strange about them is that they are both 1.7TB.
  • “unable to upgrade filesystem” + “function not implemented” errors come from the wrong block size on the VMFS2 partition.
  • Renaming datastores is not destructive in ESX 3, but I wouldn’t recommend doing this until all VM’s are functional.
  • The upgrade is a good chance to upgrade server firmware.
  • Make sure all VMDK files are connected before upgrading Virtual Hardware.  Otherwise you will get errors about disk version mismatches.  I used the recommended resolution.  I’m not confident that I did the right thing.
  • Invalid affinity entry errors will happen if you assign a processor or memory affinity to a server and then move it to a server that cannot fulfill the entry.  This could happen if you move a VM from a quad proc. server to a dual and set processor 4 as the affinity.  Best way to fix this is remove the affinity.  Second best way is to recreate the VM using the same disk files. (Remove from inventory, recreate.)
  • Network Copy Failed for File. [] /root/vmware/<servername>/nvram error is most likely a DNS problem.  Make sure to register all possible DNS names in the hosts file of each server involved.  In my case, the registered name and FQDN was different.  More info can be found here.
  • If there are yellow or red alarms on most VM’s after Virtual Center 2 upgrade:  The upgrade sometimes truncates records including the alarm thresholds.  It will truncate 70% and 90% to 7% and 9%.  VC looks like a Christmas tree the first time you log in.  Your options are bad and worse in this case.  I chose to wipe the old DB and create a new one.  The stats were not critical to us.  Doing this also affects rights, groups, and other things.
  • “The virtual machine is not supported on the target datastore.”  Rebooting solves lots of problems during the install.  Especially this one.
  • VMware Tools for Netware.  I need to address this in a seperate post, but the answer is that the only instructions for this are old GSX 3.2 instructions.  They work.

Sorry about the disorganized info, but this is just a braindump.  Please let me know if you have any questions and I will get you more detailed info.

Let’s talk about Virtualization (OS).

I am a passionate advocate of operating system virtualization, especially in the server room.  OS virtualization is probably the most important thing to happen to the server room since Ethernet (in combination with TCP/IP).  How so?  Ethernet changed everything people understood about computers at the time.  It made computers more connected and began to end the chapter of standalone human driven computers.

OS virtualization has begun to do the same thing.  Now you don’t need as many physical computers to do the same amount of work.  Underutilized computers can now be pushed to their limits by maximizing processing power, centralizing storage, and offloading specific functions off of the core system.  I truly believe that there is no greater game changer to computing than OS virtualization.

Benefits of virtualization (mostly based on VMware, and not limited to them):

  • Fewer physical computers
  • Less underutilized equipment
  • Less equipment (nics, processors, memory [in most cases], HBA’s, etc.)
  • More OS’s per machine (VMware runs Windows, Linux, Unix, Solaris, Netware.  Run Windows on a Mac, etc.)
  • Centralized Storage (this was possible before, but virtualization encourages and makes it cost effective.)
  • Improved redundancy, availability, reliability
  • Ability to dump “legacy” equipment or migrate that app that no one could rebuild.
  • Run a VM across the HW of your choice
  • No more HW upgrade headaches (just add compatible machines as needed and hot migrate everything)
  • Shared processing, networking, memory.  (That means that you only need 2-3 nics for 10 VM’s instead of 10-15 nic’s for 10 physical machines.)
  • Add resources on demand (Need more processing, memory, nic?  Just increase the priority and/or resource share for your VM).
  • More environmentally friendly

OK, I think you get the gist.  OS virtualization and server consolidation specifically has many benefits and is the #1 thing you can do right now to improve your computing environment.  Please share your thoughts and feel free to lean on me for advice on your virtualization project.