Orphaned VMDK Files

(8/10/2015) Update: Unfortunately, some backend things have changed since I originally cobbled this script together and it doesn't work any more.  I'll fix it if I get the chance, but in the meantime, there's a free utility called RVTools that can identify orphaned VMDK files (which are entertainingly called Zombie VMDKs on the vHealth tab of the application).  That's a great application to be aware of anyway, as it makes it very easy to get access to a lot of important information that the vSphere client obscures.

(11/9/2015) Update: I went ahead and put together a post detailing how I used RVTools and a helper script to identify, rename and then delete orphaned VMDK files.

Every organization has to wrestle with orphaned VMDK files.  What is an orphaned VMDK file, you ask?  It's a VMDK file that's sitting on your SAN, consuming expensive storage, but isn't actually being used by any VM.  They're notoriously hard to find (especially in larger environments), mostly because the VMDK file itself doesn't tell you which VMs might be using it.

A simple way to find potentially orphaned VMDK files in an environment is to open an SSH session on a host and CD to /vmfs/volumes.  In there, use a command like:

find -iname "*flat.vmdk" -mtime +7

That will reveal all VMDKs on all VMFS volumes that the host can access with a modified date from more than a week ago.  The assumption being that if a VMDK file hasn't been used in that long, it's probably not being used... except for templates.  Or any machine that's powered off but is still needed.

So, while that works (as long as you're sufficiently familiar with the environment to prune the hit list), it's not very sophisticated and could lead to some unfortunate false positives.  A better solution is required... and, as is so often the case, that solution can be accomplished in PowerCLI.

This isn't a new problem, not by any means.  For as long as I can remember, the default behavior when you remove a VMDK from a VM is to disassociate it but to leave the file in place.  This is great, because it protects you from accidentally deleting vital drives... but it is also the major source of orphaned VMDK files.  Rather than reinvent the wheel, I did some searching and found a post in the VMware communities by "HJA von Bokhoven" from 2010 (ancient and venerable!).  Apparently Luc Dekens (a PowerCLI guru) had made some improvements in that same timeframe.

Their script is really cool - instead of just looking at the VMDK files, it actually compares the list of all VMDK files on the storage against the configurations of all registered VMs.  If none of the registered VMs in the environment (including templates!) are using the file, it spits it out as an orphaned VMDK.

I've reposted the script here, with the minor addition of some logic to prevent false positives from Changed Block Tracking (which many VM backup utilities are using these days).  Beware of unintended line breaks and use at your own risk.  I didn't write most of this code, but I have used this script to great effect.


Comments

  1. Thanks for your post Jason. Great read!

    I have utilized this script and adapted it into a plugin for the vCheck script.

    Please feel free to check it out here: http://www.empiricvirtualization.com/2014/02/new-vcheck-plugin-orphaned-vmdk-file-v01.html

    ReplyDelete
  2. Hello!,
    great script, but if do the bad praxis of having datastores with the same name in different vmware hosts, then it doesn't work.

    I have modified the script to make it works:
    https://communities.vmware.com/blogs/VirtuallyAnITnoob/2014/03/05/report-unused-vmdk-files

    Best regards,
    Pablo

    ReplyDelete
  3. Thanks for your awesome feedback, guys!

    ReplyDelete
  4. For some reason script is reporting valid registered VM's as well..

    ReplyDelete
    Replies
    1. Is it possible that it's finding VMDKs that are named after valid VMs but that aren't actually in use by that VM?

      Delete
  5. Even I thought that but I randomly check few registered VM's and they point to correct datastore and path but i do see few correct reported orphaned vm's.

    ReplyDelete
  6. I've tried using this script, and similar versions I've seen around the place, using PowerCLI 5.5 against an ESXi 5.0 host.

    The issue for me is that all the entries of $arrUsedDisks contain a / character separating the folder name from the vmdk filename. For example, [DataStoreName] VMFolderName/VMVirtualDiskName.vmdk.

    This line of the script assembles the parts of the datastore search result into a full path:

    if (-not ($arrUsedDisks -contains ($folder.FolderPath + $fileResult.Path))){

    But I find that $folder.FolderPath does not end with a / character, and $fileResult.Path does not begin with a / character, so the concatenated string doesn't include one. So the -contains operator works on a string like '[DataStoreName] VMFolderNameVMVirtualDiskName.vmdk' instead of '[DataStoreName] VMFolderName/VMVirtualDiskName.vmdk'. This means the datastore search results never match anything in the $arrUsedDisks, so every single vmdk shows up as "Orphaned" in the results.

    I can do a quick-and-dirty fix for my environment by changing the affected line of code to say:

    if (-not ($arrUsedDisks -contains ($folder.FolderPath + '/' + $fileResult.Path))){

    Maybe something has changed in the recent versions of the PowerCLI regarding the FolderPath?

    ReplyDelete
    Replies
    1. That sounds like a possibility; thanks for posting your fix!

      Delete
  7. I found that if use this kind of scripts I get timeouts on very large datastore... Heres what I did to solve it:

    http://blog.dhjensen.dk/2015/08/orphaned-vmdk-search-return-exception.html

    ReplyDelete
  8. Hi Jason
    Thanks for this script I have managed to get it working but can you tell me how to export it to CSV file as I have too many entries on the screen

    I tried to do this
    Export-Csv -Path "C:\Install\orphaned-vmdk-files.csv" -NoTypeInformation
    after the last bracket but it still outputs to the screen

    Your help is greatly appreciated
    Regards

    ReplyDelete
    Replies
    1. I'd recommend using RVTools for this task, as people have reported that this script isn't working and I haven't had the chance to really dig into it. That said, if you want to export it to a CSV, just add a line at the end like this:

      $report | export-csv C:\Install\orphaned-vmdk-files.csv

      Delete
  9. Hi Jason
    Thanks for getting back to me. I wanted to use our report along with RV Tools to rule out false positives

    Once the zombie file has been indentified, how is the best way to check if the file is not in use by any VM. This is the part I am finiding a bit tricky.

    Appreciate your assitance and am a fan of your blog. Started reading it from 2012.
    Regards
    Rushad

    ReplyDelete
    Replies
    1. I'm always glad to hear when people appreciate my blog! That's a great question, about how to verify that the file isn't in use... I should probably write about that specifically. I fall back on the ESXi console tool vmkfstools to both verify and to remove zombie VMDK files. The reason I like to use vmkfstools for this task is that it's file lock aware but not VM aware. If you try to mess with a VMDK file that's in use, it will error out rather than break the VM. If you manipulate a VMDK file that's associated with a powered off VM, it will rename the VMDK file but it won't fix the VMX file to use the new name.

      I like to rename the VMDK files first by using vmkfstools -E oldname.vmdk newname.vmdk. I then leave that VMDK file renamed for a few weeks and, assuming that nothing was broken by the rename, I delete it with vmkfstools -U newname.vmdk.

      Delete
  10. Hi Jason
    Top Blog, script running nicely. Is there any chance of getting the Size of the VMDK file included too? The screen output doesn't list it. I'd like to know how much space 200+ zombie vmdk files are consuming!

    ReplyDelete
    Replies
    1. Hi Richard - thanks for the kind words! I've heard some reports of this script not working as desired and haven't really had a chance to troubleshoot it, so be sure to doublecheck its results. Anyway, I've posted a slightly modified version that just spits out the entire $report array, which has a .Size attribute.

      I haven't had a chance to test this, but you can probably do something like this to get a total file size:
      (find-orphans.ps1).size | measure-object -sum

      Delete
  11. Jason, you're a gentleman and scholar. It works for me. You're a star!

    ReplyDelete
  12. Hi Jason,
    Great script - excellent work! This is working well in our production environment, and we intend to use it to free up a lot of wasted space.

    After running your script and identifying orphaned VMDK files (and verifying that those machines are indeed no longer in use), is it generally a safe move to delete the entire containing sub-directory along with the VMSD, VMFX, VMX, and LOG files?

    Lastly, any advice as to cleaning up non-VMDK orphaned files? Just from a quick inspection of one of our datastores, I see some folders for machines that are no longer in use but that don't contain VMDK files. Most of the files contained in these folders are small, but I occasionally come across some larger vmware.log files.

    ReplyDelete
    Replies
    1. I'd be hesitant to attack the whole directory, as an orphaned VMDK is a VMDK that isn't registered to any VMs in the inventory. It might just be an extra disk that was created on a valid machine, then removed from that machine without being deleted.

      It would take a bit of work, but you could probably change the search string to look for .vmx or .vmtx files. You'd also have to change $arrUsedDisks so that it generates a list of vmx files for all registered VMs. You could probably do that with something like this:

      $arrUsedDisks = get-vm | % {$_.extensiondata.config.files.vmpathname}
      $arrUsedDisks += get-template | % {$_.extensiondata.config.files.vmpathname}

      Once you've identified all of the vmx files (I think that would also match vmtx files, but I'm not sure how templates would look to that... test, test, test!) that exist on the datastores but are not in use by VMs, you can delete those files and the other VM descriptor files (the vmware.log, etc.). Make sure that any .vmdk files in the directory are also unused, as it's possible that some other machine might have mounted that unregistered VM's hard drive at some point.

      Delete
    2. Hi Jason! Your script is awesome and I've done a lot of the modifications mentioned here in the comments plus some, including the one about from Pocketsidewalk.

      * I've modified the script to work across multiple vCenters at the same time.
      * The first scan through all the datastores, I look for VMX/VMTX and register the VM under the "Discovered virtual machine" top-level folder.
      * It does take a long time to run. I thought about combining the VMX/VMTX scan with the VMDK scan, but realized that it would miss instances where a VMX/VMTX had a reference to a VMDK on another DS. There might be a better way to do this without scanning the DSes twice, but this was the quick result.

      Without further ado, here's the script: https://github.com/Hossy/vmware-scripts/blob/master/vsphere/powercli/VMDK-orphaned-v2.ps1

      Delete
  13. John,

    Thanks for updating the script. Unfortunately I keep getting the following:

    Cannot find an overload for "op_Subtraction" and the argument count: "2".
    At C:\scripts\vcheckvc03\Plugins\40 Datastore\VMDK-Orphaned-v2.ps1:2 char:1
    + Script execution time: {0:c} -f ((Get-Date) - $begin) | Out-Host
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : NotSpecified: (:) [], MethodException
    + FullyQualifiedErrorId : MethodCountCouldNotFindBest

    Any ideas?
    Cheers, Al.

    ReplyDelete
  14. Great article..Im looking for remove orphanVMDK from datastore for VMWare..please help me with the script if any..thanks in advance..

    ReplyDelete

Post a Comment

Sorry guys, I've been getting a lot of spam recently, so I've had to turn on comment moderation. I'll do my best to moderate them swiftly after they're submitted,

Popular posts from this blog

PowerShell Sorting by Multiple Columns

Clone a Standard vSwitch from one ESXi Host to Another

Deleting Orphaned (AKA Zombie) VMDK Files