Monitoring native Windows deduplication job progress with Power Shell

I am a big fan of Windows data deduplication since I first saw it in Windows Server 2012 RC. You can never have enough of the disk space, right? 馃檪 It has its downsides (mostly maintenance-related – I’ll describe it later) but the benefit is amazing – official Microsoft blog claims they were able to squeeze 6TB of data in 1.6 TB. Another cool thing about data deduplication is that you can even have it running on Windows 8 (see here for details) by importing a package and then enabling a new Windows feature.

The deduplication itself is controlled through a number of Power Shell cmdlets, but the thing is that you can’t really see the progress of it. To help with it, I came up with an idea of running Get-DedupJob in a loop and showing the output on a progress bar. Here’s the script:

for ($j = 0 ; $j -le 10000; $j++) 
{ 
  Get-DedupJob | %{
    Write-Progress 
      -Id $_.ProcessId 
      -Activity ($_.Volume + " - " + $_.Type) 
      -Status ($_.State.ToString() +  " " + $_.Progress.ToString() + "% percent complete") 
      -PercentComplete $_.Progress}; 

    Start-Sleep 2 | out-null;
}

You might be also tempted to check out what the disk savings are – you can use similar loop concept to do it:

for ($j = 0 ; $j -le 10000; $j++) 
{ 
  Get-DedupStatus;
  Start-Sleep 2 | out-null;
}

I am using deduplication to control the size of my VM library at home lab and it works like charm – allows me to keep 400+GB of VHDs in less than 150 GB. However you need to remember about certain things (read the docs for details):

  • There are three kind of jobs – optimization, scrubbing and garbage collection. You have to ensure that all of them are running – for example a simple file delete command does not instantly reclaim unused space. This is especially important in a lab that’s not running 24×7. You have to consider adjust deduplication schedule to suit your needs
  • Enabling background optimization might have an impact on longevity of SSD drives due to additional I/O operations.
  • You need to have Windows 8/Server 2012 or newer to suport deduplication-enabled volumes. Windows 7 or older OSes cannot access it.
  • Last but not least, you can’t enable deduplication on a system drive.

List all contents of path in PowerShell

And now for something completely different… This little PowerShell script might be useful when trying to locate the biggest files on your hard drive (I’m always low on disk space).

Get-ChildItem d:\ -recurse | where {$_.Length -ge 0} | sort -Property Length -desc | select Length, Directory, Name, LastAccessTime, Mode | ogv 

It takes all contents of D:\ (Get-ChildItem d:\ -recurse), looks for files of specific size ($_.Length -ge 0), orders them by size descending (sort -Property Length -desc) and outputs interesting information to the Out-GridView. If you want to see it in Excel instead, replace ogv with Export-Csv and then import it to Excel (or use method described here).

Enumerating SQL Server instances in a network with Power Shell

Ever wondered how many SQL Server instances are there in your network? Or maybe you wanted to check a settings or run a script on all of them? Of course you can use Registered Servers functionality and run a query on all servers in a group, but this can save you the pain of adding all servers one by one.

Since Power Shell builds on .NET framework, you are free to use all the mechanisms available in it. The one we are interested in here is SqlDataSourceEnumerator, which allows you to enumerate all available instances of SQL Server in a local network. The only prerequisite is that servers must have SQL Server Browser service running and the task is completed by the following one-liner:

[System.Data.Sql.SqlDataSourceEnumerator]::Instance.GetDataSources()

The output you get is a table consisting of four columns 鈥 server and instance name, version and information if the server is clustered.

ServerName                                 InstanceName                               IsClustered                                Version                                 
----------                                 ------------                               -----------                                -------                                   
PHOBOS-MOBILNY                                                                        No                                         9.00.4035.00                              

Now take a simple extension of it and you are able to connect to every single SQL Server in your network. It will give you the list in standard SERVER\INSTANCE format. Then you can expand it further using SMO or Power Shell snap-in and you will get all the information you need from all servers at the same time.

[System.Data.Sql.SqlDataSourceEnumerator]::Instance.GetDataSources()  | % {if ($_.InstanceName -eq "".ToString) { $_.ServerName + "\" + $_.InstanceName} else { $_.ServerName }}

I鈥檓 awaiting – as always 鈥 your feedback, what鈥檚 good and what still needs to be improved. Or maybe there鈥檚 just something you would like to share.

Hello, world! :) and off we go – with Power Shell for DBA

First, let me express I am really happy to post here. I was following BeyondRelational for quite some time and learned a lot (especially from T-SQL Challenges, I guess) and I have finally decided to contribute a bit. I hope I am able to deliver some quality, so any comments are welcome.

So, let鈥檚 start with something not really obvious. Everyone knows about Power Shell, but who uses it for database administration? Of course since SQL Server 2008 there is an option of running Power Shell from within SSMS, but there are situations when it comes really handy, for example, when running queries against multiple servers and/or databases.

The idea is taken from one of the blogs I’ve come across once. It states that Power Shell exposes useful methods for database developers, so let鈥檚 put it to good use for administrative work.
In this scenario you get a list of servers from somewhere (it may be a text file or Active Directory), connect to each of them using SMO and get the information you want. For testing purposes, let鈥檚 grab name, compatibility level, recovery model, size and some other options. For details or extensions, MSDN website is a good place to start.

Let鈥檚 break it to bits. First we get the servers we want to query, then we鈥檒l work on getting the information.

The simplest way to get the list of servers to be queried is to place them in a file, then read it. You can fetch with one line:

$serverList = Get-Content 'filename'

The downside is that, if your server list changes, you have to remember to update your configuration file and since you are probably in a domain, this information is already out there somewhere just waiting for you to reach it. If you have a domain group containing all the servers, this gets really easy. Just remember to either run Active Directory Module for Power Shell, or include Import-Module ActiveDirectory before this:

$serverList = Get-ADComputer 鈥揊ilter {MemberOf 鈥揺q "group_name"} | select Name

Normally, the Get-ADComputer returns more information than we really need, so first we get the data and then just take what鈥檚 necessary 鈥 in this case, Name. It鈥檚, however, possible that the situation differs and you don鈥檛 have a group, you have a separate OU instead. In this case, you have to prepare an LDAP query to get to proper path and provide a dummy filter (Get-ADComputer requires Filter parameter):

$serverList = Get-ADComputer 鈥揊ilter {Name 鈥搇ike "*"} 鈥揕DAPFilter 鈥渙u=Servers,ou=Computers,dc=contoso,dc=com鈥 | select Name

Since we have our list of servers, we can start querying them. We have to load Microsoft.SqlServer.SMO (Server Management Objects) namespace first. Note that SMO does not support SQL Server version below 2000, but they should be pretty rare at the moment.

After we have the namespace ready, we will loop through all the servers. For each of the servers, we鈥檒l create an instance of Microsoft.SqlServer.Management.Smo.Server class, and then we鈥檒l look into Databases collection. To keep the output nicely formatted, we鈥檒l store the results in temporary variable $out, which is displayed using Out-GridView at the end.

$out = @()
[System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMO') | out-null
foreach ($server in $serverList)
{
	$s = New-Object ('Microsoft.SqlServer.Management.Smo.Server') $server
	$dbs=$s.Databases
	$dbs | select Name, CompatibilityLevel, RecoveryModel, AutoClose, AutoShrink, Owner, Size, LogFiles | Add-Member -Name "Server" -Value $server -MemberType NoteProperty -PassThru -Force -OutVariable +out | Out-Null
}
$out | ogv -Title "Database information"

There are more options starting from here, for instance you can list all permissions in all databases (again, refer to MSDN for detailed description):

$out =@()
foreach ($db in $dbs)
{
	$db.EnumObjectPermissions() | where {$_.ObjectSchema -ne "sys" -and $_.ObjectSchema -ne "INFORMATION_SCHEMA"} | select @{Name="dbName";Expression={$db.Name}}, * -OutVariable +out | out-null
}
$out | ogv -Title "Permissions on all objects in all databases"

I hope you find it interesting how you can make a bridge between development and administration 鈥 it鈥檚 a way to administrative stuff programatically. Anyway, I hope you enjoy my first post here and again 鈥 feel free to comment.