Wednesday, 15 April 2015

Handling Binary Content Using PowerShell

Yesterday I was testing a PDF IFilter on my development machine, and I needed a quick way to get a PDF file into a VARBINARY(MAX) column. There are probably ways to do this using SSIS and BCP but as I only needed to do it as a one-off I reached for my favourite tool to do miscellaneous tasks: PowerShell, the Swiss Army chainsaw of Windows Maintenance tasks.
The problem:
  1. I have a 35KB PDF that I want to import into a SQL table.
  2. I can append the content to the column using the SQL .WRITE statement, like this:

    UPDATE [Documents] SET [Document] .WRITE(0x1267A2B3, NULL, 0) WHERE NumId = 3
    

  3. In order to do that I need to get the bytes of the file into a 0xA63410..... format.
The solution:

PS C:\Users\Alex\Documents> (Get-Item .\BlankAnnualReturn.pdf).OpenRead() | %{$buffer = [System.Linq.Enumerable]::ToArray( [System.Linq.Enumerable]::Repeat( [Convert]::ToByte(0), 8040));$bytesRead = $_.Read($buffer, 0, $buffer.Length); while ($bytesRead -gt 0) {$output = (New-Object System.Text.StringBuilder).Append("0x"); for($i = 0; $i -lt $bytesRead; $i = $ i + 1){[void] $output.AppendFormat("{0:x2}",$buffer[$i])};$output.ToString();$bytesRead = $_.Read($buffer,0,$buffer.Length);}$_.Close();}
That gives me the content of .\BlankAnnualReturn.pdf in 8040 byte blocks (the optimum size for the SQL .WRITE statement) in 0x... format.
I'll just point out my favourite bits of this:
  1. The easiest way to create an empty byte array in PowerShell is using the Linq Repeat method.  In C# this would be:
    byte[] buffer = Enumerable.Repeat((byte)0, 8040).ToArray();
    

    But in PowerShell we don't have access to the compiler tricks that make extension methods work, so we have to call these methods as if they were normal static methods:
    $buffer = [System.Linq.Enumerable]::ToArray([System.Linq.Enumerable]::Repeat([Convert]::ToByte(0), 8040));

  2. Formatting a byte as two digit hex.
    Dead simple this bit, just:
    [void] $output.AppendFormat("{0:x2}",$buffer[$i]);

    The cast to [void] is needed to prevent the StringBuilder from being echoed to the console after each append.

Single line PowerShell rocks.

No comments:

Post a Comment