22 Aug

Splitting and Joining Files with PowerShell

Sometimes it is useful to be able to split large files into smaller chunks. This can be because the file is bigger than a file limit size for a particular communication or storage medium. There is plenty of software that will do just that. To name a few 7-zip, WinZip and WinRAR.

However as I usually have my PowerShell profile synced to all my machines it is an easy task to do in PowerShell. I wrote some PowerShell functions a while ago that split and joined files. Here are a few examples of how they should be used and then the code follows at the bottom:

Split-File -filename .\fileToSplit.dat -outprefix splitFilePrefix -splitSize 2M
Join-File -filename .\splitFilePrefix.001 -outfile CopyOfFileToSplit.dat

You can specify the split size using the suffixes K, M and G for kilobytes, megabytes and gigabytes respectively.

Note that the file locations are relative to the processes current working directory and not PowerShell’s current location. To avoid confusion and strange behaviour use absolute paths. If you want to understand more about the difference then I recommend you check out this blog which came out near the top when googling for an insightful link, http://www.beefycode.com/post/The-Difference-between-your-Current-Directory-and-your-Current-Location.aspx

Here are the functions below:

function Split-File()
{
	param
	(
		[string] $filename = $(throw "file required"),
		[string] $outprefix = $(throw "outprefix required"),
		[string] $splitSize = "50M",
		[switch] $Quiet
	)
	
	$match = [System.Text.RegularExpressions.Regex]::Match($splitSize, "^(\d+)([BKMGbkmg]?)$")
	[int64]$size = $match.Groups[1].Value
	$sizeUnit = $match.Groups[2].Value.ToUpper()
	$sizeUnitValue = 0
	switch($sizeUnit)
	{
		"K" { $sizeUnitValue = 1024 }
		"M" { $sizeUnitValue = 1048576 }
		"G" { $sizeUnitValue = 1073741824 }
		default { $sizeUnitValue = 1 }
	}
	
	$size = $sizeUnitValue * $size
	
	Write-Host ("Size Split is {0}" -f $size) -ForegroundColor Magenta
	
	$outFilePrefix = [System.IO.Path]::Combine((Get-Location).Path, $outprefix)
	
	$inFileName = [IO.Path]::Combine((Get-Location).Path,$filename)
	
	Write-Host ("Input File full path is {0}" -f $inFileName)
	
	if ([IO.File]::Exists($inFileName) -ne $true)
	{
		Write-Host ("{0} does not exist" -f $inFileName) -ForegroundColor Red
		return
	}
	
	$bufferSize = 1048576
	
	$ifs = [IO.File]::OpenRead($inFileName)
	$ofs = $null
	$buffer = New-Object -typeName byte[] -ArgumentList $bufferSize
	$outFileCounter = 0
	$bytesReadTotal = 0
	
	$bytesRead = 1 #Non zero starting number to ensure loop entry
	while ($bytesRead -gt 0)
	{
		$bytesToRead = [Math]::Min($size-$bytesReadTotal, $bufferSize)
		$bytesRead = $ifs.Read($buffer, 0, $bytesToRead)
		
		if ($bytesRead -ne 0)
		{		
			if ($ofs -eq $null)
			{
				$outFileCounter++
				$ofsName = ("{0}.{1:D3}" -f $outFilePrefix,$outFileCounter)
				$ofs = [IO.File]::OpenWrite($ofsName)
				if ($Quiet -ne $true)
				{
					Write-Host ("Created file {0}" -f $ofsName) -ForegroundColor Yellow
				}
			}
			
			$ofs.Write($buffer, 0, $bytesRead)
			$bytesReadTotal += $bytesRead
			
			if ($bytesReadTotal -ge $size)
			{
				$ofs.Close()
				$ofs.Dispose()
				$ofs = $null
				$bytesReadTotal = 0
			}
		}
	}
	
	if ($ofs -ne $null)
	{
		$ofs.Close()
		$ofs.Dispose()
	}
	
	Write-Host "Finished"
	
	$ifs.Close()
	$ifs.Dispose()
}

function Join-File()
{
	param
	(
		[string] $filename = $(throw "filename required"),
		[string] $outfile	= $(throw "out filename required")
	)
	
	$outfilename = [IO.Path]::Combine((Get-Location).Path, $outfile)
	$ofs = [IO.File]::OpenWrite($outfilename)
	
	$match = [System.text.RegularExpressions.Regex]::Match([IO.Path]::Combine((Get-Location).Path,$filename), "(.+)\.\d+$")
	if ($match.Success -ne $true)
	{
		Write-Host "Unrecognised filename format" -FroegroundColor Red
	}
	$fileprefix = $match.Groups[1].Value
	$filecounter = 1
	$bufferSize = 1048576
	$buffer = New-Object -TypeName byte[] -ArgumentList $bufferSize
	
	while ([IO.File]::Exists(("{0}.{1:D3}" -f $fileprefix,$filecounter)))
	{
		$ifs = [IO.File]::OpenRead(("{0}.{1:D3}" -f $fileprefix,$filecounter))
		
		$bytesRead = $ifs.Read($buffer, 0, $bufferSize)
		while ($bytesRead -gt 0)
		{
			$ofs.Write($buffer,0,$bytesRead)
			$bytesRead = $ifs.Read($buffer, 0, $bufferSize)
		}		
		
		$ifs.Close()
		$ifs.Dispose()
	
		$filecounter++
	}
	
	$ofs.Close()
	$ofs.Dispose()

	Write-Host ("{0} created" -f $outfilename) -ForegroundColor Yellow
}