Identifying Duplicate CSV Headers (Part 1)

CSV files are just text files, so it is easy to strip the first line and examine its headers. If you don’t have a CSV file at hand, here is a simple line to create one to play with:

 
PS C:\> Get-Process | Export-Csv -Path $env:temp\test.csv -NoTypeInformation -Encoding UTF8 -UseCulture

PS C:\>  
 

Now you can analyze its headers. This simple approach tells you whether there are duplicate headers in a CSV file (which obviously should not be the case). It assumes that the delimiter of your CSV file is a comma. If it uses a different delimiter, adjust the character you use to split:

$headers = Get-Content $env:temp\test.csv | Select-Object -First 1
$duplicates = $headers.Split(',') | Group-Object -NoElement | Where-Object {$_.Count -ge 2}
if ($duplicates.Count -eq 0)
{
    Write-Host 'You are safe!'
}
else
{
    Write-Warning 'There are duplicate columns in your CSV file:'
    $duplicates
}

The result is (as expected):

 
You are safe!

PS C:\>
 

If you are wondering when on earth you’d stumble across duplicate headers, try this:

 
PS C:\> driverquery /V /FO CSV | Set-Content -Path $env:temp\test.csv -Encoding UTF8
 

If you run this on a German system, the report will look like this:

 
WARNUNG: There are  duplicate columns in your CSV file:
 
Count Name
----- ----
    2  "Status"
 

Apparently, while localizing, Microsoft translated “State” and “Status” both to the German word “Status”, producing duplicate column headers.

Twitter This Tip! ReTweet this Tip!