Identifying Duplicate CSV Headers (Part 1)

CSV files are just text files, so it is easy to strip the first line and examine its headers. If you don’t have a CSV file at hand, here is a simple line to create one to play with:

PS C:\> Get-Process | Export-Csv -Path $env:temp\test.csv -NoTypeInformation -Encoding UTF8 -UseCulture

PS C:\>  

Now you can analyze its headers. This simple approach tells you whether there are duplicate headers in a CSV file (which obviously should not be the case). It assumes that the delimiter of your CSV file is a comma. If it uses a different delimiter, adjust the character you use to split:

$headers = Get-Content $env:temp\test.csv | Select-Object -First 1
$duplicates = $headers.Split(',') | Group-Object -NoElement | Where-Object {$_.Count -ge 2}
if ($duplicates.Count -eq 0)
    Write-Host 'You are safe!'
    Write-Warning 'There are duplicate columns in your CSV file:'

The result is (as expected):

You are safe!

PS C:\>

If you are wondering when on earth you’d stumble across duplicate headers, try this:

PS C:\> driverquery /V /FO CSV | Set-Content -Path $env:temp\test.csv -Encoding UTF8

If you run this on a German system, the report will look like this:

WARNUNG: There are  duplicate columns in your CSV file:
Count Name
----- ----
    2  "Status"

Apparently, while localizing, Microsoft translated “State” and “Status” both to the German word “Status”, producing duplicate column headers.

Twitter This Tip! ReTweet this Tip!