Identifying Duplicate CSV Headers (Part 2)

When a CSV file contains duplicate column headers, it cannot be imported. In the previous tip we illustrated how you can detect duplicate column headers in a CSV file. Now here is an approach that auto-corrects duplicate entries.

First, you need a CSV file with duplicate column headers. On German systems, for example, you could create one like this:

 
PS C:\> driverquery /V /FO CSV | Set-Content -Path $env:temp\test.csv -Encoding UTF8
 

Simply open the file and check if it has indeed duplicate entries.

 
PS C:\> notepad $env:temp\test.csv
 

If not, rename some headers to produce duplicates, then save the file.

You are now unable to import the CSV file using Import-Csv:

 
PS C:\>  Import-Csv -Path $env:temp\test.csv -Delimiter ','
Import-Csv : Element  "Status" is present already.
In Zeile:1 Zeichen:1
+ Import-Csv -Path $env:temp\test.csv  -Delimiter ','
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    +  CategoryInfo          : NotSpecified: (:)  [Import-Csv], ExtendedTypeSystemException
    +  FullyQualifiedErrorId :  AlreadyPresentPSMemberInfoInternalCollectionAdd,Microsoft.PowerShell.Commands.ImportCsvCommand
 

Here is a new function called Import-CsvWithDuplicate that can deal with duplicate entries automatically:

function Import-CsvWithDuplicate($Path, $Delimiter=',', $Encoding='UTF8')
{
    # get the header line and all header items
    $headerLine = Get-Content $Path | Select-Object -First 1
    $headers = $headerLine.Split($Delimiter) 

    # check for duplicate header names, and if found, add an incremented
    # number to it
    $dupDict = @{}
    $newHeaders = @(foreach($header in $headers)
    {
        $incrementor = 1
        $header = $header.Trim('"')
        $newheader = $header

        # increment numbers until the new name is unique
        while ($dupDict.ContainsKey($newheader) -eq $true)
        {
            $newheader = "$header$incrementor"
            $incrementor++
        } 
        
        $dupDict.Add($newheader, $header)

        # return the new header, producing a string array
        $newheader
    })

    # read the CSV without its own headers..
    Get-Content -Path $Path -Encoding $Encoding | 
      Select-Object -Skip 1 |
      # ..and replace headers with newly created list
      ConvertFrom-CSV -Delimiter $Delimiter -Header $newHeaders
}

With it, you can safely import CSVs with duplicate header names:

 
PS C:\> Import-CsvWithDuplicate -Path $env:temp\test.csv -Delimiter ','


Modulname                  : 1394ohci
Anzeigename                : OHCI-konformer 1394-Hostcontroller
Beschreibung               : OHCI-konformer 1394-Hostcontroller
Treibertyp                 : Kernel 
Startmodus                 : Manual
Status                     : Stopped
Status1                    : OK
Beenden annehmen           : FALSE
Anhalten annehmen          : FALSE
Ausgelagerter Pool (Bytes) : 4.096
Code(Bytes)                : 204.800
BSS(Bytes)                 : 0
Linkdatum                  : 16.07.2016 04:21:36
Pfad                       : C:\WINDOWS\system32\drivers\1394ohci.sys
Init(Bytes)                : 4.096

Modulname                  : 3ware
Anzeigename                : 3ware
Beschreibung               : 3ware
Treibertyp                 : Kernel 
Startmodus                 : Manual
Status                     : Stopped
Status1                    : OK
Beenden annehmen           : FALSE
Anhalten annehmen          : FALSE
Ausgelagerter Pool (Bytes) : 0 
(...)
 

As you see, the function automatically renamed the second instance of “Status” to “Status1”.

Twitter This Tip! ReTweet this Tip!