need-to-do-a-multi-key-sort-on-a-legacy-text-file-that-has-no-delimiters

I had posted the below years ago and used the suggestions given until now.

https://community.idera.com/database-tools/powershell/ask_the_experts/f/powershell_for_windows-12/18039/need-to-do-a-multi-key-sort-on-a-legacy-text-file-that-has-no-delimiters

I understand that Sorting in Get_Content can be slow and that using a list could be much faster.

I saw this: https://stackoverflow.com/questions/32385611/sort-very-large-text-file-in-powershell and the answer by n0rd (accepted answer).

Example records of my data strings in input file:

CUSTDATA1                         1                         DROP    01+0000098400

CUSTDATA1                         1                 PARTIAL DROP

CUSTDATA2                         2                         DROP    01+0000345600

I would like to sort on columns 9-60 followed by columns 69-71.

The below has worked fine until now but runs real slow. Credit to @bobmccoy for getting me over the hump at the time.

Get-Content $IN_EXTRACT | sort {$_[9..60]}, {$_[69..71]} | ForEach-Object {$_.SubString(0)}| Out-File $SORTED_EXTRACT -Encoding ascii

I would like to read into a list and sort the list and finally write it out.

The below is my attempt but cannot get the sort to work as I am unsure of the syntax:

$IN_EXTRACT = $STRATA_DATA + "\file-unsorted.txt"
$SORTED_EXTRACT = $STRATA_DATA + "\file-sorted.txt"

$ls = new-object system.collections.generic.List[string]

$reader = [System.IO.File]::OpenText($IN_EXTRACT)

#READ THE FILE INTO A LIST

try {
while (($line = $reader.ReadLine()) -ne $null)
{
$t = $ls.Add($line)
}
}
finally {
$reader.Close()
}

#SORT THE LIST
#This works and runs real fast, but not sorted by columns. Should I parse the columns while reading into list first??? How??
#$ls.Sort();

#tried all of these but syntax is not correct - any suggestions?
#$ls = $ls | sort-object @{Expression={$_[9..60]}, {$_[69..71]};Ascending=$true}
#$ls.Sort({$_.substring(9,60)}, {$_.substring(69,71)});
#$ls.Sort($ls|%{$_.substring(9,60)}, {$_.substring(69,71)});

#WRITE THE SORTED FILE

try
{
$f = New-Object System.IO.StreamWriter $SORTED_EXTRACT;
foreach ($s in $ls)
{
$f.WriteLine($s);
}
}
finally
{
$f.Close();
}

Parents
No Data
Reply
  • howdy Dennis Soans1 ,

    have you tried converting the fixed-column text into csv-text & then using "ConvertFrom-CSV" to give you an array of standard objects? it would require that you recreate the lines as fixed-column text if you still need that horrible format. [*grin*]

    take care,

    lee

Children
No Data