Split Big Text File based on Specific Line Item Details

I am wanting to know if the below is possible in PowerShell.

I have a large text file [sample listed below].  I need to be able to read each line and output to a new file.  The key is to out a new file every time I run into a specific string [in this case SSNO].  I have a PowerShell where I can output to a new file based on a new line but not based on an element of the line.  As a bonus, if when writing out the line details it could add a {TAB} before each BOLDED item.  Please let me know if something like this is possible for my unstructured data 

Sample Text File

SSNO 111-22-3333 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
LAST NM SMITH HS CITY BERKLEY SEX M DC 2 / CRED ATT 3.0
FIRST NM JOHN HS STE MI MARITAL DC 3 / CRED ERN 0.0
MIDDLE NM ALLEN GRAD DT 06/88 CITIZEN Y DC 4 / HONOR PTS 0.00
MAIDEN NM HS CODE 63/020 COLL 1 CRED 0 CC 1 / *** GPA 0.000
PHONE 555-8251 HS RANK **** COLL 2 CRED 0 CC 2 / PREV SESS 891
LOCAL ADDRESS HS CLNO **** COLL 3 CRED 0 CC 3 / CRED REG 0.0
SSNO 444-66-7777 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
LAST NM HITE HS CITY BERKLEY SEX M DC 2 / CRED ATT 3.0
FIRST NM WILLIAM HS STE MI MARITAL DC 3 / CRED ERN 0.0
MIDDLE NM ALLEN GRAD DT 06/88 CITIZEN Y DC 4 / HONOR PTS 0.00
MAIDEN NM HS CODE 63/020 COLL 1 CRED 0 CC 1 / *** GPA 0.000
PHONE 333-4444 HS RANK **** COLL 2 CRED 0 CC 2 / PREV SESS 891
LOCAL ADDRESS HS CLNO **** COLL 3 CRED 0 CC 3 / CRED REG 0.0
STREET HS GPA *** COLL 4 CRED 0 CA 1 / PREV EPS TRANS NO 0010
3213 THOMAS AVE HS TRAN 0 COLL TR 0 CA 2 / CURR EPS PREV DG
CITY BERKLEY BRTH DT 12/06/69 SES APP 891 CA 3 / ACT RECD 0 APP FEE 0
STATE MI RES CNY COG STY P CNS 0 ENGL PL 0 MATH PL 0
ZIP CODE 48072-1164 BTH CNY CUR CMP SE PREV GRAD ACT SCR - - - - -
COUNTY E-PHONE 588-5091 CURRIC EME FAC 0 SBR 2 GED SCR HIGH GRADE
RESIDENCY A NUM HST 01 ADM ST P VBR HEALTH FRM 0 FIRST TRAN
RECR COMM MACRAO FIN YR 0.00 VCN NLN ARITH PROF AT EMPL
DIR ADM ASSET 402015****01 OSA GSL/SDL ELIGIBLE ETC
POP CODES ESL STUDENT ESL CONVERSATION ESL GRAMMAR ESL WRITING MTELP
ENROLLMENT SURVEY DATA: QUES1 QUES2 QUES3 QUES4
QUES5 QUES7 , , , , , , , , QUES8 , , , , , , , ,
QUES9 , , , , , , , INTENT
FIN AID CRDS ATT 3.0 FIN AID GPA 0.000 PROG CRED ERN 0.0 PROG QUOTIENT 0
HS BAND OCC BAND
MATH ID: MATH SC:
TRIG ID: TRIG SC:

Parents
No Data
Reply

  • It's best to show you code and the errors / conditions, and smaple output you are getting and the output you are expecting, in order to get the clearest answers. If you do not, you force folks to guess. There are a number of way sot get what you are after, and there are tons of examples regarding how, all over the web. I'll just show you one way .

    Yet, this is all basic object / string parsing, and you can study up on it to wrap your head around it all.

    Ideally, for your query, you shoudl first read the text in as a csv file, yet the issues with what you chow is that whatever is outputting this file, is spitting out space delimited and comma delimited lines (I assume because there is no data in a column).

    Further more, there is the issue of dealing with those slashes, which are special characters in PowerShell, and escapeing them my be needed if you are matching for string that contain them.

    All-in-all, you should  be using a RegEx string match to select a string that you are after and the line it is on.

    PowerShell is object oriented, and you should always be using objects first. If you are coming form a *Nix background, you'll need to stop trygin to treat your PowerShell that way. You'll just cause yourself undue stress adn strain.


    $TextData = @'
    SSNO 111-22-3333 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
    LAST NM SMITH HS CITY BERKLEY SEX M DC 2 / CRED ATT 3.0
    FIRST NM JOHN HS STE MI MARITAL DC 3 / CRED ERN 0.0
    MIDDLE NM ALLEN GRAD DT 06/88 CITIZEN Y DC 4 / HONOR PTS 0.00
    MAIDEN NM HS CODE 63/020 COLL 1 CRED 0 CC 1 / *** GPA 0.000
    PHONE 555-8251 HS RANK **** COLL 2 CRED 0 CC 2 / PREV SESS 891
    LOCAL ADDRESS HS CLNO **** COLL 3 CRED 0 CC 3 / CRED REG 0.0
    SSNO 444-66-7777 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
    LAST NM HITE HS CITY BERKLEY SEX M DC 2 / CRED ATT 3.0
    FIRST NM WILLIAM HS STE MI MARITAL DC 3 / CRED ERN 0.0
    MIDDLE NM ALLEN GRAD DT 06/88 CITIZEN Y DC 4 / HONOR PTS 0.00
    MAIDEN NM HS CODE 63/020 COLL 1 CRED 0 CC 1 / *** GPA 0.000
    PHONE 333-4444 HS RANK **** COLL 2 CRED 0 CC 2 / PREV SESS 891
    LOCAL ADDRESS HS CLNO **** COLL 3 CRED 0 CC 3 / CRED REG 0.0
    STREET HS GPA *** COLL 4 CRED 0 CA 1 / PREV EPS TRANS NO 0010
    3213 THOMAS AVE HS TRAN 0 COLL TR 0 CA 2 / CURR EPS PREV DG
    CITY BERKLEY BRTH DT 12/06/69 SES APP 891 CA 3 / ACT RECD 0 APP FEE 0
    STATE MI RES CNY COG STY P CNS 0 ENGL PL 0 MATH PL 0
    ZIP CODE 48072-1164 BTH CNY CUR CMP SE PREV GRAD ACT SCR - - - - -
    COUNTY E-PHONE 588-5091 CURRIC EME FAC 0 SBR 2 GED SCR HIGH GRADE
    RESIDENCY A NUM HST 01 ADM ST P VBR HEALTH FRM 0 FIRST TRAN
    RECR COMM MACRAO FIN YR 0.00 VCN NLN ARITH PROF AT EMPL
    DIR ADM ASSET 402015****01 OSA GSL/SDL ELIGIBLE ETC
    POP CODES ESL STUDENT ESL CONVERSATION ESL GRAMMAR ESL WRITING MTELP
    ENROLLMENT SURVEY DATA: QUES1 QUES2 QUES3 QUES4
    QUES5 QUES7 , , , , , , , , QUES8 , , , , , , , ,
    QUES9 , , , , , , , INTENT
    FIN AID CRDS ATT 3.0 FIN AID GPA 0.000 PROG CRED ERN 0.0 PROG QUOTIENT 0
    HS BAND OCC BAND
    MATH ID: MATH SC:
    TRIG ID: TRIG SC:
    '@

    [regex]::Matches($TextData,'SSNO.*')

    # Results
    <#
    Groups   : {0}
    Success  : True
    Name     : 0
    Captures : {0}
    Index    : 0
    Length   : 68
    Value    : SSNO 111-22-3333 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88

    Groups   : {0}
    Success  : True
    Name     : 0
    Captures : {0}
    Index    : 429
    Length   : 68
    Value    : SSNO 444-66-7777 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
    #>

    [regex]::Matches($TextData,'SSNO.*').Value

    # Resutls
    <#
    SSNO 111-22-3333 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
    SSNO 444-66-7777 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
    #>

    # Send to a text file

    <#
    Test for filename, create if not there
    Grab all matching stings
    Remove keading/ending spaces
    Add to the new file.
    #>

    If(Test-Path -Path "$pwd\SSNO.txt")
    {
        #Do Nothing
    }
    Else
    {
        'Creating new file for new dataset'
        New-Item -Path $pwd -ItemType File -Name 'SSNO.txt' -Force
        Add-Content -Path "$pwd\SSNO.txt" -Value $(($([regex]::Matches($TextData,'SSNO.*').Value).Trim()) -replace "`n|")
    }

    'Read new file data'
    Get-Content -Path "$pwd\SSNO.txt"

    Mode                LastWriteTime         Length Name                                                                                                             
    ----                -------------         ------ ----                                                                                                             
    -a----       10/28/2019  10:23 PM              0 SSNO.txt                                                                                                         
    Read new file data
    SSNO 111-22-3333 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88
    SSNO 444-66-7777 HS NAME BERKLEY VETERAN N DC 1 / APP DATE 12/13/88

Children
No Data