Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 13
I have the need for a script that takes in a large test file exported from a database, searches for an expression and replaces a string offset from the expression. ...
  1. #1
    Just Joined!
    Join Date
    Jun 2010
    Location
    Southern California
    Posts
    6

    Search w/offset and Replace Script

    I have the need for a script that takes in a large test file exported from a database, searches for an expression and replaces a string offset from the expression. I can do it in vi, and can even script it, but it is not efficient. I need to replace am attribute associated with a tag name for about 20 devices, and repeat the process about 100 times. That is 2000 Search and replaces over 100,000 lines.

    Example:
    vi {file}

    {file}(only one line #83939)
    00083939|Global Register 04 Bit12DPU LS13|di_glbl_r04_b12 | | |undefine |000000000000| |00000999|000000000000|00|000000000000|05620|00061 |00000|255|255|255|000|FALSE |00000|011|001|TRUE |00000|002|002|TRAVEL |00000|005|003|INVALID |00000|001|00000|1|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|008|0|0|0|0|0|000 000000000|002|00000|004| 0|05|000000081844|0|0|0|0|0|0|0|0|0|001|TRUE |00000|002|0|0|0|0|000|FALSE |00000|011|0|

    :source! {script}

    {script}
    /di_glbl_r04_b12/s+274
    r3lr1

    {script needed}
    start at line 1
    find di_glbl_r04_b12
    move forward 277 characters
    replace 00000 with 00031
    repeat until end of file

    start at line 1
    find di_glbl_r05_b00
    move forward 277 characters
    replace 00000 with 00031
    repeat until end of file

    ...

    {result}
    00083939|Global Register 04 Bit12DPU LS13|di_glbl_r04_b12 | | |undefine |000000000000| |00000999|000000000000|00|000000000000|05620|00061 |00000|255|255|255|000|FALSE |00000|011|001|TRUE |00000|002|002|TRAVEL |00000|005|003|INVALID |00000|001|00031|1|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|008|0|0|0|0|0|000 000000000|002|00000|004| 0|05|000000081844|0|0|0|0|0|0|0|0|0|001|TRUE |00000|002|0|0|0|0|000|FALSE |00000|011|0|

    I am a day and a half into this and am new to Linux. Any suggestions?
    Thanks for any help I can get!

  2. #2
    Linux Guru Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    2,096
    Would it be possible, to do that change in the database?
    Some UPDATES for that row with a WHERE = di_glbl_r05_b00, di_glbl_r04_b12, etc?

    Even if a sed/perl/whatever script
    a) can be developed (which Iam sure off, just cant think of a quick solution)
    b) would be faster
    then this script would still be hack-ish: If the data changes just a little bit, then the "move forward 277 bytes" wont work anymore..
    You must always face the curtain with a bow.

  3. #3
    Just Joined!
    Join Date
    Jun 2010
    Location
    Southern California
    Posts
    6
    I can make the changes in the database, but only through a form which is very slow.

    The data will not change because the column width of each cell is fixed. That is the space between each pipe (|). The system is proprietary, so I am kind of stuck with using export and import utilities.

    The problem I found with sed is that I cannot offset from the pattern.

    ====

    So, the database is a .dbd file, but not mysql. The db porcesses are dbstart, dbnam, and dbproc when the project is brought online. The export creates .asc files which, I guess is ascii flat files. Maybe this will help?

  4. #4
    Linux Enthusiast Kloschüssel's Avatar
    Join Date
    Oct 2005
    Location
    Italy
    Posts
    717
    use sed with something like this:
    Code:
    's/^(|.*[^|]){n}|00000|/\1|00031|/g'
    what follows is an explanation of the dark magic above.

    first of all select the portion you want:
    ^ := beginning of line
    (|[^|]*){n} := exactly n-times a "|data" with the end character [^|]
    |00000| := the search string

    then replace it:
    \1 := backreference to the first part of the line (group 1)
    |00031| := the replacement

    PS: untested code snippet, shipped without warranty

  5. #5
    Linux Guru Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    2,096
    The problem, as I understand it, he doesnt want to change all lines.

    But only the ones with a special "id", like di_glbl_r04_b12, di_glbl_r05_b00, etc
    And these seem to be random, ie following no pattern..
    You must always face the curtain with a bow.

  6. #6
    Linux Enthusiast Kloschüssel's Avatar
    Join Date
    Oct 2005
    Location
    Italy
    Posts
    717
    Then he may just need to adapt the part with:
    Code:
    (|[^|]*){n}
    to match his needs. It's not the final solution, but more a good hint.

  7. #7
    Just Joined!
    Join Date
    Jun 2010
    Location
    Southern California
    Posts
    6
    How about using awk? Each row is 86 cells wide. I could search for a match in $3 and replace $35 with 00031 or 00000 based on the $3 match. I just need to "or" the search strings and "print" $1"|"$2"|"..."|"variable"|"$36...

    I am trying it now.

  8. #8
    Linux Enthusiast Kloschüssel's Avatar
    Join Date
    Oct 2005
    Location
    Italy
    Posts
    717
    awk can handle it too. in the end awk is even more powerful than sed could ever be (as sed only accepts regular expressions and awk comes shipped with a scripting language than can match context aware grammars).

  9. #9
    Just Joined!
    Join Date
    Jun 2010
    Location
    Southern California
    Posts
    6
    Okay, I got awk to work. I am using

    awk -F '|' -f script.sh file.acs > newfile.acs

    The script has a if else statement.

    if ($3 == "di_globl_r01_b13 " || $3 == "..." || $3 == "..." )
    VAR = "00031"
    else
    VAR = $35

    Now the problem is that there are a few that are getting changed that should not. I only want to change any that has DPU in $2 (Description Field) and not TPU.

    Does anyone know how to include an && $2 "*DPU*" in the if statement? As far as I can tell the wild card does not work. It might look like:

    if ($2 == “*DPU*” && ($3 == "di_globl_r01_b13 " || $3 == "..." || $3 == "...")

  10. #10
    Just Joined!
    Join Date
    Jun 2010
    Location
    Southern California
    Posts
    6
    Does anyone know how to use a wild card in awk?

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...