Results 1 to 7 of 7
This works but is veeeerrrrry slow. What I'm trying to do is replace each instance of externalId with different unique ID.
Code:
while read line; do echo $line | sed ...
- 08-13-2009 #1Just Joined!
- Join Date
- Aug 2009
- Posts
- 3
Looking for a better faster bash method to search and replace text within a file
This works but is veeeerrrrry slow. What I'm trying to do is replace each instance of externalId with different unique ID.
Anybody got a better faster way to do this?Code:while read line; do echo $line | sed 's/externalId="[^"]*"/externalId="'USR.`cat /dev/urandom | tr -dc a-z0-9 | head -c32`'"/' >> another.txt; done < 2009-07-21-ruleExport2.xml
Edit: Let me explain in more detail of what I'm trying to accomplish.
I have an XML file with system generated externalIDs.
I need to replace each externalID with another unique 16 digit number.
Each time this code is called results in an unique 16 digit number.
this sed will find externalIDs and replace it with an unique ID:Code:cat /dev/urandom | tr -dc a-z0-9 | head -c32
So all I need to do is:Code:sed 's/externalId="[^"]*"/externalId="'USR.`cat /dev/urandom | tr -dc a-z0-9 | head -c32`'"/
- Read in a line of xml
- Interrogate the line using the sed
- if an externalID is found, a unique number is generated and replaces the ID
- Read in the next line of xml
- Interrogate the line using the sed
- if an externalID is found, a unique number is generated and replaces the ID. Since we're making another call to /dev/urandom a new unique ID is generated.
- Rinse and repeat.
Example input Text
Example output:Code:<CommandExecutionCaptureRule id="ID70" oid="-1y2p0ij0tfa3k:-1y2p0iizevuy4" class="rule" name="rpm -qa" externalId="USR.ppi0iut9s6iedixxd1440u62eknlpvx8"> <Description></Description> <Severity>0</Severity> <RealTime>false</RealTime> <Actions /> <ElementName>rpm -qa</ElementName> <CommandLine>rpm -qa | sort</CommandLine> <ExcludePattern></ExcludePattern> <ExcludeReplace></ExcludeReplace> </CommandExecutionCaptureRule> <RuleGroup id="ID58" oid="-1y2p0ij32e8cq:-1y2p0iiycq5qo" class="rulegroup" name="Crontab"> <Description></Description> <Children> <Child refid="ID71" /> </Children> </RuleGroup> <CommandExecutionCaptureRule id="ID71" oid="-1y2p0ij0tfa3k:-1y2p0iiyrpnez" class="rule" name="crontab " externalId="USR.ppi0iut9s6iedixxd1440u62eknlpvx8">
Code:<CommandExecutionCaptureRule id="ID70" oid="-1y2p0ij0tfa3k:-1y2p0iizevuy4" class="rule" name="rpm -qa" externalId="USR.g1a1jzuf8hemu0rdwxnu47hbr58xr3vl"> <Description></Description> <Severity>0</Severity> <RealTime>false</RealTime> <Actions /> <ElementName>rpm -qa</ElementName> <CommandLine>rpm -qa | sort</CommandLine> <ExcludePattern></ExcludePattern> <ExcludeReplace></ExcludeReplace> </CommandExecutionCaptureRule> <RuleGroup id="ID58" oid="-1y2p0ij32e8cq:-1y2p0iiycq5qo" class="rulegroup" name="Crontab"> <Description></Description> <Children> <Child refid="ID71" /> </Children> </RuleGroup> <CommandExecutionCaptureRule id="ID71" oid="-1y2p0ij0tfa3k:-1y2p0iiyrpnez" class="rule" name="crontab " externalId="USR.adc82uzvxapztrbzuifrvmq3a99bprv0">
- 08-13-2009 #2Linux User
- Join Date
- May 2008
- Location
- NYC, moved from KS & MO
- Posts
- 251
Try running sed a bit differently:
Code:sed 's/externalId="[^"]*"/externalId="'USR.`cat /dev/urandom | tr -dc a-z0-9 | head -c32`'"/' 2009-07-21-ruleExport2.xml >>another.txt
- 08-14-2009 #3Linux User
- Join Date
- May 2008
- Location
- NYC, moved from KS & MO
- Posts
- 251
In my previous reply I actually dropped the g in the end of the sed line, without it sed will just replace the first externalId's value into a random string. Here's the corrected one:
Code:sed 's/externalId="[^"]*"/externalId="'USR.`cat /dev/urandom | tr -dc a-z0-9 | head -c32`'"/g' 2009-07-21-ruleExport2.xml >>another.txt
- 08-14-2009 #4Just Joined!
- Join Date
- Aug 2009
- Posts
- 3
Thanks secondmouse, but your code will replace all externalIDs in the file with one new ID.
- 08-15-2009 #5Linux User
- Join Date
- May 2008
- Location
- NYC, moved from KS & MO
- Posts
- 251
Apologize for not reading the sample input carefully. Try this python script instead
Name it as sr.py. To run it,Code:#!/usr/bin/env python from random import choice import re,fileinput,string def genrndstr(size): return ''.join( [ choice(string.letters[0:26]+string.digits) for i in range(size) ] ) if __name__=='__main__': p=re.compile(r'externalId=".*"') for line in fileinput.input(): if p.search(line): newid='externalId="USR.%s"' % genrndstr(32) print( '%s' % p.sub(newid, line) ), else: print( '%s' % line ),
On my system it's almost 20 times as fast as the sed method:Code:python sr.py 2009-07-21-ruleExport2.xml >> another.txt
Code:user@linux ~/sed_r $ time (for i in `seq 1 50`; do while read line; do echo $line | sed 's/externalId="[^"]*"/externalId="'USR.`cat /dev/urandom | tr -dc a-z0-9 | head -c32`'"/' > another.txt; done < 2009-07-21-ruleExport2.xml; done) real 0m43.030s user 0m9.249s sys 0m40.343s user@linux ~/sed_r $ time (for i in `seq 1 50`; do python sr.py 2009-07-21-ruleExport2.xml > another1.txt; done) real 0m2.448s user 0m1.916s sys 0m0.452s
- 08-17-2009 #6Just Joined!
- Join Date
- Aug 2009
- Posts
- 3
secondmouse, no apology needed. Your script is extremely fast and it keeps the formatting of the xml.
Thanks
- 08-17-2009 #7Linux User
- Join Date
- May 2008
- Location
- NYC, moved from KS & MO
- Posts
- 251
No problem. Glad it helps


Reply With Quote