Results 1 to 5 of 5
Hello,
I have a pipe delmited file with multiple empty fields. I would like to write a sed or awk script put the term NULL in each empty field. Is ...
- 02-23-2010 #1Just Joined!
- Join Date
- Feb 2010
- Posts
- 3
sed and/or awk question
Hello,
I have a pipe delmited file with multiple empty fields. I would like to write a sed or awk script put the term NULL in each empty field. Is there a way to do this without multiple -e commands for sed?
Example:
1|2|||5||7|
To:
1|2|NULL|NULL|5|NULL|7|"
sed -e "c/||/|NULL|/" -e "c/||/|NULL|/" -e "c/||/|NULL|" will get the job done, but is ugly. ...and I don't know how many columns I might have some day.
Is there an easier way to do this?
Thanks in advance.
- 02-23-2010 #2
You have the right idea. Unfortunately, it's a bit tough.
The simple fix is to try this instead:
See that 'g' on the end? That means that if a substitution was performed, try to substitute again starting at the point just after the previous match. That bold part is key: it means that this approach almost works, but not quite. Observe:Code:sed -e 's/||/|NULL|/g'
Note that it does not correctly work with the case where two empty fields were next to each other.Code:alex@alex-laptop:~$ echo "1|2|||5|7" | sed -e 's/||/|NULL|/g' 1|2|NULL||5|7
Sadly, instead, we need to use some of sed's more...confusing features. In particular, if we substitute for an empty field, we want to run the check again on the entire new line. Fortunately, sed does expect that people want to do this, and includes a very convenient command to do it:
This is kind of fancy (and it is my first time ever using these commands myself). Here's how it works:Code:sed -e ': repeat; s/||/|NULL|/; t repeat'
: repeat - this creates a label called repeat at the beginning of our matching code
s/||/|NULL|/ - this is fairly obvious: it's the match that we want to do
t repeat - here's the exciting part. What this does is that if a substitution has happened on the current line of input, go to the given label. Which means that if on the current line, we made a substitution, try and run the substitution again.
And does it work?
It does.Code:echo "1|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat' 1|2|NULL|NULL|5|NULL|7|
Does this help?DISTRO=Arch
Registered Linux User #388732
- 02-23-2010 #3Just Joined!
- Join Date
- Feb 2010
- Posts
- 3
Works perfectly. Fast too. I extended String in ruby to add the NULL to the output if empty, but it was significantly slower (around 5x or so) than the sed with many -e options... so I wanted to just edit the stream if at all possible.
I originally used the g (global), but with no avail, like you mentioned.
Now that I see it, the t option looks clear as can be, but I made no sense of it before I saw it working.
Thanks so much. I have used sed thousands of times and never knew about t.
- 02-23-2010 #4Linux User
- Join Date
- Nov 2009
- Location
- France
- Posts
- 292
It's indeed a tough one, your solution works fine, except if the first column is empty :
I think there should be a second pass for the first column explicitly :Code:echo "|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat' |2|NULL|NULL|5|NULL|7|
Code:echo "|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat' | sed 's/^|/NULL|/' NULL|2|NULL|NULL|5|NULL|7|
0 + 1 = 1 != 2 <> 3 != 4 ...
Until the camel can pass though the eye of the needle.
- 02-23-2010 #5
I tried a solution with awk
with testdata:Code:#! /usr/bin/awk -f BEGIN { FS = "|"; } { for (i = 1; i <= NF; ++i) { if (i != NF) { if ($i == "") { $i = "NULL"; printf("%s|", $i); } else { printf("%s|", $i); } } else { printf("\n"); } } }
1|2|||5||7|
1|2|||5||7|
1|2|||5||7|
it produced:
1|2|NULL|NULL|5|NULL|7|
1|2|NULL|NULL|5|NULL|7|
1|2|NULL|NULL|5|NULL|7|Make mine Arch Linux


Reply With Quote