Find the answer to your Linux question:
Results 1 to 5 of 5
Hello, I have a pipe delmited file with multiple empty fields. I would like to write a sed or awk script put the term NULL in each empty field. Is ...
  1. #1
    Just Joined!
    Join Date
    Feb 2010
    Posts
    3

    sed and/or awk question

    Hello,

    I have a pipe delmited file with multiple empty fields. I would like to write a sed or awk script put the term NULL in each empty field. Is there a way to do this without multiple -e commands for sed?

    Example:
    1|2|||5||7|

    To:
    1|2|NULL|NULL|5|NULL|7|"

    sed -e "c/||/|NULL|/" -e "c/||/|NULL|/" -e "c/||/|NULL|" will get the job done, but is ugly. ...and I don't know how many columns I might have some day.

    Is there an easier way to do this?

    Thanks in advance.

  2. #2
    Trusted Penguin Cabhan's Avatar
    Join Date
    Jan 2005
    Location
    Seattle, WA, USA
    Posts
    3,230
    You have the right idea. Unfortunately, it's a bit tough.

    The simple fix is to try this instead:
    Code:
    sed -e 's/||/|NULL|/g'
    See that 'g' on the end? That means that if a substitution was performed, try to substitute again starting at the point just after the previous match. That bold part is key: it means that this approach almost works, but not quite. Observe:
    Code:
    alex@alex-laptop:~$ echo "1|2|||5|7" | sed -e 's/||/|NULL|/g'
    1|2|NULL||5|7
    Note that it does not correctly work with the case where two empty fields were next to each other.

    Sadly, instead, we need to use some of sed's more...confusing features. In particular, if we substitute for an empty field, we want to run the check again on the entire new line. Fortunately, sed does expect that people want to do this, and includes a very convenient command to do it:
    Code:
    sed -e ': repeat; s/||/|NULL|/; t repeat'
    This is kind of fancy (and it is my first time ever using these commands myself). Here's how it works:

    : repeat - this creates a label called repeat at the beginning of our matching code
    s/||/|NULL|/ - this is fairly obvious: it's the match that we want to do
    t repeat - here's the exciting part. What this does is that if a substitution has happened on the current line of input, go to the given label. Which means that if on the current line, we made a substitution, try and run the substitution again.

    And does it work?
    Code:
    echo "1|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat'
    1|2|NULL|NULL|5|NULL|7|
    It does.

    Does this help?
    DISTRO=Arch
    Registered Linux User #388732

  3. #3
    Just Joined!
    Join Date
    Feb 2010
    Posts
    3
    Works perfectly. Fast too. I extended String in ruby to add the NULL to the output if empty, but it was significantly slower (around 5x or so) than the sed with many -e options... so I wanted to just edit the stream if at all possible.

    I originally used the g (global), but with no avail, like you mentioned.

    Now that I see it, the t option looks clear as can be, but I made no sense of it before I saw it working.

    Thanks so much. I have used sed thousands of times and never knew about t.

  4. #4
    Linux User
    Join Date
    Nov 2009
    Location
    France
    Posts
    292
    It's indeed a tough one, your solution works fine, except if the first column is empty :

    Code:
    echo "|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat'
    |2|NULL|NULL|5|NULL|7|
    I think there should be a second pass for the first column explicitly :

    Code:
    echo "|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat' | sed 's/^|/NULL|/'
    NULL|2|NULL|NULL|5|NULL|7|
    0 + 1 = 1 != 2 <> 3 != 4 ...
    Until the camel can pass though the eye of the needle.

  5. #5
    Linux Enthusiast gerard4143's Avatar
    Join Date
    Dec 2007
    Location
    Canada, Prince Edward Island
    Posts
    714
    I tried a solution with awk

    Code:
    #! /usr/bin/awk -f
    
    BEGIN	{
    		FS = "|";
    	}
    
    {
    	for (i = 1; i <= NF; ++i)
    	{
    		if (i != NF)
    		{
    			if ($i == "")
    			{
    				$i = "NULL";
    				printf("%s|", $i);
    			}
    			else
    			{
    				printf("%s|", $i);
    			}
    		}
    		else
    		{
    			printf("\n");
    		}
    	}
    }
    with testdata:
    1|2|||5||7|
    1|2|||5||7|
    1|2|||5||7|

    it produced:
    1|2|NULL|NULL|5|NULL|7|
    1|2|NULL|NULL|5|NULL|7|
    1|2|NULL|NULL|5|NULL|7|
    Make mine Arch Linux

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...