Results 1 to 3 of 3
Hi,
I have a large CSV file, tab-delimited with CRLF at the end of each line.
Each line should contain 5 fields (i.e. NF == 5) However, there are rogue ...
- 07-01-2010 #1Just Joined!
- Join Date
- Jul 2010
- Posts
- 2
[SOLVED] AWK: Join lines if NF is wrong
Hi,
I have a large CSV file, tab-delimited with CRLF at the end of each line.
Each line should contain 5 fields (i.e. NF == 5) However, there are rogue CRLF characters in the middle of some records, causing records to be split across two lines.
I want to scan each line, check the field count and if it's !=5 then join that line to the following line.
Example input might be;
In the example, I want to merge lines 3 and 4 to read;Code:one two three four five six seven eight nine ten eleven tw<CRLF> elve thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty
My attempt at the moment is;Code:eleven twelve thirteen fourteen fifteen
This doesn't work, I'm getting duplicate records inserted. Any help is appreciated (or suggestions on an easier way to do this using awk, sed, or perl)Code:awk 'BEGIN { FS = "\t" } ; { if (NF != 5) {saved=$0;next} {print saved,$0} }'
- 07-01-2010 #2Linux Newbie
- Join Date
- Apr 2007
- Posts
- 119
I would imagine that you need to also remove the truncated line after you save it.
- 07-01-2010 #3Just Joined!
- Join Date
- Jul 2010
- Posts
- 2
Thanks for the reply. Here's my solution. I ended up using getline instead of 'next'
Code:awk 'BEGIN { FS = "\t" } ; { if (NF != 5) {saved=$0;getline;print saved$0} else {print $0} }'


