Below is the sample for processing a large file about 75 K * 4 records and the final output had around 2 lac. With awk such large file was processed in 2 to 3 mins...AWKKKKKKKKK ROCKSSSSSSSSSSS

This is a sample you can later the below solution as per your need.

I had a file like


Requirement was the output should only contain records for which the first character is alaphabet (each | (pipe delimited) record is considered as 1 rec) so
the record in the above example is actually 4 recs (pipe delimited)

So the solution I wrote was as mentioned below....

awk '{ FS = "|" ; if (substr($1,0,1) ~ /^[A-Z]*$/) print $1;if (substr($2,0,1) ~ /^[A-Z]*$/) print $2;if (substr($3,0,1) ~ /^[A-Z]*$/) print $3;if (substr($4,0,1) ~ /^[A-Z]*$/) print $4;}' InputFile.txt >> Output file.txt