Welcome to Linux Forums! With a comprehensive Linux Forum, information on various types of Linux software and many Linux Reviews articles, we have all the knowledge you need a click away, or accessible via our knowledgeable members.
Write an article for LinuxForums Today! Win Great Prizes!
Reading and writing multicolumn text files with Perl
Hi folks,
I am still new to Perl and trying to migrate out of Fortran for simple tasks, so if this question sounds silly there is 99.9% probability that it really is. I already checked out Perl tutorials at www.perl.com but I could not find an answer for this. I have a multicoulmn file like this:
number number number number
number number number number
number number number number
.............. ............... ............... ...............
number number number number
which is the output of a certain program. I want to read the first column as four different variables, do some conditional sorting, calculations and then write the results in a new file. That would be a very simple task in Fortran, like:
read (file) var1, var2, var3, var4
and then
write (file) varX, varY, varZ
Unfortunately I can not figure out how to do that in Perl. As far as my poor brain can see I can either put the whole line into a scalar ($variable) or the whole file into an array (@variable) of individual lines. Can you please shed some light on this?
Well, I'm thinking read each line, push each column onto a separate array. Repeat until you have 4 arrays where the first element in each is the top element of each column.
I'm also thinking regular expressions, something like:
(.*)\s+(.*)\s+(.*)\s+(.*)
Basically, the elements in each parenthesis set are now accessable outside of the regexp by $1, $2, $3, and $4. So you could simply set that regexp in a while loop ( while(<FILE>) ), and push each element onto a seperate array (@col1, @col2, @col3, @col4).
Tell me what you think.
__________________
DISTRO=Gentoo
Registered Linux User #388732
Gentoo Linux, 410 GB HD, 1.2 GB RAM, Fluxbox, These are a Few of my Favorite Things
Re: Reading and writing multicolumn text files with Perl
Quote:
Originally Posted by hernandeangelis
read (file) var1, var2, var3, var4
and then
write (file) varX, varY, varZ
Well, going for the simplest approach, the Perl equivalent of the read would be:
Code:
while (<>)
{
(var1, var2, var3, var4) = $_;
}
You'll load the 4 variables for each line (the <> reads a line, the while loop ensures you'll suck in the whole file), so you'll need to process and write results within the loop. The writing can be dons by
Code:
print FH varX, varY, varZ;
where FH is a previously opened filehandle. Not the absence of a comma separator after the FH value.
OK! Many thanks Cabhan and Steve!!! However I had troubles to implement your suggestions. I am either very stupid or Perl is not the best for what I want to do. I tried with Cabhan's as well as Steve's solutions. The programs and results/errors look like this:
1. Cabhan's solution
The program:
open (INFILE, "xxxx");
open (OUTFILE, ">yyyy");
while (<INFILE>)
{
# OBS: my xxxx file contains actually 9 columns
/((.*)\s)+((.*)\s)+((.*)\s)+((.*)\s)+((.*)\s)+((.*) \s)+((.*)\s)+((.*)\s)+((.*)\s)/;
$xf = $1 + $6;
$yf = $2 + $7;
print OUTFILE $1, $2, $xf, $yf, "\n";
}
close INFILE;
close OUTFILE;
The output is garbage more or less like this:
64 64 2.351 7.120 1 1.652 1.673 64 64 2.351 7.120 1 1.652 1.6736464
64 89 2.712 6.000 1 2.079 1.741 64 89 2.712 6.000 1 2.079 1.7416464
.................................................. .................................................. ..(up to EOF)
where the first 7 numbers are those present in INFILE. The other 2 where apparently not loaded.
2. Steve's solution
The program is:
open (INFILE, "imcorr.out");
open (OUTFILE, ">imcorr.gmt");
while (<INFILE>)
{
# OBS: my xxxx file contains actually 9 columns
(v1, v2, v3, v4, v5, v6, v7, v8, v9) = $_;
$xf = $v1 + $v6;
$yf = $v2 + $v7;
print OUTFILE $v1, $v2, $xf, $yf, "\n";
}
close INFILE;
close OUTFILE;
And the output is the error message at the konsole:
Can't modify constant item in list assignment at ./im2gmt line 20, near "$_;"
Execution of ./im2gmt aborted due to compilation errors.
Well, I gave up for today, tomorrow will see. Thanks anyway guys!
And for mine, the only thing I can maybe think of is eliminating the last "\s". The way you have it, the line will only match if there is a space at the end.
__________________
DISTRO=Gentoo
Registered Linux User #388732
Gentoo Linux, 410 GB HD, 1.2 GB RAM, Fluxbox, These are a Few of my Favorite Things
OK guys! Thanks for your patience. I tell you what I did,
As santaslittlehelper said awk was done for this kind of purpouses. However the problem is that my file involves both integers and floating point numbers. Here is a sample of my original file:
I want to read 9 the values, then (if $5 == 1) then ($xf = $1+$6 and $xy = $2+$7) and finally: (print $1, $2, $xf, $xy) which is a four column file that is used by another program.
I was trying today your advice and what I can say is that Steve's solution did not work in my case (I cannot rule out my inexperience). My awk solution worked, same as Cabhans's but both produced the following file:
64 64 65 65
64 89 66 90
64 114 65 115
64 139 65 141
64 164 65 165
That is, NO FLOATING POINT values at xf and xy !!! This is something that I do not understand in the case of awk. Perl is still a big question mark for me.
Well, I am stuck for today guys. Thanks for your help if you still want to loose your time with this.
I forgot one minor detail in the regexp, which is that .* will match ANYTHING. Also sadly, \d will not match a ., so decimals don't work. So I made my own character class, [0-9.], and used that. So my regexp line looks like:
Thanks for the answers. If I execute from the konsole:
gawk 'BEGIN{print 1.1 + 2.2}'
I get
3,3
I have a weird stupid problem with the " , " instead of " . " as a decimal point and please do not tell me that I do not have it configured in the Control Center. I did but still did not get it to work. I guess is a problem of my Swedish keyboard but I do not think that this is the cause of the problem.
Open Source Security Myths Dispelled Dispel the five major myths surrounding Open Source Security and gain the tools necessary to make a truly informed decision for your IT organization subscribe
InformationWeek InformationWeek is the only newsweekly you'll need to stay on top of the latest developments in information technology. subscribe