Results 1 to 6 of 6
Hello mate,
I'm learning C in Linux and just encountered some issue which I could not solve myself. Here is part my code
Code:
char *wordlist[500000];
int main()
{
FILE ...
- 12-27-2007 #1Just Joined!
- Join Date
- Nov 2006
- Posts
- 16
Slow C program performance
Hello mate,
I'm learning C in Linux and just encountered some issue which I could not solve myself. Here is part my code
Basically, my program just read a lot of files (around 50000), each of which contains several lines. The program will read these lines into the string array wordlist[].Code:char *wordlist[500000]; int main() { FILE *file; long i=0; char *tmp = malloc(100); char *convert = malloc(20); long count=0; wordlist[0]=calloc(500000,30*sizeof(char)); for (i=1;i<500000;i++) wordlist[i]=wordlist[i-1]+30; strcpy(name,"/grid/grid21dec07/ProcessOutput/out"); for (i=0;i<=1000;i++) { strcpy(tmp,name); sprintf(convert,"%d",i); strcat(tmp,convert); strcat(tmp,"/SuccessList.txt"); file = fopen(tmp,"r"); if (file == (FILE *)0) { printf("File %s cannot be openned. Program terminated.",tmp); return -1; } while (fgets(wordlist[count],100,file)!=(char *)NULL) { count++; } fclose(file); } free(wordlist[0]); free(line); free(tmp); free(convert); free(name); }
I'm currently using FC7, and gcc as compiler. The problem is when I started running the program, it was so slow that I had to stop it in the middle, review the code to see if there is any infinite loop, then run again. In the second running, I realized that all the parts that it has ran through in the previous time was done in no time, and then slow again.
After several times, the program now run in alomost no time. But if I wait for few hours, or restart the machine, then the slow performance comes back. I look at the top program all the time, and see that my program only occupy 0.5% of mem and 1% of CPU, and my linux does not run anything heavy at all.
Could anyone suggest me the reason, and possible solution?
Thank you for reading.
SG.
- 12-27-2007 #2
my guess is it has to do with the disk caching, most of your code is IO on the hard disk, so it stores what the contents of those files were in teh cache, that is why its faster for a little while, then slow after waiting/restarting because the cache has changed, it has to open the files again
- 12-27-2007 #3Just Joined!
- Join Date
- Nov 2006
- Posts
- 16
Thank you coopstah13 for pointing it out.
Do you have any suggestion for me, i.e., using a different library for reading file, a method for caching the files before reading them, or using other languages such as C++ or Java with their own file I/O, or from other perspective, improve something with the hardware, etc.
SG.
- 12-27-2007 #4
i don't know of any library to use personally, but you could implement something on your own I would imagine if you can't find anything, i would google around for a while, i would doubt that someone hasn't done something like that already. i don't think you're going to find switching to another language will improve any hardware performance, i would stick with what you have unless you find something in another language that implements a caching mechanism.
- 12-27-2007 #5
The first thing to do when trying to nail a performance problem is to correct any outstanding bugs.
I see a bug.
wordlist[] is an array of pointers. Each one points to 30 bytes after the previous one, ja?
Ok. When you use those pointers to indicate where to do your fgets(), you ask fgets() to overwrite up to 100 bytes, not 30. Is this intentional?--
Bill
Old age and treachery will overcome youth and skill.
- 12-27-2007 #6Just Joined!
- Join Date
- Nov 2006
- Posts
- 16
Well in fact, every line in all of my files contain contains only 24 characters. So I could even reduce 30 to at most 27. The reason is: fgets reads only within particular line of a file, but if the line end before the number of char specified (e.g. 100) reached, it also terminates. Therefore in this case, even though I have specified it to be 100, but it actually reads only 24 characters, plus 2 bytes "\r\n" and the conventional ending byte, i.e. "\0". That is why there is no problem with it.
But it's good that you mentioned it, because sometimes I also mess up with the same mistake.
Regards,
SG.


Reply With Quote
