Find the answer to your Linux question:
Results 1 to 6 of 6
Hello mate, I'm learning C in Linux and just encountered some issue which I could not solve myself. Here is part my code Code: char *wordlist[500000]; int main() { FILE ...
  1. #1
    Just Joined!
    Join Date
    Nov 2006
    Posts
    16

    Slow C program performance

    Hello mate,

    I'm learning C in Linux and just encountered some issue which I could not solve myself. Here is part my code
    Code:
    char *wordlist[500000];
    int main()
    {
      FILE *file;
      long i=0;
      char *tmp = malloc(100);  
      char *convert = malloc(20);
      
      long count=0;
      wordlist[0]=calloc(500000,30*sizeof(char));  
      for (i=1;i<500000;i++)
          wordlist[i]=wordlist[i-1]+30;
      strcpy(name,"/grid/grid21dec07/ProcessOutput/out");
      for (i=0;i<=1000;i++)
      {
        strcpy(tmp,name);
        sprintf(convert,"%d",i);
        strcat(tmp,convert);
        strcat(tmp,"/SuccessList.txt");
        file = fopen(tmp,"r");
        if (file == (FILE *)0)
        {
          printf("File %s cannot be openned. Program terminated.",tmp);
          return -1;
        }
        while (fgets(wordlist[count],100,file)!=(char *)NULL)
        {
          count++;
        }
        fclose(file);
      }
      free(wordlist[0]);
      free(line);
      free(tmp);
      free(convert);
      free(name);
    }
    Basically, my program just read a lot of files (around 50000), each of which contains several lines. The program will read these lines into the string array wordlist[].

    I'm currently using FC7, and gcc as compiler. The problem is when I started running the program, it was so slow that I had to stop it in the middle, review the code to see if there is any infinite loop, then run again. In the second running, I realized that all the parts that it has ran through in the previous time was done in no time, and then slow again.

    After several times, the program now run in alomost no time. But if I wait for few hours, or restart the machine, then the slow performance comes back. I look at the top program all the time, and see that my program only occupy 0.5% of mem and 1% of CPU, and my linux does not run anything heavy at all.

    Could anyone suggest me the reason, and possible solution?

    Thank you for reading.

    SG.

  2. #2
    Linux Guru coopstah13's Avatar
    Join Date
    Nov 2007
    Location
    NH, USA
    Posts
    3,149
    my guess is it has to do with the disk caching, most of your code is IO on the hard disk, so it stores what the contents of those files were in teh cache, that is why its faster for a little while, then slow after waiting/restarting because the cache has changed, it has to open the files again

  3. #3
    Just Joined!
    Join Date
    Nov 2006
    Posts
    16
    Quote Originally Posted by coopstah13 View Post
    my guess is it has to do with the disk caching, most of your code is IO on the hard disk, so it stores what the contents of those files were in teh cache, that is why its faster for a little while, then slow after waiting/restarting because the cache has changed, it has to open the files again
    Thank you coopstah13 for pointing it out.

    Do you have any suggestion for me, i.e., using a different library for reading file, a method for caching the files before reading them, or using other languages such as C++ or Java with their own file I/O, or from other perspective, improve something with the hardware, etc.

    SG.

  4. #4
    Linux Guru coopstah13's Avatar
    Join Date
    Nov 2007
    Location
    NH, USA
    Posts
    3,149
    i don't know of any library to use personally, but you could implement something on your own I would imagine if you can't find anything, i would google around for a while, i would doubt that someone hasn't done something like that already. i don't think you're going to find switching to another language will improve any hardware performance, i would stick with what you have unless you find something in another language that implements a caching mechanism.

  5. #5
    Linux Engineer wje_lf's Avatar
    Join Date
    Sep 2007
    Location
    Mariposa
    Posts
    1,192
    The first thing to do when trying to nail a performance problem is to correct any outstanding bugs.

    I see a bug.

    wordlist[] is an array of pointers. Each one points to 30 bytes after the previous one, ja?

    Ok. When you use those pointers to indicate where to do your fgets(), you ask fgets() to overwrite up to 100 bytes, not 30. Is this intentional?
    --
    Bill

    Old age and treachery will overcome youth and skill.

  6. #6
    Just Joined!
    Join Date
    Nov 2006
    Posts
    16
    Quote Originally Posted by wje_lf View Post
    The first thing to do when trying to nail a performance problem is to correct any outstanding bugs.

    I see a bug.

    wordlist[] is an array of pointers. Each one points to 30 bytes after the previous one, ja?

    Ok. When you use those pointers to indicate where to do your fgets(), you ask fgets() to overwrite up to 100 bytes, not 30. Is this intentional?
    Well in fact, every line in all of my files contain contains only 24 characters. So I could even reduce 30 to at most 27. The reason is: fgets reads only within particular line of a file, but if the line end before the number of char specified (e.g. 100) reached, it also terminates. Therefore in this case, even though I have specified it to be 100, but it actually reads only 24 characters, plus 2 bytes "\r\n" and the conventional ending byte, i.e. "\0". That is why there is no problem with it.

    But it's good that you mentioned it, because sometimes I also mess up with the same mistake.

    Regards,

    SG.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...