Find the answer to your Linux question:
Results 1 to 8 of 8
I'm no programmer, so I'm fairly certain I'm doing something wrong here! One of the routines inside my needs to read a file in and insert each line in an ...
  1. #1
    Linux Enthusiast
    Join Date
    Jun 2005
    Location
    The Hot Humid South
    Posts
    602

    Reading from file using fscanf

    I'm no programmer, so I'm fairly certain I'm doing something wrong here!

    One of the routines inside my needs to read a file in and insert each line in an array. The way the files are written make my life pretty easy (.csv, no spaces), so I decided to use fscanf to read it in. My problem is that, for some reason, fscanf is reading the last line of the file twice and the program seg faults when I try to close the file.

    I removed the routine out of the program and started testing with it and added some fprintfs to help understand what's going on. Here's a simplified version of it:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int main (int argc, const char *argv[])
    {
      char *list[4];
      FILE *inFile;
      char tmp[10];
      size_t stringLength;
      int count = 0;
      int output; 
    
      inFile = fopen (argv[1], "r");
      if (!inFile)
        {
          fprintf (stderr, "Could not open file %s for read!\n", argv[1]);
          exit (EXIT_FAILURE);
        }
    
      while ( 1 )
        {
          char *tmpString;
    
          output = fscanf (inFile, "%s", tmp);
    
          stringLength = strlen (&tmp);
    
          tmpString = (char *) calloc (stringLength, sizeof (char));
    
          strcpy (tmpString, tmp);
          list[count] = tmpString;
    
          fprintf (stdout, "%d %s\n", count, list[count]);
    
          if (output == EOF)
            break;
    
          ++count;
        }
    
      count = 0;
      while ( count != 4)
        {
          printf (" %d %s\n", count, list[count]);
          ++count;
        }
    
      fprintf (stdout, "Closing file\n");
      fclose (inFile);
    
      return EXIT_SUCCESS;
    }
    And here's the output:
    Code:
    0 1234
    1 2345
    2 3456
    3 4567
    3 4567
     0 1234
     1 2345
     2 3456
     3 4567
    Closing file
    Segmentation Fault
    Can anyone shed some light on exactly where am I going wrong?
    "Today you are freer than ever to do what you want, provided you can pay for it!" --Bad Religion

  2. #2
    Linux Engineer wje_lf's Avatar
    Join Date
    Sep 2007
    Location
    Mariposa
    Posts
    1,192
    If you use fgets() followed by sscanf() (notice the two s's in the function name), you can then split up the file I/O from the reading into variables, thus helping you track down the problem. That's what I always do.
    --
    Bill

    Old age and treachery will overcome youth and skill.

  3. #3
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    When you read the last line, fscanf doesn't return EOF (you have to read beyond the input), so you're going round the loop an extra time before you exit. I can't explain why the fifth diagnostic line repeats the count at 3, because it should be 4(?), thus corrupting the FILE variable on the stack, giving you the SEGV.

    Better would be to control the while loop as:
    Code:
    while ((output = fscanf (inFile, "&#37;s", tmp)) != EOF)

  4. #4
    Linux Enthusiast
    Join Date
    Jun 2005
    Location
    The Hot Humid South
    Posts
    602
    Quote Originally Posted by wje_lf View Post
    If you use fgets() followed by sscanf() (notice the two s's in the function name), you can then split up the file I/O from the reading into variables, thus helping you track down the problem. That's what I always do.
    I gotta tell you the truth, I have no idea how to do what you just said there! This is the first time I try to use these file reading functions in C (MATLAB is just so much easier, but C is all I have available). If possible, I'd rather stick to the simplest solution possible! The files that I need to read in a fairly simple, so I can make quite a few assumptions.

    Quote Originally Posted by scm View Post
    When you read the last line, fscanf doesn't return EOF (you have to read beyond the input), so you're going round the loop an extra time before you exit. I can't explain why the fifth diagnostic line repeats the count at 3, because it should be 4(?), thus corrupting the FILE variable on the stack, giving you the SEGV.

    Better would be to control the while loop as:
    Code:
    while ((output = fscanf (inFile, "&#37;s", tmp)) != EOF)
    My bad on this one, I typed the output incorrectly, the last line says:
    Code:
    4 4567
    I had the "while" statement like that originally (actually, without the "output = " part, just straight fscanf) but since I was getting errors I decided to move it out and before posting here I had even more fprintf statements.

    EDIT:
    Nevermind, putting the fscanf statement back up in the while statement solved it! I don't know what was happening before! Thanks for all the help! I'm pretty happy now! There's a chance what I'm trying to do here will actually work!
    "Today you are freer than ever to do what you want, provided you can pay for it!" --Bad Religion

  5. #5
    Just Joined!
    Join Date
    Jul 2008
    Posts
    9
    This doesn't look right:

    stringLength = strlen (&tmp);

    ...should be strlen(tmp), as tmp is of type char * .

  6. #6
    Trusted Penguin Cabhan's Avatar
    Join Date
    Jan 2005
    Location
    Seattle, WA, USA
    Posts
    3,230
    If you still care what was happening before:

    fscanf() doesn't actually return whatever value it reads. Instead, it stores that into the variable that you tell it (in your example, "tmp"). It returns the number of successful assignments, or EOF if an end-of-file is hit.

    When we read the last line of the file using the above code, the line gets stored into tmp, and 1 is returned, because we did one successful match. We go through and process everything. YAY!

    Now we go through the loop again. We execute fscanf(), and because we're at the end of the file, EOF is returned. However, tmp still has its old value! Which is the last line of the file.

    It is important to remember that EOF is only returned AFTER reading the last byte of the file. So you need to be careful about the loop running again when you don't expect it to.


    As a final note, fscanf() isn't really The Right Tool for The Job (TM): fgets() would be better (it just reads and returns an entire line without doing all sorts of fun input pattern matching). Just so you know.
    DISTRO=Arch
    Registered Linux User #388732

  7. #7
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    The only thing wrong really wrong with the code in your first post is that the test for EOF and break should be put immediately after the fscanf() line, not after you'd processed the data you think you'd read in. Does that help you make sense of it?

  8. #8
    Linux Enthusiast
    Join Date
    Jun 2005
    Location
    The Hot Humid South
    Posts
    602
    Does it show that I don't really know how to program? I was actually pretty happy this morning when I applied this to my code at work and it worked.

    Thanks for all the explanations, I see what I was doing wrong before! Now I understand exactly what was happening, and why.

    On a related note, for some reason the code can't seem to get past line 100 or so! The 2 files I need to read in have 4886 and 790 entries, so I do understand they are pretty big. However, whenever they get to iteration 120-150 calloc freezes and eventually crashes the program. I'm doing all this in Windows XP with MinGW, so I'm not sure if that has something to do with it. The weird part is that opening up Task Manager shows the program not putting any load on the processor and the memory is steady at 2,300K while it's waiting for calloc to return (which it never does). It works perfectly if I on give it a file with < 100 entries, so I don't think it's a code issue.
    "Today you are freer than ever to do what you want, provided you can pay for it!" --Bad Religion

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...