Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 13
Alright, so up front, I'm obviously taking an operating systems class and writing my own shell. My question is simple. If I have an open file descriptor, say, fd , ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Dec 2006
    Posts
    47

    dup2() and stdin redirection


    Alright, so up front, I'm obviously taking an operating systems class and writing my own shell. My question is simple. If I have an open file descriptor, say, fd, I know how to execute the dup2 command to send the contents of a file to stdin:

    dup2(fd, 0);

    However, once I do this, where does the data go? How do I access the input? I need to get the duplicated data that is sent to the stdin, but how? scanf?

    Thanks for the help!

  2. #2
    Linux Enthusiast gerard4143's Avatar
    Join Date
    Dec 2007
    Location
    Canada, Prince Edward Island
    Posts
    714
    Try info/man

    read(int fd, const void *buf, size_t count);

    ...Gerard4143
    Make mine Arch Linux

  3. #3
    Just Joined!
    Join Date
    Dec 2006
    Posts
    47
    Ok, that halfway helps. What if I don't know how many bytes I'm trying to access, as is the case for the argument size_t count? The file could be of any arbitrary size, and I don't know how much to read from the file.

    For background, I'm trying to implement the input pipe for a shell - using inupt from a file as arguments for a command, such as $command < file. So, I'm using dup2() to access the file via stdin, and I need to essentially get the data from stin and assign it to a cstring for parsing. Make sense?

  4. #4
    Linux Enthusiast gerard4143's Avatar
    Join Date
    Dec 2007
    Location
    Canada, Prince Edward Island
    Posts
    714
    It really doesn't matter because read will notify you with the return value how much it read....you really should lookup man/info read...G4143

    Or try this link:

    read(2): read from file descriptor - Linux man page
    Make mine Arch Linux

  5. #5
    Linux Newbie tetsujin's Avatar
    Join Date
    Oct 2008
    Posts
    117
    If you're looking to do something like read in a line of text (that is, read in data until you encounter the newline character) the typical way of doing that would be to read in a bunch of data, look for the newline character - and repeat the process until you find it. Then any data beyond the newline character, you stick in a buffer somewhere, hold on to it until the next time you need to read a line.

    This is basically what the C FILE* object does, incidentally. If you're using dup2() to map the file to stdin, then I think you could just read data from stdin using the various stdio functions, like scanf() or fread().

    Another approach would be to read one byte at a time until you get the sentinel you're looking for - but that's probably less efficient.
    (EDIT): I guess I was probably wrong above - the C STDIN file object can't be reading an arbitrary number of characters beyond the end-of-line character when it's reading a line... if it were then programs run in succession, both taking input from the TTY, could result in the first program stealing some of the second program's input... The terminal itself does some line buffering (so that a program reading from standard input won't actually get anything until you hit newline - unless it's specifically requested that the TTY not do line-buffering) but the programs themselves read byte-by-byte, I guess... at least for stdin, they do.


    I don't know what the exact conditions are of this project you're doing - but if you look at plain old Unix programs running in a Unix shell, this is exactly what they do - the shell forks a new process, remaps some file descriptors with dup2(), and then invokes a program with exec() or similar - and then the program is written to just deal with stdin/stdout, usually without caring what sort of underlying file descriptor they correspond to. The program running in the shell doesn't do anything special to read its redirected standard input, it just uses the same input code that it would use anywhere else.

  6. #6
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,508
    Most shell programs read a character at a time (unbuffered input) and act according to the termcap and stty settings. Ie, they implement a state machine to determine how to act on receipt of a character depending upon current state. You can ignore all that and just deal with some special control input characters such as Ctrl-C (break), Ctrl-D (EOF), Ctrl-H (backspace), CR, LF, backslash, etc. Everything else is buffered until CR is detected, when the line is executed, or buffered if incomplete (commands can span multiple lines if last character before CR was a backslash. It's still most reliable to implement a state machine, even if it is simple, to process incoming data appropriately.

    Once a command line is read, it is broken into tokens and further processed. If the first token is or starts with a '#', the rest of the line is ignored (comment). If not, then it is a command to execute. The shell determines if it is a "built-in" command and calls the appropriate function if so. If not built-in, it looks for path delimiters. If found, then that is the command to execute. If not, it tries each directory in the PATH environment variable in turn, returning "command not found" if not found, and executes it with the rest of the tokens as an argument array if it is.

    FWIW, a minimal shell can be implemented in a few hundred lines of code.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  7. #7
    Linux Newbie tetsujin's Avatar
    Join Date
    Oct 2008
    Posts
    117
    Quote Originally Posted by Rubberman View Post
    You can ignore all that and just deal with some special control input characters such as Ctrl-C (break), Ctrl-D (EOF), Ctrl-H (backspace), CR, LF, backslash, etc.
    Shells (and the programs running in them) actually don't do anything special for Ctrl-C and Ctrl-D as input - that's handled higher up - the TTY driver, I think? Below that, it's all signals.

    Quite right about the byte-by-byte reading of stdin, though. My mistake.

  8. #8
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,508
    They (Ctrl-C, Ctrl-D) can be handled by the tty driver, but not always. Besides, the tty driver behavior could have been modified with the stty command. Usually the shell will handle interrupt events, and EOF on stdin appropriately, but they can also deal with the raw data if necessary. It depends upon the shell. In any case, robust shell programming is not trivial - after all they are interpreters of various depths of complexity and capability. A good programmer will not leave things to chance if they want their software ware to run as expected, even under adverse conditions.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  9. #9
    Linux Newbie tetsujin's Avatar
    Join Date
    Oct 2008
    Posts
    117
    Quote Originally Posted by Rubberman View Post
    They (Ctrl-C, Ctrl-D) can be handled by the tty driver, but not always. Besides, the tty driver behavior could have been modified with the stty command.
    In that case, neither the shell nor the program running in it is expected to handle Ctrl-C, right? If you remap SIGINT to some other keystroke, the shell (and running processes) still just watch for SIGINT. So far as I know there's no expectation for the shell to provide support for job-control keystrokes where the TTY lacks it, and with the way shells normally work (i.e. handing over direct control of the TTY to the foreground process, as opposed to acting as a middleman for the program's I/O) there's no way it could...

    Bash certainly doesn't appear to do anything special with Ctrl-C, Ctrl-D, etc. - either they're handled at the TTY level or Bash accepts them as ordinary input with no particular significance... Unless I've missed something. If I've got something wrong in my understanding of the relationship between the TTY, signals, and shell I'd really like to know. I need to know that sort of thing.

  10. #10
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, in Chicago, or in a galaxy far, far away.
    Posts
    11,508
    You are correct in that the tty handles the interrupt stuff, but there is plenty it doesn't. I don't think it handles the termcap stuff (esc sequences), or the special esc sequences that tell the shell to change the prompt, etc. You should take a close look at some of the shell source code to see how they deal with these special cases. However, as I said, a minimal shell can ignore a lot of that.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •