Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 14
hi, i am supposed to process data but i am stuck on reading in them data. the data are in a file. they look like this 0 1:1.1 2:1.2 3:1.3 ...
  1. #1
    Just Joined!
    Join Date
    Dec 2008
    Posts
    19

    help on extracting the data from a file

    hi, i am supposed to process data but i am stuck on reading in them data.
    the data are in a file. they look like this
    0 1:1.1 2:1.2 3:1.3 ...
    1 1:2.1 2:2.2 3:2.3 ....
    2 1:3.1 2:3.2 3:3.3 ....
    where by on the first line 0 is the label, 1 is the index, 1.1 is the value , and again 2 is the index, 1.2 is the value and so on. (ie, label index:value index:value ....)
    SO WHAT I WANT TO DO IS to put the labels(first number in each line) in a vector container.
    and the index before the symbol ":" and the value (the number after the symbol ":") in a vector of a struct with two members index and value.
    help help please...any idea will be warmly embraced
    thanks

  2. #2
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    This isn't for a class exercise, is it?
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  3. #3
    Just Joined!
    Join Date
    Dec 2008
    Posts
    19
    Quote Originally Posted by Rubberman View Post
    This isn't for a class exercise, is it?
    NO its not....its for my research findings

  4. #4
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    Ok. You have a set of elements in a 2d array. Element 0 is the label, element 1 is the array of values indexed at n-1 in the array.

    int array[N][M][1]; /* Where N is the maximum number of labels, and M is the maximum number of data elements associated with each label. The 1 is the number of values associated with each data element. This can accomodate missing values if you choose suitable values for such, as perhaps -1 since these all seem to be positive integers */

    If you did this with a C++ vector<int,vector<int,int> > class, you wouldn't have to concern yourself with the array dimensions as you could easily add members as you go.

    Parsing is simple. The label, and each element is delimited by whitespace. The element index is delimited from its value by a colon. The algorithm goes something like this:
    Code:
    create results vector
    while not end-of-file
        read line
        if line read
            read lable to white space
            if label found
                create vector for elements
                do
                    read next element index and value
                    if element read
                        insert index and value in element vector
                    endif
                while (element found)
                insert label and element vector into results vector
        endif
    endwhile
    How you implement this, in C, C++, Java, Python, or whatever is not important. You need to learn how to break these problems down into a generalized algorithm for processing the data. That is the all-important What-To-Do. Then comes the implementation, or the How-To-Do-It. If you don't know the What, then the How will never happen.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  5. #5
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    I missed that the values are floating point numbers. Your vector variable in C++ would be defined therefore as:

    vector<int, vector<int,float> > results;

    and not

    vector<int, vector<int,int> > as I suggested.
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  6. #6
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    Quote Originally Posted by desp_to_learn View Post
    NO its not....its for my research findings
    what exactly are you wanting to do. ? what output do you want to get ?

  7. #7
    Just Joined!
    Join Date
    Dec 2008
    Posts
    19
    Quote Originally Posted by Rubberman View Post
    Ok. You have a set of elements in a 2d array. Element 0 is the label, element 1 is the array of values indexed at n-1 in the array.

    int array[N][M][1]; /* Where N is the maximum number of labels, and M is the maximum number of data elements associated with each label. The 1 is the number of values associated with each data element. This can accomodate missing values if you choose suitable values for such, as perhaps -1 since these all seem to be positive integers */

    If you did this with a C++ vector<int,vector<int,int> > class, you wouldn't have to concern yourself with the array dimensions as you could easily add members as you go.

    Parsing is simple. The label, and each element is delimited by whitespace. The element index is delimited from its value by a colon. The algorithm goes something like this:
    Code:
    create results vector
    while not end-of-file
        read line
        if line read
            read lable to white space
            if label found
                create vector for elements
                do
                    read next element index and value
                    if element read
                        insert index and value in element vector
                    endif
                while (element found)
                insert label and element vector into results vector
        endif
    endwhile
    How you implement this, in C, C++, Java, Python, or whatever is not important. You need to learn how to break these problems down into a generalized algorithm for processing the data. That is the all-important What-To-Do. Then comes the implementation, or the How-To-Do-It. If you don't know the What, then the How will never happen.
    thank you for helping
    what do you mean by read label to white space?

  8. #8
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,974
    Quote Originally Posted by desp_to_learn View Post
    thank you for helping
    what do you mean by read label to white space?
    Well, everything up to the first whitespace is the label. In your example it would be 0-N where N is some number. What if it is > 9 or > 99? What if you decided to give names instead of numbers? Or used the julian date (a floating point number encoding date + time) instead?
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

  9. #9
    Linux User
    Join Date
    Aug 2006
    Posts
    458
    since this is not homework and i do not know what you are wanting to do, here's a Python code that stores all your data into dictionary
    Code:
    #!/usr/bin/env python
    d={}
    for line in open("file"):
        line=line.strip().split()
        d.setdefault(line[0], [])
        for items in line[1:]:
            d[line[0]].append(items.split(":"))
    print d
    output
    Code:
    # ./test.py
    # ./test.py
    {'1': [['1', '2.1'], ['2', '2.2'], ['3', '2.3'], ['....']], '0': [['1', '1.1'], ['2', '1.2'], ['3', '1.3'], ['...']], '2': [['1', '3.1'], ['2', '3.2'], ['3', '3.3'], ['....']]}
    now you are able to use the values by calling the dictionary using the appropriate keys.

  10. #10
    Just Joined!
    Join Date
    Dec 2008
    Posts
    19
    Quote Originally Posted by ghostdog74 View Post
    since this is not homework and i do not know what you are wanting to do, here's a Python code that stores all your data into dictionary
    Code:
    #!/usr/bin/env python
    d={}
    for line in open("file"):
        line=line.strip().split()
        d.setdefault(line[0], [])
        for items in line[1:]:
            d[line[0]].append(items.split(":"))
    print d
    output
    Code:
    # ./test.py
    # ./test.py
    {'1': [['1', '2.1'], ['2', '2.2'], ['3', '2.3'], ['....']], '0': [['1', '1.1'], ['2', '1.2'], ['3', '1.3'], ['...']], '2': [['1', '3.1'], ['2', '3.2'], ['3', '3.3'], ['....']]}
    now you are able to use the values by calling the dictionary using the appropriate keys.
    sorry i didnt mention....i am using c++.thanks

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...