Find the answer to your Linux question:
Results 1 to 2 of 2
hallo folk.. Here is my problem. I have a file in which contains one line with a lot floating points. In the very first place and some times in the ...
  1. #1
    Just Joined!
    Join Date
    Feb 2010
    Posts
    8

    split file when integer is found

    hallo folk..

    Here is my problem.
    I have a file in which contains one line with a lot floating points.
    In the very first place and some times in the downstream, there are a few integers, surrounded by blank spaces.

    1 1.02-4 1.03-5 544 1.04-1 65 2.98-1 5.78-10 3.45-2 etc etc

    I aim to split the file in more files each of them containing an integer and the following floatings until the next integer.

    I tried something like that:

    sed -n 's/[0-9]+ /&/w output.file' < input.file
    csplit -k -f output.file input.file '/\s[0-9]+\s/'

    It seems that my regex doesn't work.
    I also tried with POSIX, without succeeding.

    Any suggestion?

    Thanks
    Stefano

  2. #2
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Solaris, SuSE
    Posts
    1,117
    Hi, Stefano.
    Quote Originally Posted by ste2703 View Post
    ... I have a file in which contains one line with a lot floating points.
    In the very first place and some times in the downstream, there are a few integers, surrounded by blank spaces.

    1 1.02-4 1.03-5 544 1.04-1 65 2.98-1 5.78-10 3.45-2 etc etc

    I aim to split the file in more files each of them containing an integer and the following floatings until the next integer ... Any suggestion?
    Here is one solution. This script has 3 parts, first some code to show the environment, versions, and data file; second, an awk script to do the work; third, the results:
    Code:
    #!/usr/bin/env bash
    
    # @(#) s3	Demonstrate split into files at integer string.
    
    # Infrastructure details, environment, commands for forum posts. 
    # Uncomment export command to run script as external user.
    # export PATH="/usr/local/bin:/usr/bin:/bin"
    set +o nounset
    pe() { for i;do printf "%s" "$i";done; printf "\n"; }
    pl() { pe;pe "-----" ;pe "$*"; }
    LC_ALL=C ; LANG=C ; export LC_ALL LANG
    pe ; pe "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
    pe "(Versions displayed with local utility \"version\")"
    c=$( ps | grep $$ | awk '{print $NF}' )
    version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
    [ "$c" = "$s" ] && p="$s" || p="$c"
    version >/dev/null 2>&1 && version "=o" $p printf specimen awk
    set -o nounset
    pe
    
    # Remove files from previous runs.
    rm -f section.*
    
    FILE=${1-data1}
    
    # Display sample of data file, with head & tail as a last resort.
    pe " || start [ first:middle:last ]"
    specimen $FILE \
    || { pe "(head/tail)"; head -n 5 $FILE; pe " ||"; tail -n 5 $FILE; }
    pe " || end"
    
    pl " Results, files created:"
    awk '
    BEGIN { file_index = 1 ; collection = "" }
          {
            for (i = 1; i <= NF; i++) {
              if ( collection == "" ) { # assume integer start
                collection = $i
                } else {
                if ( $i !~ /^[0-9]+$/ ) {
                  collection = collection " " $i
                  } else {
                  print collection > "section." file_index
                  close("section." file_index)
                  collection = $i
                  file_index++
                }
              }
            }
          }
    END   { print collection > "section." file_index } # write last file
    ' $FILE
    
    ls -lgG section.*
    
    pl " Contents of files:"
    for i in section.*
    do
      printf " %s: " $i
      cat $i
    done
    
    exit 0
    producing:
    Code:
    % ./s3
    
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
    Distribution        : Debian GNU/Linux 5.0 
    GNU bash 3.2.39
    printf - is a shell builtin [bash]
    specimen (local) 1.17
    GNU Awk 3.1.5
    
     || start [ first:middle:last ]
    Whole: 5:0:5 of 1 lines in file "data1"
    1 1.02-4 1.03-5 544 1.04-1 65 2.98-1 5.78-10 3.45-2
     || end
    
    -----
     Results, files created:
    -rw-r--r-- 1 16 Jun 26 09:06 section.1
    -rw-r--r-- 1 11 Jun 26 09:06 section.2
    -rw-r--r-- 1 25 Jun 26 09:06 section.3
    
    -----
     Contents of files:
     section.1: 1 1.02-4 1.03-5
     section.2: 544 1.04-1
     section.3: 65 2.98-1 5.78-10 3.45-2
    Best wishes ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...