Results 1 to 6 of 6
alright...
I have file A
I have file B which is a subset of A
i want to strip file B out of file A
i.e. A - B = ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 10-18-2005 #1Just Joined!
- Join Date
- May 2004
- Location
- Pennsylvania
- Posts
- 98
Stripping a subset of text from a file
alright...
I have file A
I have file B which is a subset of A
i want to strip file B out of file A
i.e. A - B = C
This sounds like it should be easy and i feel i'm missing something....
file B is multiple lines and has some special characters...
I'm not ready to put this in a perl script just yet... i just have a feeling that there is a simpler way
tried
as well as a similar sed commmandCode:cat A |grep -v "`cat B`" > C
I escaped the special chars in B for that as well....
i think they fail because of the multiple lines....
I have used
And this works.... however i don't like it because if B is not in A, then i get a mess....Code:diff -bi A B | sed -e 's/^[0-9]*,[0-9]*[a-z]*[0-9]*$//g' | sed -e 's/^< //g' | sed -e 's/^> //g'| sed -e 's/---//g' > C
Real world application:
Taking over a webmaster position and the previous webmaster used all static HTML with a long header.... i want to strip the header out and keep the "meat" of the page.... unfortunately not every single one of the headers are exactly the same... there are about 3-4 different versions (hence the reason my diff script would make more of a mess than its worth)
any suggestions?
- 10-18-2005 #2Linux Newbie
- Join Date
- Oct 2004
- Posts
- 158
Code:sort -o A.srt A sort -o B.srt B comm -1 A.srt B.srt > FileA_minus_FileB
- 10-18-2005 #3Just Joined!
- Join Date
- May 2004
- Location
- Pennsylvania
- Posts
- 98
Thanks for the suggestion...
but sorting will do no good...
for one, i need the text to stay in order.
for two, comm looks basically like a diff which is going would leave me in a similar situation as the other script i posted
- 10-19-2005 #4Linux Newbie
- Join Date
- Oct 2004
- Posts
- 158
comm in fact does do what you want.
But since you dont like it try some C code:
Code:#include <unistd.h> #include <fcntl.h> #include <errno.h> #include <stdlib.h> #include <stdio.h> #include <sys/stat.h> #include <string.h> #define ck(x) if((x)==NULL)\ {perror("Error"); exit(EXIT_FAILURE);} /* read nbyte from a file - can read whole file */ ssize_t readall(int fd, void *buf, size_t nbyte){ ssize_t nread = 0, n=0; do { if ((n = read(fd, &((char *)buf)[nread], nbyte - nread)) == -1) { if (errno == EINTR) continue; else return (-1); } if (n == 0) return nread; nread += n; } while (nread < nbyte); return nread; } /* get file size */ size_t file_size(FILE *in) { struct stat st; if(fstat(fileno(in), &st) == (-1)) { perror("stat error"); exit(EXIT_FAILURE); } return st.st_size; } /* argv[1] = FileA argv[2] = FileB*/ int main(int argc, char *argv[1]) { char *buf=NULL; FILE *in=fopen(argv[1],"r"); /* open afile for read */ FILE *in1=fopen(argv[2],"r"); size_t filebytes=0; char record[256]={0x0}; ck(in); /* check file errors */ ck(in1); filebytes=file_size(in); /* get size of buf we need */ if(filebytes) /* do we have a file with data in it? */ { ck(buf=malloc(filebytes+1) ); /* create storage */ memset(buf,0x0, filebytes+1); /* init storage */ if( readall(fileno(in),buf,filebytes)>0 ) /* read entire file */ { ck(fprintf(stdout,"%s",buf)); /* print whole file */ } else /* complain about errors reading file */ { ck(fprintf(stderr,"file read error\n") ); exit(EXIT_FAILURE); } while(fgets(record,sizeof(record),in)!=NULL) { if( strstr(buf,record)==NULL) /* not found */ { ck(fprintf(stdout,"%s",record)); } } free(buf); /* release the buffer */ } if(! fclose(in) ) /* close file with error check */ { return 0; /* normal return */ } ck(fprintf(stderr, "filesystem error\n") ); return EXIT_FAILURE; /* file close error - return */ }
- 10-19-2005 #5Just Joined!
- Join Date
- May 2004
- Location
- Pennsylvania
- Posts
- 98
This leaves me with FileA_minus_FileB = B.srt....
Originally Posted by jim mcnamara
not saying i don't believe you, just saying it doesn't work for me.
Thanks for the C code too... may have to try it out someday...
As it turns out i just wrote a perl script to:
read file A, put the full contents into a single string, using a special character string to represent newlines
read file B, put full contents into a single string, using the same special character string for newlines
use a regex replace of string_of_A compared to string_of_B
write file C using regex replace newline_string with newline in string_of_A
- 10-20-2005 #6Just Joined!
- Join Date
- Oct 2005
- Posts
- 31
Are you stripping entire lines, sounds like a job for ruby
I hope that you find the code interesting...Code:alines = Array.new clines = Array.new #store each line in file a to an array File.open(/path/filea).each{|line| alines.push(line)} #check each line of file b against all lines in a File.open(/path/fileb).each{|line| if not alines.include?(line) clines.push(line) end } #write the resulting lines into file c filec = File.new(/path/filec) clines.each{|line| filec.puts(line)}


Reply With Quote
