Results 1 to 6 of 6
Hi all
I have a file of about 1500 lines odd long, which is an export from a database online. It is essentially a CSV, but instead of comma's is ...
- 07-09-2007 #1
sed scripting
Hi all
I have a file of about 1500 lines odd long, which is an export from a database online. It is essentially a CSV, but instead of comma's is a semi-colon. What i want is the 15th instance ofchanging to".*";."textfield:";
The problem is that the instances of ""; before the one i need to change are not all the same, and some have some text in, and some dont. So you cannot count the number of instances of "";, it will need to be something more like ".*"; (if using regexp). And will need to also consider that some of the fields have the html code for a " in it ofso needs to ideally look for the "; after each item."
Anyone know how this can be done? I have had a look on a sed thing online and come up with:
where "startline" is the first line i want the change to occur (dont want it on the first 30 odd lines).Code:sed 'startline,$ s/\"\"\;/\"textfield:\"\; /15'
Anyone here know sed that well?"I am not an alcoholic, alcoholics go to meetings"
Registered Linux user = #372327
- 07-09-2007 #2
- You have a file with a large number of rows, which contain fields delimited by a semi-colon -- ;
- You want to find the 15th field in each row and manipulate the value.
- You want to skip the first n number of rows in this operation.
Is that all true?
- 07-09-2007 #3
If i understand what i want to do properly, then yes!!
"I am not an alcoholic, alcoholics go to meetings"
Registered Linux user = #372327
- 07-09-2007 #4
I'll show you an example with simplified data.
Quick explanation of that awk statement:Code:[aqua@troy ~]$ cat my-data 01;field-a;field-b;field-c;field-d 02;field-a;field-b;field-c;field-d 03;field-a;field-b;field-c;field-d 04;field-a;field-b;field-c;field-d 05;field-a;field-b;field-c;field-d 06;field-a;field-b;field-c;field-d 07;field-a;field-b;field-c;field-d 08;field-a;field-b;field-c;field-d 09;field-a;field-b;field-c;field-d 10;field-a;field-b;field-c;field-d [aqua@troy ~]$ awk -F';' '{ if(NR < 7) next ; sub(/.*/,"new-text",$4) ; print }' my-data 07 field-a field-b new-text field-d 08 field-a field-b new-text field-d 09 field-a field-b new-text field-d 10 field-a field-b new-text field-d
1. The -F option is used to specify a semicolon / ; delimiter.
2. Any data rows prior to the 7th are skipped.
3. After that time, the 4th field in each row is replaced with 'new-text' and printed to stdout.
You can pipe the results to the less pager for viewing. Or you can redirect the results to an output file.
Why awk? No good reason. It just works.
(p.s. Let me know if you need semicolons in the output too -- there's another step required for that.)
- 07-09-2007 #5Linux Enthusiast
- Join Date
- Aug 2006
- Posts
- 631
If the csv file is seperated by a semicolon, the fields are within double quotes and you want to skip the first 30 lines:
RegardsCode:awk -F ";" '{if(NR<=30)next;$15="\42textfield:\42"}{gsub(/ /,";")}{print}' file
- 07-09-2007 #6
I have actually resolved this now with a sed command from someone else:
That outputs updated.csv with the relevent output. I had to remove the header lines manually, but i think simply adding in "30,$" before the s/... would have also done the trick, so:Code:sed -e 's/\(\("[^"]*";\)\{15\}\)"[^"]*";\(\("[^"]*";\)*\)/\1"Textfield:";\3/g' < csv_export_article_818967.csv > updated.csv
Though i have not tested this theory. Either way, this ran in no time at all (0.9 seconds) and saved my friend about a months worth of work updating the site manually (Which the provider said was the ONLY way to do it!) Dont you just love FOSS!!!Code:sed -e '30,$ s/\(\("[^"]*";\)\{15\}\)"[^"]*";\(\("[^"]*";\)*\)/\1"Textfield:";\3/g' < csv_export_article_818967.csv > updated.csv"I am not an alcoholic, alcoholics go to meetings"
Registered Linux user = #372327


Reply With Quote