Results 1 to 4 of 4
Hi,All!
Need one's help.
i have a big file looking like this
Code:
<page>
<title>...</title>
<..../>
....
</page>...
that is what i want to extract:
Code:
<page>
<title>Talk:Atlas Shrugged</title>
<id>128</id>
...
- 10-23-2007 #1Just Joined!
- Join Date
- Oct 2007
- Posts
- 2
extract part of the file/read next lines if condition is fullfiled on the current/awk
Hi,All!
Need one's help.
i have a big file looking like this
that is what i want to extract:Code:<page> <title>...</title> <..../> .... </page>...
so what i need isCode:<page> <title>Talk:Atlas Shrugged</title> <id>128</id> <revision> <id>152717854</id> <timestamp>2007-08-21T16:32:33Z</timestamp> <contributor> <username>Marlith</username> <id>4871029</id> </contributor> <minor /> <comment>/* The quotes do not belong here */</comment> <text xml:space="preserve">{{NovelsWikiProject|class=B|importance=High}} ...... </text> </revision> </page>
-> start printing/store in other file up toCode:if('<title>Talk:')will appear, then stop reading and search forCode:if(!'<page>')
againCode:if('<title>Talk:')
I don't get how can i achieve to read lines after- sed and grep are reading each line and next or getline is not enough for me. Is there any possibility to define global flag or any other ideas?Code:<title>Talk:
Please if possible comments with examples.
Thank you in forehand,
Zina
- 10-24-2007 #2Code:
awk '/<page/ { tag = $0 } /<title>Talk:/ { $0 = tag "\n" $0; f=1 } f && /<\/page/ { print; f=0 } f' filename
- 10-24-2007 #3Just Joined!
- Join Date
- Oct 2007
- Posts
- 2
perfect
Wow! it works perfectly! Thank you very much!
Anyway i don't want to left dummy. Let me understand the way of your coding.
you find <page and save it in tag
next you read line and it is not <page but it submits the next condition <title then you edit currently read line by adding tag "\n" $0; and define a flag f=1
next you read the third line and first and second condition are not fulfilled but flag is 1 and </page condition is not full filled - the line is $1, then going on with reading and comparing with all three conditions(<page, <title, </page) - lines are $2-${infinity}
after we metwe print everything starting from $0 and set our flag to zero. Now we look again for fulfilling the condition.Code:</page
Is it so?
But really wow!
Thank you and respect,
Zina
- 10-24-2007 #4
So,
when the current record matches the pattern "<title>Talk:",
we modify the current record $0: we prepend tag (the previous record)
and a new line, then we set f (our flag) to 1.
We continue, if the flag is set to 1(true) AND the current record matches
the pattern "</page>", we print it and set the flag to 0(false).
The final f do the real work: it reads all the records and if the flag is set to 1(true), it prints the record.


Reply With Quote
