Results 1 to 6 of 6
Hi all,
Say below shown html is the file
<!--file.html-->
<html>
<head>
<title>1_3-BF-01</title>
</head>
<body>
<div class="TestPurpose">The content of the script element MUST
be treated as if its display property ...
- 05-19-2009 #1Just Joined!
- Join Date
- May 2009
- Posts
- 5
Retrieve Value between tags
Hi all,
Say below shown html is the file
<!--file.html-->
<html>
<head>
<title>1_3-BF-01</title>
</head>
<body>
<div class="TestPurpose">The content of the script element MUST
be treated as if its display property were set to the value "none"
and the content of the noscript
element printed</div>
<div class="AssertionsTested">Tests for assertions 1</div>
<script type="plain/text">The text must not be printed</script><noscript><p>This text must be printed</p></noscript>
</body>
and i want the content between the <div class="TestPurpose"> </div>
which is as output
The content of the script element MUST
be treated as if its display property were set to the value "none"
and the content of the noscript
element printed
How to do it ?
I have simple gawk script when run like ($gawk -f getvalue.awk file.html )
will give only
The content of the script element MUST
which is able to do partially (only one line). How and where do i tweak it to achieve the output shown above .
#getvalue.awk
function stripInputRecord ( inputRecord )
{
gsub ( /^\t*<div.*">/, "", inputRecord);
gsub ( /<\/div.*>/, "", inputRecord);
return inputRecord;
}
/TestPurpose/ {
str = str stripInputRecord( $0 );
}
END {
print str;
}
Thanks in advance for any inputs
--
Thanks & Regards,
Siddu
- 05-19-2009 #2Linux User
- Join Date
- Aug 2006
- Posts
- 458
if you have Python
outputCode:#!/usr/bin/env python import re pat=re.compile(".*<div class=\"TestPurpose\">(.*?)<\/div>.*",re.M|re.DOTALL) data=open("file").read() print pat.findall(data)[0]
Code:# ./test.py The content of the script element MUST be treated as if its display property were set to the value "none" and the content of the noscript element printed
- 05-19-2009 #3Just Joined!
- Join Date
- May 2009
- Posts
- 5
I do have python
Thanks Dude !
- 05-20-2009 #4Just Joined!
- Join Date
- May 2009
- Posts
- 5
Dude But then what if i had iterate over several files
I am sorry ,
I dont know python
Help Me !
- 05-20-2009 #5Linux User
- Join Date
- Aug 2006
- Posts
- 458
assuming all files and the script is in current directory. please look at the documentation (see my sig) if you want to learn about Python.Code:#!/usr/bin/env python import re,os pat=re.compile(".*<div class=\"TestPurpose\">(.*?)<\/div>.*",re.M|re.DOTALL) for files in os.listdir("."): data=open(files).read() print pat.findall(data)[0]
- 05-20-2009 #6Just Joined!
- Join Date
- May 2009
- Posts
- 5
Yes i would some time soon


Reply With Quote