Find the answer to your Linux question:
Results 1 to 2 of 2
Hello Gurus, I am begginer in perl. I would like to ask several questions, some related to perl and its syntax but most will be regarding to WIN32 OLE. My ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jul 2009
    Posts
    70

    win32 ole in deepr details in perl


    Hello Gurus,
    I am begginer in perl. I would like to ask several questions, some related to perl and its syntax but most will be regarding to WIN32 OLE. My main goal is to develop script that will check word document structure (return some information) and make some changes in this document (if it is possible). I am not shure if all that can be done throught perl and OLE. First I am sorry for posting things regarding microsoft OLE but mostly things that I have found was from this site, I also tried MSDN but seems very unclear and useless to me. Latest experimienting with OLE drives me crazy so I hope somebody could help.


    1st
    ===
    How to properly start win32 OLE ? I found somewhere this approach:

    Code:
    my $word = Win32::OLE->GetActiveObject('Word.Application')
        || Win32::OLE->new('Word.Application','Quit')
        or die Win32::OLE->LastError();
    Seems clear, just few questions to enshure that I understand:
    1. Can i replace or with || both stands for logical or ?
    2. What is really going on in this code:
    The perl tries to capture some of running instances of word application, if fails then starts it own instance and if this fails the error is printed ? What would happend if it captures some existing instance, and how to create this instance, it is just started word process ?
    3. I read in OLE documentation that second argument is destructor (but seems that is not mandatory), what is really its purpose ? i know it is opposite of constructor but, there needs to be some method created with name 'Quit' or it is done automatically or what is going on ?
    4. What does those "::" and "->" in code above stands for ? Is it some accessing of methods in some package or class or whatever ?
    5. 'use warnings;' is the same as perl -w ?

    2nd
    ===
    Some codes that I've found on internet regarding OLE cointinues with following:

    Code:
    my $doc = $word->Documents->Open('C:\\Perl\\home\\001f.doc');
    but then also found some weird structures, such as:
    Code:
    my $doc = $word->Documents->Open( { FileName => 'C:\\Perl\\home\\001f.doc', ReadOnly =>1 }) or die Win32::OLE->LastError();
    SaveAs({FileName => 'exampletext.doc', FileFormat =>  wdFormatDocument,})
    $doc->Close( { SaveChanges => $wdc->{wdDoNotSaveChanges} } );
    First approach seems pretty clear, accessing method Open which is part of Document (class, package or whatever) but what does other do ?
    What does mean {FileName => 'C:\\Perl\\home\\001f.doc', ReadOnly =>1} in fcunction arguments, why curly braces ?
    What is the difference between '=>' and '->' ? Is closing properties (or what is correct name) in '{}' necessary ?

    I have found that VBA has nearly simillar syntax 'ComputeStatistics(Statistic:=wdStatisticWords)' I am assuming this thing relates each other because in fact with OLE I am using microsoft technologies from perl. Also found solution which works with same functions/parameters as VBA (but without this strange assigment) here it is: Comment on

    3rd
    ===
    On this site Regarding the "Comment" object in Perl's Win32::OLE is mentioned that M$ word does not have information about number of pages contained in documet. So everytime when OLE is used the document is opened and page count is recalculated according to styles font size etc. Seems according M$ this is no problem WD98: Sample VBA Macro to Count Number of Pages in Document
    WD97: VBA Macro to Return (Count) Total Number of Pages, or the page count is also recalculated during file openning ? Also found two solutions for perl which are working using wdPropertyPages - winapi - Why are the number of pages in a Word document different in Perl and Word VBA? - Stack Overflow and using wdStatisticPages - Comment on So where is the truth is there infomration about page count or not.



    4th
    ===
    As I mentioned it seems that perl, VBA and PowerShell codes that I've found have several in common (I am not exper neither of those languages but they are acessing simillar variables). Following page describes how to obtain number of words and number of pages from document. As one of user suggested it can be obtained with '$selection->Words->{Count};' and '$selection->pagenumbers->{Count};' construction. However if I search word 'selection' in Object Browser of M$ Visual basic (from word document hit alt+F11 and F2) I found following:

    Code:
    Class Selection
        Member of Word
    ------------------------------------
    	Property Words As Words
        read-only
        Member of Word.Selection
    
    	
    	Property Characters As Characters
        read-only
        Member of Word.Selection
    ==========================
    Sub ShrinkDiscontiguousSelection()
        Member of Word.Selection
    ------------------------------------
    Property Words As Words
        read-only
        Member of Word.Selection
    
    Property Characters As Characters
        read-only
        Member of Word.Selection

    As you can see both contains also variables (or properties or what is the correct name, please correct) for 'Characters' and 'Words' but seems that both 'Characters' and 'Words' are member of 'Word.Selection' how should I understand that? Also I tried to search for 'pagenumbers' as it was mentioned in above link but did not find anything except several 'wdPageNumberStyle' and 'PageNumbers' but not 'pagenumbers' (lowercase). Also I did not find in Object Browser that 'Word.Selection.Words' or 'Word.Selection.Characters' have 'Count' method (or property what is correct name) where this method (property) came from ? What does word 'as' means in above output it is some data type ?

    Here I am posting mentioned code which I slightly altered

    Code:
    #!/usr/bin/perl
    use Cwd 'abs_path';
    use warnings;
    use strict;
    use Win32::OLE 'CP_UTF8';
    $Win32::OLE::CP = CP_UTF8;
    binmode STDOUT, 'encoding(utf8)';
    
    print abs_path($0) . "\n";
    print "=========\n";
    my $document_name = 'C:\\Perl\\home\\thisIsPerl.doc';
    my $word = Win32::OLE->GetActiveObject('Word.Application')
        || Win32::OLE->new('Word.Application')
        or die Win32::OLE->LastError();
    	
    $word-> {visible} = 0;
    $word->Application->Selection;
    
    my $document = $word->Documents->Open( { FileName => $document_name, ReadOnly =>1 }) or die Win32::OLE->LastError();
    my $paragraphs = $document->Paragraphs ();
    my $n_paragraphs = $paragraphs->Count ();
    
    print "Words:", $word->Selection->Words->{Count}, "\n";
    print "Characters:", $word->Selection->Characters->{Count}, "\n";
    print "Paragraphs: ", $word->Selection->Paragraphs->{Count}, "\n";
    
    $document->Close();
    $word->exit;
    $word->Quit;
    
    
    Administrator@cepido /cygdrive/c/Perl/home
    $ ./internet04_pgcnt.pl
    /cygdrive/c/Perl/home/internet04_pgcnt.pl
    =========
    Words:1
    Characters:1
    Paragraphs: 1



    but this code did not works perfectly. It always returns word count 1 no matter how many word are in document. Those investigations points me to another probably most important question, how are all those OLE objects organized ? The object browser is unclear to me, I also downloaded OLE/COM Object Viewer but bad luck also. I know this is not standard question to perl but I dont know where to ask. One idea which commes to mind is to list somehow all methods (properities variables packages) which are included in OLE throught perl, and then just try several of them according name, is this possible ?



    5th
    ===
    Is possible to process word document character by character ? Or even better is possible to query data from word like if it is SQL? Simply say select * from document where Font=Italic ?

    I think that reading by words I have done here ()there are some little mistakes:

    Code:
    #!/usr/bin/perl -w
    
    use strict;
    use warnings;
    use Win32::OLE::Const 'Microsoft Word';
    my $file = 'C:\\Perl\\home\\thisIsPerl.doc';
    
    my $Word = Win32::OLE->new('Word.Application', 'Quit');
    
    $Word->{'Visible'} = 0;
    my $doc = $Word->Documents->Open($file);
    my $paragraphs = $doc->Paragraphs() ;
    my $n_paragraphs = $paragraphs->Count ();
    
    for my $p (1..$n_paragraphs) {
    
    	my $paragraph = $paragraphs->Item ($p);
        my $words = Win32::OLE::Enum->new( $paragraph->{Range}->{Words} );
    
        while ( defined ( my $word = $words->Next() ) ) {
            my $font = $word->{Font};
    		print "IN_Text:", $word->{Text}, "\n" if $word->{Text} !~ /\r/;
    		#print $text;
            #$font->{Bold} = 1 if $word->{Text} =~ /Perl/;
    		
        }
    	print "=============\n";
    }
    
    $Word->ActiveDocument->Close ;
    $Word->exit;
    $Word->Quit;

    Works but throws some error at the end and did not proceed headers and footers






    6th
    ===
    I searched found following in Object Browser:
    Const wdNumberOfPagesInDocument = 4
    Member of Word.WdInformation

    Const wdStatisticPages = 2
    Member of Word.WdStatistic

    What does mean thoe numbers ? I am shure they do not coresponds with actual number of word document pages (I was playing with code from which works Comment on)






    7th
    ===
    Finally last question, I read somewhere that full path is necessary in OLE to open word document. I would like to pass document to procesing as an argumen to script but without needing specify full path (whole path should be appended to it after it will be passed to script) found somewhere that 'abs_path($0)' is using to doing someting similar but I had no luck. Also on Windows the slashes must be escaped and so on.




    I am sorry for longer post but I am stuck at points that I've described, hope somebody knows answer. Thanks a lot for any idea

  2. #2
    Trusted Penguin
    Join Date
    May 2011
    Posts
    4,353
    Quote Originally Posted by wakatana View Post
    1st
    ===
    How to properly start win32 OLE ? I found somewhere this approach:
    I am not sure about that double-or synax. I would code it more like it is shown in the example section of the official Win32::OLE documentation. The documentation is here:

    Win32::OLE - search.cpan.org

    Here is the example shown there, similar to what you posted, but using Excel instead of Word:

    Code:
    use Win32::OLE;
    
    # use existing instance if Excel is already running
    eval {$ex = Win32::OLE->GetActiveObject('Excel.Application')};
    die "Excel not installed" if $@;
    unless (defined $ex) {
      $ex = Win32::OLE->new('Excel.Application', sub {$_[0]->Quit;})
      or die "Oops, cannot start Excel";
    }
    Seems clear, just few questions to enshure that I understand:
    1. Can i replace or with || both stands for logical or ?
    For the most part, yes. but when in doubt, use ||. See this discussion for a thorough explanation.

    2. What is really going on in this code:
    The perl tries to capture some of running instances of word application, if fails then starts it own instance and if this fails the error is printed ?
    Yes

    3. I read in OLE documentation that second argument is destructor (but seems that is not mandatory), what is really its purpose ? i know it is opposite of constructor but, there needs to be some method created with name 'Quit' or it is done automatically or what is going on ?
    From the documentation:

    Please note the destructor specified on the Win32::OLE->new method. It ensures that Excel will shutdown properly even if the Perl program dies.
    Code:
    $ex = Win32::OLE->new('Excel.Application', \&OleQuit) or die "oops\n";

    4. What does those "::" and "->" in code above stands for ? Is it some accessing of methods in some package or class or whatever ?
    The :: is a perl package name delimiter. So Win32 is the package name, and OLE is the perl module. If you look around your file sytem, somewhere you will find something like:

    Code:
    c:\perl\site\lib\Win32\OLE.pm
    The arrow is indicating a method provided by the module. That is like a function.

    5. 'use warnings;' is the same as perl -w ?
    Yes

    -----

    It is hard to answer all those questions! You ought to consider breaking them up and focusing on one at a time (you'll get better feedback here, for sure).

    Also, be sure to check out perlmonks.org - it is a great site for troubleshooting perl code and seeking enlightenment (but again, they might ignore such a long post - or tell you something like I just did).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •