Find the answer to your Linux question:
Results 1 to 1 of 1
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1

    Rebuild Perl open source to speed up performance?

    Hi there,

    I was asked by my friend (about 40% CPU bandwidth on perl??), who ran an small ERP system on Intel Xeon processor.

    That caused my "enthusiasm" to research on how to reduce CPU cycles for "perl" process in the system.

    I have downloaded Perl v.5.8.8 (stable.tar.gz) from and extract it onto my Red Hat Enterprise Linux AS release 4 (Nahant Update 2)/Intel Xeon MP machine.

    I used ./Configure to specify Perl's environment configuration and gcc compiler used at first time to generate perl program and libraries. I used "./perl" to run test ( has been attached in this thread), also I can use Intel VTune(TM) Performance Analyzer for Linux (from to generate profiling data by using "vtl activity perl_gcc -d 60 -c sampling -app "./perl," run"

    Now I can get profiling data file (.tb5) from /root/VTune/Projects/Sampling
    and import this file into VTune Ananlzyer GUI. See "perl" process below -
    Event "Clockticks" = 8,359,449,00
    Event "Instruction Retire" = 5,340,216,000
    Clockticks per Instructions Retired (CPI) = 1.565
    Total duartion = 5.81s

    I found that CPU consumed of most functions are flat, like as Perl_runops_standard(), Perl_pp_gvsv(), Perl_leave_scope(), Perl_sv_upgrade(), etc. That means if you optmized one of function - overall performance will speed up slowly.

    I heard that Intel C++ compiler has strong capability on optimizing C++ code, so I reconfigured perl environment and select Intel icc as compiler. However if I used icc compiler options, like as "-fast", "-ipo", "-parallel", "axP" to rebuild perl - the results are not satisfied. I inspected on perl's source code, they have limited data processing, huge macro-defined functions, huge branch in deep loop. I realized that I can use "-O2 -g -prof-gen" to generate first perl, then run perl several times, all feedback (profiling) data will be written into .dyn for each perl's source file. Second time, I use icc with options "-O2 -g -prof-use" to generat new perl.

    I ran perl on again and got performance as below for "perl" process -
    Event "Clockticks" = 7,063,479,000 (vs. gcc's 8,359,449,00)
    Event "Instruction Retire" = 5,139,120,000 (vs. gcc's 5,340,216,000)
    Clockticks per Instructions Retired (CPI) = 1.374 (vs. gcc's 1.565)
    Total duartion = 5.53s (vs gcc's 5.81s)

    It seems that I saved 0.28s (or 5% of all time) to run perl with

    So if you want to generate optimized perl for your system, you may choose Intel C++ compiler and use "prof-gen" to generate application which can collect performance data, then use "prof-use" to feed data to compiler to generate your final application. Optimize your code on instruction cache line NOW!!!

    Intel VTune Analyzer can help you to know the benifit gain in detail.

    Regards, Peter
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts