Results 1 to 1 of 1
Hello,
I would like to use bzip2 v1.0.3 ( http://www.bzip.org/downloads.html ) to compress my doc then transitioning to other machine. When I was using bzip2 to compress a doc with ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 09-05-2006 #1Just Joined!
- Join Date
- Sep 2006
- Posts
- 3
bzip2 - Increase performance on traditional Linux utility
Hello,
I would like to use bzip2 v1.0.3 (http://www.bzip.org/downloads.html) to compress my doc then transitioning to other machine. When I was using bzip2 to compress a doc with 10MB size, it spent my 7.36s on Pentium 4 2.8G machine.
I used Intel(R) VTune(TM) Performance Analyzer (http://www3.intel.com/cd/software/pr...tune/index.htm) to profile bzip2 and found performance bottlenecks are in mainSort() of blocksort.c. So I changed origianl code as below -
a) Use memory runtime library instead of C Code
/*-- set up the 2-byte frequency table --*/
// for (i = 65536; i >= 0; i--) ftab[i] = 0;
/* pwang-enhance1: use memset() instead of zero-assignment in the loop */
memset(&ftab[0], 0, 26214
; // length is 4*65537
b) Unroll each iteration in huge loop to make more efficiency.
/*-- Complete the initial radix sort --*/
//for (i = 1; i <= 65536; i++) ftab[i] += ftab[i-1];
/* pwang-enhance2 - unroll the loop, reduce the iteration */
for (i = 1; i <= 65536; i+=4) {
ftab[i] += ftab[i-1];
ftab[i+1] += ftab[i];
ftab[i+2] += ftab[i+1];
ftab[i+3] += ftab[i+2];
}
And I used "inline" function for mainSimpleSort() in blocksort.c. Finally I used optimized bzip2 to compress 10MB doc file, the perfromance was raised up from 7.36s to 5.9s!! Amazing to save 1.46s!
I will attach modified new blocksort.c, would you like to rebuild source and generate new bzip2 on your side? Maybe you have other idea to enhance it more?
I still have not completed optimization for "decompression". Are you interested in using VTune tool to find bottleneck for "bzip2 -d"? Please reply if you like to do...
Regards, Peter
Utility bzip2


Reply With Quote
