Results 1 to 8 of 8
I am trying to get my google mini to index my redhat box. It needs to follow links to index a folder that is jsut full of pdf files.
I ...
- 07-23-2009 #1Just Joined!
- Join Date
- Jul 2009
- Posts
- 3
[SOLVED] Dealing with spaces in awk print
I am trying to get my google mini to index my redhat box. It needs to follow links to index a folder that is jsut full of pdf files.
I used:
to create a page full of links that the mini can find and follow.Code:ls | awk {'print "<a href=\"mydocs/"$1"\">"$1"</a>"'} >> jump.html
80% of my filenames do not have spaces in them, but the ones that do have spaces do not show up in the jump.html correctly. All I get is the characters before the space.
audit_report_399.pdf shows up like:
but audit report 400.pdf shows up like:Code:<a href="/mydocs/audit_report_399.pdf">audit_report_399.pdf</a>
anyway to change the awk {print} command I am using to use &20 for the spaces? or another solution?Code:<a href="/mydocs/audit'>audit</a>
Greg
- 07-23-2009 #2Linux User
- Join Date
- Jan 2007
- Location
- cleveland
- Posts
- 452
welcome to the forum
one thing would be, to run the "ls" output through "sed"
replacing all blanks with...something else: like this--
ls | sed 's/ /_/g'
using the underscore: now no filenames have any blanks,
proceed as beforethe sun is new every day (heraclitus)
- 07-23-2009 #3Just Joined!
- Join Date
- Jul 2009
- Posts
- 3
I was think along those lines with Sed, but if I actually rename the files it will break the links to them in my CMS (joomla).
If I can replace the spaces with %20, then the links from the CMS should still work...
Hmm....
- 07-25-2009 #4Linux User
- Join Date
- Aug 2006
- Posts
- 458
then try using $0 instead of $1
- 07-26-2009 #5Just Joined!
- Join Date
- Jul 2009
- Posts
- 58
If you're trying to build a searchable index than you can't change the name of the file or remove the spaces. Unless you do that on the file system as well.
To keep the non-encoded characters you have to change them to their ord value
so create a encode.sed file with:
Than you can do:Code:s/%/%25/g s/ /%20/g s/ /%09/g s/!/%21/g s/"/%22/g s/#/%23/g s/\$/%24/g s/\&/%26/g s/'\''/%27/g s/(/%28/g s/)/%29/g s/\*/%2a/g s/+/%2b/g s/,/%2c/g s/-/%2d/g s/\./%2e/g s/\//%2f/g s/:/%3a/g s/;/%3b/g s//%3e/g s/?/%3f/g s/@/%40/g s/\[/%5b/g s/\\/%5c/g s/\]/%5d/g s/\^/%5e/g s/_/%5f/g s/`/%60/g s/{/%7b/g s/|/%7c/g s/}/%7d/g s/~/%7e/g
Now $ENCODED should be the encoded string that you can store and request from the webserver.Code:ENCODED=$(echo "${FOO}" | sed -f encode.sed)
- 07-26-2009 #6Linux Newbie
- Join Date
- Mar 2009
- Posts
- 228
- 07-27-2009 #7Linux User
- Join Date
- Aug 2006
- Posts
- 458
- 07-27-2009 #8



