Results 1 to 9 of 9
Hi everyone,
I have a file that contains links (of cities) and each is on a separate line.
Problem is that one of the cities in the link has a ...
- 02-05-2009 #1Just Joined!
- Join Date
- Jan 2009
- Location
- Halifax, NS
- Posts
- 19
A little grep problem
Hi everyone,
I have a file that contains links (of cities) and each is on a separate line.
Problem is that one of the cities in the link has a space.
for instance, if these were the links:
http://www.website.com/Toronto....
http://www.website.com/Montreal....
http://www.website.com/Niagara Falls... <-- this one is the problem
With this code:
it will lists the cities all on one line, so like this:Code:cat cityLinks.txt | while read a do echo `grep -o 'destination.*' | sed 's/destination=//' | sed 's/&country.*RACK//'` done
Toronto Montreal Niagara Falls
I want to make directories named after the cities in the links, thus I exchanged 'echo' with 'mkdir'. The problem is that it will create two folders for Niagara Falls (1 for Niagara and 1 for Falls).
How can I get it to create one folder for cities with a space in between (thus a folder called 'Niagara Falls' in this case)?
Any help is appreciated.
- 02-05-2009 #2Just Joined!
- Join Date
- Oct 2004
- Posts
- 62
Why not sustituting the space with '_' (underscore)?
By hand if such towns are few, or by
Code:cat cityLinks.txt!sed 's/ /_/' (substitution of the first blank....)
- 02-05-2009 #3Just Joined!
- Join Date
- Jan 2009
- Location
- Halifax, NS
- Posts
- 19
- 02-06-2009 #4Linux Newbie
- Join Date
- Jul 2008
- Posts
- 181
This is horribly inefficient. Since you did not give the correct links, I'll have to make a few guesses, but it seems as if you wish to discard everything up to and including "destination=", and everything including and following "&country". Sed can do all that in one go, and handle quoting:
Code:sed 's/.*destination=\(.*\)&country.*/"\1"/' cityLinks.txt| xargs mkdir
- 02-06-2009 #5Just Joined!
- Join Date
- Jan 2009
- Location
- Halifax, NS
- Posts
- 19
Thanks for showing me that. I was taught with all the piping, but I like your method.
Since I'm still new to this, can you explain what xargs does, I've never seen it been used before.
edit: I found out that \( \) treats the expression inside as a group and saves the matched characters into a temporary holding area. So I'm assuming that you reference that temporary holding area with xargs?
- 02-06-2009 #6Linux Newbie
- Join Date
- Jul 2008
- Posts
- 181
No! When using sed's "s" command, you can create groups in the regular expression and refer to the string they matched in the substitute text. There is no way to reference this information from another process! In my example, sed replaces the entire line with the part we are interested in and then prints it out. The pipe redirects the output to xargs, which uses the input to construct a command line for the following program argument, in this case "mkdir". This is a very common pattern.
- 02-06-2009 #7Just Joined!
- Join Date
- Jan 2009
- Location
- Halifax, NS
- Posts
- 19
Ok I believe I understand everything so far, but I have another question.
If I want to download the pages (thus the links inside cityLinks.txt) with wget, to that city's directory I just created (e.g. download Toronto's page from its link in the txt file and store it in the Toronto directory.)
Would it look something like this?
I'm pretty sure that's wrong, but a hint in the right direction would be appreciated.Code:sed 's/.*destination=\(.*\)&country.*/"\1"/' cityLinks.txt| xargs mkdir | wget -w 5 -i cityLinks.txt -P ./city's_directory
edit: I've used wget with the -i option before, but that downloads all links in the file to one location.
What I'm trying to do is download each link in the file to a different location, how would I be able to do that?
- 02-09-2009 #8Linux Newbie
- Join Date
- Jul 2008
- Posts
- 181
Code:while read line do dir=${line##*destination=} dir=${dir%%&country*} mkdir "$dir" wget -P "$dir" $line done < cityLinks.txt
- 02-09-2009 #9Just Joined!
- Join Date
- Jan 2009
- Location
- Halifax, NS
- Posts
- 19


Reply With Quote
