This is embarrassing. I have no idea what I've been smoking. (I don't smoke, but you get the idea.)
There was a running gag in the Rocky and Bullwinkle show where Bullwinkle would try the classic magic trick of pulling a rabbit out of his hat. My favorite three-liner?
"Hey, Rocky, watch me pull a rabbit out of my hat."
"That trick never works!"
"This time for sure."
Well, this time it'll work for sure. Let me build it up piece by piece, so you know what I'm doing.
First, we have this:
Code:
sort astats.txt bstats.txt
That combines the lines from both
astats.txt and
bstats.txt, and outputs them in sorted order. We don't care about the order, except that we want any duplicate lines to appear next to each other. We'll get duplicate lines if we have a line which exists in both input files.
The next step:
This command outputs nothing at all for each line which only appears once. It outputs exactly one copy of each line which appears at least twice (but the copies must be next to each other, which explains why we did the sort). Since we already knew that the only way we can get duplicate lines was if a line appeared in both input files, we know that these are the lines we are interested in.
Ok, on to sed.
Code:
sed -e 's/fred/barney/'
That would substitute "barney" for the first occurrence of "fred" in each line. If, instead, we said
Code:
sed -e 's/fred/barney/g'
that would mean substitute "barney" for every (not just the first) occurrence of "fred" in each line. But we won't need that "g" option, so we'll leave it out.
Code:
sed -e 's+fred+barney+'
substitutes barney for fred again, but we'll use plus signs for the sed command delimiter, because we'll have slashes in the substitution string.
Code:
sed -e 's+^fred$+barney+'
The
^ means beginning of line, and
$ means end of line. So this substitutes barney for fred, but only on those lines which contain exactly fred and nothing else.
Code:
sed -e 's+^.*$+barney+'
In this example, the period (".") means, roughly, any possible character you can imagine. The asterisk ("*") means "as many of the preceding thing that you can find, and zero is an acceptable quantity of that preceding thing". So .*$ means "as many of any character as you can find, up to the end of the line". This command will replace every single line with barney. If the file has 534 text lines containing anything and everything and nothing on its lines, you will end up with 534 lines of barney.
I've put the
.*$ in green above for a particular purpose, which I'll revisit later.
Anyway, that sounds useless, but now look at this:
Code:
sed -e 's+^.*|+barney+'
All we did was to add the pipe symbol after the asterisk, and remove the dollar sign, which meant end of line. So we're instructing sed that wherever it finds anything at the beginning of a line, for any number of characters, followed by a pipe symbol, replace all of that with barney. Anything after the pipe symbol remains in the output as it was in the input. If it doesn't find the pipe symbol at all, the line will be output exactly as it was input.
(If there's more than one pipe symbol in a line, sed will be "greedy" and absorb through the final one, and everything up to that will be replaced by barney. But I'm assuming you don't have pipe symbols in your file names, so that won't concern us here.)
What we want to do, of course, is simply to remove everything up to and including the pipe symbol, so we just remove barney from the command:
So if we have a line that looks like this:
we'll get this output:
We want to grab that data and show it twice on the line, though. That's where the escaped parentheses come in, \( and \). If those appear in the regular expression just once, they mean that anything between them is to be regarded as subexpression 1. If twice, then you have subexpressions 1 and 2. They can be referred to in the replacement part by \1 and \2, respectively. So this command:
Code:
sed -e 's+^.*|\(.*\)$+\1\1+'
says to get rid of the stuff before and including the pipe symbol. The symbols in green (including the dollar sign) mean everything else, up to the end of the line, as we had in a previous example (look for the green there). The characters found in the input line which "fit" between the escaped parentheses are considered to be subexpression 1. So the above sed command, run against this data:
Code:
123|fred barney wilma betty
will give the output
Code:
fred barney wilma bettyfred barney wilma betty
If we have this command:
Code:
sed -e 's+^.*|\(.*\)$+rrr\1eee\1ddd+'
and give it the same input, we'll get:
Code:
rrrfred barney wilma bettyeeefred barney wilma bettyddd
Now, this is where I messed up the last time, in several ways. One of them was misplacing the plus signs; the other was forgetting to place double quotation marks to protect the occurrence of spaces, not just in the first (constant) part of a file's path name, but also spaces in the rest of the file name, which might vary from one line to the next. We want something like this:
Code:
cp -pv "source location" "destination location"
Fortunately, double quotation marks have no special meaning to sed in this situation, so we'll just put them where we want them in the replacement string:
Code:
's+^.*|\(.*\)$+cp -pv "Volumes/Backup/Backup SAN/CreateSAN/CURRENT/ABA\1" "Volumes/SAN/CURRENT/ABA\1"+' > copysametime.sh
Notice the crucial lack of spaces before the \1 in each occurrence. That's one thing I weeded out. If the spaces are in there, you'll have file or directory names beginning with spaces. This is literal stuff we're working with here.
Do you want a slash before each occurrence of "Volumes"? If so, put it there, in both places if you want.
I hope I have it right this time. Are those rabbit ears I see twitching in that hat?