ARTICLE

Using diff and patch
Contributed by Scott Robbins in Applications on 2006-03-06 13:38:21

The diff and patch utilites can be intimidating to the newcomer, but they are not all that difficult to use, even for the non-programmer. If you are at all familiar with makefiles, you might find yourself frequently wanting to patch a file, either to correct an error that you've found or to add something that you need to the makefile.

After I began using the mrxvt terminal, I wanted to give it Japanese capability. My main O/S is FreeBSD. It manages packages with its ports and package system. To install a package from a port, one uses the port's Makefile, which will download and compile the souce code, in a manner familiar to those who use Gentoo's portage (which was inspired by FreeBSD ports) or ArchLinux's makepkg.

In this case, I wanted to edit the port's Makefile to enable Japanese support.

To do this, I simply had to add a line to the Makefile.

CONFIGURE_ARGS+=  	--enable-xim --enable-cjk --with-encoding=eucj	

Ok, this is simple. However, after doing this, I thought that perhaps I should submit a patch to the port's maintainer, giving others the opportunity to include Japanese support. This was a little more complicated, because the change to the Makefile meant that I should include a message when they installed the port, telling them what to do if they wished to include Japanese capability.

So, I had to add the following lines, in their proper place in the port's Makefile.

.if defined(WITH_JAPANESE)
 CONFIGURE_ARGS+=  	--enable-xim --enable-cjk --with-encoding=eucj	
 .endif # WITH_JAPANESE
 
 pre-everything::
 	@${ECHO_MSG} "=========================================>"
	@${ECHO_MSG} "For Japanese support use make -DWITH_JAPANESE install"
 	@${ECHO_MSG} "=========================================>"
I created my new Makefile, and then, using the diff command, created a patch.
diff -uN Makefile Makefile.new > patch.Makefile

The diff command has various flags. Simply doing a diff between two files shows something like (I'm just showing a few lines here)

5c5
< # $FreeBSD: /repoman/r/pcvs/ports/x11/mrxvt/Makefile,v 1.7 2005/07/22 22:38:58 pav Exp $
---
> # $FreeBSD: ports/x11/mrxvt/Makefile,v 1.7 2005/07/22 22:38:58 pav Exp $
22a23,26
> .if defined(WITH_JAPANESE)
> CONFIGURE_ARGS+=  	--enable-xim --enable-cjk --with-encoding=eucj	
> .endif # WITH_JAPANESE
> 
25a30,37
In this simple example, you can probably figure out its meaning. The < refers to lines in the original file that aren't in the new one and > refers to lines in the new file that aren't in the old one.

The 5c5 means that there is a difference in the 5th line. The c means something would have to be changed for them to match. The 25a30,37 means that text would be added at line 25. In this case, we don't have it, but there is also use of the letter d for text to be deleted.

This is a bit hard to read, especially if there are many differences. Therefore, most people prefer unified diffs, diff with the -u flag. This gives us something like (again, with many lines snipped)

--- Makefile.orig	Sat Sep 10 17:16:53 2005
+++ Makefile	Fri Sep 16 03:13:52 2005
@@ -20,9 +20,21 @@
 USE_X_PREFIX=	yes
 GNU_CONFIGURE=	yes
 USE_REINPLACE=	yes
+.if defined(WITH_JAPANESE)
+CONFIGURE_ARGS+=  	--enable-xim --enable-cjk --with-encoding=eucj	
+.endif # WITH_JAPANESE

This shows a few lines before and after the change, which helps define context. (I've snipped the lines below this change, but you can see that three lines above it are included.) Let's examine this a bit. The first lines are fairly straightforward, they have --- and the old file's name, then +++ and the name of the new file. It also contains the ctime (the time the file was last modified.) Next is what is known as the hunk. This line will start with @@ then have the old file's starting line, the old number of lines, the new start and the new number of lines, then another @@.

Understand that the three lines above and below the change remain as they are. The 3 lines are simply to give context. In this case, including that context, the change starts at line 20. Lines 20-23 will remain unchanged. Including the 3 lines above and below the differences, the change will go for 9 lines. So, we are changing 9 lines, starting from line 20, (which will include 3 lines above and 3 lines below the actual change). Therefore, this is shown with a minus sign.

Following that is the plus sign. The first number 20, is the first line of the new file and the change, including the 3 lines above and below, will continue for 21 lines. Note that I have not shown the entire patch and also some of those lines may simply be blank lines. So, the hunk starts with

@@ -20,9 +20,21 @@

Next comes the actual patch itself, the 3 lines of context and the change.

Note that in the patch, there is a space before the 3 lines of context, and then the lines below have a plus sign. A space before a line means that nothing will be changed. A plus sign means the line will be added. If there had been lines to be deleted, they would have had a minus sign in front of them.

Let's create two files to make this a little clearer. Using your favorite text editor, create patchtest.txt and patchtestnew.txt. The patchtest.txt will read

This is a file.
These first three lines are 
lines of context. They 
will remain unchanged. They will have spaces in front of them.
Here are the lines that will be changed.  They will begin with
minus signs, because they are being deleted.
Now, we will add three 
more lines that are only
context.  They will have spaces in the patch

Now, patchtest1.txt

This is a file.
These first three lines are 
lines of context. They 
will remain unchanged. They will have spaces in front of them.
These lines have been changed.  They will have plus
signs in front of them.
Now, we will add three 
more lines that are only
context.  They will have spaces in the patch

Create the patch.

patch -uN patchtest.txt patchtest1.txt > patch.txt

View the patch

less patch.txt

You will see

--- patchtest	Sun Feb 26 19:35:43 2006
+++ patchtest1	Sun Feb 26 19:35:14 2006
@@ -2,8 +2,9 @@
 These first three lines are 
 lines of context. They 
 will remain unchanged. They will have spaces in front of them.
-Here are the lines that will be changed.  They will begin with
-minus signs, because they are being deleted.
+These lines have been changed.  They will have plus
+signs in front of them.
 Now, we will add three 
 more lines that are only
 context.  They will have spaces in the patch
+This is yet another line that is different.

You can see the first line, This is a file, wasn't included in the patch--that's because it was outside of the three lines of context.

Now that we've made our patch, we can apply it.

patch patchtest.txt < patch.txt

You will see

Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|--- patchtest	Sun Feb 26 19:35:43 2006
|+++ patchtest1	Sun Feb 26 19:35:14 2006
--------------------------
Patching file patchtest using Plan A...
Hunk #1 succeeded at 2.
done

Now, patchtest.txt has been patched. If you now do a diff between patchtest and patchtest1, you'll just be put back at your command prompt, showing that there are no differences.

This is simplest form of creating diffs and using patches. Sometimes, you patch an entire directory--those who compile their own kernels may have done this. Rather than downloading an entire new tarball of the new kernel, there are often patches, especially of minor revision numbers. The README in /usr/src/linux has instructions for using these patches. When you are applying a patch to an entire directory tree, you may need to use the -p1 option. The p[number] basically helps determine the path to the file or files being patched. See man(1) patch for details and examples. For instance, if you had a patch for the entire Linux kernel source tree, and were in /usr/src you might do

patch -p1 < mylinuxsource.patch

As this varies, depending not only upon your location when applying the patch, but also what is in the patch, it's best to see the man page, however, just keep in mind that if trying to apply a patch that covers several files in a directory doesn't work, it may be the p[number] that is causing the difficulty.

Although this becomes more complex when making patches consisting of many hunks, or patching many files in a directory, (such as the Linux kernel source tree) the basic concept is the same. It is hoped that this article gives the reader a better understanding of diff and patch, and will help them to read and understand patches. This can be very handy--sometimes, a patch has something you don't want, so it' always good to look at it before applying it.

Patches can also be reversed with the -R flag. Suppose you try an experimental patch and it breaks something. You can then patch the file again with the -R flag.

Take our patchtest and patchtest1. Let's run patch again with the -R flag.

patch -R patchtest.txt < patch.txt

Again, you'll see that Hrrm...Looks like a unified diff to me message and a message that it succeeded. Actually, if you forget the -R flag, patch often catches it. Patch the file one more time with patch patchtest.txt < patch.txt and it should succeed. Once again, patchtest and patchtest1 are identical. Now, try it again, without the -R flag. You'll see a message

Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|--- patchtest	Sun Feb 26 19:35:43 2006
|+++ patchtest1	Sun Feb 26 19:35:14 2006
--------------------------
Patching file patchtest using Plan A...
Reversed (or previously applied) patch detected!  Assume -R? [y] 

If you type y then you should once again see that it succeeded.

If anyone is interested, my patch for mrxvt was accepted, and the port is now available with the option to enable Japanese.


 
Discussion(s)
Good intro
Written by Harry Sutton on 2006-03-12 09:41:36
When you get to the point of creating two files for demonstration purposes, the first time you refer to them, you call them "patchtest.txt" and "patchtestnew.txt". But thereafter, you call them "patchtest.txt" and "patchtest1.txt".

Also, after the second file is created, the command that should be run isn't

patch -uN patchtest.txt ........

it should be

diff -uN patchtest.txt ........

Finally, the output you describe (the lines that begin with "Hmm... Looks like a unified diff to me..." are specific to BSD. They don't appear in my Red Hat Enterprise Linux 4 Workstation system.

This was a good introductory article - I learned somethine new, and for me that's always been a measure of success. Thanks!
Discuss! Reply!

Thanks for the corrections and another o
Written by Scott Robbins on 2006-03-21 08:04:11
As noted there were a few typos--thanks for pointing them out. Also, please note that the original patchtest1.txt should have had one more line at the end, the one that says

This is yet another line that is different

Thanks and glad that you found you learned something from the article
Discuss! Reply!

Windows user and happy ;-)
Written by Charles on 2006-04-03 21:48:55
No teasing intended. ;-) Only to thank you VERY MUCH, Scott!

The article is in introductory level, and still, that was I needed. Obviously, I could understand the --help, but you know, that doesn't always explain too much.

I had used diff and patch in the past, but mostly by example. That was a nice explanation, and as I use Windows console most of the time, it was nice to have an intro to these two.

Thanks man!
Discuss! Reply!

One more addition
Written by Scott Robbins on 2006-03-28 22:32:12
I just realized that nowhere in this article did I give credit to my friend and mentor, Gentoo developer (along with various other impressive credits) Josh Glover. Josh had done a posting on the tlug (Tokyo Linux User Group) about diff and patch that inspired this article, one of the best explanations I've seen. So, thanks Josh. :)
Discuss! Reply!

merging
Written by tanish on 2006-06-15 06:45:35
i am trying to merge 2 files using patch/diff :

file1.txt
---------

hello my name is ABC

file2.txt
---------

hello my name is DEF

and i want the output :
----------------------

hello my name is ABCDEF

is this possible ??
Discuss! Reply!

RE: merge 2 files
Written by Alan Wardroper on 2007-02-14 16:11:10
Quote:

i am trying to merge 2 files using patch/diff :
file1.txt
--------
hello my name is ABC
file2.txt
---------
hello my name is DEF
and i want the output :
----------------------
hello my name is ABCDEF
is this possible ??





Not sure about diff but you can do it very simply with cut and paste on the command line:

cut -d ' ' -f 5 file2.txt | paste -d '' file1.txt -

(Note: the -d option to cut hee is a space, while thatfor past is empty to get your 'ABCDEF')
Discuss! Reply!

Ignoring lines when patching
Written by Andy Nelson on 2007-05-06 19:50:52
Hi, thank you for your useful article.
I was just making a script to help automate some installation, and there are these config files from the previous installation which USUALLY don't change from build to build. So I've made the script not copy over the new config files unless there is a difference.

However, these files have some default values fields which are just some path directories, and the config files that come with a new build have some default path directories, so to know if there's a real difference between the old and the new I've used the -I flag of diff, which works fine, if any other lines except the ones I've detailed after each -I, are different, then copy over the new config file.

The problem is that the new config file will now need to have those default values set to the same values as the previous file. Eg:

old_config:
blah = /home/blah/3.4/
blah2 = /etc/init.d/blah2
do blah blah blah

new_config:
blah = &lt;default&gt;
blah2 = &lt;default&gt;
do zap zap zap

So I want to basically make the new file have those blah and blah2 fields set to the values of the old config file...but I don't want the rest of the config file I don't think.

Can this be done with diff?
Discuss! Reply!