remove ligatures with gs
Is it possible to remove and replace all ligatures like fi and fl with the actual letters in a pdf file? When I copy/paste text from some pdf files, ligatures are exported as 001E and 001F and impossible to know what letters it should be.
So I am thinking that if I could convert the ligatures before I copy/paste then my problem would be solved. pdftotext can extract the real letters but I need the pdf structure intact. If I use evince the letters is correct but evince can not copy the text in the right order. I use Win XP when I do the copying but all pdf pages is on linux servers so it would be easy to convert pages before they are shared with samba. (If I knew how)
The font that is used in the pdf is OpenType.
If it is possible I would guess that it is ghostscript that should do it but I can't find out how.