I recently found myself needing to replace some text in a PDF for confidentiality reasons. I tried LibreOffice and some other GUI applications, but they had problems with content sizing and canvas size. After searching online, a StackOverflow answer showed how to do it using the command-line PDFtk and sed
. The answer warns that it doesn’t always work, but it did in my case, changing the text and not messing anything else up (that I could see).
I didn’t have PDFtk already, so I used MacPorts to install it: sudo port install pdftk
. It actually said the port was broken, but the command was there and seemed to work fine.
I tested the commands from the above answer, but ran into a problem. Another StackOverflow answer gave me the solution: I had to set LANG=C
to tell sed
to treat the file as binary.
The commands then worked to do what I wanted. I put them together into a bash script for future use:
#!/bin/bash
if [ "$#" -ne 4 ]; then
echo "Usage: pdfreplace.sh input.pdf 'string to replace' 'string replacement' output.pdf"
exit 1
fi
#--needed for encoding to be handled correctly on mac
LANG=C
#--uncompress, replace text, recompress, and remove temp files
pdftk $1 output _tmp12345.pdf uncompress \
&& sed -e "s/$2/$3/g" <_tmp12345.pdf >_tmp22345.pdf \
&& pdftk _tmp22345.pdf output $4 compress \
&& rm _tmp12345.pdf _tmp22345.pdf
It takes four arguments, shown in the “Usage” line echoed when you run it with less than four arguments.
I make no promises that it’ll work for your needs.