This is my method of creating a pdf book for uploading to archive.org.
First, I scan it into pdfs, usually about 10 scans (10 or 20 pages depending on the size of the page).
I use a HP-Laserjet-200-colorMFP-m276nw. These scripts were all done on Fedora 35. Next I split out the images from the pdf:
mkdir out
for D in t??0; do
cd $D
mv scan.pdf scan0000.pdf
for S in `ls scan0*`; do
I=`echo $S | sed 's/scan\([0-9]*\).pdf/\1/'`
echo $D $I
pdfimages -png scan${I}.pdf ../out/s${D}_${I}
done
cd ..
done
Next I have three different ways of processing the book depending on how it was scanned. For each of these, I usually open a page and figure out what numbers I should use for the crop statement. The simpliest is if I scanned two pages at once, and am not splitting them:
cd out
for I in st*.png
do
R=`echo $I | sed 's/s/r/' | sed 's/-/_/'`
T=`echo $I | sed 's/s/t/' | sed 's/-/_/'`
if test $I -ot $R
then
echo $I $R $T already done
else
echo $I $R $T
convert $I -crop 2550x3510+0 -rotate 90 -despeckle $R
convert $R -resize 33% -level 10%,90%,0.5 -posterize 32 $T
fi
done
If I had to scan each page individually, half of them will flipped, so I need to flip some of them differently. Notice for this to work, I have to make sure that I always fip the odd ones and the even ones correctly when scanning.
cd out
for I in st*[02468].png
do
R=`echo $I | sed 's/s/r/' | sed 's/-/_/'`
T=`echo $I | sed 's/s/t/' | sed 's/-/_/'`
if test $I -ot $R
then
echo $I $R $T already done
else
echo $I $R $T
convert $I -crop 1800x2700+0 -despeckle $R
convert $R -resize 33% -level 10%,90%,0.5 -posterize 32 $T
fi
done
for I in st*[13579].png
do
R=`echo $I | sed 's/s/r/' | sed 's/-/_/'`
T=`echo $I | sed 's/s/t/' | sed 's/-/_/'`
if test $I -ot $R
then
echo $I $R $T already done
else
echo $I $R $T
convert $I -crop 1800x2700+0 -rotate 180 -despeckle $R
convert $R -resize 33% -level 10%,90%,0.5 -posterize 32 $T
fi
done
If I scanned two pages at once, and am planning to split them, I have a different script:
cd out
for I in st*.png
do
R=`echo $I | sed 's/s/r/' | sed 's/-/_/'`
RA=`echo $R | sed 's/.png/a.png/'`
RB=`echo $R | sed 's/.png/b.png/'`
T=`echo $I | sed 's/s/t/' | sed 's/-/_/'`
TA=`echo $T | sed 's/.png/a.png/'`
TB=`echo $T | sed 's/.png/b.png/'`
if test $I -ot $R
then
echo $I $R $T already done
else
echo $I $R $T $RA $RB $TA $TB
convert $I -crop 1790x1350+0 -rotate 90 -despeckle $RB
convert $I -crop 1790x1350+0+1350 -rotate 90 -despeckle $RA
convert $RA -resize 33% -level 10%,90%,0.5 -posterize 32 $TA
convert $RB -resize 33% -level 10%,90%,0.5 -posterize 32 $TB
fi
done
Lastly, I need to create pdfs out of it.
img2pdf tt*.png --author "Fred Smith" --title "Smithing" -o ../smithing_1925_small.pdf
img2pdf rt*.png --author "Fred Smith" --title "Smithing" -o ../smithing_1925.pdf
No comments:
Post a Comment