Currently, I have a pdf that is not searchable and I am wondering what the best process is for preparing the file for coldfusion so I can index the file.
In particular, I am wondering whether a pdf file needs to be readable before using extracttext in cfpdf to pull the text from it.
I really appreciate the advice and I hope it helps other people who are interested in indexing pdf files with coldfusion.
I was considering extracting the text with Tesseract as suggested here
Performing Optical Character Recognition on PDF's from ColdFusion using a Java or .NET Library?
but if there is a built in feature in coldfusion, I would much rather use that and I think it would be more helpful to other people to know whether coldfusion can natively handle this task.