Just want to know if the standard PDF to Text Plugin can handle PDFs that can contain text other than English.
Basically my use case is Downloading PDFs from this Rajya Sabha Site Rajya Sabha Site and download any PDF where the text is in hindi, I want to convert that pdf into Text and search for relevant keywords. The keyword part is tried and tested, but pretty unsure about the PDF to Text part.
Can anyone help out ?
Thanks in Advance !
@AE_Knights - Do we have multi-language support for PDF to text?
Hello @inter ,
Hindi language (Multi-language) PDFs can be converted to Text using PDFUtils: PDF To Text plugin step.