PDF documents are beautiful things, but that beauty is often only skin deep. Inside, they might have any number of structures that are difficult to understand and exasperating to get at.That means that in the end, a beautiful PDF document is really meant to be read and its internals are not to be messed with.Below is the Python script through which you can get the contents of PDF.
I hope this tutorial will surely help you. If you have any questions or problems please let me know.
Happy Hadooping with Patrick..