Extract highlight text from pdf for nlp.
Each line is responsible for one paragraph.
java -jar PDFHighLightExtractor.jar -i inFile | Directory [-o output.txt]
- -i is neccessary
- -o default is output.txt
use some code from https://github.com/juanerasmoe/PDFHighlightExtractor