Software to "convert" (interpret/segmentate) complex pdf files into structured text files (Json)

*Update: attached file "technicaldescription_and_pdfsamples" with pdfsamples and more detailed specification

the goal is to create a software/script to interpret/convert a complex pdf file into a text file (json). The script/software must be written in English and I must have complete access to the source files. ANY programming languages can be used (python, java, C/C++, ..

There are several diferent types of pdf files and the software should be able to break the important information on the pdf file into a structured json file. So, the problem is to segmentate the information in the PDF files.

The PDF files are TESTs (math, science, informatics, etc) in portuguese. The ideia is to separate the important information so we can build a database with this information. The scope of this project is only to extract the informations from the pdf file and present in a json output.

The complexity is that the files are (very) different, many have figures and images that must be stored as well. I have a initial approach idea to solve the problem, but I'm open to discuss the problems and possible solutions as well. Detailed specification can be sent in case of interrest.
Im able to skype and explain everything.

The only skill needed is good programming skills and problem solving.

I'm very keen to help and discuss alternatives.

The job is not easy and price/time can be negotiate. I believe that good performance must be good rewarded ($$).


Category: Programación y Tecnología
Subcategory: Aplicaciones de escritorio
Is this a project or a position?: I don’t know yet
I currently have: I have specifications
Required availability: As needed
Experience in this type of projects: Yes (I have managed this kind of project before)
Required platforms: Windows






Freelancers interesados

Publicado: Hace 2 años

Plazo: No definido

Crea tu propio proyecto

¿Buscas un freelancer para realizar un proyecto similar? Crea tu propio proyecto y recibirás ofertas de los mejores freelancers.

Freelancers que ya aplicaron para este trabajo

Jayr A. Sou graduando de Análise e Desenvolvimento de Sistemas e desenvolvedor WEB, uso HTML5, CSS, PHP, JavaScritp e jQuery. Desenvolvimento de sistemas desktop com nwjs e outras tecnologias. E estou aqui procurando uma form... + detalles

Diego P. Desenvolvedor web full stack + detalles

Agustin N. Me interesa el desarrollo de algoritmos y métodos que automaticen procesos tediosos. De esta manera, ofrezco una forma de ganar tiempo para ocuparse de las cosas que realmente importan. Soy experto en Excel y Access,... + detalles

Jhonatan F. Portafolio Completo: Moderador del foro de Java del Blog Personal: skyp: _________... + detalles

Italo C. Olá tudo bem ? Eu sou um analista e desenvolvedor de sistemas com mais de 5 anos de experiência no mercado brasileiro, já realizei trabalhos para os mais diversos ramos de negócios, sou especialista na plataforma java... + detalles