Software to "convert" (interpret/segmentate) complex pdf files into structured text files (Json)


Publicado: Hace un año Plazo: No definido Propuestas: 5 Freelancers interesados: 11



*update: ATTACHED FILE "TechnicalDescription_and_pdfsamples" with pdfsamples and more detailed specification

The goal is to create a software/script to interpret/convert a complex PDF file into a text file (JSON). The script/software must be written in English and I must have complete access to the source files. ANY programming languages can be used (python, java, C/C++, ...).

There are several diferent types of PDF files and the software should be able to break the important information on the PDF file into a structured JSON file. So, the problem is to segmentate the information in the PDF files.

The PDF files are TESTs (math, science, informatics, etc) in portuguese. The ideia is to separate the important information so we can build a database with this information. The scope of this project is only to EXTRACT the informations from the PDF file and present in a JSON output.

The complexity is that the files are (very) different, many have figures and images that must be stored as well. I have a initial approach idea to solve the problem, but I'm open to discuss the problems and possible solutions as well. Detailed specification can be sent in case of interrest. Im able to skype and explain everything.

The only skill needed is good programming skills and problem solving.

I'm very keen to help and discuss alternatives.

The job is not easy and price/time can be negotiate. I believe that good performance must be good rewarded ($$).


Category: IT & Programming
Subcategory: Desktop Applications
Is this a project or a position?: I don't know yet
I currently have: I have specifications
Experience in this type of project: Yes (I have managed this kind of project before)
Required availability: As needed
Required platforms: Windows

Crea tu propio proyecto

¿Buscas un freelancer para realizar un proyecto similar? Crea tu propio proyecto y recibirás ofertas de los mejores freelancers.


Para ver más detalles del cliente

Ingresa a Workana

Compartir este proyecto