Terminado

Rest Api for reading how much color is in a Pdf / Ms-Doc File

Publicado el 04 Abril, 2020 en Programación y Tecnología

Sobre este proyecto

Abierto

We are needing a Rest Api for reading pdf / ms doc (97-365) and knowing how much color each page have. For examples:

Example 1: if the document has in page 1,  2000 letters count and 100 letters have other color than black, then that page has 5% of color on that page (100 letters in color / 2000 letters on that page).

Example 2: if the document has a page 8.5 x 11 Inches, combined with letters and image, it has  2000 letters count and 100 letters have other color than black and have a image that covers 50% space of the  8.5x11 page and is not in Grayscale,  then that page has 55% of color on that page (100 letters in color / 2000 letters +  50% space of the page with a image that have some other color that grayscale on that page).

Note for the image scan, that isnt really important for this specific case to know how much color really have the image in the page, is only important to know that the image is not Grayscale and the percent that occupy the image for the whole page).

You can use M/L implementation (not required)  / Any programing Language that you prefer / Operative System. Prefered the Rest Api is Serveless (Not Required).

The Rest api input is a pdf / ms-doc (97-365) file, that will be located in a path on the same server or a: s3 bucket, onedrive, google drive, drop box or any other (recommendation accepted).

The Rest api Output is a Json that have the %of color, of each page.

Json result format:

{
  "Document": {
        "Name" : "Test.PDF"   
      ,"Pages": 6
  },
  "JobStatus": "SUCCEEDED",
  "Pages": [
        {"Page1": 20, "Page2": 0, "Page3": 50}
        ]
      }

Important note:
The solution must accept pages of any size. For the letters count for each page i dont think that this matters to much. But for the image size it does, since for knowing the %of color on the page, you have to compare the image size vs the page size to know how much color have the image on the page.


Source code will be needed. Documentation how to use and install the rest api will also be needed.

Contexto general del proyecto

We are needing a Rest Api for reading pdf / ms doc (97-365) and knowing how much color each page have. For examples: Example 1: if the document has in page 1, 2000 letters count and 100 letters have other color than black, then that page has 5% of color on that page (100 letters in color / 2000 letters on that page). Example 2: if the document has a page 8.5 x 11 Inches, combined with letters and image, it has 2000 letters count and 100 letters have other color than black and have a image that covers 50% space of the 8.5x11 page and is not in Grayscale, then that page has 55% of color on that page (100 letters in color / 2000 letters + 50% space of the page with a image that have some other color that grayscale on that page). Note for the image scan, that isnt really important for this specific case to know how much color really have the image in the page, is only important to know that the image is not Grayscale and the percent that occupy the image for the whole page). You can use M/L implementation (not required) / Any programing Language that you prefer / Operative System. Prefered the Rest Api is Serveless (Not Required). The Rest api input is a pdf / ms-doc (97-365) file, that will be located in a path on the same server or a: s3 bucket, onedrive, google drive, drop box or any other (recommendation accepted). The Rest api Output is a Json that have the %of color, of each page. Json result format: { "Document": { "Name" : "Test.PDF" ,"Pages": 6 }, "JobStatus": "SUCCEEDED", "Pages": [ {"Page1": 20, "Page2": 0, "Page3": 50} ] } Important note: The solution must accept pages of any size. For the letters count for each page i dont think that this matters to much. But for the image size it does, since for knowing the %of color on the page, you have to compare the image size vs the page size to know how much color have the image on the page. Source code will be needed. Documentation how to use and install the rest api will also be needed.

Categoría Programación y Tecnología
Subcategoría Otros
Tamaño del proyecto Pequeño
¿Es un proyecto o una posición? Un proyecto
Actualmente tengo Tengo las especificaciones
Disponibilidad requerida Según se necesite
Integraciones de API Cloud Storage (Dropbox, Google Drive, etc.), Otros (Otras APIs)

Plazo de Entrega: No definido

Habilidades necesarias

Otros proyectos publicados por A. V.