Skip to content

Optical Character Recognition (OCR)

The file storage includes OCR (Optical Character Recognition) from the DBrain partner. Currently, only recognition of the main spread of the Russian Federation internal passport and selfie comparison is available.

Optical Character Recognition (OCR) is a technology that allows automatic conversion of text on images, document scans, or photographs into an editable and searchable text format.

OCR is used to extract text from paper documents, scans, photographs, PDF files, and other graphic formats so that they can be worked with as regular text: copied, edited, indexed, and searched.

OCR Applications:

  • Automation of data entry from paper documents.
  • Search and analysis of information in scanned archives.
  • Text recognition on photographs and scans for subsequent use in electronic systems.

Settings

  • passport_ocr_tags - Tags for which passport recognition should be launched
  • selfie_check_tags - Tags for which selfie verification should be launched
  • confidence_check_mode
    • If avg, the average confidence value across all fields is calculated
    • If min, the minimum value across all fields is compared (Default)
  • confidence_level - Minimum confidence threshold after recognition. Used for forming the OCR response. (Default = 0.98)

OCR API Methods

MethodURLDescription
POSTapi/files/{fileId}/ocrLaunch OCR processing for a file
GET|HEADapi/files/{fileId}/ocr/{ocrId}Get OCR result by ID

Launch OCR Processing for a File

POST /api/files/{fileId}/ocr

Description

This method allows initiating the Optical Character Recognition (OCR) process for the specified file. After OCR execution, the system returns information about the recognition result, document type, and process state.

On successful OCR result, the corresponding document displays a "Verification passed" status in the user interface (UI).

Request Parameters

ParameterTypeDescription
fileIdintegerUnique identifier of the file for which OCR is launched.

Request Example

POST /api/files/238/ocr

Successful Response Example

json
{
  "status": "ok", // Operation execution status
  "timestamp": 1750765283000, // Operation execution time (Unix timestamp, ms)
  "data": {
    "id": 1, // OCR result (or process) identifier
    "type": "passport", // Recognized document type
    "state": "OK" // OCR process state
  }
}

Error Response Example

json
{
  "status": "error", // Operation execution status
  "timestamp": 1750834012000, // Operation execution time (Unix timestamp, ms)
  "data": {
    "message": "OCR type not found for tags: 8", // Error message
    "code": "OCR_TYPE_NOT_FOUND", // Error code
    "type": "App\\Exceptions\\OCRTypeNotFoundException", // Error type
    "details": [] // Error details
  }
}

Get OCR Result by ID

GET|HEAD /api/files/{fileId}/ocr/{ocrId}

Description

This method allows retrieving information about a specific Optical Character Recognition (OCR) result for the specified file. It provides the OCR execution status, recognized document type, and result identifier.

Request Parameters

ParameterTypeDescription
fileIdintegerUnique file identifier.
ocrIdintegerUnique OCR result identifier.

Request Example

GET /api/files/238/ocr/1

Successful Response Example

json
{
  "status": "ok", // Operation execution status
  "timestamp": 1750772435000, // Operation execution time (Unix timestamp, ms)
  "data": {
    "id": 1, // OCR result (or process) identifier
    "type": "passport", // Recognized document type
    "state": "OK" // OCR process state
  }
}

Field Descriptions

FieldTypeDescription
statusstringOperation execution status ("ok"/"error")
timestampintegerOperation execution time (Unix timestamp, ms)
dataobjectOperation result
data.idintegerOCR result (or process) identifier
data.typestringRecognized document type (e.g., "passport")
data.statestringOCR process state (e.g., "OK", "FAILED")
data.messagestringError message (only for error)
data.codestringError code (only for error)
data.typestringError type (only for error)
data.detailsarrayError details (only for error)