Optical Character Recognition (OCR)
The file storage includes OCR (Optical Character Recognition) from the DBrain partner. Currently, only recognition of the main spread of the Russian Federation internal passport and selfie comparison is available.
Optical Character Recognition (OCR) is a technology that allows automatic conversion of text on images, document scans, or photographs into an editable and searchable text format.
OCR is used to extract text from paper documents, scans, photographs, PDF files, and other graphic formats so that they can be worked with as regular text: copied, edited, indexed, and searched.
OCR Applications:
- Automation of data entry from paper documents.
- Search and analysis of information in scanned archives.
- Text recognition on photographs and scans for subsequent use in electronic systems.
Settings
passport_ocr_tags- Tags for which passport recognition should be launchedselfie_check_tags- Tags for which selfie verification should be launchedconfidence_check_mode- If
avg, the average confidence value across all fields is calculated - If
min, the minimum value across all fields is compared (Default)
- If
confidence_level- Minimum confidence threshold after recognition. Used for forming the OCR response. (Default = 0.98)
OCR API Methods
| Method | URL | Description |
|---|---|---|
| POST | api/files/{fileId}/ocr | Launch OCR processing for a file |
| GET|HEAD | api/files/{fileId}/ocr/{ocrId} | Get OCR result by ID |
Launch OCR Processing for a File
POST /api/files/{fileId}/ocr
Description
This method allows initiating the Optical Character Recognition (OCR) process for the specified file. After OCR execution, the system returns information about the recognition result, document type, and process state.
On successful OCR result, the corresponding document displays a "Verification passed" status in the user interface (UI).
Request Parameters
| Parameter | Type | Description |
|---|---|---|
| fileId | integer | Unique identifier of the file for which OCR is launched. |
Request Example
POST /api/files/238/ocrSuccessful Response Example
{
"status": "ok", // Operation execution status
"timestamp": 1750765283000, // Operation execution time (Unix timestamp, ms)
"data": {
"id": 1, // OCR result (or process) identifier
"type": "passport", // Recognized document type
"state": "OK" // OCR process state
}
}Error Response Example
{
"status": "error", // Operation execution status
"timestamp": 1750834012000, // Operation execution time (Unix timestamp, ms)
"data": {
"message": "OCR type not found for tags: 8", // Error message
"code": "OCR_TYPE_NOT_FOUND", // Error code
"type": "App\\Exceptions\\OCRTypeNotFoundException", // Error type
"details": [] // Error details
}
}Get OCR Result by ID
GET|HEAD /api/files/{fileId}/ocr/{ocrId}
Description
This method allows retrieving information about a specific Optical Character Recognition (OCR) result for the specified file. It provides the OCR execution status, recognized document type, and result identifier.
Request Parameters
| Parameter | Type | Description |
|---|---|---|
| fileId | integer | Unique file identifier. |
| ocrId | integer | Unique OCR result identifier. |
Request Example
GET /api/files/238/ocr/1Successful Response Example
{
"status": "ok", // Operation execution status
"timestamp": 1750772435000, // Operation execution time (Unix timestamp, ms)
"data": {
"id": 1, // OCR result (or process) identifier
"type": "passport", // Recognized document type
"state": "OK" // OCR process state
}
}Field Descriptions
| Field | Type | Description |
|---|---|---|
| status | string | Operation execution status ("ok"/"error") |
| timestamp | integer | Operation execution time (Unix timestamp, ms) |
| data | object | Operation result |
| data.id | integer | OCR result (or process) identifier |
| data.type | string | Recognized document type (e.g., "passport") |
| data.state | string | OCR process state (e.g., "OK", "FAILED") |
| data.message | string | Error message (only for error) |
| data.code | string | Error code (only for error) |
| data.type | string | Error type (only for error) |
| data.details | array | Error details (only for error) |