Returns a prediction from an OCR model for the specified image or PDF via URL or local file.
Request Parameters
Name | Type | Description | Available Version |
---|---|---|---|
| string | Standard form sent to the model. Use this parameter with the parameter | 2.0 |
| string | The path to the form template (in JSON format) of input custom form. The form template defines the entities to extract and their key variations. See examples at Extract Data From Custom Forms. Either formType or formTemplate needs to be specified when task=form. | |
| string | ID of the model that makes the prediction.
| 2.0 |
| string | Binary content of image or PDF file uploaded as multipart/form-data. | 2.0 |
| string | URL of the image or PDF file. Use this parameter when sending in a file from a web location. The URL must be a direct link to the file. | 2.0 |
| string | String that you can pass in to tag the prediction. Optional. Can be any value, and is returned in the response. | 2.0 |
| string | Optional. Designates the type of data in the image. Default is
| 2.0 |
Form Template Content (Beta)
Name | Type | Required | Description |
---|---|---|---|
| array of Key objects | Yes | Array of Key objects (see the Key Object table) |
| array of TableCell objects | No | Array of TabelCell objects. Leave it as an empty array if no header information is specified. |
| boolean | Yes | Choose from [true, false]. If true, the automatically recognized tables will be returned in response. |
| string | Yes | Choose from ["1.0", "2.0", "3.0"] |
Key Object (Beta)
Name | Type | Required | Description |
---|---|---|---|
| object | Yes | See the Contents of Key Object table for descriptions. |
Content of Key Object (Beta)
Name | Type | Required | Description |
---|---|---|---|
| string | Yes | The entity (field name) of this entity-value pair. Each entity should be an unique identifier within this form template. It can be any UTF8 strings. |
| array | Yes | A non-empty array of the key variations of the entity. If it is a virtual key, choose from ["person", "phone", "email", "address", "website", "org", "datetime"]. |
| array | No | The Salesforce form field type of this entity. Leave it as an empty array if not specified. If empty, the default entity_type TEXT will be used. There is no entity_type for virtual keys because the data format will be validated using the text information. |
Keep the following points in mind when sending a file in for prediction:
-
Orientation—The model handles slight image or PDF orientation changes but not above 30-40%. Accuracy is better for files in which the text has a straight vertical orientation.
-
Max File Size—The maximum image or PDF file size you can pass to this resource is 10 MB.
-
Max Number of Pages—The maximum number of pages in a PDF is based on the value of the
task
parameter.contact
—The maximum number of pages is five. There should be only one business card per page.table
—The maximum number of pages is eight. The model can process multiple tables on a page, but a table that spans multiple pages is identified as separate tables. For example, if you have a table that spans pages one and two, the model returns results for two tables.text
—The maximum number of pages is eight.form
—The maximum number of pages is eight.
-
File Types—The supported file types are PNG, JPG, JPEG, and PDF.
-
Tables—The model can process multiple tables on a page, but a table that spans multiple pages is identified as separate tables. For example, if you have a table that spans pages one and two, the model returns results for two tables.
-
Response Sort Order—The detected strings returned in the response are sorted by probability.
-
Max Words Returned Per Image—When you send in an image file, the model returns a maximum of 600 words per image.
-
Supported Languages—Einstein OCR supports English only. The characters supported are:
!\"#$%&'()*+,-./0123456789:;<=>[email protected][\\]^_`abcdefghijklmnopqrstuvwxyz{|}~£≈
- Checkboxes—The model currently doesn't support checkboxes in any text.
Response Body
Name | Type | Description | Available Version |
---|---|---|---|
| string | Object returned; in this case, | 2.0 |
| array | Array of probabilities for the prediction. | 2.0 |
| string | Same value as request parameter. Returned only if the | 2.0 |
| string | Same value as request parameter. Returns | 2.0 |
Probabilities Response Body
Name | Type | Description | Available Version |
---|---|---|---|
| object | Contains additional attributes related to the | 2.0 |
| object | Contains the coordinates for the bounding box that encloses the detected text. | 2.0 |
| string | Content of the detected text when task is text or table. The label is “key-value” or “table” when the task is form, which indicates the type of this block. | 2.0 |
| float | Probability value for the input. Values are between 0–1. | 2.0 |
BoundingBox Response Body
Name | Type | Description | Available Version |
---|---|---|---|
| int | X-coordinate of the right side of the bounding box. Number of pixels from the left edge of the image. | 2.0 |
| int | Y-coordinate of the bottom of the bounding box. Number of pixels from the top edge of the image. | 2.0 |
| int | X-coordinate of the left side of the bounding box. The origin of the coordinate system is the top-left of the image. Number of pixels from the left edge of the image. | 2.0 |
| int | Y-coordinate of the top of the bounding box. Number of pixels from the top edge of the image. | 2.0 |
Attributes Response Body
Returned when label is “key-value”.
Name | Type | Description | Available Version |
---|---|---|---|
| int | Unique ID for the key-value pair. Returned only when the | 2.0 |
| object | Contains the detected text in the form that’s part of the form. For example, in a driver's license, the key might be | 2.0 |
| string | Language of the key and value. Defaults to English. Only English is currently supported. Returned only when the | 2.0 |
| int | Page that contains the identified text. The model always returns 1, except when you send in a multi-page PDF. | 2.0 |
| object | Contains the detected text of the data that was entered in the form field. For example, in a driver's license, the value might be | 2.0 |
| string | Optional. Normalized representation of the key/value string. For example, a raw text value of 123 Main Street, Suite 54, San Francisco, CA, 94101, U.S.A. will be parsed as
Empty fields are excluded from the compound JSON response. Note Currently only U.S. addresses are normalized. | 2.0 |
Attributes cellLocation Response Body
Returned when you pass a task
parameter value of table
Name | Type | Description | Available Version |
---|---|---|---|
| int | Index of the column that contains the detected text. | 2.0 |
| int | Index of the row that contains the detected text. | 2.0 |
Attributes Tag Response Body
Returned when you pass a task
parameter value of contact
Name | Type | Description | Available Version |
---|---|---|---|
| string | Entity that the model predicts for the detected text. Valid values:
| 2.0 |
Attributes Key Response Body
Returned when you pass a task
parameter value of form
.
Name | Type | Description` | Available Version |
---|---|---|---|
| object | Contains the coordinates for the bounding box that encloses the key. If text does not exist, it is [1,1,1,1]. | 2.0 |
| string | For the key text, specifies the type of form field. For example, in a driver's license, the key text can be | 2.0 |
| string | Detected text in the form that’s part of the form. For example, in a driver's license, the key text could be | 2.0 |
Attributes Key boundingBox Response Body
The bounding box for the form key.
Name | Type | Description | Available Version |
---|---|---|---|
| int | X-coordinate of the right side of the bounding box. Number of pixels from the left edge of the image. | 2.0 |
| int | Y-coordinate of the bottom of the bounding box. Number of pixels from the top edge of the image. | 2.0 |
| int | X-coordinate of the left side of the bounding box. The origin of the coordinate system is the top-left of the image. Number of pixels from the left edge of the image. | 2.0 |
| int | Y-coordinate of the top of the bounding box. Number of pixels from the top edge of the image. | 2.0 |
Attributes Value Response Body
Returned when you pass a task
parameter value of form
.
Name | Type | Description | Available Version |
---|---|---|---|
| object | Contains the coordinates for the bounding box that encloses the detected text value. | 2.0 |
| string | The data value for the specified key. For example, For example, in a driver's license, if key text is | 2.0 |
Attributes Value boundingBox Response Body
When label is “table”. The bounding box for the form value.
Name | Type | Description | Available Version |
---|---|---|---|
| int | X-coordinate of the right side of the bounding box. Number of pixels from the left edge of the image. | 2.0 |
| int | Y-coordinate of the bottom of the bounding box. Number of pixels from the top edge of the image. | 2.0 |
| int | X-coordinate of the left side of the bounding box. The origin of the coordinate system is the top-left of the image. Number of pixels from the left edge of the image. | 2.0 |
| int | Y-coordinate of the top of the bounding box. Number of pixels from the top edge of the image. | 2.0 |
Attributes Response Body
When label is “table”.
Name | Type | Description |
---|---|---|
| string | Name of the table. Must be unique across all tableNames. It matches the user-defined tableName if a mapping is found, otherwise it is an automatically generated unique string, with format "table{number}{UUID}" |
| string | Each table in the response is assigned a count in the response |
| array of tableCells object | A list of tableCells objects (see Attributes tableCells Object Response Body table) |
| string | Start page number of the table |
| int | ID of block. This ID is unique across the response and incrementally assigned after key value pairs. |
Attributes tableCells Object Response Body
Name | Type | Description |
---|---|---|
| object | Contains the coordinates for the bounding box that encloses the sub-label |
| string | The text of each element (cell) of the table |
| float | OCR confidence of text in the cell |
| cellLocation | See the description in Attributes tableCells Object cellLocation Response Body |
| cellLocation | See the description in Attributes tableCells Object cellLocation Response Body |
| string | Optional. It is set to user-defined entity of header if there is a match. Otherwise, it is non-existent. |
| string | Optional. It is one of {rowHeader, columnHeader, normal} if cellType inference is performed |
| string | Same as the normalizedText in Attributes Response Body |
Attributes tableCells Object cellLocation Response Body
Name | Type | Description |
---|---|---|
| int | The row index of the detected text |
| int | The column index of detected text |
| string | Optional. The entity of the row header of this cell. If the header does not have an entity, use the header's text instead. It does not exist if this cell does not have any headers or the header inference is not performed. |
| string | Optional. The entity of the column header of this cell. If the header does not have an entity, use the header's text instead. It does not exist if this cell does not have any headers or the header inference is not performed. |