Returns the metrics for a model, such as the f1 score, accuracy, and confusion matrix. The combination of these metrics gives you a picture of model accuracy and how well the model will perform. This call returns the metrics for the last epoch in the training used to create the model.
Response Body
Name | Type | Description | Available Version |
---|---|---|---|
| string | Algorithm used to create the model. Returned only when the | 2.0 |
| date | Date and time that the model was created. | 2.0 |
| string | ID of the model. Contains letters and numbers. | 2.0 |
| string | Model language inherited from the dataset language. Default is | 2.0 |
| object | Model metrics values. | 2.0 |
| string | Object returned; in this case, | 2.0 |
metricsData Response Body
Name | Type | Description | Available Version |
---|---|---|---|
| array | Array of integers that contains the correct and incorrect classifications for each label in the dataset based on testing done during the training process. | 2.0 |
| array | Array of floats that contains the harmonic mean of precision and recall for each label in the dataset. The corresponding label for each value in this array can be found in the | 2.0 |
| array | Array of strings that contains the dataset labels. These labels correspond to the values in the | 2.0 |
| float | Model-level f1 score. Average of all the per-label f1 scores. | 2.0 |
| object | Contains label-independent precision, recall, f1, and their corresponding threshold. | 2.0 |
| array | Array of floats that contains per-label precision values. | 2.0 |
| array | Array of floats that contains per-label recall values. | 2.0 |
| float | Accuracy of the test data. From your initial dataset, by default, 10% of the data is set aside and isn't used during training to create the model. This 10% is then sent to the model for prediction. How often the correct prediction is made with this 10% is reported as | 2.0 |
| float | Summary of the errors made in predictions using the validation data. The lower the number value, the more accurate the model. | 2.0 |
| object | Contains each model label and corresponding per-label threshold value. Returned only for models created from a dataset with a | 2.0 |
| float | Accuracy of the training data. By default, 90% of the data from your dataset is left after the test accuracy set is set aside. This 90% is then sent to the model for prediction. How often the correct prediction is made with this 90% is reported as | 2.0 |
| float | Summary of the errors made in predictions using the training and validation data. The lower the number value, the more accurate the model. | 2.0 |
Use the labels
array and the confusionMatrix
array to build the confusion matrix for a model. The labels in the array become the matrix rows and columns. Here's what the confusion matrix for the example results looks like.
hourly-forecast | current-weather | five-day-forecast | |
---|---|---|---|
hourly-forecast | 3 | 0 | 0 |
current-weather | 1 | 7 | 0 |
five-day-forecast | 0 | 0 | 1 |
precisionRecallCurve Response Body
Name | Type | Description | Available Version |
---|---|---|---|
| array | Array of floats that contains the harmonic mean of the corresponding values in the | 2.0 |
| array | Array of floats that contains the precision of the model when a particular threshold is chosen. The precision is computed using true positives and false positives for the chosen threshold. | 2.0 |
| array | Array of floats that contains the recall of the model when a particular threshold is chosen. Recall is computed using true positive and false negative for the chosen threshold | 2.0 |
| array | Array of floats that contains the threshold that's set on the probability returned while predicting against the test set. Each of the f1, precision, and recall values are computed based on the corresponding threshold. | 2.0 |
Keep the following points in mind when analyzing the precision-recall metrics.
- The precision-recall curve arrays are calculated using the examples in the test set (as specified by the
trainSplitRatio
). The margin of error when these values are computed depends on the number of examples in the test set. - The granularity of the threshold variations depends on the number of examples in the test set and the number of labels in the dataset from which the model was created.
- The maximum number of values returned in any one of these arrays is 2,000.