Get Model Metrics

Returns the metrics for a model, such as the f1 score, accuracy, and confusion matrix. The combination of these metrics gives you a picture of model accuracy and how well the model will perform. This call returns the metrics for the last epoch in the training used to create the model.

Response Body

Name

Type

Description

Available Version

algorithm

string

Algorithm used to create the model. Returned only when the modelType is text-intent.

2.0

createdAt

date

Date and time that the model was created.

2.0

id

string

ID of the model. Contains letters and numbers.

2.0

language

string

Model language inherited from the dataset language. Default is en_US.

2.0

metricsData

object

Model metrics values.

2.0

object

string

Object returned; in this case, metrics.

2.0

metricsData Response Body

Name

Type

Description

Available Version

confusionMatrix

array

Array of integers that contains the correct and incorrect classifications for each label in the dataset based on testing done during the training process.

2.0

f1

array

Array of floats that contains the harmonic mean of precision and recall for each label in the dataset. The corresponding label for each value in this array can be found in the labels array. For example, the first f1 score in the f1 array corresponds to the first label in the labels array.

2.0

labels

array

Array of strings that contains the dataset labels. These labels correspond to the values in the f1, confusionMatrix, precision, recall arrays and the threshold object.

2.0

macroF1

float

Model-level f1 score. Average of all the per-label f1 scores.

2.0

precisionRecallCurve

object

Contains label-independent precision, recall, f1, and their corresponding threshold.

2.0

precision

array

Array of floats that contains per-label precision values.

2.0

recall

array

Array of floats that contains per-label recall values.

2.0

testAccuracy

float

Accuracy of the test data. From your initial dataset, by default, 10% of the data is set aside and isn't used during training to create the model. This 10% is then sent to the model for prediction. How often the correct prediction is made with this 10% is reported as testAccuracy.

2.0

testLoss

float

Summary of the errors made in predictions using the validation data. The lower the number value, the more accurate the model.

2.0

threshold

object

Contains each model label and corresponding per-label threshold value. Returned only for models created from a dataset with a type of text-intent.

2.0

trainingAccuracy

float

Accuracy of the training data. By default, 90% of the data from your dataset is left after the test accuracy set is set aside. This 90% is then sent to the model for prediction. How often the correct prediction is made with this 90% is reported as trainingAccuracy.

2.0

trainingLoss

float

Summary of the errors made in predictions using the training and validation data. The lower the number value, the more accurate the model.

2.0

Use the labels array and the confusionMatrix array to build the confusion matrix for a model. The labels in the array become the matrix rows and columns. Here's what the confusion matrix for the example results looks like.

hourly-forecast

current-weather

five-day-forecast

hourly-forecast

3

0

0

current-weather

1

7

0

five-day-forecast

0

0

1

precisionRecallCurve Response Body

Name

Type

Description

Available Version

f1

array

Array of floats that contains the harmonic mean of the corresponding values in the precision and recall arrays.

2.0

precision

array

Array of floats that contains the precision of the model when a particular threshold is chosen. The precision is computed using true positives and false positives for the chosen threshold.

2.0

recall

array

Array of floats that contains the recall of the model when a particular threshold is chosen. Recall is computed using true positive and false negative for the chosen threshold

2.0

threshold

array

Array of floats that contains the threshold that's set on the probability returned while predicting against the test set. Each of the f1, precision, and recall values are computed based on the corresponding threshold.

2.0

Keep the following points in mind when analyzing the precision-recall metrics.

  • The precision-recall curve arrays are calculated using the examples in the test set (as specified by the trainSplitRatio). The margin of error when these values are computed depends on the number of examples in the test set.
  • The granularity of the threshold variations depends on the number of examples in the test set and the number of labels in the dataset from which the model was created.
  • The maximum number of values returned in any one of these arrays is 2,000.
Language