The confidence score represents the probability that the output of the model is correct.
A loss function is a way to quantify the accuracy of a certain model. More specifically it associates a prediction with a score that denotes how far it is from the ground truth.

Cross-Entropy Loss

The cross entropy loss takes in a probability vector as input.
Let: be the input vector representing an estimate. be a vector representing the ground truth be the number of labels that we have (i.e., the dimension of and ).

The function is defined as:

It derives from cross entropy. We may see this by noting that the “actual probabilities” are those represented in our label vector and the “believed probabilities” that of .
The cross entropy loss not only minimizes the model’s error, but it also minimizes the Entropy, and by extension the amount of information we need to communicate the correct label. In doing so, we also minimize the degree of uncertainty that the model has as a result of its predictions.

Mean Square Error

The Mean Squared Error, denoted is defined as:
A convention is to multiply the above quantity by since it makes the derivative more mathematically convenient