Define a log loss
tefla.core.losses.log_loss_custom (predictions, labels, eps=1e-07, name='log')
Args
- predictions: 2D tensor or array, [batch_size, num_classes] predictions of the network .
- labels: 2D or array tensor, [batch_size, num_classes] ground truth labels or target labels.
- eps: a constant to set upper or lower limit for labels, smoothening factor
- name: Optional scope/name for op_scope.
Returns
A tensor with the log loss.
Define a log loss
tefla.core.losses.log_loss_tf (predictions, labels, eps=1e-07, weights=1.0, name='log_loss')
Args
- predictions: 2D tensor or array, [batch_size, num_classes] predictions of the network .
- labels: 2D or array tensor, [batch_size, num_classes] ground truth labels or target labels.
- eps: a constant to set upper or lower limit for labels, smoothening factor
- name: Optional scope/name for op_scope.
Returns
A tensor with the log loss.
Define a kappa loss, Its a continuous differentiable approximation of discrete kappa loss
tefla.core.losses.kappa_loss (predictions, labels, y_pow=1, eps=1e-15, num_ratings=5, batch_size=32, name='kappa')
Args
- predictions: 2D tensor or array, [batch_size, num_classes] predictions of the network .
- labels: 2D tensor or array,[batch_size, num_classes] ground truth labels or target labels.
- y_pow: int, to whcih the labels should be raised; useful if model diverge. e.g. y_pow=2
- num_ratings: numbers of rater to used, typically num_classes of the model
- batch_size: batch_size of the training or validation ops
- eps: a float, prevents divide by zero
- name: Optional scope/name for op_scope.
Returns
A tensor with the kappa loss.
Define a joint kappa and log loss, Kappa is a continuous differentiable approximation of discrete kappa loss
tefla.core.losses.kappa_log_loss (predictions, labels, label_smoothing=0.0, y_pow=1, batch_size=32, log_scale=0.5, num_classes=5, log_offset=0.5, name='kappa_log')
Args
- predictions: 2D tensor or array, [batch_size, num_classes] predictions of the network .
- labels: 2D tensor or array,[batch_size, num_classes] ground truth labels or target labels.
- label_smoothing: a float, used to smooth the labels for better generalization if greater than 0 then smooth the labels.
- y_pow: int, to whcih the labels should be raised; useful if model diverge. e.g. y_pow=2
- num_ratings: numbers of rater to used, typically num_classes of the model
- batch_size: batch_size of the training or validation ops
- log_scale: a float, used to multiply the clipped log loss, e.g: 0.5
- log_offset:a float minimum log loss offset to substract from original log loss; e.g. 0.50
- name: Optional scope/name for op_scope.
Returns
A tensor with the kappa log loss.
Define a joint kappa and log loss; log loss is clipped by a defined min value; Kappa is a continuous differentiable approximation of discrete kappa loss
tefla.core.losses.kappa_log_loss_clipped (predictions, labels, label_smoothing=0.0, y_pow=1, batch_size=32, log_scale=0.5, log_cutoff=0.8, num_classes=5, name='kappa_log_clipped')
Args
- predictions: 2D tensor or array, [batch_size, num_classes] predictions of the network .
- labels: 2D tensor or array,[batch_size, num_classes] ground truth labels or target labels.
- label_smoothing: a float, used to smooth the labels for better generalization if greater than 0 then smooth the labels.
- y_pow: int, to whcih the labels should be raised; useful if model diverge. e.g. y_pow=2
- num_ratings: numbers of rater to used, typically num_classes of the model
- batch_size: batch_size of the training or validation ops
- log_scale: a float, used to multiply the clipped log loss, e.g: 0.5
- log_cutoff:a float, minimum log loss value; e.g. 0.50
- name: Optional scope/name for op_scope.
Returns
A tensor with the clipped kappa log loss.
Define a cross entropy loss with label smoothing
tefla.core.losses.cross_entropy_loss (logits, labels, label_smoothing=0.0, weight=1.0, name='cross_entropy_loss')
Args
- predictions: 2D tensor or array, [batch_size, num_classes] predictions of the network .
- labels: 2D tensor or array,[batch_size, num_classes] ground truth labels or target labels.
- label_smoothing: a float, used to smooth the labels for better generalizationif greater than 0 then smooth the labels.
- weight: scale the loss by this factor.
- name: Optional scope/name for op_scope.
Returns
A tensor with the cross entropy loss.
Define a L2Loss, useful for regularize, i.e. weight decay
tefla.core.losses.l1_l2_regularizer (var, weight_l1=1.0, weight_l2=1.0, name='l1_l2_regularizer')
Args
- var: tensor to regularize.
- weight_l1: an optional weight to modulate the l1 loss.
- weight_l2: an optional weight to modulate the l2 loss.
- name: Optional scope/name for op_scope.
Returns
the l1+L2 loss op.
Returns a function that can be used to apply L1 regularization to weights
tefla.core.losses.l1_regularizer (scale, name='l1_regularizer') L1 regularization encourages sparsity.
Args
scale: A scalar multiplier Tensor
. 0.0 disables the regularizer.
name: An optional name/scope name.
Returns
A function with signature l1(weights)
that apply L1 regularization.
Returns a function that can be used to apply L2 regularization to weights
tefla.core.losses.l2_regularizer (scale, name='l2_regularizer') Small values of L2 can help prevent overfitting the training data.
Args
scale: A scalar multiplier Tensor
. 0.0 disables the regularizer.
name: An optional name/scope name.
Returns
A function with signature l2(weights)
that applies L2 regularization.
log-likelihood for mixture of discretized logistics, assumes the data has been rescaled to [-1,1] interval
tefla.core.losses.discretized_mix_logistic_loss (inputs, predictions, sum_all=True, name='disretized_mix_logistic_loss')
Args
- predictions: 4D tensor or array, [batch_size, width, height, out_channels] predictions of the network .
- inputs: 4D tensor or array, [batch_size, width, height, num_classes] ground truth labels or target labels.
- name: Optional scope/name for op_scope.
Returns
A tensor with the discretized mix logistic loss.
Pull Away loss calculation
tefla.core.losses.pullaway_loss (embeddings, name='pullaway_loss')
Args
- embeddings: The embeddings to be orthogonalized for varied faces. Shape [batch_size, embeddings_dim]
Calculate the loss from the logits and the labels
tefla.core.losses.segment_loss (logits, labels, num_classes, head=None)
Args
logits: tensor, float - [batch_size * width * height, num_classes]. - Use vgg_fcn.up as logits. labels: Labels tensor, int32 - [batch_size * width * height, num_classes]. - The ground truth of your data. head: numpy array - [num_classes] - Weighting the loss of each class - Optional: Prioritize some classes
Returns
loss: Loss tensor of type float.
Calculate the triplet loss according to the FaceNet paper
tefla.core.losses.triplet_loss (anchor, positive, negative, alpha=0.2, name='triplet_loss')
Args
anchor: 2-D tensor
[batch_size, embedding_size], the embeddings for the anchor images.
positive: 2-D tensor
[batch_size, embedding_size], the embeddings for the positive images.
negative: 2-D tensor
[batch_size, embedding_size], the embeddings for the negative images.
alpha: positive to negative triplet distance margin
Returns
the triplet loss.
Decov loss as described in https://arxiv.org/pdf/1511.06068.pdf
tefla.core.losses.decov_loss (xs, name='decov_loss') 'Reducing Overfitting In Deep Networks by Decorrelating Representation'
Args
- xs: 4-D
tensor
[batch_size, height, width, channels], input
Returns
a float
decov loss
Center loss based on the paper "A Discriminative Feature Learning Approach for Deep Face Recognition"
tefla.core.losses.center_loss (features, label, alpha, num_classes, name='center_loss') (http://ydwen.github.io/papers/WenECCV16.pdf)
Args
- features: 2-D
tensor
[batch_size, feature_length], input features - label: 1-D
tensor
[batch_size], input label - alpha: center loss parameter
- num_classes: a
int
numof classes for training
Returns
a float
, center loss
Adds a similarity loss term, the correlation between two representations
tefla.core.losses.correlation_loss (source_samples, target_samples, weight, name='corr_loss')
Args
- source_samples: a tensor of shape [num_samples, num_features]
- target_samples: a tensor of shape [num_samples, num_features]
- weight: a scalar weight for the loss.
- scope: optional name scope for summary tags.
Returns
a scalar tensor representing the correlation loss value.
Computes the Maximum Mean Discrepancy (MMD) of two samples: x and y
tefla.core.losses.maximum_mean_discrepancy (x, y, kernel=
Maximum Mean Discrepancy (MMD) is a distance-measure between the samples of the distributions of x and y. Here we use the kernel two sample estimate using the empirical mean of the two distributions.
MMD^2(P, Q) = || \E{\phi(x)} - \E{\phi(y)} ||^2= \E{ K(x, x) } + \E{ K(y, y) } - 2 \E{ K(x, y) },
where K = <\phi(x), \phi(y)>, is the desired kernel function, in this case a radial basis kernel.
Args
- x: a tensor of shape [num_samples, num_features]
- y: a tensor of shape [num_samples, num_features]
- kernel: a function which computes the kernel in MMD. Defaults to theGaussianKernelMatrix.
Returns
a scalar denoting the squared maximum mean discrepancy loss.
Adds a similarity loss term, the MMD between two representations
tefla.core.losses.mmd_loss (source_samples, target_samples, weight, name='mmd_loss')
This Maximum Mean Discrepancy (MMD) loss is calculated with a number of different Gaussian kernels.
Args
source_samples: a tensor of shape [num_samples, num_features]. target_samples: a tensor of shape [num_samples, num_features]. weight: the weight of the MMD loss. scope: optional name scope for summary tags.
Returns
a scalar tensor representing the MMD loss value.
Adds the domain adversarial (DANN) loss
tefla.core.losses.dann_loss (source_samples, target_samples, weight, name='dann_loss')
Args
source_samples: a tensor of shape [num_samples, num_features]. target_samples: a tensor of shape [num_samples, num_features]. weight: the weight of the loss. scope: optional name scope for summary tags.
Returns
a scalar tensor representing the correlation loss value.
Adds the difference loss between the private and shared representations
tefla.core.losses.difference_loss (private_samples, shared_samples, weight=1.0, name='difference_loss')
Args
private_samples: a tensor of shape [num_samples, num_features]. shared_samples: a tensor of shape [num_samples, num_features]. weight: the weight of the incoherence loss. name: the name of the tf summary.
A helper function to compute the error between quaternions
tefla.core.losses.log_quaternion_loss_batch (predictions, labels, name='log_quaternion_batch_loss')
Args
predictions: A Tensor of size [batch_size, 4]. labels: A Tensor of size [batch_size, 4]. params: A dictionary of parameters. Expecting 'use_logging', 'batch_size'.
Returns
A Tensor of size [batch_size], denoting the error between the quaternions.
A helper function to compute the mean error between batches of quaternions
tefla.core.losses.log_quaternion_loss (predictions, labels, batch_size, name='log_quaternion_loss')
The caller is expected to add the loss to the graph.
Args
predictions: A Tensor of size [batch_size, 4]. labels: A Tensor of size [batch_size, 4]. params: A dictionary of parameters. Expecting 'use_logging', 'batch_size'.
Returns
A Tensor of size 1, denoting the mean error between batches of quaternions.
Adds noise to embeddings and recomputes classification loss
tefla.core.losses.random_perturbation_loss (embedded, length, loss_fn, perturb_norm_length=0.1)
Args
- embedded: 3-D float
Tensor
, [batch_size, num_timesteps, embedding_dim] - length: a
int
, length of the mask - loss_fn: a callable, that returns loss
- perturb_norm_length: a
float
, Norm length of adversarial perturbation to be optimized with validatio
Returns
perturbation loss
Adds gradient to embedding and recomputes classification loss
tefla.core.losses.adversarial_loss (embedded, loss, loss_fn, perturb_norm_length=0.1)
Args
- embedded: 3-D float
Tensor
, [batch_size, num_timesteps, embedding_dim] - loss:
float
, loss - loss_fn: a callable, that returns loss
- perturb_norm_length: a
float
, Norm length of adversarial perturbation to be optimized with validatio
Returns
adversial loss
Virtual adversarial loss
tefla.core.losses.virtual_adversarial_loss (logits, embedded, labels, length, logits_from_embedding_fn, num_classes, num_power_iteration=1, small_constant_for_finite_diff=0.001, perturb_norm_length=0.1) Computes virtual adversarial perturbation by finite difference method and power iteration, adds it to the embedding, and computes the KL divergence between the new logits and the original logits.
Args
- logits: 2-D float
Tensor
, [num_timesteps*batch_size, m], where m=1 if num_classes=2, otherwise m=num_classes. - embedded: 3-D float
Tensor
, [batch_size, num_timesteps, embedding_dim]. - labels: 1-D
Tensor
, input labels - length: a
int
, input length - logits_from_embedding_fn: callable that takes embeddings and returns classifier logits.
- num_classes: num_classes for training
- vocab_size: a
int
, vocabular size of the problem - num_power_iteration: a
int
, the number of power iteration - small_constant_for_finite_diff: a
float
, Small constant for finite difference method - perturb_norm_length: a
float
, Norm length of adversarial perturbation to be optimized with validatio
Returns
a float
scalar
, KL divergence.
Adds noise to embeddings and recomputes classification loss fir bidirectional rnn models
tefla.core.losses.random_perturbation_loss_brnn (embedded, length, loss_fn, perturb_norm_length=0.1)
Args
- embedded: 3-D float
Tensor
, [batch_size, num_timesteps, embedding_dim] - length: a
int
, length of the mask - loss_fn: a callable, that returns loss
- perturb_norm_length: a
float
, Norm length of adversarial perturbation to be optimized with validatio
Returns
perturbation loss
Adds gradient to embeddings and recomputes classification loss for bidirectional rnn models
tefla.core.losses.adversarial_loss_brnn (embedded, loss, loss_fn, perurb_norm_length=0.1)
Args
- embedded: 3-D float
Tensor
, [batch_size, num_timesteps, embedding_dim] - loss:
float
, loss - loss_fn: a callable, that returns loss
- perturb_norm_length: a
float
, Norm length of adversarial perturbation to be optimized with validatio
Returns
adversial loss
Virtual adversarial loss for bidirectional models
tefla.core.losses.virtual_adversarial_loss_brnn (logits, embedded, labels, length, logits_from_embedding_fn, vocab_size, num_classes, num_power_iteration=1, small_constant_for_finite_diff=0.001, perturb_norm_length=0.1) Computes virtual adversarial perturbation by finite difference method and power iteration, adds it to the embedding, and computes the KL divergence between the new logits and the original logits.
Args
- logits: 2-D float
Tensor
, [num_timesteps*batch_size, m], where m=1 if num_classes=2, otherwise m=num_classes. - embedded: 3-D float
Tensor
, [batch_size, num_timesteps, embedding_dim]. - labels: 1-D
Tensor
, input labels - length: a
int
, input length - logits_from_embedding_fn: callable that takes embeddings and returns classifier logits.
- num_classes: num_classes for training
- vocab_size: a
int
, vocabular size of the problem - num_power_iteration: a
int
, the number of power iteration - small_constant_for_finite_diff: a
float
, Small constant for finite difference method - perturb_norm_length: a
float
, Norm length of adversarial perturbation to be optimized with validatio
Returns
a float
scalar
, KL divergence.
Generate a mask for the EOS token (1.0 on EOS, 0.0 otherwise)
tefla.core.losses._end_of_seq_mask (tokens, vocab_size)
Args
- tokens: 1-D integer
Tensor
[num_timesteps*batch_size]. Each element is an id from the vocab. - vocab_size: a
int
, vocabular size of the problem
Returns
Float 1-D Tensor
same shape as tokens, whose values are 1.0 on the end of
sequence and 0.0 on the others.
Returns weighted KL divergence between distributions q and p
tefla.core.losses._kl_divergence_with_logits (q_logits, p_logits, weights, num_classes)
Args
- q_logits: logits for 1st argument of KL divergence shape [num_timesteps * batch_size, num_classes] if num_classes > 2, and [num_timesteps * batch_size] if num_classes == 2.
- p_logits: logits for 2nd argument of KL divergence with same shape q_logits.
- weights: 1-D
float
tensor with shape [num_timesteps * batch_size]. Elements should be 1.0 only on end of sequences - num_classes: a
int
, number of training classes
Returns
a float
scalar
, KL divergence.
Calculates the per-example cross-entropy loss for a sequence of logits and
tefla.core.losses.cross_entropy_sequence_loss (logits, targets, sequence_length) masks out all losses passed the sequence length.
Args
- logits: Logits of shape
[T, B, vocab_size]
- targets: Target classes of shape
[T, B]
- sequence_length: An int32 tensor of shape
[B]
corresponding -to the length of each input
Returns
A tensor of shape [T, B] that contains the loss per example, per time step.