Dataflow handling class

tefla.dataset.dataflow.Dataflow (dataset, num_readers=1, shuffle=True, num_epochs=None, min_queue_examples=1024, capacity=2048)

Args

dataset: an instance of the dataset class
num_readers: num of readers to read the dataset
shuffle: a bool, shuffle the dataset
num_epochs: total number of epoch for training or validation
min_queue_examples: minimum number of items after dequeue
capacity: total queue capacity

Methods

batch_inputs (batch_size, train, tfrecords_image_size, crop_size, im_size=None, bbox=None, image_preprocessing=None, num_preprocess_threads=16)

Args

dataset: instance of Dataset class specifying the dataset.
See dataset.py for details.
batch_size: integer
train: boolean
crop_size: training time image size. a int or tuple
tfrecords_image_size: a list with original image size used to encode image in tfrecords e.g.: [width, height, channel]
image_processing: a function to process image
num_preprocess_threads: integer, total number of preprocessing threads

Returns

images: 4-D float Tensor of a batch of images labels: 1-D integer Tensor of [batch_size].

get (items, image_size, resize_size=None)

Args

items: a list, with items to get from the dataset e.g.: ['image', 'label']
image_size: a list with original image size e.g.: [width, height, channel]
resize_size: if image resize required, provide a list of width and height e.g.: [width, height]

get_batch (batch_size, target_probs, image_size, resize_size=None, crop_size=[32, 32, 3], image_preprocessing=None, num_preprocess_threads=32, init_probs=None, enqueue_many=True, queue_capacity=2048, threads_per_queue=4, name='balancing_op', data_balancing=True)

Stochastically creates batches based on per-class probabilities. This method discards examples. Internally, it creates one queue to amortize the cost of disk reads, and one queue to hold the properly-proportioned batch.

Args

batch_size: a int, batch_size
target_probs: probabilities of class samples to be present in the batch
image_size: a list with original image size e.g.: [width, height, channel]
resize_size: if image resize required, provide a list of width and height e.g.: [width, height]
init_probs: initial probs of data sample in the first batch
enqueue_many: bool, if true, interpret input tensors as having a batch dimension.
queue_capacity: Capacity of the large queue that holds input examples.
threads_per_queue: Number of threads for the large queue that holds input examples and for the final queue with the proper class proportions.
name: a optional scope/name of the op

prefetch (tensor_dict, capacity) Creates a FIFO queue to asynchronously enqueue tensor_dicts and returns a dequeue op that evaluates to a tensor_dict. This function is useful in prefetching preprocessed tensors so that the data is readily available for consumers.

Args

tensor_dict: a dictionary of tensors to prefetch.
capacity: the size of the prefetch queue.

Returns

a FIFO prefetcher queue