Part 2 of the Clusterone TensorBoard tutorial. Learn how to output images to TensorBoard!
Welcome to part two of our TensorBoard tutorial. In the first part, we’ve dug into the basics of TensorFlow’s visualization framework. We’ve created a handsome graph for our neural network and summary plots for various parameters during training.
If you’ve missed part one, take a look at it here!
In this second part, you’re going to learn how to output images to TensorBoard. We’ll work with the MNIST dataset of handwritten digits. We will assess the performance of the trained network by comparing it to ourselves:
- How hard are the digits to read that the model misclassified?
- Maybe it’s not even the model’s fault because the numbers are really hard to read even for us?
- Or maybe we realize our network is especially bad at classifying a specific number?
Just like in the first part, this tutorial will explain step by step what you need to do to add image output to your TensorBoard. You can run the example code on your local machine and I’ll also explain how to run it on Clusterone, our deep learning platform.
Get the code!
All code in this tutorial can be found in our GitHub repository, together with a growing number of other machines learning examples and tutorials. To run the code, you should have Python 3.5 or higher installed, as well as TensorFlow and the Clusterone Python library. For installation instructions, check the README file in the GitHub repository.
What is an image in TensorFlow?
Before we dive into the code, it’s a good idea to understand what TensorFlow means when it says “image”. Since TensorFlow describes its whole world in tensors, an image is nothing else than that: a tensor.
More specifically, it’s a tensor of shape (x, y, c), where x and y are the dimensions of the image in pixels, while c is the number of channels. Grayscale images have one channel (0 for black, 255 for white), while color images have three (red, green, blue) or four (red, green, blue, alpha).
However, MNIST applies a simplification that makes its own life easier and ours a little more complicated. Because the relation of the pixels to each other doesn’t matter all that much for the digit recognition task, the images are reshaped into row vectors of shape (1, x*y). You can imagine this as simply lining up all pixels one after the other.
This modification is not a problem, we just need to keep in mind that we have to reverse this reshaping before displaying the image (we’ll do that later in the code!).
To display an image in TensorBoard, we need to create a summary from it.
TensorBoard summaries are created just like any output in TensorFlow: an input variable is passed to some operation. This operation is then executed in a TensorFlow session that produces an output. For TensorBoard summaries, this output is then added to a
tf.summary.FileWriter object through the
Time to get coding!
Let’s dive into the code. We reuse the model from part one of this tutorial and simply add the necessary steps to display the images. We start with the input variable by creating a
tf.placeholder and call it
tb_images. Note that the
None value for the first dimension means that the input tensor can have any shape for this dimension:
tf.equal() function tests if each value in
predictions_op is equal to the value in
labels (the list of correct classifications for the digits). Its output is a vector with boolean values, indicating for each MNIST image if the prediction was correct or incorrect.
When the last training step has come, we run the
correct_predictions_op operation and feed it the test data set:
The output of that run is called
pred_bool and passed to a function called
get_misclassified_images(), which.. well.. gets all the incorrectly classified images.
More specifically, it creates a summary operation that contains all these images, as well as an additional image for each misclassified one, showing the number that the network predicted. The output from running that summary operation is then passed to our existing test file writer.
Here’s what the
get_misclassified_images() function looks like:
First, all incorrect predictions are extracted into
wrong_predictions, together with their position in the original test data tensor. Then we create a matrix called
images to hold all the misclassified images.
In the loop, the digit that the model predicted is written to
predicted_number using the OpenCV library. Then the original MNIST image is stored in
mnist_number. Both images are appended together and stored in the
We still need to reshape the matrix and turn the row of pixels back into an images with the correct width and height. Then we create a summary operation from the image matrix using
A closer look at tf.summary.image
Before moving on, let’s have a closer look at
The input tensor — called
wrong_images in this case — needs to be of four dimensions:
- The number of images
- The pixels in horizontal direction
- The pixels in vertical direction
- The number of color channels (1, 3, or 4)
tf.summary.image itself takes an identifier (we call it
image), the tensor of images, as well as the maximum number of images.
This can be any number (the default is 3). Limiting the number of images is useful because TensorBoard may get really slow when it has to load a huge number of images. Feel free to experiment with more images and see how quick your TensorBoard loads them.
Run the code and fire up TensorBoard
That’s it! Run the code from our GitHub repository (or your own, of course) and start up TensorBoard:
$ tensorboard --logdir /path/to/your/logs
Make sure to point TensorBoard to where you store the logs. Our example code creates a new subdirectory called
logs in the code folder.
When you navigate to TensorBoard in the browser, you’ll see a tab called “images”. This is where your images appear! Please note that TensorBoard sometimes takes a while to load all the data from the log files. If there’s no “images” tab just yet, give it a few seconds.
Take a look at the MNIST images above that the model failed to classify correctly. The digits on white are the original MNIST images that the model was given as input. The numbers on gray on the right of each digit are what was predicted. Can you say with confidence what the correct prediction would be?
As you can see in my results, some digits are obvious, but others are harder to get right. For instance, take a look at the number at the bottom left. To me, that looks like a “9”, just like the model predicted. But it’s actually an “8”. Huh..
Or what about the rightmost images on the second and third line? Both show the number “2”, but it’s easy to see how our model could think it was a “7” instead.
TensorBoard on Clusterone
When running your TensorFlow code on our deep learning platform Clusterone, using TensorBoard becomes even easier. Clusterone supports TensorBoard natively for all jobs you run on the platform. You can even display the output of various jobs together to compare.
Once your job is running, open the Matrix, Clusterone’s graphical web interface. Find your job and click the “Add to TensorBoard” button. Now click the “TensorBoard” button at the top.
TensorBoard will open in a new tab and after a few seconds, your images will appear in the “images” tab. Easy, right?