Part 1 of the Clusterone TensorBoard tutorial. Learn how to output graphs, scalar plots, and histograms with a few lines of code, based on a hands-on example.
Machine learning — and deep learning in particular — can learn to solve problems that were deemed impossible to solve for computers not long ago. But it also adds a new layer of “obscurity”: neural networks can be complex and confusing. Sometimes it’s even hard to say why something worked well or failed horribly.
To counter this issue, TensorFlow comes with a visualization tool called TensorBoard. In TensorBoard you can plot all kinds of parameters in your model as it runs. You can create a neat-looking graph of your neural network, making it easy to grasp its structure. You can even output audio and images and arrange them nicely in so-called embeddings.
Basically, you can use TensorBoard to deeply look into the inner workings of your model as it runs.
But all these wonderful insights don’t just show up in TensorBoard out of nothing. You have to provide this information to TensorBoard from your code. In this tutorial, we’re going to look at how that’s done.
We’ll start with a simple bare-bone example of MNIST handwritten digit recognition without TensorBoard support and add the code necessary to create beautiful output in TensorBoard.
In the second part of the tutorial, we’re going to collect images that the network misclassified and print them to TensorBoard. This way we can see how our network performs in comparison to a human — us!
Get the Code!
All code in this tutorial can be found in our GitHub repository, together with a growing number of other machines learning examples and tutorials.
To run the code, you should have Python 3.5 or higher installed, as well as TensorFlow and the Clusterone Python library. For installation instructions, check the README file in the GitHub repository.
The Starting Line: Barebone MNIST
As a starting point, I have created a very basic implementation of handwritten digit classification using the MNIST dataset, sort of the “hello world” of machine learning.
My implementation closely resembles the implementation in the TensorFlow beginner tutorial, but I have rewritten it to serve as a basis for adding TensorBoard support. The network consists of two fully-connected hidden layers using ReLu activations and a linear output layer after it. Softmax is used to calculate the cross-entropy and TensorFlow’s GradientDescentOptimizer for learning. See below for a graphical representation of the network (yes, that’s straight out of TensorBoard!).
The barebone implementation can be found in the GitHub repository as main_bare.py. Feel free torun it.
Now, just for the heck of it, let’s launch TensorBoard. You can do that by running this in yourconsole:
tensorboard --logdir .
Open the browser and navigate to the URL provided by the
tensorboard command. Asexpected, your TensorBoard is sad and empty. So let’s change that!
Creating a Network Graph with tf.summary.FileWriter
So, how do we get information about our network into TensorBoard? We write data about the network to a log directory and point TensorBoard there. When TensorBoard runs, it automagically figures out what data is available in the logs and displays them.
To write log files, TensorFlow provides the
tf.summary.FileWriter object. We’ll create one and pass it the target directory and the graph we want to save:
merge_all() function call is needed to create an operation that we can run using
sess.run(). This isn’t necessary for the graph yet, but we’ll need it when we want to display other data in TensorBoard later.
If you now run the code again and invoke TensorBoard (make sure to use
--logdir logs/), you’ll see that there is a graph of your network! But it doesn’t look half as nice as the one I showed you above. That’s because we haven’t used scopes yet!
TensorFlow offers the
tf.name_scope() function to group components of the network together under a common name. For example, we can create a scope “hidden1” for our 1st hidden layer:
This will look nice, but it will look even nicer if we describe our network in more detail. So let’s also add scopes for weights and biases variables, as well as the activation function. We also give our variables names using the optional “name” parameter.
We do the same with the other hidden layer and the output layer, as well as our input placeholder objects. In a creative outburst, we’ll call them “input”! The loss softmax, the training operation, and the accuracy evaluation get equally creative scope names assigned.
Now every component of the network has been named and some components have been grouped together as layers.
This stage of the code is stored in main_tensorboard_graph.py. Run it! When the script is done, start TensorBoard:
$ tensorboard --logdir logs
Now there’s a “Graph” tab at the top. Click it and behold the pretty graph of your neural network!
You can double-click on each block of the graph to see what’s going on inside of it. This representation of the network can be really useful to figure out what’s really happening and to spot errors in the setup of the network.
Adding Scalar Summaries
Let’s take a look at how to add plots of different parameters of the network. The most obvious parameter to plot is the accuracy parameter that is already printed out to the console. But there are a few more that can help you to understand your network better.
While the accuracy parameter should increase over time, the loss should decrease. But both accuracy and loss only describe what’s going on at the output end of the network. What about the hidden layers? To get better insight into the inner workings of our hidden layers, we use a little function that produces a few statistical values and call it for each weight and bias variable in our layers.
As you can see, we record the mean, the standard deviation, as well as the maximum and minimum value of the input variable. I have borrowed this function from the TensorFlow tutorial on TensorBoard.
Now we can call
variable_summaries() for each weights and biases Variable object in our code.
Try it out by running main_tensorboard.py from the tutorial repository. Invoking TensorBoard afterwards shows us that several plots have appeared on the “Scalars” page. Awesome!
And Now for Histograms
Another useful summary in TensorBoard are histograms. They are created just as easily as the scalars:
tf.summary.histogram() adds a histogram of a specific variable to the summary.
In this tutorial, we add a histogram for each component of our network. The easiest way to do this is by simply adding a histogram line to the
variable_summaries() function. We’ll also add another histogram summary for the activation function of each layer since they are not covered in the
Now, when we run the code again and start TensorBoard, a new tab named “Histograms” appears at the top:
Okay, you may say. These histogram plots do look really beautiful, but what are they supposed to tell me? For those not familiar with histograms, here’s a quick intro: histograms show the progression of a graph over time. Each slice in a histogram here is a snapshot in time. Take for example the bias histogram at the top right of the image above. The slice in the front tells us that right when training ended, the biases in the layer were distributed between -0.05 and 0.1, with a few outliers outside of this margin. Looking at darker slices farther back mean going back in time. Earlier, the biases were distributed more uniformly between -0.05 and 0.05. From there, we see that the biases have learned values.
As another example, let’s take weights at the bottom. Here we see that the values haven’t learned all that much. Their distribution has changed only slightly.
For an intro to histograms from the folks at TensorFlow, check out this link.
TensorBoard on Clusterone
When running your machine learning code on our online platform, using TensorBoard becomes even easier. Clusterone supports TensorBoard natively for all jobs you run on the platform. You can even display the output of various jobs together to compare.
Try it out by running this tutorial on Clusterone. To run the code, simply follow the setup instruction in the GitHub repository.
Once your job is running, open the Matrix, Clusterone’s graphical web interface. Find your job and click the “Add to TensorBoard” button. Now click the “TensorBoard” button at the top. TensorBoard will open in a new tab and your summaries will appear. Voila!
In this example, we have taken a barebone neural network to learn handwritten digit recognition and added TensorBoard summaries to be able to see what’s going on inside our creation. Having these visualizations greatly improves our ability to judge how our network is performing. Even more, if something isn’t going well, TensorBoard summaries can help us to understand the reasons and improve our network.
To download the source code for this example, check out our GitHub.
In the next part of this tutorial series, we are going to discuss how you can add images to TensorBoard. In particular, we will output handwritten digits that our model has failed to classify correctly.