Wednesday, March 4, 2020

Understand the history and evolution of Tensorflow by revisiting Tensorflow 1.0 Part 1

Tensorflow 2.0 has been beta since last year, and it is a completely different universe as its predecessor Tensorflow 1.0 but even in 2020 it is important to understand the history and evolution of TF library to understand how did it get from here, and why did it choose Keras as a high level API. It is important to understand what is a compute graph as it is a super useful concept in Deep Learning and that you can still visualize and inspect it in TensorBoard.

Let's go back in time and talk about Tensorflow 1.0 its data flow graph and everything executed in a Session object, C or C++ backend and how it handled parallel computing. Offered both Python and C++ API. Though it had a big learning curve at the time of its release, it was production ready and powerful, and had already been used internally at google before being outsourced. It supported CPU GPU and distributed processing in clusters. Its focus is on deep learning neural networks versus Scikit Learn focuses on traditional machine learning algorithm.

What is a data flow graph? It is a very important computer science concept. The node represents math operations, and the edges are multi dimensional arrays called tensors, which flow among the data graph hence the name Tensorflow! See the history is important. Using the graph we can easily visualize the neural networks. Numpy and Scitkit learn would not give that result.

The frustrating part is that graph needs to be built first before running it in a Session. This is where the learning curve got a bit hard and that it was hard to prototype and iterate, and requires a bit of math architect skills than just engineering and coding.

A quick note on tensor, which is also a concept in math and relativity. In this case, it just means a more complex multi-dimensional array of numbers, usually more than 2D (matrix), and has auto gradient compute compatibility and also capability to move to CPU or GPU and parallelize vector compute if possible. Technically even a vector (1D) or a number (0D) is a tensor.

In deep learning, usually we have to convert data such as texts or images into integers, and we usually represent them using tensors. Each image for example is a 3 dimensional tensor of red green and blue. Each dimensional has a matrix corresponding to the width and height of the image, with each element representing the pixel brightness at each w, h coordinate. This is called a feature matrix, aka a feature tensor. Though no one calls it a tensor in this case.

One cumbersome pattern in Tensorflow 1.0 was the need to define a placeholder tf.placeholder() and with type specified before filling it or initializing it with actual numbers. It has the benefit of contiguous memory but it can take time to get used especially when TF is trying to court dynamic type python users. The benefit is also to be able to construct the graph without knowing or filling in specific numeric values. One minus is the inability to test and prototype and iterate.

tf.Variable() allows initializing and filling in data that can be later changed. Each node is an unit of computation. Each edge is either an input or output of an operation.

tensorflow 1.0 like tensorflow 2.0 has a pythonic front-end, a pythonic API and can be deployed on many containers and devices such as CPU, GPU Android and other mobiles OS such as iOS, javascript (tensorflow 1.x+)  in the browser. It has always been quite production ready. Hence was popular before Pytorch 1.0 came along.

Tensorflow and Pytorch both focuses on deep learning and are optimized for deep learning.

Additional feature - Auto differentiation is important for gradient based deep learning algorithms. Additional feature - Optimizer for fine tuning weights efficiently.

No comments:

Post a Comment

Algolia Search API Basics Tutorial

I write full time now for write me to say hi, request content or be notified of new tutorials like this. Unqitech writes abou...