Friday, October 13, 2017

MATLAB eBook on Machine Learning

Working on graduate study work? MATLAB now offers a free eBook on Machine Learning. Great resource for those in the academia.

For those not familiar with MATLAB, it is the definitive math software, great for matrix manipulation, used throughout college and graduate school labs. Solid software for any one pursuing a PhD in quantitate fields and advanced social science studies. Had to use MATLAB for my work at a Stanford Physics lab a long long time ago and also a few Physics PhDs still use it.  You can get a student discount (very substantial, lifetime license) from most US universities.

Coursera Announces Deep Learning Machine Learning by Andrew Ng new timeline

After releasing Course 1 of the Deep Learning specialization on Coursera, Andrew Ng's team is working on releasing the next few courses: Course 2, 3, 4 Convolutional Neural Network (Late October) and 5. Sequence Models (Late November). Course 1-3 are available right now on Coursera.

  • Course 1 Neural Networks and Deep Learning
  • Course 2 Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, Optimization
  • Course 3 Structuring Machine Learning Projects
  • Course 4 Convolutional Neural Networks
  • Course 5 Sequence Models
For those new to the realm of Machine Learning, you will need CNN to do machine learning work on images. Sequence models are used for Natural Langue Processing and Audio models, pretty advanced.

My personal experience with the series so far: knowledgeable, info packed, still Andrew Ng talking one man team one man class, great sequel to his Machine Learning course, and the Udacity Machine Learning Engineering Nanodegree, but definitely need some background on ML techniques and NN. 

Ask me about this course, ask me about the Udacity machine learning engineering nanodegree. I have quite some in-depth experience with the two.

Wednesday, October 11, 2017

Famous Machine Learning Datasets - Machine Learning Wiki

  • MNIST dataset, a collection of 70,000+ labeled digits, starting point of machine learning practice
    • Beginner Machine Learning data
    • Each image is 28 by 28 pixels so 784 data points per image
    • Pixel value 0 to 255. Grayscale, zero means black, 255 means white or completely lit
    • Often used in Google Tensorflow demos
    • sklearn provides this dataset too
    • Small images written by students teachers and government workers
  • Inception-v3 pre-trained Inception-v3 model achieves state-of-the-art accuracy for recognizing general objects with 1000 classes, like "Zebra", "Dalmatian", and "Dishwasher"
  • vgg19 image data
  • What is VGG-16?

    "Since 2010, ImageNet has hosted an annual challenge where research teams present solutions to image classification and other tasks by training on the ImageNet dataset. ImageNet currently has millions of labeled images; it’s one of the largest high-quality image datasets in the world. The Visual Geometry group at the University of Oxford did really well in 2014 with two network architectures: VGG-16, a 16-layer convolutional Neural Network, and VGG-19, a 19-layer Convolutional Neural Network."
  • Imagenet can output 1000+ classes. If we don't need that many, instead need transfer learning should consider replacing it with bottleneck of only 1-10 classes.
  • Youtube 8M Video Data Kaggle
  • 1000+ different objects in 1.3 million high resolution training images
  • cornell movie dialog
  • More famous datasets on github - amazing public databases
  • “Twenty Newsgroups” The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. To the best of our knowledge, it was originally collected by Ken Lang, probably for his paper “Newsweeder: Learning to filter netnews,” though he does not explicitly mention this collection. The 20 newsgroups collection has become a popular data set for experiments in text applications of machine learning techniques, such as text classification and text clustering.

Additional datasets, some famous some lesser known
  • Movie review 100K ratings from 1000 users on 1700 movies
  • Datasets on Keras

Inception - Tensorflow Wiki

  • Google's state of art image classifier
  • Pre-trained
  • Open sourced
  • Trained on 1.2 million images
  • Training took 2 weeks

Sunday, October 8, 2017

Time your python script

  • Time it in the terminal
  • Time it in your script

import time
start = time.time()


print 'It took', time.time()-start, 'seconds.'

Source:StackOverflow -

You can now open Jupyter Notebook in Coursera! - this week in online learning

Open Jupyter Notebook in Coursera
As seen in this Michigan Data Science MOOC, Coursera now allows you to open and edit Jupyter Notebook right in the browser. Pretty amazing engineer! Truly the future of learning. Think of it as a super

Saturday, October 7, 2017

Augmented Reality in Painting of Mona Lisa by Leonardo Da Vinci

Leonardo Da Vinci carefully studied the human anatomy of smiles and experimented with new painting techniques to create life like realistic smile of Mona Lisa. The smile is so illusive that it only is picked up by peripheral vision.

Thursday, October 5, 2017

Python Interview List Slicing

a = [1,2,3,4] 
-> [1, 2, 3] 

 a[0:3:2] # slice with increment of 2 
->[1, 3] 

 a[::-1] # reverse slicing 
->[4, 3, 2, 1] 

 t=(1,2,3,4,5) #slicing tuples 
->(1, 2, 3, 4) 

sliceObj = slice(1,3) 
t[sliceObj] ->(2, 3)

-> returns a full copy of the list

React UI, UI UX, Reactstrap React Bootstrap

React UI MATERIAL  Install yarn add @material-ui/icons Reactstrap FORMS. Controlled Forms. Uncontrolled Forms.  Columns, grid