Ad

Sunday, November 12, 2017

What is Tensorflow? - Tensorflow for Dummies Google Tensorflow 101

What is Tensorflow

Tensorflow is a deep learning framework, and deep learning is a hot field of machine learning. It's like like rails is a framework for web applications and bootstrap is a framework for front end development.

More generally, Tensorflow is built for large scale numerical computation. Deep learning is one of its capabilities.

You can scale your machine learning code, with Google in the cloud by employing more than one core. Even use GPU and parallel processing.

Tensorflow computes gradients, a non trivial calculation, fast.

TF provides a library of machine learning APIs, models, scoring metrics, optimizer for machine learning. It also provide mathematical computation libraries and functions that support high dimension matrix calculation, manipulation for linear algebra.

Get a taste of Tensorflow for beginners here: https://www.tensorflow.org/get_started/mnist/beginners
Get a taste of Tensorflow for experts here: https://www.tensorflow.org/get_started/mnist/pros
Google Cloud app engine will soon offer Cloud ML

Matrix Vector Representation of an Image - Image Classification for Machine Learning

Each pixel intensity can be represented with RGB values - red green blue. See this Coursera Deep Learning MOOC by Andrew Ng. The RGB data of an image is known as the three 3 channels. Each is a matrix. We vectorize the RGB data into feature matrix X. Dimension length of X is = width_pixel  multiply by height_pixel multiply by number of channels 3. E.g. the digits in LSMNT  are 28 by 28 by 1 because they are black and white, so instead of 3 channels, it only has one channel.

Coursera Deep Learning MOOC by Andrew Ng Convolutional Neural Net

Udacity Offers Free Preview of its Deep Learning Nanodegree

"What will you create?" with Deep Learning? Previously the deep learning foundation is taught by Youtube star Sraj Raval, quite a personality, look him up! Luis Serrano head of Machine Learning at Udacity is revamping the series with a course developer - Matt. Luis is a machine learning specialist PhD who has taught, done research and worked as a Google Engineer. Matt has used datas science and python for his PhD work. They just offered a free preview of this Nanodegree. Here are some reviews, observations and commentaries.

Luis, Matt
will guide you through this Udacity
Deep Learning Nanodegree process

You can meet the instructors. Learn about Neural Networks including:

  • Convolutional Networks
  • Recurrent Neural Networks
  • Generative Adversarial Networks
  • Deep Reinforcement Learning. 

Real world projects of this nanodegree:

  • Create original art like Picasso, Hokusai (Japanese wood print painting), using deep learning transfer learning (though you can do this right now. Google Developer demo shows you how)
  • Teach a car to navigate in simulated traffic (Deep Reinforcement Learning)
  • Train a gaming agent to play Flappy Bird
There will also be stories about Sebastian Thrun's work at Stanford skin cancer detection by Alexis Cook. Understand how Sebastian's team devised this new life saving algorithm. Technically after learning Convolutional Neural Network (CNN) you can analyze medical MRI, X-rays and more. 


Exposure to technologies:

  • Keras
  • Tensorflow
Exposure to experts:
  • Sebastian Thrun
  • Google AI, Google Brain
Information covered in this nanodegree: CNN, RNN, GAN, Reinforcement learning, projects. See the infograph below


Deep Learning Nanodegree has 5 parts:


Introduction

Create art with transfer learning
Linear regression, machine learning

Neural Networks

Build simple neural networks from scratch using python 
Gradient Descent, backpropagation
Project 1, predict bike ridership using a NN
Model evaluation, validation, 
Guest instructor Andrew Trask author of Grokking Deep Learning
Predict text and predicting sentiment

Convolutional Networks

Computer vision
Build Convolutinal networks in Tensorflow
Project 2 use CNN to classify dog breeds
Build autoencoder with CNN
A network architecture for Image compression and denoising
Use pretrained neural network VGGnet to classify images of flowers the network has not seen, using transfer learning

Recurrent Neural Networks (RNN)

Transform sequences like text, music, time series data,
Build a RNN generate new text character by character
Natural language processing, Word embedding, Word2Vec model, Semantic relationship between words, 
Combine embedding and RNN to predict sentiment of movie reviews
Project 3: generate tv scripts from episodes of the simpsons

Generative Adversarial Networks (GANs)

Great for real-world data.
Generating image as in the CycleGAN project
Guest instructor Ian Goodfellow from CycleGAN implementation
semi supervised learning
training classifier with data with mostly missing abels. 
semi-supervised learning, a technique for training classifiers with data mostly missing labels.
project 4 use deep covolutional GAN to generate human faces.

Deep Reinforcement Learning

Artificial intelligence
AlphaGo by DeepMind, Video gaming agent, robotics
Design agents that can learn to take actions in a simulated environment
Project 5 Deep Reinforcement Learning agent to control several quadcopter flying tasks, including take-off, hover, and landing.

Learning Resources and support: there will be community forum, a slack channel and a waffle board.

https://www.udacity.com/course/deep-learning-nanodegree-foundation--nd101

Wednesday, November 1, 2017

Computer Science Tutorials for MBAs and Business Leaders

Amazing Computer Science tutorials for MBAs and Business Leaders by CS on Youtube
Link to the playlist
https://www.youtube.com/watch?v=WMYyD5zx9_c&list=PLhQjrBD2T383wBEMbMIpdWghyHVQU2wB_

Easy to understand, world class quality. Brought to you by the Harvard team that made the massively popular CS50 Series.

Behind the scene

Friday, October 13, 2017

MATLAB eBook on Machine Learning

Working on graduate study work? MATLAB now offers a free eBook on Machine Learning. Great resource for those in the academia.

https://www.mathworks.com/campaigns/products/display/machine-learning-with-matlab.html

For those not familiar with MATLAB, it is the definitive math software, great for matrix manipulation, used throughout college and graduate school labs. Solid software for any one pursuing a PhD in quantitate fields and advanced social science studies. Had to use MATLAB for my work at a Stanford Physics lab a long long time ago and also a few Physics PhDs still use it.  You can get a student discount (very substantial, lifetime license) from most US universities.

Coursera Announces Deep Learning Machine Learning by Andrew Ng new timeline

After releasing Course 1 of the Deep Learning specialization on Coursera, Andrew Ng's team is working on releasing the next few courses: Course 2, 3, 4 Convolutional Neural Network (Late October) and 5. Sequence Models (Late November). Course 1-3 are available right now on Coursera.

  • Course 1 Neural Networks and Deep Learning
  • Course 2 Improving Deep Neural Networks: Hyperparameter Tuning, Regularization, Optimization
  • Course 3 Structuring Machine Learning Projects
  • Course 4 Convolutional Neural Networks
  • Course 5 Sequence Models
For those new to the realm of Machine Learning, you will need CNN to do machine learning work on images. Sequence models are used for Natural Langue Processing and Audio models, pretty advanced.

My personal experience with the series so far: knowledgeable, info packed, still Andrew Ng talking one man team one man class, great sequel to his Machine Learning course, and the Udacity Machine Learning Engineering Nanodegree, but definitely need some background on ML techniques and NN. 

Ask me about this course, ask me about the Udacity machine learning engineering nanodegree. I have quite some in-depth experience with the two. 


https://www.coursera.org/specializations/deep-learning

Wednesday, October 11, 2017

Famous Machine Learning Datasets - Machine Learning Wiki

  • MNIST dataset, a collection of 70,000+ labeled digits, starting point of machine learning practice
    • Beginner Machine Learning data
    • Each image is 28 by 28 pixels so 784 data points per image
    • Often used in Google Tensorflow demos
    • sklearn provides this dataset too
  • Inception-v3 pre-trained Inception-v3 model achieves state-of-the-art accuracy for recognizing general objects with 1000 classes, like "Zebra", "Dalmatian", and "Dishwasher"
  • vgg19 image data
  • What is VGG-16?

    "Since 2010, ImageNet has hosted an annual challenge where research teams present solutions to image classification and other tasks by training on the ImageNet dataset. ImageNet currently has millions of labeled images; it’s one of the largest high-quality image datasets in the world. The Visual Geometry group at the University of Oxford did really well in 2014 with two network architectures: VGG-16, a 16-layer convolutional Neural Network, and VGG-19, a 19-layer Convolutional Neural Network."
  • 1000+ different objects in 1.3 million high resolution training images
  • cornell movie dialog https://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html

Inception - Tensorflow Wiki


  • Google's state of art image classifier
  • Pre-trained
  • Open sourced
  • Trained on 1.2 million images
  • Training took 2 weeks

Sunday, October 8, 2017

Time your python script


  • Time it in the terminal
time file_name.py
  • Time it in your script

import time
start = time.time()

fun()

print 'It took', time.time()-start, 'seconds.'

Source:StackOverflow - https://stackoverflow.com/questions/6786990/find-out-time-it-took-for-a-python-script-to-complete-execution

You can now open Jupyter Notebook in Coursera! - this week in online learning

Open Jupyter Notebook in Coursera
As seen in this Michigan Data Science MOOC, Coursera now allows you to open and edit Jupyter Notebook right in the browser. Pretty amazing engineer! Truly the future of learning. Think of it as a super Codecademy.com

Saturday, October 7, 2017

Augmented Reality in Painting of Mona Lisa by Leonardo Da Vinci


Leonardo Da Vinci carefully studied the human anatomy of smiles and experimented with new painting techniques to create life like realistic smile of Mona Lisa. The smile is so illusive that it only is picked up by peripheral vision.

Thursday, October 5, 2017

Python Interview List Slicing

a = [1,2,3,4] 
a[0:3] 
-> [1, 2, 3] 

 a[0:3:2] # slice with increment of 2 
->[1, 3] 

 a[::-1] # reverse slicing 
->[4, 3, 2, 1] 

 t=(1,2,3,4,5) #slicing tuples 
t[0:4] 
->(1, 2, 3, 4) 

 t=(1,2,3,4,5) 
sliceObj = slice(1,3) 
t[sliceObj] ->(2, 3)

Friday, September 22, 2017

Preview of Flying Car Nanodegree Program from Udacity

Udacity's Flying Car Nanodegree trailer featuring Sebastian Thrun, KittyHawk flying drones, flying smart cars and more.

Host an HTML website on Github in 5 minutes HD

Web development with Github in 5 minutes. Build your personal portfolio today.



Wednesday, September 20, 2017

Coding Interview Questions - OOP

Class definition



class Person(object): 
 def __init__(self, name, age): 
      self.name = name 
      self.age = age 
 def birthday(self): 
      self.age += 1

Note class name is capitalized. Initialization is the constructors where self.xyz defines class variables. Next def defines a function. Class functions take self as the default, 1st parameter.

Salary for Self Driving Car Engineer

How much salary can you make with a self driving car engineer degree? If you graduated from Carnegie Mellon University at the epicenter of self driving car testing in Pittsburgh, your starting salary can be as high as $200,000 a year. In general, graduates of the Computer Vision program at Carnegie Mellon are receiving offers north of $200K. Majoring in STEM programs really generate high rewards, now especially in computer science and specifically computer vision. See Time CNBC report on how to get a high salary as a self driving car engineer and/or work for Uber!  http://time.com/money/4946034/high-paying-jobs-college-vision-pittsburgh-uber/

Tuesday, September 19, 2017

Rules of Sudoku for Algorithm Exercises

Need to code a Sudoku solver? Here are three rules of Sudoku:

  • A 9x9 grids,
  • Each row ...
  • Each column ...
  • Each of the 9 3x3 grids (example A1 A2 A3, B1 B2 B3, C1 C2 C3) must contain all digits from 1 to 9 and each digit is unique.

Read more detailed rules:
http://www.conceptispuzzles.com/?uri=puzzle/sudoku/rules

Friday, September 1, 2017

Roomba and Machine Learning Artificial Intelligence for the cleaning robot



The latest Roomba 980 can scan room size, identify obstacles and optimize routes. This robot is equipped with state of the art sensors and cameras to scan the room, scan for obstruction and record odometry (used by wheeled robot to estimate distance traveled from a starting position). Its decision tree may work something like scan a small room 3x, a medium size room 2x, a large room for only once. source

It is equipped with infrared receiver for sensors such as cliff sensors and object sensors. It calculates the room size based on distance traveled. The wall sensor allows iRobot to travel closely along the wall.  The iRobot 980 camera can look forward and up at a 45 degree angle.



In machine learning, robot - the agent will interact with the environment, record each state and reward for moving into the next date, update a Q table with utility and eventually choose the best way to complete the task. Reinforcement learning such as that allows the robot to optimize its routes and maximize reward for completing a task based on a set of policies.

In reality, Roombas are forgetful (resets after each run) but it's getting advanced AI functionalities fast. With its existing cameras and image processing software, Roomba iRobot can map out your room with surprising precision. The camera and software in the  Roomba iRobot 980 device can navigate much better than its predecessors which move around semi randomly (at one point Roomba Red travels in spirals, the SPOT cleaning feature still looks a bit like that). 980 has vision! It does not recognize objects yet.

Roomba uses  simultaneous location and mapping or SLAM, an algorithm that takes significant time to optimize and is a lot to pack into a small device according to researchers at MIT. MIT professor John Leonard says Google self driving cars already use navigation systems based on SLAM technology (the self driving car also made significant improvements and use a whole lot more data than SLAM for iRobot which a simple localization task only source).

This little robot is mapping out your room. With the newest Roomba connecting to WIFI and working with Alexa and Google Homes, researchers are concerned about data collection and privacy. The user has to keep Roomba offline or explicitly opt out of data sharing for the advanced wireless models. Albert Gidari, director of privacy at the Stanford Center for Internet and Society, told NYTimes that sharing such data will draw legal ramifications.
source

Did you know that iRobot was created by MIT alumni?

Roomba reached 655 millions in sales 2016. (source)
https://uniqtech.tumblr.com/post/164869166245/irobot-roomba-980-algorithm-machine-learning-slam

Thursday, August 10, 2017

Host an HTML website on Github in 5 minutes HD

Tutorial how to set up web hosting, personal portfolio website on Github in 5 minutes.

Host an HTML website on Github in 5 minutes HD

Tutorial how to set up web hosting, personal portfolio website on Github in 5 minutes. More content coming to this paid channel including insider information on company culture, tutorials, high tech job training and more!



Tuesday, August 8, 2017

Host an HTML website on Github in 5 minutes HD

99 cents paid mini tutorial: how to host an HTML website on Github in 5 minutes. Host your portfolio, host your personal website, showcase your capstone projects for Udacity Nanodegrees.



Saturday, July 15, 2017

Amazon Alexa 101 - How to make your first app

Code helloworld app using Amazon Alexa. Make your first voice search, voice assistant app. Amazon Alexa app making for dummies. Learn to code tutorial 101.





Sample tutorials How to build an Amazon Alexa fact app by Amazon Developers
Read our original blog post and notes on Bloc 's Amazon Alexa 101 tutorial here 

Subscribe to our mailing list


Wednesday, May 31, 2017

Udacity Hiring Partners, Course Partners for Udacity Nanodegrees

Udacity announced in its first conference Intersect 2017 the following Udacity Nanodegree hiring partners. Notable ones include Didi (Uber of China), Uber, Github, Google, Facebook and Slack.



Udacity Blitz allows nanodegree graduates to become "interns" and "contractors" in Udacity Blitz workforce and work for potential clients. International hiring partners include Rakuten (eBay ecommerce giant of Japan), BMW (smart car) and more.

A comprehensive guide to Nanodegree and corresponding hiring partners:

VR Developer Nanodegree course partners are Google VR, Vive, Upload, hiring partners are Lucid, Samsung, Upload, nod

Self Driving Car course partners include Mercedes-Benz, NVIDIA, Uber ATG, Didi (Uber of China), BMW, McLaren, NEXTEV,

Robotics Nanodegree course partners are Bosch Electric Movement iRobot, Lockheed Martin, Kuka, Uber ATG

Digital Marketing Nanodegree course partners are facebook, google, moz, hootsuite, hubspot, mailchimp

Front-End Web Developer Nanodegre course partners are AT&T, Google, Github, Hack Reactor,

Machine Learning Engineer Nanodegree course partner is Kaggle (now Google)

Artificial Intelligence course partners include IBM Watson, Amazon Alexa, Didi (Uber of China)

Data Analyst Nanodegree:

Business Analytics Nanodegree course partners include alteryx, tableau,

Android Basics Nanodegree by Google

Machine Learning Nanodegree course partners include kaggle,

Full Stack Web Developer Nanodegree course partners include amazon web services, github, AT&T, Google

Android Developer Nanodegree by Google

Become an iOS Developer course partner AT&T, Lyft, Google

Data Analyst Nanodegree course partner Facebook, Tableau,

For an overview see Udacity's Hiring Partner page



Wednesday, May 24, 2017

Machine Learning with Emoji for Fun?

Here's an interesting idea. Explain Machine Learning with Emojis! It's not trivial to convey complex ideas with symbols but it is a lot of fun. We introduce three concepts here: machine learning cheat sheet in emojis, KNN neighbor visualization with emojis, and Instacart math with emojis.








Wednesday, April 12, 2017

K mean clustering sklearn best practice - Udacity Machine Learning Nanodegree Unsupervised Learning

There are three key k means clustering parameters in sklearn that you will need to pay attention to:

  • Number of centroids, aka center of clusters, initialized
  • Max number of iterations, used to optimize the algorithm. Best practice recommended by Udacity is 300
  • Number of different iterations, with initialization of centroids

Saturday, March 25, 2017

Difference between Batch Gradient Descent and Stochastic Gradient Descent - Udacity Machine Learning Nanodegree Coursera

Recommend this great 13 minutes crystal clear video by Andrew Ng on Coursera explaining the differences between batch gradient descent aka gradient descent aka normal gradient descent versus Stochastic Gradient Descent. https://www.coursera.org/learn/machine-learning/lecture/DoRHJ/stochastic-gradient-descent It's clear simple and easy to understand without prerequisite. Andrew Ng shows you how the formula differs, how the step by step train strategy differs and a visualization of the trajectory to find global minimum (the center of all the ellipse in his graph).


  • Summary
    • Gradient Descent may have issues when the scale of the data is large
      • If the number of training samples is large
      • Gradient Descent algorithm requires summing over all of m
        • e.g. US population census population of 300MM
    • Stochastic Gradient Descent is a modification of gradient descent
      • In other words, the cost functions are different
    • Stochastic every iteration is faster
    • Steps: randomly shuffle dataset, optimize one training data at a time, improve parameters early one at a time, instead of looking at the examples together as a batch
    • Weakness: generally moves towards global minimum, but doesn't always go there, can reach the general vicinity of the global minimum. Does not converge as nicely as gradient descent. 
    • In reality, practical data science, once it gets close to the global minimum its parameters are good enough. In real life, it works out.

K Means Clustering Unsupervised Learning - Udacity Machine Learning Nanodegree Flash Card


  • Draw a line connecting two centroids and use the half way line as a division line for two hyperplanes (if two clusters). Results vary greatly.

  • Initial positions of centroid can strongly influence result. Different initial positions give completely different results.

  • Analogy "Rubber Band"

  • Center of the cluster is called a centroid

  • Number of centroids at initiation can heavily influence the result. 

  • Great for ... PROS:

  • Bad for ... CONS ... limitations:
    • Hill climbing algorithm.
    • Result depends on initiation
    • If initiation is close to local optima, may be sticky. Never move away. Ignore global optima. Bad initial centroids exist
    • If there are more potential clusters, there are more local optima. Run iterate the algorithm many times to avoid being stuck. 

Thursday, March 23, 2017

Follow my new website - Zero Budget Growth Hacking for Small Businesses

Dear entrepreneurs, small business owners and startup techies, how do you go from zero to one with no marketing budget? I will show you how in my new blog. Here's my background highlighted in the first post http://www.matterr.co/2017/03/about-me.html

What makes me a special growth hacker? I don't just advertise, I code, hacked and actually took multiple stores, youtube channel, and contents from zero to one.

My Biography
TL;DR Dilys is a social media growth hacker. Dilys' background is the intercept of business, technology, and startup. She has experience working with giant corporations and top YCombinator startups. She contributed to USATODAY, Fast Company, VentureBeat, Crunchies by TechCrunch and was invited to Google social media studies, tech conferences. She ran campaigns to kickstart e-Commerce stores: Chinese Alibaba Taobao 0 to Level 6, eBay 0 to PowerSeller, Shopify 0 to Shopify & Uber partner. She recently took an experimental Youtube partner channel from 0 to 400,000 minutes watched, 0 to 300,000 views, 0 to 900 subscribers in just one month (February 2017 the shortest month too!).

Can't wait to share all my unique experiences as a seller, growth hacker, startup growth person with you. FREE. Just content and some Google ads. That's it. No subscription needed. Follow my blog now.

Wednesday, March 22, 2017

Udacity Digital Marketing Nanodegree Reviews (updating in progress)

This review is updated continuously throughout the program. Yay I just joined the Udacity Nanodegree for Digital Marketing! I am such an Udacity and learning junkie LOL. What grabbed my attention was the line-up of partners, the real world projects and also Avinash Kaushik's presence. I wonder what's the oracle of Google Analytics doing promoting this course.


  • First impression, clean beautiful videos, unlike some of the programming Georgia Tech videos Udacity has
  • The partners really do show up early in the syllabus and seems like they will participate
  • Though jobs are not guaranteed, there are mentions of hiring partners
  • Classmates are young and energetic marketing veterans. Already very active on slack
  • Meet the students use hashtag #ImInDMND on instagram
  • Realworld like non-trivial business cases and owner / user statements
  • What are the projects like? Udacity allows you to use Udacity as a real-world marketing project.
  • Amazing speakers, famous authors and speakers including author of crossing the chasm, avinash kaushik Google Analytics evangelist
  • Mentorship - mentorship is available. My mentor has been unresponsive and unhelpful so far. I do not recommend.
  • Mini interviews with industry giants
  • The Facebook Ad project is extremely useful. The real world project experience can have tangible results. It is resume worthy. I got a sizable view and conversion of which I am comfortable to talk about in future interviews.

Subscribe to our mailing list


Tuesday, March 21, 2017

Udacity Machine Learning Nanodegree - Projects Step by Step Walkthrough High Level Cheat Sheet

High level steps to solve Udacity Machine Learning Nanodegree projects:

  • Import dependencies: numpy, pandas, sklearn, matplotlib
  • Data cleaning:
    • Replace all data with numeric value such as binaries 0 and 1 or scale down to between -1 to 1, or 0 to 1 (normalization). 
    • Replace yes/no binary answers with 1,0
    • Replace categorical data A, B, C with dummy columns |A|B|C| use 1 if true, 0 if false
  • Split data into features and target aka label
  • Perform initial exploration, turns data CSV into Pandas.DataFrame
    • Computer summary stats: mean, counts etc.
  • from sklearn import model
  • clf = sklearnmodel.model() #specify the classifier
  • clf.fit( ... ) #fit the model wither parameters
  • clf.predict() #make predictions
  • Metrics:
    • R^2 R squared - great for linear regression 0 to 1, 1 being the best
  • Errors:
  • This list is under construction

Sklearn machine learning model cheat sheet
What are the best algorithms to use for each machine learning problem?
Classification versus regression
Supervised versus unsupervised

Saturday, March 18, 2017

Commonly seen python error messages - Learn to code Python for Beginners


  • Python KeyError if dict[key]: cannot do this have to change to if key in dict: 

Pandas Sample Code - Udacity Machine Learning


  • .groupby()
  • .count()
  • pandas.DataFrame.count
  • .sum()
  • df[df["class"]==1].count()["value"]
  • countOfColumn = myDataFrame[conditionColumn["myCondition"]=="myCondValue"].count()["conditionColumn"] get row count by column condition and value
  • pandas.Series.map
  • pandas.DataFrame.count
  • df[(df['A']>0) & (df['B']>0) & (df['C']>0)]
  • pandas.DataFrame.sum
  • df.groupby('a').count()
  • df.first()

Tuesday, March 14, 2017

Startup small business tax part 4 - miscellaneous calendar dates

Startup or Small Business Tax Deadliens

  • March 15th +/- 5 days tax due for partnership LLCs
  • April 18th deadlines for corporations
    • Annual Delaware Franchise Tax (if startup is incorporated in Delaware)
    • Annual California Franchise Tax (if startup is incorporated in Delaware and doing business as a foreign entity in California)
    • Statement of Information - California  (if startup is incorporated in Delaware and doing business as a foreign entity in California)


Personal Tax Deadlines

  • Jan 31st +/- 5 days  W2 and 1099
    • 1099 DIV 1099 INT : Stock Dividend, Bank Interest. Examples include Scottrade, eTrade, Vanguard
  • April 18th deadlines for personal tax

Disclaimer: no post on this blog should be considered legal nor professional advice. Only CPA, professionals, certified financial advisors can provide legal or professional advice. All information for my personal use, and for entertainment purpose only. 

Sunday, March 12, 2017

Udacity Machine Learning Nanodegree Bayes Rule Bayesian Analysis Walkthrough

quiz
<xi, di>
di = f(xi) + err
x, d, h(x) = x mod 9, h(x) = x/3, h(x) = 2,
1, 1, 1%9 = 1, 1/3, 2,
3, 0, 3%9 = 3,  1, 2,
6, 5, 6%9 = 6, 2, 2,
10, 2, 10%9= 1, 10/3, 2,
11, 1, 11%9= 2, 11/3, 2,
13, 4, 13%9 = 4, 13/3, 2,

sum of squared errors for each (excel calc)


h(x) = x mod 9
sum of squared errors = 12


h(x) = x/3
sum of squared errors = 19.44

h(x) = 2
sum of squared errors = 19

Use the smallest
Or better way: write a python script

Saturday, March 11, 2017

R Squared Coefficient of Determination - Machine Learning Concept

*coefficient of determination*](http://stattrek.com/statistics/dictionary.aspx?definition=coefficient_of_determination)

R^2
R<sup>2</sup>

coefficient of determination
useful statistics for regression analysis
measures how good the model makes prediction.


R^2 range {0, 1}
can be negative, arbitrarily worse
percentage of square correlection between predicted and actual values of target variable

indicates what percentage of the target variable, using this model, can be explained by the **features**.


r2_score from sklearn.metrics

Wednesday, March 8, 2017

Pandas Numpy Data Analysis Tool Kit - Udacity Machine Learning Nanodegree 01

Numpy perfect for statistical analysis, matrix manipulation. Learn to Code Notes.
Numpy Documentation
https://docs.scipy.org/doc/numpy-dev/user/quickstart.html

Code pattern 01 numpy use array().T to get matrix transpose
Example:
X = [1,2,3]
XT = array(X).T

numpy.dot(series1, series2)

Pandas Numpy Data Analysis Tool Kit - Udacity Machine Learning Nanodegree 00

SERIES & DATAFRAME

Basic units data structures of Pandas, data analysis using Python

Allows users to store a large amount of information and perform data analysis

Dataframe documentation: http://pandas.pydata.org/pandas-docs/version/0.17.0/dsintro.html#dataframe

A dictionary
  • Dict of 1D ndarrays, lists, dicts, or Series
  • 2-D numpy.ndarray
  • Structured or record ndarray
  • Series
  • Another DataFrame


Sample Code: 

d = {'key_name':Series([1,2,3], index=['a','b','c'])}

Analogy : Excel Spreadsheet
Will also return number of rows and columns

Pandas.Series()
Pandas.Series([],index=[])


----

More sample code:
   my_data = pd.DataFrame(data)
    print my_data.dtypes
    print ""
    print my_data.describe()
    print ""
    print my_data.head()
    print ""

    print my_data.tail()


# Retrieve columns
df[['col_name','col2_name']]
# Retrieve rows
df.loc['a']


df[df['col_name'] >= 30]

get row column counts of Pandas Dataframe
.shape
len(DataFrame.index)
.count() count each column of the entire table

Wednesday, March 1, 2017

Startup Tax How to get Turbotax Discount?

7 easy steps to get Turbotax discount. A good reason to buy TurboTax? The IRS assume people make 20% more mistakes when preparing their own tax. Using TurboTax can potentially reduce auditing risk (note, just lifehack tips, not professional advice, please consult your tax and legal professionals).

01 Google "turbotax discount" literally

Did you know that you can find discounts by literally googling for it? If you don't ask for it, it won't be given. It can take you to a landing page, or a bulk discount site that gives customers more favorable deals. Get $20 dollars off.

02 Use American Express Discount

Did you know that American Express Offer has TurboTax discount? But only for personal filing though.  Save 5% to 10%. 

03 Use Partnership Discounts - Fidelity TurboTax Discount

Some companies offer joint discount! Fidelity offers TurboTax discount for its customers. Save $20.

04 Use a Membership Business Toolbox Discount - FounderCard

Memberships like FounderCard is geared towards startup founders and users. It gives discounts to all kinds of products and services including TurboTax and Moo Business Cards. Save 10% off.

05 Buy TurboTax on Amazon

It's painfully obvious Amazon offers the steepest discount. TurboTax Business 2016 perfect for c corp startups incorporated in Delaware and doing business in California is $50 dollars off! Insane. State filings require additional though. Don't buy the Delaware one. You can't file via TurboTax anyway.

Use my Amazon referral for TurboTax for the steepest discount. http://amzn.to/2mNA7TR








06 Buy TurboTax Disc - Hard Copy

Buying online? You are in a hurry. No discount. Buy a disk? You are probably a real budgeting, accounting person who is price sensitive. Buying a TurboTax disc instead of a digital copy sometimes can save you money. Just keep in mind, you may have to pay more for special and specific filings.

07 Online Merchant Account Discount for TurboTax - eBay

eCommerce platforms like eBay and Etsy have special discount codes for online shops and merchants. Use your subscriber discount for TurboTax and QuickBooks.

Saturday, February 11, 2017

Udacity Machine Learning Udacity Connect Lesson 01 Syllabus

In-person Udacity Connect Machine Learning Nanodegree syllabus

  • Practice running python from within a Jupyter Notebook (FKA IPython Notebook).

  • Become familiar with importing useful modules and packages, e.g. pandas, numpy, matplotlib.pyplot.

  • Learn about the pandas data structures, including the Series and DataFrame objects.

  • Create a DataFrame object from data in a comma-separated variable (csv) file using pandas.read_csv

  • Index and select data from Series and DataFrame objects using loc and iloc

  • Compute descriptive statistics on a Series or DataFrame, including the mean, the median, and the min & max

  • Explore a public data set found on Kaggle

  • Conduct some exploratory data analysis, and visualize trends in data using matplotlib

Pre Lesson Activities
  • Student Handbook
  • Class schedule and holidays
    • It's an aggressive schedule
  • Logistics
  • Github repo
Lesson 1 in-person activity
  • Meet classmates
  • Meet and greet
  • First lunch is provided. No free lunches in future sessions


Friday, February 10, 2017

Seven Extraordinary Startup Founder Stories in Tweetable Sizes

Seven Power Hustle Stories about Early Startups

Andrew Chen wrote in his recent essay the key to the future of growth is to execute (growth tactics) thoughtfully and iteratively. Ingenious moves of growth hacking could be luck, a sparkle, a clever thought or simply persistence. You can call it hustling, bootstrapping, hacking. There are some crazy stories. Paul Graham calls it do things that do not scale. Here are 11 founders who did the unthinkable at the dawn of their startups in byte-size stories:
  1. Ben S. #founder of @pinterest used to sneak into Palo Alto Apple Store & change safari homepages to Pinterest until he’s kicked out. #hustle
  2. Adora C. brushed teeth at McDonald’s to save $. Learned how to clean like a maid to found billion dollar HomeJoy on-demand cleaning service.
  3. @stripe #founder brothers would personally implement Stripe on #YC batchmate’s computers to get other developers to try the product
  4. @Codecademy never stopped pivoting at #YC its interactive #javascript tutorial wasn’t live until the night before #startup #demoday
  5. #Reddit #founder once created fake accounts to boost user number in the early days of reddit
  6. Founder of Muse #millennial #career site was flagged as a spammer on Gmail after sending out many cold emails in an effort to up user count
  7. @Airbnb founders turned their loft apartment into 1st Airbnb listing to fund rent & the #startup. Ditched Craigslist for being “impersonal”
Comment and favorite if you want to read more byte-size hustle stories like these. Pss all the stories are perfectly tweetable.
Originally published on Medium

Wednesday, February 8, 2017

Machine Learning K Nearest Neighbors KNN Algorithm

On a high level, KNN tries to look for the most similar existing instances (defined by a chosen distance function), then retrieve the corresponding labels. Then using a voting method to decide the final label based on k votes. Voting mechanism could be a simple majority. May weigh instances unevenly. Commonly used distance function is euclidean e.g. as in sklearn. FYI: Euclidean is a special case of Minkowski with p=2

Can modify the weight parameter: uniform treat all neighbors equally, distance as measured in a specified distance function. Can customize func.

KNNk nearest neigbhors to a given point
kthe number of neighbors
nthe number of data points, data is sorted to speed up the algorithm
distanceEuclidean distance shortest line connecting the pointsOR a custom function that defines the distance
visualizescatter plot, each point is a circle with a radius that include certain number of neighbors
KNNa query intensive algorithm, not learning intensive
performancebig o notation, log(n) binary search, 1 is constant, n is linear
intuitionKNN stores all the data, then performs a binary search on the data when querying. Linear regression only stores the model y = mx+b. Key concepts: LEARN vs QUERY
Running TimeSpace
1 NN 1-nearest neigbhor, 1 dimensional list e.g. [1 2 4 7 8 9]learning1nKNN all data to storage without learning, so running time is 1 which means constant in Big O notation, and storage space is n for the number of data points
querylog(n) binary search to find one point1
K NNlearning1n
querylog(n) + k binary search log(n) to find one point and the k items next to it in a sorted list1
linear regressionlearningn1
query11

Monday, February 6, 2017

11 Things You Can Do at the Crunchies 2017

Hot off the press, use your super powers and unleash your inner extrovert at the Crunchies award ceremony today in San Francisco! adapted from Medium

  1. Meet famous founders, tech investors and journalists such as Mark Zuckerberg, Dave McClure and Ron Conway. Kungfu hustle for your budding tech product in after party open bar
  2. Take fabulous runway pictures at this tech oscars on the green carpet
  3. Get advice and mentorship from fellow founders, YCombinator 500 Startup alumni and tech workers
  4. Find inspiration and actionables for your startup
  5. Trend spotting and policy crunching.
  6. Indulge in the sparkles and glamours of tech giants like Kevin of Instagram, Pinterest, Mark Zuckerberg of Facebook, and famous investors like Ron Conway. It’s hard to resist a selfie even if it is mildly impressing.
  7. Take a selfie. Take lots of selfies.
  8. Chuckle at occasional tech nerd humor squeezed into the ceremony by guest artists, hosts, comedians and even opera singers. Past guest artists include the Daily Show actors.
  9. Drink and hustle. There is a bar with flowing champagnes and cocktails that can give you a boost and bring out your networking inner extrovert. Prep dinero, it ain’t no free ride.
  10. This year’s nominee’s include SpaceX and Pokemon GO! Expect to see Elon Musk and the Niantic studio in the crowd?
  11. Network with reporters from VentureBeat, Mashable, TechCrunch … for your baby startup?

What is the Crunchies award ceremony?

Here’s a whimsical opera intro to Crunchies! Enjoy



In short it is the oscars for the tech community where all the startup and tech giants and inner circles meet. Like HBO’s Silicon Valley show? You will likely really enjoy this award ceremony. But if you are not huslting in tech, this event may be nothing more than a selfie opportunity.

Crunchies’ Past

In the past Crunchies, people arrived at the events in style in their fabulous outfits and with their tech gadgets. Travis of Uber had his pampered lap dog with him. People showed off their Google Glasses and Tesla in the past. This year, expect Snap to make a splash with their unique personality and Snap Spectacle glasses before their big IPO.
Past guest speakers included Mayor Ed Lee and Ron Conway “the Godfather of Silicon Valley”. They provide great insights for the future of the Silicon Valley.

My Crunchies’ Past

Fun fact, I shared the stage with Mark Zuckerberg, Marissa Meyers and Kevin Systrom of Instagram at the 2012 Crunchies award ceremony. I was at 2 YC startups. Dave McClure 500 startup and I follow each other on Twitter :P

Saturday, February 4, 2017

Growth Hacker Tool - Google Mobile Friendliness Online Tester

Building a website? Launching a startup? Make sure Google can find you and wants to find you. One key search metric is mobile responsiveness and friendliness. Use this official Google tool to test your website.
https://search.google.com/search-console/mobile-friendly

Friday, February 3, 2017

Machine Learning 101 Resources, Lessons and Tips

Thinking about getting started with Machine Learning? Silicon Vanity is your go-to resource on the Learn-to-Code movement #learntocode, tutorials and the hottest job trends in the Silicon Valley.  Educational tech is dear to my heart. Here are some resources to get you started.


  • Udacity Machine Learning Nanodegree.
  • Udacity Machine Learning Nanodegree Udacity Connect Intensive. An in-person, intensive bootcamp version fo the full Nanodegree. Github workbooks.
  • Udacity Deep Learning Nanodegree. Created in collaboration with YCombinator startup founder / member Siraj, who has his own Machine Learning Youtube show. 
  • Coursera Machine Learning
  • Stanford Machine Learning course on Youtube
  • Khan Academy Machine Learning
  • Machine Learning competition on Kaggle. Kaggle is a great place fo datasets and competitions. Talking about competitions, Alibaba sponsored a customer flow competition on its Koubei product.
  • Udacity Machine Learning notes, slides, forum, online one-on-one mentoring 
  • Books Machine Learning for Dummies

The above resources are more academic than practical. Udacity has tried to marry practicality, industry requirement with academic coursework. The course is still in its early stage of becoming beginner friendly.

Is there a resource that you would recommend? Please share with me. 

What is Tensorflow? - Tensorflow for Dummies Google Tensorflow 101

What is Tensorflow Tensorflow is a deep learning framework, and deep learning is a hot field of machine learning. It's like like rails...