Ad

Friday, June 12, 2020

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in neural networks. Regularization acts on the loss function (cost function) by adding an extra penalty term. The penalty term depending on the method of regularization, penalizing the weight parameters so it is a function of w

Two common regularization methods:
  • Lasso 
    • Uses L1-norm
  • Ridge
    • Uses L2-norm
A trick to remember the norm is that letter L comes before letter R, so Lasso is L1 norm and Ridge is L2 norm. 

One is more likely to result in sparse solutions turning one or more coefficients zero. Which one do you think it is? 

Quiz: which formula is Lasso? Which one is ridge?

  • Regularization penalizes overly complex models
  • Large weights usually make penalty term higher, so smaller effective weights are preferred
    • Larger weights cost more
  • Regularization = regular_loss_function + extra_penalty_term(lambda, weights)
    • The extra penalty term also depends on the weights parameter and the lambda rate parameter

Thursday, June 11, 2020

WeChat Basics


  • WeChat requires approval
  • Important to decouple app with data API so we can change more after
  • Where to host WeChat app? Best hosting best API?
  • 微信开发者工具 WeChat Developer Tools
    • 三个部分组成,模拟器,编辑器和调试器
  • Developer also need a wechat account to administer and manage wechat apps and login to developer tools. It is used constantly for login, verification and testing, so it's very important. It also is like a 2 factor authentication. It is often used to verify before logging in. 
  • index.js is the main page to write your code
  • WeChat Games
    • Amazing example viral game 跳一跳
  • Performance
    • Compressed images
    • Stored locally in geo locations
  • WeChat Mini Program with AR
    • Example Armani cosmetic app allows users to try makeup
    • Other use cases for WeChat AR including gaming, real estate house previewing, open house, hotel room previewing, house shopping
    • Limited API availability, only available for some brands and developers for AR
  • WeChat voice interface
  • WeChat is a platform, not just a messaging app, it also include e-commerce, game, web browsing, search and content publishing
  • Features
  • Can create a wechat test app, previously known as sandbox
  • To test your app, click on preview on iphone to use your iphone to scan a QR code and be able to test it on your local phone
Write HelloWorld in WeChat Mini Program
When the wechat mini program launches, the onLoad function will be invoked. In index.js this code will console.log hello world. As you can see the syntax is very much similar to JavaScript and is stored in a .js script file. 

Page({

  onLoad() {

    console.log("Hello World!")

  } 

})


onLoad is a lifecycle call back function



Making WeChat Stickers - Sticker Developer
It's very important to follow community guideline and developer policies. Here's the content requirement for WeChat

Linear Algebra Review

A matrix is a grid of number

Points and vectors

||w|| length of w, also known as the magnitude of w, also the L2 norm.

U(U^T) square value of matrix is matrix U multiply by the transpose of U


Sudoku - Technical Interview

Each row can be represented using letters.Can be stored using row='ABCDEFGHI' note there are 9 letters. Iteration: for r in row : #do something

Each column can be represented using numbers. Can be store using cols ='123456789' also 9 digits. 

Use dot . as placeholder.   

Saturday, May 30, 2020

React + React Native Basics in 2020

I am writing new blog posts for technologies every year because they change, they evolve. JavaScript today is nothing like the JavaScript 10 years ago. Today's topic is React and React Native. This is a narrative style cheat sheet of React Basics. It is my hope to organize those concepts in a cohesive way. It will tell you how the concepts are connected, but to find out more, to dive deeper, it is important to seek out external tutorials and resources. 

Post in progress, in construction. Updated daily.

Key Concepts in React:
Declarative program : opposite of imperative programming, where we specify step-by-step instruction, implementing behaviors in details. In declarative programming we tell React what we want back, like a component. HTML is a declarative, because we don't need to implement every detail just tell the browser to render a <div>.

JSX : 
React uses JavaScript to write HTML codes using JSX.
const my_html = <h1>Hello World</h1>
It is a combination of JavaScript and HTML, won't validate in vanilla JavaScript. 

Libraries:
React 
ReactDOM : insert components into DOM
Babel : compatibility, helps convert JSX to JavaScript (Browsers can understand Js code, not JSX)
React Native : Build mobile apps using just JavaScript. Write once, deploy any where. Supports iOS and Android. Dependency is React. JavaScript is bundled, transpiled from ES7 ES6 ESNext down to ES5 code. Also minified (Source: CS50 Harvard). Multiple JavaScript files compiled into a one big JavaScript bundle. Separate threads for UI, layout, JavaScript (which is Single Thread and can get locked up). 

Link to React, ReactDOM, and Babel using script tags in the headers.

Basic unit of react organized around components. It inherits from React.component. Usually contains a render function. 

class CappedComponentName extends React.component{
    render(){
        return <h1>Some HTML Code</h1>
    }
}

Props : objects that are passed to elements. It looks similar to JSON, but unlike JSON which can only handle strings, props can handle other JavaScript data types. 

Arrow function: 
Benefit of using arrow function, is to handle the event variable and bind this correctly. this object can get kind of funky in js. 

Best practice:

A workaround best practice to ensure compatibility is to compile JSX to JavaScript before deployment. 

What is the different between props and states. States is something the component may want to track and modify. Props is somewhat like initialization, configuration, like states but with fewer changes


Post in progress, in construction. Updated daily.

Sunday, May 10, 2020

Intro to Data Visualization

REPOST from Medium with permission

Data Visualization in Machine Learning — Beyond the Basics

This is not a tutorial. These are my notes from various Machine Learning articles and tutorials. My personal cheatsheet for interviews and reviews. Any feedback and corrections are welcome. If you’d like to read more, please let me know as well. These notes are more applicable for python users. Does not include ggplot, great for R.

Prerequisites and Dependencies

This tutorial and overview is python based so we use matplotlib.pyplot. These commands can be run in command line and in Python Notebook with just a bit of modifications. Any reference to plt means the function is from the matplotlib library.
import matplotlib.pyplot as plt
# will get object does not have bar, scatter.... function_name error # if not imported

Plot a Bar Chart

Bar chart, bin chart: useful for frequency analysis, distributions and counts.
labels = ['A','B','C','D','E','F','G']
nums = [13,24,5,8,7,10,11]
xs = range(len(nums)) #[0, 1, 2, 3, 4, 5, 6]
#xs is a convention variable name for x axis
plt.bar(xs,nums)
plt.ylabel("Customize y label") 
plt.title("Customize graph label")
plt.show() #display the plot


Don’t be deceived by its simple look. Frequency analysis is very powerful in data EDA, stats and machine learning.

Plot a Histogram

Histogram will automatically divide data into bins.
import matplotlib.pyplot as plt
import pandas as pd
nums = [99, 1, 3, 5, 7,33, 23,684, 13, 3 ,0, 4]
pd.Series(nums).hist(bins=30)
# <matplotlib.axes._subplots.AxesSubplot object at 0x10d340d90>
# returns object in memory
plt.show()


Also useful for visualizing distribution and outliers.

Scatter Plot

How is scatter plot beyond the basics? Scatter plot is extremely intuitive yet powerful. Just plot the vertical coordinate and horizontal coordinate of each data point in the sample to get its scatter plot. If the relationship is non-linear, or there may be the presence of an outlier, these targets will be clearly visible in the scatter plot. In the case of many features i.e. dimensions, a scatterplot matrix can be used.
Below is a screenshot of pandas scatterplot matrix in the official documentation.


Clearly the relationship is not linear. The diagonal is the variable vs itself, so it’s showing a distribution graph instead of scatter plot. Neat, looks like the variable is normally distributed.
Scatterplot is a great first visual. Too many features? Try sampling or generating data subsets before visualizing.
Use pandas.DataFrame.describe() to summarize and describe datasets that are simply too big. This function will generate summary stats.
Scatterplots are useful for pairwise comparison of features.
Scatterplots can go beyond two dimensions. We can use marker size and color to illustrate the 3rd dimension, even 4th dimension as in the famous TED talk of economical inequality. The presenter even used timeline (animation) as the 5th dimension.

Visualizing Error

Youtube deep learning star Sraj shows a 3D visual of error function while altering y intercept aka bias and slope for linear regression. The global optima i.e. the global minimum in this case is the goal of gradient descent algorithm.
Error functions have shapes and can be visualized. Local optima which prevents your model from improving can potentially be visualized.


Gradient can be visualize as directional arrows that travel in the direction of the global minima along the shape of the 3D plot. It can also be visualized as a field of arrows in a matrix.
Each residual (y_i — y_hat) can be visualize as a vertical line connecting the data point with the fitted line in linear regression.

Data Scientists Love Box Plots

Why? It displays essential stats about distribution in a concise visual form. Aka candle stick plot. Also popular in finance.
Max, 3rd Quartile, Median, 1st Quartile, min.
This is known as the box and whisker graph too. It’s popular among statisticians. Used to visualize range. It can be drawn horizontally.
What’s between Q3 and Q1? The interquartile range, which used in analyzing outliers. Q1–1.5*IQR is too low, Q3+1.5*IQR is too high.
Box whisker plot displays outliers as a dot!
Check out Boston University’s Blood Pressure dataset box whisker plot with outliers.


Heatmap

Did you say heat map? Heat map has been in and out of favor. Web analytics still use heat map to track events and clicks on a webpage to identify key screen real estates. Why should we use heat map for machine learning?
It turns out that generating a heat map of all the feature variables — feature variables as row headers and column headers, and the variable vs itself on the diagonal— is extremely powerful way to visualize relationships between variables in high dimensional space.
For example, a correlation matrix with heat map coloring. A covariance matrix with heat map coloring. Even a massive confusion matrix with coloring.
Think less about the traditional use of heat map, but more like color is another dimension that can visually summarize the underlining data.
Correlation Matrix Heat Maps are frequently seen on Kaggle, for exploratory data analysis (EDA).


More Data Visualization Magic

Did you know that you can visualize decision trees using graphviz. It may output a very large PNG file. Remember the split of decision tree is not always stable — consistent over time. Take it with a grain of salt. The benefit of visualizing a decision tree is to understand where and how machines made decision splits. Decision tree boundaries can be visualized too, see screenshot below from Sklearn documentation.


Visualizing models, decision boundaries and prediction results may give hints whether the model is indeed a good fit or it is a poor fit for the data. For example, it is high bias to ignore the nature of our data if use a straight line to fit a circular scatter of dots.
Researchers even visualized different optimizers to see their descend to minimize loss.
Did you know you can create interactive plots using Plotly right in Jupyter Notebook? Interactive plots allow you to visualize complex data, toggle and change parameters. For example you can slide to change values of your hyperparameters and visualize how the model performance change in gridsearch and other systematic search of the space.

Wednesday, April 29, 2020

JavaScript Basic 2020

Learn what's new with JavaScript in 2020. It has changed a lot from the JavaScript you know.
  • JavaScript is interpreted as opposed to compiled
    • C is compiled
    • No need to declare variable types
    • Allows dynamic typing : given a variable there is no type associated with it until it is filled with value, it can be changed later. Some languages are not 
  • ES6 is the latest version of JavaScript full name is ECMAScript 6
    • Symbol
  • In new JavaScript languages semicolon ; is likely optional
  • JavaScript can check equality using double equal sign ==  , or triple equal sign  ===
    • == coerces the type
    • === requires to be exact, doesn't coerce the type 
  • Node command line runtime for JavaScript is built on V8 engine
  • typeof null --> returns object - one of those strange behaviors of JavaScript
  • JavaScript development is guided by ECMAScript standard. ECMA is pronounced Ehk-MA. E stands for Europe
  • You can think : the spec for JavaScript is written by ECMA
  • Each browser can have its own JavaScript engine, for example Chrome uses V8
  • Event Listener
    • Listening or subscribing events such as keydown 
    • aka the =Event handler
  • Modern JavaScript variations include Typescript, frequently used in Angular 
  • npm is the popular package manager for JavaScript
    • Has joined Github
    • In order words both Github and npm is now owned by Microsoft
  • Define a constant : const CONSTANT = 0.5
  • Enclose strings in double quotes or single quotes
  • Arrays can contain values as well as functions
    • const arr = ["value1",5, function(){console.log("Hello World")}]
      • Run the function arr[2]()
    • Can contain different types
    • Can access using indexing, starting from position zero
    • use array with for loop
for (let i = 0; i < arry.length : i++) {
    console.log(arr[i])
}

  • JavaScript allows trailing comma
  • Types
    • primitives: 
      • no methods attached?  immutable
        • boolean, string, null, number, symbol, undefined
        • Number includes both float and integer, there is no separate type
  • Sudden differences between undefined and null
  • JavaScript string
    • concatenation is implicit coercion or type casting, if we use str(variable) then that's explicit coercion
  • checking types of input using typeof, e.g. typeof undefined // --> undefined, type 5 // --> number
  • Try out JavaScript interactively using Chrome browser inspect element mode, or install node and call interactive JavaScript prompt. Use those two as a JavaScript interpreter
  • In general, undefined is returned if nothing specific is returned
  • JavaScript documentation by Mozilla https://developer.mozilla.org/en-US/docs/Web/javascript

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in...