Saturday, May 30, 2020

React + React Native Basics in 2020

I am writing new blog posts for technologies every year because they change, they evolve. JavaScript today is nothing like the JavaScript 10 years ago. Today's topic is React and React Native. This is a narrative style cheat sheet of React Basics. It is my hope to organize those concepts in a cohesive way. It will tell you how the concepts are connected, but to find out more, to dive deeper, it is important to seek out external tutorials and resources. 

Post in progress, in construction. Updated daily.

Key Concepts in React:
Declarative program : opposite of imperative programming, where we specify step-by-step instruction, implementing behaviors in details. In declarative programming we tell React what we want back, like a component. HTML is a declarative, because we don't need to implement every detail just tell the browser to render a <div>.

JSX : 
React uses JavaScript to write HTML codes using JSX.
const my_html = <h1>Hello World</h1>
It is a combination of JavaScript and HTML, won't validate in vanilla JavaScript. 

ReactDOM : insert components into DOM
Babel : compatibility, helps convert JSX to JavaScript (Browsers can understand Js code, not JSX)
React Native : Build mobile apps using just JavaScript. Write once, deploy any where. Supports iOS and Android. Dependency is React. JavaScript is bundled, transpiled from ES7 ES6 ESNext down to ES5 code. Also minified (Source: CS50 Harvard). Multiple JavaScript files compiled into a one big JavaScript bundle. Separate threads for UI, layout, JavaScript (which is Single Thread and can get locked up). 

Link to React, ReactDOM, and Babel using script tags in the headers.

Basic unit of react organized around components. It inherits from React.component. Usually contains a render function. 

class CappedComponentName extends React.component{
        return <h1>Some HTML Code</h1>

Props : objects that are passed to elements. It looks similar to JSON, but unlike JSON which can only handle strings, props can handle other JavaScript data types. 

Arrow function: 
Benefit of using arrow function, is to handle the event variable and bind this correctly. this object can get kind of funky in js. 

Best practice:

A workaround best practice to ensure compatibility is to compile JSX to JavaScript before deployment. 

What is the different between props and states. States is something the component may want to track and modify. Props is somewhat like initialization, configuration, like states but with fewer changes

Post in progress, in construction. Updated daily.

Sunday, May 10, 2020

Intro to Data Visualization

REPOST from Medium with permission

Data Visualization in Machine Learning — Beyond the Basics

This is not a tutorial. These are my notes from various Machine Learning articles and tutorials. My personal cheatsheet for interviews and reviews. Any feedback and corrections are welcome. If you’d like to read more, please let me know as well. These notes are more applicable for python users. Does not include ggplot, great for R.

Prerequisites and Dependencies

This tutorial and overview is python based so we use matplotlib.pyplot. These commands can be run in command line and in Python Notebook with just a bit of modifications. Any reference to plt means the function is from the matplotlib library.
import matplotlib.pyplot as plt
# will get object does not have bar, scatter.... function_name error # if not imported

Plot a Bar Chart

Bar chart, bin chart: useful for frequency analysis, distributions and counts.
labels = ['A','B','C','D','E','F','G']
nums = [13,24,5,8,7,10,11]
xs = range(len(nums)) #[0, 1, 2, 3, 4, 5, 6]
#xs is a convention variable name for x axis,nums)
plt.ylabel("Customize y label") 
plt.title("Customize graph label") #display the plot

Don’t be deceived by its simple look. Frequency analysis is very powerful in data EDA, stats and machine learning.

Plot a Histogram

Histogram will automatically divide data into bins.
import matplotlib.pyplot as plt
import pandas as pd
nums = [99, 1, 3, 5, 7,33, 23,684, 13, 3 ,0, 4]
# <matplotlib.axes._subplots.AxesSubplot object at 0x10d340d90>
# returns object in memory

Also useful for visualizing distribution and outliers.

Scatter Plot

How is scatter plot beyond the basics? Scatter plot is extremely intuitive yet powerful. Just plot the vertical coordinate and horizontal coordinate of each data point in the sample to get its scatter plot. If the relationship is non-linear, or there may be the presence of an outlier, these targets will be clearly visible in the scatter plot. In the case of many features i.e. dimensions, a scatterplot matrix can be used.
Below is a screenshot of pandas scatterplot matrix in the official documentation.

Clearly the relationship is not linear. The diagonal is the variable vs itself, so it’s showing a distribution graph instead of scatter plot. Neat, looks like the variable is normally distributed.
Scatterplot is a great first visual. Too many features? Try sampling or generating data subsets before visualizing.
Use pandas.DataFrame.describe() to summarize and describe datasets that are simply too big. This function will generate summary stats.
Scatterplots are useful for pairwise comparison of features.
Scatterplots can go beyond two dimensions. We can use marker size and color to illustrate the 3rd dimension, even 4th dimension as in the famous TED talk of economical inequality. The presenter even used timeline (animation) as the 5th dimension.

Visualizing Error

Youtube deep learning star Sraj shows a 3D visual of error function while altering y intercept aka bias and slope for linear regression. The global optima i.e. the global minimum in this case is the goal of gradient descent algorithm.
Error functions have shapes and can be visualized. Local optima which prevents your model from improving can potentially be visualized.

Gradient can be visualize as directional arrows that travel in the direction of the global minima along the shape of the 3D plot. It can also be visualized as a field of arrows in a matrix.
Each residual (y_i — y_hat) can be visualize as a vertical line connecting the data point with the fitted line in linear regression.

Data Scientists Love Box Plots

Why? It displays essential stats about distribution in a concise visual form. Aka candle stick plot. Also popular in finance.
Max, 3rd Quartile, Median, 1st Quartile, min.
This is known as the box and whisker graph too. It’s popular among statisticians. Used to visualize range. It can be drawn horizontally.
What’s between Q3 and Q1? The interquartile range, which used in analyzing outliers. Q1–1.5*IQR is too low, Q3+1.5*IQR is too high.
Box whisker plot displays outliers as a dot!
Check out Boston University’s Blood Pressure dataset box whisker plot with outliers.


Did you say heat map? Heat map has been in and out of favor. Web analytics still use heat map to track events and clicks on a webpage to identify key screen real estates. Why should we use heat map for machine learning?
It turns out that generating a heat map of all the feature variables — feature variables as row headers and column headers, and the variable vs itself on the diagonal— is extremely powerful way to visualize relationships between variables in high dimensional space.
For example, a correlation matrix with heat map coloring. A covariance matrix with heat map coloring. Even a massive confusion matrix with coloring.
Think less about the traditional use of heat map, but more like color is another dimension that can visually summarize the underlining data.
Correlation Matrix Heat Maps are frequently seen on Kaggle, for exploratory data analysis (EDA).

More Data Visualization Magic

Did you know that you can visualize decision trees using graphviz. It may output a very large PNG file. Remember the split of decision tree is not always stable — consistent over time. Take it with a grain of salt. The benefit of visualizing a decision tree is to understand where and how machines made decision splits. Decision tree boundaries can be visualized too, see screenshot below from Sklearn documentation.

Visualizing models, decision boundaries and prediction results may give hints whether the model is indeed a good fit or it is a poor fit for the data. For example, it is high bias to ignore the nature of our data if use a straight line to fit a circular scatter of dots.
Researchers even visualized different optimizers to see their descend to minimize loss.
Did you know you can create interactive plots using Plotly right in Jupyter Notebook? Interactive plots allow you to visualize complex data, toggle and change parameters. For example you can slide to change values of your hyperparameters and visualize how the model performance change in gridsearch and other systematic search of the space.

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in...