Ad

Friday, September 25, 2020

Firebase Tutorial - Cool things you can do with Firebase 2020

Introduction to Firebase

You control your Firebase using the Firebase console.

The console manage one or many projects. 

Each data resource in each project is called a collection. Like a collection of data. Read the firebase best practice article for better understanding of firebase projects. 

You can add apps to the firebase project. 


What is firebase : Online, in the cloud, real time database backed by Google and integrated into Google Cloud

Firebase works on all platforms

Provides iOS SDK, Android SDK, Web SDK

Full list of language and libraries found on its documentation: full list of SDK https://firebase.google.com/docs/database/rest/start

Language Libraries
Clojure taika by Cloudfuji
Dart IO Client in the official firebase-dart library
Go Firego by Steven Berlanga and Tim Gossett
Go Firebase by Cosmin Nicolaescu and Justin Tulloss
Java firebase4j by Brandon Gresham
Perl Firebase-Perl by Kiran Kumar and JT Smith
PHP firebase-php by kreait
firebase-php by Tamas Kalman
Python Pyrebase by James Childs-Maidment
python-firebase by Özgür Vatansever
python-firebase by Michael Huynh
Ruby firebase-ruby by Oscar Del Ben
BigBertha by Fernand Galiana
rest-firebase by Codementor


Even an admin SDK for administrative access. 

Firebase and google cloud

Can add Firebase to an existing GCP project

Use Firebase API to programmatically manage a firebase project. 

Firebase works Google Analytics and other offerings of the Google Cloud.
Fire Hosting is also available

Implement Search Functionality in Firebase Firestore
Use the query function to do SQL like filtering. 
However, Firestore does not natively support search. Instead it recommends using Algolia a search as a service platform. Algolia requires you to upload all your data to Algolia in order to make search possible. Though the term of service seems to say A cannot compete with you, you do have to expose your data to a 3rd party, it is a bit of a tie-down. The plus side is Algolia provides search UI, one click demo generation, so you can implement a prototype fast.  

firebase first need to initialize the firebase instance

using a config variable where the tokens and api keys are stored.
You can get this config variable from your firebase console in the web browser, select add firebase to web app option. 

Cost Saving

Every firebase project corresponds with a GCP project. Be sure to set quota on the GCP project, billing alert, limits. 

To Use Firebase Function First Install Firebase Tools in the Command Line

First check that node is installed before proceeding

npm install -g firebase-tools

Will also want to log into firebase in the command line. firebase login
Initialize the project with : firebase init
Deploy with : firebase deploy
Select firebase function to initiate a firebase function project

When initializing a firebase function you will be asked to select a few options, configurations: for example, you will want to select the language JavaScript or Typescript. Different dependencies modules and lint will be installed. Each has pros and cons. Strong typed TypeScript may be easier for debugging testing but has a learning curve. 



Friday, September 18, 2020

Algolia Search API Basics Tutorial

I write full time now for hi@uniqtech.co write me to say hi, request content or be notified of new tutorials like this. Unqitech writes about all the cool technologies.

  • In short Algolia is search engine SaaS software as a service
  • Tables and records in the database 
  • There is a search bar which lets you test out the Algolia service right away. And it is easy. 
  • The search results will be highlighted (yay no code)
  • They have indices important attributes
  • To display an image, include an url to image, hosted for example on google cloud
  • Note you must upload your data to Algolia for this to work
  • In the term of use it seems Algolia have a non-compete clause which theoretically prevents them from making a clone of your app
  • I have not found a way to white label it
  • The price is reasonable, based on requests. 
  • It is super easy to use, build a demo site, share it with users! But it has Algolia label everywhere. You will be doing free ad for them. 
  • Remember don't put sensitive data here. Especially not HIPAA sensitive stuff. You will have to upload your data to Algolia for the search to work. 
  • You can upload your data using a JSON file or using the dashboard. One is fast one is easy to use. Remember to use a JSONLint to check for mistakes. 
  • Will save you time if you use a lint. 
  • Remember your data is not private to any online JSON lint you use.
  • Pricing : hacker plan available, free for certain quotas. Great for testing, hackathon. 
  • Example UI is available, again not white labeled.
  • Example codes are available. Again the logo appears in the input box :(
  • Easy configuration. 
  • You can get to search in minutes! That's easy. That's a true SaaS
  • You can easily configure auto-completion. 
  • The search is typo resistant. But when the database is small, auto completion and typo handling didn't work for me. 

Wednesday, September 2, 2020

Getting started with Scikit Learn Machine Learning API - read documentat...





In this video, posted with permission, we talk about the code pattern to get started with Scikit Learn and how to read the sklearn documentation.

Thursday, July 23, 2020

AutoML Automatic Machine Learning landscape

  • Features: here are some features that may be offered
    • Automatic visualization: Automatically generate visualizations
    • Machine learning interpretability (possibly advanced feature)
      • Explain model results
        • K-LIME, Shapley, Variable Importance, Decision Tree, Partial Dependence (H2O example)
    • Model Deployment and Operations
      • Automatic deployment
      • Create REST API endpoint

H2O Driverless AI
  •  Recipes available
  •     
    

Friday, June 12, 2020

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in neural networks. Regularization acts on the loss function (cost function) by adding an extra penalty term. The penalty term depending on the method of regularization, penalizing the weight parameters so it is a function of w

Two common regularization methods:
  • Lasso 
    • Uses L1-norm
  • Ridge
    • Uses L2-norm
A trick to remember the norm is that letter L comes before letter R, so Lasso is L1 norm and Ridge is L2 norm. 

One is more likely to result in sparse solutions turning one or more coefficients zero. Which one do you think it is? 

Quiz: which formula is Lasso? Which one is ridge?

  • Regularization penalizes overly complex models
  • Large weights usually make penalty term higher, so smaller effective weights are preferred
    • Larger weights cost more
  • Regularization = regular_loss_function + extra_penalty_term(lambda, weights)
    • The extra penalty term also depends on the weights parameter and the lambda rate parameter

Thursday, June 11, 2020

WeChat Basics


  • WeChat requires approval
  • Important to decouple app with data API so we can change more after
  • Where to host WeChat app? Best hosting best API?
  • 微信开发者工具 WeChat Developer Tools
    • 三个部分组成,模拟器,编辑器和调试器
  • Developer also need a wechat account to administer and manage wechat apps and login to developer tools. It is used constantly for login, verification and testing, so it's very important. It also is like a 2 factor authentication. It is often used to verify before logging in. 
  • index.js is the main page to write your code
  • WeChat Games
    • Amazing example viral game 跳一跳
  • Performance
    • Compressed images
    • Stored locally in geo locations
  • WeChat Mini Program with AR
    • Example Armani cosmetic app allows users to try makeup
    • Other use cases for WeChat AR including gaming, real estate house previewing, open house, hotel room previewing, house shopping
    • Limited API availability, only available for some brands and developers for AR
  • WeChat voice interface
  • WeChat is a platform, not just a messaging app, it also include e-commerce, game, web browsing, search and content publishing
  • Features
  • Can create a wechat test app, previously known as sandbox
  • To test your app, click on preview on iphone to use your iphone to scan a QR code and be able to test it on your local phone
Write HelloWorld in WeChat Mini Program
When the wechat mini program launches, the onLoad function will be invoked. In index.js this code will console.log hello world. As you can see the syntax is very much similar to JavaScript and is stored in a .js script file. 

Page({

  onLoad() {

    console.log("Hello World!")

  } 

})


onLoad is a lifecycle call back function



Making WeChat Stickers - Sticker Developer
It's very important to follow community guideline and developer policies. Here's the content requirement for WeChat

Linear Algebra Review

A matrix is a grid of number

Points and vectors

||w|| length of w, also known as the magnitude of w, also the L2 norm.

U(U^T) square value of matrix is matrix U multiply by the transpose of U


Sudoku - Technical Interview

Each row can be represented using letters.Can be stored using row='ABCDEFGHI' note there are 9 letters. Iteration: for r in row : #do something

Each column can be represented using numbers. Can be store using cols ='123456789' also 9 digits. 

Use dot . as placeholder.   

Saturday, May 30, 2020

React + React Native Basics in 2020

I am writing new blog posts for technologies every year because they change, they evolve. JavaScript today is nothing like the JavaScript 10 years ago. Today's topic is React and React Native. This is a narrative style cheat sheet of React Basics. It is my hope to organize those concepts in a cohesive way. It will tell you how the concepts are connected, but to find out more, to dive deeper, it is important to seek out external tutorials and resources. 

Post in progress, in construction. Updated daily.

Getting Started
Try the create-react-app
Provides basic seeding, basic scaffolding
All the javascript files are in the src folder.

Normal to see the node_module folder in all node and react apps. That's where the packages are installed. 

Basic Node commands
Start the node app in command line
$ npm start

install the latest and greatest
npm install npm@latest -g

Launches the placeholder app in localhost:3000

Check versions
node -v
npm -v


Key Concepts in React:
Declarative program : opposite of imperative programming, where we specify step-by-step instruction, implementing behaviors in details. In declarative programming we tell React what we want back, like a component. HTML is a declarative, because we don't need to implement every detail just tell the browser to render a <div>.

JSX : 
JSX mix html and javascript together
React uses JavaScript to write HTML codes using JSX.
const my_html = <h1>Hello World</h1>
It is a combination of JavaScript and HTML, won't validate in vanilla JavaScript. 

Components:
React components auto attach props. When we use it it also knows which props we are referring to. It knows we want to use it. 

Functional component is good for a simple quick function. 

The components codes are organized into the components folder in the src subdirectory. 

To be frank, a component is a block of reusable code. 

Libraries:
React 
ReactDOM : insert components into DOM
Babel : compatibility, helps convert JSX to JavaScript (Browsers can understand Js code, not JSX)
React Native : Build mobile apps using just JavaScript. Write once, deploy any where. Supports iOS and Android. Dependency is React. JavaScript is bundled, transpiled from ES7 ES6 ESNext down to ES5 code. Also minified (Source: CS50 Harvard). Multiple JavaScript files compiled into a one big JavaScript bundle. Separate threads for UI, layout, JavaScript (which is Single Thread and can get locked up). Prerequisite of React Native learning is to know core React concepts

Link to React, ReactDOM, and Babel using script tags in the headers.

Basic unit of react organized around components. It inherits from React.component. Usually contains a render function. 

class CappedComponentName extends React.component{
    render(){
        return <h1>Some HTML Code</h1>
    }
}

Props : objects that are passed to elements. It looks similar to JSON, but unlike JSON which can only handle strings, props can handle other JavaScript data types. 
React uses setState to change state props.

Arrow function: 
Benefit of using arrow function, is to handle the event variable and bind this correctly. this object can get kind of funky in js. 

Best practice:

A workaround best practice to ensure compatibility is to compile JSX to JavaScript before deployment. 

What is the different between props and states. States is something the component may want to track and modify. Props is somewhat like initialization, configuration, like states but with fewer changes

Use case:
    React can convert JSON data quickly to HTML
    React is really good for repetitive HTML, generating a lot of HTML templates repetitively which vary with data
    React components make coding easy
    React helps developer handles States
    React helps HTML page display data
    
Performance : 
    Virtual DOM computing and re-rendering with delta change only makes React performant, less expensive DOM rendering, refreshing, layout change with data.
    Virtual DOM manages delta update by monitoring changes and understand what constitutes a re-render, re-render partially, instead of regenerating DOM from scratch, which is costly and expensive

In general, front end development frameworks help reduce DOM manipulation work, and improves code quality 


Post in progress, in construction. Updated daily.

Components:
React components are independent, easy to maintain, reusable. Describe how components should look based on states, properties. state can be think of as a configuration


State:
In constructor we define the state of the component, what does the component needs to keep track of
this.state = {} property value pairs. state can be think of as a configuration

State is dynamic
State contains data that can change. In contrast with props. 

State is used on a class component not a functional component. 

Initial state vs changes of state

Styling React  | Make React App Pretty
Use React UI, Material UI
React UI includes cards, breadcrumbs, tree, accordian, 

Render function
Related concept : Virtual DOM
React Render function takes 2 arguments : element, target


-----

If data changes, in an eligible event, can trigger changes to all the components that use the data. 

----
React modules
import React from 'react'
that's the core react module. There is also a react module for react web (react-dom) and react native (mobile). Used to be one big app. 
Even have a module for VR. 
----

Files

App.js 
main top level component, 
a wrapper for the rest of the project. Point to other component code. 

App.css 
main css file, 
Can organize css for other components, 
one css file per component. 

App.test.js 
coveres teseting
index.css --> css for the index.html

.gitignore
is good for ignoring sensitive information, large uneccessary files such as node_modules, etc.  such as data files. 

package.json
not specific to react
 common in all node projects
 contain information about the app, such as author dependencies, such as react-dom, react-scripts (used by create-react-app)
package.json the file that describes the node project
also describes dependencies

npm install command will install these dependencies

---
Best practice: 

When removing a .js or .css file be sure to also remove any related import statements.

serviceworker for push notifications.

Sunday, May 10, 2020

Intro to Data Visualization

REPOST from Medium with permission

Data Visualization in Machine Learning — Beyond the Basics

This is not a tutorial. These are my notes from various Machine Learning articles and tutorials. My personal cheatsheet for interviews and reviews. Any feedback and corrections are welcome. If you’d like to read more, please let me know as well. These notes are more applicable for python users. Does not include ggplot, great for R.

Prerequisites and Dependencies

This tutorial and overview is python based so we use matplotlib.pyplot. These commands can be run in command line and in Python Notebook with just a bit of modifications. Any reference to plt means the function is from the matplotlib library.
import matplotlib.pyplot as plt
# will get object does not have bar, scatter.... function_name error # if not imported

Plot a Bar Chart

Bar chart, bin chart: useful for frequency analysis, distributions and counts.
labels = ['A','B','C','D','E','F','G']
nums = [13,24,5,8,7,10,11]
xs = range(len(nums)) #[0, 1, 2, 3, 4, 5, 6]
#xs is a convention variable name for x axis
plt.bar(xs,nums)
plt.ylabel("Customize y label") 
plt.title("Customize graph label")
plt.show() #display the plot


Don’t be deceived by its simple look. Frequency analysis is very powerful in data EDA, stats and machine learning.

Plot a Histogram

Histogram will automatically divide data into bins.
import matplotlib.pyplot as plt
import pandas as pd
nums = [99, 1, 3, 5, 7,33, 23,684, 13, 3 ,0, 4]
pd.Series(nums).hist(bins=30)
# <matplotlib.axes._subplots.AxesSubplot object at 0x10d340d90>
# returns object in memory
plt.show()


Also useful for visualizing distribution and outliers.

Scatter Plot

How is scatter plot beyond the basics? Scatter plot is extremely intuitive yet powerful. Just plot the vertical coordinate and horizontal coordinate of each data point in the sample to get its scatter plot. If the relationship is non-linear, or there may be the presence of an outlier, these targets will be clearly visible in the scatter plot. In the case of many features i.e. dimensions, a scatterplot matrix can be used.
Below is a screenshot of pandas scatterplot matrix in the official documentation.


Clearly the relationship is not linear. The diagonal is the variable vs itself, so it’s showing a distribution graph instead of scatter plot. Neat, looks like the variable is normally distributed.
Scatterplot is a great first visual. Too many features? Try sampling or generating data subsets before visualizing.
Use pandas.DataFrame.describe() to summarize and describe datasets that are simply too big. This function will generate summary stats.
Scatterplots are useful for pairwise comparison of features.
Scatterplots can go beyond two dimensions. We can use marker size and color to illustrate the 3rd dimension, even 4th dimension as in the famous TED talk of economical inequality. The presenter even used timeline (animation) as the 5th dimension.

Visualizing Error

Youtube deep learning star Sraj shows a 3D visual of error function while altering y intercept aka bias and slope for linear regression. The global optima i.e. the global minimum in this case is the goal of gradient descent algorithm.
Error functions have shapes and can be visualized. Local optima which prevents your model from improving can potentially be visualized.


Gradient can be visualize as directional arrows that travel in the direction of the global minima along the shape of the 3D plot. It can also be visualized as a field of arrows in a matrix.
Each residual (y_i — y_hat) can be visualize as a vertical line connecting the data point with the fitted line in linear regression.

Data Scientists Love Box Plots

Why? It displays essential stats about distribution in a concise visual form. Aka candle stick plot. Also popular in finance.
Max, 3rd Quartile, Median, 1st Quartile, min.
This is known as the box and whisker graph too. It’s popular among statisticians. Used to visualize range. It can be drawn horizontally.
What’s between Q3 and Q1? The interquartile range, which used in analyzing outliers. Q1–1.5*IQR is too low, Q3+1.5*IQR is too high.
Box whisker plot displays outliers as a dot!
Check out Boston University’s Blood Pressure dataset box whisker plot with outliers.


Heatmap

Did you say heat map? Heat map has been in and out of favor. Web analytics still use heat map to track events and clicks on a webpage to identify key screen real estates. Why should we use heat map for machine learning?
It turns out that generating a heat map of all the feature variables — feature variables as row headers and column headers, and the variable vs itself on the diagonal— is extremely powerful way to visualize relationships between variables in high dimensional space.
For example, a correlation matrix with heat map coloring. A covariance matrix with heat map coloring. Even a massive confusion matrix with coloring.
Think less about the traditional use of heat map, but more like color is another dimension that can visually summarize the underlining data.
Correlation Matrix Heat Maps are frequently seen on Kaggle, for exploratory data analysis (EDA).


More Data Visualization Magic

Did you know that you can visualize decision trees using graphviz. It may output a very large PNG file. Remember the split of decision tree is not always stable — consistent over time. Take it with a grain of salt. The benefit of visualizing a decision tree is to understand where and how machines made decision splits. Decision tree boundaries can be visualized too, see screenshot below from Sklearn documentation.


Visualizing models, decision boundaries and prediction results may give hints whether the model is indeed a good fit or it is a poor fit for the data. For example, it is high bias to ignore the nature of our data if use a straight line to fit a circular scatter of dots.
Researchers even visualized different optimizers to see their descend to minimize loss.
Did you know you can create interactive plots using Plotly right in Jupyter Notebook? Interactive plots allow you to visualize complex data, toggle and change parameters. For example you can slide to change values of your hyperparameters and visualize how the model performance change in gridsearch and other systematic search of the space.

Wednesday, April 29, 2020

JavaScript Basic 2020

Learn what's new with JavaScript in 2020. It has changed a lot from the JavaScript you know.
  • JavaScript is interpreted as opposed to compiled
    • C is compiled
    • No need to declare variable types
    • Allows dynamic typing : given a variable there is no type associated with it until it is filled with value, it can be changed later. Some languages are not 
  • ES6 is the latest version of JavaScript full name is ECMAScript 6
    • Symbol
  • In new JavaScript languages semicolon ; is likely optional
  • JavaScript can check equality using double equal sign ==  , or triple equal sign  ===
    • == coerces the type
    • === requires to be exact, doesn't coerce the type 
  • Node command line runtime for JavaScript is built on V8 engine
  • typeof null --> returns object - one of those strange behaviors of JavaScript
  • JavaScript development is guided by ECMAScript standard. ECMA is pronounced Ehk-MA. E stands for Europe
  • You can think : the spec for JavaScript is written by ECMA
  • Each browser can have its own JavaScript engine, for example Chrome uses V8
  • Event Listener
    • Listening or subscribing events such as keydown 
    • aka the =Event handler
  • Modern JavaScript variations include Typescript, frequently used in Angular 
  • npm is the popular package manager for JavaScript
    • Has joined Github
    • In order words both Github and npm is now owned by Microsoft
  • Define a constant : const CONSTANT = 0.5
  • Enclose strings in double quotes or single quotes
  • Arrays can contain values as well as functions
    • const arr = ["value1",5, function(){console.log("Hello World")}]
      • Run the function arr[2]()
    • Can contain different types
    • Can access using indexing, starting from position zero
    • use array with for loop
for (let i = 0; i < arry.length : i++) {
    console.log(arr[i])
}

  • JavaScript allows trailing comma
  • Types
    • primitives: 
      • no methods attached?  immutable
        • boolean, string, null, number, symbol, undefined
        • Number includes both float and integer, there is no separate type
  • Sudden differences between undefined and null
  • JavaScript string
    • concatenation is implicit coercion or type casting, if we use str(variable) then that's explicit coercion
  • checking types of input using typeof, e.g. typeof undefined // --> undefined, type 5 // --> number
  • Try out JavaScript interactively using Chrome browser inspect element mode, or install node and call interactive JavaScript prompt. Use those two as a JavaScript interpreter
  • In general, undefined is returned if nothing specific is returned
  • JavaScript documentation by Mozilla https://developer.mozilla.org/en-US/docs/Web/javascript

Friday, April 3, 2020

Bootstrap Basics


  • Bootstrap is a front end framework used to quickly design, organize and beautify a modern website. It generates css fast for common front end patterns and UI elements
  • Horizontal containers are called row s
  • Vertical containers are called col s , short for column, can be used in designing grid system
  • Bootstrap allows you to focus more on the html file rather than CSS file, write a bit less CSS
  • And no need to reinvent the wheel : writing common UIs and interactions from scratch
  • Concept : bootstrap requires using specific class names to generate desired design 
  • Pro tip: use margin to organize layout, example: can do margin left auto to push things all the way to the right  mt-3 margin top 3
  • Pro tip : use chrome inspector on Bootstrap sample and tutorial page to see what class, and configurations are used.

Grid system

Bootstrap organizes html contents into grids. Each row of the grid is called a row, each column a column. Each row has 12 columns. 

Make the website responsive

Use media query to query screen size and type. Specify the content parameter to change html content. 

Viewport is the visible area. Be sure to utilize the actual size of the phone, prevent rendering websites as desktop version on mobile device, not pretending it s desktop load. 

<meta name="viewport" content="width=Device-width, initial-scale=1.0">

Bootstrap can detect screen size and label it as lg for large, sm for small. If we use the lg and sm parameter in the class of the html element, we can specify how much space a grid column will take if the screen is small versus large (based on screen size).

<div class="row">
<div class="col-lg-3 col-sm-6"> This is a section.
</div>

<div class="col-lg-3 col-sm-6"> This is another section.
<div>
<div class="col-lg-3 col-sm-6"> This is a third section.
</div>
<div class="col-19-3 col-sm-6"> This is a fourth section.
</div>
</div>

CSS review

Pseudo class : 

selector:pseudo-class {
  property: value;
}

w3schools

example 
a:link {
  color: #FF0000;
}

but more importantly in modern css
::after
::before
are two important pseudo class

p::after { 
  content: " - add a foot note";
}


Friday, March 27, 2020

My Little Green Book of Machine Learning and Deep Learning, Artificial Intelligence

Data pre-processing

Turn Complex Data into Numbers

Turn data into features. Turn data into feature vectors. Machine Learning models can only take numeric data. All input data must be represented numerically. For example, words need to be converted to word embeddings in some Natural Language Processing tasks. 

Training vs Inference Models

There are two major tasks in machine learning 1. build and train a model 2. deploy a model for inference. Part 1 takes known data, uses it to tune parameters of the model such as weights. Part 2 takes in unknown data, real world data or test data and calls a dot predict method on the new data. 

Normalization, Scaling Data

Normalize data, need to scale data to bound it. For example in Machine Learning, an error term can be arbitrarily large because the model can be arbitrarily bad, causing the lower bound of error term for f(x) = wx+b to be essentially unbounded. Bounding the error term by scaling the features numeric value can make the result easier to compute and make the search space easier for gradient descent.

Bias Variance Tradeoff

High bias may refer to underfitting, where the model is too simple, not complex enough to make accurate predictions. It can also mean when the model is practically ignoring the data.

High variance may refer to overfitting. That's when the model overfits, hence cannot generate to future data well. 

Tuesday, March 24, 2020

Natural Language Processing (NLP) 2020

It is year 2020 and vision 20/20. It is time to do another survey article of Natural Language Processing (NLP) field. What is there to learn / know? What is new?

Getting started with NLP

Great sources for NLP

Social media : Twitter, Facebook, comments, posts, forums
Transcripts : events, conferences, speech, call transcripts, zoom transcripts
Smart voice assistants : Alexa, Siri, Google Home

Libraries for NLP

SpaCy: works for Japanese, Chinese and English up to 45 languages. Source 4
Scikit-Learn sklearn TFIDF vectorizer

Advanced NLP

Machine Translation
Transfer learning in NLP

Algorithms

Latent Dirichlet Allocation (LDA) topic modeling

Projects

Sentiment analysis
Fake news classifier
Trump tweet maker

Can combine sentiment analysis with tweet analysis. Great Natural Language Processing (NLP) projects for hackathon : retrieve keywords from tweets (entities recognition for hashtags, brand names), pipe the result into a sentiment analysis model, predict sentiment negativity to positivity zero stars to five stars. 

Monday, March 23, 2020

Growth Marketing - Technical Marketing


  • Hiring people can be a pitch for your app, startup 
  • How to use freemium monetization on content. Blur out important infographics images in newsletter, create a digital tease to motivate user to leave newsletter and head to content page 
  • Growth hack vanity address
  • Embed survey in emails. 
  • Use AMP for interactive emails. Easy to fix typos and content swamp if email is generated dynamically. 
  • When working with a Growth Manager, get to know what's her style, what's her vision. 
  • Turn super users into community managers, staffs, employees
  • Unfortunately on Youtube drama drives clicks. Dramatic Youtubers seem to grab attention. There was one joke: Uber hits a pothole, Youtuber describes it Gosh I almost died today.
  • Startups are all about growth. Funded startups are definitely about growth. Even bootstrapped startups need to think about growth, large scale growth asap. 
  • Youtube
    • Very important to have attention grabbing thumbnails

Advanced Python


  • Installation
    • import numpy as np
  • Request is deprecated
  • Python 2.x is deprecated (Python 2.7 is usually pre-installed on Macs 3-7 years from 2020). 
  • BeautifulSoup
  • Pyecharts
  • Scrapy
  • Pycharm
  • mylist = [1,2,3,4]
  • np_array= np.array(mylist)
  • Slicing
    • Entire list mylist[:]
    • Slicing zeroth to 0th, 1st element mylist[0:2]
    • The second position is exclusive
  • Function signature
    • What type of input is expected? Example CSV
    • What type of output is expected?
    • What is the functionaltiy
    • Each functionality does one task
  • Using an IDE
    • Anaconda Spyder (Scientific Python Development Environment)
    • PyCharm

Sunday, March 22, 2020

Technical Interview Tips - Technical Interview with Python cheat sheet


  • Time complexity
    • Big O Exponential 2**n
    • Worst case, Big O
    • Best scenario
    • Average 
  • Data structure & algorithms
  • Test case: ZeroDivisionError
  • Implement function from scratch
  • Tree algorithms
  • Use pointers while lo < hi : do xyz lo = 0th index hi = len(input)-1
  • Language: Java, Python slightly preferred
  • Exponential 

Firebase APIs Basics


  • A project is a container for resources on Google Cloud
  • Install Firebase tools first to use command line utilities
  • Defer tag in html, means don't load the resources until the page finishes loading
  • Firebase serve local port 5000

Google Cloud Basics


  • .yaml extension of configuration file
  • Export python packages and environment dependencies as requirements.txt
  • name.py python code
  • Google Cloud Function
    • By default cloud function is authenticated into other Google APIs
    • Serverless, fully managed, event triggered, considered a micro-service in the cloud
  • Google Cloud OAuth
  • Use stackdriver (build on AWS) to track performance
  • Role management using IAM
  • Google Cloud Discount
    • Google Cloud discount for students available
  • Cloud DNS is an available API
    • Add a record ANAME find ipv4 address
    • Copy external IP address
    • ANAME record is linked to external IP
    • CNAME www is connected to domain name .com
    • Now go to domain provider and do the same
    • Add the google domain one to the name server
    • Need generate an SSL a certificate to enable
  • Integrates with Firebase
    • Firebase APIs
  • Resources - Conference: Google Cloud Next Conference usually in April
  • Static versus ephemeral IP address
  • Market place: Google Cloud wordpress
  • Google Cloud partner: top partners include Accenture
  • Tutorial : Qwiklab
  • Concept : server less on the cloud, runs code in the cloud don’t need to know or manage what machine infrastructure it is operating on. No need to worry about automatic scaling, load balancing, security, patch.

Friday, March 20, 2020

Learn SQL — It’s on every job listing — Part 1

SQL is not obsolete. You can now build Machine Learning models with SQL, query real time or big data with SQL. It is true if you look around you will find plenty of job postings with SQL as a desired skill, even from FANG companies (Facebook, Apple, Netflix, Google). Uniqtech writes technical tutorials for coding bootcamp graduates, free lancers, self-study, MOOC students who are in the realm of data science, software engineering, machine learning and deep learning. Read our disclaimer here. This disclaimer applies to our entire site. Please take our words with a grain of salt. They are not considered professional advice nor are they considered professional opinions. Repost from Uniqtech Medium with permission. 

Microsoft Excel is a workbook that contain work sheets just like database contains tables.
Each table can be queried separately. To query tables, jointly, we will need to use join statements and keys to look up the corresponding data.
Each table row should have a unique ID, known as the primary ID. It can also have a foreign key (FK) which associates the row, aka record, with a unique primary ID of another table.
For example each e-commerce transaction has an unique ID, which can be generated with the timestamp of when the transaction happened. Each transaction ID can have a FK such as customer ID, which uniquely identifies the customer that made the transaction. His or her full information resides in the customers table.
That is the perfect sequel to talk about the philosophy and convention behind table names. You can think of table names are natural division of the data we want to model in forms of nouns, and in noun plural form: transactions, customers, products etc. Each row in the transactions table is a transaction (singular). Each row in the customers table is a customer. Each column represents a customer attribute, such as gender, age etc.
When designing the database, an architect or Database Admin (DBA) will construct a digital blue print stating how the tables are connected with each other or they are stand alone in the database. This diagram and the relations it specify is called the database schema.

What is SQL

SQL is a database query language. It doesn’t matter what relational database you use, SQL concepts are helpful. Pandas analytics library uses similar joins, query methods. Google BigQuery allows SQL like syntax.
Newer database such as NoSQL and graph databases use different query languages. Sample code from Google Cloud Datastore nosql database
1. // List Google companies with fewer than 400 employees.
2. var companies = query.filter(‘name =’, ‘Google’).filter(‘size <’, 400);

Important SQL Keywords

SELECT

The one select statement to select them all is using the wildcard.
SELECT * FROM table_name
It is important to slow down and read the statement. It reads: select all from table_name. * means all columns.
Nested Select Statements
SELECT * FROM (SELECT "A" AS A, "B" AS B);
AS specifies the alias. When column names are not reader friendly or long, alias is your friend.
It selects the column of data.

FROM

The FROM keyword is usually followed by a tablename. FROM database.CUSTOMERS . It can also be followed by a nested query.
It specifies the table to operate on.

WHERE

Where clause narrows down the query results by specifying conditions such as where TABLE_NAME.gender == 'Female' . It works on filtering the rows of data.

Putting it all together SELECT FROM WHERE

query = """
    SELECT my_column
    FROM my_table AS m
    WHERE m.gender = ‘F’
 """

WITH

“The SQL WITH clause allows you to give a sub-query block a name (a process also called sub-query refactoring), which can be referenced in several places within the main SQL query.” — Geek for geeks

ORDER BY

Sort the query result by columns ascending or descending.
ORDER BY ASC
ORDER BY DESC

LIMIT

LIMIT 1000
LIMIT 25
Show the first xx number of rows of records in a table.
Usually at the end of the query. The last line in SQL query.
Note in big data, where managing cost and resource use is important, LIMIT does not mean the entire database is not queried.

Data Structure and Algorithms

Time Efficiency and Space Efficiency Both Matter

How to write effective tests cases

Wednesday, March 18, 2020

Google Colab Basics

In Google's own words: Colab is zero configuration, free GPU, and easy to share. I honestly have to agree with that. Google Colab is the easiest environment to get started on machine learning with scikit learn, or deep learning with Tensorflow and Pytorch. Seriously, zero installation is awesome! Back in the days, when Ruby on Rails was hot, we had installation parties all the time. Because that's what took the most time, for every one. Now Colab even has access to free TPU!

Google Colab is basically like Google Doc is for Microsoft Office as it is for Jupyter Notebook. It basically lets you create, edit and host Jupyter Notebook in the cloud.

Google Colab for Training Models

Training is essentially free and easy on Google Colab. You have access to both GPU and TPU. Though the free version can lose temporary variables, files in the home directory, because it refreshes every 12 hours or less. There are ways to save and download the files to avoid such catastrophe. 

Use Google Colab for Demo Purpose

It is easy to build an example in Colab and share it with audience, give it away instead of Github source code, instead of slides.

Tuesday, March 17, 2020

Flask basics 2020

Flask is known as light weight and feature rich. It is also known as a micro framework.  Django is the heavy weight one. Used to dynamically code and update web applications, including Single Page Applications (SPA). It is a web development framework, not a library. Frameworks has certain philosophy, strategy as well as code patterns (for example must follow certain folder structure because the framework will automatically look for assets into those folders, as well as file, resource naming conventions). Frameworks can help us do routine tasks, which we have to do over and over again in web programming, way faster and more efficiently. It uses tried and true methods, and code patterns to implement common web app components, there is no need to reinvent the wheel every time. 

Model-View Controller framework

MVC framework, popularized by Ruby on Rails. MVC is a separation of concerns, a way of organizing and designing code projects. 

Controller: use a decorator like and URL like structure to define behaviors. Of how the URL will be handled. Called routes. Url_for() short hand to link url to functions instead having to type long url

Database

SQL
Manage Flask ORM resources and records using
SQLAlchemy 

Launching Flask app via local server
$ flask run
Use above to launch a local website

Additional Flask concepts

Hello world, first script is commonly called app.py
$python app.py

Jinja mixes Python and HTML together. Url_for() short hand to link url to functions instead having to type long url
Conditional HTML Jinja templating language along with Flask allows us to write if else statement in HTML

How to host flask applications: one example is that you can easily deploy flask applications on Heroku as well as separately on Google Cloud app engine.

__name__ the use of name keyword can determine whether a script is the main process that is being run, or it is being called upon by another script. If it is imported, then it is called by another script. Check whether the python __name__ == __main__ can inform Flask where to automatically look for static files.

Flask has debugging mode, which should never be used in production but is handy when developing.

Make Flask app available online use ngrok

Web Programming Web Development Basics 2020



Model view controller framework (MVC)
"a separation of concern"

  • Controller contains the code logics, routes
  • View contains the aesthetics codes CSS


CSS
  • Block versus inline
  • Can use CSS layout
  • build in tool called flexbox which can lay out automatically
  • flex box is great for making a bunch of cards that rearrange themselves as we shrink the page
    • 4 on each row
    • 3 on each row
    • ..
    • 1 on each row stacking.
HTTP protocol
  • Text protocol
  • HTTP requests contains a header
  • Key : value pairs
  • Http status codes 1xx 2xx 3xx 4xx 5xx
  • Http response
Cookies

Frameworks

  • Using a web development framework allos us to by pass a lot of hard work and be able to write web apps quickly without knowing all the functions to call and libraries to import
Web hooks

Data

  • D3 js visualize dataflow chart, org chart

Monday, March 16, 2020

API Design 2020

Designing API

  • What kind of resources are needed
  • What kind of actions will be taken
  • What kind of endpoints should be designed

Testing API

Testing API using curl

GET Method
curl 'https://[URL]/[resource].json'

curl -X PUT -d '{"key":{"nested_key":"value"}}' \
  'https://[PROJECT_ID].firebaseio/users/tom.json'

curl -X PATCH -d '{"key":"value"}' \
 'https://[URL]/[resource].json'

curl -X POST -d '{"key":{"nested_key":"value"}}' \
  'https://[PROJECT_ID].firebaseio/users/tom.json'

curl -X DELETE \
'https://[PROJECT_ID].firebaseio/users/tom.json'

Use curl to check the documentation to see what the API does

Shopify Partner Basics

  • 2019 Shopify started to version its APIs. You can now refer to APIs by their version number. 
  • Career opportunities: Shopify store owner turned developer, turned partner
  • Shopify Buy Button is available for WordPress blogs
  • Utilize the Shopify partner blog, a great resource
  • Shopify Ping chat and Kit CRM robo assistant
  • Can use Apple Business Chat with Shopify channels
  • Can get approved to run product tagging and ads on instagram
  • Shopify Lounge provides co-working opportunities, photoshooting light box sessions
  • Point DNS on Shopify custom domain

GraphQL Basics on Shopify

GraphQL eliminates the need to define, a potentially infinite, number of endpoints for developer to interact with APIs. There is no more need to predefine the endpoints needed to interact with the API.  Many traditional API calls may be needed to get complex results back. "Multiple API calls from different schema hard for developers and slow for users." - Shopify forum discussion. GraphQL is can give back all the information in nested JSON format. Use GraphQL admin to manage the API. REST API needs multiple endpoints for each resource, to be designed and written, GraphQL technically just need one endpoint. In Shopify POST https://{shop}.myshopify.com/admin/api/2019-04/graphql.json for example. Shopify has a GraphQL app ready for installation. Shopify Developers and Shopify Partners can potentially use this to design reporting fast. Traditional API call HTTP request GET /api/user?id=1 HTTP response {“id”:1, “name”:”xyz”}

Wednesday, March 4, 2020

Evaluating Classification Tasks in Machine Learning and Deep Learning

Confusion Matrix

Keywords: recall sensitivity, specificity

ROC curve, ROC AUC (curve)

Use real example: doctors, medicine, cancer example

Technical presentation components


  • Story telling
  • Workflow flow charts, where does it fit in the big picture
  • Code snippets
  • Take aways, action items after the talk
  • Link to slides
  • Link to codebase
  • Visualizations
  • Tricks to memorize, remember

Women in Data Science Conference (WiDS) summary, transcripts, notes from my personal experience

WiDS started small but is now a global movement with many regional events and branches. It is 5 years old in 2020. This year it is hosted at Stanford University.

Volunteer opportunities with WiDS: ambassadors, region events and branches, 500+ ambassadors world wide

Understand the history and evolution of Tensorflow by revisiting Tensorflow 1.0 Part 1

Tensorflow 2.0 has been beta since last year, and it is a completely different universe as its predecessor Tensorflow 1.0 but even in 2020 it is important to understand the history and evolution of TF library to understand how did it get from here, and why did it choose Keras as a high level API. It is important to understand what is a compute graph as it is a super useful concept in Deep Learning and that you can still visualize and inspect it in TensorBoard.

Let's go back in time and talk about Tensorflow 1.0 its data flow graph and everything executed in a Session object, C or C++ backend and how it handled parallel computing. Offered both Python and C++ API. Though it had a big learning curve at the time of its release, it was production ready and powerful, and had already been used internally at google before being outsourced. It supported CPU GPU and distributed processing in clusters. Its focus is on deep learning neural networks versus Scikit Learn focuses on traditional machine learning algorithm.

What is a data flow graph? It is a very important computer science concept. The node represents math operations, and the edges are multi dimensional arrays called tensors, which flow among the data graph hence the name Tensorflow! See the history is important. Using the graph we can easily visualize the neural networks. Numpy and Scitkit learn would not give that result.

The frustrating part is that graph needs to be built first before running it in a Session. This is where the learning curve got a bit hard and that it was hard to prototype and iterate, and requires a bit of math architect skills than just engineering and coding.

A quick note on tensor, which is also a concept in math and relativity. In this case, it just means a more complex multi-dimensional array of numbers, usually more than 2D (matrix), and has auto gradient compute compatibility and also capability to move to CPU or GPU and parallelize vector compute if possible. Technically even a vector (1D) or a number (0D) is a tensor.

In deep learning, usually we have to convert data such as texts or images into integers, and we usually represent them using tensors. Each image for example is a 3 dimensional tensor of red green and blue. Each dimensional has a matrix corresponding to the width and height of the image, with each element representing the pixel brightness at each w, h coordinate. This is called a feature matrix, aka a feature tensor. Though no one calls it a tensor in this case.

One cumbersome pattern in Tensorflow 1.0 was the need to define a placeholder tf.placeholder() and with type specified before filling it or initializing it with actual numbers. It has the benefit of contiguous memory but it can take time to get used especially when TF is trying to court dynamic type python users. The benefit is also to be able to construct the graph without knowing or filling in specific numeric values. One minus is the inability to test and prototype and iterate.

tf.Variable() allows initializing and filling in data that can be later changed. Each node is an unit of computation. Each edge is either an input or output of an operation.

tensorflow 1.0 like tensorflow 2.0 has a pythonic front-end, a pythonic API and can be deployed on many containers and devices such as CPU, GPU Android and other mobiles OS such as iOS, javascript (tensorflow 1.x+)  in the browser. It has always been quite production ready. Hence was popular before Pytorch 1.0 came along.

Tensorflow and Pytorch both focuses on deep learning and are optimized for deep learning.

Additional feature - Auto differentiation is important for gradient based deep learning algorithms. Additional feature - Optimizer for fine tuning weights efficiently.

Sunday, February 23, 2020

SpaCy for Natural Language Processing (NLP)

Documents are tokenized to sentences, then to words. Additional or readily available features can be made from these documents to work a task. One SpaCy task could be identify whether a tweet is positive or negative and among its texts, is there a specific product that is mentioned.

Installation

Install SpaCy with Python like any other python package using pip. The easiest way to install and configure is to use Tensorflow Colab. 

pip install and then import SpaCy. It works on Tensorflow Colab too. Perhaps the fastest way to get started.

Step 1. Need to import a language model before proceeding. SpaCy supports many language models.

Step 2. Load the English model:

spacy.load('en')
spacy.blank('en')

Supports other models too, include en_core_web_sm

Update the model (optional)
nlp.update

Step 3. Init the model, wrap it in an nlp object
doc = nlp(u"document sentence here")

print out items from the spacy nlp model
- print out tokens
# Iterate over tokens in a Doc
for token in doc:
    print(token.text)

- print out entities
for ent in doc.ents:
print(ent.text, ent.start_char, ent.end_char, ent.label_)

this will print out the entity as well as the beginning and ending index and its label

Can also query doc using slicing example [1:4]
Can access .text attribute fo the token


Additional functions

Pipeline

print(nlp.pipeline)

Disable pipeline for custom training
nlp.disable_pTraining src 13

source -13 : https://spacy.io/usage/trainingipes
Why disable the pipeline? Say if you are using SpaCy for just one task such as NER, you can disable the pipeline to avoid some of the tasks.
Check the list of pipeline labels nlp.pipe_names()


Save the trained model
nlp.to_disk


NER
Bloom embedding, a type of optimized word embedding
1D CNN 1D convolutional neural network


NLTK
Not a part of Spacy is the entry level tool kit to NLP. It can do basic part-of-speech tagging. But does not have advanced functionality like spacy

Spacy Deep Learning

Training src 13
source -13 : https://spacy.io/usage/training

SpaCy word embedding

To print out the word embedding 
print out vector, access word embedding use .vector method

SpaCy for Biomedical Research

scispaCy, a python package that provides SpaCy models for biomedical, clinical texts and scientific literature. Pre-processing. 
Source: https://allenai.github.io/scispacy/

Why Natural Language Processing is hard?

Limitation of pre-trained models

" A model trained on Wikipedia, where sentences in the first person are extremely rare, will likely perform badly on Twitter. Similarly, a model trained on romantic novels will likely perform badly on legal text." - Spacy documentation

Friday, February 21, 2020

OpenCV cheat sheet


  • import cv2
  • cv2.imread()
  • cv2.resize()
  • .tranpose() on arrays
  • .reshape() on arrays

Google Cloud AutoML


  • Functionality provided by AutoML: Single Label Classification - 1. Predict the ONE correct label that you want to assign to a document 2. Multi-label Classification Predict ALL the correct labels that you want to assigned to a document. 3. Entity Extraction Identify entities within your text items. 4. Sentiment Analysis understand the overall sentiment expressed in block of text. (Source direct quote AutoML documentation). 

Thursday, February 20, 2020

My experience with TripleByte technical interview and quiz

I read a few really good posts on TripleByte experience. They were helpful so I am also posting my two cents here.

First of all, TripleByte is legit. It went through Y Combinator and it is being actively promoted by YC.

Amazing selection of quizzes:
I am so happy that they have full stack, data science as well as Machine Learning quizzes as of Feb 2020! The Data Science and Machine Learning Quizzes both have a NEW sign.

It is about 2 minutes per question.

I really like the FastTrack feature. It is a quick validation. It is encouraging and it quickly moves candidates to the next step  : actually doing or practicing technical interviews. Honestlly this part is not avoidable.

I haven't figured out a way to take other quizzes when passing one with FastTrack.

It is not very hard for me to get FastTrack or well but if I can get exceptionally well, then it is rarer and more meaningful, and there may even be an opportunity to be matched with top companies and opportunities. I don't think the Exceptionally Well is exactly trivial to obtain. TripleByte visualizes your skill set with sub categories that either has a scale of 1-5 rating or a radar map with similar scale. But one does not want to score a 3 in any of the sub categories - visually it makes the radar map looks weak.

With a little a bit of review and brief study, the quizzes should be easily passable. If you don't pass the quiz, may be it is time to learn more and get more experience, because it is not that hard to pass it.

From most of what I gathered online in forums, the technical interview portion is difficult. There is quite a bit of requirement in coding exercises and setting up the coding environment in the console. Because I come from a non-traditional background, I don't know C++ ... yet. I plan to learn it. Some of the exercises, quizzes and interview questions can be in C++. And that's a problem for me. The quiz C++ is easy to figure out even if you don't know the language. But the coding exercise in C++ cannot be figured out without prior knowledge.

Apparently you will be sent an interview guide if you do schedule a technical interview.

One trick to do well in technical interview is to have practiced the problem, then you will know the caveat, and won't stress to understand the problem (comprehension), and potentially know roughly what the optimal solution look like.

During the interview, it'd be good to think of a similar problem that you resolved and recall how you resolved it. Being able to discuss the problem in a real world setting is always helpful for finding optimal solution and also showcase your understanding of the technical problem.

How does TripleByte compare to HackerRank and Leetcode

TripleByte is more developer-friendly and better for candidates than HackerRank and Leetcode. Because first of all, it tests knowledge more than trivia. As long as you understand the problem, you likely can resolve the question fast, within 2 minutes (the requirement). It focuses one or two missing line, or the final returned result. This means you won't have to spend 45 minutes to conjure each solution. I like that a lot. I can demonstrate I understand the problem and its edge cases without having to get very detail right. 

Leetcode is more detailed, and there is a lot of competition for time performance, even a good solution may not be enough. HackerRank has a nice trajectory to level up, and is interesting, but like Leetcode it also requires the candidate to write a lot of code every time. Though eventually, you should probably still use HackerRank or Leetcode to prepare for the screenshared interview - first round. 

HackerRank supports a few choices of languages. TripleByte lets you choose category of your quiz but there no explicit language choice. Leetcode supports many languages. 

Firebase Tutorial - Cool things you can do with Firebase 2020

Introduction to Firebase You control your Firebase using the Firebase console. The console manage one or many projects.  Each dat...