Ad

Saturday, December 29, 2018

Sample .gitignore file


Some good files to ignore .DS_Store a hidden file generated by Mac, .ipynb_checkpoints, hidden files generated by Python Notebooks aka Jupyter Notebooks. Large image datasets used for machine learning and deep learning training. .gitignore is also prefixed by a dot, it is a hidden file, usually doesn't show up in finder or file viewer, but can be opened in sublime.

Pytorch Torchvision model performance cheatsheet


Network vs Top-1 Error vs Top-5 Error including VGG net, ResNet, Densenet.

Sunday, December 2, 2018

Visualize tensors - Machine Learning Deep Learning Cheat Sheet

Tensors are the basic units of deep learning frameworks, neural networks functions calculations. A one dimensional tensor is like a list of elements. A two dimensional tensor is like an excel sheet it has a row dimension and a column dimension. 3D tensor is like an RGB marked image. Each pixel has a red, green, blue value, makes each pixel representation 3 dimensional.


Trivia: Tensorflow is named after tensors. Duh

Thursday, October 11, 2018

Codecademy - Machine Learning Fundamentals - Syllabus

Upgrade your skills with Codecademy's Pro Intensive, Machine Learning Fundamentals.
Each unit will cover conceptual and syntax lessons and quizzes. There will also be a few cumulative off-platform projects throughout the Intensive. Articles and videos will be available to supplement your learning.
Unit 1- What is Machine Learning?
Learn about the types of problems to solve with machine learning.
Machine Learning Process
Learn about Scikit
Why Data?
Unit 2 - Regression
Predict continuous-valued output based on the input value(s).
Distance Formula
Linear Regression
Multiple Linear Regression
Precision vs Recall
Unit 3 - Classification
Classify data into different categories.
Bayes’ Theorem
Naive Bayes Classifier
K-Nearest Neighbors
The Ethics of Overfitting
Unit 4 - Unsupervised Learning
Find patterns and structures in unlabeled data points.
K-Means Clustering
K-Means++ Clustering
Unit 5 - Neural Network Teaser
Implement a single neuron - the building block of neural networks.
Perceptron
Unit 6 - Capstone Project
Apply your new knowledge to complex projects reviewed by experts.
Yelp recommender
Date-a-Scientist

Monday, September 24, 2018

Natural Language Processing NLP - Useful libraries, tools and code samples

Basic Concepts
  • Stop words removal
    • Stop words are words that may not carry valuable information
    • In some cases stop words matter. For example researchers found that stop words are useful in identifying negative reviews or recommendations. People use sentences such as "This is not what I want." "This may not be a good match." People may use stop words more in negative reviews. Researchers found this out by keeping the stop words and achieving better prediction results. 
    • Removing punctuation may also yield better results in some situations
  • Tokenization  : breaking texts into tokens. example: breaking sentences into words, and more group words based on scenarios. There's also the n gram model and skip gram model
    • Basic tokenization is 1 gram, n gram or multi gram is useful when a phrase yields better result than one word, for example "I do not like Banana." one gram is I _space_ do  _space_ not _space_ like _space_ banana. It may yield better result with 3 gram model: I do not, do not like, not like banana, like banana _space_, banana _space. 
    • ngram : n is the number of words we want in each token. Frequently, n =1
  • Lemmatization:  transform words into its roots. Example: economics, micro-economics, macro-economists, economists, economist, economy, economical, economic forum can all be transformed back to its root econ, which can mean this text or article is largely about economics, finance or economic issues. Useful in situations such as topic labeling. Common libraries: WordNetLemmatizer, Porter-Stemmer
  • An illustration of sentence tagging
  • Example of tokenization and lemmatization for ngrams = 1. source
Python Basics
  • Python library NLTK
    • includes a list of stop words in English and many languages, you may want to customize this list
    • Example The Sun and Sun mean different things, in certain analytics situation, it matters.
    • from nltk.corpus import stopwords
    • clean_tokens = [token for token in tokens if token not in stop_words] #important pattern
      • source: Towards Data Science  Emma Grimaldi How Machines understand our language: an introduction to Natural Language processing
  • from nltk.tokenize import RegexpTokenizer a regex tokenization
  • RegexpTokenizer(r'\w+') tokenize any word that has length > 1, effectively removing all punctuations
Sklearn Basics
  • Sklearn text classification with sparse matrix http://scikit-learn.org/stable/auto_examples/text/document_classification_20newsgroups.html
  • Read our article about TF-IDF model for information retrieval, document search read here

Count Vectorizer

What does it do? "Convert a collection of text documents to a matrix of token counts" (sklearn documentation). returns a sparse matrix scipy.sparse.csr_matrix 

Feature Dimension : equal to the vocabulary size found by analyzing the data.

NLP Use Case

  • Classify is a review positive or negative, sentiment analysis

Saturday, August 25, 2018

Getting Started Coding Amazon Alexa Skills for Developers

Getting Started

First read our blog post on how to get started with Alexa Skill building or watch our youtube video on the topic


Visit the Alexa Skill Store to see more than 25000 skills available.  https://www.amazon.com/b/?ie=UTF8&node=13727921011&ref_=topnav_storetab_a2s

Alexa Skill Persona https://developer.amazon.com/blogs/alexa/post/1884bc03-66f0-49ea-819b-e5db6407ec68/hear-it-from-a-skill-builder-how-to-create-a-persona-for-your-alexa-skill

Monetizing Your Alexa Skill - Earn Money with Your Alexa Skill

https://developer.amazon.com/alexa-skills-kit/rewards

Deep Drive for In Game Purchase
https://developer.amazon.com/blogs/alexa/post/12279973-0f16-4fef-9286-649552a06767/metadata-deep-dive-for-in-skill-purchasing

Amazon Alexa Skill Developer Community, Developer Support, Developer Relations

Amazon rewards developers with promotions and token rewards
https://developer.amazon.com/alexa-skills-kit/rewards

Alexa devday for a city near you https://developer.amazon.com/alexa/devday

Super developer award, developer incentives and promotions https://developer.amazon.com/alexa-skills-kit/super-developer

Developer 3 skill promotion
https://developer.amazon.com/alexa-skills-kit/alexa-developer-skill-promotion

Amazon Alexa Skill Dev Tools

Alexa Setting API for timezone, temperature and unit personalization, localization for example for sports event this really matters. 
https://developer.amazon.com/blogs/alexa/post/c2ba44fa-4bd8-4b49-925d-29dbc0330b1e/personalize-your-alexa-skill-with-customer-specific-time-zones-and-measurements-using-the-alexa-settings-api

Amazon Alexa Skill Development Best Practice

Naming your Alexa Skill 
https://developer.amazon.com/blogs/alexa/post/66ab64e1-3099-441a-971b-3dcea4683c34/storyline-how-to-pick-an-invocation-name-for-alexa-skills

Improve customer engagement
https://developer.amazon.com/blogs/alexa/post/2cda040c-b432-493c-92b2-842cf4c7aab6/hear-it-from-a-skill-builder-4-ways-to-optimize-your-skills-for-customer-engagement

Diaglog Management
https://developer.amazon.com/alexa-skills-kit/dialog-management

Thursday, August 23, 2018

ImportError: No module named flaskr - Python Flask Flaskr Tutorial Error Debug

Error Message

/flaskr/__init__.py", line 1, in <module> from .flaskr import app ImportError: No module named flaskr 

Related github issues

Issue 2058 https://github.com/pallets/flask/issues/2058
Issue 1902 https://github.com/pallets/flask/issues/1902
Issue 2437 The example flaskr can't easily be run by a beginner in Flask framework https://github.com/pallets/flask/issues/2437



Issue Summary

Ran into the above error because some files are put into the wrong directory and need to be moved.

Proposed solution

Better understanding of the tutorial folder structure. Here's a better diagram than the original.


flaskr/
├── MANIFEST.in
├── flaskr
│   ├── __init__.py
│   ├── flaskr.py
│   ├── schema.sql
│   ├── static
│   └── templates
└── setup.py


The indentation of the original tutorial is hard to read!

after a successful $flask run command
you will get this updated folder structure below:


flaskr/
├── MANIFEST.in
├── flaskr
│   ├── __init__.py
│   ├── __init__.pyc
│   ├── flaskr.py
│   ├── flaskr.pyc
│   ├── schema.sql
│   ├── static
│   └── templates
├── flaskr.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   ├── requires.txt
│   └── top_level.txt
└── setup.py



happy coding.

Background:

I found the Flaskr tutorial through Nicole White's talk about Neo4j and Flaskr. Plus Flask is a popular python web framework. I have always wanted to code a helloworld. 

Additional documentation

https://media.readthedocs.org/pdf/flask/latest/flask.pdf
https://github.com/pallets/flask/tree/master/docs/tutorial
http://flask.pocoo.org/docs/0.12/tutorial/

Friday, August 17, 2018

Password protect and zip compress files on Mac

The following step is only recommended for developers who have experience working in the command line. This is the build-in zip compress and password protect function in Mac, so it is free.

It's very difficult if not impossible to undo command line changes. So please, only use this option if you are familiar with command line scripts.

Data Science beyond the basics

Exploratory Data Analysis

  • Histogram plotting, input is a list of distributions we want to plot,  specify bins, can also weigh each sample differently, it doesn't have to be count 1. hist function can return values.  How many items in each bin, and the plot. 
  • It is also important to do feature extraction, simply the data, reduce computational cost, dimensionality reduction before feeding data into a machine learning algorithm. Algorithms will run faster, more efficiently, use less memory space, and even perform better, in some cases. 
  • Anomaly detection, outlier detection to handle or remove outliers and abnormality in the data to help the model generalize better and be a more accurate representation. 

Machine Learning

Machine Learning is emerging as a popular field of data science. It has predictive power, employs applied statistics and pattern recognition technologies.

Machine learning is taking data mining to the next level.

Major machine learning tasks include classification, regression and clustering.

Questions that Business Analysts and Decision Makers are Interested In

  • Who are the best customers? aka Who are the customers with the best Customer Life Value
  • Causal relationship: 
    • Results of recent experiments (More prevalent in Startup Culture)
    • Hypothesis if one segmentation is actually different from another
    • Is the result significant or is it random chance
    • Please note that causal relationship determination requires controlled studies to control for extraneous variables. In many industries, such as biotech, statistical significance is a must, a prerequisite for next step analysis or more business investments. 
    • Demo graphics of customers. Summary statistics, customer segmentation and more. 
    • How to measure profitability and other Key Performance Indicators (KPI)

Statistical Hypothesis Testing

Python for Data Science

  • Use conda command similar to pip for installing and launching packages
  • Anaconda comes with a wonderful Python IDE called Spyder

Scientific Computing using Scipy

  • Scipy.integral.quad using the quad method to compute integral function to compute, lower bound, first bound, a tuple, returns an approximation of the result and how much error

Becoming a Data Engineer

Data engineer takes care of data quality. Provide data fast, reliably. 
For example, data funnel starts when installing a javascript tag, gather user browsing data. End with a Saas that client can visualize the data.
Things that can happen in between data gathering , aggregation, storage and delivery. ContentSquare collects browser data so that grows fast. 70 million web pages per day, 3 terabytes of new data each month, 10**15 peta bytes per year. kafka, spark elastic, scala akka 
https://youtu.be/hFsGKjPVOn8?list=WL

Sunday, August 5, 2018

Lifehack Productivity Tips for Business Bay Area Professionals No.101

General Productivity
  • Did you know that having a daily routine improves efficiency and productivity? Mark Zuckerberg of Facebook famously wear the same grey shirt and hoodie on a daily basis to simplify wardrobe choices and save minutes each day. 
  • Automate everything: use API connectors to connect applications such as Gmail, Shopify and Trello without coding: Zapier, IFTTT, Do Button
  • Fix your iphone, your glasses your gadgets DIY style https://www.ifixit.com/
  • Tool for Youtube Creators SoundCloud audio editing https://blog.soundcloud.com/2012/09/20/rec/
  • Join an online initiative to go complaint free for one month and induce positivity in your life https://gonoco.com/
  • Easily distracted? Having trouble finishing meaningful tasks? Try a 30/30 timer rule: switching tasks every 30 minutes. There are iOS apps that time you and chime for you to make a switch and move on. http://3030.binaryhammer.com/

Password Security and Privacy

  • Check if your email has been hacked https://haveibeenpwned.com/
  • Controls what google knows about you
    • http://venturebeat.com/2015/11/10/googles-new-about-me-tool-lets-you-control-personal-information-shown-by-gmail-youtube-maps-and-more/

Social Network, Social Marketing and Growth Lifehacks

  • Use Tweepi to flush Twitter followers that are inactive or don't follow you back

Personal Finance Productivity

  • Use a stock, mutual fund screener to find stocks and funds that match your investment goals for your 401K plan

Developer Productivity:

  • Pair programming for productivity - AirPair and Pivot Labs, a premium development consulting agency for startups and new technology companies, talk about pair programming for developer productivity http://www.airpair.com/pair-programming/
  • Use code lint, code validation, and auto formatting (such as Android Studio code formatting) to get error free codes
  • Always look for shortcuts and do more things faster. Some developers even use fast notetaking apps like notational velocity and combine it with hot keys to shave fraction of seconds off their daily routine. 
  • Code a mobile app without learning iOS development or Android ionic framework http://ionicframework.com/

Startup Productivity

  • Use prototype and wireframes as visual aid to communicate product visions and designs, clearly. 
  • Did you know that having a 3D printed prototype generate 3x more feedback for architecture and physical product designers than just having a concept drawing? 
  • Did you know that famous universities like Stanford teach students to print or draw iOS UIs and designs on paper and walk user through imaginary steps to get design feedback before they code?
  • Looking for great business ideas? Use a startup name or domain generator to get inspired!
  • Google design often teaches paper prototyping - fast, easy-to-use and effective


    This Week in Silicon Valley Byte Size Newsletter No.101

    Udacity Launches AI for trading with WorldQuant, also its hiring partner. Ready to do artificial intelligence for fintech, this may be your nanodegree! What's the ultimate dream? Probably join a quantitative traded hedge fund, eventually. It is said that a little less than 30% of all US trades are done by computers. Specifically you want python for finance and historical data skills.
    - https://blog.udacity.com/2018/08/introducing-the-artificial-intelligence-for-trading-nanodegree-program.html

    Author Adam Fisher launches Valley of Genius as told by the hackers, founders, and freaks who made it. If you like HBO's Silicon Valley, you will probably like these unicorn and innovator stories of Silicon Valley


    Great Escape! Medium is running an August author challenge: tell Medium why and how you quit your job! https://medium.com/s/greatescape/tell-us-about-the-best-time-you-quit-your-bad-job-aaaf6d5b4e20 Your story may be featured. See this challenge post by Medium's editor.
    - https://medium.com/s/greatescape

    What does it feel like to be Steve Job's daughter? Her memoir now available for readers. See this article on Vanity Fair.
    - https://www.vanityfair.com/news/2018/08/lisa-brennan-jobs-small-fry-steve-jobs-daughter




    Youtube Machine Learning Artificial Intelligence celebrity Sraj wants to start his own School of AI. He wants it to be a "nonprofit". Strange but true. He's now recruiting Deans to head cities.





    Business Intelligence Data Warehousing BIDW Basics 101

    BIDW may employ more stable, heavy duty and less flexible architecture, schema and data store than startups in the Silicon Valley. Such may be a sacrifice for security, stability which many fortune companies rely on.

    Structured Query Language (SQL)

    Despite the popularity of many new data stores and technologies such as Hadoop, Spark, Pandas etc, many companies still require Business Analysts to be fluent in sql. Never forget SQL.

    Graphical User Interface (GUI)

    GUI interface helps business users query and drill data without the help of the development department. The schema and database are still designed and implemented by dev.


    Online Analytical Processing (OLAP)

    Provides a GUI to query platform for business users to do data explorations with minimum help from dev department. 

    Analysts and decision makers can quickly and efficiently do data analysis and ad hoc reporting without too much help from a data scientist or database administrator. 

    The schema, reports, and drilling depth may need to be pre-planned, designed and tested before being released to business users.

    This is also a large scale system, suitable for companies such as Macy's, Gap, Walmart which have millions of new sales record per hour. 

    OLAP is for data exploration by large businesses.

    Data Warehousing 

    Data Warehousing is a serious challenge for large companies with many transactional records, product offerings across many departments. 

    Many DW providers can also provide integrated data mining, business intelligence services build on top of proprietary DW hardware (including server stack) and software.

    Best Practice

    • Sales teams on-the-road often needs faster, better data information on mobile devices to seal a deal. Don't be surprised if they get mad when numbers are off! They bring home the dough. 

    Questions that Business Analysts and Decision Makers are Interested In

    • Who are the best customers? aka Who are the customers with the best Customer Life Value
    • Causal relationship: 
      • Results of recent experiments (More prevalent in Startup Culture)
      • Hypothesis if one segmentation is actually different from another
      • Is the result significant or is it random chance
      • Please note that causal relationship determination requires controlled studies to control for extraneous variables. In many industries, such as biotech, statistical significance is a must, a prerequisite for next step analysis or more business investments. 
      • Demo graphics of customers. Summary statistics, customer segmentation and more. 
      • How to measure profitability and other Key Performance Indicators (KPI)

    SQL Basics 101

    SELECT, INSERT, UPDATE with SQL

    The Equivalent of HelloWorld of SQL

    SELECT *
    FROM table_name

    Select all columns and rows from a table. In real life practice, we may want to avoid using SELECT * because it may be asking and displaying a lot of unnecessary records utilizing our precious computing resource, especially for large systems, companies with large databases. 

    A Basic Select Statement

    SELECT ProductID, Name
    FROM Product
    WHERE Price > 2.00


    A Fancier Select Statement

    SELECT * FROM CUSTOMERS WHERE AGE > 25 AND SEX = 'F' AND REGION='CA'

    The *  means all, specifically all columns and rows in this statement. All columns and all rows will be returned. 

    An Advanced Select Statement with Join Statement

    SELECT p.[Name] AS ProductName,
    c.[Name] AS CategoryName,
    FROM SalesLT.Product AS p
    JOIN SalesLT.ProductCategory AS c
    ON p.ProductCategoryID = c.ProductCategoryID;

    An Insert Statement


    INSERT INTO table_name (column1, column2, column3, ...)
    VALUES (value1, value2, value3, ...);



    Useful SQL interview skills

    Be able to read and comprehend SQL scripts

    Be able to compose advanced sql queries including aggregation, slicing and dicing.

    Advanced SQL Query Select Count and Group By

    It's easy to use SQL to display all the data columns and rows. But that's not practical. It's not practical for the business user to get the entire database, nor is it memory efficient. 

    How to view aggregate data? Use Group By, don't forget to use Count() too, else the result is again not meaningful. 

    SELECT COUNT(CUSTOMER_ID), STATE
    FROM CUSTOMERS
    GROUP BY STATE
    ORDER BY COUNT(CUSTOMER_ID) DESC;

    Group By helps aggregate and filter out data. In this case we are interested in aggregating data by State in the Customers table. What kind of state wide information are we trying to get? We are trying to count the number of customers in each state, as measured by customer_id. In addition, once data is aggregated, order the results in a descending order by count(customer_id) the largest count to the smallest. 

    Compare a Select all statement which just returns all the data rows
    to
    Select Count() and Group By statement that aggregates data by country



    SQL is great for the following queries:


    • SQL Segmentation example, analyze by location, select * from sales group by location

    Additional Tools

    Why should you learn SQL - common SQL usage

    Swift iOS Development Core Data uses sqlite as a persistent data store.

    Alternatives to SQL language, sqlite, and relational databases
    • ORM and ActiveRecords used in Rails
    • Hadoop uses HIVE is a SQL like language
    • Spark and the new way to run SQL queries on structured, distributed data
    • Firebase real time database and JSON
    • JSON objects
    • NoSQL databases like MongoDB

    SQL Security
    Cross Site Scripting and SQL Injection
    If allowed to enter special characters in input boxes and forms on a website, hackers may use code to run SQL queries against your database and get data illegally about your website. Many websites do not allow special characters, such as yelp. Some websites stringify the user input before processing it on the server so special characters are turned into strings so to reduce security risk. 

    Saturday, August 4, 2018

    3D Printing Basics 101

    Learn how to 3D Print with Shapeways - Getting Started with 3D Printing

    Take a 3D printing class with shapeways - a famous European printing house http://www.skillshare.com/classes/design/Introduction-to-3D-Printing-An-Easy-Start-to-Your-First-3D-Design/2097968974

    My personal favorite app to get started is TinkerCAD. You don't need to know 3D modeling to get started.

    Getting Started with 3rd Party Printing

    Here are just a few of the ways our 3D tools make it easy to 3D print with Shapeways:

    1. Check to make sure your design is ready for 3D printing
    2. Reinforce designs that are too thin
    3. Identify loose shells in a 3D scan
    4. Save on labor cost by adding a sinter shell container to multi-part designs
    5. Get feedback from our engineers if we're unable to manufacture your design
    Note shapeways mail from Europe. 

    Useful Apps


    • 123D Scan - an app that can scan real world object into 3D object
    • Shapeways 3D printing model checking tool

    Friday, August 3, 2018

    Android Basics - Views and ViewGroups

    Common views

    TextView, ImageView, ButtonView

    Nested View Groups

    Putting views inside other views. Nesting can be costly and get expensive if there are too many layers of nesting. 

    Best practice using Android Views

    • The view that contains all the views is the root view. 
    • Can organize material design cards into view groups, the button, icon and image inside the card are all nested views. 
    • The linearlayout horizontal and vertical orientations are extremely important.
    • If the elements are not horizontally or vertically spread out, relative layout may be better.
    • Draw a view hierarchy diagram to organize views
    • Indent children views
    • Set layout_width, layout_height dynamically using match_parent or wrap_content

    Thursday, August 2, 2018

    Android ImageView with Drawable images

    The goal is for mobile developers to load images onto mobile applications when limited memory is available.

    Android drawable images @drawable/my_img can be set as the source of an ImageView. Image file extension is optional. Drawable refers to the fact that the image can be drawn on the screen. Android manage all drawables in a res/drawable directory.
    https://developer.android.com/guide/topics/resources/drawable-resource

    Drawable supports mainly bitmap format including .jpg, .png, .gif. The unit element for these images is a pixel.

    Density independent pixels (DIP) allows ImageView to scale and resize across screen sizes and pixel densities - across the wide variety of Android devices. Specifying button size using dp instead of px make sure the button is still reasonably sized and clickable on high resolution high density screens (high number of dots or pixels per inch).

    Best practice to keep file size small is to include different image sizes for handling different dip's. Android does this automatically and load the corresponding dip drawable assets into the right folder: hdpi, mdpi, xhdpi, xxhdpi.

    Developers also use ImageMagik to compress photos and Android Drawable Importer to convert images to drawable https://plugins.jetbrains.com/plugin/7658-android-drawable-importer

    Bash - Command Line productivity for developers

    Bash can improve developer productivity. It is available on Mac via terminals. Developers can use bash to write build scripts, enhance dev productivity, use curl to visit and process websites, interact with file systems, modify files, pipe outputs into files.
    • ~ current directory
    • pwd command to show current working directory
    • cd change current directory command
    • ls list files commands
    • ls -l list file with long flag to display detailed info on access, directory, owner, date, file name
    • ls -a list hidden file command
    • . current working directory
    • .. parent working directory
    • cd .. to go up a directory
    • Vim is a text editor in bash

    More reading
    - http://lifehacker.com/5633909/who-needs-a-mouse-learn-to-use-the-command-line-for-almost-anything

    Machine Learning SVM

    SVM can use other functions to make data linearly separable. SVM can give non linear, intricate decision boundaries. SVM Decision Boundary is a straight line for linear SVM.  Apply linear SVM. If it has 0% error, your data is linearly separable.

    c parameter SVM controlls trade off between smooth decision boundary and classifying training points correctly (may not generalize well, get a smooth boundary or get more points classified correctly). Effects of C especially obvious in the RVF kernel. A large c means get more training points correctly. Larger c --> more intricate boundaries

    Gamma Parameter
    Gamma defines how far the influence of a single training example reaches. If gamma has a low value each pointer has a far reach, if gamma has a high value each point has a closer reach.  A high gamma value will make decision boundaries pay close attention to those points that are close, but ignore those that are far. High value of gamma could mean a very wiggly decision boundary.

    A point close to the frontier can really have a lot of weight and pull the frontier close to itself. Versus a low gamma, means more points will have weights of influence on the frontier, so the frontier end up being smoother.

    svm kernel http://scikit-learn.org/stable/modules/svm.html#svm-kernels


    Use SVM for Stock finance https://en.wikipedia.org/wiki/Support_vector_machine

    Sunday, July 29, 2018

    Game Design Concepts 101

    (draft in progress)

    Gaming Mechanics

    Four types of gamers especially in MMOGs
    • Achiever
    • Explorer
    • Socializer
    • Killer
    Concept how do you measure gaming experience?

    Modern games require design addictive cycles to keep the gamers engaged.
    It's a big deal because can shake your moral ground. Making a game addictive is both making a successful product but also potentially doing harm to gamers.

    Game Algorithms

    In-Game Economy Design
    Virtual goods are all the range. That's how a lot of freemium games and social games make a buck these days.

    Online Social Games
    Examples include Facebook games like FarmVille: get lots of traffic, viral factor, millions of people can play it each day (at its height 100 million plus players play online social games each day)

    Some gaming companies got so huge, they entire focus shifted to analytics instead of game design.

    Concept - Gamification
    Making things that are not pure games have gaming elements and incentives to drive results. Gamification takes advantage of fun and addictive gaming mechanics to encourage results.

    Metrics
    Summary statistics, segmentation, average per segmentation, user acquisition, conversion, life time customer value, lifetime spending,

    http://nativex.com/science/how-do-i-know-if-my-mobile-game-is-healthy/

    Tools for Game Developers:

    • HTML Canvas can be used for gaming. The drawback is it doesn't have undo or redo, have to re-draw everything again, 

    Tuesday, July 24, 2018

    List of Natural Language Processing NLP and Machine Learning Papers

    • Andreas, J., Rohrbach, M., Darrell, T., Klein, D., 2016. Neural Module Networks, CVPR
    • Auli, M., Galley, M., Quirk, C. and Zweig, G., 2013. Joint language and translation modeling with recurrent neural networks. In EMNLP.
    • Auli, M., and Gao, J., 2014. Decoder integration and expected bleu training for recurrent neural network language models. In ACL.
    • Bahdanau, D., Cho, K., and Bengio, Y. 2015. Neural machine translation by joingly learning to align and translate, in ICLR 2015.
    • Bejar, I., Chaffin, R. and Embretson, S. 1991. Cognitive and psychometric analysis of analogical problem solving. Recent research in psychology.
    • Bengio, Y., 2009. Learning deep architectures for AI. Foundumental Trends Machine Learning, vol. 2.
    • Bengio, Y., Courville, A., and Vincent, P. 2013. Representation learning: A review and new perspectives. IEEE Trans. PAMI, vol. 38, pp. 1798-1828.
    • Bengio, Y., Ducharme, R., and Vincent, P., 2000. A Neural Probabilistic Language Model, in NIPS.
    • Berant, J., Chou, A., Frostig, R., Liang, P. 2013. Semantic Parsing on Freebase from Question-Answer Pairs. In EMNLP.
    • Berant, J., and Liang, P. 2014. Semantic parsing via paraphrasing. In ACL.
    • Bian, J., Gao, B., Liu, T. 2014. Knowledge-Powered Deep Learning for Word Embedding. In ECML.
    • Blei, D., Ng, A., and Jordan M. 2001. Latent dirichlet allocation. In NIPS.
    • Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J. and Yakhnenko, O. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS.
    • Bordes, A., Chopra, S., and Weston, J. 2014. Question answering with subgraph embeddings. In EMNLP.
    • Bordes, A., Glorot, X., Weston, J. and Bengio Y. 2012. Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing. In AISTATS.
    • Brown, P., deSouza, P. Mercer, R., Della Pietra, V., and Lai, J. 1992. Class-based n-gram models of natural language. Computational Linguistics 18 (4).
    • Chandar, A. P. S., Lauly, S., Larochelle, H., Khapra, M. M., Ravindran, B., Raykar, V., and Saha, A. (2014). An autoencoder approach to learning bilingual word representations. In NIPS.
    • Chang, K., Yih, W., and Meek, C. 2013. Multi-Relational Latent Semantic Analysis. In EMNLP.
    • Chang, K., Yih, W., Yang, B., and Meek, C. 2014. Typed Tensor Decomposition of Knowledge Bases for Relation Extraction. In EMNLP.
    • Collobert, R., and Weston, J. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In ICML.
    • Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P., 2011. Natural language processing (almost) from scratch. in JMLR, vol. 12.
    • Cui, L., Zhang, D., Liu, S., Chen, Q., Li, M., Zhou, M., and Yang, M. (2014). Learning topic representation for SMT with neural networks. In ACL.
    • Dahl, G., Yu, D., Deng, L., and Acero, 2012. A. Context-dependent, pre-trained deep neural networks for large vocabulary speech recognition, IEEE Trans. Audio, Speech, & Language Proc., Vol. 20 (1), pp. 30-42.
    • Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T., and Harshman, R. 1990. Indexing by latent semantic analysis. J. American Society for Information Science, 41(6): 391-407
    • Devlin, J., Cheng, H., Fang, H., Gupta, S., Deng, L., He, X., Zweig, G., and Mitchell, M., 2015. Language Models for Image Captioning: The Quirks and What Works, ACL
    • Deng, L., He, X., Gao, J., 2013. Deep stacking networks for information retrieval, ICASSP
    • Deng, L., Seltzer, M., Yu, D., Acero, A., Mohamed, A., and Hinton, G., 2010. Binary Coding of Speech Spectrograms Using a Deep Auto-encoder, in Interspeech.
    • Deng, L., Tur, G, He, X, and Hakkani-Tur, D. 2012. Use of kernel deep convex networks and end-to-end learning for spoken language understanding, Proc. IEEE Workshop on Spoken Language Technologies.
    • Deng, L., Yu, D. and Acero, A. 2006. Structured speech modeling, IEEE Trans. on Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1492-1504.
    • Deng, L., Yu, D., and Platt, J. 2012. Scalable stacking and learning for building deep architectures, Proc. ICASSP.
    • Deng, L. and Yu, D. 2014. Deeping learning methods and applications. Foundations and Trends in Signal Processing 7:3-4.
    • Deoras, A., and Sarikaya, R., 2013. Deep belief network based semantic taggers for spoken language understanding, in INTERSPEECH.
    • Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R., and Makhoul, J., 2014. Fast and Robust Neural Network Joint Models for Statistical Machine Translation, ACL.
    • Duh, K. 2014. Deep learning for natural language processing and machine translation. Tutorial. 2014.
    • Duh, K., Neubig, G., Sudoh, K., and Tsukada, H. (2013). Adaptation data selection using neural language models: Experiments in machine translation. In ACL.
    • Fader, A., Zettlemoyer, L., and Etzioni, O. 2013. Paraphrase-driven learning for open question answering. In ACL.
    • Fang, H., Gupta, S., Iandola, F., Srivastava, R., Deng, L., Dollár, P., Gao, J., He, X., Mitchell, M., Platt, J., Zitnick, L., Zweig, G., “From Captions to Visual Concepts and Back,” arXiv:1411.4952
    • Faruqui, M. and Dyer, C. (2014). Improving vector space word representations using multilingual correlation. In EACL.
    • Faruqui, M., Dodge, J., Jauhar, S., Dyer, C., Hovy, E., Smith, N. 2015. Retrofitting Word Vectors to Semantic Lexicons. In NAACL-HLT.
    • Faruqui, M., Tsvetkov, Y., Yogatama, D., Dyer, C., Smith, N. 2015. Sparse Overcomplete Word Vector Representations. In ACL.
    • Firth, J. R. 1957. Papers in Linguistics 1934–1951, Oxford University Press, 1957
    • Frome, A., Corrado, G., Shlens, J., Bengio, S., Dean, J., Ranzato, M., and Mikolov, T., 2013. DeViSE: A Deep Visual-Semantic Embedding Model, Proc. NIPS.
    • Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F. 2013. Association Rule Mining Under Incomplete Evidence in Ontological Knowledge Bases. In WWW.
    • Gao, J., He, X., Yih, W-t., and Deng, L. 2014a. Learning continuous phrase representations for translation modeling. In ACL.
    • Gao, J., He, X., and Nie, J-Y. 2010. Clickthrough-based translation models for web search: from word models to phrase models. In CIKM.
    • Gao, J., Pantel, P., Gamon, M., He, X., Deng, L., and Shen, Y. 2014b. Modeling interestingness with deep neural networks. In EMNLP
    • Gao, J., Toutanova, K., Yih., W-T. 2011. Clickthrough-based latent semantic models for web search. In SIGIR.
    • Gao, J., Yuan, W., Li, X., Deng, K., and Nie, J-Y. 2009. Smoothing clickthrough data for web search ranking. In SIGIR.
    • Gao, J., and He, X. 2013. Training MRF-based translation models using gradient ascent. In NAACL-HLT.
    • Getoor, L., and Taskar, B. editors. 2007. Introduction to Statistical Relational Learning. The MIT Press.
    • Graves, A., Jaitly, N., and Mohamed, A., 2013a. Hybrid speech recognition with deep bidirectional LSTM, Proc. ASRU.
    • Graves, A., Mohamed, A., and Hinton, G., 2013. Speech recognition with deep recurrent neural networks, Proc. ICASSP.
    • He, J., Chen, J., He, X., Gao, J., Li, L., Deng, L., Ostendorf, M., 2015 Deep Reinforcement Learning with an Action Space Defined by Natural Language, arXiv:1511.04636 (to appear on EMNLP16)
    • He, X. and Deng, L., 2013. Speech-Centric Information Processing: An Optimization-Oriented Approach, in Proceedings of the IEEE.
    • He, X. and Deng, L., 2012. Maximum Expected BLEU Training of Phrase and Lexicon Translation Models , ACL.
    • He, X., Deng, L., and Chou, W., 2008. Discriminative learning in sequential pattern recognition, Sept. IEEE Sig. Proc. Mag.
    • Hermann, K. M. and Blunsom, P. (2014). Multilingual models for compositional distributed semantics. In ACL.
    • Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., and Kingsbury, B., 2012. Deep Neural Networks for Acoustic Modeling in Speech Recognition, IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97.
    • Hinton, G., Osindero, S., and The, Y-W. 2006. A fast learning algorithm for deep belief nets. Neural Computation, 18: 1527-1554.
    • Hinton, G., and Salakhutdinov, R., 2010. Discovering binary codes for documents by learning deep generative models. Topics in Cognitive Science.
    • Hu, Y., Auli, M., Gao, Q., and Gao, J. 2014. Minimum translation modeling with recurrent neural networks. In EACL.
    • Huang, E., Socher, R., Manning, C, and Ng, A. 2012. Improving word representations via global context and multiple word prototypes, Proc. ACL.
    • Huang, P., He, X., Gao, J., Deng, L., Acero, A., and Heck, L. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM.
    • Hutchinson, B., Deng, L., and Yu, D., 2012. A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition, Proc. ICASSP.
    • Hutchinson, B., Deng, L., and Yu, D., 2013. Tensor deep stacking networks, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 35, pp. 1944 - 1957.
    • Jansen, P., Surdeanu, M., Clark, P. 2014. Discourse Complements Lexical Semantics for Non-factoid Answer Reranking. In ACL.
    • Jurgens, D., Mohammad, S., Turney, P. and Holyoak, K. 2012. SemEval-2012 Task 2: Measuring degrees of relational similarity. In SemEval.
    • Jurafsky, D., & Martin, J. H. (2014). Speech and language processing (Vol. 3). London: Pearson.
    • Kafle, K., Kanan, C., 2016. Answer-Type Prediction for Visual Question Answering, CVPR
    • Kalchbrenner, N. and Blunsom, P. (2013). Recurrent continuous translation models., in EMNLLP
    • Kiros, R., Zemel, R., and Salakhutdinov, R. 2013. Multimodal Neural Language Models, Proc. NIPS Deep Learning Workshop.
    • Klementiev, A., Titov, I., and Bhattarai, B. (2012). Inducing crosslingual distributed representations of words. In COLING.
    • Kocisky, T., Hermann, K. M., and Blunsom, P. (2014). Learning bilingual word representations by marginalizing alignments. In ACL.
    • Koehn, P. 2009. Statistical Machine Translation. Cambridge University Press.
    • Krizhevsky, A., Sutskever, I, and Hinton, G., 2012. ImageNet Classification with Deep Convolutional Neural Networks, NIPS.
    • Landauer. T., 2002. On the computational basis of learning and cognition: Arguments from LSA. Psychology of Learning and Motivation, 41:43–84.
    • Lao, N., Mitchell, T., and Cohen, W. 2011. Random walk inference and learning in a large scale knowledge base. In EMNLP.
    • Lauly, S., Boulanger, A., and Larochelle, H. (2013). Learning multilingual word representations using a bag-of-words autoencoder. In NIPS.
    • Le, H-S, Oparin, I., Allauzen, A., Gauvain, J-L., Yvon, F., 2013. Structured output layer neural network language models for speech recognition, IEEE Transactions on Audio, Speech and Language Processing.
    • LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-based learning applied to document recognition, Proceedings of the IEEE, Vol. 86, pp. 2278-2324.
    • Levy, O., and Goldberg, Y. 2014. Linguistic Regularities in Sparse and Explicit Word Representations. In CoNLL.
    • Levy, O., and Goldberg, Y. 2014. Neural Word Embeddings as Implicit Matrix Factorization. In NIPS.
    • Li, P., Hastie, T., and Church, K.. 2006. Very sparse random projections, in Proc. SIGKDD.
    • Li, P., Liu, Y., and Sun, M. (2013). Recursive autoencoders for ITG-based translation. In EMNLP.
    • Li, P., Liu, Y., Sun, M., Izuha, T., and Zhang, D. (2014b). A neural reordering model for phrase-based translation. In COLING.
    • Liu, S., Yang, N., Li, M., and Zhou, M. (2014). A recursive recurrent neural network for statistical machine translation. In ACL.
    • Liu, X., Gao, J., He, X., Deng, L., Duh, K., Wang, Y., 2015. Representation learning using multi-task deep neural networks for semantic classification and information retrieval, NAACL
    • Liu, L., Watanabe, T., Sumita, E., and Zhao, T. (2013). Additive neural networks for statistical machine translation. In ACL.
    • Lu, S., Chen, Z., and Xu, B. (2014). Learning new semi-supervised deep auto-encoder features for statistical machine translation. In ACL.
    • Maskey, S., and Zhou, B. 2012. Unsupervised deep belief feature for speech translation, in ICASSP.
    • Mesnil, G., He, X., Deng, L., and Bengio, Y., 2013. Investigation of Recurrent-Neural-Network Architectures and Learning Methods for Spoken Language Understanding, in Interspeech.
    • Mikolov, T., Kombrink, S., Burget, L., Cernocky, J., Khudanpur, S. 2011. Extensions of recurrent neural network based language model. In ICASSP.
    • Mikolov, T. 2012. Statistical Language Models based on Neural Networks, Ph.D. thesis, Brno University of Technology.
    • Mikolov, T., Chen, K., Corrado, G., and Dean, J. 2013. Efficient estimation of word representations in vector space, Proc. ICLR.
    • Mikolov, T., Kombrink,. S., Burget, L., Cernocky, J., Khudanpur, S., 2011. Extensions of Recurrent Neural Network LM. ICASSP.
    • Mikolov, T., Yih, W., Zweig, G., 2013. Linguistic Regularities in Continuous Space Word Representations. In NAACL-HLT.
    • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS.
    • Mnih, A., Kavukcuoglu, K. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In NIPS.
    • Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M., 2013. Playing Atari with Deep Reinforcement Learning, NIPS
    • Mohamed, A., Yu, D., and Deng, L. 2010. Investigation of full-sequence training of deep belief networks for speech recognition, Proc. Interspeech.
    • Mohammad, S., Dorr, Bonnie., and Hirst, G. 2008. Computing word pair antonymy. In EMNLP.
    • Narasimhan, K., Kulkarni, T., Barzilay, R., 2015. Language Understanding for Text-based Games Using Deep Reinforcement Learning. EMNLP
    • Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., and Ng, A. 2011. Multimodal deep learning, Proc. ICML.
    • Nickel, M., Tresp, V., and Kriegel, H. 2011. A three-way model for collective learning on multi-relational data. In ICML.
    • Niehues, J., Waibel, A. 2013. Continuous space language models using Restricted Boltzmann Machines. In IWLT.
    • Noh, H., Seo, P., Han, B., 2016. Image Question Answering Using Convolutional Neural Network With Dynamic Parameter Prediction, CVPR
    • Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., Song, X., Ward R., 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval, IEEE/ACM Transactions on Audio, Speech, and Language Processing 24 (4), 694-707
    • Pennington, J., Socher, R., Manning, C. 2014. Glove: Global Vectors for Word Representation. In EMNLP.
    • Reddy, S., Lapata, M., and Steedman, M. 2014. Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics (TACL).
    • Sainath, T., Mohamed, A., Kingsbury, B., and Ramabhadran, B. 2013. Convolutional neural networks for LVCSR, Proc. ICASSP.
    • Salakhutdinov R., and Hinton, G., 2007 Semantic hashing. in Proc. SIGIR Workshop Information Retrieval and Applications of Graphical Models
    • Salton, G. and McGill, M. 1983. Introduction to Modern Information Retrieval. McGraw Hill.
    • Sarikaya, R., Hinton, G., and Ramabhadran, B., 2011. Deep belief nets for natural language call-routing, in Proceedings of the ICASSP.
    • Schwenk, H. 2012. Continuous space translation models for phrase-based statistical machine translation, in COLING.
    • Schwenk, H., Rousseau, A., and Attik, M., 2012. Large, pruned or continuous space language models on a gpu for statistical machine translation, in NAACL-HLT 2012 Workshop.
    • Seide, F., Li, G., and Yu, D. 2011. Conversational speech transcription using context-dependent deep neural networks, Proc. Interspeech
    • Shen, Y., He, X., Gao, J., Deng, L., and Mesnil, G. 2014. Learning Semantic Representations Using Convolutional Neural Networks for Web Search, in Proceedings of WWW.
    • Shen, Y., He, X., Gao, J., Deng, L., and Mesnil, G. 2014. A convolutional latent semantic model for web search. CIKM
    • Shih, K., Singh, S., Hoiem, D., 2016. Where to Look: Focus Regions for Visual Question Answering, CVPR
    • Silver, D., Huang, A., Maddison, C., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D., 2016. Mastering the game of Go with deep neural networks and tree search, Nature
    • Simonyan, K., Zisserman, A., 2015 Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015
    • Socher, R., Chen, D., Manning, C., and Ng, A. 2013. Reasoning With Neural Tensor Networks For Knowledge Base Completion. In NIPS.
    • Socher, R., Huval, B., Manning, C., Ng, A., 2012. Semantic compositionality through recursive matrix-vector spaces. In EMNLP.
    • Socher, R., Lin, C., Ng, A., and Manning, C. 2011. Learning continuous phrase representations and syntactic parsing with recursive neural networks, Proc. ICML.
    • Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C., Ng A., and Potts. C. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, Proc. EMNLP
    • Son, L. H., Allauzen, A., and Yvon, F. (2012). Continuous space translation models with neural networks. In NAACL.
    • Song, X. He, X., Gao. J., and Deng, L. 2014. Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model. MSR Tech Report.
    • Song, Y., Wang, H., and He, X., 2014. Adapting Deep RankNet for Personalized Search. Proc. WSDM.
    • Songyot, T. and Chiang, D. (2014). Improving word alignment using word similarity. In EMNLP.
    • Sundermeyer, M., Alkhouli, T., Wuebker, J., and Ney, H. (2014). Translation modeling with bidirectional recurrent neural networks, in EMNLP.
    • Sutton, R., Barto, A., 1998. Reinforcement Learning: An Introduction. MIT Press.
    • Tamura, A., Watanabe, T., and Sumita, E. (2014). Recurrent neural networks for word alignment model. In ACL.
    • Tapaswi, M., Zhu, Y., Stiefelhagen, R., Torralba, A., Urtasun, R., Fidler, S., 2016. MovieQA: Understanding Stories in Movies Through Question-Answering, CVPR
    • Tran, K. M., Bisazza, A., and Monz, C. (2014). Word translation prediction for morphologically rich languages with bilingual neural networks. In EMNLP.
    • Tran, K., He, X., Zhang, L., Sun, J., Carapcea, C., Thrasher, C., Buehler, C., Sienkiewicz, C., “Rich Image Captioning in the Wild,” DeepVision, CVPR 2016
    • Tur, G., Deng, L., Hakkani-Tur, D., and He, X., 2012. Towards Deeper Understanding Deep Convex Networks for Semantic Utterance Classification, in ICASSP.
    • Turney P. 2008. A uniform approach to analogies, synonyms, antonyms, and associations. In COLING. Songyot, T. and Chiang, D. (2014). Improving word alignment using word similarity. In EMNLP.
    • Vaswani, A., Zhao, Y., Fossum, V., and Chiang, D. 2013. Decoding with large-scale neural language models improves translation. In EMNLP.
    • Wang, H., He, X., Chang, M., Song, Y., White, R., Chu, W., 2013. Personalized ranking model adaptation for web search, SIGIR Wang, Z., Zhang, J., Feng, J., Chen, Z. 2014. Knowledge Graph and Text Jointly Embedding. In EMNLP.
    • Watkins, C., and Dayan, P., 1992. Q-learning. Machine Learning
    • Wright, S., Kanevsky, D., Deng, L., He, X., Heigold, G., and Li, H., 2013. Optimization Algorithms and Applications for Speech and Language Processing, in IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 11.
    • Wu, Q., Wang, P., Shen, C., Dick, A., Hengel, A., 2016. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge From External Sources, CVPR
    • Wu, H., Dong, D., Hu, X., Yu, D., He, W., Wu, H., Wang, H., and Liu, T. (2014a). Improve statistical machine translation with context-sensitive bilingual semantic embedding model. In EMNLP.
    • Wu, Y., Watanabe, T., and Hori, C. (2014b). Recurrent neural network-based tuple sequence model for machine translation. In COLING.
    • Xu, C., Bai, Y., Bian, J., Gao, B., Wang, G., Liu, X., Liu, T. 2014. RC-NET: A General Framework for Incorporating Knowledge into Word Representations. In CIKM.
    • Yang, B., Yih, W., He, X., Gao, J., and Deng L. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In ICLR.
    • Yang, N., Liu, S., Li, M., Zhou, M., and Yu, N. 2013. Word alignment modeling with context dependent deep neural network. In ACL.
    • Yang, Y., Chang, M. 2015. S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking. In ACL.
    • Yao, K., Zweig, G., Hwang, M-Y. , Shi, Y., Yu, D., 2013. Recurrent neural networks for language understanding, submitted to Interspeech.
    • Yao, X., Van Durme, B. 2014. Information Extraction over Structured Data: Question Answering with Freebase. In ACL.
    • Yann, D., Tur, G., Hakkani-Tur, D., Heck, L., 2014. Zero-Shot Learning and Clustering for Semantic Utterance Classification Using Deep Learning. In ICLR
    • Yogatama, D., Faruqui, M., Dyer, C., Smith, N. 2015. LearningWord Representations with Hierarchical Sparse Coding. In ICML.
    • Yih, W., Toutanova, K., Platt, J., and Meek, C. 2011. Learning discriminative projections for text similarity measures. In CoNLL.
    • Yih, W., Zweig, G., Platt, J. 2012. Polarity Inducing Latent Semantic Analysis. In EMNLP-CoNLL.
    • Yih, W., Chang, M., Meek, C., Pastusiak, A. 2013. Question Answering Using Enhanced Lexical Semantic Models. In ACL.
    • Yih, W., He, X., Meek, C. 2014. Semantic Parsing for Single-Relation Question Answering. In ACL.
    • Yih, W., Chang, M., He, X., Gao, J. 2015. Semantic parsing via staged query graph generation: Question answering with knowledge base, In ACL.
    • Zeiler, M. and Fergus, R. 2013. Visualizing and understanding convolutional networks, arXiv:1311.2901, pp. 1-11.
    • Zhang, J., Liu, S., Li, M., Zhou, M., and Zong, C. (2014). Bilingually-constrained phrase embeddings for machine translation. In ACL.
    • Zhu, Y., Groth, O., Bernstein, M., Fei-Fei, L., 2016. Visual7W: Grounded Question Answering in Images, CVPR
    • Zou, W. Y., Socher, R., Cer, D., and Manning, C. D. (2013). Bilingual word embeddings for phrase-based machine translation. In EMNLP.

    Thursday, July 19, 2018

    Reinforcement Learning Q Learning

    Explore <s, a> ---> s' reads: move from current state s to s' via action a.  Through the action a reward is received, it can be positive for positive reinforcement, negative for punishment or discouragement. As the robot explores the environment, the agent will update the Q table which tracks the scores of accumulated scores.

    Bellman Equation is one of the utility equations used to track scores.
    U(s) = R(s) + ɣ max_a Σ (s,a,s') U(s')
    The function none linear. This fancy function means current utility is a function of reward, a multiplier or a fraction of the max total future actions and future rewards.

    Start with arbitrary utility, explore, and update based on allowed neighboring moves, based on the states it can reach. Update at every iteration.

    Wednesday, July 18, 2018

    F1 Score - Machine Learning

    F1 Score is an useful metric of classification models rather than regression machine learning models. It is an useful metrics for models that also go well with confusion matrix. F1 score is an useful machine learning metrics aka performance score that is also frequently used in statistical analysis. You can read more about F1 score on the wikipedia page and also the sklearn F1 score documentation below:

    • https://en.wikipedia.org/wiki/F1_score
    • http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html
    F1 Score and Accuracy scores are both used in classification tasks. Accuracy score has some shortfalls. For example, if the dataset is obviously biased. For example, if most of the input data is negative (of the negative class only), say 99.99%. Then the machine does not need to explicitly learn anything intelligent. It can just guess "negative"every time, it will still be 99.99% accurate. F1 score is a shorthand to measure a composite score of the confusion matrix - true positive, true negative, false positive, false negative.

    F1 score is a combination of recall and precision. It also a shorthand to measure how accurate and useful the result is.

    Accuracy is a simple fraction of correctly classified objects over total number of objects.

    It can be misleading to only focus on accuracy, especially when data labels are imbalanced, even if data is representative. Certain scenarios are simply more prevalent in the population data. For example, by definition orphan diseases are the minority data points in the real world. 

    Get the most out of your Udacity Nanodegree and Subscription

    Udacity is pricey. You are on a budget and desperately need a better job. Here are some great tips to take advantage of your Udacity subscription.

    Udacity Career Partners and Career Hub

    Udacity offer video tutorials for technology as well as how to write a resume, start a startup and more. In addition, each nanodegree is created in partnership with top tech companies, take advantage of these hidden connections. Reach out to content creators and industry leaders from Google.

    Udacity Career Conferences

    It's real. It works. There are actually top Silicon Valley companies come to review your resume and interview you in person. Highly recommend. I can give a lot of personal anecdote about how well this worked out for me. 

    Udacity Career Profile

    Completed multiple nanodegrees? You can turn your "ADHD" and inability to stop learning forever into a career advantage: show that you had a the grit and resourcefulness to complete multiple nanodegrees on your career profile. Update it regularly. 

    Make your Capstone Project Portfolio Ready

    These days, companies hire if you have a great portfolio not a great label. Turn your capstone project into a recruiter ready, professional medium post, a github repo, Linkedin ready slides or PDF. Do this while completing the capstone. It is so much easier. Once you are done with the nanodegree, it's really hard to go back.

    For example, Udacity digital marketing project slides are presentation ready. And you get real-world experience marketing for Udacity on Adwords, Facebook and Instagram. 

    Mentor

    Though not always helpful, Udacity Nanodegree subscription does come with an online. Remember, you can always request to change mentor if you have any trouble. 

    Udacity Forum

    A great place for discussion, gaining traffic, and ask for advice. 

    Tuesday, July 17, 2018

    Better Relationships in Silicon Valley

    You are here to win and start a startup, but the journey of being an entrepreneur can be lonely, especially if you are a solopreneur. Have you thought about starting a meaningful relationship while you are here? Here are some tips and resources for you.

    Hinge: a professional Tinder like dating app but often for Ivy League educated young professionals

    Coffee Meets Bagel blog: another dating app offer some advice on their blog.

    A Ted talk about relationships https://www.ted.com/topics/relationships more Ted relationship talks http://ideas.ted.com/tag/relationships/

    Former OkCupid blogger made famous insights and findings about relationships and online dating profiles. He has turned those insights into a full book.



    Monday, July 16, 2018

    100 Social Networks, Resources and Sites for Jobs in Silicon Valley

    Muse

    A women friendly job site completed with great tips, startup office infos and perks, and other pretty things that help ease the job searching stress.

    Weave

    Tinder but for business professionals. Wave match you with young business professionals that you should connect with.

    Hinge

    Who says work and love have to be separated? Meet young business and tech professionals using Hinge, a dating app popular in Silicon Valley. 

    Silicon Valley HBO TV Show

    Need to learn to talk-the-talk walk-the-walk in the Valley? Get inspired and entertained by watching Silicon Valley the HBO show. 




    100 Amazing Tools and Resources for Startup Founders

    Brainstorm Startup Ideas and Domain Names Using Generators

    One surprisingly easy hack is to precede the startup name with "try" or "get", example: getAlto, tryAlto

    http://itsthisforthat.com/

    Bootstrap - Front End Framework

    Previously Twitter Bootstrap is a super popular framework for front-end development. 

    Use GSuite or Google Domain to Host Your Custom Domain Email

    Want to have email@mycompany.co instead of mycompany@gmail.com so that your company looks official and trust worthy? Use Google Work's gmail hosting or the email forwarding service of Google Domain (only works one way - only receives custom domain emails).

    Trello Board

    Organize development cycles and sprints with Trello Board.

    Material Design - Front End Framework and Stylebook

    Google's Material Design helps you design Android like apps, paper like animations and layout.

    Flat UI - CSS Framework

    Design Google Material Design like UI, flat design. 

    Prototyping, Wireframing Tools

    Invision, Marvel

    Marvel can easily mock up mobile apps in minutes, for free.

    https://www.flinto.com/

    Free Professional Apps for iPad

    Expensive apps like Adobe and premium MailChimp features are actually available in various forms of iPad apps. You can use advanced features for free! MailChimp even have an offline app for collecting emails at events and conferences.

    Fiverr

    Get gigs done on Fiverr for $5 dollars and up.

    Reddit, Product Hunt, Imgur, Hacker News - are all important social networks for founders at Startups


    Outsource and Delegate to Offsite Teams

    Google Voice

    Use google voice to set up a separate number for work.

    Protect Your Intellectual Property

    File patents, copy right protection.

    Example https://www.patentmonk.com/

    Google Trend - Product Research & Market Research

    Growthhacker.org - Growth

    Yoast - SEO for WordPress

    Make Gifs and Screenshots

    Some Chrome Extensions also have the capability. 

          

    Unsplash - Stock Photo

    Unsplash provides high-quality, startup-friend, royalty free stock photos. 

    Code Libraries

    In addition to frameworks, there are also jQuery UI, WordPress themes, 

    WordPress Themes

    Startup themes are available for purchase on ThemeForest. These templates will make your website look instantly like a legitimate startup. However, for WordPress speed is a serious concern. Without the snappy speed, startup websites will give off the wrong vibe. How can you raise funds for your tech startup if your website is slow? 

    Bootstrap, a popular front-end frame, can be easily integrated with WordPress.

    Hire Designers on 99 Design and Fiverr

    Splashthat - Make Instant Event Invite Pages

    These pages are called splash landing pages. Splash refers how instant and short-lived the pages are, usually used for a particular event or a purpose, or an Optimizely experiment. 

    Pinterest Board for Web Design Mood Boarding

    Pinterest Rich Pins are content friendly, sophisticated pins (with specialized meta tags) that display detailed information such as recipe, blog article, and Buyable Pins. 
    https://help.pinterest.com/en/articles/enable-rich-pins-your-site

    Add E-Commerce Capability to Your Youtube Channel and Games

    Alto's Adventure and Odyssey - a top selling iOS game uses Shopify as its e-commerce platform to sell specialty merchandise such as stuffed llama - a beloved avatar int he game. 


    ----------------------- ----------------------- ----------------------- -----------------------
    More Tips and Tools:

    • Understand your customer journey Google Customer Journey https://www.thinkwithgoogle.com/tools/customer-journey-to-online-purchase.html
    • Practice the philosophy of running a lean startup http://runninglean.co/
    • How to boost employee happiness without spending any money by Fast Company
      • http://www.fastcompany.com/3043863/hit-the-ground-running/how-to-boost-employee-happiness-without-spending-any-money
    • MailChimp newsletter integration with Shopify and eBay
      • https://connect.mailchimp.com/integrations/
      • https://connect.mailchimp.com/integrations/shopify-integration--4
    • Co-working spaces are more than just physical spaces for an office. Those are also great places to connect and meet with people. It is a real community valuable for entrepreneurs, especially solopreneurs. 
    • Mock up API calls https://www.mockable.io/
    • Web Scraping tool http://scrapy.org/ Careful most sites are protected with Term of Use, which generally prohibits scraping for commercial purpose. 
    • Organize your code snippets with public and private gists on github
    • version control for designers pixelapse (acquired by dropbox)
    • Use lint and validator tools to validate codes
    • Business Plan http://www.bplans.com
    • Wells Fargo Business Plan tool and Business Intelligence tool
    • Business Plan tool http://www.liveplan.com/

    Statistics Basic 101

    Statistics is a dark science ... until you understand it.

    Core concepts:

    Sample vs Population
    Establish a hypothesis for the research
    Statistics significance and statistical theories
    Probabilities

    Evaluating Statistics:
    Did we achieve the research objective?
    Did we find support for the hypothesis?
    What is the conclusion?
    What is the next step?

    100 Amazing Coding, Machine Learning, Data Science Courses Tutorials on the Internet

    While learning to code, bettering your coding schools, online learners should avoid getting stuck in the ocean of tutorials and videos - do not get stuck in learners' limbo. It is impractical to know all the details of a framework. Not every pilot knows how to build a plane, not every machine learner needs to know all the math of all the algorithms!
    • Udacity
      • Web Development
        • https://www.udacity.com/course/viewer#!/c-cs253/l-48737165/m-313672917
      • JavaScript Design patterns
        • https://www.udacity.com/course/javascript-design-patterns--ud989
      • How to Build a Startup
        • https://www.udacity.com/course/how-to-build-a-startup--ep245
      • OOP with JavaScript
        • https://www.udacity.com/course/object-oriented-javascript--ud015
      • iOS Swift
        • https://developer.apple.com/videos/wwdc/2014/
        • optionals https://s3.amazonaws.com/udacity-hosted-downloads/ud585/docs/Optionals.pdf
      • https://www.udacity.com/course/intro-to-jquery--ud245
      • MONGODB
      • Algorithms
      • Data Analytics
      • Intro to Machine Learning
      • Product Design https://www.udacity.com/course/product-design--ud509
      • Firebase for iOS
      • Introduction to Firebase
      • Linear Algebra Review
      • JAVA
        • https://classroom.udacity.com/courses/cs046/lessons/183784769/concepts/1869544900923
      • D3
        • https://www.udacity.com/course/viewer#!/c-ud507/l-3068848585/e-3095208733/m-3095208735
      • JavaScript
        • https://www.udacity.com/course/intro-to-ajax--ud110
        • OOP JavaScript https://www.udacity.com/course/object-oriented-javascript--ud015
      • Startup
        • how to build a startup https://www.udacity.com/course/how-to-build-a-startup--ep245
      • Git and Github, version control
        • https://www.udacity.com/course/how-to-use-git-and-github--ud775
        • http://blog.teamtreehouse.com/getting-started-github-basics

    • Coursera 
    • Berkeley School of Information
    • Stanford CS 101 cousera https://class.coursera.org/cs101-selfservice 
    • R programming and genetics algorithms by Johns Hopkins on Coursera
    • Bio informatics, Data Science Johns Hopkins on Coursera
    • Check out our blog post on free Udacity Baidu self driving car seminar http://www.siliconvanity.com/2018/07/learning-self-driving-car-engineer-for.html
    • Stanford intro to logic past https://www.coursera.org/course/intrologic
    • LittleBits teaches hardware and software engineering experience to kids. It is slightly more accessible than Raspberry Pi. Comes with a variety of sensors and components, such as pressure sensor, light sensor, temperature sensor etc.
      • http://littlebits.cc/tips-tricks/fridays-tips-tricks-light-sensor-light-trigger
    • Tutorial code test quiz - https://codefights.com/
    • W3Schools
      • HTML Dom Events
        • http://www.w3schools.com/jsref/dom_obj_event.asp
      • Free tutorials on Angular by w3schools
        • http://www.w3schools.com/angular/angular_tables.asp
      • CSS
      • Angular
    • AngularJS
      • Coursera
      • https://www.airpair.com/angularjs/posts/angularjs-tutorial
      • https://www.airpair.com/angularjs#9-directives-custom-
      • https://www.airpair.com/angularjs/posts/angularjs-tutorial#8-directives-core-
    • Platzi learning
      • Once funded by YCombinator. It invites prominent speakers including YC leaders to talk about startup, finance, data, growth and more. 
    • Manning Book Practical Data Science with R
      • https://www.manning.com/books/practical-data-science-with-r
    • Coursera
    • Chinese MOOCs http://mooc.guokr.com/
    • Code School was once pretty good and creative. It became a more generic professional training site after being acquired by pluralsight. 
    • Staying Sharp with Angular.js JAVASCRIPT
    • Building Blocks of Express.js JAVASCRIPT
    • Mastering GitHub GIT
    • Shaping up with Angular.js JAVASCRIPT
    • Surviving APIs with Rails RUBY
    • Warming Up With Ember.js JAVASCRIPT
    • Front-end Formations HTML/CSS
    • Core iOS 7 IOS
    • Rails 4 Patterns RUBY
    • jQuery: The Return Flight JAVASCRIPT
    • Try iOS IOS
    • Ruby Bits Part 2 RUBY
    • Ruby Bits RUBY
    • Try Git GIT
    • Real-time Web with Node.js JAVASCRIPT
    • Microsoft AI school https://aischool.microsoft.com/learning-paths
    • Coursera others
      • wesleyan - creative writing program
      • johns hopkins - rails angularjs mongodb html css javascript
      • wharton - analytics marketing
      • https://www.coursera.org/learn/meteor-development/home/week/1
        https://www.coursera.org/learn/search-engine-optimization/home/week/1
        https://www.coursera.org/learn/server-side-development/home/welcome
        https://www.coursera.org/learn/web-frameworks/home/week/2 
        https://www.coursera.org/utoronto
    • d3 high quality well explained tutorials http://cs.wellesley.edu/~mashups/pages/am5/d3tutorial1.html
    • http://paperjs.org/reference/shape/
    • Meteor Tut+ https://www.youtube.com/watch?v=hgjyr6BPAtA&list=PLLnpHn493BHECNl9I8gwos-hEfFrer7TV
    • https://teamtreehouse.com/library/building-modern-web-applications-with-meteor
    • Rails
      • rails https://teamtreehouse.com/library/build-a-todo-list-application-with-rails-4
      • one month rails
      • soup to bits code school rails and real life examples
      • https://www.codeschool.com/shows/soup-to-bits
    • Stanford iOS iPhone app class on iTune U
    • JavaScript AirPair http://www.airpair.com/javascript/language-introduction
    • Design lessons for developers https://hackdesign.org/lessons/
    • Startup prototyping code4startups
    • Google developer channel
    • Game probabilities https://sinepost.wordpress.com/2012/10/26/probability-in-games-xcom/

    WordPress Tutorials and Classes

    Treehouse offers WordPress tutorials and classes for reasonable prices - a flat monthly subscription. Treehouse also teaches theme development 

    Product Design, Product Management Classes


    Udacity offers a product design class - https://www.udacity.com/course/product-design--ud509

    Learn how to build vector graphics: Sketch website has its own tutorials - https://www.sketchapp.com/learn/
    - Online book on Sketch https://designcode.io/sketch
    - Treehouse offers a Sketch prototyping class https://teamtreehouse.com/library/sketch-basics
    - - I downloaded a discounted version of Sketch using a treehouse promo code once. 

    WebDesignerLedger a website by designers for designers https://webdesignledger.com/#22a3873003

    How to start a startup by Stanford and YCombinator - the best startup accelerator https://startupclass.co/courses/how-to-start-a-startup/lectures/64050

    A startup class Platzi - YCombinator alum in collab with YC partners https://courses.platzi.com/courses/startup-class/

    JavaScript Classes, Front End and Full Stack JavaScript

    Treehouse offers Node.js classes. Udacity offers Firebase JavaScript and Angular classes.

    React has been very popular. Get started with React in lieu of JavaScript on Facebook React website https://reactjs.org/tutorial/tutorial.html

    Data Visualization Data Analysis, Data Science Classes

    Treehouse offers a D3 class. Udacity is the king of Python Data Science, Data Analysis and Machine Learning, Artificial Intelligence classes.  Udacity offers Firebase classes in collaboration with Google!

     Web Security, Crypto Classes

    One month offers a white hat hacker class - https://onemonth.com/courses/web-security

    Cloud Engine

    For Google Cloud you want to look on Google's website and Coursera. For Amazon, you can use amazon's developer site, lots of free webinars. 


    Pro Tips for Online Learners

    Use Kindle e-ink to avoid eye strains.




    Machine Learning Workflow

    Data cleaning Missing data Outlier Others: duplicates, typos, special characters Strategy for missing data: imputation, mean, median...