- Intermediate Concepts. Source: Kaggle Live Coding
- Bloom filter (a data structure) looking at overlapping in data. Checking if there's any overlap or cross over between train and test data. Test if element is an element of a set.
- Use in NLP, in n-grams, 8-grams arbitrary, 20-gram typical because sentences are 20ish words. 7-grams, human memory span around seven words. Average spoken language may be 7-grams. Can do both to see the amount of overlaps. Look at all sets of n grams. Pair wise comparison: what number of n-grams already exist in the set. Empty bloom filter is a bit set of m bits, all set to 0 (wikipedia). k hash functions look at the input, each map or hashes some element to m bits. k is much smaller than m.
- Kaggle competition with Google Cloud New York Taxi Fare Competition https://www.kaggle.com/c/new-york-city-taxi-fare-prediction
- Playground competition in partnership with Google Cloud, Coursera and Kaggle
Saturday, March 9, 2019
Kaggle Intermediate Cheat Sheet
Using Kaggle on Google Colab
Install Kaggle, and also install catboost
!pip install kaggle
# Google Colab file access feature
# allows Colab to import data directly into colab
from google.colab import files
# retrieve uploaded file
uploaded = files.upload()
# move kaggle.json into thfolder where APIs expects to finds the json file
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/kaggle/kaggle.json
#we will upload the kaggle.json file here so that colab knows our kaggle authentication
#Go to my account create new API token, which will be downloaded as a JSON file
now we can access the kaggle competition list
!kaggle competition list
AutoML machine learning deep learning without code by Uber, Ludwig allows users to train and make inference deep learning model without co...
In this downtown startup work space design the designers used fat boy bean bags and an extra wide step tiered staircase to create work space...
Google's algorithm has pushed websites to deploy mobile friendly websites, but sometimes business owners and developers really need to a...
The bogus request from P2PU to hunt for HTML tags in real life has yielded a lot of good thoughts. My first impression was that this is stup...