Ad

Saturday, March 9, 2019

Kaggle Earthquake Prediction Challenge





Objective:

Think like a data scientist

Categorical Gradient Boosting. Cat Boost Algorithm

Support Vector Machine for regression (it is more commonly known for classification)

Syllabus
Earthquake prediction background & helpful resources
Step 1 - installing dependencies
Step 2 - importing dataset
Step 3 - Exploratory data analysis
Step 4 - Feature engineering (statistical features added)
Step 5 - Implement Catboost model
Step 6 - Implement support vector machine + radial basis functional model
Step 7 - Future Directions (Genetic programming, recurrent networks etc.)




Comment: may be we can use advanced RNN for earthquake prediction since it has a time series element

Install important libraries. Installations & Dependencies
!pip install kaggle
!pip install numpy==1.15.0
!pip install catboost 
import pandas as pd
import numpy as np
from catboost import CatBoostRegressor, Pool
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.svm import NuSVR, SVR
#kernel ridge model for SVM
from sklearn.kernel_ridge import KernelRidge
"Kernel methods are a way of improving Support Vector Machine Predictions. Make sure we can create a classifier line or regression line in a feature space we can visualize. You know? A lower dimension feature space"
#data visualization
import matplotlib.pyplot as plt

# Google Colab file access feature
# allows Colab to import data directly into colab
from google.colab import files
# retrieve uploaded file
uploaded = files.upload()
# move kaggle.json into thfolder where APIs  expects to finds the json file
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/kaggle/kaggle.json
#we will upload the kaggle.json file here so that colab knows our kaggle authentication
#Go to my account create new API token, which will be downloaded as a JSON file
now we can access the kaggle competition list
!kaggle competition list


1 comment:

Matplotlib Explained - Kite Blog

  If Jupyter Notebook is the new Excel, the horsepower of data science (visualizations, presentations and demos), Matplotlib is...