Ad

Wednesday, April 3, 2019

Getting Started with Automated Data Pipelines, Day 2: Validation and URL...







  • Data validation creating data from URL
  • When do you need data from URL? Maps, getting shapes for maps

Kaggle Challenge (LIVE)





  • Architecture: UNet
  • Use Google Colab to avoid dependent
  • Salt correlated with oil and gas where salt is heavy
  • !pip install imageio
  • for image processing
  • !pip install torch






Kaggle Live-Coding: Code Reviews! | Kaggle







  • Make code robust and reproducible, if column names change later can you still handle it. 
  • Use R functions for column querying starts_with(), ends_with(), contains() makes the query more robust, harder to break downstream. 
  • Avoid using numeric column indexing as order of columns may change
  • Avoid redundancy in code and comments
  • If want to make file a bit shorter, can avoid inline images, use script to generate images instead. 
  • Make sure the logic matches the coding comment and function signature

Applying for jobs at the Lending Club

We tried to figure out Lending Club 's tech stack for 2019. Our analysis shows Lending Club asks for skills in Python, Tableau, SQL and ...