Ad

Sunday, November 10, 2019

Machine Learning Workflow


  • Data cleaning
    • Missing data
    • Outlier
    • Others: duplicates, typos, special characters
  • Strategy for missing data: imputation, mean, median, np.nan, unknown
  • Outlier: visualize, demo of linear regression change with outlier, IQR
  • Curse of dimensionality: count of columns aka features vs count of rows, 
  • Data transformation:
    • Encoding
      • Categorical, one hot encoding, machine readable, ordinal versus independent
    • Scaling
    • Skewed data
  • Sampling
  • Stratification
  • Class imbalance
  • Feature engineering
    • Rank transformation


Key concepts
  • One hot encoding: a categorical column of three potential values: married, single, divorced will become three separate columns of 1, 0

1 comment:

  1. If you are new user to sage 50 accounting software and looking for the sage 50 technical support.If yes than you have come to right place as we provide efficient technical support service to customers who show complete faith in us. With our efficient and highly qualified team ,we never disappoint our customers.You can reach us at 1800-910-4754 at any hour of the day. You can also visit our website at https://www.geekaccounting247.com/ for the complete knowledge of the sage products and services.

    The Services we offered are following-
    Sage 50 Technical Support Number
    Sage 100 Technical Support Number
    Sage 50 live chat
    Sage 50 Technical Support phone Number
    Sage 50 support phone number

    Sage 50 customer service number
    Sage 50 payroll support number

    ReplyDelete

Machine Learning Workflow

Data cleaning Missing data Outlier Others: duplicates, typos, special characters Strategy for missing data: imputation, mean, median...