Tuesday, December 10, 2019

Exploratory Data Analysis EDA Cheat Sheet


  • pandas.DataFrame.shape -- > (row_count, col_count)
  • pandas.DataFrame.shape[0] --> number of records, number of samples in the dataset
  • my_dataframe['my_series_name'].unique() --> returns a unique values of a column, "radio button choices"
  • dataframe.describe() --> returns summary data
  • len(my_dataframe['my_series_name'].unique()) --> number of unique values
  • import os os.listdir('name_of_directory_or_just_use_.') --> list the files in the current directory '.' os.listdir('.')  or a specific directory with a name 
  • import os len(os.listdir('.') ) --> returns the number of files in the current directory
  • my_dataframe.groupby(['col_1', 'col_2']) --> groupby column 1 first then groupby column 2
  • Converting a Pandas GroupBy output from Series to DataFrame: .groupby() returns a groupby object with MultiIndex instead of a dataframe with a single index. it is also known as a hierarchical index. Will need to rename columns and reset index my_groupby.add_suffix('_Count').reset_index() or call the .size().reset_index() important to note that .size() is called on the groupby object not the usual dataframe. pandas.core.groupby.GroupBy.size calculates : Series Number of rows in each group
  • group = ['col_1', 'col_2']; my_df.groupby(group).size().reset_index(name="colum_name")
  • df = df[(df.col_name < 1) & (df.col_name_2 < 1)] complex condition query / filter in dataframe
  • pd = pd.query('col_name != "my_value"')
  • .value_count df.column.value_count()
  • pandas cheatsheet
  • .copy()
  • .head()
  • .unique()
  • Another way to use unique pd.unique(df.col_name)
  • df['col_name'].isnull().sum()
  • df['col_name'].min() 
  • df['col_name'].max()
  • df.fillna(0) #fill the dataframe with zero the entire table
  • df.reset_index(drop=True, inplace = True)
  • remove target column or any column data.drop(['target'], axis = 1, inplace = True)

No comments:

Post a Comment

Developing apps for airtable using Airtable Blocks

The airtable smart sheets now has an app platform called Airtable Blocks, which allows developers to add custom code, and build apps quickly...