Ad

Wednesday, March 8, 2017

Pandas Numpy Data Analysis Tool Kit - Udacity Machine Learning Nanodegree 00

SERIES & DATAFRAME

Basic units data structures of Pandas, data analysis using Python

Allows users to store a large amount of information and perform data analysis

Dataframe documentation: http://pandas.pydata.org/pandas-docs/version/0.17.0/dsintro.html#dataframe

A dictionary
  • Dict of 1D ndarrays, lists, dicts, or Series
  • 2-D numpy.ndarray
  • Structured or record ndarray
  • Series
  • Another DataFrame


Sample Code: 

d = {'key_name':Series([1,2,3], index=['a','b','c'])}

Analogy : Excel Spreadsheet
Will also return number of rows and columns

Pandas.Series()
Pandas.Series([],index=[])


----

More sample code:
   my_data = pd.DataFrame(data)
    print my_data.dtypes
    print ""
    print my_data.describe()
    print ""
    print my_data.head()
    print ""

    print my_data.tail()


# Retrieve columns
df[['col_name','col2_name']]
# Retrieve rows
df.loc['a']


df[df['col_name'] >= 30]

get row column counts of Pandas Dataframe
.shape
len(DataFrame.index)
.count() count each column of the entire table

No comments:

Post a Comment

Debug Google Cloud Error - Failed to enable API please make sure you have the IAM permission to enable API

 If you are not expecting this message,  and think you have the permission to enable API read on. Before using Google Cloud services, genera...