Ad

Wednesday, March 8, 2017

Pandas Numpy Data Analysis Tool Kit - Udacity Machine Learning Nanodegree 00

SERIES & DATAFRAME

Basic units data structures of Pandas, data analysis using Python

Allows users to store a large amount of information and perform data analysis

Dataframe documentation: http://pandas.pydata.org/pandas-docs/version/0.17.0/dsintro.html#dataframe

A dictionary
  • Dict of 1D ndarrays, lists, dicts, or Series
  • 2-D numpy.ndarray
  • Structured or record ndarray
  • Series
  • Another DataFrame


Sample Code: 

d = {'key_name':Series([1,2,3], index=['a','b','c'])}

Analogy : Excel Spreadsheet
Will also return number of rows and columns

Pandas.Series()
Pandas.Series([],index=[])


----

More sample code:
   my_data = pd.DataFrame(data)
    print my_data.dtypes
    print ""
    print my_data.describe()
    print ""
    print my_data.head()
    print ""

    print my_data.tail()


# Retrieve columns
df[['col_name','col2_name']]
# Retrieve rows
df.loc['a']


df[df['col_name'] >= 30]

get row column counts of Pandas Dataframe
.shape
len(DataFrame.index)
.count() count each column of the entire table

No comments:

Post a Comment

Applying for jobs at the Lending Club

We tried to figure out Lending Club 's tech stack for 2019. Our analysis shows Lending Club asks for skills in Python, Tableau, SQL and ...