You are currently viewing How to create DataFrames and Series using Pandas

How to create DataFrames and Series using Pandas

Loading

You will discover how to use Pandas to create a dataframe and a series in this article. You’ll discover several approaches to building a dataframe. However, let’s first define data manipulation before moving on.

So what is data manipulation?

From the given term it is clear that we are manipulating the data means changing the data in order to read and perform several operations like reading, writing, merging, etc. This process helps us to enhance the accuracy and precision of the model.

In python data manipulation is performed using the Pandas library. This library is known for fast data cleaning, processing, and analysis. So this makes it easier to perform data visualization and machine learning.

Pandas is built on top of NumPy packages, so due to this it is easy to work with arrays and matrices. It is called a series and dataframes in Python.

Series

Series –  A Series is a one-dimensional labeled array that can hold any data type. It consists of a single list with index values starting from 0 to n where n is the length of the value in the series.

# importing the required libraries
import pandas as pd

# Creating a series
name= pd.Series([‘Sam’,’Walton’,’Bill’,’George’])
print(name)

Dataframes

Dataframes – A dataframe is a 2-dimensional array where values are represented in the form of rows and columns. When we combine multiple series it will form dataframes.

# Creating an empty dataframe
df= pd.DataFrame()

There are basically many ways to create a dataframe.

  • Creating a dataframe with the help of the numpy library
  • Creating a dataframe with the help of a list
  • Creating a dataframe from a list of lists:
  • Creating a dataframe with the help of a dictionary
  • Creating a dataframe with the help of a list of dictionaries

Creating a dataframe with the help of the numpy library

Using the pd.DataFrame() function, you may generate a pandas DataFrame from a NumPy array. Here is an illustration of converting a 2D NumPy array into a DataFrame:

import numpy as np
import pandas as pd

data = np.array([1, 2, 3, 4, 5])
df = pd.DataFrame(data, columns=[‘values’],index =[‘a’,’b’,’c’,’d’,’e’])
df
Creating a Dataframe from numpy

Creating a dataframe with the help of a list

# Creating a dataframe with the help of a list
df= pd.DataFrame([1,2,3,4,5],columns=[“values”])
df
Creating a dataframe with the help of a list

In the above figure, the column name is given as “values” and indexes are represented using {0,1,2,3,4}.

Creating a dataframe from a list of lists:

Now a dataframe is created from a list of lists (i.e. it means a list containing multiple lists) with the help of the pandas function.

data = [[‘Ram’,10],[‘Rohan’,12],[‘Raj’,13]]

# Creating a dataframe from a list of lists
df = pd.DataFrame(data,columns=[‘Name’,’Age’])
df
Creating a dataframe from a list of lists

Here data is the list of lists, and columns is a list of column names for the DataFrame.

Creating a dataframe with the help of a dictionary

Now a dataframe is created with the help of a dictionary. First of all, you have to initialize a dictionary with the key “name” &”age”, and values in the form of a list.

# Creating a dictionary
abc={
“name”:[“Rohan”,”Raj”,”Sunil”,”Abhi”,”Arun”],
     “Age”:[10,11,12,13,14]
     }

# Creating a dataframe with the help of a dictionary
df=pd.DataFrame(abc)
df
Creating a dataframe with the help of a dictionary

You can also change the index values by specifying an index parameter. In the above diagram, the index is represented in the form of numbers like {0,1,2,3,4}. Now we are replacing these index values with the help of {a,b,c,d,e}.

df=pd.DataFrame(abc,index=[“a”,”b”,”c”,”d”,”e”])
df
specifying index in a dataframe

Creating a dataframe with the help of a list of dictionaries

In this section, we will be creating a dataframe by taking the input as the list of dictionaries. As we know that a dictionary is represented in the form of a Key, value format. Next multiple dictionaries are made inside a list. After that with the help of pd.DataFrame function

data = [{‘Name’: ‘Ram’, ‘Age’: 10}, {‘Name’: ‘Sam’, ‘Age’: 12}, {‘Name’: ‘Watson’, ‘Age’: 13}]

# Creating a dataframe with the help of a list of dictionaries
df = pd.DataFrame(data, columns=[‘Name’, ‘Age’])
df
Creating a dataframe with the help of a list of dictionaries

Conclusion

Pandas is a powerful data manipulation library in Python that provides easy-to-use data structures and data analysis tools. The Series and DataFrame are the two primary data structures in pandas that are used to store and manipulate data. Both Series and DataFrames can be created in multiple ways, such as from lists, dictionaries, and NumPy arrays. If you have any doubts or a suggestion regarding this blog feel free to comment.

If you like the article and would like to support me, make sure to:

This Post Has 2 Comments

  1. zelma

    thanks man, this help me solve my issue

  2. zelma

    thanks for sharing the information

Comments are closed.