Convert DataFrame to List in Pandas using Python:
Pandas.values property is used to get a numpy.array and then use the tolist() function to convert that array to list. DataFrame is the two-dimensional data structure. DataFrame consists of rows and columns. Data is aligned in the tabular format. Hence, we can use the DataFrame to store the data.Lists are also used to store data.However, the list is a collection that is ordered and changeable. Therefore, lists need not always be homogeneous.
Pandas DataFrame to List
To convert Pandas DataFrame to List in Python, use the DataFrame.values().tolist() function. It is a standard way of conversion, but there are many ways to convert Pandas DataFrame to List based on your scenario.
- Use Pandas df.values.tolist()
- Use Pandas df.Series.tolist()
- Use zip(*df.values)
- Use iloc
- Use df.columns.values.tolist()
Steps to convert Python DataFrame to List
Please follow the below steps to convert Pandas DataFrame to List.
Step 1: Create a DataFrame
See the following code.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin
To create a DataFrame, we need Data. So, we have created a Dictionary in which keys are column names and values are the lists of values.
Then used pd.DataFrame() function to create the DataFrame from the Dictionary.
Step 2: Use df.values to get a numpy array of values
The next step is to use the DataFrame property values, which are used to get the values in the NumPy array format.Please keep in mind that only the values in the DataFrame will be returned, the axes labels will be removed.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) # get the values print('---------------------') print('Use df.values property to get NumPy array') vals = df.values print(vals) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin --------------------- Use df.values property to get NumPy array [['Stranger Things' 25 'Millie'] ['Money Heist' 40 'Alvaro'] ['House of Cards' 45 'Kevin']]
Pandas DataFrame.values attribute returns a NumPy representation of the given DataFrame.The last step is to convert the NumPy array to a list using the tolist() function.
Step 3: Use Pandas tolist() function
Pandas tolist() is a built-in function that converts a series to a list. Series is a One-dimensional ndarray with axis labels (including time series). So, we have only converted Pandas DataFrame to Series, or in our case, it is a numpy array.Initially, the series is of type pandas.core.series. Then, series and applying the tolist() method is converted to list data type.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) # get the values print('---------------------') print('Use df.values property to get NumPy array') vals = df.values print(vals) # convert to list print('---------------------') print('Convert values to list') print(vals.tolist()) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin --------------------- Use df.values property to get NumPy array [['Stranger Things' 25 'Millie'] ['Money Heist' 40 'Alvaro'] ['House of Cards' 45 'Kevin']] --------------------- Convert values to list [['Stranger Things', 25, 'Millie'], ['Money Heist', 40, 'Alvaro'], ['House of Cards', 45, 'Kevin']]
The “df.values” return values present in the dataframe. “tolist()” will convert those values into a list.
Use Pandas df.Series.tolist()
Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called index.Pandas Series is nothing but the column in the excel sheet. Labels need not be unique but must be a hashable type.
Pandas series can be created using various inputs like:
- Array
- Dictionary
- Scalar value or constant
Pandas Series.tolist() is an inbuilt function that returns a list of the values.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) # get the values print('---------------------') print("Get the list of actors") print(df['actors'].tolist()) print("Get the list of episodes") print(df['episodes'].tolist()) print("Get the list of series") print(df['series'].tolist()) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin --------------------- Get the list of actors ['Millie', 'Alvaro', 'Kevin'] Get the list of episodes [25, 40, 45] Get the list of series ['Stranger Things', 'Money Heist', 'House of Cards']
In this code, we have got the series by its column name using df[‘column name’], and then it will return the series of values and then use the tolist() function to convert them into the list.
Use zip(*df.values) function to convert df to list
Conversion of DataFrame to List can also take place using one more method.The Python zip() function takes iterables, aggregates them in the list, and returns it. Iterables could be 0 or more.If we do not pass any parameter, the zip() function returns the empty iterator. Finally, the * operator is used in conjunction with a zip to unzip the list.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) # get the values print('---------------------') print('Use zip() function to aggregate DataFrame into List') listOfVal = [list(i) for i in zip(*df.values)] print(listOfVal) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin --------------------- Use zip() function to aggregate DataFrame into List [['Stranger Things', 'Money Heist', 'House of Cards'], [25, 40, 45], ['Millie', 'Alvaro', 'Kevin']]
From the output, you can see that you have converted df to list within one step using the zip() function.
Use iloc[] approach
If we want to convert only one column to a list, we can use iloc.Pandas.DataFrame.iloc is the unique inbuilt method that returns integer-location-based indexing for selection by position.Pandas Dataframe.iloc[] function is used when the index label of the DataFrame is something other than the numeric series of 0, 1, 2, 3….n, or in some scenario, and the user doesn’t know the index label. The iloc selects data by row number.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) # get the values print('---------------------') listOfVal0 = df.iloc[:, 0].tolist() listOfVal1 = df.iloc[:, 1].tolist() listOfVal2 = df.iloc[:, 2].tolist() print('Use iloc[] function to get select list of row 0') print(listOfVal0) print('Use iloc[] function to get select list of row 1') print(listOfVal1) print('Use iloc[] function to get select list of row 2') print(listOfVal2) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin --------------------- Use iloc[] function to get select list of row 0 ['Stranger Things', 'Money Heist', 'House of Cards'] Use iloc[] function to get select list of row 1 [25, 40, 45] Use iloc[] function to get select list of row 2 ['Millie', 'Alvaro', 'Kevin']
We have converted all the rows of DataFrame to list individually and printed them on the screen using iloc[].
Use df.columns.values.tolist()
If we want to convert DataFrame column names to list, we can use the df.columns.values.tolist() function.The df.column.values will return the column names, and then we will use the tolist() will convert them to the list. See the following code.
# app.py import pandas as pd # Creating Dictionary dict = { 'series': ['Stranger Things', 'Money Heist', 'House of Cards'], 'episodes': [25, 40, 45], 'actors': ['Millie', 'Alvaro', 'Kevin'] } # Creating Dataframe df = pd.DataFrame(dict) print(df) # get the values print('---------------------') print('Use df.columns.values.tolist() to get the list of columns') print(df.columns.values.tolist()) Output python3 app.py series episodes actors 0 Stranger Things 25 Millie 1 Money Heist 40 Alvaro 2 House of Cards 45 Kevin --------------------- Use df.columns.values.tolist() to get the list of columns ['series', 'episodes', 'actors']
Conclusion:
Pandas.DataFrame, Pandas.Series and built-in type lists can be converted to each other. We have seen five ways to convert Pandas DataFrame to a list. You can use one of the above five approaches according to your requirement.