How to combine Python Lists Into a Pandas Dataframe
Lists are a useful way to collect data from other Python processes, but they’re not easy to analyze.
If you have a Python process which appends data to multiple lists, the first step to analyzing this data is usually to get it into a Pandas dataframe.
Here’s an example of how to go about doing this.
First let’s import Pandas and define two Python lists:
import pandas as pd
customer_id = [100, 101, 102, 103]
customer_name = ['Jack', 'Jill', 'Tom', 'Jerry']
The goal is to produce a dataframe which looks like this:
customer_id | customer_name |
---|---|
100 | Jack |
101 | Jill |
102 | Tom |
103 | Jerry |
You can do this by using the list()
and zip()
functions in the DataFrame
class.
df = pd.DataFrame(list(zip(customer_id, customer_name)),
columns =['customer_id', 'customer_name'])
How does this work?
The zip()
function iterates through our input lists customer_id
and customer_name
. It takes the first item in each list - 100
and ‘Jack’
in our case, and puts them together in a tuple. Then it moves on to the second item in each list etc.
What you end up with is a zip
object which looks like this:
(100, 'Jack'), (101, 'Jill'), (102, 'Tom'), (103, 'Jerry')
The list()
function is wrapped around the zip
object, which changes it into a list
which can be read by the DataFrame
class.
The columns
option in the DataFrame
class names the columns in the output dataframe – this can be different to your list names.
Your lists have now been combined into a dataframe ready for analysis.
Now that you have your dataframe, take a look at this post if you’d like to create a permanent copy of it as an SQLite table.