How to combine Python Lists Into a Pandas Dataframe
Lists are a useful way to collect data from other Python processes, but they’re not easy to analyze.
If you have a Python process which appends data to multiple lists, the first step to analyzing this data is usually to get it into a Pandas dataframe.
Here’s an example of how to go about doing this.
First let’s import Pandas and define two Python lists:
import pandas as pd
customer_id = [100, 101, 102, 103]
customer_name = ['Jack', 'Jill', 'Tom', 'Jerry']
The goal is to produce a dataframe which looks like this:
| customer_id | customer_name |
|---|---|
| 100 | Jack |
| 101 | Jill |
| 102 | Tom |
| 103 | Jerry |
You can do this by using the list() and zip() functions in the DataFrame class.
df = pd.DataFrame(list(zip(customer_id, customer_name)),
columns =['customer_id', 'customer_name'])
How does this work?
The zip() function iterates through our input lists customer_id and customer_name. It takes the first item in each list - 100 and ‘Jack’ in our case, and puts them together in a tuple. Then it moves on to the second item in each list etc.
What you end up with is a zip object which looks like this:
(100, 'Jack'), (101, 'Jill'), (102, 'Tom'), (103, 'Jerry')
The list() function is wrapped around the zip object, which changes it into a list which can be read by the DataFrame class.
The columns option in the DataFrame class names the columns in the output dataframe – this can be different to your list names.
Your lists have now been combined into a dataframe ready for analysis.
Now that you have your dataframe, take a look at this post if you’d like to create a permanent copy of it as an SQLite table.
