How to Explode Arrays with Presto

Learn how to use the equivalent of HiveQL’s lateral view explode in Presto and Trino.

If you ever come across a column in a relational table which includes arrays or maps (key value pairs), then you’ll usually want to expand the array so that each element gets its own row. This makes it much easier to use the data in your analysis.

Lateral view explode

If you’re using SQL or HiveQL then you can use a LATERAL VIEW EXPLODE statement.

It puts each element of the array into its own row which is combined with the other columns of the table - confused yet? Don’t worry! - there’s an example coming up. 🤔

I came across this situation when I was working on a piece of analysis to map out which channels customers interact with in the 30 days pre-purchase. Luckily for me, the data already existed at transaction level. But, as the table maintained one row per transaction, the channels which customers interacted with were in an array.

Presto and Trino don’t have the lateral view explode function which you’d find in SQL or HiveQL, but there is an alternative - you can use the UNNEST() function with a CROSS JOIN

Before we look at how to do this in Presto and Trino, let’s first look at an example of what a lateral view explode can be used for.

Example of a lateral view explode

A lateral view explode expands your array elements across multiple rows.

As you can see in the example below, the channel_array elements have now been distributed across multiple rows for each customer_id.

Before lateral view explode

transaction_id channel_array
11111 ["Email","App","Website"]
22222 ["Social","Website"]
33333 ["Website","App","Email","App"]

After lateral view explode

transaction_id channel_number channel
11111 1 Email
11111 2 App
11111 3 Website
22222 1 Social
22222 2 Website
33333 1 Website
33333 2 App
33333 3 Email
33333 4 App

How can I do this with Presto and Trino?

You can use CROSS JOIN UNNEST as in this query:

    b.n AS channel_number,

FROM db.transactions a

CROSS JOIN UNNEST(a.channel_path) WITH ORDINALITY AS b (channel, n)

There are a few things to unpack here. Let’s go through them one by one:

Note that if you’re doing this on a mapping of key value pairs instead of an array then you can remove the redundant WITH ORDINALITY statement and refer to the key values using b.key and b.value in the SELECT statement when b, key and value are defined like this:

CROSS JOIN UNNEST(a.channel_path) AS b (key, value) 

Now you now know how to perform a lateral view explode in Presto and Trino.

This is a handy trick to know as more processes use API data returned in JSON format which finds its way downstream to the analysts!