Loading a Heroku PostgreSQL Database with Data
How to load data into PostgreSQL - and some of the issues you might encounter along the way.
Recently, I needed to get a PostgreSQL database running on Heroku with some sample data. Here's what I had to do to make it happen - and some of the limitations I discovered in Heroku's PostgreSQL offering.
Heroku versus...well, everything else
First, a quick note about Heroku.
I admittedly don't use the platform much. As a former AWS (and current Microsoft) employee, my first choices for new projects are AWS or Azure. But I've been diving into Heroku more as part of the work I do for TinyStacks.
Heroku is a PaaS, or Platform as a Service (PaaS) offering. In reality, that means it's basically a layer above AWS. In essence, Heroku uses AWS so you don't have to.
At first blush, there's a real benefit to this. The AWS Management Console is notoriously hard to use. And AWS itself (and, to be fair, Azure as well) is chock full of a plethora of features that seem to compete with one another. Want to deploy an application? Congratulations - there are about six to eight options for you to choose from! (See some of my articles for TinyStacks comparing some of the available options.)
For medium- to large-size projects, this flexibility is (mostly) nice. You can mix and match various AWS features to create a secure, scalable application that grows with your business. However, for smaller projects, the abundance of choice and the console usability issues quickly frustrate those new to the cloud.
Enter Heroku. Heroku provides a sleek, simplified interface above AWS for application deployment. This ease of use is great when your project's just starting out.
Over time, however, Heroku's downsides pile up. Heroku's limited scaling options and geographic reach put an upper cap on how large your app can grow. And the premium the company charges on top of its AWS costs gets costlier as your app scales.
Personally, I think it's worth the extra effort to figure out how to deploy your application directly onto AWS or Azure. Yes, the learning curve is steeper. But both AWS and Azure have PaaS-like offerings that allow you to keep things simpler during your initial exploratory and v1 phases. And it's a lot easier to grow directly on a cloud provider than it is to port your app off of Heroku after the fact.
Prerequisites
Anyway, rant over. You didn't come here to listen to me opine on the cloud but for practical advice. Assuming you still want/need to get a PostgreSQL DB up and running on Heroku, here's how to get it done.
To get started, you'll need the following installed on your machine:
- Heroku CLI
- PostgreSQL client tools (included with main PostgreSQL install)
- A Heroku app. If you don't have one, follow the instructions to create one here.
For the purposes of this article, I followed the Python walk-through. If you follow the tutorial on up to this command, you'll get a basic Python database with a PostgreSQL database already attached to it:
git push heroku main
You can see the details for this database in your Heroku console. You'll note, as I pointed out above, that it's hosted on AWS.
Running a .sql file on Heroku
For this sample app, I wanted to import some data from a larger data set than the one provided by the default Heroku "Greetings" app. So I downloaded the SQL insertion file for the USDA sample database.
Importing this specific sample data set was pretty easy. (Although I had problems with other sample data sets; see below.) You just have to launch the PostgreSQL command line tools using the Heroku CLI from within your Heroku application's directory. The Heroku CLI will take care of authenticating you into your database:
heroku pg:psql
(Note: If you're on a Windows system, you may get an error that psql
could not be found. For some reason, the installer chose fit not to add the path to the PostgreSQL bin
directory to my environment PATH
. I added it myself; on my system, the path PostgreSQL 14 is C:\Program Files\PostgreSQL\14\bin
.)
From there, download the usda.sql
file into the same directory. Inserting the data is as simple as running the following command:
\i usda.sql
You can then run a simple SQL query to verify the data inserted correctly:
Issues running other SQL imports
When I started this article, I tried to import Pagila, a version of the MySQL Sakila database tailored specifically to Postgres. However, I immediately got a series of errors:
So I dug into why. I soon found out that PostgreSQL creates a default set of database credentials for you that contains all permissions just below superuser.
This means that you can do all of the normal CRUD (Create/Read/Update/Delete) operations on existing tables plus grant some permissions with the default credentials. You can even create tables and load data. However, the Pagila import attempts to create a series of other PostgreSQL objects such as types, domains, and functions. Apparently, you can't perform these operations with the default credentials.
This seems like a major restriction on the Heroku platform. Heroku is adamant that they don't let users have superuser permissions on their PostgreSQL databases. Hobby tier databases are restricted to a single PostgreSQL credential: the default credential. Hosting levels above hobby can create additional sets of credentials. However, it doesn't seem like granting permissions to these advanced functions is possible even at that tier. (I haven't found anything to that effect.)
I wouldn't mind being told I'm wrong. However, for me, this just highlights why I'd go to a cloud provider first before Heroku. Not having admin access to my own databases strikes me as fundamentally silly. I'm all for systems that make the cloud easier - but not for ones that enforce overly restrictive guardrails.
I'm a software engineering manager with 25+ years experience in the industry who's done application dev for both AWS and Microsoft. Want to tap into my expertise? Contact me via LinkedIn to set up a consultation or reach out to me on Twitter.