Connecting your Distilled data warehouse to Jupyter Notebook.
Welcome to this documentation guide on how to connect your Distilled database to Jupyter Notebook. By following the instructions outlined in this guide, you will be able to utilize the available connection string in your account to establish a connection between your Distilled data and Jupyter Notebook. This will enable you to access and manipulate your data in a Python environment, leveraging the power and flexibility of Python’s data analysis tools. Whether you are a data scientist, analyst, or researcher, this guide will equip you with the skills necessary to effectively analyze your data in a Jupyter Notebook environment.
Before you begin:
Before you begin, gather your connection string details for your Distilled PostgreSQL database. To find your Distilled database name and port details, access your Distilled account and navigate to Analyze > Fetch your data and use the connection string details
Libraries to Install:
- A python database API (DBAPI) library
The libraries you’ll need can be installed by using the pip install commands:
pip install ipython-sql pip install sqlalchemy pip install psycopg2
Created by catherinedevlin on Github, this enables the use of SQL magic functions that contain % and %% , allowing you to write SQL style code right in Jupyter Notebook.
sqlalchemy is touted as a “SQL toolkit and object-relational-mapper” for Python. It will mainly be used to store SQL queries into a pandas dataframe.
Now that we have our libraries installed, let’s move to Jupyter Notebook.
How to Start
To load ipython-sql, use the following magic command:
Next, we will only need the create_engine() function from sqlalchemy so let’s import that with the following line:
from sqlalchemy import create_engine
From here, we can now connect to your Distilled PostgreSQL database.
Connecting to Your Distilled PostgreSQL Database
To connect ipython-sql to your database, use the following format:
# Format %sql dialect+driver://username:password@host:port/database # Example format %sql postgresql://postgres:password123@localhost/dvdrental
- dialect+driver in this case would just be postgresql
- username:password is where you will substitute your username and password
- host is can be found in your connection details
- port can be found in your conncetion details
- database can be found in your connection details, but may not be needed in all use-cases
To see if it successfully connected, Jupyter Notebook will print the following:
To connect sqlalchemy to the database, the format will largely be the same, but we will create a new object called engine using the create_engine() function:
# Format engine = create_engine('dialect+driver://username:password@host:port/database') # Example format engine = create_engine('postgresql://postgres:password123@localhost/dvdrental')
You should be able to copy & paste what you wrote after the %sql magic command when connecting to ipython-sql and encapsulate it in quotes as a parameter in the create_engine() function.
As long as you see no errors in the output, you should be connected to your Distilled PostgreSQL database.