Multicorn - powerful foreign data wrapper for PostgreSQL
Hello my dear friends. In this article I will talk about Multicorn: what is this, how to install it and use it with PostgreSQL.
What is Multicorn?
Multicorn is a PostgreSQL 9.1+ extension meant to make Foreign Data Wrapper development easy, by allowing the programmer to use the Python programming language. “Foreign Data Wrappers” (FDW) were introduced in PostgreSQL 9.1, providing a way of accessing external data sources from within PostgreSQL using SQL.
For installing Multicorn I will use Ubuntu. First, we need install some packages:
We can install Multicorn by using pgxn client or from source. I prefer install from source:
To complete the installation we need to enable the extension in the database:
Now let consider how to use it.
For connection to another RDBMS database Multicorn use SQLalchemy library. It support MySQL, PostgreSQL, Microsoft SQL Server, and more. Let’s try how it is work with MySQL. First of all we should install additional libs:
In MySQL database “testing” we have table “companies”:
First of all we should create server for FDW in PostgreSQL:
Now we can create foreign table, which will contain data from MySQL table “companies” (I called this table in PostgreSQL “mysql_companies”):
db_url (string) - an sqlalchemy connection string (examples: “mysql://<user>:<password>@<host>/<dbname>”, “mssql: mssql://<user>:<password>@<dsname>”). See the sqlalchemy dialects documentation.
tablename (string) - the table name in the remote RDBMS.
And now we can check how it is work:
As you can see, it is work.
We can use Multicorn to get your emails from inbox by IMAP protocol. We need install additional libraries:
Next steps similar to previous. We should create server and table, where we will get data:
host (string) - the IMAP host to connect to.
port (string) - the IMAP host port to connect to.
login (string) - the login to connect with.
password (string) - the password to connect with.
payload_column (string) - the name of the column which will store the payload.
flags_column (string) - the name of the column which will store the IMAP flags, as an array of strings.
ssl (boolean) - wether to use ssl or not.
imap_server_charset (string) - the name of the charset used for IMAP search commands. Defaults to UTF8. For the cyrus IMAP server, it should be set to “utf-8”.
And we can read emails from inbox by using table “my_inbox”:
Added flag to email “Test email”:
Multicorn can get use RSS as source of information. Again, we need install additional libraries:
We should create server and table, where we will get data:
url (string) - the RSS feed URL.
Also, you should be sure, what your database use UTF-8 charset. Because in another encodings you can get errors :)
This FDW can be used to access data stored in CSV files. We should create server and table, where we will get data:
filename (string) - the full path to the CSV file containing the data. This file must be readable to the postgres user.
delimiter (character) - the CSV delimiter (defaults to “,”).
quotechar (character) - the CSV quote character (defaults to “).
skip_header (integer) - the number of lines to skip (defaults to 0).
Let’s check how it work:
The Multicorn also contain LDAP and FileSystem Foreign Data Wrappers. LDAP FDW can be used to access directory servers via the LDAP protocol. FileSystem FDW can be used to access data stored in various files, in a filesystem.
Your custom FDWs
Multicorn provides a simple interface for writing own foreign data wrappers. More information you can find here.
The original implementation of FDWs in PostgreSQL 9.1 and 9.2 was read-only, but in PostgreSQL 9.3 FDWs also have write access as well. Right now Multicorn support write access API in version >= 1.0.0.
As you can be seen, Multicorn is very useful extensions, which provide for PostgreSQL communicate with many external types of data source and provide for Python developers create own custom FDW for PostgreSQL.
That’s all folks! Thank you for reading till the end.