Alexey Vasiliev

  • 6+ years experience
    • Linux and Databases administrator
    • Web and Mobile developer (Ruby, Java, JavaScript, Objective-C, C/C++)
  • Open-Source developer
    • WebP-ffi
    • MongodbLogger for Rails
    • Piro - Chrome extension for PivotalTracker
    • SMTRails and SHTRails (shared templates for rails)
    • SkypeKit for Ruby
me

PostgreSQL features

Flexible Datatypes

PostgreSQL

Arrays

Flexible Datatypes

        select array_agg(id) from endpoints group by application_id;
      
        select (array['hi', 'there', 'everyone', 'at', 'hotcode'])[random()*2 + 1];
      
        select name, tags from posts where tags @> array['it', 'sql'];
      
        select unnest(tags) as tag from posts where title = 'About PostgreSQL';
      

Ranges (9.2+)

Flexible Datatypes

        SELECT int4range(10, 20) @> 3;
        SELECT daterange('["Jan 1 2013", "Jan 15 2013")') @> 'Jan 10 2013'::date;
      
        $ ALTER TABLE reservation ADD EXCLUDE USING gist (during WITH &&);

        $ INSERT INTO reservation VALUES (1108, '[2010-01-01 11:30, 2010-01-01 13:00)');
        INSERT 0 1

        $ INSERT INTO reservation VALUES (1108, '[2010-01-01 14:45, 2010-01-01 15:45)');
        ERROR:  conflicting key value violates exclusion constraint "reservation_during_excl"
        DETAIL:  Key (during)=([ 2010-01-01 14:45:00, 2010-01-01 15:45:00 )) conflicts
        with existing key (during)=([ 2010-01-01 14:30:00, 2010-01-01 15:30:00 )).
      

XML and JSON (9.2+)

Flexible Datatypes

        $ SELECT xpath('/my:a/text()', 'test', ARRAY[ARRAY['my', 'http://example.com']]);
         xpath
        --------
         {test}
      
        $ SELECT * from json_demo;
          id | username |       email       | posts_count
         ----+----------+-------------------+-------------
           1 | john     | john@gmail.com    |          10
           2 | mickael  | mickael@gmail.com |          50
         $ SELECT row_to_json(json_demo) FROM json_demo;
                                         row_to_json
         ----------------------------------------------------------------------------
          {"id":1,"username":"john","email":"john@gmail.com","posts_count":10}
          {"id":2,"username":"mickael","email":"mickael@gmail.com","posts_count":50}
      

JSON and PLV8 for "schemaless" sql

Flexible Datatypes

        CREATE OR REPLACE FUNCTION get_numeric(json_raw json, key text)
        RETURNS numeric AS $$
          var o = JSON.parse(json_raw);
          return o[key];
        $$ LANGUAGE plv8 IMMUTABLE STRICT;
      
        SELECT * FROM members WHERE get_numeric(profile, 'age') = 36;
        Time: 9340.142 ms
      
        CREATE INDEX member_age ON members (get_numeric(profile, 'age'));
      
         SELECT * FROM members WHERE get_numeric(profile, 'age') = 36;
         Time: 57.429 ms
      

JSON functions (comming in 9.3)

Flexible Datatypes

  • array_to_json (present in 9.2)
  • row_to_json (present in 9.2)
  • to_json
  • json_array_length
  • json_each
  • json_each_text
  • json_extract_path
  • json_extract_path_text
  • json_object_keys
  • json_populate_record
  • json_populate_recordset
  • json_array_elements

WITH

WITH examples

WITH

        WITH a AS ( SELECT 'a' as a ) SELECT * FROM a;
      
        WITH
          prepared_data AS ( ... )
        SELECT data, count(data),
               min(data), max(data)
        FROM prepared_data
        GROUP BY data;
      

WITH examples

WITH

      $ WITH RECURSIVE t(n) AS (
          VALUES (1)
        UNION ALL
          SELECT n+1 FROM t WHERE n < 100
        )
        SELECT sum(n) FROM t;

         sum
        ------
         5050
        (1 row)
      

LISTEN / NOTIFY

Example

LISTEN / NOTIFY

        $ LISTEN delay_worker;
        LISTEN
        $ NOTIFY delay_worker, '44924';
        NOTIFY
        Asynchronous notification "delay_worker"
        with payload "44924" received from server process with PID 29118.
        $ SELECT pg_notify('delay_worker', '44924');
         pg_notify
        -----------

        (1 row)

        Asynchronous notification "delay_worker"
        with payload "44924" received from server process with PID 29118.
      

What is this?

LISTEN / NOTIFY

  • LISTEN on a channel
  • NOTIFY messages are delivered asynchronously w/payload
  • useful to fan out messages to other clients

Great for

  • broadcasting events to other clients
  • work distribution
  • cache busting

Window functions

Example

Window functions

        $ SELECT depname, empno, salary, avg(salary)
          OVER (PARTITION BY depname) FROM empsalary;
         depname  | empno | salary |          avg
        -----------+-------+--------+-----------------------
         develop   |    11 |   5200 | 5020.0000000000000000
         develop   |     7 |   4200 | 5020.0000000000000000
         develop   |     9 |   4500 | 5020.0000000000000000
         develop   |     8 |   6000 | 5020.0000000000000000
         develop   |    10 |   5200 | 5020.0000000000000000
         personnel |     5 |   3500 | 3700.0000000000000000
         personnel |     2 |   3900 | 3700.0000000000000000
         sales     |     3 |   4800 | 4866.6666666666666667
         sales     |     1 |   5000 | 4866.6666666666666667
         sales     |     4 |   4800 | 4866.6666666666666667
        (10 rows)
      

Example

Window functions

        $ SELECT salary, sum(salary) OVER () FROM empsalary;
         salary |  sum
        --------+-------
           5200 | 47100
           5000 | 47100
           3500 | 47100
           4800 | 47100
           3900 | 47100
           4200 | 47100
           4500 | 47100
           4800 | 47100
           6000 | 47100
           5200 | 47100
        (10 rows)
      

Example

Window functions

        $ SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
         salary |  sum
        --------+-------
           3500 |  3500
           3900 |  7400
           4200 | 11600
           4500 | 16100
           4800 | 25700
           4800 | 25700
           5000 | 30700
           5200 | 41100
           5200 | 41100
           6000 | 47100
        (10 rows)
      

Table Inheritance

How to use Partitioning

Partitioning

        CREATE TABLE my_logs(
          id SERIAL PRIMARY KEY,
          logdate TIMESTAMP NOT NULL,
          data JSON
        );
      
        CREATE TABLE my_logs2012m10 (
        CHECK ( logdate >= DATE '2012−10−01' AND logdate < DATE '2012−11−01' )
        ) INHERITS (my_logs);
        CREATE INDEX my_logs2012m10_logdate ON my_logs2012m10 (logdate);
      

Management Partition

Partitioning

Simple cleanup

        DROP TABLE my_logs2012m06;
      

or remove partition from partitioning

        ALTER TABLE my_logs2012m06
        NO INHERIT my_logs;
      

Smart Query Optimization

Partitioning

        partitioning_test=# EXPLAIN SELECT ∗ FROM my_logs WHERE logdate > '2012−08−01';
                                  QUERY PLAN
        −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
        Result ( cost =6.81..41.87 rows=660 width=52)
          −> Append ( cost =6.81..41.87 rows=660 width=52)
            −> Bitmap Heap Scan on my_logs ( cost =6.81..20.93 rows=330 width=52)
              Recheck Cond: ( logdate > '2012−08−01 00:00:00' : : timestamp without time zone)
                −> Bitmap Index Scan on my_logs_logdate ( cost =0.00..6.73 rows=330 width=0)
                  Index Cond: (logdate > '2012−08−01 00:00:00' : : timestamp without time zone)
            −> Bitmap Heap Scan on my_logs2012m08 my_logs ( cost =6.81..20.93 rows=330 width=52)
              Recheck Cond: ( logdate > '2012−08−01 00:00:00' : : timestamp without time zone)
                −> Bitmap Index Scan on my_logs2012m08_logdate ( cost =0.00..6.73 rows=330 width=0)
                  Index Cond: (logdate > '2010−08−01 00:00:00' : : timestamp without time zone)
        (10 rows)
      

Indexes

Functional Indexes

Functional and Partial Indexes

Index on expression

        CREATE INDEX foo_name_first_idx
        ON foo ((lower(substr(foo_name, 1, 1))));
      

for selects

        SELECT * FROM foo
        WHERE lower(substr(foo_name, 1, 1)) = 's';
      

Partial Indexes

Functional and Partial Indexes

Index refers to the predicate WHERE

        CREATE INDEX access_log_client_ip_ix ON access_log (client_ip)
        WHERE NOT (client_ip > inet '192.168.100.0' AND
                   client_ip < inet '192.168.100.255');
      

for selects

        SELECT * FROM access_log
        WHERE client_ip = '192.168.100.45';
      

Create Index Concurrently

Instead

        CREATE INDEX sales_quantity_index ON sales_table (quantity);
      

this better for huge table

        CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
      

but be careful

        Indexes:
            "idx" btree (col) INVALID
      

Foreign Data Wrappers

Foreign Data Wrappers (FDWs)

FDWs

Wrappers

Foreign Data Wrappers (FDWs)

  • oracle_fdw
  • mysql_fdw
  • odbc_fdw
  • jdbc_fdw
  • couchdb_fdw
  • mongo_fdw
  • redis_fdw
  • file_fdw, file_text_array_fdw, file_fixed_length_record_fdw
  • twitter_fdw
  • ldap_fdw
  • s3_fdw
  • www_fdw
  • Multicorn (CSV, FS, RSS, Hive)

Wrappers support

Foreign Data Wrappers (FDWs)

  • From PostgreSQL 9.1 - wrappers can only read
  • From PostgreSQL 9.3 - wrappers can read and write

Extensions

  • PostGIS
  • PostPic
  • PL/Proxy
  • PgMemcache
  • Prefix
  • pgSphere
  • Multicorn
  • Hstore
  • PLV8
  • Intarray
  • Dblink
  • Smlar
  • many others...
Extensions

PostgreSQL 9.3 is comming

  • Configuration directive 'include_dir'
  • COPY FREEZE for more efficient bulk loading
  • Custom Background Workers
  • JSON: Additional functionality
  • LATERAL JOIN
  • Switch to Posix shared memory and mmap() (No need SHMMAX and SHMALL)
  • Parallel pg_dump for faster backups
  • 'pg_isready' server monitoring tool
  • Materialized Views
  • Recursive View Syntax
  • Updatable Views
  • Writeable Foreign Tables
  • Streaming-Only Remastering
  • Fast Failover
  • Architecture-Independent Streaming

<Thank You!>

Contact information