Leopard Blog2024-03-02T21:47:28+00:00http://leopard.in.uaAlexey Vasilievrwpod.com[at]gmail.comSafe and unsafe operations for high volume PostgreSQL2016-09-20T00:00:00+00:00http://leopard.in.ua/2016/09/20/safe-and-unsafe-operations-postgresql<p><a href="https://www.postgresql.org/">PostgreSQL</a> is an object-relational database management system, which I often to use for many products. Some of this products should have high availability and working without any downtime. This means, I should run a database schema migrations while the app is up and serving requests. I have to be very careful about what database operations I run. If I run a bad command, it can lock out updates to a table for a long time. For example, if I create a new index on table, I cannot create new record in this table while that index is building. Anyone who tries to make a record in this table will block, and possibly time out, causing a partial outage. In general, I am ok with database operations taking a long time. However, any operation that locks a table for updates for more than a few seconds means downtime for me.</p>
<p>I decided to make a list of an operations, which can be done safe (without downtime) and usafe.</p>
<h1 id="add-a-new-column-safe">Add a new column (safe)</h1>
<p>This operation will not block table and can be done safety. But exists some cases, which can lock your table.</p>
<h2 id="add-a-column-with-a-default-unsafe-if-postgresql--11">Add a column with a default (unsafe if PostgreSQL < 11)</h2>
<p>Adding a column with a default requires updating each row of the table (to store the new column value). For big table this will create long running operation that locks it. So if you intend to fill the column with mostly non default values, it’s best to add the column with no default, insert the correct values using <code class="language-plaintext highlighter-rouge">UPDATE</code> (correct way is to do batched updates, for example, update 1000 rows at a time, because big update will create table-wide lock), and then add any desired default.</p>
<p><strong>UPDATE</strong>: With PostgreSQL 11 it is now possible to have DDL statements like this:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">users</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">foo_factor</span> <span class="nb">integer</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">DEFAULT</span> <span class="mi">42</span><span class="p">;</span></code></pre></figure>
<p>execute in constant time. Rows are not touched when this executed, and are instead updated “lazily”.</p>
<h2 id="add-a-column-that-is-non-nullable-unsafe-if-postgresql--11">Add a column that is non-nullable (unsafe if PostgreSQL < 11)</h2>
<p>This will have the same problem, as “Add a column with a default”. To make this operation without locking, you can create a new table with the addition of the non-nullable column, write to both tables, backfill, and then switch to the new table. This workaround is incredibly onerous and need two times more space than is a table takes.</p>
<p><strong>UPDATE</strong>: With PostgreSQL 11 it is now possible to have DDL statements like this:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">users</span> <span class="k">ADD</span> <span class="k">COLUMN</span> <span class="n">foo_factor</span> <span class="nb">integer</span> <span class="k">NOT</span> <span class="k">NULL</span> <span class="k">DEFAULT</span> <span class="mi">42</span><span class="p">;</span></code></pre></figure>
<p>execute in constant time. Rows are not touched when this executed, and are instead updated “lazily”.</p>
<h1 id="drop-a-column-safe">Drop a column (safe)</h1>
<p>Dropping a column is very quick, but PostgreSQL won’t reclaim the disk space until you run a “VACUUM FULL”.</p>
<h1 id="change-the-type-of-a-column-unsafe">Change the type of a column (unsafe)</h1>
<p>It is not strictly unsafe for all changes. Changing the length of a varchar, for example, does not lock a table. But if column type change requires a rewrite or not depends on the datatype, in this case this operation requires updating each row of the table. As workaround, you can add a new column with needed type, change the code to write to both columns, and backfill the new column.</p>
<h1 id="add-a-default-value-to-an-existing-column-safe">Add a default value to an existing column (safe)</h1>
<p>This operation will not block table and can be done safety.</p>
<h1 id="add-an-index-unsafe">Add an index (unsafe)</h1>
<p>Normally PostgreSQL locks the table to be indexed against writes and performs the entire index build with a single scan of the table. Other transactions can still read the table, but if they try to insert, update, or delete rows in the table they will block until the index build is finished.</p>
<p>PostgreSQL supports building indexes without locking out writes. This method is invoked by specifying the <code class="language-plaintext highlighter-rouge">CONCURRENTLY</code> option of <code class="language-plaintext highlighter-rouge">CREATE INDEX</code>. When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially modify or use the index to terminate. Thus this method requires more total work than a standard index build and takes significantly longer to complete. However, since it allows normal operations to continue while the index is built, this method is useful for adding new indexes in a production environment. Of course, the extra CPU and I/O load imposed by the index creation might slow other operations.</p>
<p>If a problem arises while scanning the table, such as a uniqueness violation in a unique index, the <code class="language-plaintext highlighter-rouge">CREATE INDEX</code> command will fail but leave behind an “invalid” index. This index will be ignored for querying purposes because it might be incomplete; however it will still consume update overhead. The psql <code class="language-plaintext highlighter-rouge">\d</code> command will report such an index as INVALID:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="n">postgres</span><span class="o">=#</span> <span class="err">\</span><span class="n">d</span> <span class="n">tab</span>
<span class="k">Table</span> <span class="nv">"public.tab"</span>
<span class="k">Column</span> <span class="o">|</span> <span class="k">Type</span> <span class="o">|</span> <span class="n">Modifiers</span>
<span class="c1">--------+---------+-----------</span>
<span class="n">col</span> <span class="o">|</span> <span class="nb">integer</span> <span class="o">|</span>
<span class="n">Indexes</span><span class="p">:</span>
<span class="nv">"idx"</span> <span class="n">btree</span> <span class="p">(</span><span class="n">col</span><span class="p">)</span> <span class="n">INVALID</span></code></pre></figure>
<p>The recommended recovery method in such cases is to drop the index and try again to perform <code class="language-plaintext highlighter-rouge">CREATE INDEX CONCURRENTLY</code>.</p>
<p>Another difference is that a regular <code class="language-plaintext highlighter-rouge">CREATE INDEX</code> command can be performed within a transaction block, but <code class="language-plaintext highlighter-rouge">CREATE INDEX CONCURRENTLY</code> cannot.</p>
<h1 id="add-a-column-with-a-unique-constraint-unsafe">Add a column with a unique constraint (unsafe)</h1>
<p>This operation will lock table. As workaround, you can add column, add unique index concurrently, and then add the constraint onto the table:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">UNIQUE</span> <span class="k">INDEX</span> <span class="n">CONCURRENTLY</span> <span class="n">token_is_unique</span> <span class="k">ON</span> <span class="n">large_table</span><span class="p">(</span><span class="n">token</span><span class="p">);</span>
<span class="k">ALTER</span> <span class="k">TABLE</span> <span class="n">large_table</span> <span class="k">ADD</span> <span class="k">CONSTRAINT</span> <span class="n">token</span> <span class="k">UNIQUE</span> <span class="k">USING</span> <span class="k">INDEX</span> <span class="n">token_is_unique</span><span class="p">;</span></code></pre></figure>
<h1 id="drop-a-constraint-safe">Drop a constraint (safe)</h1>
<p>This operation will not block table and can be done safety.</p>
<h1 id="vacuum-full-unsafe">VACUUM FULL (unsafe)</h1>
<p><code class="language-plaintext highlighter-rouge">VACUUM</code> reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a <code class="language-plaintext highlighter-rouge">VACUUM</code> is done. <code class="language-plaintext highlighter-rouge">VACUUM FULL</code> rewrites the entire contents of the table into a new disk file with no extra space, allowing unused space to be returned to the operating system. This form is much slower and requires an exclusive lock on each table while it is being processed.</p>
<p>To solve this problem you can use <a href="https://github.com/reorg/pg_repack">Pg_repack</a> PostgreSQL extension. To perform a full-table repack, pg_repack will:</p>
<ol>
<li>create a log table to record changes made to the original table;</li>
<li>add a trigger onto the original table, logging INSERTs, UPDATEs and DELETEs into our log table;</li>
<li>create a new table containing all the rows in the old table;</li>
<li>build indexes on this new table;</li>
<li>apply all changes which have accrued in the log table to the new table;</li>
<li>swap the tables, including indexes and toast tables, using the system catalogs;</li>
<li>drop the original table;</li>
</ol>
<p>Pg_repack will only hold an <code class="language-plaintext highlighter-rouge">ACCESS EXCLUSIVE</code> lock for a short period during initial setup (steps 1 and 2 above) and during the final swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack only needs to hold an <code class="language-plaintext highlighter-rouge">ACCESS SHARE</code> lock on the original table, meaning INSERTs, UPDATEs, and DELETEs may proceed as usual.</p>
<p>Performing a full-table repack requires free disk space about twice as large as the target table(s) and its indexes.</p>
<h1 id="alter-table-set-tablespace-unsafe">ALTER TABLE SET TABLESPACE (unsafe)</h1>
<p>Normally all PostgreSQL data resides in single directory. But you might have some additional SSD disks, or quite the contrary~— some slow, but very large disks. And you’d want to put some of the data to another disk set. This is what tablespaces are.</p>
<p>Default tablespace is simply <code class="language-plaintext highlighter-rouge">$PGDATA/base</code> directory. But you can have many other, created with:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="n">TABLESPACE</span> <span class="n">xxx</span> <span class="k">LOCATION</span> <span class="s1">'/wherver'</span><span class="p">;</span></code></pre></figure>
<p>command. Afterwards you can move some tables/indexes to this new tablespace with:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">ALTER</span> <span class="k">TABLE</span><span class="o">/</span><span class="k">INDEX</span> <span class="n">whatever</span> <span class="k">SET</span> <span class="n">TABLESPACE</span> <span class="n">xxx</span><span class="p">;</span></code></pre></figure>
<p>This is locking operation. To solve this problem you can use pg_repack with <code class="language-plaintext highlighter-rouge">--tablespace</code> option.</p>
<h1 id="summary">Summary</h1>
<p>As you can see, all unsafe operations can be solved by some workarounds. Just need to remember how this unsafe operations will behave in the PostgreSQL database and be very careful about what database operations you run on production database.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
What is Accelerated Mobile Pages (AMP) and how you can use it2016-09-07T00:00:00+00:00http://leopard.in.ua/2016/09/07/what-is-amp-and-how-you-can-use-it<p>Hello my dear friends. Today we will talk about <a href="https://www.ampproject.org/">Accelerated Mobile Pages (AMP)</a> and how it can help to you speedup your website.</p>
<h1 id="what-is-accelerated-mobile-pages-amp">What is Accelerated Mobile Pages (AMP)?</h1>
<p>Speed of loading your web page is really matters today. Many studies have shown that page load has a direct impact on sales. For instance, <a href="http://www.gduchamp.com/media/StanfordDataMining.2006-11-28.pdf">every 100ms delay costs 1% of sales for Amazon store</a>. This is especially important on mobile phones and tablets. <a href="https://instantarticles.fb.com/">Facebook’s Instant Articles</a> and <a href="https://www.apple.com/news/">Apple News</a> are answering these issues in their own way. Fortunately, the <a href="https://www.ampproject.org/">Accelerated Mobile Pages (AMP)</a> Project, while promoted by Google, is an open-source project and you should give it a try. AMP in action consists of three different parts:</p>
<ul>
<li><strong>AMP HTML</strong> is HTML with some restrictions for reliable performance and some extensions for building rich content beyond basic HTML;</li>
<li>The <strong>AMP JS</strong> library ensures the fast rendering of AMP HTML pages;</li>
<li>The <strong>Google AMP Cache</strong> can be used to serve cached AMP HTML pages;</li>
</ul>
<video width="320" height="720" autoplay="" loop="" controls="">
<source src="https://www.google.com/images/google-blog-assets/amp-phone-10062015.mp4" type="video/mp4" />
Your browser does not support the video tag.
</video>
<h3 id="amp-html">AMP HTML</h3>
<p>AMP HTML is basically HTML extended with custom AMP properties. Though most tags in an AMP HTML page are regular HTML tags, some HTML tags are replaced with AMP-specific tags. These custom elements, called AMP HTML components, make common patterns easy to implement in a performant way. For example, the img tag provides full srcset support even in browsers that don’t support it yet. Learn how to create your first AMP HTML page.</p>
<h3 id="amp-js">AMP JS</h3>
<p>The AMP JS library implements all of AMP’s best performance practices, manages resource loading and gives you the custom tags mentioned above, all to ensure a fast rendering of your page. Among the biggest optimizations is the fact that it makes everything that comes from external resources asynchronous, so nothing in the page can block anything from rendering. Other performance techniques include the sandboxing of all iframes, the pre-calculation of the layout of every element on page before resources are loaded and the disabling of slow CSS selectors.</p>
<h3 id="google-amp-cache">Google AMP Cache</h3>
<p>The <a href="https://developers.google.com/amp/cache/">Google AMP Cache</a> is a proxy-based content delivery network for delivering all valid AMP documents. It fetches AMP HTML pages, caches them, and improves page performance automatically. When using the Google AMP Cache, the document, all JS files and all images load from the same origin that is using HTTP 2.0 for maximum efficiency.</p>
<p>The cache also comes with a built-in validation system which confirms that the page is guaranteed to work, and that it doesn’t depend on external resources. The validation system runs a series of assertions confirming the page’s markup meets the AMP HTML specification. Another version of the validator comes bundled with every AMP page. This version can log validation errors directly to the browser’s console when the page is rendered, allowing you to see how complex changes in your code might impact performance and user experience.</p>
<h1 id="anatomy-of-an-amp-page">Anatomy of an AMP Page</h1>
<p>Here is a minimalist AMP page:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="cp"><!doctype html></span>
<span class="nt"><html</span> <span class="na">amp</span><span class="nt">></span>
<span class="nt"><head></span>
<span class="nt"><meta</span> <span class="na">charset=</span><span class="s">"utf-8"</span><span class="nt">></span>
<span class="nt"><link</span> <span class="na">rel=</span><span class="s">"canonical"</span> <span class="na">href=</span><span class="s">"hello-world.html"</span><span class="nt">></span>
<span class="nt"><meta</span> <span class="na">name=</span><span class="s">"viewport"</span> <span class="na">content=</span><span class="s">"width=device-width,minimum-scale=1,initial-scale=1"</span><span class="nt">></span>
<span class="nt"><style </span><span class="na">amp-boilerplate</span><span class="nt">>body</span><span class="p">{</span><span class="nl">-webkit-animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">;</span><span class="nl">-moz-animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">;</span><span class="nl">-ms-animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">;</span><span class="nl">animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">}</span><span class="k">@-webkit-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@-moz-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@-ms-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@-o-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="nt"></style><noscript><style </span><span class="na">amp-boilerplate</span><span class="nt">>body</span><span class="p">{</span><span class="nl">-webkit-animation</span><span class="p">:</span><span class="nb">none</span><span class="p">;</span><span class="nl">-moz-animation</span><span class="p">:</span><span class="nb">none</span><span class="p">;</span><span class="nl">-ms-animation</span><span class="p">:</span><span class="nb">none</span><span class="p">;</span><span class="nl">animation</span><span class="p">:</span><span class="nb">none</span><span class="p">}</span><span class="nt"></style></noscript></span>
<span class="nt"><script </span><span class="na">async</span> <span class="na">src=</span><span class="s">"https://cdn.ampproject.org/v0.js"</span><span class="nt">></script></span>
<span class="nt"></head></span>
<span class="nt"><body></span>Hello World!<span class="nt"></body></span>
<span class="nt"></html></span></code></pre></figure>
<p>An AMP page is simply a regular HTML, page with a few extra rules and restrictions:</p>
<ul>
<li>The top of the page must have the following:</li>
</ul>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><html</span> <span class="na">amp</span><span class="nt">></span></code></pre></figure>
<p>You can also use the:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><html</span> <span class="err">⚡</span><span class="nt">></span></code></pre></figure>
<ul>
<li>Contain a <code class="language-plaintext highlighter-rouge"><link rel="canonical" href="$SOME_URL" /></code> tag inside their head that points to the regular HTML version of the AMP HTML document or to itself if no such HTML version exists;</li>
<li>Contain a <code class="language-plaintext highlighter-rouge"><meta name="viewport" content="width=device-width,minimum-scale=1"></code> tag inside their head tag. It’s also recommended to include <code class="language-plaintext highlighter-rouge">initial-scale=1</code>;</li>
<li>You must inline all your CSS in your HEAD tag (no external stylesheets allowed) using a <code class="language-plaintext highlighter-rouge"><style amp-custom></code> tag;</li>
<li>You must include the following code as the last items before your <code class="language-plaintext highlighter-rouge"></head><body></code> tag:</li>
</ul>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><style </span><span class="na">amp-boilerplate</span><span class="nt">>body</span><span class="p">{</span><span class="nl">-webkit-animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">;</span><span class="nl">-moz-animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">;</span><span class="nl">-ms-animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">;</span><span class="nl">animation</span><span class="p">:</span><span class="n">-amp-start</span> <span class="m">8s</span> <span class="n">steps</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="n">end</span><span class="p">)</span> <span class="m">0s</span> <span class="m">1</span> <span class="nb">normal</span> <span class="nb">both</span><span class="p">}</span><span class="k">@-webkit-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@-moz-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@-ms-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@-o-keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="k">@keyframes</span> <span class="n">-amp-start</span><span class="p">{</span><span class="nt">from</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">hidden</span><span class="p">}</span><span class="nt">to</span><span class="p">{</span><span class="nl">visibility</span><span class="p">:</span><span class="nb">visible</span><span class="p">}}</span><span class="nt"></style><noscript><style </span><span class="na">amp-boilerplate</span><span class="nt">>body</span><span class="p">{</span><span class="nl">-webkit-animation</span><span class="p">:</span><span class="nb">none</span><span class="p">;</span><span class="nl">-moz-animation</span><span class="p">:</span><span class="nb">none</span><span class="p">;</span><span class="nl">-ms-animation</span><span class="p">:</span><span class="nb">none</span><span class="p">;</span><span class="nl">animation</span><span class="p">:</span><span class="nb">none</span><span class="p">}</span><span class="nt"></style></noscript></span>
<span class="nt"><script </span><span class="na">async</span> <span class="na">src=</span><span class="s">"https://cdn.ampproject.org/v0.js"</span><span class="nt">></script></span></code></pre></figure>
<ul>
<li>You must remove any other javascript from your code (whether inline or external);</li>
<li>Image tags (<code class="language-plaintext highlighter-rouge"><img></code>) must be replaced with img tags (<code class="language-plaintext highlighter-rouge"><img></code>) and similarly with other media. This of course means that normal HTML readers can no longer parse or display the page contents without executing the AMP Javascript;</li>
<li>Other items (e.g. forms) must also be removed;</li>
<li>You must implement <a href="http://schema.org/NewsArticle">Schema.org NewsArticle</a>, <a href="http://schema.org/Article">Schema.org Article</a> or <a href="http://schema.org/BlogPosting">Schema.org BlogPosting</a> meta detail in your HEAD and also include an image of at least 696 pixels, if you want Google to use your AMP pages in the Top stories carousel;</li>
</ul>
<p>Next, make sure that your AMP page is actually valid AMP, or it won’t get discovered and distributed by third-party platforms like Google Search. To validate:</p>
<ul>
<li>Open your page in your browser;</li>
<li>Add <code class="language-plaintext highlighter-rouge">#development=1</code> to the URL, for example, <a href="http://leopard.in.ua/#development=1">http://leopard.in.ua/#development=1</a>;</li>
<li>Open the Chrome DevTools console and check for validation errors;</li>
</ul>
<h1 id="migration-leopardinua-to-amp">Migration leopard.in.ua to AMP</h1>
<p>I decided to migrate this website to AMP. It is working on top of <a href="https://jekyllrb.com/">Jekyll</a> and this is what I change inside it:</p>
<ul>
<li>Added all needed html tags/attributes and removed all JS code from pages (except AMP scripts);</li>
<li>Change <code class="language-plaintext highlighter-rouge"><img></code> tags to <code class="language-plaintext highlighter-rouge"><img></code> and provide width and height attributes for its;</li>
<li>Inline CSS code in <code class="language-plaintext highlighter-rouge"><style amp-custom></code>. I used for this new feature of Jekyll, which allow compile scss/sass files on a fly:</li>
</ul>
<figure class="highlight"><pre><code class="language-html" data-lang="html">{% capture include_to_scssify %}{% include sass/styles.scss %}{% endcapture %}
<span class="nt"><style </span><span class="na">amp-custom</span><span class="nt">></span><span class="p">{</span><span class="err">{</span> <span class="err">include_to_scssify</span> <span class="err">|</span> <span class="err">scssify</span> <span class="p">}</span><span class="err">}</span><span class="nt"></style></span>
</code></pre></figure>
<p>but scss/sass files moved in <code class="language-plaintext highlighter-rouge">_includes</code> directory.</p>
<ul>
<li>Back Google Analytic script by <code class="language-plaintext highlighter-rouge">amp-analytics</code> component. Example:</li>
</ul>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><amp-analytics</span> <span class="na">type=</span><span class="s">"googleanalytics"</span> <span class="na">id=</span><span class="s">"googleAnalytics"</span><span class="nt">></span>
<span class="nt"><script </span><span class="na">type=</span><span class="s">"application/json"</span><span class="nt">></span>
<span class="p">{</span>
<span class="dl">"</span><span class="s2">vars</span><span class="dl">"</span><span class="p">:</span> <span class="p">{</span>
<span class="dl">"</span><span class="s2">account</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">{{ site.analytics.google_tracking_id }}</span><span class="dl">"</span>
<span class="p">},</span>
<span class="dl">"</span><span class="s2">triggers</span><span class="dl">"</span><span class="p">:</span> <span class="p">{</span>
<span class="dl">"</span><span class="s2">trackPageview</span><span class="dl">"</span><span class="p">:</span> <span class="p">{</span>
<span class="dl">"</span><span class="s2">on</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">visible</span><span class="dl">"</span><span class="p">,</span>
<span class="dl">"</span><span class="s2">request</span><span class="dl">"</span><span class="p">:</span> <span class="dl">"</span><span class="s2">pageview</span><span class="dl">"</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nt"></script></span>
<span class="nt"></amp-analytics></span>
</code></pre></figure>
<ul>
<li>Back social buttons by <code class="language-plaintext highlighter-rouge">amp-social-share</code> component. Example:</li>
</ul>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><amp-social-share</span>
<span class="na">type=</span><span class="s">"twitter"</span>
<span class="na">width=</span><span class="s">"40"</span>
<span class="na">height=</span><span class="s">"30"</span><span class="nt">></amp-social-share></span>
<span class="nt"><amp-social-share</span>
<span class="na">type=</span><span class="s">"facebook"</span>
<span class="na">width=</span><span class="s">"40"</span>
<span class="na">height=</span><span class="s">"30"</span>
<span class="na">data-param-app_id=</span><span class="s">"{{ site.sharing.facebook_app_id }}"</span><span class="nt">></amp-social-share></span>
<span class="nt"><amp-social-share</span>
<span class="na">type=</span><span class="s">"gplus"</span>
<span class="na">width=</span><span class="s">"40"</span>
<span class="na">height=</span><span class="s">"30"</span><span class="nt">></amp-social-share></span>
<span class="nt"><amp-social-share</span>
<span class="na">type=</span><span class="s">"email"</span>
<span class="na">width=</span><span class="s">"40"</span>
<span class="na">height=</span><span class="s">"30"</span><span class="nt">></amp-social-share></span>
</code></pre></figure>
<ul>
<li>Return <a href="https://disqus.com/">Disqus</a> was little tricky, because AMP doesn’t contain component for it. I found solution by using <code class="language-plaintext highlighter-rouge">amp-iframe</code> component on github:</li>
</ul>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><amp-iframe</span>
<span class="na">height=</span><span class="s">"400"</span>
<span class="na">sandbox=</span><span class="s">"allow-scripts allow-same-origin allow-popups allow-popups-to-escape-sandbox"</span>
<span class="na">frameborder=</span><span class="s">"0"</span>
<span class="na">src=</span><span class="s">"https://tempest.services.disqus.com/engage-iframe/amp/?forum={{ site.comments.disqus_short_name }}&amp;disqus_url={{ page.url | prepend: site.baseurl | prepend: site.url | cgi_escape }}"</span>
<span class="nt">></span>
<span class="nt"><div</span> <span class="na">placeholder</span> <span class="na">class=</span><span class="s">"disqus-placeholder__wrap"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"disqus-placeholder"</span><span class="nt">></span>
<span class="nt"><svg</span> <span class="na">version=</span><span class="s">"1.1"</span> <span class="na">xmlns=</span><span class="s">"http://www.w3.org/2000/svg"</span> <span class="na">xmlns:xlink=</span><span class="s">"http://www.w3.org/1999/xlink"</span> <span class="na">width=</span><span class="s">"1024"</span> <span class="na">height=</span><span class="s">"1024"</span> <span class="na">viewBox=</span><span class="s">"0 0 1024 1024"</span> <span class="na">class=</span><span class="s">"disqus-placeholder__svg"</span><span class="nt">><path</span> <span class="na">d=</span><span class="s">"M524.456259,1012.5 C404.195712,1012.5 294.23718,968.012444 209.221899,894.419296 L0,923.35537 L80.8289496,721.399852 C52.6676835,658.493741 36.8688345,588.659481 36.8688345,515 C36.8688345,240.254704 255.169612,17.5 524.456259,17.5 C793.721065,17.5 1012,240.254704 1012,515 C1012,789.796889 793.728345,1012.5 524.456259,1012.5 L524.456259,1012.5 Z M790.685065,520.577519 L790.685065,519.191889 C790.685065,375.631815 690.679079,273.264741 518.245928,273.264741 L332.008806,273.264741 L332.008806,770.764741 L515.48659,770.764741 C689.259367,770.772111 790.685065,664.130222 790.685065,520.577519 L790.685065,520.577519 L790.685065,520.577519 Z M520.29905,648.534519 L465.825784,648.534519 L465.825784,395.531815 L520.29905,395.531815 C600.305295,395.531815 653.409813,441.707185 653.409813,521.344037 L653.409813,522.729667 C653.409813,603.037222 600.305295,648.534519 520.29905,648.534519 L520.29905,648.534519 Z"</span><span class="nt">></path></svg></span>
LOADING DISCUSSION
<span class="nt"></div></span>
<span class="nt"></div></span>
<span class="nt"></amp-iframe></span>
</code></pre></figure>
<h1 id="results">Results</h1>
<p>After all this changes need to check results. Page without AMP:</p>
<p><a href="/assets/images/web/amp/without_amp.png"><img src="/assets/images/web/amp/without_amp.png" alt="without_amp" title="without_amp" class="aligncenter size-full wp-image-1950" /></a></p>
<p>With AMP:</p>
<p><a href="/assets/images/web/amp/with_amp.png"><img src="/assets/images/web/amp/with_amp.png" alt="with_amp" title="with_amp" class="aligncenter size-full wp-image-1950" /></a></p>
<p>The results are impressive, even for a simple site like this without much bloat. The Start Render time drops from 2.091 seconds to under a 0.793 seconds and the overall load plummets (from 493kb to 231kb) as does the number of resources loaded (31 versus 11). Very nice!</p>
<h1 id="summary">Summary</h1>
<p>This article describes what is AMP and how it can help to build web pages for static content that render fast. This technology can be used for news web portals, blogs and similar websites, where static content is a major resource for customers. You can look, who <a href="https://www.ampproject.org/who/">already support AMP</a> (this resource too). I do not cover fully all usage of AMP, aspecially Google AMP Cache, which also can add additional speed for AMP pages, but in this case better create separate AMP pages and provide link to them with AMP Cache by <code class="language-plaintext highlighter-rouge"><link rel="amphtml" href="https://cdn.ampproject.org/c/s/YOUR_AMP_PAGE"></code> tag inside non-AMP page.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Improving security of your web applications with the Content Security Policy2015-10-13T00:00:00+00:00http://leopard.in.ua/2015/10/13/content-security-policy<p>Hello my dear friends.</p>
<p>Today we will talk about Content Security Policy and how it can help your to improve security of your web applications.</p>
<h1 id="what-is-content-security-policy">What is Content Security Policy?</h1>
<p><a href="https://en.wikipedia.org/wiki/Content_Security_Policy">Content Security Policy (CSP)</a> is an added layer of security that helps to detect and mitigate certain types of attacks, including <a href="https://en.wikipedia.org/wiki/Cross-site_scripting">Cross Site Scripting (XSS)</a> and data injection attacks. These attacks are used for everything from data theft to site defacement or distribution of malware.</p>
<h1 id="usage-example">Usage example</h1>
<p>In many cases many web applications do not need CSP, if they do not store and show some HTML/CSS data that user inputs. But in case when your web application should show some custom user content (html pages with css, some files, etc), you need to have good filter engine, which will remove any inline JavaScript code (<code class="language-plaintext highlighter-rouge"><script></code> tags, <code class="language-plaintext highlighter-rouge">onclick</code> from <code class="language-plaintext highlighter-rouge"><a></code> links, etc). For example, if you build web mail client, you cannot remove all HTML from email tags, because email will be broken and customer will not be happy to read broken email. So you need to prevent JavaScript inline code injection by this HTML email for hackers with any posibility.</p>
<h1 id="how-to-use-it">How to use it</h1>
<p>Instead of blindly trust to everything that a server delivers, CSP defines the <code class="language-plaintext highlighter-rouge">Content-Security-Policy</code> HTTP header that allows you to create a whitelist of sources of trusted content, and instructs the browser to execute or render only resources from those sources. Even if an attacker can find a hole through which to inject script, the script won’t match the whitelist, and therefore won’t be executed.</p>
<p>For example, if we trust <code class="language-plaintext highlighter-rouge">cdn.example.com</code> to deliver valid code, and we trust ourselves to do the same, let’s define a policy that only allows script to execute when it comes from one of those two sources:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">Content-Security-Policy: script-src <span class="s1">'self'</span> cdn.example.com</code></pre></figure>
<p>This head can contain such directives:</p>
<table class="describe-table">
<tr>
<th class="w180">Directive</th>
<th class="w200">Example Value</th>
<th>Description</th>
</tr>
<tr>
<td class="center">default-src</td>
<td class="center">'self' cdn.example.com</td>
<td>The "default-src" is the default policy for loading content such as JavaScript, Images, CSS, Font's, AJAX requests, Frames, HTML5 Media. See the <b>Source List</b> for possible values</td>
</tr>
<tr>
<td class="center">script-src</td>
<td class="center">'self' js.example.com</td>
<td>Defines valid sources of JavaScript</td>
</tr>
<tr>
<td class="center">style-src</td>
<td class="center">'self' css.example.com</td>
<td>Defines valid sources of stylesheets</td>
</tr>
<tr>
<td class="center">img-src</td>
<td class="center">'self' img.example.com</td>
<td>Defines valid sources of images</td>
</tr>
<tr>
<td class="center">connect-src</td>
<td class="center">'self'</td>
<td>Applies to XMLHttpRequest (AJAX), WebSocket or EventSource. If it is not allowed, the browser emulates a 400 HTTP status code</td>
</tr>
<tr>
<td class="center">font-src</td>
<td class="center">font.example.com</td>
<td>Defines valid sources of fonts</td>
</tr>
<tr>
<td class="center">object-src</td>
<td class="center">'self'</td>
<td>Defines valid sources of plugins, eg <object>, <embed> or <applet></td>
</tr>
<tr>
<td class="center">media-src</td>
<td class="center">media.example.com</td>
<td>Defines valid sources of audio and video, eg HTML5 <audio>, <video> elements</td>
</tr>
<tr>
<td class="center">child-src (old version frame-src)</td>
<td class="center">'self'</td>
<td>Defines valid sources for loading frames</td>
</tr>
<tr>
<td class="center">sandbox</td>
<td class="center">allow-forms allow-scripts</td>
<td>Enables a sandbox for the requested resource similar to the iframe sandbox attribute. The sandbox applies a same origin policy, prevents popups, plugins and script execution is blocked. You can keep the sandbox value empty to keep all restrictions in place, or add values: allow-forms allow-same-origin allow-scripts, and allow-top-navigation</td>
</tr>
<tr>
<td class="center">report-uri</td>
<td class="center">/report</td>
<td>Instructs the browser to POST a reports of policy failures to this URI. You can also append "-Report-Only" to the HTTP header name to instruct the browser to only send reports (does not block anything)</td>
</tr>
</table>
<h2 id="source-list">Source List</h2>
<p>All of the directives that end with “-src” support similar values known as a source list. Multiple source list values can be space seperated with the exception of “*” and “none” which should be the only value.</p>
<table class="describe-table">
<tr>
<th class="w180">Source Value</th>
<th class="w200">Example</th>
<th>Description</th>
</tr>
<tr>
<td class="center">*</td>
<td class="center">img-src *</td>
<td>Wildcard, allows anything</td>
</tr>
<tr>
<td class="center">'none'</td>
<td class="center">object-src 'none'</td>
<td>Prevents loading resources from any source</td>
</tr>
<tr>
<td class="center">'self'</td>
<td class="center">script-src 'self'</td>
<td>Allows loading resources from the same origin (same scheme, host and port)</td>
</tr>
<tr>
<td class="center">data:</td>
<td class="center">img-src 'self' data:</td>
<td>Allows loading resources via the data scheme (eg Base64 encoded images)</td>
</tr>
<tr>
<td class="center">domain.example.com</td>
<td class="center">img-src img.example.com</td>
<td>Allows loading resources via the data scheme (eg Base64 encoded images)</td>
</tr>
<tr>
<td class="center">*.example.com</td>
<td class="center">img-src *.example.com</td>
<td>Allows loading resources from the any subdomain under example.com</td>
</tr>
<tr>
<td class="center">https:</td>
<td class="center">img-src https:</td>
<td>Allows loading resources only over HTTPS on any domain</td>
</tr>
<tr>
<td class="center">'unsafe-inline'</td>
<td class="center">script-src 'unsafe-inline'</td>
<td>Allows use of inline source elements such as style attribute, onclick, or script tag bodies (depends on the context of the source it is applied to)</td>
</tr>
<tr>
<td class="center">'unsafe-eval'</td>
<td class="center">script-src 'unsafe-eval'</td>
<td>Allows unsafe dynamic code evaluation such as JavaScript eval()</td>
</tr>
</table>
<h1 id="usage-example-1">Usage example</h1>
<p>As you can see we can combine all this values for “Content Security Policy” header and create most flexible rule for our app. Let’s create a little example. I am using Rails app with <a href="https://github.com/railsware/global">global gem</a> to make it work with “Content Security Policy”. First I create global yml file with configuration (<code class="language-plaintext highlighter-rouge">config/global/content_security_policy.yml</code>):</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">default</span><span class="pi">:</span>
<span class="na">enabled</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">default_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">*"</span>
<span class="na">script_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'self'</span><span class="nv"> </span><span class="s">localhost:3000</span><span class="nv"> </span><span class="s">localhost:9292"</span>
<span class="na">object_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'self'"</span>
<span class="na">style_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'self'</span><span class="nv"> </span><span class="s">'unsafe-inline'</span><span class="nv"> </span><span class="s">'unsafe-eval'"</span>
<span class="na">img_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">*</span><span class="nv"> </span><span class="s">data:"</span>
<span class="na">media_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'none'"</span>
<span class="na">child_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'self'"</span>
<span class="na">font_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'self'</span><span class="nv"> </span><span class="s">data:"</span>
<span class="na">connect_src</span><span class="pi">:</span> <span class="s2">"</span><span class="s">'self'</span><span class="nv"> </span><span class="s">ws://localhost:9292"</span>
<span class="na">development</span><span class="pi">:</span>
<span class="na">enabled</span><span class="pi">:</span> <span class="no">true</span></code></pre></figure>
<p>And user <code class="language-plaintext highlighter-rouge">default_headers</code> inside rails app (<code class="language-plaintext highlighter-rouge">config/application.rb</code>):</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="no">File</span><span class="p">.</span><span class="nf">expand_path</span><span class="p">(</span><span class="s1">'../boot'</span><span class="p">,</span> <span class="kp">__FILE__</span><span class="p">)</span>
<span class="nb">require</span> <span class="s1">'rails/all'</span>
<span class="nb">require</span> <span class="no">File</span><span class="p">.</span><span class="nf">expand_path</span><span class="p">(</span><span class="s1">'../../lib/loaderio_redis_config'</span><span class="p">,</span> <span class="kp">__FILE__</span><span class="p">)</span>
<span class="no">Bundler</span><span class="p">.</span><span class="nf">require</span><span class="p">(</span><span class="ss">:default</span><span class="p">,</span> <span class="no">Rails</span><span class="p">.</span><span class="nf">env</span><span class="p">)</span>
<span class="c1"># global initialize</span>
<span class="no">Global</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">config</span><span class="o">|</span>
<span class="n">config</span><span class="p">.</span><span class="nf">environment</span> <span class="o">=</span> <span class="no">Rails</span><span class="p">.</span><span class="nf">env</span><span class="p">.</span><span class="nf">to_s</span>
<span class="n">config</span><span class="p">.</span><span class="nf">config_directory</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">expand_path</span><span class="p">(</span><span class="s1">'../global'</span><span class="p">,</span> <span class="kp">__FILE__</span><span class="p">)</span>
<span class="n">config</span><span class="p">.</span><span class="nf">namespace</span> <span class="o">=</span> <span class="s2">"Global"</span>
<span class="k">end</span>
<span class="k">module</span> <span class="nn">YourCoolApp</span>
<span class="k">class</span> <span class="nc">Application</span> <span class="o"><</span> <span class="no">Rails</span><span class="o">::</span><span class="no">Application</span>
<span class="n">secure_headers</span> <span class="o">=</span> <span class="p">{</span>
<span class="s1">'X-Frame-Options'</span> <span class="o">=></span> <span class="s1">'SAMEORIGIN'</span><span class="p">,</span>
<span class="s1">'X-XSS-Protection'</span> <span class="o">=></span> <span class="s1">'1; mode=block'</span><span class="p">,</span>
<span class="s1">'X-Content-Type-Options'</span> <span class="o">=></span> <span class="s1">'nosniff'</span>
<span class="p">}</span>
<span class="k">if</span> <span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">enabled?</span>
<span class="n">secure_headers</span><span class="p">.</span><span class="nf">merge!</span><span class="p">(</span>
<span class="s1">'Content-Security-Policy'</span> <span class="o">=></span> <span class="s2">"default-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">default_src</span><span class="si">}</span><span class="s2">; script-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">script_src</span><span class="si">}</span><span class="s2">; object-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">object_src</span><span class="si">}</span><span class="s2">; style-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">style_src</span><span class="si">}</span><span class="s2">; img-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">img_src</span><span class="si">}</span><span class="s2">; media-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">media_src</span><span class="si">}</span><span class="s2">; child-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">child_src</span><span class="si">}</span><span class="s2">; frame-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">child_src</span><span class="si">}</span><span class="s2">; font-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">font_src</span><span class="si">}</span><span class="s2">; connect-src </span><span class="si">#{</span><span class="no">Global</span><span class="p">.</span><span class="nf">content_security_policy</span><span class="p">.</span><span class="nf">connect_src</span><span class="si">}</span><span class="s2">"</span>
<span class="p">)</span>
<span class="k">end</span>
<span class="n">config</span><span class="p">.</span><span class="nf">action_dispatch</span><span class="p">.</span><span class="nf">default_headers</span> <span class="o">=</span> <span class="n">secure_headers</span>
<span class="o">...</span></code></pre></figure>
<p>After restarting of the Rails app you should see “Content Security Policy” header in any HTTP response from your app. If someone will try to inject JS code in your app (<code class="language-plaintext highlighter-rouge">onclick</code> in link), it will get such JS error:</p>
<p><a href="/assets/images/security/csp/csp1.png"><img src="/assets/images/security/csp/csp1.png" alt="CSP error" title="CSP error" class="aligncenter size-full" /></a></p>
<h1 id="subresource-integrity">Subresource Integrity</h1>
<p>Many sites uses a content delivery network (CDN) to serve static assets such as JavaScript, CSS, and images to our users. The CDN makes web browsing faster by delivering assets from data centers that are geographically close to the end user and by using hardware and software that is optimized for quickly serving static assets. The compromise of a major CDN could be devastating to the security of the hundreds of thousands of sites that depends on it. If our CDN were to be compromised, it could be used to serve malicious JavaScript to all our users, rendering our many XSS mitigations and transport security useless. Content Security Policy is invaluable for protecting against traditional XSS attacks, but it provides no defense against an attacker who can control assets served from whitelisted sources.</p>
<p>To prevent this type of attack, you can use <a href="http://www.w3.org/TR/SRI/">Subresource Integrity</a> browser technology. The website author includes an <code class="language-plaintext highlighter-rouge">integrity</code> attribute on JavaScript and CSS tags, specifying the cryptographic digest of the resource being loaded from the third party. When the browser fetches the resource, it computes the file’s digest and compares it with the value from the <code class="language-plaintext highlighter-rouge">integrity</code> attribute. If the values match, the resource is loaded. Otherwise, the browser refuses to load the resource. Example:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><script </span><span class="na">src=</span><span class="s">"/assets/application-asdhhwheruhsjkadlslkdl.js"</span> <span class="na">integrity=</span><span class="s">"sha256-TvVUHzSfftWg1rcfL6TIJ0XKEGrgLyEq6lsd29qs="</span><span class="nt">></script></span></code></pre></figure>
<p>If you are using Rails with sprockets-rails gem (version >= 3), you can add <code class="language-plaintext highlighter-rouge">integrity</code> key to your <code class="language-plaintext highlighter-rouge">javascript_include_tag</code> helper to activate this feature:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">javascript_include_tag</span> <span class="ss">:application</span><span class="p">,</span> <span class="ss">integrity: </span><span class="kp">true</span>
<span class="c1"># => "<script src="/assets/application.js" integrity="sha-256-TvVUHzSfftWg1rcfL6TIJ0XKEGrgLyEq6lEpcmrG9qs="></script>"</span></code></pre></figure>
<p>More info about Subresource Integrity you can read in <a href="http://githubengineering.com/subresource-integrity/">this article</a>.</p>
<h1 id="browser-support">Browser Support</h1>
<p>CSP is designed to be fully backward compatible; browsers that don’t support it still work with servers that implement it, and vice-versa. Browsers that don’t support CSP simply ignore it, functioning as usual, defaulting to the standard same-origin policy for web content. If the site doesn’t offer the CSP header, browsers likewise use the standard same-origin policy.</p>
<table class="describe-table">
<tr>
<th>Header</th>
<th>Chrome</th>
<th>FireFox</th>
<th>Safari</th>
<th>Internet Explorer/Edge</th>
</tr>
<tr>
<td>Content-Security-Policy (1.0)</td>
<td>25+</td>
<td>23+</td>
<td>7+</td>
<td>-</td>
</tr>
<tr>
<td>X-Content-Security-Policy</td>
<td>-</td>
<td>4.0+</td>
<td>-</td>
<td>10+ (limited)</td>
</tr>
<tr>
<td>X-Webkit-CSP</td>
<td>14+</td>
<td>-</td>
<td>6+</td>
<td>-</td>
</tr>
</table>
<p>As you can see in this table, CSP have good support for major browsers. Internet Explorer 10-11 and Edge have partial support for CSP via the <code class="language-plaintext highlighter-rouge">X-Content-Security-Policy</code> header, but even then they only appear to support the optional “sandbox” directive. More info on <a href="http://caniuse.com/#feat=contentsecuritypolicy">caniuse</a>.</p>
<h1 id="summary">Summary</h1>
<p>Content Security Policy can provide the additional security layer for your apps against XSS and data injection attacks (XSS is in third place in the ranking of the key risks of Web-based applications under the 2013 OWASP).</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
PostgreSQL Indexes2015-04-13T00:00:00+00:00http://leopard.in.ua/2015/04/13/postgresql-indexes<p>Hello my dear friends. This is my new article in which I would like to to tell you about PostgreSQL indexes.</p>
<h1 id="first-of-all-what-is-index">First of all what is Index?</h1>
<p>For the beginning let us remind ourselves what is a table in relational database.</p>
<p>Table in a relational database is a list of rows, in the same time each row have cells. The number of cells and cell types in the row is the same as a scheme of a column (columns) in table.
This list of rows has consecutively numbered RowId sequence number. So, we can consider table as list of pairs: (RowId, row).</p>
<p>Indexes are in the inverse relationship: (row, RowId). In index row must contain at least one cell. Obviously, if a row is not unique (two identical rows), these relations look like mapped RowId list.</p>
<p><a href="/assets/images/postgresql/pg_indexes/pg_indexes1.png"><img src="/assets/images/postgresql/pg_indexes/pg_indexes1.png" alt="Indexes" title="Indexes" class="aligncenter size-full" /></a></p>
<p>Index is an additional data structure, which can help us with:</p>
<ul>
<li>Data search - all indexes support search values on equality. Some indexes also support prefix search (like “abc%”), arbitrary ranges search</li>
<li>Optimizer - B-Tree and R-Tree indexes represent a histogram arbitrary precision</li>
<li>Join - indexes can be used for Merge, Index algorithms</li>
<li>Relation - indexes can be used for except/intersect operations</li>
<li>Aggregations - indexes can effectively calculate some aggregation function (count, min, max, etc)</li>
<li>Grouping - indexes can effectively calculate the arbitrary grouping and aggregate functions (sort-group algorithm)</li>
</ul>
<h1 id="postgresql-index-types">PostgreSQL Index Types</h1>
<p>There are many types of indexes in PostgreSQL, as well as different ways to use them. Let’s review all these indexes.</p>
<h3 id="b-tree-index">B-Tree index</h3>
<p>B-Tree is the default index that you get when you do <code class="language-plaintext highlighter-rouge">CREATE INDEX</code>. Virtually all databases will have some B-tree indexes. The B stands for Balanced (Boeing/Bayer/Balanced/Broad/Bushy-Tree), and the idea is that the amount of data on both sides of the tree is roughly the same. Therefore the number of levels that must be traversed to find rows is always in the same approximate number. B-Tree indexes can be used for equality and range queries efficiently. They can operate against all datatypes, and can also be used to retrieve NULL values. Btrees are designed to work very well with caching, even when only partially cached.</p>
<p><a href="/assets/images/postgresql/pg_indexes/btree1.gif"><img src="/assets/images/postgresql/pg_indexes/btree1.gif" alt="B-Tree" title="B-Tree" class="aligncenter size-full" /></a></p>
<p>Advantages:</p>
<ul>
<li>Retain sorting data</li>
<li>Support the search for the unary and binary predicates</li>
<li>Allow the entire sequence of data to estimate cardinality (number of entries) for the entire index (and therefore the table), range, and with arbitrary precision without scanning</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>For their construction is require to perform a full sorting pairs (row, RowId) (slow operation)</li>
<li>Take up a lot of disk space. Index on unique “Integers” weights twice more as the column (because additionaly RowId need stored)</li>
<li>Recording unbalances tree constantly, and begins to store data sparsely, and the access time is increased by increasing the amount of disk information. What is why, B-Tree indexes require monitoring and periodic rebuilding</li>
</ul>
<h3 id="r-tree-index">R-Tree index</h3>
<p>R-Tree (rectangle-tree) index storing numeric type pairs of (X, Y) values (for example, the coordinates). R-Tree is very similar to B-Tree. The only difference is the information written to intermediate page in a tree. For the i-th value of the B-Tree node we write the most out of the i-th subtree. In R-Tree it is a minimum rectangle that encloses all the rectangles of the child. Details can be seen in figure:</p>
<p><a href="/assets/images/postgresql/pg_indexes/pg_indexes2.jpg"><img src="/assets/images/postgresql/pg_indexes/pg_indexes2.jpg" alt="R-Tree" title="R-Tree" class="aligncenter size-full" /></a></p>
<p>Advantages:</p>
<ul>
<li>Search for arbitrary regions, points</li>
<li>Allows us to estimate the number of dots in a region without a full data scan</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>Significant redundancy in the data storage</li>
<li>Slow update</li>
</ul>
<p>In general, the pros-cons are very similar to B-Tree.</p>
<h3 id="hash-index">Hash index</h3>
<p>Hash index doesn’t store the values, but their hashes. Such indexing way reducing the size (and therefore increased speed and processing) of high index fields. In this case, when a query using Hash indexes will not be compared with the value of the field, but the hash value of the desired hash fields.</p>
<p>Because hash functions is non-linear, such index cannot be sorted. This causes inability to use the comparisons more/less and “IS NULL” with this index. In addition, since the hashes are not unique, then the matching hashes used methods of resolving conflicts.</p>
<p><a href="/assets/images/postgresql/pg_indexes/hash_indexes.png"><img src="/assets/images/postgresql/pg_indexes/hash_indexes.png" alt="Hash indexes" title="Hash indexes" class="aligncenter size-full wp-image-1950" /></a></p>
<p>Advantages:</p>
<ul>
<li>Very fast search O(1)</li>
<li>Stability - the index does not need to be rebuild</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>Hash is very sensitive to collisions. In the case of “bad” data distribution, most of the entries will be concentrated in a few bouquets, and in fact the search will occur through collision resolution.</li>
</ul>
<p>As you can see, Hash indexes are only useful for equality comparisons, but you pretty much never want to use them since they are not transaction safe, need to be manually rebuilt after crashes, and are not replicated to followers in PostgreSQL (all this fixed in PostgreSQL 10).</p>
<h3 id="bitmap-index">Bitmap index</h3>
<p>Bitmap index create a separate bitmap (a sequence of 0 and 1) for each possible value of the column, where each bit corresponds to a string with an indexed value. Bitmap indexes are optimal for data where bit unique values (example, gender field).</p>
<p><a href="/assets/images/postgresql/pg_indexes/bitmap.png"><img src="/assets/images/postgresql/pg_indexes/bitmap.png" alt="Bitmap indexes" title="Bitmap indexes" class="aligncenter size-full" /></a></p>
<p>Advantages:</p>
<ul>
<li>Compact representation (small amount of disk space)</li>
<li>Fast reading and searching for the predicate “is”</li>
<li>Effective algorithms for packing masks (even more compact representation, than indexed data)</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>You can not change the method of encoding values in the process of updating the data. From this it follows that if the distribution data has changed, it is required the index to be completely rebuild</li>
</ul>
<p>PostgreSQL is not provide persistent bitmap index. But it can be used in database to combine multiple indexes. PostgreSQL scans each needed index and prepares a bitmap in memory giving the locations of table rows that are reported as matching that index’s conditions. The bitmaps are then ANDed and ORed together as needed by the query. Finally, the actual table rows are visited and returned.</p>
<h3 id="gist-index">GiST index</h3>
<p>Generalized Search Tree (GiST) indexes allow you to build general balanced tree structures, and can be used for operations beyond equality and range comparisons. The tree structure is not changed, still no elevators in each node pair stored value (the page number) and the number of children with the same amount of steam in the node.</p>
<p>The essential difference lies in the organization of the key. B-Tree trees sharpened by search ranges, and hold a maximum subtree-child. R-Tree - the region on the coordinate plane. GiST offers as values in the non-leaf nodes store the information that we consider essential, and which will determine if we are interested in values (satisfying the predicate) in the subtree-child. The specific form of information stored depends on the type of search that we wish to pursue. Thus parameterize R-Tree and B-Tree tree predicates and values we automatically receive specialized for the task index (examples: PostGiST, pg_trgm, hstore, ltree, etc.). They are used to index the geometric data types, as well as full-text search.</p>
<p>Advantages:</p>
<ul>
<li>Efficient search</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>Large redundancy</li>
<li>The specialized implementation for each query group are nessesary</li>
</ul>
<p>The rest of the pros-cons similar to B-Tree and R-Tree.</p>
<h3 id="gin-index">GIN index</h3>
<p>Generalized Inverted Indexes (GIN) are useful when an index must map many values to one row, whereas B-Tree indexes are optimized for when a row has a single key value. GINs are good for indexing array values as well as for implementing full-text search.</p>
<p><a href="/assets/images/postgresql/pg_indexes/fulltext-gist-vs-gin.png"><img src="/assets/images/postgresql/pg_indexes/fulltext-gist-vs-gin.png" alt="GIN index" title="GIN index" class="aligncenter size-full" /></a></p>
<p>Key features:</p>
<ul>
<li>Well suited for full-text search</li>
<li>Look for a full match (“is”, but not “less” or “more”).</li>
<li>Well suited for semi-structured data search</li>
<li>Allows you to perform several different searches (queries) in a single pass</li>
<li>Scales much better than GiST (support large volumes of data)</li>
<li>Works well for frequent recurrence of elements (and therefore are perfect for full-text search)</li>
</ul>
<h1 id="block-range-brin-index-95">Block Range (BRIN) Index (9.5+)</h1>
<p>BRIN stands for Block Range INdexes, and store metadata on a range of pages. At the moment this means the minimum and maximum values per block.</p>
<p>This results in an inexpensive index that occupies a very small amount of space, and can speed up queries in extremely large tables. This allows the index to determine which blocks are the only ones worth checking, and all others can be skipped. So if a 10GB table of order contained rows that were generally in order of order date, a BRIN index on the order_date column would allow the majority of the table to be skipped rather than performing a full sequential scan. This will still be slower than a regular BTREE index on the same column, but with the benefits of it being far smaller and requires less maintenance.</p>
<p>More info about this index you can read in <a href="http://pythonsweetness.tumblr.com/post/119568339102/block-range-brin-indexes-in-postgresql-95">this article</a>.</p>
<h1 id="partial-indexes">Partial Indexes</h1>
<p>A partial index covers just a subset of a table’s data. It is an index with a <code class="language-plaintext highlighter-rouge">WHERE</code> clause. The idea is to increase the efficiency of the index by reducing its size. A smaller index takes less storage, is easier to maintain, and is faster to scan.</p>
<p>For example, suppose you log in table some information about network activity and very often you need check logs from local IP range. You may want to create an index like so:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">access_log_client_ip_ix</span> <span class="k">ON</span> <span class="n">access_log</span> <span class="p">(</span><span class="n">client_ip</span><span class="p">)</span>
<span class="k">WHERE</span> <span class="p">(</span><span class="n">client_ip</span> <span class="o">></span> <span class="n">inet</span> <span class="s1">'192.168.100.0'</span> <span class="k">AND</span>
<span class="n">client_ip</span> <span class="o"><</span> <span class="n">inet</span> <span class="s1">'192.168.100.255'</span><span class="p">);</span></code></pre></figure>
<p>and such sql query will use such index</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">access_log</span> <span class="k">WHERE</span> <span class="n">client_ip</span> <span class="o">=</span> <span class="s1">'192.168.100.45'</span><span class="p">;</span></code></pre></figure>
<p>This index will remain fairly small, and can also be used along other indexes on the more complex queries that may require it.</p>
<h1 id="expression-indexes">Expression Indexes</h1>
<p>Expression indexes are useful for queries that match on some function or modification of your data. Postgres allows you to index the result of that function so that searches become as efficient as searching by raw data values.</p>
<p>For example, suppose you doing very often search by first leter in lower case from <code class="language-plaintext highlighter-rouge">name</code> field. You may want to create an index like so:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">users_name_first_idx</span> <span class="k">ON</span> <span class="n">foo</span> <span class="p">((</span><span class="k">lower</span><span class="p">(</span><span class="n">substr</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))));</span></code></pre></figure>
<p>and such sql query will use such index</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">users</span> <span class="k">WHERE</span> <span class="k">lower</span><span class="p">(</span><span class="n">substr</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> <span class="o">=</span> <span class="s1">'a'</span><span class="p">;</span></code></pre></figure>
<h1 id="unique-indexes">Unique Indexes</h1>
<p>A unique index guarantees that the table won’t have more than one row with the same value. It’s advantageous to create unique indexes for two reasons: data integrity and performance. Lookups on a unique index are generally very fast.</p>
<p>There is little distinction between unique indexes and unique constraints. Unique indexes can be though of as lower level, since expression indexes and partial indexes cannot be created as unique constraints. Even partial unique indexes on expressions are possible.</p>
<h1 id="multi-column-indexes">Multi-column Indexes</h1>
<p>While Postgres has the ability to create multi-column indexes, it’s important to understand when it makes sense to do so. The Postgres query planner has the ability to combine and use multiple single-column indexes in a multi-column query by performing a bitmap index scan (“Bitmap index” for more info). In general, you can create an index on every column that covers query conditions and in most cases Postgres will use them, so make sure to benchmark and justify the creation of a multi-column index before you create them. As always, indexes come with a cost, and multi-column indexes can only optimize the queries that reference the columns in the index in the same order, while multiple single column indexes provide performance improvements to a larger number of queries.</p>
<p>However there are cases where a multi-column index clearly makes sense. An index on columns (a, b) can be used by queries containing <code class="language-plaintext highlighter-rouge">WHERE a = x AND b = y</code>, or queries using <code class="language-plaintext highlighter-rouge">WHERE a = x</code> only, but will not be used by a query using <code class="language-plaintext highlighter-rouge">WHERE b = y</code>. So if this matches the query patterns of your application, the multi-column index approach is worth considering. Also note that in this case creating an index on a alone would be redundant.</p>
<h1 id="summary">Summary</h1>
<p>Indexes are common way to enhance database performance. Index allows the database server to find and retrieve specific rows much faster than it can be without an index. But indexes also add overhead to the database system as a whole, so they should be used sensibly.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
SQL Joins Visualizer - build SQL JOIN between two tables by using of Venn diagrams2015-01-05T00:00:00+00:00http://leopard.in.ua/2015/01/05/sql-joins-visualizer<p>Hello my dear friends.</p>
<p>Today we will lear about SQL Joins and my new little app, which help to build and understand its.</p>
<h1 id="sql-join">SQL join</h1>
<p>A SQL join clause combines records from two or more tables in a database. It creates a set that can be saved as a table or used as it is. A JOIN is a means for combining fields from two tables by using values common to each. ANSI-standard SQL specifies five types of JOIN: INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER and CROSS. As a special case, a table (base table, view, or joined table) can JOIN to itself in a self-join.</p>
<h1 id="sql-joins-visualizer">SQL Joins Visualizer</h1>
<p>If you have tried to understand how joins work and constantly get confused about what join to use, you just need to use a new simple app - <a href="http://sql-joins.leopard.in.ua/">SQL Joins Visualizer</a>. It using Venn diagram to build a valid SQL join with explanation. Application can work offline.</p>
<p><a href="http://sql-joins.leopard.in.ua/"><img src="/assets/images/sql/visualizer/sql_visyalizer.png" alt="" title="1" class="aligncenter size-full wp-image-1950" /></a></p>
<p>To select need type of join between two table you need to click at sectors on Venn diagram. For example, if you want to get the results that completely contains the table A you will see that it is sufficient to use the “LEFT JOIN”. You will get “INNER JOIN” if your JOIN results need to include both A and B results.</p>
<p>Of course, this application is <a href="https://github.com/le0pard/sql-joins-app">open source</a>.</p>
<h1 id="cross-join">CROSS join</h1>
<p>There’s also a cartesian product or cross join, which as far as I know, can’t be expressed as a Venn diagram:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">TableA</span>
<span class="k">CROSS</span> <span class="k">JOIN</span> <span class="n">TableB</span></code></pre></figure>
<p>This joins “everything to everything”, resulting in 4 x 4 = 16 rows, far more than we had in the original sets. If you do the math, you can see why this is a very dangerous join to run against large tables.</p>
<h1 id="summary">Summary</h1>
<p>SQL Joins Visualizer help to you build SQL JOIN between two tables by using of Venn diagrams. I hope it will help to understand how working SQL joins.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Zopfli-ffi - Ruby wrapper for zopfli library2014-10-29T00:00:00+00:00http://leopard.in.ua/2014/10/29/zopfli-ffi<p>Hello my dear friends.</p>
<p>Today we will lear about Zopfli and how you can use it with Ruby.</p>
<h1 id="what-is-zopfli">What is Zopfli?</h1>
<p><a href="https://code.google.com/p/zopfli/">Zopfli</a> Compression Algorithm is a new zlib (gzip, deflate) compatible compressor, which at 3.7-8.3% more efficient than standard zlib library at the maximum level of compression. Initially the algorithm was designed for lossless compression <a href="/2013/11/23/rails-and-webp/">WebP format</a>, but it can be applied to other content.</p>
<p>The new algorithm is a standard “deflate” algorithm, so it is compatible with the zlib and gzip, and decompression of data is already supported by all browsers. Just connect zopfli to a server (for example, it can be used with a web server Nginx without changes in the module gzip, simply specifying a new “compressor”).</p>
<p>However, compression using Zopfli requires about 100 times more resources than gzip (~100x slower), but the decompression is done in the browser at the same speed.</p>
<h1 id="ruby-and-zopfli">Ruby and Zopfli</h1>
<p>I wrote zopfli gem - <a href="http://leopard.in.ua/zopfli-ffi/">zopfli-ffi</a>. You can use this gem to work with zopfli (it can work with MRI, JRuby and RBX). This gem have only one main method - <code class="language-plaintext highlighter-rouge">compress</code>. You should provide file, which you want to compress and file, which should store compressed result.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">uncompressed_file</span> <span class="o">=</span> <span class="s1">'spec/fixtures/test.txt'</span>
<span class="n">compressed_file</span> <span class="o">=</span> <span class="s1">'spec/fixtures/test.txt.gz'</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">)</span></code></pre></figure>
<p>You can define format of compression (:zlib is default):</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">,</span> <span class="ss">:zlib</span><span class="p">)</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">,</span> <span class="ss">:deflate</span><span class="p">)</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">,</span> <span class="ss">:gzip</span><span class="p">)</span></code></pre></figure>
<p>Also you can define number of iterations for compression (greater number - better compression, but slower compression time; default = 15):</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">,</span> <span class="ss">:zlib</span><span class="p">,</span> <span class="mi">15</span><span class="p">)</span> <span class="c1"># default format</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">,</span> <span class="ss">:deflate</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="n">uncompressed_file</span><span class="p">,</span> <span class="n">compressed_file</span><span class="p">,</span> <span class="ss">:zlib</span><span class="p">,</span> <span class="mi">25</span><span class="p">)</span></code></pre></figure>
<h1 id="benchmarking">Benchmarking</h1>
<p>Let’s look at what time work and the result of compression will have Zopfli and ZLib. For this benchmark I created little Ruby script:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'zopfli_ffi'</span>
<span class="nb">require</span> <span class="s1">'zlib'</span>
<span class="nb">require</span> <span class="s1">'benchmark'</span>
<span class="n">in_dir</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">expand_path</span><span class="p">(</span><span class="no">File</span><span class="p">.</span><span class="nf">dirname</span><span class="p">(</span><span class="kp">__FILE__</span><span class="p">))</span>
<span class="n">out_dir</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">expand_path</span><span class="p">(</span><span class="no">File</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="no">File</span><span class="p">.</span><span class="nf">dirname</span><span class="p">(</span><span class="kp">__FILE__</span><span class="p">),</span> <span class="s2">"../tmp/"</span><span class="p">))</span>
<span class="no">Benchmark</span><span class="p">.</span><span class="nf">bm</span><span class="p">(</span><span class="mi">7</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">x</span><span class="o">|</span>
<span class="n">x</span><span class="p">.</span><span class="nf">report</span><span class="p">(</span><span class="s2">"Gzip:"</span><span class="p">)</span> <span class="k">do</span>
<span class="no">Zlib</span><span class="o">::</span><span class="no">GzipWriter</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="n">out_dir</span><span class="si">}</span><span class="s2">/1.jpg.gz"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">gz</span><span class="o">|</span>
<span class="n">gz</span><span class="p">.</span><span class="nf">write</span> <span class="no">IO</span><span class="p">.</span><span class="nf">binread</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="n">in_dir</span><span class="si">}</span><span class="s2">/1.jpg"</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">x</span><span class="p">.</span><span class="nf">report</span><span class="p">(</span><span class="s2">"Zopfli (5 iterations):"</span><span class="p">)</span> <span class="k">do</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="n">in_dir</span><span class="si">}</span><span class="s2">/1.jpg"</span><span class="p">,</span> <span class="s2">"</span><span class="si">#{</span><span class="n">out_dir</span><span class="si">}</span><span class="s2">/1_5.jpg.zfl"</span><span class="p">,</span> <span class="ss">:zlib</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">x</span><span class="p">.</span><span class="nf">report</span><span class="p">(</span><span class="s2">"Zopfli (50 iterations):"</span><span class="p">)</span> <span class="k">do</span>
<span class="no">Zopfli</span><span class="p">.</span><span class="nf">compress</span><span class="p">(</span><span class="s2">"</span><span class="si">#{</span><span class="n">in_dir</span><span class="si">}</span><span class="s2">/1.jpg"</span><span class="p">,</span> <span class="s2">"</span><span class="si">#{</span><span class="n">out_dir</span><span class="si">}</span><span class="s2">/1_50.jpg.zfl"</span><span class="p">,</span> <span class="ss">:zlib</span><span class="p">,</span> <span class="mi">50</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>In the result we have such execution time:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>bundle <span class="nb">exec </span>ruby spec/benchmark.rb
user system total real
Gzip: 0.600000 0.190000 0.790000 <span class="o">(</span>0.944868<span class="o">)</span>
Zopfli <span class="o">(</span>5 iterations<span class="o">)</span>: 124.330000 20.880000 145.210000 <span class="o">(</span>145.643881<span class="o">)</span>
Zopfli <span class="o">(</span>50 iterations<span class="o">)</span>: 558.800000 152.280000 711.080000 <span class="o">(</span>713.134134<span class="o">)</span></code></pre></figure>
<p>As you can see, Zopfli in ~150 times slower, than Zlib. Also as you can see execution time grows, if we increase number of iterations for compression.</p>
<p>But what about files size? This is result:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">vagrant 11578722 Oct 29 18:52 1.jpg
vagrant 9807633 Oct 29 18:05 1.jpg.gz
vagrant 9747611 Oct 29 18:07 1_5.jpg.zfl
vagrant 9733181 Oct 29 18:19 1_50.jpg.zfl</code></pre></figure>
<p>As we can see, Zlib reduced file size for 15.29%. Zopfli with 5 iterations reduced file size for 15.814% and with 5 iterations reduced file size for 15.939%. Difference is not too big. So, why you even will consider to use Zopfli?</p>
<h1 id="use-cases">Use cases</h1>
<p>Zopfli is not good for real-time compression, as it can do Zlib. What is why it is not good idea to activate it for Nginx (as I wrote at begin of the article).</p>
<p>Zopfli can be very useful for systems, which prepare compressed files for distribution (static HTML pages, JS/CSS/etc files) by HTTP protocol. For example, jQuery CDN for distribution of a jQuery library can use gigabytes of network traffic (I don’t know real numbers). Zopfli can save huge amount of a network traffic and increase speed of distribution of content, because even 1% is really huge number in this case.</p>
<h1 id="summary">Summary</h1>
<p>Zopfli is a zlib (gzip, deflate) compatible compressor, that can better compress your files (3.7-8.3%). However, you should pay for this very slow compression time (the decompression is done in the browser at the same speed).</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Pagination Done the PostgreSQL Way2014-10-11T00:00:00+00:00http://leopard.in.ua/2014/10/11/postgresql-paginattion<p>Hello my dear friends. In this article I will talk about PostgreSQL and pagination.</p>
<h1 id="pagination-in-simple-way">Pagination in simple way</h1>
<p>Let’s start with simple examples. A query to fetch the 10 most recent news:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="nb">date</span><span class="p">,</span> <span class="n">id</span> <span class="k">DESC</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span></code></pre></figure>
<p>In SQL we are using <code class="language-plaintext highlighter-rouge">ORDER BY</code> to get most recent first news and <code class="language-plaintext highlighter-rouge">LIMIT</code> to fetch only the first 10 news.</p>
<h2 id="worst-case-no-index-for-order-by">Worst Case: No index for ORDER BY</h2>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">27678</span><span class="p">.</span><span class="mi">15</span><span class="p">..</span><span class="mi">27678</span><span class="p">.</span><span class="mi">18</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">393</span><span class="p">.</span><span class="mi">361</span><span class="p">..</span><span class="mi">393</span><span class="p">.</span><span class="mi">363</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Sort</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">27678</span><span class="p">.</span><span class="mi">15</span><span class="p">..</span><span class="mi">28922</span><span class="p">.</span><span class="mi">17</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">393</span><span class="p">.</span><span class="mi">359</span><span class="p">..</span><span class="mi">393</span><span class="p">.</span><span class="mi">360</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Sort</span> <span class="k">Key</span><span class="p">:</span> <span class="n">id</span>
<span class="n">Sort</span> <span class="k">Method</span><span class="p">:</span> <span class="n">top</span><span class="o">-</span><span class="n">N</span> <span class="n">heapsort</span> <span class="n">Memory</span><span class="p">:</span> <span class="mi">25</span><span class="n">kB</span>
<span class="o">-></span> <span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">foo</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">16925</span><span class="p">.</span><span class="mi">00</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">024</span><span class="p">..</span><span class="mi">277</span><span class="p">.</span><span class="mi">040</span> <span class="k">rows</span><span class="o">=</span><span class="mi">499071</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">500929</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">233</span><span class="p">.</span><span class="mi">021</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>The limiting factor is the number of rows that match the <code class="language-plaintext highlighter-rouge">WHERE</code> condition. The database might use an index to satisfy the <code class="language-plaintext highlighter-rouge">WHERE</code> condition, but must still fetch all matching rows to sort them.</p>
<p><a href="/assets/images/postgresql/pagination/no_index.png" target="_blank"><img src="/assets/images/postgresql/pagination/no_index.png" alt="no_index" title="no_index" class="aligncenter size-full" /></a></p>
<h1 id="fetch-next-page">Fetch Next Page</h1>
<p>To get next resent 10 news in most cases using <code class="language-plaintext highlighter-rouge">OFFSET</code>:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="nb">date</span><span class="p">,</span> <span class="n">id</span> <span class="k">DESC</span> <span class="k">OFFSET</span> <span class="mi">10</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span></code></pre></figure>
<h2 id="worst-case-no-index-for-order-by-1">Worst Case: No index for ORDER BY</h2>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">OFFSET</span> <span class="mi">10</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">30166</span><span class="p">.</span><span class="mi">22</span><span class="p">..</span><span class="mi">30166</span><span class="p">.</span><span class="mi">25</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">388</span><span class="p">.</span><span class="mi">711</span><span class="p">..</span><span class="mi">388</span><span class="p">.</span><span class="mi">714</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Sort</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">30166</span><span class="p">.</span><span class="mi">20</span><span class="p">..</span><span class="mi">31410</span><span class="p">.</span><span class="mi">22</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">388</span><span class="p">.</span><span class="mi">706</span><span class="p">..</span><span class="mi">388</span><span class="p">.</span><span class="mi">711</span> <span class="k">rows</span><span class="o">=</span><span class="mi">20</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Sort</span> <span class="k">Key</span><span class="p">:</span> <span class="n">id</span>
<span class="n">Sort</span> <span class="k">Method</span><span class="p">:</span> <span class="n">top</span><span class="o">-</span><span class="n">N</span> <span class="n">heapsort</span> <span class="n">Memory</span><span class="p">:</span> <span class="mi">25</span><span class="n">kB</span>
<span class="o">-></span> <span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">foo</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">16925</span><span class="p">.</span><span class="mi">00</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">020</span><span class="p">..</span><span class="mi">271</span><span class="p">.</span><span class="mi">130</span> <span class="k">rows</span><span class="o">=</span><span class="mi">499071</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">500929</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">388</span><span class="p">.</span><span class="mi">761</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span>
<span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">OFFSET</span> <span class="mi">100</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">36285</span><span class="p">.</span><span class="mi">62</span><span class="p">..</span><span class="mi">36285</span><span class="p">.</span><span class="mi">65</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">389</span><span class="p">.</span><span class="mi">534</span><span class="p">..</span><span class="mi">389</span><span class="p">.</span><span class="mi">536</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Sort</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">36285</span><span class="p">.</span><span class="mi">37</span><span class="p">..</span><span class="mi">37529</span><span class="p">.</span><span class="mi">40</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">389</span><span class="p">.</span><span class="mi">512</span><span class="p">..</span><span class="mi">389</span><span class="p">.</span><span class="mi">524</span> <span class="k">rows</span><span class="o">=</span><span class="mi">110</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Sort</span> <span class="k">Key</span><span class="p">:</span> <span class="n">id</span>
<span class="n">Sort</span> <span class="k">Method</span><span class="p">:</span> <span class="n">top</span><span class="o">-</span><span class="n">N</span> <span class="n">heapsort</span> <span class="n">Memory</span><span class="p">:</span> <span class="mi">30</span><span class="n">kB</span>
<span class="o">-></span> <span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">news</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">16925</span><span class="p">.</span><span class="mi">00</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">029</span><span class="p">..</span><span class="mi">274</span><span class="p">.</span><span class="mi">907</span> <span class="k">rows</span><span class="o">=</span><span class="mi">499071</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">500929</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">389</span><span class="p">.</span><span class="mi">588</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span>
<span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">OFFSET</span> <span class="mi">1000</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">44246</span><span class="p">.</span><span class="mi">58</span><span class="p">..</span><span class="mi">44246</span><span class="p">.</span><span class="mi">61</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">389</span><span class="p">.</span><span class="mi">982</span><span class="p">..</span><span class="mi">389</span><span class="p">.</span><span class="mi">986</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="n">Sort</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">44244</span><span class="p">.</span><span class="mi">08</span><span class="p">..</span><span class="mi">45488</span><span class="p">.</span><span class="mi">10</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">389</span><span class="p">.</span><span class="mi">765</span><span class="p">..</span><span class="mi">389</span><span class="p">.</span><span class="mi">930</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1010</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Sort</span> <span class="k">Key</span><span class="p">:</span> <span class="n">id</span>
<span class="n">Sort</span> <span class="k">Method</span><span class="p">:</span> <span class="n">top</span><span class="o">-</span><span class="n">N</span> <span class="n">heapsort</span> <span class="n">Memory</span><span class="p">:</span> <span class="mi">96</span><span class="n">kB</span>
<span class="o">-></span> <span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">news</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">16925</span><span class="p">.</span><span class="mi">00</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">024</span><span class="p">..</span><span class="mi">271</span><span class="p">.</span><span class="mi">414</span> <span class="k">rows</span><span class="o">=</span><span class="mi">499071</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">500929</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">390</span><span class="p">.</span><span class="mi">049</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">8</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>As you can see by <code class="language-plaintext highlighter-rouge">EXPLAIN</code> for each next page need more memory to sort rows, before to do <code class="language-plaintext highlighter-rouge">OFFSET</code> and <code class="language-plaintext highlighter-rouge">LIMIT</code>. This might become the limiting factor when browsing farther back. Fetching the last page can take considerably longer than fetching the first page.</p>
<p><a href="/assets/images/postgresql/pagination/no_index2.png" target="_blank"><img src="/assets/images/postgresql/pagination/no_index2.png" alt="no_index2" title="no_index2" class="aligncenter size-full" /></a></p>
<h1 id="improvement-1-indexed-order-by">Improvement #1: Indexed ORDER BY</h1>
<p>To impove pagination we should have indexes for fields, which we are using in <code class="language-plaintext highlighter-rouge">ORDER BY</code>:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">#</span> <span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">index_news_on_id_type</span> <span class="k">ON</span> <span class="n">news</span> <span class="k">USING</span> <span class="n">btree</span> <span class="p">(</span><span class="n">id</span><span class="p">);</span>
<span class="k">CREATE</span> <span class="k">INDEX</span>
<span class="o">#</span> <span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">index_news_on_category_id</span> <span class="k">ON</span> <span class="n">news</span> <span class="k">USING</span> <span class="n">btree</span> <span class="p">(</span><span class="n">category_id</span><span class="p">);</span>
<span class="k">CREATE</span> <span class="k">INDEX</span></code></pre></figure>
<p>The same index can be using in <code class="language-plaintext highlighter-rouge">WHERE</code> and <code class="language-plaintext highlighter-rouge">ORDER BY</code>.</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">OFFSET</span> <span class="mi">10</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">----------------------------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">1</span><span class="p">.</span><span class="mi">07</span><span class="p">..</span><span class="mi">1</span><span class="p">.</span><span class="mi">71</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">087</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">112</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Index</span> <span class="k">Only</span> <span class="n">Scan</span> <span class="k">using</span> <span class="n">index_news_on_id_type</span> <span class="k">on</span> <span class="n">news</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">42</span><span class="p">..</span><span class="mi">31872</span><span class="p">.</span><span class="mi">47</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">057</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">109</span> <span class="k">rows</span><span class="o">=</span><span class="mi">20</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">Index</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="n">Heap</span> <span class="n">Fetches</span><span class="p">:</span> <span class="mi">20</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">158</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
<span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">OFFSET</span> <span class="mi">100</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-----------------------------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">6</span><span class="p">.</span><span class="mi">83</span><span class="p">..</span><span class="mi">7</span><span class="p">.</span><span class="mi">47</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">315</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">338</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Index</span> <span class="k">Only</span> <span class="n">Scan</span> <span class="k">using</span> <span class="n">index_news_on_id_type</span> <span class="k">on</span> <span class="n">news</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">42</span><span class="p">..</span><span class="mi">31872</span><span class="p">.</span><span class="mi">47</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">058</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">318</span> <span class="k">rows</span><span class="o">=</span><span class="mi">110</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">Index</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="n">Heap</span> <span class="n">Fetches</span><span class="p">:</span> <span class="mi">110</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">409</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span>
<span class="o">#</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">OFFSET</span> <span class="mi">1000</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">------------------------------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">64</span><span class="p">.</span><span class="mi">48</span><span class="p">..</span><span class="mi">65</span><span class="p">.</span><span class="mi">12</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">1</span><span class="p">.</span><span class="mi">651</span><span class="p">..</span><span class="mi">1</span><span class="p">.</span><span class="mi">663</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Index</span> <span class="k">Only</span> <span class="n">Scan</span> <span class="k">using</span> <span class="n">index_news_on_id_type</span> <span class="k">on</span> <span class="n">news</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">42</span><span class="p">..</span><span class="mi">31872</span><span class="p">.</span><span class="mi">47</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497609</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">041</span><span class="p">..</span><span class="mi">1</span><span class="p">.</span><span class="mi">596</span> <span class="k">rows</span><span class="o">=</span><span class="mi">1010</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">Index</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">(</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span>
<span class="n">Heap</span> <span class="n">Fetches</span><span class="p">:</span> <span class="mi">1010</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">1</span><span class="p">.</span><span class="mi">698</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>As you can see, fetching the next page is also faster. But in order to select, for example, the 10 page (10 per page), PostgreSQL should select 100 records and make offset 90 of selected rows.</p>
<p><a href="/assets/images/postgresql/pagination/index.png" target="_blank"><img src="/assets/images/postgresql/pagination/index.png" alt="index" title="index" class="aligncenter size-full" /></a></p>
<h1 id="improvement-2-the-seek-method">Improvement #2: The Seek Method</h1>
<p>To remove the rows from previous pages we can use <code class="language-plaintext highlighter-rouge">WHERE</code> filter instead of <code class="language-plaintext highlighter-rouge">OFFSET</code>.</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">#</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">AND</span> <span class="p">(</span><span class="nb">date</span><span class="p">,</span> <span class="n">id</span><span class="p">)</span> <span class="o"><</span> <span class="p">(</span><span class="n">prev_date</span><span class="p">,</span> <span class="n">prev_id</span><span class="p">)</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="nb">date</span> <span class="k">DESC</span><span class="p">,</span> <span class="n">id</span> <span class="k">DESC</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span></code></pre></figure>
<p>In this case neither the size of the base set(*) nor the fetched page number affects the response time. And the memory footprint is very low!</p>
<p>Examples:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">#</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">news</span> <span class="k">WHERE</span> <span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span> <span class="k">AND</span> <span class="n">id</span> <span class="o"><</span> <span class="mi">12345678</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">id</span> <span class="k">DESC</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-------------------------------------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Limit</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">42</span><span class="p">..</span><span class="mi">1</span><span class="p">.</span><span class="mi">09</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">036</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">060</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="o">-></span> <span class="k">Index</span> <span class="k">Only</span> <span class="n">Scan</span> <span class="k">Backward</span> <span class="k">using</span> <span class="n">index_news_on_id_type</span> <span class="k">on</span> <span class="n">news</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">42</span><span class="p">..</span><span class="mi">33116</span><span class="p">.</span><span class="mi">37</span> <span class="k">rows</span><span class="o">=</span><span class="mi">497603</span> <span class="n">width</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">035</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">053</span> <span class="k">rows</span><span class="o">=</span><span class="mi">10</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">Index</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">((</span><span class="n">category_id</span> <span class="o">=</span> <span class="mi">1234</span><span class="p">::</span><span class="nb">integer</span><span class="p">)</span> <span class="k">AND</span> <span class="p">(</span><span class="n">id</span> <span class="o"><</span> <span class="mi">12345678</span><span class="p">::</span><span class="nb">integer</span><span class="p">))</span>
<span class="n">Heap</span> <span class="n">Fetches</span><span class="p">:</span> <span class="mi">10</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">098</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p><a href="/assets/images/postgresql/pagination/index2.png" target="_blank"><img src="/assets/images/postgresql/pagination/index2.png" alt="index2" title="index2" class="aligncenter size-full" /></a></p>
<p>But the Seek Method has serious limitations:</p>
<ul>
<li>You cannot directly navigate to arbitrary pages (because you need the values from the previous page)</li>
<li>Bi-directional navigation is possible but tedious (you need to revers the <code class="language-plaintext highlighter-rouge">ORDER BY</code> direction and <code class="language-plaintext highlighter-rouge">WHERE</code> comparison)</li>
<li>Works best with full row values support (workaround is possible, but ugly and less performant)</li>
</ul>
<h2 id="use-case">Use case</h2>
<p>The Seek Method perfect for “Infinite Scrolling” and “Next-Prev” (only this button) navigations:</p>
<p><a href="/assets/images/postgresql/pagination/pagination_example.jpg" target="_blank"><img src="/assets/images/postgresql/pagination/pagination_example.jpg" alt="index2" title="index2" class="aligncenter size-full" /></a></p>
<p>This types of paginations doesn’t need:</p>
<ul>
<li>navigate to arbitrary pages</li>
<li>browse backwards (only for “Prev-Next” navigation)</li>
<li>show total pages</li>
</ul>
<h1 id="summary">Summary</h1>
<p>As you can see, pagination can be improved by using an indexes (duh..) and the seek method. Last one can improve performance of pagination, but it can be used only for several types of paginations.</p>
<p>This article based on slides for Markus Winand’s talk <a href="https://wiki.postgresql.org/wiki/File:Pagination_Done_the_PostgreSQL_Way.pdf">“Pagination Done the PostgreSQL Way”</a> for PGDay on 1st Feb 2013 in Brussels. Also good article <a href="http://use-the-index-luke.com/no-offset">“We need tool support for keyset pagination”</a>.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Working with PostgreSQL: tuning and scaling (4-th edition, russian) and Cooking Infrastructure by Chef (1-st edition)2014-08-25T00:00:00+00:00http://leopard.in.ua/2014/08/25/postgresql-and-chef-books<p>Hello my dear friends. After a long silence in the blog, I present to you the result of several months - two FREE books.</p>
<p>The first of these books is <a href="http://postgresql.leopard.in.ua/">“Working with PostgreSQL: tuning and scaling”</a>. This book about <a href="http://www.postgresql.org/">PostgreSQL</a> database: how you can use it, possible solutions with tunning and scaling this database. This is not new book, it is 4-th edition of my book. I added and updated huge amount of information inside this book. Right now this book present only in Russian language.</p>
<p><a href="http://postgresql.leopard.in.ua/" target="_blank"><img src="/assets/images/postgresql/postgresql4.png" alt="postgresql" title="postgresql" class="aligncenter size-full" /></a>
<a href="http://chef.leopard.in.ua/" target="_blank"><img src="/assets/images/chef/cover.jpg" alt="chef" title="chef" class="aligncenter size-full" /></a></p>
<p>The second book is <a href="http://chef.leopard.in.ua/">“Cooking Infrastructure by Chef”</a>. This book should help to understand how to begin work with <a href="http://www.getchef.com/chef/">Chef (DevOps tool)</a> and improve your knowledge about it. This is first release of this book.</p>
<p>All books available in PDF, HTML, ePub and Mobi formats.</p>
<p>BTW, this books also Open Source (<a href="http://creativecommons.org/licenses/by-nc/4.0/">CC-by-NC</a>). All books written in LaTeX.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Cooking Infrastructure by Chef: Free open source book about Chef2014-05-26T00:00:00+00:00http://leopard.in.ua/2014/05/26/chef-book<p>Hello my dear friends. After number of articles about Chef I decided to create book about it. And today I released my first draft variant of <a href="http://chef.leopard.in.ua/">“Cooking Infrastructure by Chef”</a> book. It is free and open source book (<a href="http://creativecommons.org/licenses/by-nc/4.0/">CC-by-NC</a>).</p>
<div class="aligncenter">
<a href="http://chef.leopard.in.ua/" target="_blank"><img src="/assets/images/chef/cover.jpg" alt="chef" title="chef" class="aligncenter size-full" /></a>
</div>
<p>It is my second open source book. First book was about PostgreSQL and written in Russian. This book I decided to create in English. What is why I need help of native speakers to find and help me to fix typos inside this book. This will help to make good book about Chef. You can use github pull request system or send me fixes directly on email.</p>
<p>Thanks in advance!</p>
PgTune - Tuning PostgreSQL config by your hardware2014-03-24T00:00:00+00:00http://leopard.in.ua/2014/03/24/pgtune-for-postgresql<p>Hello my dear friends. In this article I will talk about my new little app - PgTune.</p>
<p><a href="http://pgtune.leopard.in.ua/" target="_blank"><img src="/assets/images/postgresql/pgtune/pgtune.png" alt="pgtune" title="pgtune" class="aligncenter size-full" /></a></p>
<h1 id="pgtune">PgTune?</h1>
<p>To optimize the settings for PostgreSQL based on maximizing performance for a given hardware configuration Gregory Smith in 2008 created a utility <a href="http://pgfoundry.org/projects/pgtune/">pgtune</a>. The utility is easy to use and in many Linux systems can go in packages. But exists several problems about this tool:</p>
<ul>
<li>Not maintained (last release: October 29, 2009), what is why it have the old methods of calculation configurations for PostgreSQL</li>
<li>Need to download/install it for usage</li>
</ul>
<p>What is why I created online version of <a href="http://pgtune.leopard.in.ua/">PgTune</a>. Main benefits:</p>
<ul>
<li>Updated calculation for PostgreSQL config</li>
<li>Don’t need to download or install anything</li>
<li>Can work offline</li>
<li>Can work as mobile application</li>
</ul>
<p>And, of course, it is <a href="https://github.com/le0pard/pgtune">open source</a>.</p>
<h2 id="offline-mode">Offline mode</h2>
<p>At the first loading of the page, you no longer need access to the Internet to use PgTune. It will work offline (without internet connection) by using <a href="http://www.html5rocks.com/en/tutorials/appcache/beginner/">Application Cache</a> technology.</p>
<h2 id="mobile-app">Mobile app</h2>
<p>We can “install” PgTune as mobile app, because the application can operate without access to the internet.</p>
<p>Steps for iOS:</p>
<ul>
<li>Open up Safari on your iOS device</li>
<li>Navigate to the <a href="http://pgtune.leopard.in.ua/">PgTune</a> page</li>
<li>Tap the Share button (it’s an icon that’s a box with an arrow sticking out from it)</li>
<li>Tap on <a href="http://support.apple.com/kb/TI42">Add to Home Screen</a></li>
<li>On the next page you’ll give the shortcut a name and confirm the web address</li>
<li>After that, tap on Add in the upper-right corner to add the PgTune app to your home screen</li>
</ul>
<p>Steps for Android:</p>
<ul>
<li>Open the browser on your Chrome Android</li>
<li>Navigate to the <a href="http://pgtune.leopard.in.ua/">PgTune</a> page</li>
<li>Hit the settings button – it’s three vertical dots, locating in the top right of the screen</li>
<li>Tap on “Add to Homescreen”</li>
<li>On the next page you’ll give the shortcut a name</li>
<li>After that, tap on Ok button to add the PgTune app to your home screen</li>
</ul>
<p>As you can see, it is very simple to add it as mobile app.</p>
<h1 id="summary">Summary</h1>
<p>PGTune calculate configuration for PostgreSQL based on the maximum performance for a given hardware configuration. It isn’t a “silver bullet” for the optimization settings of PostgreSQL. Many settings depend not only on the hardware configuration, but also on the size of the database, the number of clients and the complexity of queries, so that optimally configure the database can only be given all these parameters. But I hope it will help to start of tunning PostgreSQL.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Chef cookbooks development by TDD2013-12-01T00:00:00+00:00http://leopard.in.ua/2013/12/01/chef-and-tdd<blockquote>
<p><strong>WARNING</strong>: This article can be outdated. Better read my book about Chef: <a href="http://chef.leopard.in.ua/">Cooking Infrastructure by Chef</a></p>
</blockquote>
<p>Hello my dear friends. Today we will continue to talk about Chef. But today my article will be about Chef cookbooks by <a href="http://en.wikipedia.org/wiki/Test-driven_development">TDD</a>. If you don’t know what is Chef and how to use it, then you should better start with my <a href="/2013/01/04/chef-solo-getting-started-part-1/">articles</a> about it. All code examples you can find here: <a href="https://github.com/le0pard/chef-tdd-monit">github.com/le0pard/chef-tdd-monit</a>.</p>
<h1 id="chef-testing-tools">Chef testing tools</h1>
<p>First, let’s look what tools exist to test Chef cookbooks today.</p>
<h2 id="foodcritic">Foodcritic</h2>
<ul>
<li><a href="http://acrmp.github.io/foodcritic/">Site</a></li>
</ul>
<p>Foodcritic is a lint tool for your Opscode Chef cookbooks. Foodcritic has two goals:</p>
<ul>
<li>To make it easier to flag problems in your Chef cookbooks that will cause Chef to blow up when you attempt to converge. This is about faster feedback. If you automate checks for common problems you can save a lot of time.</li>
<li>To encourage discussion within the Chef community on the more subjective stuff - what does a good cookbook look like? Opscode have avoided being overly prescriptive which by and large I think is a good thing. Having a set of rules to base discussion on helps drive out what we as a community think is good style.</li>
</ul>
<p>On main site you can find <a href="http://acrmp.github.io/foodcritic/#FC001">list of rules</a>. Also you can define own list of rules (if you need this). Foodcritic is like jslint for cookbooks. At the bare minimum, you should run foodcritic against all your cookbooks.</p>
<h2 id="fauxhai">Fauxhai</h2>
<ul>
<li><a href="http://technology.customink.com/fauxhai/">Site</a></li>
</ul>
<p>Ohai is a tool that is used to detect attributes on a node, and then provide these attributes to the chef-client at the start of every chef-client run. Ohai is required by the chef-client and must be present on a node. It’s awesome, but this can be problem for testing. What is why exist Fauxhai. Fauxhai is a gem for mocking out ohai data in your chef testing. Example:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'chefspec'</span>
<span class="n">describe</span> <span class="s1">'awesome_cookbook::default'</span> <span class="k">do</span>
<span class="n">before</span> <span class="k">do</span>
<span class="no">Fauxhai</span><span class="p">.</span><span class="nf">mock</span><span class="p">(</span><span class="n">platform</span><span class="ss">:'ubuntu'</span><span class="p">,</span> <span class="n">version</span><span class="ss">:'12.04'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'should install awesome'</span> <span class="k">do</span>
<span class="vi">@runner</span> <span class="o">=</span> <span class="no">ChefSpec</span><span class="o">::</span><span class="no">ChefRunner</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">converge</span><span class="p">(</span><span class="s1">'tmpreaper::default'</span><span class="p">)</span>
<span class="vi">@runner</span><span class="p">.</span><span class="nf">should</span> <span class="n">install_package</span> <span class="s1">'awesome'</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<h2 id="chefspec">ChefSpec</h2>
<ul>
<li><a href="http://sethvargo.com/chefspec/">Site</a></li>
</ul>
<p>ChefSpec is a unit testing framework for testing Chef cookbooks. ChefSpec makes it easy to write examples and get fast feedback on cookbook changes without the need for virtual machines or cloud servers. Example:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'chefspec'</span>
<span class="n">describe</span> <span class="s1">'example::default'</span> <span class="k">do</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:chef_run</span><span class="p">)</span> <span class="p">{</span> <span class="no">ChefSpec</span><span class="o">::</span><span class="no">Runner</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">converge</span><span class="p">(</span><span class="n">described_recipe</span><span class="p">)</span> <span class="p">}</span>
<span class="n">it</span> <span class="s1">'installs foo'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">install_package</span><span class="p">(</span><span class="s1">'foo'</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<h2 id="cucumber-chef">Cucumber-chef</h2>
<ul>
<li><a href="http://www.cucumber-chef.org/">Site</a></li>
</ul>
<p>Cucumber-chef is a library of tools to enable the emerging discipline of infrastructure as code to practice test driven development. It provides a testing platform within which <a href="http://cukes.info/">Cucumber tests</a> can be run which provision virtual machines, configure them by applying the appropriate Chef roles to them, and then run acceptance and integration tests against the environment.</p>
<h2 id="test-kitchen">Test-kitchen</h2>
<ul>
<li><a href="https://github.com/test-kitchen/test-kitchen">Site</a></li>
</ul>
<p>Test-kitchen is a convergence integration test harness for configuration management systems.</p>
<h2 id="chef-zero">Chef Zero</h2>
<ul>
<li><a href="https://github.com/opscode/chef-zero">Site</a></li>
</ul>
<p>Chef Zero is a simple, easy-install, in-memory Chef server that can be useful for Chef Client testing and chef-solo-like tasks that require a full Chef Server. Because Chef Zero runs in memory, it’s super fast and lightweight. This makes it perfect for testing against a “real” Chef Server without mocking the entire Internet.</p>
<h1 id="enough-words-lets-start-with-the-practice">Enough words. Let’s start with the practice</h1>
<p>First of all you should have installed Ruby and Rubygems. Let’s create <a href="http://mmonit.com/monit/">monit</a> cookbook by TDD. I generated structure of coobook by <a href="http://berkshelf.com/">berkshelf</a>:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>ruby <span class="nt">-v</span>
ruby 2.0.0p353 <span class="o">(</span>2013-11-22 revision 43784<span class="o">)</span> <span class="o">[</span>x86_64-darwin13.0.0]
<span class="nv">$ </span>gem <span class="nb">install </span>berkshelf
Successfully installed berkshelf-2.0.10
1 gem installed
<span class="nv">$ </span>berks cookbook monit
create monit/files/default
create monit/templates/default
create monit/attributes
create monit/definitions
create monit/libraries
create monit/providers
create monit/recipes
create monit/resources
create monit/recipes/default.rb
create monit/metadata.rb
create monit/LICENSE
create monit/README.md
create monit/Berksfile
create monit/Thorfile
create monit/chefignore
create monit/.gitignore
run git init from <span class="s2">"./monit"</span>
create monit/Gemfile
create monit/Vagrantfile
<span class="nv">$ </span><span class="nb">cd </span>monit</code></pre></figure>
<p>Now we need to add gems in Gemfile, which we will use for testing:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">source</span> <span class="s1">'https://rubygems.org'</span>
<span class="n">gem</span> <span class="s1">'berkshelf'</span>
<span class="n">gem</span> <span class="s1">'foodcritic'</span>
<span class="n">gem</span> <span class="s1">'fauxhai'</span>
<span class="n">gem</span> <span class="s1">'chefspec'</span>
<span class="n">gem</span> <span class="s1">'busser-bats'</span>
<span class="n">gem</span> <span class="s1">'busser-minitest'</span>
<span class="n">gem</span> <span class="s1">'test-kitchen'</span><span class="p">,</span> <span class="s1">'1.0.0.rc.2'</span>
<span class="n">group</span> <span class="ss">:integration</span> <span class="k">do</span>
<span class="n">gem</span> <span class="s1">'kitchen-vagrant'</span><span class="p">,</span> <span class="s1">'0.12.0'</span>
<span class="k">end</span></code></pre></figure>
<p>And you should to execute “bundle” command to install this gems.</p>
<h2 id="using-chefspec">Using ChefSpec</h2>
<p>First of all we should create tests for our monit cookbook:</p>
<p>File: spec/spec_helper.rb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'chefspec'</span>
<span class="nb">require</span> <span class="s1">'chefspec/berkshelf'</span> <span class="c1"># I use berkshelf, but it also have librarian support</span>
<span class="no">RSpec</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">config</span><span class="o">|</span>
<span class="c1">#empty</span>
<span class="k">end</span></code></pre></figure>
<p>File: spec/unit/recipes/default_spec.rb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'chefspec'</span>
<span class="n">describe</span> <span class="s1">'monit::default'</span> <span class="k">do</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:chef_run</span><span class="p">)</span> <span class="p">{</span> <span class="no">ChefSpec</span><span class="o">::</span><span class="no">Runner</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">converge</span><span class="p">(</span><span class="n">described_recipe</span><span class="p">)</span> <span class="p">}</span>
<span class="n">it</span> <span class="s1">'install monit package'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">install_package</span><span class="p">(</span><span class="s1">'monit'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'enable monit service'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">enable_service</span><span class="p">(</span><span class="s1">'monit'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'create direcory for custom services'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">create_directory</span><span class="p">(</span><span class="s1">'/etc/monit/conf.d/'</span><span class="p">).</span><span class="nf">with</span><span class="p">(</span>
<span class="ss">user: </span><span class="s1">'root'</span><span class="p">,</span>
<span class="ss">group: </span><span class="s1">'root'</span>
<span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'create main monit config'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">create_template</span><span class="p">(</span><span class="s1">'/etc/monit/monitrc'</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>Of course tests failed:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>rspec
monit::default
<span class="nb">install </span>monit package <span class="o">(</span>FAILED - 1<span class="o">)</span>
<span class="nb">enable </span>monit service <span class="o">(</span>FAILED - 2<span class="o">)</span>
create direcory <span class="k">for </span>custom services <span class="o">(</span>FAILED - 3<span class="o">)</span>
create main monit config <span class="o">(</span>FAILED - 4<span class="o">)</span>
...
Finished <span class="k">in </span>0.07275 seconds
4 examples, 4 failures
Failed examples:
rspec ./spec/unit/recipes/default_spec.rb:6 <span class="c"># monit::default install monit package</span>
rspec ./spec/unit/recipes/default_spec.rb:10 <span class="c"># monit::default enable monit service</span>
rspec ./spec/unit/recipes/default_spec.rb:14 <span class="c"># monit::default create direcory for custom services</span>
rspec ./spec/unit/recipes/default_spec.rb:21 <span class="c"># monit::default create main monit config</span></code></pre></figure>
<p>Let’s fix these tests by writing cookbook code:</p>
<p>File: attributes/default.rb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:notify_email</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"notify@example.com"</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:logfile</span><span class="p">]</span> <span class="o">=</span> <span class="s1">'syslog facility log_daemon'</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:poll_period</span><span class="p">]</span> <span class="o">=</span> <span class="mi">60</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:poll_start_delay</span><span class="p">]</span> <span class="o">=</span> <span class="mi">120</span>
<span class="o">...</span></code></pre></figure>
<p>File: templates/default/monitrc.erb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">set</span> <span class="n">daemon</span> <span class="o"><</span><span class="sx">%= @node[:monit][:poll_period] %>
<% if @node[:monit][:poll_start_delay] %>
with start delay <%=</span> <span class="vi">@node</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:poll_start_delay</span><span class="p">]</span> <span class="o">%></span>
<span class="o"><</span><span class="sx">% end </span><span class="o">%></span>
<span class="o">...</span></code></pre></figure>
<p>File: recipes/default.rb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">package</span> <span class="s2">"monit"</span></code></pre></figure>
<p>First test should pass:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>rspec
monit::default
<span class="nb">install </span>monit package
<span class="nb">enable </span>monit service <span class="o">(</span>FAILED - 1<span class="o">)</span>
create direcory <span class="k">for </span>custom services <span class="o">(</span>FAILED - 2<span class="o">)</span>
create main monit config <span class="o">(</span>FAILED - 3<span class="o">)</span>
...
Finished <span class="k">in </span>0.06091 seconds
4 examples, 3 failures
Failed examples:
rspec ./spec/unit/recipes/default_spec.rb:10 <span class="c"># monit::default enable monit service</span>
rspec ./spec/unit/recipes/default_spec.rb:14 <span class="c"># monit::default create direcory for custom services</span>
rspec ./spec/unit/recipes/default_spec.rb:21 <span class="c"># monit::default create main monit config</span></code></pre></figure>
<p>Perfect! Let’s fix rest of the tests:</p>
<p>File: recipes/default.rb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">package</span> <span class="s2">"monit"</span>
<span class="n">service</span> <span class="s2">"monit"</span> <span class="k">do</span>
<span class="n">action</span> <span class="p">[</span><span class="ss">:enable</span><span class="p">,</span> <span class="ss">:start</span><span class="p">]</span>
<span class="n">enabled</span> <span class="kp">true</span>
<span class="n">supports</span> <span class="p">[</span><span class="ss">:start</span><span class="p">,</span> <span class="ss">:restart</span><span class="p">,</span> <span class="ss">:stop</span><span class="p">]</span>
<span class="k">end</span>
<span class="n">directory</span> <span class="s2">"/etc/monit/conf.d/"</span> <span class="k">do</span>
<span class="n">owner</span> <span class="s1">'root'</span>
<span class="n">group</span> <span class="s1">'root'</span>
<span class="n">mode</span> <span class="mo">0755</span>
<span class="n">action</span> <span class="ss">:create</span>
<span class="n">recursive</span> <span class="kp">true</span>
<span class="k">end</span>
<span class="n">template</span> <span class="s2">"/etc/monit/monitrc"</span> <span class="k">do</span>
<span class="n">owner</span> <span class="s2">"root"</span>
<span class="n">group</span> <span class="s2">"root"</span>
<span class="n">mode</span> <span class="mo">0700</span>
<span class="n">source</span> <span class="s1">'monitrc.erb'</span>
<span class="n">notifies</span> <span class="ss">:restart</span><span class="p">,</span> <span class="n">resources</span><span class="p">(</span><span class="ss">:service</span> <span class="o">=></span> <span class="s2">"monit"</span><span class="p">),</span> <span class="ss">:delayed</span>
<span class="k">end</span></code></pre></figure>
<p>And again I will check tests:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>rspec
monit::default
<span class="nb">install </span>monit package
<span class="nb">enable </span>monit service
create direcory <span class="k">for </span>custom services
create main monit config
Finished <span class="k">in </span>0.06594 seconds
4 examples, 0 failures</code></pre></figure>
<p>Ok, tests passed.</p>
<h2 id="checking-by-foodcritic">Checking by Foodcritic</h2>
<p>Now we need to check our cookbook by foodcritic:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>foodcritic <span class="nb">.</span>
FC002: Avoid string interpolation where not required: ./templates/default/monitrc.erb:31
FC019: Access node attributes <span class="k">in </span>a consistent manner: ./attributes/default.rb:8
FC027: Resource sets internal attribute: ./recipes/default.rb:12
FC043: Prefer new notification syntax: ./recipes/default.rb:26</code></pre></figure>
<p>We have a few warnings in the code. Let’s fix them:</p>
<p>FC002: Avoid string interpolation where not required: ./templates/default/monitrc.erb:31</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nt">---</span> a/templates/default/monitrc.erb
+++ b/templates/default/monitrc.erb
@@ <span class="nt">-28</span>,7 +28,7 @@ <span class="nb">set </span>alert <%<span class="o">=</span> @node[:monit][:notify_email] %> NOT ON <span class="o">{</span> action, instance, pid, pp
<span class="nb">set </span>httpd port <%<span class="o">=</span> node[:monit][:port] %>
<%<span class="o">=</span> <span class="s2">"use address #{node[:monit][:address]}"</span> <span class="k">if </span>node[:monit][:address] %>
<% node[:monit][:allow].each <span class="k">do</span> |a| %>
- allow <%<span class="o">=</span> <span class="s2">"#{a}"</span> %>
+ allow <%<span class="o">=</span> a.to_s %>
<% end %>
<% <span class="k">if </span>node[:monit][:ssl] %>
ssl <span class="nb">enable</span></code></pre></figure>
<p>FC019: Access node attributes in a consistent manner: ./attributes/default.rb:8</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nt">---</span> a/attributes/default.rb
+++ b/attributes/default.rb
@@ <span class="nt">-5</span>,7 +5,7 @@ default[:monit][:poll_period] <span class="o">=</span> 60
default[:monit][:poll_start_delay] <span class="o">=</span> 120
default[:monit][:mail_format][:subject] <span class="o">=</span> <span class="s2">"</span><span class="nv">$SERVICE</span><span class="s2"> </span><span class="nv">$EVENT</span><span class="s2">"</span>
<span class="nt">-default</span><span class="o">[</span>:monit][:mail_format][:from] <span class="o">=</span> <span class="s2">"monit@#{node['fqdn']}"</span>
+default[:monit][:mail_format][:from] <span class="o">=</span> <span class="s2">"monit@#{node[:fqdn]}"</span>
default[:monit][:mail_format][:message] <span class="o">=</span> <span class="o"><<-</span><span class="no">EOS</span><span class="sh">
Monit </span><span class="nv">$ACTION</span><span class="sh"> </span><span class="nv">$SERVICE</span><span class="sh"> at </span><span class="nv">$DATE</span><span class="sh"> on </span><span class="nv">$HOST</span><span class="sh">: </span><span class="nv">$DESCRIPTION</span><span class="sh">.
Yours sincerely,</span></code></pre></figure>
<p>FC027: Resource sets internal attribute: ./recipes/default.rb:12
FC043: Prefer new notification syntax: ./recipes/default.rb:26</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nt">---</span> a/recipes/default.rb
+++ b/recipes/default.rb
@@ <span class="nt">-11</span>,7 +11,6 @@ package <span class="s2">"monit"</span>
service <span class="s2">"monit"</span> <span class="k">do
</span>action <span class="o">[</span>:enable, :start]
- enabled <span class="nb">true
</span>supports <span class="o">[</span>:start, :restart, :stop]
end
@@ <span class="nt">-28</span>,5 +27,5 @@ template <span class="s2">"/etc/monit/monitrc"</span> <span class="k">do
</span>group <span class="s2">"root"</span>
mode 0700
<span class="nb">source</span> <span class="s1">'monitrc.erb'</span>
- notifies :restart, resources<span class="o">(</span>:service <span class="o">=></span> <span class="s2">"monit"</span><span class="o">)</span>, :delayed
+ notifies :restart, <span class="s2">"service[monit]"</span>, :delayed
end</code></pre></figure>
<p>And I will check foodcritic and tests:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>foodcritic <span class="nb">.</span>
<span class="nv">$ </span>rspec
monit::default
<span class="nb">install </span>monit package
<span class="nb">enable </span>monit service
create direcory <span class="k">for </span>custom services
create main monit config
Finished <span class="k">in </span>0.07382 seconds
4 examples, 0 failures</code></pre></figure>
<h2 id="working-with-different-operating-systems">Working with different operating systems</h2>
<p>Now I will show how to work with different operating systems. I will add such default attributes in our attributes :</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="k">case</span> <span class="n">node</span><span class="p">[</span><span class="ss">:platform_family</span><span class="p">]</span>
<span class="k">when</span> <span class="s2">"rhel"</span><span class="p">,</span> <span class="s2">"fedora"</span><span class="p">,</span> <span class="s2">"suse"</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:main_config_path</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"/etc/monit.conf"</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:includes_dir</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"/etc/monit.d"</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:cert</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"/etc/monit.pem"</span>
<span class="k">else</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:main_config_path</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"/etc/monit/monitrc"</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:includes_dir</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"/etc/monit/conf.d"</span>
<span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:cert</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"/etc/monit/monit.pem"</span>
<span class="k">end</span></code></pre></figure>
<p>And I will change tests:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'chefspec'</span>
<span class="n">describe</span> <span class="s1">'monit::default'</span> <span class="k">do</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:platfom</span><span class="p">)</span> <span class="p">{</span> <span class="s1">'ubuntu'</span> <span class="p">}</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:platfom_version</span><span class="p">)</span> <span class="p">{</span> <span class="s1">'12.04'</span> <span class="p">}</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:chef_run</span><span class="p">)</span> <span class="p">{</span> <span class="no">ChefSpec</span><span class="o">::</span><span class="no">Runner</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">platform: </span><span class="n">platfom</span><span class="p">,</span> <span class="ss">version: </span><span class="n">platfom_version</span><span class="p">).</span><span class="nf">converge</span><span class="p">(</span><span class="n">described_recipe</span><span class="p">)</span> <span class="p">}</span>
<span class="n">it</span> <span class="s1">'install monit package'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">install_package</span><span class="p">(</span><span class="s1">'monit'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'enable monit service'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">enable_service</span><span class="p">(</span><span class="s1">'monit'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'create direcory for custom services'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">create_directory</span><span class="p">(</span><span class="s1">'/etc/monit/conf.d'</span><span class="p">).</span><span class="nf">with</span><span class="p">(</span>
<span class="ss">user: </span><span class="s1">'root'</span><span class="p">,</span>
<span class="ss">group: </span><span class="s1">'root'</span>
<span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'create main monit config'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">create_template</span><span class="p">(</span><span class="s1">'/etc/monit/monitrc'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'reload daemon on change config'</span> <span class="k">do</span>
<span class="n">resource</span> <span class="o">=</span> <span class="n">chef_run</span><span class="p">.</span><span class="nf">template</span><span class="p">(</span><span class="s1">'/etc/monit/monitrc'</span><span class="p">)</span>
<span class="n">expect</span><span class="p">(</span><span class="n">resource</span><span class="p">).</span><span class="nf">to</span> <span class="n">notify</span><span class="p">(</span><span class="s1">'service[monit]'</span><span class="p">).</span><span class="nf">to</span><span class="p">(</span><span class="ss">:restart</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">context</span> <span class="s2">"rhel system"</span> <span class="k">do</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:platfom</span><span class="p">)</span> <span class="p">{</span> <span class="s1">'centos'</span> <span class="p">}</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:platfom_version</span><span class="p">)</span> <span class="p">{</span> <span class="s1">'6.3'</span> <span class="p">}</span>
<span class="n">it</span> <span class="s1">'create direcory for custom services'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">create_directory</span><span class="p">(</span><span class="s1">'/etc/monit.d'</span><span class="p">).</span><span class="nf">with</span><span class="p">(</span>
<span class="ss">user: </span><span class="s1">'root'</span><span class="p">,</span>
<span class="ss">group: </span><span class="s1">'root'</span>
<span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'create main monit config'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">create_template</span><span class="p">(</span><span class="s1">'/etc/monit.conf'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'reload daemon on change config'</span> <span class="k">do</span>
<span class="n">resource</span> <span class="o">=</span> <span class="n">chef_run</span><span class="p">.</span><span class="nf">template</span><span class="p">(</span><span class="s1">'/etc/monit.conf'</span><span class="p">)</span>
<span class="n">expect</span><span class="p">(</span><span class="n">resource</span><span class="p">).</span><span class="nf">to</span> <span class="n">notify</span><span class="p">(</span><span class="s1">'service[monit]'</span><span class="p">).</span><span class="nf">to</span><span class="p">(</span><span class="ss">:restart</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>Now we should fix tests in recipe:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">directory</span> <span class="n">node</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:includes_dir</span><span class="p">]</span> <span class="k">do</span>
<span class="n">owner</span> <span class="s1">'root'</span>
<span class="n">group</span> <span class="s1">'root'</span>
<span class="n">mode</span> <span class="mo">0755</span>
<span class="n">action</span> <span class="ss">:create</span>
<span class="n">recursive</span> <span class="kp">true</span>
<span class="k">end</span>
<span class="n">template</span> <span class="n">node</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:main_config_path</span><span class="p">]</span> <span class="k">do</span>
<span class="n">owner</span> <span class="s2">"root"</span>
<span class="n">group</span> <span class="s2">"root"</span>
<span class="n">mode</span> <span class="mo">0700</span>
<span class="n">source</span> <span class="s1">'monitrc.erb'</span>
<span class="n">notifies</span> <span class="ss">:restart</span><span class="p">,</span> <span class="s2">"service[monit]"</span><span class="p">,</span> <span class="ss">:delayed</span>
<span class="k">end</span></code></pre></figure>
<p>And we again will check tests:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>rspec
monit::default
<span class="nb">install </span>monit package
<span class="nb">enable </span>monit service
create direcory <span class="k">for </span>custom services
create main monit config
reload daemon on change config
rhel system
create direcory <span class="k">for </span>custom services
create main monit config
reload daemon on change config
Finished <span class="k">in </span>0.15188 seconds
8 examples, 0 failures</code></pre></figure>
<h2 id="working-with-fauxhai">Working with Fauxhai</h2>
<p>In default attributes we have such attribute:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">default</span><span class="p">[</span><span class="ss">:monit</span><span class="p">][</span><span class="ss">:mail_format</span><span class="p">][</span><span class="ss">:from</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"monit@</span><span class="si">#{</span><span class="n">node</span><span class="p">[</span><span class="ss">:fqdn</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span></code></pre></figure>
<p>where “fqdn” is attribute from ohai. Let’s check by Fauxhai is this attribute will work corrently. Add this test:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">context</span> <span class="s2">"mail to attribute"</span> <span class="k">do</span>
<span class="n">before</span> <span class="k">do</span>
<span class="no">Fauxhai</span><span class="p">.</span><span class="nf">mock</span><span class="p">(</span><span class="ss">platform: </span><span class="n">platfom</span><span class="p">,</span> <span class="ss">version: </span><span class="n">platfom_version</span><span class="p">)</span> <span class="c1"># fqdn == fauxhai.local</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'it should be monit@fauxhai.local'</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">chef_run</span><span class="p">).</span><span class="nf">to</span> <span class="n">render_file</span><span class="p">(</span><span class="s1">'/etc/monit/monitrc'</span><span class="p">).</span><span class="nf">with_content</span><span class="p">(</span><span class="sr">/monit@fauxhai\.local/</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>And run tests:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>rspec
monit::default
<span class="nb">install </span>monit package
<span class="nb">enable </span>monit service
create direcory <span class="k">for </span>custom services
create main monit config
reload daemon on change config
rhel system
create direcory <span class="k">for </span>custom services
create main monit config
reload daemon on change config
mail to attribute
it should be monit@fauxhai.local
Finished <span class="k">in </span>0.1829 seconds
9 examples, 0 failures</code></pre></figure>
<p>May be this example is not perfect for Fauxhai (because we couldn’t change “fqdn” using method “mock”), but this should help you to understand how you can use it.</p>
<h2 id="using-test-kitchen-bats-and-minitest">Using test-kitchen, bats and minitest</h2>
<p>Now let’s begin testing using test-kitchen. First we need to initialize it:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>kitchen init
create .kitchen.yml
append Thorfile
create <span class="nb">test</span>/integration/default
append .gitignore
append .gitignore</code></pre></figure>
<p>This command will create file “.kitchen.yml”, which contains all settings for test-kitchen:</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="nn">---</span>
<span class="na">driver</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">vagrant</span>
<span class="na">provisioner</span><span class="pi">:</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">chef_solo</span>
<span class="na">platforms</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ubuntu-12.04</span>
<span class="na">suites</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">default</span>
<span class="na">run_list</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">recipe[monit::default]</span>
<span class="na">attributes</span><span class="pi">:</span> <span class="pi">{}</span></code></pre></figure>
<p>About this setting you can read here <a href="https://github.com/test-kitchen/test-kitchen#the-kitchen-yaml-format">this page</a>. Let’s add integration tests. I use them for this <a href="https://github.com/sstephenson/bats">bats</a>:</p>
<p>File: test/integration/default/bats/default.bats</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">@test <span class="s2">"monit is installed and in the path"</span> <span class="o">{</span>
which monit
<span class="o">}</span>
@test <span class="s2">"monit configuration dir exists"</span> <span class="o">{</span>
<span class="o">[</span> <span class="nt">-d</span> <span class="s2">"/etc/monit"</span> <span class="o">]</span>
<span class="o">}</span></code></pre></figure>
<p>And <a href="https://github.com/seattlerb/minitest">minitest</a>:</p>
<p>File: test/integration/default/minitest/test_default.rb</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'minitest/autorun'</span>
<span class="n">describe</span> <span class="s1">'monit::default'</span> <span class="k">do</span>
<span class="n">it</span> <span class="s2">"install monit"</span> <span class="k">do</span>
<span class="n">assert</span> <span class="nb">system</span><span class="p">(</span><span class="s1">'apt-cache policy monit | grep Installed | grep -v none'</span><span class="p">)</span>
<span class="k">end</span>
<span class="n">describe</span> <span class="s2">"services"</span> <span class="k">do</span>
<span class="c1"># You can assert that a service must be running following the converge:</span>
<span class="n">it</span> <span class="s2">"runs as a daemon"</span> <span class="k">do</span>
<span class="n">assert</span> <span class="nb">system</span><span class="p">(</span><span class="s1">'/etc/init.d/monit status'</span><span class="p">)</span>
<span class="k">end</span>
<span class="c1"># And that it will start when the server boots:</span>
<span class="n">it</span> <span class="s2">"boots on startup"</span> <span class="k">do</span>
<span class="n">assert</span> <span class="no">File</span><span class="p">.</span><span class="nf">exists?</span><span class="p">(</span><span class="no">Dir</span><span class="p">.</span><span class="nf">glob</span><span class="p">(</span><span class="s2">"/etc/rc5.d/S*monit"</span><span class="p">).</span><span class="nf">first</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span></code></pre></figure>
<p>It is not neccessary to use bats and minitests together in the same cookbook. I use both in this cookbook to show simple example.</p>
<p>Finaly, run command “kitchen test” to begin testing:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>kitchen <span class="nb">test</span> <span class="nt">--parallel</span>
<span class="nt">-----</span><span class="o">></span> Starting Kitchen <span class="o">(</span>v1.0.0.rc.2<span class="o">)</span>
<span class="nt">-----</span><span class="o">></span> Cleaning up any prior instances of <default-ubuntu-1204>
<span class="nt">-----</span><span class="o">></span> Destroying <default-ubuntu-1204>...
Finished destroying <default-ubuntu-1204> <span class="o">(</span>0m0.00s<span class="o">)</span><span class="nb">.</span>
<span class="nt">-----</span><span class="o">></span> Testing <default-ubuntu-1204>
<span class="nt">-----</span><span class="o">></span> Creating <default-ubuntu-1204>...
...
<span class="nt">-----</span><span class="o">></span> Running bats <span class="nb">test </span>suite
✓ monit is installed and <span class="k">in </span>the path
✓ monit configuration <span class="nb">dir </span>exists
2 tests, 0 failures
<span class="nt">-----</span><span class="o">></span> Running minitest <span class="nb">test </span>suite
/opt/chef/embedded/bin/ruby <span class="nt">-I</span><span class="s2">"/opt/chef/embedded/lib/ruby/1.9.1"</span> <span class="s2">"/opt/chef/embedded/lib/ruby/1.9.1/rake/rake_test_loader.rb"</span> <span class="s2">"/tmp/busser/suites/minitest/test_default.rb"</span>
Run options: <span class="nt">--seed</span> 42931
<span class="c"># Running tests:</span>
Installed: 1:5.3.2-1
<span class="nb">.</span> <span class="k">*</span> monit is running
..
Finished tests <span class="k">in </span>0.027586s, 108.7507 tests/s, 108.7507 assertions/s.
3 tests, 3 assertions, 0 failures, 0 errors, 0 skips
Finished verifying <default-ubuntu-1204> <span class="o">(</span>0m2.94s<span class="o">)</span><span class="nb">.</span>
<span class="nt">-----</span><span class="o">></span> Destroying <default-ubuntu-1204>...</code></pre></figure>
<p>Of course my tests are not designed to work on different types of systems (on CentOS they will fail), my goal was to show how you can test environment after your cookbook. More about test-kitchen you can read <a href="https://github.com/test-kitchen/test-kitchen/wiki/Getting-Started">here</a>.</p>
<h1 id="summary">Summary</h1>
<p>In this article I covered how to write Chef cookbook by TDD. Hope it will help you to write better cookbooks for Chef. All code examples you can find here: <a href="https://github.com/le0pard/chef-tdd-monit">github.com/le0pard/chef-tdd-monit</a>.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Speed up your Ruby on Rails application using WebP images2013-11-23T00:00:00+00:00http://leopard.in.ua/2013/11/23/rails-and-webp<p>Hello my dear friends.</p>
<p>Today we will speed up our rails application using webp images.</p>
<h1 id="what-is-webp">What is WebP?</h1>
<p><a href="https://developers.google.com/speed/webp/">WebP</a> is an image format that employs both lossy and lossless compression. It was developed by Google. As we can see <a href="http://caniuse.com/webp">on this page</a>, today only Google Chrome (+ Android) and Opera support this type of images. But it is not a problem. We can show webp images in these browsers, but in another browsers we will show a png, jpg or gif images.</p>
<h1 id="webp-and-ruby-on-rails">WebP and Ruby on Rails</h1>
<p>After the release of webp library I wrote webp gem - <a href="http://leopard.in.ua/webp-ffi/">webp-ffi</a>. You can use this gem to work with webp images in Ruby. Two good Ruby gems were released later - <a href="https://github.com/kavu/sprockets-webp">sprockets-webp</a> and <a href="https://github.com/kavu/carrierwave-webp">carrierwave-webp</a>. Sprockets-webp provides a Rails Asset Pipeline hook for converting PNG and JPEG assets to the WebP format. At first, just add this gem in the Gemfile:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">gem</span> <span class="s1">'sprockets-webp'</span></code></pre></figure>
<p>And run:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>bundle</code></pre></figure>
<p>Put some PNGs and JPGs into “app/assets/images” and you can test converter locally with the Rake task:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>bundle <span class="nb">exec </span>rake assets:precompile <span class="nv">RAILS_ENV</span><span class="o">=</span>production</code></pre></figure>
<p>Now for each image webp will be created:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">ls </span>public/assets | <span class="nb">grep </span>webp
app_view-c1c11c4587eb1e6583df7825b76354eb.png.webp
bg-51830e3641c265cb246d752c57df3c20.jpg.webp
favicon-c594206ef399e642bd7024986158976c.png.webp
feature01-62e93a154b58089fe25e63a4e5087cf2.png.webp
feature02-24ee54e80423624080977fc17828bf23.png.webp
feature03-67de13538e75226fcdb1e4dd15d258bb.png.webp
feature04-28c1d1d55b982341f371ef519df06d36.png.webp
feature05-54a3ca87cd4ea2b4d8c6476181940b95.png.webp
feature06-7a7881a2640f8b879ec4defeb07e7d9b.png.webp
feature07-4b206a44dbf7181a1f653b157b8183de.png.webp
feature08-800ee165d1e6e854f55d92282a2df09d.png.webp
icon01-93581fe0eeaab8d135282a15f5ef8e3f.png.webp
icon02-e8efdd17d3f1f5d13f1d878240d55970.png.webp
icon03-1f3e94e02160b9fbeb2036490973150d.png.webp
logo-e424b8ca2552edb04331faf0da7f213c.png.webp
signin_block-3af7fa120ca9a6ea11d7bf63c1ec062b.png.webp
toggle-ae225a8eda983bb3344c2f496749cb3e.png.webp</code></pre></figure>
<p>If you want to convert images upload by users in your application, you can use carrierwave-webp (of course, if you use for this purpose <a href="https://github.com/carrierwaveuploader/carrierwave">carrierwave</a> gem).</p>
<h1 id="nginx-with-webp">Nginx with WebP</h1>
<p>We have to show webp images only in browsers, which support this format. For this purpose we will use <a href="http://nginx.org/">Ngnix</a> web server. Chrome and Opera advertise image/webp on its Accept header for all image requests. Now, we have to configure our Nginx server for automatically choosing the right file.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">location ~ ^/<span class="o">(</span>assets<span class="o">)</span>/ <span class="o">{</span>
<span class="c"># check Accept header for webp, check if .webp is on disk</span>
<span class="k">if</span> <span class="o">(</span><span class="nv">$http_accept</span> ~<span class="k">*</span> <span class="s2">"webp"</span><span class="o">)</span> <span class="o">{</span> <span class="nb">set</span> <span class="nv">$webp</span> <span class="s2">"true"</span><span class="p">;</span> <span class="o">}</span>
<span class="k">if</span> <span class="o">(</span><span class="nt">-f</span> <span class="nv">$request_filename</span>.webp<span class="o">)</span> <span class="o">{</span> <span class="nb">set</span> <span class="nv">$webp</span> <span class="s2">"</span><span class="k">${</span><span class="nv">webp</span><span class="k">}</span><span class="s2">-local"</span><span class="p">;</span> <span class="o">}</span>
<span class="k">if</span> <span class="o">(</span><span class="nv">$webp</span> <span class="o">=</span> <span class="s2">"true-local"</span><span class="o">)</span> <span class="o">{</span>
add_header Vary Accept<span class="p">;</span>
access_log off<span class="p">;</span>
expires 30d<span class="p">;</span>
rewrite <span class="o">(</span>.<span class="k">*</span><span class="o">)</span> <span class="nv">$1</span>.webp <span class="nb">break</span><span class="p">;</span>
<span class="o">}</span>
root /some/folder/current/public<span class="p">;</span>
expires max<span class="p">;</span>
add_header Cache-Control public<span class="p">;</span>
access_log off<span class="p">;</span>
gzip_static on<span class="p">;</span>
gzip_proxied any<span class="p">;</span>
<span class="nb">break</span><span class="p">;</span>
<span class="o">}</span></code></pre></figure>
<p>Then, we check if the Accept header is an advertising WebP. Then we check if there is a corresponding file with a .webp extension on disk. If both conditions match, we serve the WebP asset and add “Vary: Accept” header.</p>
<h1 id="results">Results</h1>
<p>Now we can see results of our work. First we check the speed of loading of our assets in Firefox (<a href="/assets/images/rails/webp1.png">full image</a>):</p>
<p><a href="/assets/images/rails/webp2.png"><img src="/assets/images/rails/webp2.png" alt="Firefox Webp" title="Firefox Webp" class="aligncenter" /></a></p>
<p>As we can see, many big image have size 145.07 Kb. Now let’s check the result in Chrome (<a href="/assets/images/rails/webp3.png">full image</a>):</p>
<p><a href="/assets/images/rails/webp4.png"><img src="/assets/images/rails/webp4.png" alt="Chrome Webp" title="Chrome Webp" class="aligncenter" /></a></p>
<p>As we can see, image with size 145.07 Kb was converted in webp image with size 17.2 Kb. Another images also have smaller size, than png or jpeg images. By the way, the visual quality of images has not become worse.</p>
<p><a href="/assets/images/rails/webp5.png"><img src="/assets/images/rails/webp5.png" alt="Webp" title="Webp" class="aligncenter" /></a></p>
<p>As the result we reduced load time of rails application almoust in 2 times: before we have in average time 800ms, now an average response time of the page is 500ms.</p>
<h1 id="summary">Summary</h1>
<p>As we can see using webp images, we have accelerated load speed of our application in Chrome and Opera browsers. Hopefully support of this image format will come into sight in Firefox (and maybe in IE).</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
World of the NoSQL databases2013-11-08T00:00:00+00:00http://leopard.in.ua/2013/11/08/nosql-world<p>Hello my dear friends. In this article I will talk about NoSQL databases.</p>
<h1 id="relational-database-management-system-rdbms">Relational database management system (RDBMS)</h1>
<p>The relational database model focuses on the organization of the data in the form of two-dimensional tables. Each relational table is a two-dimensional array that has the following properties:</p>
<ul>
<li>each element of the table is a data element;</li>
<li>all cells in the column homogeneous: all elements in the column are the same type (numeric, character, etc.);</li>
<li>each column has a unique name;</li>
<li>identical rows in the table are not available;</li>
<li>the order of the rows and columns can be arbitrary.</li>
</ul>
<p><a href="http://en.wikipedia.org/wiki/Edgar_F._Codd">Edgar Codd</a> is the author of the ‘relational’ concept.</p>
<p><a href="/assets/images/nosql/Edgar_F_Codd.jpg"><img src="/assets/images/nosql/Edgar_F_Codd.jpg" alt="Edgar_F_Codd" title="Edgar_F_Codd" class="aligncenter" /></a></p>
<p>The architecture of relational models started its history in the 1970s. The main task of the database then was to support the launched in 1960 a massive shift from paper records to computer-economic activities. A huge amount of information from paper documents tolerated in the database accounting systems that was a securety store of all incoming information and, if necessary, way for quick finding of information. These requirements led to the architectural features of a RDBMS that remained virtually unchanged until now: the row-data storage, indexing, records and logging operations.</p>
<p>Examples of databases:</p>
<ul>
<li><a href="http://www.mysql.com/">MySQL</a></li>
<li><a href="http://www.postgresql.org/">PostgreSQL</a></li>
<li><a href="http://www-01.ibm.com/software/data/db2/">DB2</a></li>
<li><a href="https://www.microsoft.com/en-us/sqlserver/default.aspx">SQL Server</a></li>
</ul>
<h1 id="nosql">NoSQL</h1>
<p>NoSQL (not only SQL) - a number of approaches and projects aimed for the implementation of database models, with significant differences from those that used in traditional relational database management system with access to the data with the help of SQL. Description schema in the case of NoSQL can be carried out through the use of different data structures: hash tables, arrays, trees, etc.</p>
<p>For the first time the term “NoSQL” was used in the late 90’s. The real meaning of the form used now got only in the middle 2009. Originally, it was a title of the open-source database created by Carlo Strozzi, which stores all data as ASCII files and used shell scripts instead of SQL to access data. This database did not have anything in common with the “NoSQL” in it’s current form.</p>
<p><a href="/assets/images/nosql/carlo-strozzi.jpg"><img src="/assets/images/nosql/carlo-strozzi.jpg" alt="carlo-strozzi" title="carlo-strozzi" class="aligncenter" /></a></p>
<p>Johan Oskarsson organized a meeting to discuss new technologies in the IT market, storage and processing of data in June 2009 in San Francisco. The main stimulus for the meeting was the new products such as BigTable and Dynamo. For a meeting it was necessary to find a brief term for using as a Twitter hashtag. “NoSQL” term was suggested by Eric Evans from RackSpace. The term was planned to use only for this meeting and did not have a deep meaning. But it turned out that it spread worldwide network such as viral advertising and became the de facto name of a trend in the IT.</p>
<p>The term “NoSQL” has absolutely natural origin and has no universally accepted definition or scientific institution behind. This title is rather characterized by the vector of development of IT away from relational databases. Pramod J. Sadalage and Martin Fowler tried to group and organize knowledge about the NoSQL world in the book <a href="http://www.amazon.com/NoSQL-Distilled-Emerging-Polyglot-Persistence/dp/0321826620">“NoSQL Distilled”</a>.</p>
<p>Now there are about 150 kinds of NoSQL databases (<a href="http://nosql-database.org/">nosql-database.org</a>). Let’s consider the main development directions of NoSQL.</p>
<h2 id="wide-column-store--column-families">Wide Column Store / Column Families</h2>
<p>A column-oriented DBMS is a database management system that stores data tables as sections of columns of data rather than as rows of data. Physically tables are a collection of columns, each of them is essentially a table with a single field. These databases are generally used for analytic systems, business intelligence and analytical data storages.</p>
<p>Advantages:</p>
<ul>
<li>It is possible to compress data significantly, because in a single column of the table, the data is usually in the same type;</li>
<li>Allows on a cheap and low-powered hardware to boost the speed for the query performance in the 5, 10 and sometimes even 100 times, thus, due to compression, the data on the drive will take 5-10 times less space than in the case of the traditional RDBMS.</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>In general there are no transactions;</li>
<li>Have a number of limitations for the developer who is used to the developed traditional RDBMS.</li>
</ul>
<p>Examples of databases:</p>
<ul>
<li><a href="http://hbase.apache.org/">HBase</a></li>
<li><a href="http://cassandra.apache.org/">Cassandra</a></li>
<li><a href="http://accumulo.apache.org/">Accumulo</a></li>
<li><a href="http://aws.amazon.com/simpledb/">Amazon SimpleDB</a></li>
</ul>
<h2 id="key-value--tuple-store">Key Value / Tuple Store</h2>
<p>This database allows you to store key/value pairs in a persistent store, and subsequently read these values using the keys. What is the benefit of such an extremely limited at first glance solutions? During saving/reading values by key, the system works extremely efficient because of the lacks of the heavy layers of SQL handlers, indexing systems, profiling system evacuation (for PostgreSQL), etc. Such solution provides the most efficient performance, lowest cost of implementation and scaling.</p>
<p>Advantages:</p>
<ul>
<li>RDBMS are too slow, have heavy layer of SQL cursors;</li>
<li>Solutions of RDBMS to store small amounts of data too much cost;</li>
<li>There are no need for SQL queries, indexes, triggers, stored procedures, temporary tables, forms, views, etc;</li>
<li>Key/value database is easily scalable and high-performance due to its lightness.</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>Limitations of relational databases ensure data integrity at the lowest level. In stores key/value no such restriction. Data integrity controled by applications. In this case data integrity may be compromised due to errors in the application code;</li>
<li>In an RDBMS if the model is well designed, the database will contain a logical structure that fully reflects the structure of the stored data and can differ from the structure of the application (the data are independent from the application). For a key/value storage it is harder to achieve.</li>
</ul>
<p>Examples of databases:</p>
<ul>
<li><a href="http://aws.amazon.com/dynamodb/">Amazon DynamoDB</a></li>
<li><a href="http://docs.basho.com/riak/latest/">Riak</a></li>
<li><a href="http://redis.io/">Redis</a></li>
<li><a href="https://code.google.com/p/leveldb/">LevelDB</a></li>
<li><a href="https://code.google.com/p/scalaris/">Scalaris</a></li>
<li><a href="http://memcachedb.org/">MemcacheDB</a></li>
<li><a href="http://fallabs.com/kyotocabinet/">Kyoto Cabinet</a></li>
</ul>
<h2 id="document-store">Document Store</h2>
<p>Are programs designed to store, search, and manage document-oriented information (semi-structured data). The central concept is a document. Implementation of the specific document-oriented database is different, but in general, they suggest the encapsulation and encryption of the data (documents) in several standard formats: XML, YAML, JSON, BSON, PDF, etc.</p>
<p>Advantages:</p>
<ul>
<li>Sufficiently flexible language for querying;</li>
<li>Easy horizontally scalable.</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>Atomicity in most cases is conditional.</li>
</ul>
<p>Examples of such databases:</p>
<ul>
<li><a href="http://www.mongodb.org/">MongoDB</a></li>
<li><a href="http://www.couchbase.com/">Couchbase</a></li>
<li><a href="http://couchdb.apache.org/">CouchDB</a></li>
<li><a href="http://www.rethinkdb.com/">RethinkDB</a></li>
</ul>
<h2 id="graph-databases">Graph Databases</h2>
<p>A graph database is a database that uses graph structures with nodes, edges, and properties to represent and store data. By definition, a graph database is an any storage system that provides index-free adjacency. This means that every element contains a direct pointer to its adjacent element and index lookups are not necessary.</p>
<p>Advantages:</p>
<ul>
<li>Often faster for associative data sets;</li>
<li>Can scale more naturally to large data sets as they do not typically require expensive join operations.</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>RDBMS can be used in more general cases. Graph databases are suitable for graph-like data.</li>
</ul>
<p>Examples of such databases:</p>
<ul>
<li><a href="http://www.neo4j.org/">Neo4j</a></li>
<li><a href="https://github.com/twitter/flockdb">FlockDB</a></li>
<li><a href="http://infogrid.org/trac/">InfoGrid</a></li>
<li><a href="http://www.orientdb.org/">OrientDB</a></li>
</ul>
<h2 id="multimodel-databases">Multimodel Databases</h2>
<p>These databases includes features of multiple databases.</p>
<p>There are two different groups of products that can be considered as multi-model:</p>
<ul>
<li>Multimodel databases that have been developed specifically to support multiple data models and use cases;</li>
</ul>
<p>For example: ArangoDB - promises the benefits of key-value storage as well as a document-oriented and graph inside.</p>
<ul>
<li>General-purpose database with support for multiple model variants.</li>
</ul>
<p>For example: Oracle MySQL 5.6, which can support SQL-like access and key-value access via the Memcached API.</p>
<p>Examples of such databases:</p>
<ul>
<li><a href="http://www.arangodb.org/">ArangoDB</a></li>
<li><a href="http://www.aerospike.com/">Aerospike</a></li>
<li><a href="http://www.datomic.com/">Datomic</a></li>
</ul>
<h2 id="object-databases">Object Databases</h2>
<p>Database in which the data are modeled as objects, their attributes, methods, and classes. Object-oriented databases are usually recommended for those applications that require high-performance data processing, which have a complex structure.</p>
<p>Advantages:</p>
<ul>
<li>The object model is the best display of the real world, rather than relational tuples. This is especially true for complex and multi-faceted objects;</li>
<li>Organize your data with hierarchical characteristics;</li>
<li>Separate query language is not required for accessing the data, because access is directly to objects. Nevertheless, the possibility exists to use the queries.</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>In the RDBMS schema change as a result of the creation, modification or deletion of tables usually do not depend on the application. In applications that work with object database, schema change class usually means that changes must be made in the other application classes that are associated with this class. This leads to the necessity of correction of the whole system;</li>
<li>Object database usually tied to a particular language with a separate API and data are available only through the API. RDBMS in this regard is a great opportunity, thanks to the common query language.</li>
</ul>
<p>Examples of such databases:</p>
<ul>
<li><a href="http://www.velocitydb.com/">VelocityDB</a></li>
<li><a href="http://www.objectivity.com/">Objectivity</a></li>
<li><a href="http://www.zodb.org/en/latest/">ZODB</a></li>
<li><a href="http://siaqodb.com/">Siaqodb</a></li>
<li><a href="http://www.eyedb.org/">EyeDB</a></li>
</ul>
<h2 id="multidimensional-databases">Multidimensional Databases</h2>
<p>Database optimization for online analytical processing. Can receive data from a variety of relational databases and a certain way for structuring the information into categories and sections that may be available in certain ways the coordinates.</p>
<p>Examples of such databases:</p>
<ul>
<li><a href="http://globalsdb.org/">GlobalsDB</a></li>
<li><a href="http://www.intersystems.com/cache/index.html">Intersystems Cache</a></li>
<li><a href="http://scidb.org/">SciDB</a></li>
<li><a href="http://www.rasdaman.org/">Rasdaman</a></li>
</ul>
<h2 id="multivalue-databases">Multivalue Databases</h2>
<p>Variety of multi-dimensional database. Main feature is the support of using the attributes that can store a list of values.</p>
<p>Examples of such databases:</p>
<ul>
<li><a href="http://www.rocketsoftware.com/brand/rocket-u2">Rocket U2</a></li>
<li><a href="http://www.revsoft.co.uk/openinsight.aspx">OpenInsight</a></li>
<li><a href="http://www.northgate-reality.com/">Reality</a></li>
</ul>
<h1 id="summary">Summary</h1>
<p>NoSQL movement is gaining popularity with enormous speed. However, this does not mean that relational databases are becoming rudiment or something archaic. Most likely they will be used, and will be used still active, but more in a symbiotic relationship with them will be performing NoSQL database. We live an era of polyglot persistence, an era of using the different needs of different data warehouse. Now there is no monopoly of relational databases, as there is no alternative source of data. Increasingly, architects are selected based on the nature of the storage of the data itself and how we want them to manipulate what volumes of data expected.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
How I update project from Rails 3 to Rails 42013-11-01T00:00:00+00:00http://leopard.in.ua/2013/11/01/update-from-rails3-to-rails4<p>Hello my dear friends.</p>
<p>Work with large projects often requires repayment of the technical debts: update some libs, сode refactoring, etc. In my case I should move the project from rails 3 to rails 4. In this article I will tell how I did it.</p>
<h2 id="step-0-tests">Step 0: Tests!!!</h2>
<p>First of all, you project must have huge test coverage. In my project we have 93.52% code covered by tests. Yep, it is hard. But without tests thinking about update of some libs or doing сode refactoring in your project will be the same as playing “Russian roulette” - maybe you will win, maybe no. So first of all, before making update for rails 4 you have to check code coverage in your project (<a href="https://github.com/colszowka/simplecov">simplecov</a> very good library for this) and cover by tests missed parts.</p>
<h2 id="step-1-ruby-193-or-200">Step 1: Ruby 1.9.3 or 2.0.0</h2>
<p>Rails 4 requires Ruby 1.9.3 and recommends Ruby 2.0.0. Attempting to run Rails 4 with a Ruby version below 1.9.3 will cause syntax errors or runtime issues. So first of all you should migrate your project to new Ruby version. We have been using the last ruby 2.0.0 in this project, so we did not have such problem :)</p>
<h2 id="step-2-strong_parameters">Step 2: Strong_Parameters</h2>
<p>First huge change in the rails 4 is <a href="https://github.com/rails/strong_parameters">strong_parameters</a>. I began to change project by removing whitelist_attributes and migration to strong_parameters. We need to change some settings in our projects to do this. First of all, we have to add strong_parameters to Gemfile:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">gem</span> <span class="s2">"strong_parameters"</span></code></pre></figure>
<p>Next changes will be in “config/application.rb” whitelist_attributes settings:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">config</span><span class="p">.</span><span class="nf">active_record</span><span class="p">.</span><span class="nf">whitelist_attributes</span> <span class="o">=</span> <span class="kp">false</span></code></pre></figure>
<p>And then add “strong_parameters.rb” file in “config/initializers” with this content:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="no">ActiveRecord</span><span class="o">::</span><span class="no">Base</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="ss">:include</span><span class="p">,</span> <span class="no">ActiveModel</span><span class="o">::</span><span class="no">ForbiddenAttributesProtection</span><span class="p">)</span></code></pre></figure>
<p>After this change you have to remove “attr_accessible” from models and use “strong_parameters” in controllers. Use the tests to verify that you migrated without problems in strong_parameters (I used tests in each step).</p>
<h2 id="step-3-routes">Step 3: Routes</h2>
<p>Method “match” can be used only for routes, which should accept several types of requests (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS).</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">match</span> <span class="s1">'subscriptions/:subscription_id'</span> <span class="o">=></span> <span class="s1">'news#create'</span><span class="p">,</span> <span class="ss">via: </span><span class="p">[</span><span class="ss">:post</span><span class="p">,</span> <span class="ss">:put</span><span class="p">,</span> <span class="ss">:patch</span><span class="p">]</span></code></pre></figure>
<p>But if you have something like this:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">match</span> <span class="s1">'/terms'</span> <span class="o">=></span> <span class="s1">'site#terms'</span></code></pre></figure>
<p>you have to change it to a right type of the request:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">get</span> <span class="s1">'/terms'</span> <span class="o">=></span> <span class="s1">'site#terms'</span></code></pre></figure>
<p>Again, check all changes by tests :)</p>
<h2 id="step-4-gems">Step 4: Gems</h2>
<p>Migration of the rails 3 app to the rails 4 requires gem updates. Some gems in your Gemfile can be migrated to new version, which can work in both in rails 3 and in rails 4. For example, we will start with the migration of the devise gem. First of all, I read changelog of this gem to understand differences between versions and know what to change in migration. Check all by tests (also by manual check, tests do not guarantee 100% success) and select another gem for migration.</p>
<p>The “rails4_upgrade” gem helps to automate some of the processes required to the upgrade to the Rails 4. It add rake task to check old versions of gems:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>bundle <span class="nb">exec </span>rake rails4:check_gems</code></pre></figure>
<p>Some of gems stopped to support rails 3 and support just rails 4. In this case better to read, what were changed in a gem in changelog and add migration notes, in case you have to do some work with this gem after update to rails 4.</p>
<h2 id="step-5-change-rails-version-in-gemfile">Step 5: Change Rails version in Gemfile</h2>
<p>Open Gemfile in a text editor and change the line that starts with gem ‘rails’ to:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">gem</span> <span class="s1">'rails'</span><span class="p">,</span> <span class="s1">'~> 4.0.1.rc3'</span></code></pre></figure>
<p>Why did I use RC? Just because 4.0.0 have some bugs, which broke my app (my main problem was fixed by <a href="https://github.com/rails/rails/pull/11496">this pull request</a>)</p>
<p>Rails 4 also depends on newer versions of gems that drive the asset pipeline (assets group must be removed):</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">gem</span> <span class="s1">'sass-rails'</span><span class="p">,</span> <span class="s1">'~> 4.0.0'</span>
<span class="n">gem</span> <span class="s1">'coffee-rails'</span><span class="p">,</span> <span class="s1">'~> 4.0.0'</span>
<span class="n">gem</span> <span class="s1">'uglifier'</span><span class="p">,</span> <span class="s1">'>= 1.3.0'</span></code></pre></figure>
<p>Rails 4 moves many features into gems that were previously shipped with Rails itself. You can found list of this gems on <a href="http://www.andylindeman.com/2013/03/05/gems-extracted-in-rails-4.html">this page</a>. In my case I need:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1"># used cache_page</span>
<span class="n">gem</span> <span class="s1">'actionpack-page_caching'</span><span class="p">,</span> <span class="s1">'1.0.0'</span>
<span class="c1"># xml requests from some services</span>
<span class="n">gem</span> <span class="s1">'actionpack-xml_parser'</span><span class="p">,</span> <span class="s1">'1.0.0'</span></code></pre></figure>
<p>After “bundle update rails” you should change your rails app. <a href="http://railsdiff.org/">railsdiff.org</a> will help you very much with this . In my project I had to use <a href="http://railsdiff.org/html/v3.2.15-v4.0.1.rc3.html">this diff</a>.</p>
<p>Finally, I continue to update gems, which work only with rails 4</p>
<h2 id="step-6-fix-fallen-tests">Step 6: Fix fallen tests</h2>
<p>No comments, just check your code and tests.</p>
<h1 id="summary">Summary</h1>
<p>As you can see migration process even of the big rails project is not so complicated if you do it with clear migration plan. There is open source book <a href="http://www.upgradingtorails4.com/">upgradingtorails4</a>, which can be very helpfull (I did this migration before this book was published).</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Binary serialization formats2013-10-13T00:00:00+00:00http://leopard.in.ua/2013/10/13/binary-serialization-formats<p>Hello my dear friends. Today we will talk about binary serialization formats.</p>
<h1 id="binary-serialization-formats">Binary serialization formats</h1>
<p>Serialization is the process of translating data structures or object state into a format that can be stored and resurrected later in the same or another computer environment. In most cases binary serialization formats is not human-readable, but it can effectively compress the data, which is very usefull for caches, inter process communication, message brokers, etc. For my development tasks very important select good binary serialization format, which will used for distributed system for inter communication and storage. Let’s look at the most interesting of these formats.</p>
<h2 id="bson">BSON</h2>
<ul>
<li><a href="http://bsonspec.org/">Site</a></li>
</ul>
<p>BSON is a computer data interchange format used mainly as a data storage and network transfer format in the MongoDB database. It is a binary form for representing simple data structures and associative arrays (called objects or documents in MongoDB). BSON has a huge number of implementations. Compared to JSON, BSON is designed to be efficient both in storage space and scan-speed. The key advantage is its traversability, which makes it suitable for storage purposes, but comes at the cost of over-the-wire encoding size (in some cases, BSON will use more space than JSON due to the length prefixes and explicit array indices).</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">002</span> <span class="o">></span> <span class="n">a</span> <span class="o">=</span> <span class="p">{</span> <span class="ss">key: </span><span class="s2">"example"</span><span class="p">}.</span><span class="nf">to_bson</span>
<span class="o">=></span> <span class="s2">"</span><span class="se">\x16\x00\x00\x00\x02</span><span class="s2">key</span><span class="se">\x00\b\x00\x00\x00</span><span class="s2">example</span><span class="se">\x00\x00</span><span class="s2">"</span>
<span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">003</span> <span class="o">></span> <span class="no">BSON</span><span class="o">::</span><span class="no">Document</span><span class="p">.</span><span class="nf">from_bson</span><span class="p">(</span><span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">a</span><span class="p">))</span>
<span class="o">=></span> <span class="p">{</span><span class="s2">"key"</span><span class="o">=></span><span class="s2">"example"</span><span class="p">}</span></code></pre></figure>
<h2 id="messagepack">MessagePack</h2>
<ul>
<li><a href="http://msgpack.org/">Site</a></li>
</ul>
<p>MessagePack is a binary form for representing simple data structure like arrays and associative arrays. MessagePack aims to be as compact and simple as possible. MessagePack has a huge number of implementations. MessagePack is designed to be fast. It’s transparently compatible with JSON, despite BSON’s reputation as the recommended binary JSON equivalent. In my humble opinion, MessagePack better for networking communication and BSON better for storage purpose.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">002</span> <span class="o">></span> <span class="n">msg</span> <span class="o">=</span> <span class="no">MessagePack</span><span class="p">.</span><span class="nf">pack</span><span class="p">({</span> <span class="ss">key: </span><span class="s2">"example"</span><span class="p">})</span>
<span class="o">=></span> <span class="s2">"</span><span class="se">\x81\xA3</span><span class="s2">key</span><span class="se">\xA7</span><span class="s2">example"</span>
<span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">003</span> <span class="o">></span> <span class="no">MessagePack</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="n">msg</span><span class="p">)</span>
<span class="o">=></span> <span class="p">{</span><span class="s2">"key"</span><span class="o">=></span><span class="s2">"example"</span><span class="p">}</span></code></pre></figure>
<h2 id="bert">BERT</h2>
<ul>
<li><a href="http://bert-rpc.org/">Site</a></li>
</ul>
<p>Originally comde from Erlang. Specification written by Tom Preston-Werner, founder of Github, and used heavily there. BERT has a huge number of implementations. In some simple Ruby benchmarking, I noticed that BERT was an order of magnitude slower at serialization than BSON or MessagePack. This may not be true in other language ports, however.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">002</span> <span class="o">></span> <span class="n">a</span> <span class="o">=</span> <span class="no">BERT</span><span class="p">.</span><span class="nf">encode</span><span class="p">({</span> <span class="ss">key: </span><span class="s2">"example"</span><span class="p">})</span>
<span class="o">=></span> <span class="s2">"</span><span class="se">\x83</span><span class="s2">h</span><span class="se">\x03</span><span class="s2">d</span><span class="se">\x00\x04</span><span class="s2">bertd</span><span class="se">\x00\x04</span><span class="s2">dictl</span><span class="se">\x00\x00\x00\x01</span><span class="s2">h</span><span class="se">\x02</span><span class="s2">d</span><span class="se">\x00\x03</span><span class="s2">keym</span><span class="se">\x00\x00\x00\a</span><span class="s2">examplej"</span>
<span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">003</span> <span class="o">></span> <span class="no">BERT</span><span class="p">.</span><span class="nf">decode</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="o">=></span> <span class="p">{</span><span class="ss">:key</span><span class="o">=></span><span class="s2">"example"</span><span class="p">}</span></code></pre></figure>
<h2 id="protocol-buffers">Protocol Buffers</h2>
<ul>
<li><a href="https://code.google.com/p/protobuf/">Site</a></li>
</ul>
<p>Protocol Buffers are a method of serializing structured data. As such, they are useful in developing programs to communicate with each other over a wire or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates from that description source code in various programming languages for generating or parsing a stream of bytes that represents the structured data. Google developed Protocol Buffers for use internally. Does not provide an RPC mechanism but instead focuses on interchange protocols. Widely implemented, though not all are of the same quality/completion.</p>
<h2 id="capn-proto">Cap’n Proto</h2>
<ul>
<li><a href="http://kentonv.github.io/capnproto/">Site</a></li>
</ul>
<p>Cap’n Proto is an insanely fast data interchange format and capability-based RPC system. Cap’n Proto much faster, than Protocol Buffers, because there is no encoding/decoding step. The Cap’n Proto encoding is appropriate both as a data interchange format and an in-memory representation. Developed by Kenton Varda, who was the primary author of Protocol Buffers version 2. At the moment exists a C++ and Python implementations, for Erlang, Ruby and Rust under development.</p>
<h2 id="apache-thrift">Apache Thrift</h2>
<ul>
<li><a href="http://thrift.apache.org/">Site</a></li>
</ul>
<p>Thrift is an interface definition language that is used to define and create services for numerous languages. It is used as a remote procedure call (RPC) framework and was developed at Facebook. Thrift’s goal is “to enable efficient and reliable communication across programming languages”. Solving many aspects of cross-platform services, it generates RPC code for clients and servers, providing a compact, deterministic, and versionable interchange protocol.</p>
<h2 id="apache-avro">Apache Avro</h2>
<ul>
<li><a href="http://avro.apache.org/docs/current/">Site</a></li>
</ul>
<p>Apache Avro is a data serialization system. Avro requires schemas when data is written or read. Most interesting is that you can use different schemas for serialization and deserialization, and Avro will handle the missing/extra/modified fields. Providing a schema with binary data allows each datum be written without overhead. The result is more compact data encoding, and faster data processing. Avro might be a good choice if the rigidity of Protocol Buffers or Thrift is too much for you.</p>
<h2 id="blink-protocol">Blink protocol</h2>
<ul>
<li><a href="http://blinkprotocol.org/">Site</a></li>
</ul>
<p>The Blink Protocol is a standardized method for defining how to exchange messages in and between systems. Blink makes it easy for people to define what information to exchange and how. It also eliminates friction in the communications machinery. The philosophy of Blink is that efficient communication follows from making every word tell. At the moment exists a Java implementation. Protocol Buffers relatively more efficient at encoding a message with many absent fields. The decoding of Protocol Buffers messages will be somewhat more expensive as a result of this relaxed ordering. Blink requires fields to be encoded strictly in specification order.</p>
<h2 id="sereal">Sereal</h2>
<ul>
<li><a href="https://github.com/Sereal/Sereal">Site</a></li>
</ul>
<p>Sereal is a binary data serialization format, which was written for Perl, but right now also have Go, Java, Python, Objective-C and Ruby language ports. By <a href="https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs">benchmarks</a> have good speed of encoding and decoding.</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">002</span> <span class="o">></span> <span class="n">a</span> <span class="o">=</span> <span class="no">Sereal</span><span class="p">.</span><span class="nf">encode</span><span class="p">({</span> <span class="ss">key: </span><span class="s2">"example"</span><span class="p">})</span>
<span class="o">=></span> <span class="s2">"=srl</span><span class="se">\x01\x00</span><span class="s2">Qckeygexample"</span>
<span class="mf">2.0</span><span class="o">.</span><span class="mi">0</span><span class="n">p247</span> <span class="p">:</span><span class="mo">003</span> <span class="o">></span> <span class="no">Sereal</span><span class="p">.</span><span class="nf">decode</span><span class="p">(</span><span class="n">a</span><span class="p">)</span>
<span class="o">=></span> <span class="p">{</span><span class="s2">"key"</span><span class="o">=></span><span class="s2">"example"</span><span class="p">}</span></code></pre></figure>
<h2 id="bencode">Bencode</h2>
<ul>
<li><a href="https://wiki.theory.org/BitTorrentSpecification">Site</a></li>
</ul>
<p>Bencode is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data. Bencoding is most commonly used in torrent files. These metadata files are simply bencoded dictionaries.</p>
<p>While less efficient than a pure binary encoding, bencoding is simple and (because numbers are encoded as text in decimal notation) is unaffected by endianness, which is important for a cross-platform application like BitTorrent. It is also fairly flexible, as long as applications ignore unexpected dictionary keys, so that new ones can be added without creating incompatibilities.</p>
<h2 id="gobs">Gobs</h2>
<ul>
<li><a href="http://golang.org/pkg/encoding/gob/">Site</a></li>
</ul>
<p>Specific to the Go language binary serialization format. There is no need for a separate interface definition language or “protocol compiler”. The data structure itself is all the package should need to figure out how to encode and decode it. Main goal “to transmit a data structure across a network or to store it in a file, it must be encoded and then decoded again”.</p>
<h2 id="abstract-syntax-notation-one-asn1">Abstract Syntax Notation One (ASN.1)</h2>
<ul>
<li><a href="http://www.itu.int/ITU-T/asn1/index.html">Site</a></li>
</ul>
<p>Abstract Syntax Notation One (ASN.1) is a standard and notation that describes rules and structures for representing, encoding, transmitting, and decoding data in telecommunications and computer networking. The formal rules enable representation of objects that are independent of machine-specific encoding techniques. Because of the widespread use of ASN.1 in 1988 moved to its own standard X.208. Beginning in 1995, significantly revised ASN.1 standard describes X.680.</p>
<h2 id="boost-serialization">Boost Serialization</h2>
<ul>
<li><a href="http://www.boost.org/libs/serialization/">Site</a></li>
</ul>
<p>Specific to the C++ language binary serialization format. The biggest problem is the lack of compatibility of different versions of the library. In my humble opinion, boost serialization is great for storing applications local data, but not for networking communication.</p>
<h1 id="summary">Summary</h1>
<p>As can be seen in the list, there are many implementations of binary serialization formats. Which one to choose depends on you, and of course, the problem that it will solve.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Speed up testing of ruby projects on Travis-CI2013-10-12T00:00:00+00:00http://leopard.in.ua/2013/10/12/speed-testing-on-travis<p>Hello my dear friends. Today we will talk about <a href="https://travis-ci.org/">Travis-CI</a> testing and how to speed up testing for ruby projects.</p>
<h1 id="caching-bundle-directory">Caching .bundle directory</h1>
<p>For many ruby projects a lot of time for preparing a build are spend by Travis-CI. The main reason for this is the installation of the required gems for your project. Eric Barendt and Matias Korhonen found a good solution for this problem - caching all needed gems in AWS S3. For this purpose they created <a href="https://rubygems.org/gems/bundle_cache">bundle_cache</a> gem. Let’s setup it.</p>
<p>First of all, you need AWS credentials and you should create the AWS S3 bucket, where cached gems will be stored. This gem need to set some environment variables.</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">AWS_S3_KEY</span><span class="o">=</span><your aws access key>
<span class="nv">AWS_S3_SECRET</span><span class="o">=</span><your aws secret>
<span class="nv">AWS_S3_BUCKET</span><span class="o">=</span><your bucket name>
<span class="nv">AWS_S3_REGION</span><span class="o">=</span><optional, aws s3 region, default us-east-1>
<span class="nv">BUNDLE_ARCHIVE</span><span class="o">=</span><the filename to use <span class="k">for </span>your cache></code></pre></figure>
<p>So we need change our .travis.yml file in project:</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">language</span><span class="pi">:</span> <span class="s">ruby</span>
<span class="na">bundler_args</span><span class="pi">:</span> <span class="s">--without development --path=~/.bundle</span>
<span class="na">rvm</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">2.0.0</span>
<span class="na">before_install</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">echo</span><span class="nv"> </span><span class="s">'</span><span class="s1">'</span><span class="s">gem:</span><span class="nv"> </span><span class="s">--no-ri</span><span class="nv"> </span><span class="s">--no-rdoc'</span><span class="s1">'</span><span class="nv"> </span><span class="s">></span><span class="nv"> </span><span class="s">~/.gemrc'</span>
<span class="pi">-</span> <span class="s">gem install bundler bundle_cache</span>
<span class="pi">-</span> <span class="s">bundle_cache_install</span>
<span class="na">before_script</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">cp config/database.travis.yml config/database.yml</span>
<span class="pi">-</span> <span class="s">bundle exec rake db:create db:migrate db:schema:load</span>
<span class="na">after_script</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">bundle_cache</span>
<span class="na">env</span><span class="pi">:</span>
<span class="na">global</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">BUNDLE_ARCHIVE="test-bundle"</span>
<span class="pi">-</span> <span class="s">AWS_S3_BUCKET="travisci-bundler-cache"</span>
<span class="pi">-</span> <span class="s">RAILS_ENV=test</span></code></pre></figure>
<p>Main commands are located in the keys “before_install”, “bundler_args”, “env” and “after_script”. Let’s analyze what each command are doing:</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">bundler_args</span><span class="pi">:</span> <span class="s">--without development --path=~/.bundle</span> <span class="c1"># set for bundler install all gems, except "development" group and use "~/.bundle" folder for this gems</span>
<span class="na">before_install</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">echo</span><span class="nv"> </span><span class="s">'</span><span class="s1">'</span><span class="s">gem:</span><span class="nv"> </span><span class="s">--no-ri</span><span class="nv"> </span><span class="s">--no-rdoc'</span><span class="s1">'</span><span class="nv"> </span><span class="s">></span><span class="nv"> </span><span class="s">~/.gemrc'</span> <span class="c1"># skip installing docs for gems</span>
<span class="pi">-</span> <span class="s">gem install bundler bundle_cache</span> <span class="c1"># install bundler and bundle_cache gems</span>
<span class="pi">-</span> <span class="s">bundle_cache_install</span> <span class="c1"># download cached gems and unpack them in "~/.bundle" folder</span>
<span class="na">after_script</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">bundle_cache</span> <span class="c1"># pack "~/.bundle" folder and upload to AWS S3</span>
<span class="na">env</span><span class="pi">:</span>
<span class="na">global</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">BUNDLE_ARCHIVE="test-bundle"</span> <span class="c1"># set name for cached gems file</span>
<span class="pi">-</span> <span class="s">AWS_S3_BUCKET="travisci-bundler-cache"</span> <span class="c1"># set S3 bucket name</span></code></pre></figure>
<p>Also you’ll need to add your AWS credentials to a secure section there. Adding of this credentials is simple:</p>
<ol>
<li>Install the travis gem using command <code class="language-plaintext highlighter-rouge">gem install travis</code></li>
<li>Log into Travis with <code class="language-plaintext highlighter-rouge">travis login --auto</code> (from inside of your project respository directory), for Travis Pro use command <code class="language-plaintext highlighter-rouge">travis login --pro</code>.</li>
<li>Encrypt your S3 credentials using command <code class="language-plaintext highlighter-rouge">travis encrypt AWS_S3_KEY="" AWS_S3_SECRET="" --add</code> (be sure you add your actual credentials inside the double quotes)</li>
</ol>
<p>In your .travis.yml file something like this will be added :</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">env</span><span class="pi">:</span>
<span class="na">global</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">BUNDLE_ARCHIVE="test-bundle"</span>
<span class="pi">-</span> <span class="s">AWS_S3_BUCKET="travisci-bundler-cache"</span>
<span class="pi">-</span> <span class="s">RAILS_ENV=test</span>
<span class="pi">-</span> <span class="na">secure</span><span class="pi">:</span> <span class="s">wqeqweheo3H743Iob4s8qweqwec0tcv0JGlg8JBhccCPnIiFUArqwe=</span></code></pre></figure>
<p>When you first start testing on Travis-CI you will see the next:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">echo</span> <span class="s1">'gem: --no-ri --no-rdoc'</span> <span class="o">></span> ~/.gemrc
<span class="nv">$ </span>gem <span class="nb">install </span>bundler bundle_cache
Fetching: bundler-1.3.5.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed bundler-1.3.5
Fetching: uuidtools-2.1.4.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed uuidtools-2.1.4
Fetching: nokogiri-1.5.10.gem <span class="o">(</span>100%<span class="o">)</span>
Building native extensions. This could take a <span class="k">while</span>...
Successfully installed nokogiri-1.5.10
Fetching: aws-sdk-1.21.0.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed aws-sdk-1.21.0
Fetching: bundle_cache-0.1.0.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed bundle_cache-0.1.0
5 gems installed
<span class="nv">$ </span>bundle_cache_install
<span class="o">=></span> Downloading the bundle
There<span class="s1">'s no such archive!
$ bundle install --without development --path=~/.bundle
....
$ bundle_cache
Checking for changes
=> There was no existing digest, uploading a new version of the archive
=> Preparing bundle archive
=> Uploading the bundle
=> Uploading 18 parts
=> Uploading /home/travis/test-bundle-x86_64.tgz.aaa
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aab
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aac
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aad
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aae
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aaf
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aag
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aah
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aai
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aaj
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aak
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aal
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aam
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aan
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aao
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aap
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aaq
=> Uploaded
=> Uploading /home/travis/test-bundle-x86_64.tgz.aar
=> Uploaded
=> Completed multipart upload
=> Uploading the digest file
All done now.</span></code></pre></figure>
<p>But for the next testing you will see the similar image:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">echo</span> <span class="s1">'gem: --no-ri --no-rdoc'</span> <span class="o">></span> ~/.gemrc
<span class="nv">$ </span>gem <span class="nb">install </span>bundler bundle_cache
Fetching: bundler-1.3.5.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed bundler-1.3.5
Fetching: uuidtools-2.1.4.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed uuidtools-2.1.4
Fetching: nokogiri-1.5.10.gem <span class="o">(</span>100%<span class="o">)</span>
Building native extensions. This could take a <span class="k">while</span>...
Successfully installed nokogiri-1.5.10
Fetching: aws-sdk-1.21.0.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed aws-sdk-1.21.0
Fetching: bundle_cache-0.1.0.gem <span class="o">(</span>100%<span class="o">)</span>
Successfully installed bundle_cache-0.1.0
5 gems installed
<span class="nv">$ </span>bundle_cache_install
<span class="o">=></span> Downloading the bundle
<span class="o">=></span> Completed bundle download
<span class="o">=></span> Extract the bundle
<span class="o">=></span> Downloading the digest file
<span class="o">=></span> Completed digest download
<span class="o">=></span> All <span class="k">done</span><span class="o">!</span>
<span class="nv">$ </span>bundle <span class="nb">install</span> <span class="nt">--without</span> development <span class="nt">--path</span><span class="o">=</span>~/.bundle
....
<span class="nv">$ </span>bundle_cache
Checking <span class="k">for </span>changes
<span class="o">=></span> There were no changes, doing nothing
All <span class="k">done </span>now.</code></pre></figure>
<p>Command “bundle install” will work very fast, because all needed gems are already present in “~/.bundle” folder. For my project testing speed was faster for 5 minutes (the duration of testing was decreased from 13 to 8 minutes).</p>
<h1 id="loading-database-scheme-without-rake-command">Loading database scheme without rake command</h1>
<p>When you run <code class="language-plaintext highlighter-rouge">rake db:create db:migrate</code>, command is load the Rails environment, which of course is not so fast. You can load your database scheme much faster by using built-in database tools (mysql, psql). Set in “config/application.rb” for your rails project:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">config</span><span class="p">.</span><span class="nf">active_record</span><span class="p">.</span><span class="nf">schema_format</span> <span class="o">=</span> <span class="ss">:sql</span></code></pre></figure>
<p>In this case you rails app database scheme will be stored in sql format. Next, you should add such commands in “.travis.yml” file (example for PostgreSQL):</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="ss">before_install:
</span><span class="o">-</span> <span class="s2">"psql -c 'create database DB_NAME;' -U postgres"</span>
<span class="o">-</span> <span class="s2">"psql -U postgres -q -d DB_NAME -f db/structure.sql"</span>
<span class="o">-</span> <span class="s2">"cp -f config/database.travis.yml config/database.yml"</span></code></pre></figure>
<p>In you project file “database.travis.yml” should exist, which contains “DB_NAME”. For my project testing speed was faster for 1 minutes (the duration of testing decreased from 8 to 7 minutes).</p>
<h1 id="parallelizing-your-builds-across-virtual-machines">Parallelizing your builds across virtual machines</h1>
<p>To speed up a test suite, you can split it to the several parts using Travis-CI build <a href="http://about.travis-ci.org/docs/user/build-configuration/#The-Build-Matrix">matrix</a> feature. Say you want to split up your unit tests and your integration tests into two different build jobs. They’ll run in parallel and fully utilize the available build capacity for your account.</p>
<p>Here’s an example on how to utilize this feature in your .travis.yml:</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">TEST_SUITE=units</span>
<span class="pi">-</span> <span class="s">TEST_SUITE=integration</span>
<span class="na">script</span><span class="pi">:</span> <span class="s2">"</span><span class="s">bundle</span><span class="nv"> </span><span class="s">exec</span><span class="nv"> </span><span class="s">rake</span><span class="nv"> </span><span class="s">test:$TEST_SUITE"</span></code></pre></figure>
<p>Travis will determine the build matrix based on the environment variables and schedule two builds to run.</p>
<p>Depending on the size and complexity of your test suite you can split it up even further. You could separate different concerns for the integration tests into different subfolders and run them in separate stages of the matrix building.</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">TESTFOLDER=spec/features</span>
<span class="pi">-</span> <span class="s">TESTFOLDER=spec/requests</span>
<span class="pi">-</span> <span class="s">TESTFOLDER=spec/models</span>
<span class="na">script</span><span class="pi">:</span> <span class="s2">"</span><span class="s">bundle</span><span class="nv"> </span><span class="s">exec</span><span class="nv"> </span><span class="s">rspec</span><span class="nv"> </span><span class="s">$TESTFOLDER"</span></code></pre></figure>
<h1 id="paralellizing-your-build-on-one-vm">Paralellizing your build on one VM</h1>
<p>Travis CI VMs run on 1.5 virtual cores. This is not exactly a concurrency, which allows to parallelize a lot, but it can give a nice speedup depending on case you use. Parallelizing the test suite on one VM depends on the language and test runner, which you use, so you will have to research your options. For Ruby and RSpec you can use <a href="https://github.com/grosser/parallel_tests">parallel_tests</a> gem.</p>
<h1 id="summary">Summary</h1>
<p>This article describes some techniques for speeding up the testing of your Ruby project at Travis-CI. Of course others techniques can be exist which I did not mention, but these ones helped to reduce the testing time of the projects in several times.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Multicorn - powerful foreign data wrapper for PostgreSQL2013-09-28T00:00:00+00:00http://leopard.in.ua/2013/09/28/postgresql-multicorn<p>Hello my dear friends. In this article I will talk about Multicorn: what is this, how to install it and use it with PostgreSQL.</p>
<h1 id="what-is-multicorn">What is Multicorn?</h1>
<p><a href="http://multicorn.org/">Multicorn</a> is a PostgreSQL 9.1+ extension meant to make <a href="http://wiki.postgresql.org/wiki/Foreign_data_wrappers">Foreign Data Wrapper</a> development easy, by allowing the programmer to use the Python programming language. “Foreign Data Wrappers” (FDW) were introduced in PostgreSQL 9.1, providing a way of accessing external data sources from within PostgreSQL using SQL.</p>
<h1 id="install">Install</h1>
<p>For installing Multicorn I will use Ubuntu. First, we need install some packages:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">sudo </span>aptitude <span class="nb">install </span>build-essential postgresql-server-dev-9.3 python-dev python-setuptools</code></pre></figure>
<p>We can install Multicorn by using <a href="http://pgxnclient.projects.postgresql.org/">pgxn client</a> or from source. I prefer install from source:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>git clone git@github.com:Kozea/Multicorn.git
<span class="nv">$ </span><span class="nb">cd </span>Multicorn
<span class="nv">$ </span>make
<span class="nv">$ </span><span class="nb">sudo </span>make <span class="nb">install</span></code></pre></figure>
<p>To complete the installation we need to enable the extension in the database:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">multicorn</span><span class="p">;</span>
<span class="k">CREATE</span> <span class="n">EXTENSION</span></code></pre></figure>
<p>Now let consider how to use it.</p>
<h1 id="usage">Usage</h1>
<h2 id="rdbms-databases">RDBMS databases</h2>
<p>For connection to another RDBMS database Multicorn use <a href="http://www.sqlalchemy.org/">SQLalchemy</a> library. It support MySQL, PostgreSQL, Microsoft SQL Server, and <a href="http://docs.sqlalchemy.org/en/latest/dialects/">more</a>. Let’s try how it is work with MySQL. First of all we should install additional libs:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">sudo </span>aptitude <span class="nb">install </span>python-sqlalchemy python-mysqldb</code></pre></figure>
<p>In MySQL database “testing” we have table “companies”:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>mysql <span class="nt">-u</span> root <span class="nt">-p</span> testing
mysql> SELECT <span class="k">*</span> FROM companies<span class="p">;</span>
+----+---------------------+---------------------+
| <span class="nb">id</span> | created_at | updated_at |
+----+---------------------+---------------------+
| 1 | 2013-07-16 14:06:09 | 2013-07-16 14:06:09 |
| 2 | 2013-07-16 14:30:00 | 2013-07-16 14:30:00 |
| 3 | 2013-07-16 14:33:41 | 2013-07-16 14:33:41 |
| 4 | 2013-07-16 14:38:42 | 2013-07-16 14:38:42 |
| 5 | 2013-07-19 14:38:29 | 2013-07-19 14:38:29 |
+----+---------------------+---------------------+
5 rows <span class="k">in </span><span class="nb">set</span> <span class="o">(</span>0.00 sec<span class="o">)</span></code></pre></figure>
<p>First of all we should create server for FDW in PostgreSQL:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">SERVER</span> <span class="n">alchemy_srv</span> <span class="k">foreign</span> <span class="k">data</span> <span class="n">wrapper</span> <span class="n">multicorn</span> <span class="k">options</span> <span class="p">(</span>
<span class="n">wrapper</span> <span class="s1">'multicorn.sqlalchemyfdw.SqlAlchemyFdw'</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="n">SERVER</span></code></pre></figure>
<p>Now we can create foreign table, which will contain data from MySQL table “companies” (I called this table in PostgreSQL “mysql_companies”):</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span> <span class="n">mysql_companies</span> <span class="p">(</span>
<span class="n">id</span> <span class="nb">integer</span><span class="p">,</span>
<span class="n">created_at</span> <span class="nb">timestamp</span> <span class="k">without</span> <span class="nb">time</span> <span class="k">zone</span><span class="p">,</span>
<span class="n">updated_at</span> <span class="nb">timestamp</span> <span class="k">without</span> <span class="nb">time</span> <span class="k">zone</span>
<span class="p">)</span> <span class="n">server</span> <span class="n">alchemy_srv</span> <span class="k">options</span> <span class="p">(</span>
<span class="n">tablename</span> <span class="s1">'companies'</span><span class="p">,</span>
<span class="n">db_url</span> <span class="s1">'mysql://root:password@127.0.0.1/testing'</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span></code></pre></figure>
<p>Main options:</p>
<ul>
<li>db_url (string) - an sqlalchemy connection string (examples: “mysql://<user>:<password>@<host>/<dbname>”, “mssql: mssql://<user>:<password>@<dsname>”). See the <a href="http://docs.sqlalchemy.org/en/latest/dialects/">sqlalchemy dialects documentation</a>.</li>
<li>tablename (string) - the table name in the remote RDBMS.</li>
</ul>
<p>And now we can check how it is work:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">mysql_companies</span><span class="p">;</span>
<span class="n">id</span> <span class="o">|</span> <span class="n">created_at</span> <span class="o">|</span> <span class="n">updated_at</span>
<span class="c1">----+---------------------+---------------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">09</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">06</span><span class="p">:</span><span class="mi">09</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">00</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">00</span>
<span class="mi">3</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">33</span><span class="p">:</span><span class="mi">41</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">33</span><span class="p">:</span><span class="mi">41</span>
<span class="mi">4</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">38</span><span class="p">:</span><span class="mi">42</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">16</span> <span class="mi">14</span><span class="p">:</span><span class="mi">38</span><span class="p">:</span><span class="mi">42</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">19</span> <span class="mi">14</span><span class="p">:</span><span class="mi">38</span><span class="p">:</span><span class="mi">29</span> <span class="o">|</span> <span class="mi">2013</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">19</span> <span class="mi">14</span><span class="p">:</span><span class="mi">38</span><span class="p">:</span><span class="mi">29</span>
<span class="p">(</span><span class="mi">5</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>As you can see, it is work.</p>
<h2 id="imap-servers">IMAP servers</h2>
<p>We can use Multicorn to get your emails from inbox by IMAP protocol. We need install additional libraries:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">sudo </span>aptitude <span class="nb">install </span>python-pip
<span class="nv">$ </span><span class="nb">sudo </span>pip <span class="nb">install </span>imapclient</code></pre></figure>
<p>Next steps similar to previous. We should create server and table, where we will get data:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">SERVER</span> <span class="n">multicorn_imap</span> <span class="k">FOREIGN</span> <span class="k">DATA</span> <span class="n">WRAPPER</span> <span class="n">multicorn</span> <span class="k">options</span> <span class="p">(</span> <span class="n">wrapper</span> <span class="s1">'multicorn.imapfdw.ImapFdw'</span> <span class="p">);</span>
<span class="k">CREATE</span> <span class="n">SERVER</span>
<span class="o">=#</span> <span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span> <span class="n">my_inbox</span> <span class="p">(</span>
<span class="nv">"Message-ID"</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="nv">"From"</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="nv">"Subject"</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="nv">"payload"</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="nv">"flags"</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">[],</span>
<span class="nv">"To"</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">)</span> <span class="n">server</span> <span class="n">multicorn_imap</span> <span class="k">options</span> <span class="p">(</span>
<span class="k">host</span> <span class="s1">'imap.gmail.com'</span><span class="p">,</span>
<span class="n">port</span> <span class="s1">'993'</span><span class="p">,</span>
<span class="n">payload_column</span> <span class="s1">'payload'</span><span class="p">,</span>
<span class="n">flags_column</span> <span class="s1">'flags'</span><span class="p">,</span>
<span class="n">ssl</span> <span class="s1">'True'</span><span class="p">,</span>
<span class="n">login</span> <span class="s1">'example@gmail.com'</span><span class="p">,</span>
<span class="n">password</span> <span class="s1">'supersecretpassword'</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span></code></pre></figure>
<p>Main options:</p>
<ul>
<li>host (string) - the IMAP host to connect to.</li>
<li>port (string) - the IMAP host port to connect to.</li>
<li>login (string) - the login to connect with.</li>
<li>password (string) - the password to connect with.</li>
<li>payload_column (string) - the name of the column which will store the payload.</li>
<li>flags_column (string) - the name of the column which will store the IMAP flags, as an array of strings.</li>
<li>ssl (boolean) - wether to use ssl or not.</li>
<li>imap_server_charset (string) - the name of the charset used for IMAP search commands. Defaults to UTF8. For the cyrus IMAP server, it should be set to “utf-8”.</li>
</ul>
<p>And we can read emails from inbox by using table “my_inbox”:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">flags</span><span class="p">,</span> <span class="nv">"Subject"</span><span class="p">,</span> <span class="n">payload</span> <span class="k">FROM</span> <span class="n">my_inbox</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">flags</span> <span class="o">|</span> <span class="n">Subject</span> <span class="o">|</span> <span class="n">payload</span>
<span class="c1">------------+-------------------+---------------------</span>
<span class="p">{</span><span class="nv">"</span><span class="se">\\</span><span class="nv">Seen"</span><span class="p">}</span> <span class="o">|</span> <span class="n">Test</span> <span class="n">email</span> <span class="o">|</span> <span class="n">Test</span> <span class="n">email</span><span class="err">\</span><span class="n">r</span> <span class="o">+</span>
<span class="o">|</span> <span class="o">|</span>
<span class="p">{</span><span class="nv">"</span><span class="se">\\</span><span class="nv">Seen"</span><span class="p">}</span> <span class="o">|</span> <span class="n">Test</span> <span class="k">second</span> <span class="n">email</span> <span class="o">|</span> <span class="n">Test</span> <span class="k">second</span> <span class="n">email</span><span class="err">\</span><span class="n">r</span><span class="o">+</span>
<span class="o">|</span> <span class="o">|</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Added flag to email “Test email”:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">SELECT</span> <span class="n">flags</span><span class="p">,</span> <span class="nv">"Subject"</span><span class="p">,</span> <span class="n">payload</span> <span class="k">FROM</span> <span class="n">my_inbox</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">flags</span> <span class="o">|</span> <span class="n">Subject</span> <span class="o">|</span> <span class="n">payload</span>
<span class="c1">--------------------------------------+-------------------+---------------------</span>
<span class="p">{</span><span class="err">$</span><span class="n">MailFlagBit1</span><span class="p">,</span><span class="nv">"</span><span class="se">\\</span><span class="nv">Flagged"</span><span class="p">,</span><span class="nv">"</span><span class="se">\\</span><span class="nv">Seen"</span><span class="p">}</span> <span class="o">|</span> <span class="n">Test</span> <span class="n">email</span> <span class="o">|</span> <span class="n">Test</span> <span class="n">email</span><span class="err">\</span><span class="n">r</span> <span class="o">+</span>
<span class="o">|</span> <span class="o">|</span>
<span class="p">{</span><span class="nv">"</span><span class="se">\\</span><span class="nv">Seen"</span><span class="p">}</span> <span class="o">|</span> <span class="n">Test</span> <span class="k">second</span> <span class="n">email</span> <span class="o">|</span> <span class="n">Test</span> <span class="k">second</span> <span class="n">email</span><span class="err">\</span><span class="n">r</span><span class="o">+</span>
<span class="o">|</span> <span class="o">|</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>It’s work.</p>
<h2 id="rss-source">RSS source</h2>
<p>Multicorn can get use RSS as source of information. Again, we need install additional libraries:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span><span class="nb">sudo </span>aptitude <span class="nb">install </span>python-lxml</code></pre></figure>
<p>We should create server and table, where we will get data:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">SERVER</span> <span class="n">rss_srv</span> <span class="k">foreign</span> <span class="k">data</span> <span class="n">wrapper</span> <span class="n">multicorn</span> <span class="k">options</span> <span class="p">(</span>
<span class="n">wrapper</span> <span class="s1">'multicorn.rssfdw.RssFdw'</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="n">SERVER</span>
<span class="o">=#</span> <span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span> <span class="n">my_rss</span> <span class="p">(</span>
<span class="nv">"pubDate"</span> <span class="nb">timestamp</span><span class="p">,</span>
<span class="n">description</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">title</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">link</span> <span class="nb">character</span> <span class="nb">varying</span>
<span class="p">)</span> <span class="n">server</span> <span class="n">rss_srv</span> <span class="k">options</span> <span class="p">(</span>
<span class="n">url</span> <span class="s1">'http://news.yahoo.com/rss/entertainment'</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span></code></pre></figure>
<p>Main options:</p>
<ul>
<li>url (string) - the RSS feed URL.</li>
</ul>
<p>Also, you should be sure, what your database use UTF-8 charset. Because in another encodings you can get errors :)</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">SELECT</span> <span class="nv">"pubDate"</span><span class="p">,</span> <span class="n">title</span><span class="p">,</span> <span class="n">link</span> <span class="k">from</span> <span class="n">my_rss</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="nv">"pubDate"</span> <span class="k">DESC</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">pubDate</span> <span class="o">|</span> <span class="n">title</span> <span class="o">|</span> <span class="n">link</span>
<span class="c1">---------------------+----------------------------------------------------+--------------------------------------------------------------------------------------</span>
<span class="mi">2013</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">28</span> <span class="mi">14</span><span class="p">:</span><span class="mi">11</span><span class="p">:</span><span class="mi">58</span> <span class="o">|</span> <span class="n">Royal</span> <span class="n">Mint</span> <span class="n">coins</span> <span class="k">to</span> <span class="n">mark</span> <span class="n">Prince</span> <span class="n">George</span> <span class="n">christening</span> <span class="o">|</span> <span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">news</span><span class="p">.</span><span class="n">yahoo</span><span class="p">.</span><span class="n">com</span><span class="o">/</span><span class="n">royal</span><span class="o">-</span><span class="n">mint</span><span class="o">-</span><span class="n">coins</span><span class="o">-</span><span class="n">mark</span><span class="o">-</span><span class="n">prince</span><span class="o">-</span><span class="n">george</span><span class="o">-</span><span class="n">christening</span><span class="o">-</span><span class="mi">115906242</span><span class="p">.</span><span class="n">html</span>
<span class="mi">2013</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">28</span> <span class="mi">11</span><span class="p">:</span><span class="mi">47</span><span class="p">:</span><span class="mi">03</span> <span class="o">|</span> <span class="n">Miss</span> <span class="n">Philippines</span> <span class="n">wins</span> <span class="n">Miss</span> <span class="n">World</span> <span class="k">in</span> <span class="n">Indonesia</span> <span class="o">|</span> <span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">news</span><span class="p">.</span><span class="n">yahoo</span><span class="p">.</span><span class="n">com</span><span class="o">/</span><span class="n">miss</span><span class="o">-</span><span class="n">philippines</span><span class="o">-</span><span class="n">wins</span><span class="o">-</span><span class="n">miss</span><span class="o">-</span><span class="n">world</span><span class="o">-</span><span class="n">indonesia</span><span class="o">-</span><span class="mi">144544381</span><span class="p">.</span><span class="n">html</span>
<span class="mi">2013</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">28</span> <span class="mi">10</span><span class="p">:</span><span class="mi">59</span><span class="p">:</span><span class="mi">15</span> <span class="o">|</span> <span class="n">Billionaire</span><span class="s1">'s daughter in NJ court in will dispute | http://news.yahoo.com/billionaires-daughter-nj-court-dispute-144432331.html
2013-09-28 08:40:42 | Security tight at Miss World final in Indonesia | http://news.yahoo.com/security-tight-miss-world-final-indonesia-123714041.html
2013-09-28 08:17:52 | Guest lineups for the Sunday news shows | http://news.yahoo.com/guest-lineups-sunday-news-shows-183815643.html
2013-09-28 07:37:02 | Security tight at Miss World crowning in Indonesia | http://news.yahoo.com/security-tight-miss-world-crowning-indonesia-113634310.html
2013-09-27 20:49:32 | Simons stamps his natural mark on Dior | http://news.yahoo.com/simons-stamps-natural-mark-dior-223848528.html
2013-09-27 19:50:30 | Jackson jury ends deliberations until Tuesday | http://news.yahoo.com/jackson-jury-ends-deliberations-until-tuesday-235030969.html
2013-09-27 19:23:40 | Eric Clapton-owned Richter painting to sell in NYC | http://news.yahoo.com/eric-clapton-owned-richter-painting-sell-nyc-201447252.html
2013-09-27 19:14:15 | Report: Hollywood is less gay-friendly off-screen | http://news.yahoo.com/report-hollywood-less-gay-friendly-off-screen-231415235.html
(10 rows)</span></code></pre></figure>
<h2 id="csv-source">CSV source</h2>
<p>This FDW can be used to access data stored in CSV files. We should create server and table, where we will get data:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">CREATE</span> <span class="n">SERVER</span> <span class="n">csv_srv</span> <span class="k">foreign</span> <span class="k">data</span> <span class="n">wrapper</span> <span class="n">multicorn</span> <span class="k">options</span> <span class="p">(</span>
<span class="n">wrapper</span> <span class="s1">'multicorn.csvfdw.CsvFdw'</span>
<span class="p">);</span>
<span class="k">CREATE</span> <span class="n">SERVER</span>
<span class="o">=#</span> <span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span> <span class="n">csvtest</span> <span class="p">(</span>
<span class="n">sort_order</span> <span class="nb">numeric</span><span class="p">,</span>
<span class="n">common_name</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">formal_name</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">main_type</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">sub_type</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">sovereignty</span> <span class="nb">character</span> <span class="nb">varying</span><span class="p">,</span>
<span class="n">capital</span> <span class="nb">character</span> <span class="nb">varying</span>
<span class="p">)</span> <span class="n">server</span> <span class="n">csv_srv</span> <span class="k">options</span> <span class="p">(</span>
<span class="n">filename</span> <span class="s1">'/var/data/countrylist.csv'</span><span class="p">,</span>
<span class="n">skip_header</span> <span class="s1">'1'</span><span class="p">,</span>
<span class="k">delimiter</span> <span class="s1">','</span><span class="p">);</span>
<span class="k">CREATE</span> <span class="k">FOREIGN</span> <span class="k">TABLE</span></code></pre></figure>
<p>Main options:</p>
<ul>
<li>filename (string) - the full path to the CSV file containing the data. This file must be readable to the postgres user.</li>
<li>delimiter (character) - the CSV delimiter (defaults to “,”).</li>
<li>quotechar (character) - the CSV quote character (defaults to “).</li>
<li>skip_header (integer) - the number of lines to skip (defaults to 0).</li>
</ul>
<p>Let’s check how it work:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="o">=#</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">csvtest</span> <span class="k">LIMIT</span> <span class="mi">10</span><span class="p">;</span>
<span class="n">sort_order</span> <span class="o">|</span> <span class="n">common_name</span> <span class="o">|</span> <span class="n">formal_name</span> <span class="o">|</span> <span class="n">main_type</span> <span class="o">|</span> <span class="n">sub_type</span> <span class="o">|</span> <span class="n">sovereignty</span> <span class="o">|</span> <span class="n">capital</span>
<span class="c1">------------+---------------------+-----------------------------------------+-------------------+----------+-------------+------------------</span>
<span class="mi">1</span> <span class="o">|</span> <span class="n">Afghanistan</span> <span class="o">|</span> <span class="n">Islamic</span> <span class="k">State</span> <span class="k">of</span> <span class="n">Afghanistan</span> <span class="o">|</span> <span class="n">Independent</span> <span class="k">State</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span> <span class="n">Kabul</span>
<span class="mi">2</span> <span class="o">|</span> <span class="n">Albania</span> <span class="o">|</span> <span class="n">Republic</span> <span class="k">of</span> <span class="n">Albania</span> <span class="o">|</span> <span class="n">Independent</span> <span class="k">State</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span> <span class="n">Tirana</span>
<span class="mi">3</span> <span class="o">|</span> <span class="n">Algeria</span> <span class="o">|</span> <span class="n">People</span><span class="s1">'s Democratic Republic of Algeria | Independent State | | | Algiers
4 | Andorra | Principality of Andorra | Independent State | | | Andorra la Vella
5 | Angola | Republic of Angola | Independent State | | | Luanda
6 | Antigua and Barbuda | | Independent State | | | Saint John'</span><span class="n">s</span>
<span class="mi">7</span> <span class="o">|</span> <span class="n">Argentina</span> <span class="o">|</span> <span class="n">Argentine</span> <span class="n">Republic</span> <span class="o">|</span> <span class="n">Independent</span> <span class="k">State</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span> <span class="n">Buenos</span> <span class="n">Aires</span>
<span class="mi">8</span> <span class="o">|</span> <span class="n">Armenia</span> <span class="o">|</span> <span class="n">Republic</span> <span class="k">of</span> <span class="n">Armenia</span> <span class="o">|</span> <span class="n">Independent</span> <span class="k">State</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span> <span class="n">Yerevan</span>
<span class="mi">9</span> <span class="o">|</span> <span class="n">Australia</span> <span class="o">|</span> <span class="n">Commonwealth</span> <span class="k">of</span> <span class="n">Australia</span> <span class="o">|</span> <span class="n">Independent</span> <span class="k">State</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span> <span class="n">Canberra</span>
<span class="mi">10</span> <span class="o">|</span> <span class="n">Austria</span> <span class="o">|</span> <span class="n">Republic</span> <span class="k">of</span> <span class="n">Austria</span> <span class="o">|</span> <span class="n">Independent</span> <span class="k">State</span> <span class="o">|</span> <span class="o">|</span> <span class="o">|</span> <span class="n">Vienna</span>
<span class="p">(</span><span class="mi">10</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<h2 id="another-fdws">Another FDWs</h2>
<p>The Multicorn also contain LDAP and FileSystem Foreign Data Wrappers. LDAP FDW can be used to access directory servers via the LDAP protocol. FileSystem FDW can be used to access data stored in various files, in a filesystem.</p>
<h2 id="your-custom-fdws">Your custom FDWs</h2>
<p>Multicorn provides a simple interface for writing own foreign data wrappers. More information you can find <a href="http://multicorn.org/implementing-an-fdw/">here</a>.</p>
<h1 id="postgresql-93">PostgreSQL 9.3+</h1>
<p>The original implementation of FDWs in PostgreSQL 9.1 and 9.2 was read-only, but in PostgreSQL 9.3 FDWs also have write access as well. Right now Multicorn support write access API in version >= 1.0.0.</p>
<h1 id="conclusion">Conclusion</h1>
<p>As you can be seen, Multicorn is very useful extensions, which provide for PostgreSQL communicate with many external types of data source and provide for Python developers create own custom FDW for PostgreSQL.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Check your build status in Travis CI by Sphero2013-09-19T00:00:00+00:00http://leopard.in.ua/2013/09/19/check-your-build-status-by-sphero<p>Hello my dear friends. In this article we will learn how to check our travis ci build status by sphero.</p>
<h1 id="what-is-sphero">What is Sphero?</h1>
<p><a href="/assets/images/fun/sphero/sphero1.png"><img src="/assets/images/fun/sphero/sphero1.png" alt="sphero" title="sphero" class="aligncenter" /></a></p>
<p>Sphero is small spherical robot. More info you can find on <a href="http://www.gosphero.com/">official page</a> and see it in <a href="http://www.youtube.com/watch?v=5Bg88VkWGOQ">this video</a>.</p>
<p>Yes, this is just a toy. But this toy have <a href="https://github.com/orbotix/DeveloperResources">development docs</a> and <a href="https://developer.gosphero.com/">many SDKs</a>. So you can create applications, which will manipulate Sphero for your needs.</p>
<h1 id="travis-ci">Travis CI</h1>
<p>Travis CI is a hosted continuous integration service for the open source community. It is integrated with GitHub and can test many languages. I using Travis CI for my Open Source projects and for me good to know, if some my project have failed tests.</p>
<h1 id="code">Code</h1>
<p>I chouse to use Ruby to write “SBSC” (Sphero Build Status Checker). For Ruby today exist very good library for working with robots - <a href="http://artoo.io/">Artoo</a>. Let’s create Gemfile and add to it needed library:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="n">source</span> <span class="s1">'http://rubygems.org'</span>
<span class="n">gem</span> <span class="s1">'artoo'</span>
<span class="n">gem</span> <span class="s1">'artoo-sphero'</span>
<span class="n">gem</span> <span class="s1">'hybridgroup-sphero'</span>
<span class="n">gem</span> <span class="s1">'hybridgroup-serialport'</span>
<span class="n">gem</span> <span class="s1">'travis'</span></code></pre></figure>
<p>Next we write code for for our “SBSC”:</p>
<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="nb">require</span> <span class="s1">'artoo'</span>
<span class="nb">require</span> <span class="s1">'artoo/robot'</span>
<span class="nb">require</span> <span class="s1">'travis'</span>
<span class="k">class</span> <span class="nc">CISpheroRobot</span> <span class="o"><</span> <span class="no">Artoo</span><span class="o">::</span><span class="no">Robot</span>
<span class="n">connection</span> <span class="ss">:sphero</span><span class="p">,</span> <span class="ss">adaptor: :sphero</span>
<span class="n">device</span> <span class="ss">:sphero</span><span class="p">,</span> <span class="ss">driver: :sphero</span>
<span class="n">work</span> <span class="k">do</span>
<span class="n">ci_repo</span> <span class="o">=</span> <span class="s2">"le0pard/omniauth-yammer"</span> <span class="c1"># change on your github repo</span>
<span class="n">ci_branch</span> <span class="o">=</span> <span class="s2">"master"</span>
<span class="n">repo</span> <span class="o">=</span> <span class="no">Travis</span><span class="o">::</span><span class="no">Repository</span><span class="p">.</span><span class="nf">find</span><span class="p">(</span><span class="n">ci_repo</span><span class="p">)</span>
<span class="n">sphero</span><span class="p">.</span><span class="nf">set_color</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1"># reset to white color</span>
<span class="n">every</span><span class="p">(</span><span class="mi">60</span><span class="p">.</span><span class="nf">seconds</span><span class="p">)</span> <span class="k">do</span>
<span class="k">case</span> <span class="n">repo</span><span class="p">.</span><span class="nf">branch</span><span class="p">(</span><span class="n">ci_branch</span><span class="p">).</span><span class="nf">state</span>
<span class="k">when</span> <span class="s1">'passed'</span>
<span class="nb">puts</span> <span class="s2">"Your </span><span class="si">#{</span><span class="n">ci_repo</span><span class="si">}</span><span class="s2"> is green"</span>
<span class="n">sphero</span><span class="p">.</span><span class="nf">set_color</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1"># reset to white color</span>
<span class="k">when</span> <span class="s1">'failed'</span>
<span class="nb">puts</span> <span class="s2">"Your </span><span class="si">#{</span><span class="n">ci_repo</span><span class="si">}</span><span class="s2"> is failed"</span>
<span class="mi">5</span><span class="p">.</span><span class="nf">times</span> <span class="k">do</span>
<span class="n">sphero</span><span class="p">.</span><span class="nf">set_color</span> <span class="nb">rand</span><span class="p">(</span><span class="mi">255</span><span class="p">),</span><span class="nb">rand</span><span class="p">(</span><span class="mi">255</span><span class="p">),</span><span class="nb">rand</span><span class="p">(</span><span class="mi">255</span><span class="p">)</span>
<span class="n">sphero</span><span class="p">.</span><span class="nf">roll</span> <span class="mi">20</span><span class="p">,</span> <span class="nb">rand</span><span class="p">(</span><span class="mi">360</span><span class="p">)</span>
<span class="nb">sleep</span> <span class="mf">0.5</span>
<span class="k">end</span>
<span class="k">else</span>
<span class="n">sphero</span><span class="p">.</span><span class="nf">set_color</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="no">CISpheroRobot</span><span class="p">.</span><span class="nf">work!</span><span class="p">(</span><span class="no">CISpheroRobot</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">connections: </span><span class="p">{</span><span class="ss">sphero: </span><span class="p">{</span><span class="ss">port: </span><span class="s1">'/dev/tty.Sphero-GGY-AMP-SPP'</span><span class="p">}}))</span></code></pre></figure>
<p>So I created class “CISpheroRobot”, which have block “work”. In this method you can manipulate by sphero by object “sphero” with methods “set_color”, “roll”, etc. In the port you can see I used ‘/dev/tty.Sphero-GGY-AMP-SPP’. To find this info you can in terminal of MacOS “ls /dev/tty*”.</p>
<p><a href="/assets/images/fun/sphero/sphero.png"><img src="/assets/images/fun/sphero/sphero.png" alt="sphero" title="sphero" class="aligncenter" /></a></p>
<h1 id="result">Result</h1>
<p>Result you can see on this video (sorry for not good quality):</p>
<div class="video-container">
<iframe width="480" height="270" src="https://www.youtube.com/embed/P4JlJ3KrduA" frameborder="0" allowfullscreen="">
</iframe>
</div>
<p><em>That’s all folks!</em></p>
Setting up shared memory for PostgreSQL2013-09-05T00:00:00+00:00http://leopard.in.ua/2013/09/05/postgresql-sessting-shared-memory<p>Hello my dear friends. In this article I will cover about setting up shared memory on Linux for PostgreSQL.</p>
<h1 id="what-is-shared-memory">What is shared memory?</h1>
<p><strong>Shared memory</strong> is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between programs. Using memory for communication inside a single program, for example among its multiple threads, is also referred to as shared memory.</p>
<h1 id="postgresql-and-shared-memory">PostgreSQL and shared memory</h1>
<p>The “shared_buffers” configuration parameter determines how much memory is dedicated to PostgreSQL to use for caching data. One reason the defaults are low is because on some platforms (like older Solaris versions and SGI), having large values requires invasive action like recompiling the kernel. Even on a modern Linux system, the stock kernel will likely not allow setting shared_buffers to over 32MB without adjusting kernel settings first.</p>
<p>If you have a system with 1GB or more of RAM, a reasonable starting value for shared_buffers is 1/4 of the memory in your system. If you have less RAM you’ll have to account more carefully for how much RAM the OS is taking up; closer to 15% is more typical there. There are some workloads where even larger settings for shared_buffers are effective, but given the way PostgreSQL also relies on the operating system cache, it’s unlikely you’ll find using more than 40% of RAM to work better than a smaller amount.</p>
<p>It’s likely you will have to increase the amount of memory your operating system allows you to allocate at once to set the value for shared_buffers this high. On UNIX-like systems, if you set it above what’s supported, you’ll get a message like this</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">FATAL: could not create shared memory segment: Invalid argument
DETAIL: Failed system call was shmget<span class="o">(</span><span class="nv">key</span><span class="o">=</span>5440001, <span class="nv">size</span><span class="o">=</span>4011376640, 03600<span class="o">)</span><span class="nb">.</span>
This error usually means that PostgreSQL<span class="s1">'s request for a shared memory
segment exceeded your kernel'</span>s SHMMAX parameter. You can either
reduce the request size or reconfigure the kernel with larger SHMMAX.
To reduce the request size <span class="o">(</span>currently 4011376640 bytes<span class="o">)</span>, reduce
PostgreSQL<span class="s1">'s shared_buffers parameter (currently 50000) and/or
its max_connections parameter (currently 12).</span></code></pre></figure>
<p>probably means your kernel’s limit on the size of shared memory is smaller than the work area PostgreSQL is trying to create (4011376640 bytes in this example). Or it could mean that you do not have System-V-style shared memory support configured into your kernel at all. As a temporary workaround, you can try starting the server with a smaller-than-normal number of buffers (shared_buffers).</p>
<h1 id="configure-shared-memory">Configure shared memory</h1>
<p>We must to setup our Unix system shared memory, if we want tune PostgreSQL. We can go by <a href="http://www.postgresql.org/docs/current/static/kernel-resources.html">this link</a> and see how to setup it. But what values to set for shmmax and shmall? To do this, I made a small script. It’s mission - to calculate and display the size of shared memory, which is half the available memory on the server:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c">#!/bin/bash</span>
<span class="c"># simple shmsetup script</span>
<span class="nv">page_size</span><span class="o">=</span><span class="sb">`</span>getconf PAGE_SIZE<span class="sb">`</span>
<span class="nv">phys_pages</span><span class="o">=</span><span class="sb">`</span>getconf _PHYS_PAGES<span class="sb">`</span>
<span class="nv">shmall</span><span class="o">=</span><span class="sb">`</span><span class="nb">expr</span> <span class="nv">$phys_pages</span> / 2<span class="sb">`</span>
<span class="nv">shmmax</span><span class="o">=</span><span class="sb">`</span><span class="nb">expr</span> <span class="nv">$shmall</span> <span class="se">\*</span> <span class="nv">$page_size</span><span class="sb">`</span>
<span class="nb">echo </span>kernel.shmmax <span class="o">=</span> <span class="nv">$shmmax</span>
<span class="nb">echo </span>kernel.shmall <span class="o">=</span> <span class="nv">$shmall</span></code></pre></figure>
<p>For example, for a server with 2GB of RAM script will generate the following:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">kernel.shmmax <span class="o">=</span> 1055092736
kernel.shmall <span class="o">=</span> 257591</code></pre></figure>
<p>Here SHMMAX maximum size (in bytes) on the segment of shared memory, is set to 1 GB. SHMALL - the total amount of shared memory (in pages), which all processes on the server can use. The number of bytes in a page depends on the operating system, basically, the default is 4096 bytes. In order to have these data have applied for Linux (Ubuntu, Debian), run this command from the root (./shmsetup - the script):</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>./shmsetup <span class="o">>></span> /etc/sysctl.conf</code></pre></figure>
<p>And check that:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>sysctl <span class="nt">-p</span></code></pre></figure>
<p>Also do not forget about semaphores in the system:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>ipcs <span class="nt">-l</span>
<span class="nt">------</span> Semaphore Limits <span class="nt">--------</span>
max number of arrays <span class="o">=</span> 128
max semaphores per array <span class="o">=</span> 250
max semaphores system wide <span class="o">=</span> 32000
max ops per semop call <span class="o">=</span> 32
semaphore max value <span class="o">=</span> 32767</code></pre></figure>
<p>The values in sysctl:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>sysctl kernel.sem <span class="o">=</span> 250 32000 32 128</code></pre></figure>
<p>All four values may need to be increased on systems with a large number of processes.</p>
<h1 id="postgresql-93">PostgreSQL 9.3</h1>
<p>In 9.3, PostgreSQL has switched from using SysV shared memory to using Posix shared memory and mmap for memory management. This allows easier installation and configuration of PostgreSQL, and means that except in usual cases, system parameters such as SHMMAX and SHMALL no longer need to be adjusted.</p>
<h1 id="summary">Summary</h1>
<p>As can be seen, setting up shared memory on Linux for PostgreSQL is not so hard. Good luck.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>
Using ltree for hierarchical structures in PostgreSQL2013-09-02T00:00:00+00:00http://leopard.in.ua/2013/09/02/postgresql-ltree<p>Hello my dear friends. In <a href="/2013/07/11/storing-trees-in-rdbms/">the previous article</a> we learned about storing the tree structures in the RDBMS. In this article we will learn how to work with ltree module for PostgreSQL, which allow store data in a hierarchical tree-like structure.</p>
<h1 id="what-is-ltree">What is ltree?</h1>
<p>Ltree is a PostgreSQL module. It is implements a data type ltree for representing labels of data stored in a hierarchical tree-like structure. Extensive facilities for searching through label trees are provided.</p>
<h2 id="why-ltree">Why ltree?</h2>
<ul>
<li>The ltree implements a materialized path, which very quick for INSERT/UPDATE/DELETE and pretty quick for SELECT operations</li>
<li>It will be generally faster than using a recursive CTE or recursive function that constantly needs to recalculate the branching</li>
<li>As built in query syntax and operators specifically designed for querying and navigating trees</li>
<li>Indexes!!!</li>
</ul>
<h1 id="initial-data">Initial data</h1>
<p>First of all you should enable extension in your database. You can do this by this command:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">ltree</span><span class="p">;</span></code></pre></figure>
<p>Let’s create table and add to it some data:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span> <span class="nb">integer</span><span class="p">,</span> <span class="n">description</span> <span class="nb">text</span><span class="p">,</span> <span class="n">path</span> <span class="n">ltree</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0001.0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0001.0001.0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">1</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0001.0001.0002'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0001.0001.0003'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">6</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0002'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">6</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0002.0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">6</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">8</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">9</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">11</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">2</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">5</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0003'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">7</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002.0001'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">20</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002.0002'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">31</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002.0003'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">22</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002.0004'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">34</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002.0005'</span><span class="p">);</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">22</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0002.0002.0006'</span><span class="p">);</span></code></pre></figure>
<p>Also we should add some indexes:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">path_gist_comments_idx</span> <span class="k">ON</span> <span class="n">comments</span> <span class="k">USING</span> <span class="n">GIST</span><span class="p">(</span><span class="n">path</span><span class="p">);</span>
<span class="k">CREATE</span> <span class="k">INDEX</span> <span class="n">path_comments_idx</span> <span class="k">ON</span> <span class="n">comments</span> <span class="k">USING</span> <span class="n">btree</span><span class="p">(</span><span class="n">path</span><span class="p">);</span></code></pre></figure>
<p>As you can see, I create table ‘comments’ with field ‘path’, which contain full path by tree for this comment. As you can see, for tree divider I use 4 numbers and dot.</p>
<p>Let’s found all comments, where path begin from ‘0001.0003’:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o"><@</span> <span class="s1">'0001.0003'</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+--------------------------</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">20</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">31</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0004</span>
<span class="mi">34</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0005</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0006</span>
<span class="p">(</span><span class="mi">12</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>We can check this sql by EXPLAIN command:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o"><@</span> <span class="s1">'0001.0003'</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">----------------------------------------------------------------------------------------------------</span>
<span class="n">Seq</span> <span class="n">Scan</span> <span class="k">on</span> <span class="n">comments</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">1</span><span class="p">.</span><span class="mi">24</span> <span class="k">rows</span><span class="o">=</span><span class="mi">2</span> <span class="n">width</span><span class="o">=</span><span class="mi">38</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">013</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">017</span> <span class="k">rows</span><span class="o">=</span><span class="mi">12</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Filter</span><span class="p">:</span> <span class="p">(</span><span class="n">path</span> <span class="o"><@</span> <span class="s1">'0001.0003'</span><span class="p">::</span><span class="n">ltree</span><span class="p">)</span>
<span class="k">Rows</span> <span class="n">Removed</span> <span class="k">by</span> <span class="n">Filter</span><span class="p">:</span> <span class="mi">7</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">038</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">4</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Let’s for test disable seq scan:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SET</span> <span class="n">enable_seqscan</span><span class="o">=</span><span class="k">false</span><span class="p">;</span>
<span class="k">SET</span>
<span class="err">$</span> <span class="k">EXPLAIN</span> <span class="k">ANALYZE</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o"><@</span> <span class="s1">'0001.0003'</span><span class="p">;</span>
<span class="n">QUERY</span> <span class="n">PLAN</span>
<span class="c1">-----------------------------------------------------------------------------------------------------------------------------------</span>
<span class="k">Index</span> <span class="n">Scan</span> <span class="k">using</span> <span class="n">path_gist_comments_idx</span> <span class="k">on</span> <span class="n">comments</span> <span class="p">(</span><span class="n">cost</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">00</span><span class="p">..</span><span class="mi">8</span><span class="p">.</span><span class="mi">29</span> <span class="k">rows</span><span class="o">=</span><span class="mi">2</span> <span class="n">width</span><span class="o">=</span><span class="mi">38</span><span class="p">)</span> <span class="p">(</span><span class="n">actual</span> <span class="nb">time</span><span class="o">=</span><span class="mi">0</span><span class="p">.</span><span class="mi">023</span><span class="p">..</span><span class="mi">0</span><span class="p">.</span><span class="mi">034</span> <span class="k">rows</span><span class="o">=</span><span class="mi">12</span> <span class="n">loops</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">Index</span> <span class="n">Cond</span><span class="p">:</span> <span class="p">(</span><span class="n">path</span> <span class="o"><@</span> <span class="s1">'0001.0003'</span><span class="p">::</span><span class="n">ltree</span><span class="p">)</span>
<span class="n">Total</span> <span class="n">runtime</span><span class="p">:</span> <span class="mi">0</span><span class="p">.</span><span class="mi">076</span> <span class="n">ms</span>
<span class="p">(</span><span class="mi">3</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Now it’s slower, but you can see, how it use index. First request use sequence scan because we have not many data in table.</p>
<p>We can do select “path <@ ‘0001.0003’” in another way:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">~</span> <span class="s1">'0001.0003.*'</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+--------------------------</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">20</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">31</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0004</span>
<span class="mi">34</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0005</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0006</span>
<span class="p">(</span><span class="mi">12</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Also you should not forget about ordering of data. Example:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">9</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0001.0001'</span><span class="p">);</span>
<span class="err">$</span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">9</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0001.0002'</span><span class="p">);</span>
<span class="err">$</span> <span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">comments</span> <span class="p">(</span><span class="n">user_id</span><span class="p">,</span> <span class="n">description</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span> <span class="k">VALUES</span> <span class="p">(</span> <span class="mi">9</span><span class="p">,</span> <span class="n">md5</span><span class="p">(</span><span class="n">random</span><span class="p">()::</span><span class="nb">text</span><span class="p">),</span> <span class="s1">'0001.0003.0001.0003'</span><span class="p">);</span>
<span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">~</span> <span class="s1">'0001.0003.*'</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+--------------------------</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">20</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">31</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0004</span>
<span class="mi">34</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0005</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0006</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="p">(</span><span class="mi">15</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Now with order:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">~</span> <span class="s1">'0001.0003.*'</span> <span class="k">ORDER</span> <span class="k">by</span> <span class="n">path</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+--------------------------</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">20</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">31</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0004</span>
<span class="mi">34</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0005</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0006</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="p">(</span><span class="mi">15</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>There are several modifiers that can be put at the end of a non-star label in lquery to make it match more than just the exact match:</p>
<ul>
<li>”@” - match case-insensitively, for example a@ matches A</li>
<li>”*” - match any label with this prefix, for example foo* matches foobar</li>
<li>”%” - match initial underscore-separated words</li>
</ul>
<table>
<tbody>
<tr>
<td>Also, you can write several possibly-modified labels separated with</td>
<td>(OR) to match any of those labels, and you can put ! (NOT) at the start to match any label that doesn’t match any of the alternatives. Example:</td>
</tr>
</tbody>
</table>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">~</span> <span class="s1">'0001.*{1,2}.0001|0002.*'</span> <span class="k">ORDER</span> <span class="k">by</span> <span class="n">path</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+--------------------------</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">1</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">20</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">31</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0004</span>
<span class="mi">34</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0005</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0006</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="p">(</span><span class="mi">19</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Now let’s use it for our commentable system. Find all direct childrens for parent ‘0001.0003’:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">~</span> <span class="s1">'0001.0003.*{1}'</span> <span class="k">ORDER</span> <span class="k">by</span> <span class="n">path</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+----------------</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="p">(</span><span class="mi">2</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Find all childrens for parent ‘0001.0003’:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">~</span> <span class="s1">'0001.0003.*'</span> <span class="k">ORDER</span> <span class="k">by</span> <span class="n">path</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+--------------------------</span>
<span class="mi">6</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">8</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">9</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">11</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">7</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0001</span>
<span class="mi">20</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="mi">31</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0004</span>
<span class="mi">34</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0005</span>
<span class="mi">22</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0006</span>
<span class="mi">5</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0003</span>
<span class="p">(</span><span class="mi">15</span> <span class="k">rows</span><span class="p">)</span></code></pre></figure>
<p>Find parent for children ‘0001.0003.0002.0002.0005’:</p>
<figure class="highlight"><pre><code class="language-sql" data-lang="sql"><span class="err">$</span> <span class="k">SELECT</span> <span class="n">user_id</span><span class="p">,</span> <span class="n">path</span> <span class="k">FROM</span> <span class="n">comments</span> <span class="k">WHERE</span> <span class="n">path</span> <span class="o">=</span> <span class="n">subpath</span><span class="p">(</span><span class="s1">'0001.0003.0002.0002.0005'</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="k">ORDER</span> <span class="k">by</span> <span class="n">path</span><span class="p">;</span>
<span class="n">user_id</span> <span class="o">|</span> <span class="n">path</span>
<span class="c1">---------+---------------------</span>
<span class="mi">2</span> <span class="o">|</span> <span class="mi">0001</span><span class="p">.</span><span class="mi">0003</span><span class="p">.</span><span class="mi">0002</span><span class="p">.</span><span class="mi">0002</span>
<span class="p">(</span><span class="mi">1</span> <span class="k">row</span><span class="p">)</span></code></pre></figure>
<p>If you path will not be unique, it will get several records.</p>
<h1 id="summary">Summary</h1>
<p>As can be seen, working with ltree materialized path is very simple. In this article, I have listed are not all the possible usage of ltree. It is not considered full-text search issues ltxtquery. But you can found this in <a href="http://www.postgresql.org/docs/current/static/ltree.html">documentation</a>.</p>
<p><em>That’s all folks!</em> Thank you for reading till the end.</p>