<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="http://0.0.0.0:4000/draft-blog3/feed.xml" rel="self" type="application/atom+xml" /><link href="http://0.0.0.0:4000/draft-blog3/" rel="alternate" type="text/html" /><updated>2026-03-27T20:10:04-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/feed.xml</id><title type="html">Julian Hyde on Streaming Data, Open Source OLAP. And stuff.</title><subtitle>Julian Hyde&apos;s blog</subtitle><entry><title type="html">Scratch</title><link href="http://0.0.0.0:4000/draft-blog3/9999/12/31/scratch.html" rel="alternate" type="text/html" title="Scratch" /><published>9999-12-31T12:00:00-08:00</published><updated>9999-12-31T12:00:00-08:00</updated><id>http://0.0.0.0:4000/draft-blog3/9999/12/31/scratch</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/9999/12/31/scratch.html"><![CDATA[<h2 id="sql-fragments-for-more-than-query-talk">SQL fragments for “More than Query” talk.</h2>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">fun</span> <span class="nf">mask</span> <span class="p">(</span><span class="n">guess</span><span class="p">,</span> <span class="n">answer</span><span class="p">)</span> <span class="p">=</span>
  <span class="kr">let</span>
    <span class="kr">fun</span> <span class="nf">mask2</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="p">[],</span> <span class="n">answer</span><span class="p">)</span> <span class="p">=</span> <span class="n">m</span>
      <span class="p">|</span> <span class="nf">mask2</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">letter</span> <span class="n">::</span> <span class="n">rest</span><span class="p">,</span> <span class="n">answer</span><span class="p">)</span> <span class="p">=</span>
          <span class="n">mask2</span> <span class="p">((</span><span class="n">m</span> <span class="n">*</span> <span class="mi">3</span>
          <span class="n">+</span> <span class="p">(</span><span class="kr">if</span> <span class="n">sub</span><span class="p">(</span><span class="n">answer</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span> <span class="p">=</span> <span class="n">letter</span>
               <span class="kr">then</span> <span class="mi">2</span>
             <span class="kr">else</span> <span class="kr">if</span> <span class="n">isSubstring</span><span class="p">(</span><span class="n">str</span> <span class="n">letter</span><span class="p">)</span> <span class="n">answer</span>
               <span class="kr">then</span> <span class="mi">1</span>
             <span class="kr">else</span> <span class="mi">0</span><span class="p">)),</span> <span class="n">i</span> <span class="n">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">rest</span><span class="p">,</span> <span class="n">answer</span><span class="p">)</span>
  <span class="kr">in</span>
    <span class="n">mask2</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">explode</span> <span class="n">guess</span><span class="p">,</span> <span class="n">answer</span><span class="p">)</span>
  <span class="kr">end</span><span class="p">;</span>

<span class="kr">fun</span> <span class="nf">maskToString</span> <span class="n">m</span> <span class="p">=</span>
  <span class="kr">let</span>
    <span class="kr">fun</span> <span class="nf">maskToString2</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">s</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="p">=</span> <span class="n">s</span>
      <span class="p">|</span> <span class="nf">maskToString2</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">s</span><span class="p">,</span> <span class="n">k</span><span class="p">)</span> <span class="p">=</span>
        <span class="n">maskToString2</span> <span class="p">(</span><span class="n">m</span> <span class="n">div</span> <span class="mi">3</span><span class="p">,</span>
          <span class="nn">List</span><span class="p">.</span><span class="n">nth</span><span class="p">([</span><span class="s2">"b"</span><span class="p">,</span> <span class="s2">"y"</span><span class="p">,</span> <span class="s2">"g"</span><span class="p">],</span> <span class="n">m</span> <span class="n">mod</span> <span class="mi">3</span><span class="p">)</span> <span class="n">^</span> <span class="n">s</span><span class="p">,</span>
          <span class="n">k</span> <span class="n">-</span> <span class="mi">1</span><span class="p">)</span>
  <span class="kr">in</span>
    <span class="n">maskToString2</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="s2">""</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
  <span class="kr">end</span><span class="p">;</span>

<span class="kr">val</span> <span class="nv">words</span> <span class="p">=</span> <span class="n">from</span> <span class="n">w</span> <span class="kr">in</span> <span class="nn">file</span><span class="p">.</span><span class="nn">wordle</span><span class="p">.</span><span class="n">words</span> <span class="n">yield</span> <span class="nn">w</span><span class="p">.</span><span class="n">word</span><span class="p">;</span>

<span class="kr">fun</span> <span class="nf">maskCount</span> <span class="p">(</span><span class="n">guess</span><span class="p">,</span> <span class="n">remainingWords</span><span class="p">)</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">w</span> <span class="kr">in</span> <span class="n">remainingWords</span>
    <span class="n">group</span> <span class="n">m</span> <span class="p">=</span> <span class="n">mask</span> <span class="p">(</span><span class="n">guess</span><span class="p">,</span> <span class="n">w</span><span class="p">)</span> <span class="n">compute</span> <span class="n">c</span> <span class="p">=</span> <span class="n">count</span>
    <span class="n">compute</span> <span class="n">count</span><span class="p">;</span>

<span class="kr">fun</span> <span class="nf">bestGuesses</span> <span class="n">words</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">w</span> <span class="kr">in</span> <span class="n">words</span><span class="p">,</span>
    <span class="n">maskCount</span> <span class="p">=</span> <span class="n">maskCount</span> <span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="n">words</span><span class="p">)</span>
    <span class="n">order</span> <span class="n">maskCount</span> <span class="n">desc</span><span class="p">;</span>

<span class="kr">fun</span> <span class="nf">remaining</span> <span class="p">(</span><span class="n">words</span><span class="p">,</span> <span class="p">[])</span> <span class="p">=</span> <span class="n">words</span>
  <span class="p">|</span> <span class="nf">remaining</span> <span class="p">(</span><span class="n">words</span><span class="p">,</span> <span class="p">(</span><span class="n">guess</span><span class="p">,</span> <span class="n">m</span><span class="p">)</span> <span class="n">::</span> <span class="n">rest</span><span class="p">)</span> <span class="p">=</span>
      <span class="n">from</span> <span class="n">w</span> <span class="kr">in</span> <span class="p">(</span><span class="n">remaining</span> <span class="p">(</span><span class="n">words</span><span class="p">,</span> <span class="n">rest</span><span class="p">))</span>
      <span class="kr">where</span> <span class="n">maskToString</span> <span class="p">(</span><span class="n">mask</span> <span class="p">(</span><span class="n">guess</span><span class="p">,</span> <span class="n">w</span><span class="p">))</span> <span class="p">=</span> <span class="n">m</span><span class="p">;</span>
</code></pre></div></div>

<p>Functional programming  –  values, types, operators</p>
<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">1</span> <span class="n">+</span> <span class="mi">2</span><span class="p">;</span>
<span class="n">&gt;</span> <span class="kr">val</span> <span class="nv">it</span> <span class="p">=</span> <span class="mi">3</span> <span class="p">:</span> <span class="n">int</span>
<span class="s2">"Hello, "</span> <span class="n">^</span> <span class="s2">"world!"</span><span class="p">;</span>
<span class="n">&gt;</span> <span class="kr">val</span> <span class="nv">it</span> <span class="p">=</span> <span class="s2">"Hello, world!"</span> <span class="p">:</span> <span class="n">string</span>
<span class="kr">val</span> <span class="nv">integers</span> <span class="p">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">];</span>
<span class="n">&gt;</span> <span class="kr">val</span> <span class="nv">integers</span> <span class="p">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">7</span><span class="p">,</span><span class="mi">8</span><span class="p">]</span> <span class="p">:</span> <span class="n">int</span> <span class="n">list</span>
<span class="kr">fun</span> <span class="nf">filter</span> <span class="n">f</span> <span class="p">[]</span> <span class="p">=</span> <span class="p">[]</span>
  <span class="p">|</span> <span class="nf">filter</span> <span class="n">f</span> <span class="p">(</span><span class="n">first</span> <span class="n">::</span> <span class="n">rest</span><span class="p">)</span> <span class="p">=</span>
      <span class="kr">if</span> <span class="p">(</span><span class="n">f</span> <span class="n">first</span><span class="p">)</span>
        <span class="kr">then</span> <span class="n">first</span> <span class="n">::</span> <span class="p">(</span><span class="n">filter</span> <span class="n">f</span> <span class="n">rest</span><span class="p">)</span>
        <span class="kr">else</span> <span class="n">filter</span> <span class="n">f</span> <span class="n">rest</span><span class="p">;</span>
<span class="n">&gt;</span> <span class="kr">val</span> <span class="nv">filter</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">:</span> <span class="p">(</span><span class="nd">'a</span> <span class="p">-&gt;</span> <span class="n">bool</span><span class="p">)</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span>
<span class="n">filter</span> <span class="p">(</span><span class="kr">fn</span> <span class="n">i</span> <span class="p">=&gt;</span> <span class="n">i</span> <span class="n">mod</span> <span class="mi">2</span> <span class="p">=</span> <span class="mi">0</span><span class="p">)</span> <span class="n">integers</span><span class="p">;</span>
<span class="n">&gt;</span> <span class="kr">val</span> <span class="nv">it</span> <span class="p">=</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">8</span><span class="p">]</span> <span class="p">:</span> <span class="n">int</span> <span class="n">list</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">val</span> <span class="nv">union</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">:</span> <span class="nd">'a</span> <span class="n">list</span> <span class="n">*</span> <span class="nd">'a</span> <span class="n">list</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span>
<span class="kr">val</span> <span class="nv">except</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">:</span> <span class="nd">'a</span> <span class="n">list</span> <span class="n">*</span> <span class="nd">'a</span> <span class="n">list</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span>
<span class="kr">val</span> <span class="nv">intersect</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">:</span> <span class="nd">'a</span> <span class="n">list</span> <span class="n">*</span> <span class="nd">'a</span> <span class="n">list</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span>
<span class="kr">val</span> <span class="nv">filter</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">:</span> <span class="p">(</span><span class="nd">'a</span> <span class="p">-&gt;</span> <span class="n">bool</span><span class="p">)</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span>
<span class="kr">val</span> <span class="nv">map</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">:</span> <span class="p">(</span><span class="nd">'a</span> <span class="p">-&gt;</span> <span class="nd">'b</span><span class="p">)</span> <span class="p">-&gt;</span> <span class="nd">'a</span> <span class="n">list</span> <span class="p">-&gt;</span> <span class="nd">'b</span> <span class="n">list</span>
<span class="kr">val</span> <span class="nv">join</span> <span class="p">=</span> <span class="kr">fn</span>
  <span class="p">:</span> <span class="nd">'a</span> <span class="n">list</span> <span class="n">*</span> <span class="nd">'b</span> <span class="n">list</span> <span class="n">*</span> <span class="p">(</span><span class="nd">'a</span> <span class="n">*</span> <span class="nd">'b</span> <span class="p">-&gt;</span> <span class="n">bool</span><span class="p">)</span>
    <span class="p">-&gt;</span> <span class="p">(</span><span class="nd">'a</span> <span class="n">*</span> <span class="nd">'b</span><span class="p">)</span> <span class="n">list</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">db</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">10</span>
  <span class="n">yield</span> <span class="p">{</span><span class="nn">e</span><span class="p">.</span><span class="n">name</span><span class="p">,</span> <span class="n">pay</span> <span class="p">=</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">+</span> <span class="nn">e</span><span class="p">.</span><span class="n">comm</span><span class="p">}</span>
</code></pre></div></div>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- SQL</span>
<span class="k">SELECT</span> <span class="n">item</span><span class="p">,</span> <span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">AS</span> <span class="k">c</span><span class="p">,</span>
  <span class="k">SUM</span><span class="p">(</span><span class="n">sales</span><span class="p">)</span> <span class="k">AS</span> <span class="n">total</span>
<span class="k">FROM</span> <span class="n">ProduceSales</span>
<span class="k">WHERE</span> <span class="n">item</span> <span class="o">!=</span> <span class="s1">'bananas'</span>
  <span class="k">AND</span> <span class="n">category</span> <span class="k">IN</span> <span class="p">(</span><span class="s1">'fruit'</span><span class="p">,</span> <span class="s1">'nut'</span><span class="p">)</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">item</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">item</span> <span class="k">DESC</span><span class="p">;</span>

<span class="n">item</span>      <span class="k">c</span> <span class="n">total</span>
<span class="o">======</span> <span class="o">====</span> <span class="o">=====</span>
<span class="n">apples</span>    <span class="mi">2</span>     <span class="mi">9</span>
</code></pre></div></div>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- GoogleSQL pipe syntax</span>
<span class="k">FROM</span> <span class="n">ProduceSales</span>
<span class="o">|&gt;</span> <span class="k">WHERE</span> <span class="n">item</span> <span class="o">!=</span> <span class="s1">'bananas'</span>
    <span class="k">AND</span> <span class="n">category</span> <span class="k">IN</span> <span class="p">(</span><span class="s1">'fruit'</span><span class="p">,</span> <span class="s1">'nut'</span><span class="p">)</span>
<span class="o">|&gt;</span> <span class="k">AGGREGATE</span> <span class="k">COUNT</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">AS</span> <span class="k">c</span><span class="p">,</span> <span class="k">SUM</span><span class="p">(</span><span class="n">sales</span><span class="p">)</span> <span class="k">AS</span> <span class="n">total</span>
   <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">item</span>
<span class="o">|&gt;</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="n">item</span> <span class="k">DESC</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">from</span> <span class="n">p</span> <span class="kr">in</span> <span class="n">produceSales</span>
  <span class="kr">where</span> <span class="nn">p</span><span class="p">.</span><span class="n">item</span> <span class="n">!=</span> <span class="s2">"bananas"</span>
    <span class="kr">andalso</span> <span class="nn">p</span><span class="p">.</span><span class="n">category</span> <span class="n">elem</span> <span class="p">[</span><span class="s2">"fruit"</span><span class="p">,</span> <span class="s2">"nut"</span><span class="p">]</span>
  <span class="n">group</span> <span class="nn">p</span><span class="p">.</span><span class="n">item</span> <span class="n">compute</span> <span class="n">c</span> <span class="p">=</span> <span class="n">count</span><span class="p">,</span>
    <span class="n">total</span> <span class="p">=</span> <span class="n">sum</span> <span class="kr">of</span> <span class="nn">p</span><span class="p">.</span><span class="n">sales</span>
  <span class="n">order</span> <span class="n">item</span> <span class="n">desc</span>
</code></pre></div></div>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">e</span><span class="p">.</span><span class="n">ename</span><span class="p">,</span> <span class="n">e</span><span class="p">.</span><span class="n">sal</span>
<span class="k">FROM</span> <span class="n">emps</span> <span class="k">AS</span> <span class="n">e</span>
<span class="k">WHERE</span> <span class="n">e</span><span class="p">.</span><span class="n">deptno</span> <span class="o">=</span> <span class="mi">10</span>
<span class="k">AND</span> <span class="n">e</span><span class="p">.</span><span class="n">sal</span> <span class="o">&gt;</span> <span class="p">(</span><span class="k">SELECT</span> <span class="k">MAX</span><span class="p">(</span><span class="n">e2</span><span class="p">.</span><span class="n">sal</span><span class="p">)</span>
             <span class="k">FROM</span> <span class="n">emps</span> <span class="k">AS</span> <span class="n">e2</span>
             <span class="k">WHERE</span> <span class="n">e2</span><span class="p">.</span><span class="n">deptno</span> <span class="o">=</span> <span class="mi">20</span>
             <span class="k">AND</span> <span class="n">e2</span><span class="p">.</span><span class="n">job</span> <span class="o">=</span> <span class="s1">'PROGRAMMER'</span><span class="p">)</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">10</span>
    <span class="kr">andalso</span> <span class="n">sal</span> <span class="n">&gt;</span>
    <span class="p">(</span><span class="n">from</span> <span class="n">e2</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emp</span>
      <span class="kr">where</span> <span class="nn">e2</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">20</span>
        <span class="kr">andalso</span> <span class="nn">e2</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"PROGRAMMER"</span>
      <span class="n">compute</span> <span class="n">max</span> <span class="mi">0</span> <span class="kr">of</span> <span class="nn">e2</span><span class="p">.</span><span class="n">sal</span><span class="p">)</span>
  <span class="n">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emp</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">10</span>
    <span class="kr">andalso</span>
    <span class="p">(</span><span class="n">forall</span> <span class="n">e2</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emp</span>
      <span class="kr">where</span> <span class="nn">e2</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">20</span>
        <span class="kr">andalso</span> <span class="nn">e2</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"PROGRAMMER"</span>
      <span class="n">require</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">&gt;</span> <span class="nn">e2</span><span class="p">.</span><span class="n">sal</span><span class="p">)</span>
  <span class="n">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">datatype</span> <span class="kt">personnel_id</span> <span class="p">=</span>
    <span class="nc">EMPLOYEE</span> <span class="kr">of</span> <span class="n">int</span>
  <span class="p">|</span> <span class="nc">CONTRACTOR</span> <span class="kr">of</span> <span class="p">{</span><span class="n">ssid</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">agency</span><span class="p">:</span> <span class="n">string</span><span class="p">};</span>

<span class="kr">type</span> <span class="kt">member</span> <span class="p">=</span> <span class="p">{</span><span class="n">name</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">deptno</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">id</span><span class="p">:</span> <span class="n">personnel_id</span><span class="p">};</span>

<span class="kr">val</span> <span class="nv">members</span> <span class="p">=</span> <span class="p">[</span>
  <span class="p">{</span><span class="n">name</span> <span class="p">=</span> <span class="s2">"Smith"</span><span class="p">,</span> <span class="n">deptno</span> <span class="p">=</span> <span class="mi">10</span><span class="p">,</span> <span class="n">id</span> <span class="p">=</span> <span class="n">EMPLOYEE</span> <span class="mi">100</span><span class="p">},</span>
  <span class="p">{</span><span class="n">name</span> <span class="p">=</span> <span class="s2">"Jones"</span><span class="p">,</span> <span class="n">deptno</span> <span class="p">=</span> <span class="mi">20</span><span class="p">,</span>
   <span class="n">id</span> <span class="p">=</span> <span class="n">CONTRACTOR</span> <span class="p">{</span><span class="n">ssid</span> <span class="p">=</span> <span class="s2">"xxx-xx-xxxx"</span><span class="p">,</span> <span class="n">agency</span> <span class="p">=</span> <span class="s2">"Cheap &amp; cheerful"</span><span class="p">}];</span>

<span class="kr">val</span> <span class="nv">departments</span> <span class="p">=</span> <span class="nn">scott</span><span class="p">.</span><span class="n">depts</span><span class="p">;</span>

<span class="kr">val</span> <span class="nv">primes</span> <span class="p">=</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">11</span><span class="p">];</span>

<span class="kr">val</span> <span class="nv">bands</span> <span class="p">=</span> <span class="p">[[</span><span class="s2">"john"</span><span class="p">,</span> <span class="s2">"paul"</span><span class="p">,</span> <span class="s2">"george"</span><span class="p">,</span> <span class="s2">"ringo"</span><span class="p">],</span> <span class="p">[</span><span class="s2">"simon"</span><span class="p">,</span> <span class="s2">"garfunkel"</span><span class="p">]];</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">from</span> <span class="n">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">],</span>
    <span class="n">j</span> <span class="kr">in</span> <span class="p">[</span><span class="s2">"a"</span><span class="p">,</span> <span class="s2">"b"</span><span class="p">],</span>
    <span class="n">k</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">where</span> <span class="n">i</span> <span class="n">+</span> <span class="n">k</span> <span class="n">&lt;</span> <span class="mi">6</span><span class="p">;</span>
<span class="n">&gt;</span> <span class="p">{</span><span class="n">i</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">j</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">k</span><span class="p">:</span> <span class="n">int</span><span class="p">}</span> <span class="n">list</span>

<span class="n">from</span> <span class="n">dept</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">30</span><span class="p">],</span>
    <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="n">dept</span>
  <span class="n">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">;</span>
<span class="n">&gt;</span> <span class="n">string</span> <span class="n">bag</span>

<span class="n">from</span> <span class="n">dept</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">30</span><span class="p">],</span>
    <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="n">dept</span>
  <span class="n">order</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">desc</span>
  <span class="n">take</span> <span class="mi">3</span>
  <span class="n">yield</span> <span class="p">{</span><span class="nn">e</span><span class="p">.</span><span class="n">deptno</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">};</span>
<span class="n">&gt;</span> <span class="p">{</span><span class="n">deptno</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">ename</span><span class="p">:</span> <span class="n">string</span><span class="p">}</span> <span class="n">list</span>
</code></pre></div></div>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- Delete employees who earn more than 1,000.</span>
<span class="k">DELETE</span> <span class="k">FROM</span> <span class="n">scott</span><span class="p">.</span><span class="n">emps</span>
<span class="k">WHERE</span> <span class="n">sal</span> <span class="o">&gt;</span> <span class="mi">1000</span><span class="p">;</span>

<span class="c1">-- Add one employee.</span>
<span class="k">INSERT</span> <span class="k">INTO</span> <span class="n">scott</span><span class="p">.</span><span class="n">emps</span> <span class="p">(</span><span class="n">empno</span><span class="p">,</span> <span class="n">deptno</span><span class="p">,</span> <span class="n">ename</span><span class="p">,</span> <span class="n">job</span><span class="p">,</span> <span class="n">sal</span><span class="p">)</span>
<span class="k">VALUES</span> <span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="s1">'HYDE'</span><span class="p">,</span> <span class="s1">'ANALYST'</span><span class="p">,</span> <span class="mi">1150</span><span class="p">);</span>

<span class="c1">-- Double the salary of all managers.</span>
<span class="k">UPDATE</span> <span class="n">scott</span><span class="p">.</span><span class="n">emps</span>
<span class="k">SET</span> <span class="n">sal</span> <span class="o">=</span> <span class="n">sal</span> <span class="o">*</span> <span class="mi">2</span>
<span class="k">WHERE</span> <span class="n">job</span> <span class="o">=</span> <span class="s1">'MANAGER'</span><span class="p">;</span>

<span class="c1">-- Commit.</span>
<span class="k">COMMIT</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(*</span><span class="cm"> Delete employees who earn more than 1,000. *)</span>
<span class="n">delete</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">&gt;</span> <span class="mi">1000</span><span class="p">;</span>
<span class="c">(*</span><span class="cm"> Add one employee. *)</span>
<span class="n">insert</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="p">[{</span><span class="n">empno</span> <span class="p">=</span> <span class="mi">100</span><span class="p">,</span> <span class="n">deptno</span> <span class="p">=</span> <span class="mi">20</span><span class="p">,</span> <span class="n">ename</span> <span class="p">=</span> <span class="s2">"HYDE"</span><span class="p">,</span>
    <span class="n">job</span> <span class="p">=</span> <span class="s2">"ANALYST"</span><span class="p">,</span> <span class="n">sal</span> <span class="p">=</span> <span class="mi">1150</span><span class="p">}];</span>
<span class="c">(*</span><span class="cm"> Double the salary of all managers. *)</span>
<span class="n">update</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"MANAGER"</span>
  <span class="n">assign</span> <span class="p">(</span><span class="n">e</span><span class="p">,</span> <span class="p">{</span><span class="n">e</span> <span class="kr">with</span> <span class="n">sal</span> <span class="p">=</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">*</span> <span class="mi">2</span><span class="p">});</span>
<span class="c">(*</span><span class="cm"> Commit. *)</span>
<span class="n">commit</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(*</span><span class="cm"> Delete employees who earn more than 1,000. *)</span>
<span class="kr">val</span> <span class="nv">emps2</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
    <span class="kr">where</span> <span class="n">not</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">&gt;</span> <span class="mi">1000</span><span class="p">);</span>
<span class="c">(*</span><span class="cm"> Add one employee. *)</span>
<span class="kr">val</span> <span class="nv">emps3</span> <span class="p">=</span> <span class="n">emps2</span> <span class="n">union</span>
  <span class="p">[{</span><span class="n">empno</span> <span class="p">=</span> <span class="mi">100</span><span class="p">,</span> <span class="n">deptno</span> <span class="p">=</span> <span class="mi">20</span><span class="p">,</span> <span class="n">ename</span> <span class="p">=</span> <span class="s2">"HYDE"</span><span class="p">,</span> <span class="n">job</span> <span class="p">=</span> <span class="s2">"ANALYST"</span><span class="p">,</span>
    <span class="n">sal</span> <span class="p">=</span> <span class="mi">1150</span><span class="p">}];</span>
<span class="c">(*</span><span class="cm"> Double the salary of all managers. *)</span>
<span class="kr">val</span> <span class="nv">emps4</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="n">emps3</span>
    <span class="n">yield</span> <span class="kr">if</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"MANAGER"</span>
      <span class="kr">then</span> <span class="p">{</span><span class="n">e</span> <span class="kr">with</span> <span class="n">sal</span> <span class="p">=</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">*</span> <span class="mi">2</span><span class="p">}</span>
      <span class="kr">else</span> <span class="n">e</span><span class="p">;</span>
<span class="c">(*</span><span class="cm"> Commit. *)</span>
<span class="n">commit</span> <span class="p">{</span><span class="n">scott</span> <span class="kr">with</span> <span class="n">emps</span> <span class="p">=</span> <span class="n">emps4</span><span class="p">};</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(*</span><span class="cm"> New and removed employees. *)</span>
<span class="kr">val</span> <span class="nv">empsAdded</span> <span class="p">=</span> <span class="n">emps4</span> <span class="n">except</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span><span class="p">;</span>
<span class="kr">val</span> <span class="nv">empsRemoved</span> <span class="p">=</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span> <span class="n">except</span> <span class="n">emps4</span><span class="p">;</span>
<span class="c">(*</span><span class="cm"> Compute the updated summary table. *)</span>
<span class="kr">val</span> <span class="nv">summary2</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">s</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">summary</span>
    <span class="n">union</span>
      <span class="p">(</span><span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="n">empsAdded</span>
        <span class="n">yield</span> <span class="p">{</span><span class="nn">e</span><span class="p">.</span><span class="n">deptno</span><span class="p">,</span> <span class="n">c</span> <span class="p">=</span> <span class="mi">1</span><span class="p">,</span> <span class="n">sum_sal</span> <span class="p">=</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">}</span>
    <span class="n">union</span>
      <span class="p">(</span><span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="n">empsRemoved</span>
        <span class="n">yield</span> <span class="p">{</span><span class="nn">e</span><span class="p">.</span><span class="n">deptno</span><span class="p">,</span> <span class="n">c</span> <span class="p">=</span> <span class="mi">~1</span><span class="p">,</span> <span class="n">sum_sal</span> <span class="p">=</span> <span class="n">~sum_sal</span><span class="p">})</span>
    <span class="n">group</span> <span class="nn">s</span><span class="p">.</span><span class="n">deptno</span> <span class="n">compute</span> <span class="n">c</span> <span class="p">=</span> <span class="n">sum</span> <span class="kr">of</span> <span class="n">c</span><span class="p">,</span> <span class="n">sum_sal</span> <span class="p">=</span> <span class="n">sum</span> <span class="kr">of</span> <span class="n">sum_sal</span>
    <span class="kr">where</span> <span class="n">c</span> <span class="n">!=</span> <span class="mi">0</span><span class="p">);</span>
<span class="c">(*</span><span class="cm"> Commit. *)</span>
<span class="n">commit</span> <span class="p">{</span><span class="n">scott</span> <span class="kr">with</span> <span class="n">summary</span> <span class="p">=</span> <span class="n">summary2</span><span class="p">};</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(*</span><span class="cm"> Morel "forwards" relation *)</span>
<span class="c">(*</span><span class="cm"> Relation defined using algebra. *)</span>
<span class="kr">fun</span> <span class="nf">clerks</span> <span class="p">()</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="n">emps</span>
    <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"CLERK"</span><span class="p">;</span>
<span class="c">(*</span><span class="cm"> Query uses regular iteration. *)</span>
<span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="n">clerks</span><span class="p">,</span>
    <span class="n">d</span> <span class="kr">in</span> <span class="n">depts</span>
  <span class="kr">where</span> <span class="nn">d</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span>
  <span class="kr">andalso</span> <span class="nn">d</span><span class="p">.</span><span class="n">loc</span> <span class="p">=</span> <span class="s2">"DALLAS"</span>
  <span class="n">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">name</span><span class="p">;</span>
<span class="kr">val</span> <span class="nv">it</span> <span class="p">=</span>
  <span class="p">[</span><span class="s2">"SMITH"</span><span class="p">,</span> <span class="s2">"ADAMS"</span><span class="p">]</span> <span class="p">:</span> <span class="n">string</span> <span class="n">list</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(*</span><span class="cm"> Morel "backwards" relation *)</span>
<span class="c">(*</span><span class="cm"> Relation defined using a predicate. *)</span>
<span class="kr">fun</span> <span class="nf">isClerk</span> <span class="n">e</span> <span class="p">=</span>
  <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"CLERK"</span><span class="p">;</span>
<span class="c">(*</span><span class="cm"> Query uses a mixture of constrained
   and regular iteration. *)</span>
<span class="n">from</span> <span class="n">e</span><span class="p">,</span>
    <span class="n">d</span> <span class="kr">in</span> <span class="n">depts</span>
  <span class="kr">where</span> <span class="n">isClerk</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
    <span class="kr">andalso</span> <span class="nn">d</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span>
    <span class="kr">andalso</span> <span class="nn">d</span><span class="p">.</span><span class="n">loc</span> <span class="p">=</span> <span class="s2">"DALLAS"</span>
  <span class="n">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">name</span><span class="p">;</span>
<span class="kr">val</span> <span class="nv">it</span> <span class="p">=</span>
  <span class="p">[</span><span class="s2">"SMITH"</span><span class="p">,</span> <span class="s2">"ADAMS"</span><span class="p">]</span> <span class="p">:</span> <span class="n">string</span> <span class="n">list</span><span class="p">;</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">datatype</span> <span class="nd">'a</span> <span class="kt">option</span> <span class="p">=</span> <span class="nc">NONE</span> <span class="p">|</span> <span class="nc">SOME</span> <span class="kr">of</span> <span class="nd">'a</span><span class="p">;</span>
<span class="n">SOME</span> <span class="mi">1</span><span class="p">;</span>

<span class="n">-</span> <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emp</span>
<span class="p">=</span>   <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">10</span>
<span class="p">=</span>   <span class="kr">andalso</span> <span class="p">(</span><span class="n">forall</span> <span class="n">e2</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emp</span>
<span class="p">=</span>     <span class="kr">where</span> <span class="nn">e2</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"PROGRAMMER"</span>
<span class="p">=</span>     <span class="n">require</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="n">&gt;</span> <span class="nn">e2</span><span class="p">.</span><span class="n">sal</span><span class="p">)</span>
<span class="p">=</span>   <span class="n">yield</span> <span class="p">{</span><span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">};</span>
<span class="kr">val</span> <span class="nv">it</span> <span class="p">=</span>
  <span class="p">[{</span><span class="n">ename</span><span class="p">=</span><span class="s2">"CLARK"</span><span class="p">,</span><span class="n">sal</span><span class="p">=</span><span class="mf">2450.0</span><span class="p">},{</span><span class="n">ename</span><span class="p">=</span><span class="s2">"KING"</span><span class="p">,</span><span class="n">sal</span><span class="p">=</span><span class="mf">5000.0</span><span class="p">},</span>
   <span class="p">{</span><span class="n">ename</span><span class="p">=</span><span class="s2">"MILLER"</span><span class="p">,</span><span class="n">sal</span><span class="p">=</span><span class="mf">1300.0</span><span class="p">}]</span> <span class="p">:</span> <span class="p">{</span><span class="n">ename</span><span class="p">:</span><span class="n">string</span><span class="p">,</span> <span class="n">sal</span><span class="p">:</span><span class="n">real</span><span class="p">}</span> <span class="n">list</span>
</code></pre></div></div>

<div class="language-sml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(*</span><span class="cm"> Initial database value and type (schema). *)</span>
<span class="kr">val</span> <span class="nv">scott1</span> <span class="p">=</span> <span class="n">db</span>
  <span class="p">:</span> <span class="p">{</span><span class="n">emps</span><span class="p">:</span> <span class="p">{</span><span class="n">name</span><span class="p">:</span> <span class="n">string</span><span class="p">,</span> <span class="n">empno</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">deptno</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span>
            <span class="n">hiredate</span><span class="p">:</span> <span class="n">string</span><span class="p">}</span> <span class="n">bag</span><span class="p">,</span>
    <span class="n">depts</span><span class="p">:</span> <span class="p">{</span><span class="n">deptno</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">name</span><span class="p">:</span> <span class="n">string</span><span class="p">}</span> <span class="n">bag</span><span class="p">};</span>
<span class="c">(*</span><span class="cm"> Shim that makes a v1 database look like v2. *)</span>
<span class="kr">fun</span> <span class="nf">scott2on1shim</span> <span class="n">scott1</span> <span class="p">=</span>
  <span class="p">{</span><span class="n">emps</span> <span class="p">=</span>
    <span class="kr">fn</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott1</span><span class="p">.</span><span class="n">emps</span>
      <span class="n">yield</span> <span class="p">{</span><span class="n">e</span> <span class="kr">with</span> <span class="n">hiredate</span> <span class="p">=</span> <span class="nn">Date</span><span class="p">.</span><span class="n">fromString</span><span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">hiredate</span><span class="p">)},</span>
   <span class="n">depts</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="nn">scott1</span><span class="p">.</span><span class="n">depts</span><span class="p">};</span>
<span class="c">(*</span><span class="cm"> Shim that makes v3 database look like v1. *)</span>
<span class="kr">fun</span> <span class="nf">scott1on3shim</span> <span class="n">scott3</span> <span class="p">=</span>
  <span class="p">{</span><span class="n">emps</span> <span class="p">=</span>
    <span class="kr">fn</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott3</span><span class="p">.</span><span class="n">emps</span>
      <span class="n">yield</span> <span class="p">{</span><span class="n">e</span> <span class="kr">with</span> <span class="n">hiredate</span> <span class="p">=</span> <span class="nn">Date</span><span class="p">.</span><span class="n">toString</span><span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">hiredate</span><span class="p">)</span>
               <span class="n">removing</span> <span class="n">rating</span><span class="p">},</span>
   <span class="n">depts</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">()</span> <span class="p">=&gt;</span> <span class="nn">scott3</span><span class="p">.</span><span class="n">depts</span><span class="p">};</span>
<span class="c">(*</span><span class="cm"> An application writes its queries &amp; views against version 2;
   shims make it work on any actual version. *)</span>
<span class="kr">val</span> <span class="nv">scott</span> <span class="p">=</span> <span class="n">scott2</span><span class="p">;</span>
<span class="kr">fun</span> <span class="nf">recentHires</span> <span class="p">()</span> <span class="p">=</span>
  <span class="n">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">e</span>
    <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">hiredate</span> <span class="n">&gt;</span> <span class="nn">Date</span><span class="p">.</span><span class="n">subtract</span><span class="p">(</span><span class="nn">Date</span><span class="p">.</span><span class="n">now</span><span class="p">(),</span> <span class="mi">100</span><span class="p">);</span>
</code></pre></div></div>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[SQL fragments for “More than Query” talk.]]></summary></entry><entry><title type="html">What Humans Do When Agents Write Code</title><link href="http://0.0.0.0:4000/draft-blog3/2026/03/23/tools-languages-agentic-era.html" rel="alternate" type="text/html" title="What Humans Do When Agents Write Code" /><published>2026-03-23T13:00:00-07:00</published><updated>2026-03-23T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2026/03/23/tools-languages-agentic-era</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2026/03/23/tools-languages-agentic-era.html"><![CDATA[<p>Agents can write code. So what’s left for humans to do? The answer,
I’ll argue, is architecture — and specifically, the design of
tools.  That turns out to be hard, important, and distinctly human
work.</p>

<p><img src="/draft-blog3/assets/img/tools.jpg" alt="Tools on a workbench" /></p>

<p>An agent can turn out code at much faster than any human developer,
but like a human, it has trouble grasping the big picture. A 100K-line
system is too large for a single context, but if it is divided into
smaller pieces, the agent can write and maintain them one at a time.</p>

<p>This is where the human developer — the architect — can
help.  I define architecture as a <em>strategy for organizing
complexity</em>, but usually involve divides the system into components
with clean interfaces.  An agent (or human) working on one component
needs only a brief, accurate picture of the others — not their source.</p>

<p>Components take many forms, such as microservices, APIs, and
libraries.  A single project can contain components, provided that
care is taken to clearly define each component’s boundary. This post
focuses on tools, which are the simplest kind of component.</p>

<h2 id="unix-tools">Unix tools</h2>

<p>Unix tools are effective components because they have a simple
interface contract (the man page), are easy to invoke (from the
shell), and compose with other tools (in pipelines and scripts).</p>

<!-- SECTION 2: Architecture as the answer
Divide the system into components with clean interfaces. An agent (or
human) working on one component needs only a brief, accurate picture
of the others — not their source. The right model for this is the
Unix tool: a program with a man page. The man page IS the interface
contract. If your component can't be described in a man page, it
isn't a tool yet.
-->

<!-- SECTION 3: Power through language
`grep` has enormous functionality, but you invoke it concisely because
regexp is a powerful declarative language. This is the pattern: a good
tool comes with a small declarative language. The return of tools is
the return of DSLs.
-->

<!-- SECTION 4: darn as a concrete example
`darn` (https://github.com/hydromatic/morel/issues/345) processes
Markdown, but it also invokes the Morel kernel to validate and execute
embedded code fragments. It has a clear man page, does one thing, and
uses Morel as its language for code-block evaluation. Calcite's
relational algebra interface is another example of a tool with a clean
declarative language at its interface.
-->

<!-- SECTION 5: Designing tools is hard — which makes it good human work
Choosing the right abstraction, picking the right language, writing the
man page (which is the spec) — this requires taste and experience.
The difficulty is the point: it's exactly the kind of high-leverage
work that humans should be doing.
-->

<!-- SECTION 6: The toolchain as a form of architecture
Lint rules that enforce patterns — e.g. enum variants must be sorted —
are a different kind of tool. Even if you don't read agent-generated
code, the compiler rejects code that violates the invariant. This
forces the agent toward consistent patterns and prevents edits that
can't be merged. Humans encode architectural decisions into the
toolchain so they don't have to review every line.
-->

<!-- CONCLUSION
The measure of a well-designed component is whether you can write its
man page. If you can, an agent can use it. Beyond individual tools,
the toolchain itself encodes architecture: a lint rule that rejects
unsorted enum variants doesn't just catch a style nit — it forces the
agent to adopt a pattern that will merge cleanly. In the agentic era,
humans set the constraints; agents work within them.
-->

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><p>There was a 'Not Found' error fetching URL: 'https://x.com/julianhyde/status/XXXXXXXXXXXXXXXXX'</p></div>
</div>

<!--
This article
[has been updated](https://github.com/julianhyde/share/commits/main/blog/_posts/2026-03-23-tools-languages-agentic-era.md).
-->]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[Agents can write code. So what’s left for humans to do? The answer, I’ll argue, is architecture — and specifically, the design of tools. That turns out to be hard, important, and distinctly human work.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/tools.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/tools.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Datalog in Morel</title><link href="http://0.0.0.0:4000/draft-blog3/2026/03/09/datalog-in-morel.html" rel="alternate" type="text/html" title="Datalog in Morel" /><published>2026-03-09T13:00:00-07:00</published><updated>2026-03-09T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2026/03/09/datalog-in-morel</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2026/03/09/datalog-in-morel.html"><![CDATA[<p>This week we
<a href="https://github.com/hydromatic/morel/commit/62581437ac9c8dc415b159fdc9d6abc7eb588e9a">added Datalog support</a>
to Morel — not by building a Datalog engine,
but by adding a language feature called predicate inversion.</p>

<p>You can now write queries in the
<a href="https://souffle-lang.github.io/">Soufflé</a> dialect of Datalog
and execute them using Morel’s usual runtime.</p>

<p>This demonstrates that Morel now supports both query paradigms —
Datalog’s relational calculus and Morel’s native relational algebra
— and you can freely switch between them. But what are these
paradigms, and why does it matter?</p>

<h2 id="the-two-paradigms">The two paradigms</h2>

<p>The two paradigms originate in set theory, and continue
through the relational model into modern query languages.</p>

<p>Set theory provides two ways to define a set: the <strong>intensional</strong>
method defines the set by its properties (for example, “red cars” is
the set of all cars whose color is red), and the <strong>extensional</strong>
method creates the set by performing operations on existing sets
(intersect the set of all cars with the set of all red objects).</p>

<p>The relational model for databases provides two ways to specify a
query which mirror intensional and extensional set definitions. In
<strong>relational calculus</strong>, one specifies the logical properties of the
tuples to retrieve from the input relations; in <strong>relational
algebra</strong>, one specifies the input relations and a sequence of
operations (intersect, join, filter, project) to apply to them.
<a href="https://en.wikipedia.org/wiki/Codd%27s_theorem">Codd’s Theorem</a>
proves that these languages have equivalent expressive power.</p>

<p>Query languages are generally based on one of those paradigms. SQL is
largely based on algebra (although its <code class="language-plaintext highlighter-rouge">EXISTS</code> keyword shows the
influence of calculus). Datalog is based on calculus. Functional
programming languages (including Morel) are in the algebra camp; they
provide relational operators via higher-order functions like <code class="language-plaintext highlighter-rouge">map</code>,
<code class="language-plaintext highlighter-rouge">filter</code> and <code class="language-plaintext highlighter-rouge">reduce</code>, and sometimes provide syntactic sugar like
list-comprehensions.</p>

<p>If the languages are equivalent, why does it matter? The languages
have different strengths.</p>

<p>Algebra’s strengths:</p>
<ul>
  <li>Algebra naturally extends to <strong>bags and lists</strong> (collections with
ordering and/or duplicate values), while calculus only works on
sets;</li>
  <li><strong>Aggregate functions</strong> are a more natural extension to algebra
than calculus;</li>
  <li>Mainstream programming languages are functional or procedural, so
there is lower <strong>impedance mismatch</strong> embedding a query in a
program or writing a user-defined function to be called from a
query;</li>
  <li>Developers familiar with mainstream programming languages
find the calculus paradigm <strong>difficult to learn</strong>.</li>
</ul>

<p>Calculus (epitomized by Datalog) excels at graph and deductive
queries, such as queries that iterate until they reach a fixed
point. As we shall see, it is just easier to write recursive queries
if they return a boolean than if they return a complex data type like
a set of tuples.</p>

<p>For simple fixed-point queries such as computing the transitive
closure of a relation, the algebra query returns a set that is the
union of the points that are one step away, two steps away, and so
forth. In calculus, the value is boolean: whether there is a path from
one point to another.</p>

<p>For more complex fixed-point queries, the algebra programmer must
define a data type with a semilattice structure. Consider, for
example, a query to find all pairs of nodes connected by no more than
five steps. In algebra, the data type is now a set of <code class="language-plaintext highlighter-rouge">(source,
destination, distance)</code> triples combined by taking the minimum
distance. In calculus, the data type remains boolean: the result of
the function <code class="language-plaintext highlighter-rouge">has_path_within(source, destination, distance)</code>. The
boolean function is easier to write, and easier for the query planner
to understand.</p>

<p>Until now, if a programmer had to solve a problem with mixed workload,
they would need to switch languages. Because of a new feature called
predicate inversion, Morel now supports both paradigms.</p>

<h2 id="the-datalog-interface">The Datalog interface</h2>

<p>The following program, in the
<a href="https://souffle-lang.github.io/">Soufflé</a> dialect of Datalog,
computes the transitive closure of an <code class="language-plaintext highlighter-rouge">edge</code> relation.</p>

<div class="language-prolog highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">.</span><span class="ss">decl</span> <span class="ss">edge</span><span class="p">(</span><span class="ss">x</span><span class="o">:</span><span class="ss">number</span><span class="p">,</span> <span class="ss">y</span><span class="o">:</span><span class="ss">number</span><span class="p">)</span>
<span class="p">.</span><span class="ss">decl</span> <span class="ss">path</span><span class="p">(</span><span class="ss">x</span><span class="o">:</span><span class="ss">number</span><span class="p">,</span> <span class="ss">y</span><span class="o">:</span><span class="ss">number</span><span class="p">)</span>
<span class="ss">edge</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">).</span>
<span class="ss">edge</span><span class="p">(</span><span class="m">2</span><span class="p">,</span><span class="m">3</span><span class="p">).</span>
<span class="ss">path</span><span class="p">(</span><span class="nv">X</span><span class="p">,</span><span class="nv">Y</span><span class="p">)</span> <span class="p">:-</span> <span class="ss">edge</span><span class="p">(</span><span class="nv">X</span><span class="p">,</span><span class="nv">Y</span><span class="p">).</span>
<span class="ss">path</span><span class="p">(</span><span class="nv">X</span><span class="p">,</span><span class="nv">Z</span><span class="p">)</span> <span class="p">:-</span> <span class="ss">path</span><span class="p">(</span><span class="nv">X</span><span class="p">,</span><span class="nv">Y</span><span class="p">),</span> <span class="ss">edge</span><span class="p">(</span><span class="nv">Y</span><span class="p">,</span><span class="nv">Z</span><span class="p">).</span>
<span class="p">.</span><span class="ss">output</span> <span class="ss">path</span>
</code></pre></div></div>

<p>In a graph with nodes 1, 2 and 3, the <code class="language-plaintext highlighter-rouge">edge</code> relation defines edges
from 1 → 2 and 2 → 3. The derived <code class="language-plaintext highlighter-rouge">path</code> relation says that
there is a path between two nodes if (a) there is an edge, or (b)
there is an edge to an intermediate node and a path from that
intermediate node to the destination node. From the edges {1 →
2, 2 → 3} it deduces the paths {1 → 2, 2 → 3, 1 →
3}.</p>

<p>You can now run the following program from Morel’s shell:</p>

<!-- morel
Datalog.execute "
.decl edge(x:int, y:int)
.decl path(x:int, y:int)
edge(1,2).
edge(2,3).
path(X,Y) :- edge(X,Y).
path(X,Z) :- path(X,Y), edge(Y,Z).
.output path";
> val it = {path=[{x=1,y=2},{x=2,y=3},{x=1,y=3}]}
>   : {path:{x:int, y:int} list} variant
-->

<div class="code-block">
<div class="code-input"><span class="nn">Datalog</span><span class="p">.</span><span class="n">execute</span> <span class="s2">"
.decl edge(x:int, y:int)
.decl path(x:int, y:int)
edge(1,2).
edge(2,3).
path(X,Y) :- edge(X,Y).
path(X,Z) :- path(X,Y), edge(Y,Z).
.output path"</span><span class="p">;</span></div>
<div class="code-output">val it = {path=[{x=1,y=2},{x=2,y=3},{x=1,y=3}]}
  : {path:{x:int, y:int} list} variant</div>
</div>

<p>The program is passed (as a string literal) as an argument to the
<code class="language-plaintext highlighter-rouge">Datalog.execute</code> function, and the Soufflé <code class="language-plaintext highlighter-rouge">symbol</code> and
<code class="language-plaintext highlighter-rouge">number</code> types in the <code class="language-plaintext highlighter-rouge">.decl</code> directive have been mapped to Morel
<code class="language-plaintext highlighter-rouge">string</code> and <code class="language-plaintext highlighter-rouge">int</code> types, but is otherwise unchanged.</p>

<p>(Adding a <code class="language-plaintext highlighter-rouge">Datalog</code> structure, with functions <code class="language-plaintext highlighter-rouge">execute</code>, <code class="language-plaintext highlighter-rouge">translate</code>
and <code class="language-plaintext highlighter-rouge">validate</code>, seemed preferable to writing a whole Datalog shell and
testing framework. Facts and rules have the same syntax as
Soufflé, as does the <code class="language-plaintext highlighter-rouge">.output</code> directive. The <code class="language-plaintext highlighter-rouge">.input</code>
directive, not shown in this example, has a new optional <em>filePath</em>
argument.)</p>

<h2 id="translating-datalog-to-morel">Translating Datalog to Morel</h2>

<p>The translation makes concrete the equivalence that Codd’s Theorem
promises: each Datalog construct has a direct counterpart in Morel.</p>

<p>One way to support Datalog would have been to implement a Datalog
engine, but this would have been a major task and would not have
benefited the rest of Morel. Instead, we have extended the Morel
language with Datalog-like constructs; this has made the Morel
language more powerful, and made Datalog translation straightforward.</p>

<p>The Datalog-to-Morel translator has a structure that will be familiar
to anyone who has implemented a compiler that translates a high-level
language to a lower-level language. Three steps are executed in
succession:</p>

<ol>
  <li>The <em>parser</em> converts a Datalog string to a parse tree.</li>
  <li>The <em>validator</em> makes sure that the program is valid (that rules
are safe, grounded and stratified) and deduces its type.</li>
  <li>The <em>translator</em> generates a Morel program that is equivalent to
the Datalog program.</li>
</ol>

<p>Parsing and validation follow standard patterns, but let’s look at
the translation algorithm in a little more detail.
Here is the translation to Morel of the earlier Datalog program:</p>

<!-- morel skip
let
  val edge_facts = [(1, 2), (2, 3)]
  fun edge (x, y) = (x, y) elem edge_facts
  fun path (x, y) =
    edge (x, y) orelse
    (exists v0 where path (x, v0) andalso edge (v0, y))
in
  {path = from x, y where path (x, y)}
end
-->

<div class="code-block">
<div class="code-input"><span class="kr">let</span>
  <span class="kr">val</span> <span class="nv">edge_facts</span> <span class="p">=</span> <span class="p">[(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">)]</span>
  <span class="kr">fun</span> <span class="nf">edge</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="kr">elem</span> <span class="n">edge_facts</span>
  <span class="kr">fun</span> <span class="nf">path</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">=</span>
    <span class="n">edge</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="kr">orelse</span>
    <span class="p">(</span><span class="kr">exists</span> <span class="n">v0</span> <span class="kr">where</span> <span class="n">path</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">v0</span><span class="p">)</span> <span class="kr">andalso</span> <span class="n">edge</span> <span class="p">(</span><span class="n">v0</span><span class="p">,</span> <span class="n">y</span><span class="p">))</span>
<span class="kr">in</span>
  <span class="p">{</span><span class="n">path</span> <span class="p">=</span> <span class="kr">from</span> <span class="nv">x</span><span class="p">,</span> <span class="nv">y</span> <span class="kr">where</span> <span class="n">path</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)}</span>
<span class="kr">end</span></div>
</div>

<p>You’ll notice that the Datalog and Morel programs have the same
structure. Datalog rules without a body (such as <code class="language-plaintext highlighter-rouge">edge(1,2)</code> and
<code class="language-plaintext highlighter-rouge">edge(2,3)</code>) are gathered into a list of tuples (<code class="language-plaintext highlighter-rouge">edge_facts</code>).</p>

<p>Each rule becomes a boolean function. If there are several
comma-separated predicates in a rule’s body, they are combined using
<code class="language-plaintext highlighter-rouge">andalso</code>. If there are several rules of the same name, their
conditions are combined using <code class="language-plaintext highlighter-rouge">orelse</code>. Invocations of a rule become
function calls, which, like rules, may be recursive.</p>

<p>The body of the rule <code class="language-plaintext highlighter-rouge">path(X,Z) :- path(X,Y), edge(Y,Z)</code> has a
variable, <code class="language-plaintext highlighter-rouge">Y</code>, that does not occur in the head. It is translated to
<code class="language-plaintext highlighter-rouge">exists v0</code>.</p>

<p>A Datalog program may have several <code class="language-plaintext highlighter-rouge">.output</code> directives. The Morel
program returns a single value, a record with one field for each
directive. This program has one directive, <code class="language-plaintext highlighter-rouge">.output path</code>, so the
record has a single field named <code class="language-plaintext highlighter-rouge">path</code> that is a list of
<code class="language-plaintext highlighter-rouge">{x:int, y:int}</code> records.</p>

<h2 id="how-morel-does-it">How Morel does it</h2>

<p>The magic lies not in the Datalog-to-Morel converter but in the Morel
language itself. Over the last few months, we have added to Morel a
capability called <em>predicate inversion</em>, the ability to deduce a set
from a boolean expression.</p>

<p>At the heart of the generated Morel program is a query: <code class="language-plaintext highlighter-rouge">from x, y
where path (x, y)</code>. It differs from a regular query in that the
variables <code class="language-plaintext highlighter-rouge">x</code> and <code class="language-plaintext highlighter-rouge">y</code> are <em>unbounded</em>. (In a conventional query,
every variable is <em>bounded</em>, meaning it iterates over a collection, as
do <code class="language-plaintext highlighter-rouge">d</code> and <code class="language-plaintext highlighter-rouge">e</code> in <code class="language-plaintext highlighter-rouge">from d in departments, e in employees</code>.)</p>

<p>In principle, an unbounded variable iterates over every possible value
of its data type. This is fine for “small” data types like <code class="language-plaintext highlighter-rouge">boolean</code>,
<code class="language-plaintext highlighter-rouge">char</code>, and <code class="language-plaintext highlighter-rouge">enum Color { RED | GREEN | BLUE }</code>, but problematic for
“large” data types like <code class="language-plaintext highlighter-rouge">int</code> and <code class="language-plaintext highlighter-rouge">{b: boolean, i: int}</code> and infinite
data types like <code class="language-plaintext highlighter-rouge">string</code> and <code class="language-plaintext highlighter-rouge">int list</code>.</p>

<p>Morel allows unbounded variables in a program as long as there is a
predicate like <code class="language-plaintext highlighter-rouge">where x &gt; 0 andalso x &lt; 10</code> or <code class="language-plaintext highlighter-rouge">where e elem
employees</code> that connects it with a finite set. Invertible predicates
provide a way to generate the values of the variable. In Datalog
parlance, they ensure that the variable is <em>grounded</em>.</p>

<p>Morel’s predicate inversion algorithm recognizes various predicate
patterns, including boolean functions that check collection membership
(like <code class="language-plaintext highlighter-rouge">edge</code>) or compute transitive closure (like <code class="language-plaintext highlighter-rouge">path</code>).</p>

<h2 id="mixing-styles">Mixing styles</h2>

<p>The net result is that predicate inversion allows you to freely mix
Datalog-style queries (defined by boolean expressions and functions)
with the relational algebra-style queries (defined by <code class="language-plaintext highlighter-rouge">from</code>,
<code class="language-plaintext highlighter-rouge">exists</code>, <code class="language-plaintext highlighter-rouge">join</code> and set operations).</p>

<p>The following query is in a hybrid style.</p>

<!-- morel skip
(* Calculus style: recursive reachability *)
fun edge (x, y) = (x, y) elem [(1,2), (2,3), (3,4), (2,4)];
fun reachable (x, y) =
  edge (x, y) orelse
  exists z where edge (x, z) andalso reachable (z, y);

(* Algebra style: count reachable nodes per source *)
from source in [1, 2, 3, 4]
  yield {source,
         reachable_count = count (from target
                                    where reachable (source, target))}
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Calculus style: recursive reachability *)</span>
<span class="kr">fun</span> <span class="nf">edge</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">=</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="kr">elem</span> <span class="p">[(</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">),</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">),</span> <span class="p">(</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">),</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span><span class="mi">4</span><span class="p">)];</span>
<span class="kr">fun</span> <span class="nf">reachable</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="p">=</span>
  <span class="n">edge</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="kr">orelse</span>
  <span class="kr">exists</span> <span class="n">z</span> <span class="kr">where</span> <span class="n">edge</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="kr">andalso</span> <span class="n">reachable</span> <span class="p">(</span><span class="n">z</span><span class="p">,</span> <span class="n">y</span><span class="p">);</span>

<span class="c">(*</span><span class="cm"> Algebra style: count reachable nodes per source *)</span>
<span class="kr">from</span> <span class="nv">source</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="p">{</span><span class="n">source</span><span class="p">,</span>
         <span class="n">reachable_count</span> <span class="p">=</span> <span class="n">count</span> <span class="p">(</span><span class="kr">from</span> <span class="nv">target</span>
                                    <span class="kr">where</span> <span class="n">reachable</span> <span class="p">(</span><span class="n">source</span><span class="p">,</span> <span class="n">target</span><span class="p">))}</span></div>
</div>

<p>The <code class="language-plaintext highlighter-rouge">edge</code> and <code class="language-plaintext highlighter-rouge">reachable</code> functions define graph reachability in a
Datalog style, using recursion and boolean return values. The <code class="language-plaintext highlighter-rouge">from</code>
query is in the algebra style, but uses predicate inversion to
generate all values of the unbounded <code class="language-plaintext highlighter-rouge">target</code> variable for which
<code class="language-plaintext highlighter-rouge">reachable (source, target)</code> is true. Predicate inversion provides the
junction between the two styles.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Morel now unifies the calculus and algebra styles of writing queries.
The new Datalog interface showcases this capability, but you can also
use the calculus style in Morel programs, where you can freely mix it
with the algebra style and functional programming.</p>

<p>Notice that we have made no claims about the <strong>efficiency</strong> of the
implementation. Our goal was to increase the expressive power of the
language, and we have achieved that goal. Morel’s internal
representation is algebraic — using relational operators, other
operators provided by functions, and iteration to a fixed point
— and from this point we can apply conventional
query-optimization techniques.</p>

<p>To keep things simple, we have not discussed <strong>evaluation models</strong>.
Datalog uses forward chaining (bottom-up evaluation) while boolean
functions give the impression that backwards chaining (top-down
evaluation) is being used. For most queries both approaches are valid,
and the planner would ideally consider both strategies along with
optimizations such as join re-ordering, magic sets, semi-naïve
evaluation, and materialized views. But there are queries where the
evaluation model matters (say, they would terminate under one model
but not another), and for these cases it is important that we define
Morel’s operational semantics.</p>

<p>The predicate inversion algorithm needs to evolve and mature. It has
been tested over a wide array of queries, but there are still cases
where it fails to invert a predicate, or fails to remove a condition
that has been fully satisfied by a generator. (We hope to write more
about predicate inversion, generators, and subsuming predicates, in a
future article.)</p>

<p>Please download Morel and give it a try! (Morel has both
<a href="https://github.com/hydromatic/morel">Java</a> and
<a href="https://github.com/hydromatic/morel-rust">Rust</a> versions, but Datalog
and predicate inversion require the Java version for now.)</p>

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">How we added Datalog support to <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a>... and why you might want to just write a Morel query. <a href="https://t.co/bws0HF4xHl">https://t.co/bws0HF4xHl</a> <a href="https://t.co/z6wfZGyamn">pic.twitter.com/z6wfZGyamn</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/2031116250278211833?ref_src=twsrc%5Etfw">March 9, 2026</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<p>This article
<a href="https://github.com/julianhyde/share/commits/main/blog/_posts/2026-03-09-datalog-in-morel.md">has been updated</a>.</p>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[This week we added Datalog support to Morel — not by building a Datalog engine, but by adding a language feature called predicate inversion.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">History of lambda syntax</title><link href="http://0.0.0.0:4000/draft-blog3/2025/10/26/history-of-lambda-syntax.html" rel="alternate" type="text/html" title="History of lambda syntax" /><published>2025-10-26T13:00:00-07:00</published><updated>2025-10-26T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/10/26/history-of-lambda-syntax</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/10/26/history-of-lambda-syntax.html"><![CDATA[<p>Lambda syntax varies widely across languages; more widely, I think, than
other language features. I wish it weren’t so. It’s difficult to see the
elegance in a new language if the syntax is unfamiliar.</p>

<p>The following table lists the year that various programming languages
introduced lambda syntax (not always the year in which the language
was born). If a language introduced an alternate syntax at a different
date, I have noted the year of introduction.</p>

<table>
  <thead>
    <tr>
      <th>Language</th>
      <th>Year</th>
      <th>Syntax</th>
      <th>Alternate(s)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Lambda calculus</td>
      <td>1930s<sup id="fnref:32"><a href="#fn:32" class="footnote" rel="footnote" role="doc-noteref">1</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">λx.x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Lisp</td>
      <td>1960<sup id="fnref:1"><a href="#fn:1" class="footnote" rel="footnote" role="doc-noteref">2</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">(lambda (x) (+ x 1))</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>ML</td>
      <td>1978<sup id="fnref:2"><a href="#fn:2" class="footnote" rel="footnote" role="doc-noteref">3</a></sup></td>
      <td><code>&lambda;x.x+1</code></td>
      <td>Evolved to <code class="language-plaintext highlighter-rouge">fun x.x+1</code> (1983), then <code class="language-plaintext highlighter-rouge">fn x =&gt; x + 1</code> (1985)</td>
    </tr>
    <tr>
      <td>Hope</td>
      <td>1980<sup id="fnref:36"><a href="#fn:36" class="footnote" rel="footnote" role="doc-noteref">4</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">lambda x =&gt; x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Smalltalk</td>
      <td>1981<sup id="fnref:35"><a href="#fn:35" class="footnote" rel="footnote" role="doc-noteref">5</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">[ :x | x + 1 ]</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Erlang</td>
      <td>1987<sup id="fnref:3"><a href="#fn:3" class="footnote" rel="footnote" role="doc-noteref">6</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">fun(X) -&gt; X + 1 end</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Haskell</td>
      <td>1990<sup id="fnref:4"><a href="#fn:4" class="footnote" rel="footnote" role="doc-noteref">7</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">\x -&gt; x + 1</code></td>
      <td><code class="language-plaintext highlighter-rouge">(+ 1)</code> (1999<sup id="fnref:5"><a href="#fn:5" class="footnote" rel="footnote" role="doc-noteref">8</a></sup>)</td>
    </tr>
    <tr>
      <td>Python</td>
      <td>1991<sup id="fnref:6"><a href="#fn:6" class="footnote" rel="footnote" role="doc-noteref">9</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">lambda x: x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Lua</td>
      <td>1993<sup id="fnref:33"><a href="#fn:33" class="footnote" rel="footnote" role="doc-noteref">10</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">function (x) return x + 1 end</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Perl</td>
      <td>1994<sup id="fnref:7"><a href="#fn:7" class="footnote" rel="footnote" role="doc-noteref">11</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">sub { $_[0] + 1 }</code></td>
      <td><code class="language-plaintext highlighter-rouge">sub { my $x = shift; $x + 1 }</code></td>
    </tr>
    <tr>
      <td>JavaScript</td>
      <td>1995<sup id="fnref:8"><a href="#fn:8" class="footnote" rel="footnote" role="doc-noteref">12</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">function(x) { return x + 1; }</code></td>
      <td><code class="language-plaintext highlighter-rouge">x =&gt; x + 1</code> (2015 <sup id="fnref:9"><a href="#fn:9" class="footnote" rel="footnote" role="doc-noteref">13</a></sup>)</td>
    </tr>
    <tr>
      <td>Ruby</td>
      <td>1995<sup id="fnref:10"><a href="#fn:10" class="footnote" rel="footnote" role="doc-noteref">14</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">Proc.new { | x | x + 1 }</code><br /><code class="language-plaintext highlighter-rouge">proc { | x | x + 1 }</code></td>
      <td><code class="language-plaintext highlighter-rouge">lambda { |x| x + 1 }</code> (2003 <sup id="fnref:11"><a href="#fn:11" class="footnote" rel="footnote" role="doc-noteref">15</a></sup>)<br /><code class="language-plaintext highlighter-rouge">-&gt;(x) { x + 1 }</code> (2007 <sup id="fnref:12"><a href="#fn:12" class="footnote" rel="footnote" role="doc-noteref">16</a></sup>)</td>
    </tr>
    <tr>
      <td>OCaml</td>
      <td>1996<sup id="fnref:13"><a href="#fn:13" class="footnote" rel="footnote" role="doc-noteref">17</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">fun x -&gt; x + 1</code></td>
      <td><code class="language-plaintext highlighter-rouge">(+) 1</code> (1996 <sup id="fnref:14"><a href="#fn:14" class="footnote" rel="footnote" role="doc-noteref">18</a></sup>)</td>
    </tr>
    <tr>
      <td>APL</td>
      <td>1996<sup id="fnref:20"><a href="#fn:20" class="footnote" rel="footnote" role="doc-noteref">19</a></sup></td>
      <td><code>{&omega;+1}</code></td>
      <td><code class="language-plaintext highlighter-rouge">+∘1</code> (1978 <sup id="fnref:21"><a href="#fn:21" class="footnote" rel="footnote" role="doc-noteref">20</a></sup>)</td>
    </tr>
    <tr>
      <td>Groovy</td>
      <td>2003<sup id="fnref:16"><a href="#fn:16" class="footnote" rel="footnote" role="doc-noteref">21</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">{ x -&gt; x + 1 }</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Scala</td>
      <td>2003<sup id="fnref:17"><a href="#fn:17" class="footnote" rel="footnote" role="doc-noteref">22</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">x =&gt; x + 1</code></td>
      <td><code class="language-plaintext highlighter-rouge">_ + 1</code> (2007 <sup id="fnref:18"><a href="#fn:18" class="footnote" rel="footnote" role="doc-noteref">23</a></sup>)</td>
    </tr>
    <tr>
      <td>MATLAB</td>
      <td>2004<sup id="fnref:19"><a href="#fn:19" class="footnote" rel="footnote" role="doc-noteref">24</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">@(x) x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>C#</td>
      <td>2007<sup id="fnref:15"><a href="#fn:15" class="footnote" rel="footnote" role="doc-noteref">25</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">x =&gt; x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Clojure</td>
      <td>2007<sup id="fnref:22"><a href="#fn:22" class="footnote" rel="footnote" role="doc-noteref">26</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">(fn [x] (+ x 1))</code></td>
      <td><code class="language-plaintext highlighter-rouge">#(+ % 1)</code><br /><code class="language-plaintext highlighter-rouge">(partial + 1)</code></td>
    </tr>
    <tr>
      <td>Go</td>
      <td>2009<sup id="fnref:23"><a href="#fn:23" class="footnote" rel="footnote" role="doc-noteref">27</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">func(x int) int { return x + 1 }</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Delphi</td>
      <td>2009<sup id="fnref:34"><a href="#fn:34" class="footnote" rel="footnote" role="doc-noteref">28</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">f := function(x: Integer): Integer begin Result := x + 1; end;</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Rust</td>
      <td>2010<sup id="fnref:24"><a href="#fn:24" class="footnote" rel="footnote" role="doc-noteref">29</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">|x| x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Dart</td>
      <td>2011<sup id="fnref:25"><a href="#fn:25" class="footnote" rel="footnote" role="doc-noteref">30</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">(x) =&gt; x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Elixir</td>
      <td>2011<sup id="fnref:26"><a href="#fn:26" class="footnote" rel="footnote" role="doc-noteref">31</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">fn x -&gt; x + 1 end</code></td>
      <td><code class="language-plaintext highlighter-rouge">&amp;(&amp;1 + 1)</code></td>
    </tr>
    <tr>
      <td>Kotlin</td>
      <td>2011<sup id="fnref:27"><a href="#fn:27" class="footnote" rel="footnote" role="doc-noteref">32</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">{ x -&gt; x + 1 }</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>C++</td>
      <td>2011<sup id="fnref:28"><a href="#fn:28" class="footnote" rel="footnote" role="doc-noteref">33</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">[](int x) { return x + 1; }</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Julia</td>
      <td>2012<sup id="fnref:29"><a href="#fn:29" class="footnote" rel="footnote" role="doc-noteref">34</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">x -&gt; x + 1</code></td>
      <td> </td>
    </tr>
    <tr>
      <td>Swift</td>
      <td>2014<sup id="fnref:30"><a href="#fn:30" class="footnote" rel="footnote" role="doc-noteref">35</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">{ x in x + 1 }</code></td>
      <td><code class="language-plaintext highlighter-rouge">{$0 + 1}</code></td>
    </tr>
    <tr>
      <td>Java</td>
      <td>2014<sup id="fnref:31"><a href="#fn:31" class="footnote" rel="footnote" role="doc-noteref">36</a></sup></td>
      <td><code class="language-plaintext highlighter-rouge">x -&gt; x + 1</code></td>
      <td> </td>
    </tr>
  </tbody>
</table>

<p>(Please let me know if there are mistakes in syntax or year of
introduction. Claude was my research assistant. I omitted languages in
the same family with the same syntax, e.g. Lisp-Scheme-Racket,
OCaml-F#. Did I miss any early, major languages?)</p>

<p>Here is the original tweet:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">Lambda syntax varies widely across languages; more widely, I think, than other language features. I wish it weren&#39;t so. It&#39;s difficult to see the elegance in a new language if the syntax is unfamiliar. <a href="https://t.co/kz1KrtsrbU">pic.twitter.com/kz1KrtsrbU</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1950681730568143094?ref_src=twsrc%5Etfw">July 30, 2025</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<p>This article
<a href="https://github.com/julianhyde/share/commits/main/blog/_posts/2025-10-26-history-of-lambda-syntax.md">has been updated</a>.</p>

<h3 id="footnotes">Footnotes</h3>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:32">
      <p>The lambda calculus was invented in the 1930s by Alonzo Church.
   The original notation used a Greek letter lambda (λ) to denote
   anonymous functions. It is a mathematical formalism rather than
   a programming language. <a href="#fnref:32" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:1">
      <p>Lisp was invented in 1958, but the lambda syntax appeared in the
  1960 paper “Recursive Functions of Symbolic Expressions and
  Their Computation by Machine, Part I” by John McCarthy. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2">
      <p>ML was invented in 1973 by Robin Milner et al. In
  <a href="https://dl.acm.org/doi/pdf/10.1145/512760.512773&quot;">“A Metalanguage for Interactive Proof in ICF”</a>
  (Gordon, Milner et al., 1978), the syntax was
  “<code>&lambda;x.x+1</code>”. By
  <a href="https://smlfamily.github.io/history/SML-proposal-6-83.pdf">“A Proposal for Standard ML (second draft)”</a>
  (Milner, 1983), the syntax was “<code class="language-plaintext highlighter-rouge">fun x . x + 1</code>”.
  The final syntax “<code>fn x =&gt; x + 1</code>” first appeared in
  <a href="https://smlfamily.github.io/history/SML-proposal-9-85.pdf">“The Standard ML Core Language (Revised)”</a>
  (Milner, 1985). <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:36">
      <p>Per “<a href="https://dl.acm.org/doi/pdf/10.1145/3386336">The History of Standard ML</a>”
   (MacQueen, Harper, Reppy, 2000), HOPE was developed in
   Edinburgh just after LCF/ML from 1977 to 1980. See
   “<a href="https://dl.acm.org/doi/pdf/10.1145/800087.802799">HOPE: An experimental applicative language</a>”
   (Burstall, MacQueen, Sannella, 1980). <a href="#fnref:36" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:35">
      <p>Smalltalk was created in the early 1970s at Xerox PARC by Alan
   Kay, Dan Ingalls, Adele Goldberg, and others. Smalltalk-76
   added block literals with no arguments. Smalltalk-80 (1981)
   allowed code blocks to have arguments. See
   “<a href="http://stephane.ducasse.free.fr/FreeBooks/BlueBook/Bluebook.pdf">Smalltalk-80: The Language and its implementation</a>”
   by Adele Goldberg and David Robson, page 35. <a href="#fnref:35" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3">
      <p>Erlang was created in 1987 by Joe Armstrong et al. The syntax
  appeared in the 1993 book “Erlang Programming” by Armstrong. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4">
      <p>Haskell was first defined in 1990 by a committee. The syntax
  appeared in the 1990 paper “Haskell: A Non-strict, Purely
  Functional Language” by Simon Peyton Jones et al. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:5">
      <p>The operator section syntax <code class="language-plaintext highlighter-rouge">(+ 1)</code> was introduced in the
  Haskell 98 Report (1999). <a href="#fnref:5" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:6">
      <p>Python introduced the <code class="language-plaintext highlighter-rouge">lambda</code> syntax in version 1.0, released
  in January 1994. The syntax was present in the 1991 “Python
  Tutorial” by Guido van Rossum. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:33">
      <p>Lua was created in 1993 by Roberto Ierusalimschy, Luiz Henrique
   de Figueiredo, and Waldemar Celes. The function syntax appeared
   in the original Lua documentation. <a href="#fnref:33" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:7">
      <p>Perl introduced anonymous subroutines in version 5.0, released
  in 1994. The syntax was documented in the “Programming Perl”
  book by Larry Wall et al. The use of <code class="language-plaintext highlighter-rouge">my $x = shift</code>; within
  such a block is a standard way to access arguments passed to the
  subroutine. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:8">
      <p>JavaScript was created in 1995 by Brendan Eich. The <code class="language-plaintext highlighter-rouge">function</code>
  syntax appeared in the original specification “JavaScript
  Language Specification” by Netscape. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:9">
      <p>The arrow function syntax <code class="language-plaintext highlighter-rouge">x =&gt; x + 1</code> was introduced in
  ECMAScript 6 (2015). Prior to that, JavaScript did not have a
  concise lambda syntax. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:10">
      <p>Ruby was created in 1995 by Yukihiro Matsumoto. The initial
   release, Ruby 0.95 contained the <code class="language-plaintext highlighter-rouge">Proc</code> class and block
   syntax. The <code class="language-plaintext highlighter-rouge">Kernel#proc</code> method was equivalent to <code class="language-plaintext highlighter-rouge">Proc.new</code>. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:11">
      <p>Ruby introduced <code class="language-plaintext highlighter-rouge">lambda</code> in Ruby 1.8 (2003) as a way to create
   lambda functions with stricter argument checking. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:12">
      <p>The stabby lambda syntax <code class="language-plaintext highlighter-rouge">-&gt;</code> was introduced in Ruby 1.9 (2007)
   as a more concise way to define lambdas. <code class="language-plaintext highlighter-rouge">Kernel#proc</code> was
   changed to be equivalent to <code class="language-plaintext highlighter-rouge">Proc.new</code>, which has slightly
   different behavior than a lambda. <a href="#fnref:12" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:13">
      <p>OCaml was created in 1996 by Xavier Leroy et al. The syntax
   appeared in the 1996 paper “The Objective Caml System” by
   Leroy. <a href="#fnref:13" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:14">
      <p>OCaml supports partial application of functions, so <code class="language-plaintext highlighter-rouge">(+) 1</code> is
   valid syntax for a function that adds 1. <a href="#fnref:14" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:20">
      <p><a href="https://en.wikipedia.org/wiki/John_M._Scholes">John Scholes</a>
   invented direct functions or dfns (pronounced “dee funs”) in
   1996. <a href="#fnref:20" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:21">
      <p>The tacit programming style (also known as point-free style)
   was introduced by Kenneth E. Iverson in the 1978 book “APL: An
   Interactive Approach” co-authored with Philip S. Abrams.  See
   also <a href="https://aplwiki.com/wiki/Bind">Bind</a>. <a href="#fnref:21" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:16">
      <p>Groovy was created in 2003 by James Strachan. The closure
   syntax appeared in the original Groovy documentation. <a href="#fnref:16" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:17">
      <p>Scala was created in 2003 by Martin Odersky. The syntax
   appeared in the 2004 paper “The Scala Language Specification”
   by Odersky et al. <a href="#fnref:17" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:18">
      <p>Scala “placeholder syntax” was introduced around 2007, and
   appears in the 2008 “Programming in Scala” book by Odersky,
   Spoon, and Venners. <a href="#fnref:18" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:19">
      <p>MATLAB introduced function handles in release R12 (MATLAB 6.0),
   which was released in November 2000. However, in this version,
   calling them still required the use of the <code class="language-plaintext highlighter-rouge">feval</code>
   function. Anonymous and nested functions, which expanded the
   capabilities related to function handles, were introduced later
   in release R14 (MATLAB 7.0), released in June 2004. <a href="#fnref:19" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:15">
      <p>While C# had <code class="language-plaintext highlighter-rouge">delegate</code> in version 2.0 (2005), lambda
   expressions did not arrive until version 3.0 (2007). The syntax
   appeared in the “C# Language Specification” by Anders Hejlsberg
   et al. <a href="#fnref:15" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:22">
      <p>Clojure was created in 2007 by Rich Hickey. The <code class="language-plaintext highlighter-rouge">fn</code> syntax,
   function literal syntax <code class="language-plaintext highlighter-rouge">#(+ % 1)</code>, and <code class="language-plaintext highlighter-rouge">partial</code> all
   appeared in the original Clojure documentation. <a href="#fnref:22" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:23">
      <p>Go was created in 2009 by Robert Griesemer, Rob Pike, and Ken
   Thompson.  The <code class="language-plaintext highlighter-rouge">func</code> syntax appeared in the original Go
   specification. <a href="#fnref:23" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:34">
      <p>Delphi introduced anonymous methods in Delphi 2009. The syntax
   appeared in the “Delphi Language Guide” by Embarcadero. Anonymous
   methods must be used immediately (assigned to a variable, passed
   as a parameter, or applied to arguments). <a href="#fnref:34" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:24">
      <p>Rust was created in 2010 by Graydon Hoare. The closure syntax
   appeared in the original Rust documentation. <a href="#fnref:24" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:25">
      <p>Dart was created in 2011 by Lars Bak and Kasper Lund. The arrow
   syntax appeared in the original Dart language specification. <a href="#fnref:25" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:26">
      <p>Elixir was created in 2011 by José Valim. The <code class="language-plaintext highlighter-rouge">fn</code> syntax and
   capture operator <code class="language-plaintext highlighter-rouge">&amp;</code> appeared in the original Elixir
   documentation. <a href="#fnref:26" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:27">
      <p>Kotlin was created in 2011 by JetBrains. The lambda syntax
   appeared in the original Kotlin documentation. <a href="#fnref:27" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:28">
      <p>C++ introduced lambda expressions in C++11 (2011). The syntax
   appeared in the “C++11 Standard” by the ISO/IEC JTC1/SC. <a href="#fnref:28" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:29">
      <p>Julia was created in 2012 by Jeff Bezanson, Stefan Karpinski,
   Viral B. Shah, and Alan Edelman. The arrow syntax appeared in
   the original Julia documentation. <a href="#fnref:29" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:30">
      <p>Swift was created in 2014 by Apple Inc. The closure syntax, and
   shorthand argument names like <code class="language-plaintext highlighter-rouge">$0</code>, have been part of Swift
   since version 1.0. <a href="#fnref:30" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:31">
      <p>Java 8 introduced lambda expressions in 2014. The syntax
   appeared in the “Java Language Specification, Java SE 8
   Edition” by James Gosling et al. <a href="#fnref:31" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[Lambda syntax varies widely across languages; more widely, I think, than other language features. I wish it weren’t so. It’s difficult to see the elegance in a new language if the syntax is unfamiliar.]]></summary></entry><entry><title type="html">Morel Rust release 0.2.0</title><link href="http://0.0.0.0:4000/draft-blog3/2025/10/23/morel-rust-release-0-2-0.html" rel="alternate" type="text/html" title="Morel Rust release 0.2.0" /><published>2025-10-23T02:30:00-07:00</published><updated>2025-10-23T02:30:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/10/23/morel-rust-release-0-2-0</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/10/23/morel-rust-release-0-2-0.html"><![CDATA[<p>I am pleased to announce
<a href="https://github.com/hydromatic/morel-rust/blob/main/CHANGELOG.md#020--2025-10-23">release 0.2.0</a>
of <a href="https://github.com/hydromatic/morel-rust/">Morel Rust</a>.</p>

<p>The Morel language has an existing implementation in Java
(<a href="https://github.com/hydromatic/morel/">Morel Java</a> version 0.7 was
<a href="/draft-blog3/2025/06/08/morel-release-0-7-0.html">released in June</a> and
0.8 is coming soon) but this is the beginning of a brand-new Rust
runtime.</p>

<h3 id="whats-in-release-020">What’s in release 0.2.0</h3>

<p>This release focuses on Morel’s underpinnings as a functional
programming language. It can parse any program, and execute
simple programs that consist of expressions, function
declarations, and lambdas (closures).</p>

<p>Here’s a quick example showing what works today.
First, use <code class="language-plaintext highlighter-rouge">cargo</code> to build Morel and start a shell:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>cargo run
morel-rust version 0.2.0 <span class="o">(</span>rust version 1.90.0<span class="o">)</span>
-
</code></pre></div></div>

<p>Next, you can enter some commands:</p>

<!-- morel skip
(* Define a recursive function *)
fun factorial n =
  if n <= 1 then 1
  else n * factorial (n - 1);

(* Use lambdas and higher-order functions *)
val squares = List.map (fn x => x * x) [1, 2, 3, 4, 5];

(* Compose functions *)
val sumOfSquares =
  List.foldl (fn (x, y) => x + y) 0 (List.map (fn x => x * x) [1, 2, 3, 4, 5]);
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Define a recursive function *)</span>
<span class="kr">fun</span> <span class="nf">factorial</span> <span class="n">n</span> <span class="p">=</span>
  <span class="kr">if</span> <span class="n">n</span> <span class="o">&lt;</span><span class="p">=</span> <span class="mi">1</span> <span class="kr">then</span> <span class="mi">1</span>
  <span class="kr">else</span> <span class="n">n</span> <span class="o">*</span> <span class="n">factorial</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>

<span class="c">(*</span><span class="cm"> Use lambdas and higher-order functions *)</span>
<span class="kr">val</span> <span class="nv">squares</span> <span class="p">=</span> <span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="kr">fn</span> <span class="n">x</span> <span class="o">=&gt;</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span><span class="p">)</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">];</span>

<span class="c">(*</span><span class="cm"> Compose functions *)</span>
<span class="kr">val</span> <span class="nv">sumOfSquares</span> <span class="p">=</span>
  <span class="nn">List</span><span class="p">.</span><span class="n">foldl</span> <span class="p">(</span><span class="kr">fn</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="p">)</span> <span class="mi">0</span> <span class="p">(</span><span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="kr">fn</span> <span class="n">x</span> <span class="o">=&gt;</span> <span class="n">x</span> <span class="o">*</span> <span class="n">x</span><span class="p">)</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">]);</span></div>
</div>

<p>This demonstrates core functional programming: recursion,
higher-order functions, and composition. Support for programs
that contain queries—the <code class="language-plaintext highlighter-rouge">from</code>, <code class="language-plaintext highlighter-rouge">exists</code>, and <code class="language-plaintext highlighter-rouge">forall</code>
keywords—and user-defined types will come later.</p>

<p>A caveat: this is pre-alpha software. Expect bugs, crashes, and
minimal error handling. We’ve focused on getting the foundations
right—the Hindley-Milner type deduction algorithm, an
evaluation environment that handles recursive functions and
closures—rather than polish. If you’d like to
contribute—fixing bugs, adding features, or improving
documentation—please join us!</p>

<h3 id="why-rust">Why Rust?</h3>

<p>Why create a Rust runtime for Morel when there is already a
Java runtime? Rust brings significant advantages for data
processing workloads.</p>

<p>Rust processes in-memory data at exceptional speed, with
zero-cost abstractions and no garbage collection pauses. It
integrates naturally with modern data infrastructure: Apache
DataFusion for query execution, Arrow for columnar processing,
and Parquet for efficient storage. Memory safety comes without
runtime overhead, and the resulting binaries are ideal for
cloud-native deployments.</p>

<p>Having multiple runtimes also underscores a key design principle: when
writing a Morel program, you don’t need to think about implementation
details. Your choice of runtime is separate from your choice of
language. Choose Java for its ecosystem and JVM integration, or Rust
for performance and modern infrastructure. Programs are portable
across both.</p>

<p>But the most important “runtime” is wherever your data already
lives—Iceberg tables on object storage, Kafka topics, or SQL
engines like Snowflake, BigQuery, or Postgres. That’s why query
planning, federation, and SQL dialect translation are central to
Morel’s design. The compiler can push computation to the data,
regardless of which Morel implementation you’re using.</p>

<h3 id="morel-is-a-language-not-a-framework">Morel is a language, not a framework</h3>

<p>Morel is a complete language, not a framework. When you have a
data problem, you can solve it entirely in Morel—no jumping
between languages, no glue code, and no framework boundaries to
cross.</p>

<p>With a framework, you’re constantly context-switching: Python
for orchestration, SQL for queries, Java for business logic,
and Spark for transformations. Morel lets you express the entire
solution in one language, bringing the benefits of functional
programming—type safety, composability, and refactoring—to data
engineering.</p>

<h3 id="choose-your-runtime-keep-your-code">Choose your runtime, keep your code</h3>

<p>Because Morel is a language with multiple implementations, your
Morel programs are portable across runtimes. Write your code
once, and run it on either Java or Rust—whichever fits your
deployment needs. Users shouldn’t notice—and don’t need to
care—which implementation they’re running.</p>

<p>One reason that Morel Rust has developed quickly is that we can
run Morel Java’s test scripts unchanged. We are gradually
enabling tests as functionality comes online, proving
portability in practice.</p>

<h3 id="learn-more">Learn more</h3>

<p>To find out more about Morel, read about its
<a href="/draft-blog3/2020/02/25/morel-a-functional-language-for-data.html">goals</a>
and <a href="/draft-blog3/2020/03/03/morel-basics.html">basic language</a>,
and find a full definition of the language in the
<a href="https://github.com/hydromatic/morel/blob/main/docs/query.md">query reference</a>
and the
<a href="https://github.com/hydromatic/morel/blob/main/docs/reference.md">language reference</a>.</p>

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">Morel is now in Rust! I just made the first release of the new Rust toolchain for <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a>. Morel-Rust implements same language as Morel-Java. It&#39;s early days, but potentially performance will be much better. <a href="https://t.co/15BJXA8lLe">https://t.co/15BJXA8lLe</a> <a href="https://t.co/cysSLvMbPP">pic.twitter.com/cysSLvMbPP</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1981440836467642880?ref_src=twsrc%5Etfw">October 23, 2025</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<p>This article
<a href="https://github.com/julianhyde/share/commits/main/blog/_posts/2025-10-23-morel-rust-release-0-2-0.md">has been updated</a>.</p>

<p><small>Apache Arrow, Apache DataFusion, Apache Iceberg, Apache
Parquet, and Apache Kafka are trademarks of the Apache Software
Foundation.</small></p>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[I am pleased to announce release 0.2.0 of Morel Rust.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Sorting on expressions</title><link href="http://0.0.0.0:4000/draft-blog3/2025/06/20/sorting-on-expressions.html" rel="alternate" type="text/html" title="Sorting on expressions" /><published>2025-06-20T13:00:00-07:00</published><updated>2025-06-20T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/06/20/sorting-on-expressions</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/06/20/sorting-on-expressions.html"><![CDATA[<p>Morel’s design philosophy of “everything is an expression” has
transformed how we think about queries, making them more composable
and flexible than traditional SQL.  One stubborn holdout was the
<code class="language-plaintext highlighter-rouge">order</code> step, which required a special syntax with comma-separated
order-items rather than a single expression. In this post, we describe
how we
<a href="https://github.com/hydromatic/morel/issues/244">evolved the syntax of the <code class="language-plaintext highlighter-rouge">order</code> step</a>
in Morel release 0.7, and the benefits of this change.</p>

<h2 id="why-expressions">Why expressions?</h2>

<p>In release 0.6, Morel’s
<a href="https://github.com/hydromatic/morel/blob/main/docs/query.md#syntax">query syntax</a>
(simplified a little) looked like this:</p>

<pre><code><i>query</i> &rarr; <b>from</b> <i>scan</i> [ , <i>scan</i> ... ] [ <i>step</i> ... ]

<i>step</i> &rarr; <b>distinct</b>
    | <b>except</b> [ <b>distinct</b> ] <i>exp</i> [ , <i>exp</i> ... ]
    | <b>group</b> <i>groupKey</i> [ , <i>groupKey</i> ... ] [ <b>compute</b> <i>agg</i> [ , <i>agg</i> ... ] ]
    | <b>intersect</b> [ <b>distinct</b> ] <i>exp</i> [ , <i>exp</i> ... ]
    | <b>join</b> <i>scan</i> [ , <i>scan</i> ... ]
    | <b>order</b> <i>orderItem</i> [ , <i>orderItem</i> ... ]
    | <b>skip</b> <i>exp</i>
    | <b>take</b> <i>exp</i>
    | <b>union</b> [ <b>distinct</b> ] <i>exp</i> [ , <i>exp</i> ... ]
    | <b>where</b> <i>exp</i>
    | <b>yield</b> <i>exp</i>

<i>scan</i> &rarr; <i>pat</i> <b>in</b> <i>exp</i> [ <b>on</b> <i>exp</i> ]

<i>orderItem</i> &rarr; <i>exp</i> [ <b>desc</b> ]

<i>groupKey</i> &rarr; [ <i>id</i> <b>=</b> ] <i>exp</i>

<i>agg</i> &rarr; [ <i>id</i> <b>=</b> ] <i>exp</i> [ <b>of</b> <i>exp</i> ]</code></pre>

<p>Almost everything is an expression. The argument to the <code class="language-plaintext highlighter-rouge">yield</code> step
is an expression (whereas SQL’s <code class="language-plaintext highlighter-rouge">SELECT</code> has a list of expressions
with optional <code class="language-plaintext highlighter-rouge">AS</code> aliases); the scan in a <code class="language-plaintext highlighter-rouge">from</code> query or <code class="language-plaintext highlighter-rouge">join</code> step
is over an expression (which, unlike SQL, is not necessarily a query);
the arguments to the <code class="language-plaintext highlighter-rouge">where</code>, <code class="language-plaintext highlighter-rouge">skip</code>, <code class="language-plaintext highlighter-rouge">take</code>, <code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">intersect</code>,
and <code class="language-plaintext highlighter-rouge">union</code> steps are also expressions.</p>

<p>(The <em>groupKey</em> and <em>agg</em> items in <code class="language-plaintext highlighter-rouge">group</code> and <code class="language-plaintext highlighter-rouge">compute</code> have some way
to go, and we will be looking at those for Morel 0.8, but at least the
aggregate function (before <code class="language-plaintext highlighter-rouge">of</code>) may be a (function-valued)
expression.)</p>

<p>Making everything an expression pays dividends. Queries can return a
collection of any value, not just records. You can easily join a
collection to a set of nested records (say an order to its nested
order-lines). If you need a custom aggregate function, you can roll
your own. And each of these expressions can be made into function
arguments, so that you can parameterize your query.</p>

<p>From Morel 0.7 onwards, syntax of the <code class="language-plaintext highlighter-rouge">order</code> step is simpler:</p>

<pre><code><i>step</i> &rarr; ...
  | <b>order</b> <i>exp</i></code></pre>

<p>The argument is now just an expression, and the <em>orderItem</em> concept
has disappeared.</p>

<p>Let’s look at how we got here. What was wrong with the previous
syntax, which alternatives did we consider for the new syntax, and
what changes were necessary in order to make it possible?</p>

<h2 id="the-order-step">The <code class="language-plaintext highlighter-rouge">order</code> step</h2>

<p>In the previous syntax, the argument of the <code class="language-plaintext highlighter-rouge">order</code> step was a
comma-separated list of order-items, each of which is an expression
with an optional <code class="language-plaintext highlighter-rouge">desc</code> keyword.</p>

<p>One problem is the commas. In the expression</p>

<!-- morel skip
let
  val pairs = [(1, "a"), (2, "b"), (1, "c")];
in
  foo (from (i, j) in pairs order i desc, j)
end;
-->

<div class="code-block">
<div class="code-input"><span class="kr">let</span>
  <span class="kr">val</span> <span class="nv">pairs</span> <span class="p">=</span> <span class="p">[(</span><span class="mi">1</span><span class="p">,</span> <span class="s2">"a"</span><span class="p">),</span> <span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="s2">"b"</span><span class="p">),</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="s2">"c"</span><span class="p">)];</span>
<span class="kr">in</span>
  <span class="n">foo</span> <span class="p">(</span><span class="kr">from</span> <span class="p">(</span><span class="nv">i</span><span class="p">,</span> <span class="nv">j</span><span class="p">)</span> <span class="kr">in</span> <span class="n">pairs</span> <span class="kr">order</span> <span class="n">i</span> <span class="kr">desc</span><span class="p">,</span> <span class="n">j</span><span class="p">)</span>
<span class="kr">end</span><span class="p">;</span></div>
</div>

<p>it is not immediately clear whether <code class="language-plaintext highlighter-rouge">j</code> is a second argument for the
call to the function <code class="language-plaintext highlighter-rouge">foo</code> or the second item in the <code class="language-plaintext highlighter-rouge">order</code> clause.</p>

<p>Another problem was the fact that the <code class="language-plaintext highlighter-rouge">order</code> clause could not be
empty. The
<a href="https://github.com/hydromatic/morel/issues/273">ordered and unordered collections</a>
feature introduced an <code class="language-plaintext highlighter-rouge">unorder</code> step to convert a <code class="language-plaintext highlighter-rouge">list</code> to a <code class="language-plaintext highlighter-rouge">bag</code>,
and we need the opposite of that, a trivial sort whose
key has the same value for every element.</p>

<p>We can’t just get rid of the <code class="language-plaintext highlighter-rouge">desc</code> keyword and covert the list to a
singleton. Real queries require complex sorting behaviors like
composite keys, descending keys, and nulls-first or nulls-last
specifications. So, how can we put all that complexity in a single
expression?</p>

<p>One approach is to do what many programming languages do, and use a
comparator function. Let’s explore this approach.</p>

<h2 id="comparator-functions">Comparator functions</h2>

<p>In Standard ML, a comparator function is any function that takes a
pair of arguments of the same type and returns a value of the <code class="language-plaintext highlighter-rouge">order</code>
enum (<code class="language-plaintext highlighter-rouge">LESS</code>, <code class="language-plaintext highlighter-rouge">EQUAL</code>, <code class="language-plaintext highlighter-rouge">GREATER</code>). Its type is
<code class="language-plaintext highlighter-rouge">alpha * alpha -&gt; order</code>.</p>

<p>For <code class="language-plaintext highlighter-rouge">int</code>, I can write a simple function:</p>

<!-- morel
fun compareInt (x: int, y: int) =
  if x < y then LESS
  else if x > y then GREATER
  else EQUAL;
> val compareInt = fn : int * int -> order
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">compareInt</span> <span class="p">(</span><span class="n">x</span><span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">int</span><span class="p">)</span> <span class="p">=</span>
  <span class="kr">if</span> <span class="n">x</span> <span class="o">&lt;</span> <span class="n">y</span> <span class="kr">then</span> <span class="n">LESS</span>
  <span class="kr">else</span> <span class="kr">if</span> <span class="n">x</span> <span class="o">&gt;</span> <span class="n">y</span> <span class="kr">then</span> <span class="n">GREATER</span>
  <span class="kr">else</span> <span class="n">EQUAL</span><span class="p">;</span></div>
<div class="code-output">val compareInt = fn : int * int -&gt; order</div>
</div>

<p>In fact, most data types have a built-in <code class="language-plaintext highlighter-rouge">compare</code> function:</p>

<!-- morel
Int.compare;
> val it = fn : int * int -> order
Real.compare;
> val it = fn : real * real -> order
String.compare;
> val it = fn : string * string -> order
-->

<div class="code-block">
<div class="code-input"><span class="nn">Int</span><span class="p">.</span><span class="n">compare</span><span class="p">;</span></div>
<div class="code-output">val it = fn : int * int -&gt; order</div>
<div class="code-input"><span class="nn">Real</span><span class="p">.</span><span class="n">compare</span><span class="p">;</span></div>
<div class="code-output">val it = fn : real * real -&gt; order</div>
<div class="code-input"><span class="nn">String</span><span class="p">.</span><span class="n">compare</span><span class="p">;</span></div>
<div class="code-output">val it = fn : string * string -&gt; order</div>
</div>

<p>For more complex orderings, I can write a comparator that combines
other comparators. For example, this function compares a list of
<code class="language-plaintext highlighter-rouge">string * real</code> pairs, the <code class="language-plaintext highlighter-rouge">string</code> first, then the <code class="language-plaintext highlighter-rouge">real</code>
descending:</p>

<!-- morel
fun compareStringRealPair ((s1, r1), (s2, r2)) =
    case String.compare (s1, s2) of
        EQUAL => Real.compare (r2, r1)
      | result => result;
> val compareStringRealPair = fn : string * real * (string * real) -> order
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">compareStringRealPair</span> <span class="p">((</span><span class="n">s1</span><span class="p">,</span> <span class="n">r1</span><span class="p">),</span> <span class="p">(</span><span class="n">s2</span><span class="p">,</span> <span class="n">r2</span><span class="p">))</span> <span class="p">=</span>
    <span class="kr">case</span> <span class="nn">String</span><span class="p">.</span><span class="n">compare</span> <span class="p">(</span><span class="n">s1</span><span class="p">,</span> <span class="n">s2</span><span class="p">)</span> <span class="kr">of</span>
        <span class="n">EQUAL</span> <span class="o">=&gt;</span> <span class="nn">Real</span><span class="p">.</span><span class="n">compare</span> <span class="p">(</span><span class="n">r2</span><span class="p">,</span> <span class="n">r1</span><span class="p">)</span>
      <span class="p">|</span> <span class="n">result</span> <span class="o">=&gt;</span> <span class="n">result</span><span class="p">;</span></div>
<div class="code-output">val compareStringRealPair = fn : string * real * (string * real) -&gt; order</div>
</div>

<p>If we were to add comparators to Morel, we could add <code class="language-plaintext highlighter-rouge">order using</code>
syntax like this:</p>

<!-- morel skip
(* Sort employees by job, and then by descending salary. *)
from e in scott.emps
  order using fn (emp1, emp2) =>
    case String.compare (emp1.job, emp2.job) of
       EQUAL => Real.compare (emp2.sal, emp1.sal)
     | result => result;
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Sort employees by job, and then by descending salary. *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="n">using</span> <span class="kr">fn</span> <span class="p">(</span><span class="n">emp1</span><span class="p">,</span> <span class="n">emp2</span><span class="p">)</span> <span class="o">=&gt;</span>
    <span class="kr">case</span> <span class="nn">String</span><span class="p">.</span><span class="n">compare</span> <span class="p">(</span><span class="nn">emp1</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="nn">emp2</span><span class="p">.</span><span class="n">job</span><span class="p">)</span> <span class="kr">of</span>
       <span class="n">EQUAL</span> <span class="o">=&gt;</span> <span class="nn">Real</span><span class="p">.</span><span class="n">compare</span> <span class="p">(</span><span class="nn">emp2</span><span class="p">.</span><span class="n">sal</span><span class="p">,</span> <span class="nn">emp1</span><span class="p">.</span><span class="n">sal</span><span class="p">)</span>
     <span class="p">|</span> <span class="n">result</span> <span class="o">=&gt;</span> <span class="n">result</span><span class="p">;</span></div>
</div>

<p>(The comparator expression in this query is basically an inline
version of the <code class="language-plaintext highlighter-rouge">compareStringRealPair</code> function, but working on <code class="language-plaintext highlighter-rouge">emp</code>
records rather than <code class="language-plaintext highlighter-rouge">string * real</code> pairs.)</p>

<p>But this is much longer than the equivalent in SQL. Comparator
functions are clearly powerful, but they fail the “make simple things
simple” test – forcing developers to write complex code for common
sorting patterns.</p>

<p>Let’s look instead at value-based sorting, which is simpler, but
provides most of the flexibility of comparator functions.</p>

<h2 id="structured-values-for-complex-orderings">Structured values for complex orderings</h2>

<p>The idea behind value-based sorting is that any values of the same
type can be compared, and that the Morel system generates comparison
logic for any type. If you require a complex sorting behavior, you can
construct an expression with a complex type.</p>

<p>Previously, if you wanted a composite ordering, with one of the keys
descending, you would write something like this:</p>

<!-- morel skip
(* Old syntax. *)
from e in scott.emps
  order e.job, e.sal desc;
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Old syntax. *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="kr">desc</span><span class="p">;</span></div>
</div>

<p>As of Morel 0.7, you can write the same query using a single
expression:</p>

<!-- morel skip
(* New syntax. *)
from e in scott.emps
  order (e.job, DESC e.sal);
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> New syntax. *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">);</span></div>
</div>

<p>Note that:</p>
<ul>
  <li>For a composite ordering, we use a tuple type. Morel compares the
values lexicographically.</li>
  <li>For a descending ordering, we wrap the value in the <code class="language-plaintext highlighter-rouge">descending</code>
data type using its <code class="language-plaintext highlighter-rouge">DESC</code> constructor. Morel compares the values
in the usual way, then reverses the direction.</li>
</ul>

<p>Sorting is defined for all other data types, including tuples,
records, sum-types such as <code class="language-plaintext highlighter-rouge">Option</code> and <code class="language-plaintext highlighter-rouge">Descending</code>, lists, bags, and
any combination thereof.</p>

<p>Morel’s compiler has two tricks to make this powerful and efficient.</p>

<p>First, Morel is effectively generating a comparator function at
compile time based on the type of the <code class="language-plaintext highlighter-rouge">order</code> expression.  This makes
value-based sorting as powerful as comparator functions, but with less
code for the user to write.</p>

<p>(The change included a new library function, <code class="language-plaintext highlighter-rouge">Relational.compare</code>,
that allows you to compare any two values of the same type, even if
you are not performing a sort. This is a somewhat strange function,
because it takes the type as an implicit argument, then drives its
behavior by introspecting that type.)</p>

<p>Second, the <code class="language-plaintext highlighter-rouge">order</code> clause uses a form of lazy evaluation. If the
query</p>

<!-- morel skip
from e in scott.emps
  order (e.job, DESC e.sal);
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">);</span></div>
</div>

<p>created a tuple <code class="language-plaintext highlighter-rouge">(e.job, DESC(e.sal))</code> for every element, we would
worry about the impact on performance, but those tuples are never
constructed. Morel operates on the employee records <code class="language-plaintext highlighter-rouge">e</code> directly,
and the performance is the same as if we had specified the ordering
using a list of order-items or a comparator function.</p>

<h2 id="benefits-of-sorting-on-expressions">Benefits of sorting on expressions</h2>

<p>Now the <code class="language-plaintext highlighter-rouge">order</code> step takes an expression, what is now possible that
wasn’t before?</p>

<p>We can pass the expression as an argument to a function, like this:</p>

<!-- morel skip
fun rankedEmployees extractKey =
  from e in scott.emps
    order extractKey e;

rankedEmployees (fn e => e.ename);
rankedEmployees (fn e => (e.job,  DESC e.sal));
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">rankedEmployees</span> <span class="n">extractKey</span> <span class="p">=</span>
  <span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
    <span class="kr">order</span> <span class="n">extractKey</span> <span class="n">e</span><span class="p">;</span>

<span class="n">rankedEmployees</span> <span class="p">(</span><span class="kr">fn</span> <span class="n">e</span> <span class="o">=&gt;</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">);</span>
<span class="n">rankedEmployees</span> <span class="p">(</span><span class="kr">fn</span> <span class="n">e</span> <span class="o">=&gt;</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span>  <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">));</span></div>
</div>

<p>We can also achieve the trivial sort required to convert a <code class="language-plaintext highlighter-rouge">bag</code> to a
<code class="language-plaintext highlighter-rouge">list</code>. You can sort by any constant value, such as the integer <code class="language-plaintext highlighter-rouge">0</code> or
the <code class="language-plaintext highlighter-rouge">Option</code> constructor <code class="language-plaintext highlighter-rouge">NONE</code>, but the norm would be to sort by the
empty tuple <code class="language-plaintext highlighter-rouge">()</code>:</p>

<!-- morel skip
from e in scott.emps
  yield e.ename
  order ();
> val it =
>   ["SMITH","ALLEN","WARD","JONES","MARTIN","BLAKE","CLARK",
>    "SCOTT","KING","TURNER","ADAMS","JAMES","FORD","MILLER"]
>   : string list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span>
  <span class="kr">order</span> <span class="p">();</span></div>
<div class="code-output">val it =
  ["SMITH","ALLEN","WARD","JONES","MARTIN","BLAKE","CLARK",
   "SCOTT","KING","TURNER","ADAMS","JAMES","FORD","MILLER"]
  : string list</div>
</div>

<p>Note that result is a <code class="language-plaintext highlighter-rouge">list</code>, even though <code class="language-plaintext highlighter-rouge">scott.emps</code> (a relational
database table) is a <code class="language-plaintext highlighter-rouge">bag</code>.  The elements are in
arbitrary order (because any order is consistent with the empty sort
key) but in converting the collection to a <code class="language-plaintext highlighter-rouge">list</code> the arbitrary order
has become frozen and repeatable.</p>

<h2 id="future-work">Future work</h2>

<p>Several challenges remain to be addressed.</p>

<h3 id="nulls-first-and-nulls-last">NULLS FIRST and NULLS LAST</h3>

<p>Real-world data sets often contain null values, and at various times
you wish to sort nulls low (as if they were zero or negative infinity)
or high (as if they were positive infinity). Morel uses the <code class="language-plaintext highlighter-rouge">option</code>
type rather than <code class="language-plaintext highlighter-rouge">NULL</code> to represent optional values, but the same
requirement exists.</p>

<p>SQL has <code class="language-plaintext highlighter-rouge">NULLS FIRST</code> and <code class="language-plaintext highlighter-rouge">NULLS LAST</code> keywords to control how nulls
are sorted, but Morel does not have an equivalent syntax.</p>

<p>Currently, the behavior is the same as SQL’s <code class="language-plaintext highlighter-rouge">NULLS FIRST</code>.  This
happens because Morel sorts datatype values based on the declaration
order of their constructors. The <code class="language-plaintext highlighter-rouge">option</code> type is declared as:</p>

<!-- morel skip
datatype option 'a = NONE | SOME of 'a;
-->

<div class="code-block">
<div class="code-input"><span class="kr">datatype</span> <span class="n">option</span> <span class="nn">'a</span> <span class="p">=</span> <span class="n">NONE</span> <span class="p">|</span> <span class="n">SOME</span> <span class="kr">of</span> <span class="nn">'a</span><span class="p">;</span></div>
</div>

<p>Since <code class="language-plaintext highlighter-rouge">NONE</code> appears before <code class="language-plaintext highlighter-rouge">SOME</code> in this declaration, the <code class="language-plaintext highlighter-rouge">NONE</code>
value sorts lower than all <code class="language-plaintext highlighter-rouge">SOME</code> values:</p>

<!-- morel
from i in [SOME 1, SOME ~100, NONE]
  order i;
> val it = [NONE,SOME ~100,SOME 1] : int option list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="n">SOME</span> <span class="mi">1</span><span class="p">,</span> <span class="n">SOME</span> ~<span class="mi">100</span><span class="p">,</span> <span class="n">NONE</span><span class="p">]</span>
  <span class="kr">order</span> <span class="n">i</span><span class="p">;</span></div>
<div class="code-output">val it = [NONE,SOME ~100,SOME 1] : int option list</div>
</div>

<p>We haven’t yet figured out how to express the equivalent of <code class="language-plaintext highlighter-rouge">NULLS
LAST</code>.  One idea is to add a <code class="language-plaintext highlighter-rouge">noneLast</code> datatype</p>

<!-- morel skip
datatype 'a noneLast = NONE_LAST of 'a;
-->

<div class="code-block">
<div class="code-input"><span class="kr">datatype</span> <span class="nn">'a</span> <span class="n">noneLast</span> <span class="p">=</span> <span class="n">NONE_LAST</span> <span class="kr">of</span> <span class="nn">'a</span><span class="p">;</span></div>
</div>

<p>and use it in a query like this:</p>

<!-- morel skip
from i in [SOME 1, SOME ~100, NONE]
  order NONE_LAST i;
> val it = [SOME ~100, SOME 1, NONE] : int option list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="n">SOME</span> <span class="mi">1</span><span class="p">,</span> <span class="n">SOME</span> ~<span class="mi">100</span><span class="p">,</span> <span class="n">NONE</span><span class="p">]</span>
  <span class="kr">order</span> <span class="n">NONE_LAST</span> <span class="n">i</span><span class="p">;</span></div>
<div class="code-output">val it = [SOME ~100, SOME 1, NONE] : int option list</div>
</div>

<p>When we use <code class="language-plaintext highlighter-rouge">NONE_LAST</code> and <code class="language-plaintext highlighter-rouge">DESC</code> together in a query</p>

<!-- morel skip
from i in [SOME 1, SOME ~100, NONE]
  order DESC (NONE_LAST i);
> val it = [NONE, SOME 1, SOME ~100] : int option list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="n">SOME</span> <span class="mi">1</span><span class="p">,</span> <span class="n">SOME</span> ~<span class="mi">100</span><span class="p">,</span> <span class="n">NONE</span><span class="p">]</span>
  <span class="kr">order</span> <span class="n">DESC</span> <span class="p">(</span><span class="n">NONE_LAST</span> <span class="n">i</span><span class="p">);</span></div>
<div class="code-output">val it = [NONE, SOME 1, SOME ~100] : int option list</div>
</div>

<p>the <code class="language-plaintext highlighter-rouge">NONE</code> value appears first. It’s what we asked for,
but not what we expected if we were expecting <code class="language-plaintext highlighter-rouge">DESC</code>
and <code class="language-plaintext highlighter-rouge">NONE_LAST</code> to commute.</p>

<p>Until we figure out something intuitive, we won’t have a
solution for <code class="language-plaintext highlighter-rouge">NULLS LAST</code> yet.</p>

<h3 id="comparator-functions-1">Comparator functions</h3>

<p>Under the “make hard things possible” principle, we might still want
to support comparator functions at some point. The syntax could be as
follows:</p>

<pre><code><i>step</i> &rarr; ...
  | <b>order</b> <i>exp</i>
  | <b>order using</b> <i>comparator</i></code></pre>

<p>Is value-based sorting strictly less powerful than comparator
functions? It’s an interesting theoretical question, and I honestly
don’t know. A comparator function can be an arbitrarily complex piece
of code — but perhaps it is always possible to create a value that
matches the structure of the code.</p>

<h3 id="aggregation-syntax">Aggregation syntax</h3>

<p>The syntax for <code class="language-plaintext highlighter-rouge">group</code> and <code class="language-plaintext highlighter-rouge">compute</code> steps is still not an expression.
For Morel 0.8 and beyond, we will be looking at
<a href="https://github.com/hydromatic/morel/issues/288">several improvements</a>.</p>

<p>First, making the group-key and compute-items an expression, with
field aliasing provided via record syntax, as in the current <code class="language-plaintext highlighter-rouge">yield</code>
step.</p>

<p>Second, allowing complex compute expressions with expressions both
inside and outside the aggregate function, as in the SQL expression
“<code class="language-plaintext highlighter-rouge">1 + AVG(sal * 2)</code>”. This will mean the <code class="language-plaintext highlighter-rouge">of</code> keyword, which is
currently part of the <em>agg</em> syntax, will be transitioned to a new
keyword that is part of the expression syntax, possibly <code class="language-plaintext highlighter-rouge">over</code>.</p>

<p>Third, further explore the relationship between the argument to an
aggregate function and a query. Noting that SQL aggregate function
syntax by now includes most relational operators (<code class="language-plaintext highlighter-rouge">FILTER</code>,
<code class="language-plaintext highlighter-rouge">DISTINCT</code>, <code class="language-plaintext highlighter-rouge">WITHIN DISTINCT</code>, <code class="language-plaintext highlighter-rouge">ORDER BY</code>) consider making the
argument (the <code class="language-plaintext highlighter-rouge">over</code> keyword just mentioned) a kind of query
expression.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Making sorting expression-based represents more than just a syntax
change – it exemplifies Morel’s commitment to principled language
design. By eliminating the special-case syntax for <code class="language-plaintext highlighter-rouge">order</code>, we’ve
resolved parsing ambiguities and enabled new forms of query
composition.</p>

<p>In the next few releases, we shall continue to evolve Morel to make it
more uniform and composable. The result, we hope, will be a query
language that feels both familiar to SQL users and naturally
functional to developers who think in terms of higher-order functions
and data transformation pipelines.</p>

<p>To find out more about Morel, read about its
<a href="/draft-blog3/2020/02/25/morel-a-functional-language-for-data.html">goals</a>
and <a href="/draft-blog3/2020/03/03/morel-basics.html">basic language</a>, peruse the
<a href="https://github.com/hydromatic/morel/blob/main/docs/query.md">query reference</a>
or
<a href="https://github.com/hydromatic/morel/blob/main/docs/reference.md">language reference</a>,
or download it from <a href="https://github.com/hydromatic/morel/">GitHub</a> and
give it a try.</p>

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">How we simplified the syntax of <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a>&#39;s &quot;order&quot; step <a href="https://t.co/pLUHBVoURN">https://t.co/pLUHBVoURN</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1936229301621604772?ref_src=twsrc%5Etfw">June 21, 2025</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<p>This article
<a href="https://github.com/julianhyde/share/commits/main/blog/_posts/2025-06-20-sorting-on-expressions.md">has been updated</a>.</p>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[Morel’s design philosophy of “everything is an expression” has transformed how we think about queries, making them more composable and flexible than traditional SQL. One stubborn holdout was the order step, which required a special syntax with comma-separated order-items rather than a single expression. In this post, we describe how we evolved the syntax of the order step in Morel release 0.7, and the benefits of this change.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Morel release 0.7.0</title><link href="http://0.0.0.0:4000/draft-blog3/2025/06/08/morel-release-0-7-0.html" rel="alternate" type="text/html" title="Morel release 0.7.0" /><published>2025-06-08T13:00:00-07:00</published><updated>2025-06-08T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/06/08/morel-release-0-7-0</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/06/08/morel-release-0-7-0.html"><![CDATA[<p>I am pleased to announce Morel
<a href="https://github.com/hydromatic/morel/blob/main/HISTORY.md#070--2025-06-07">release 0.7.0</a>,
just one month after
<a href="https://github.com/hydromatic/morel/blob/main/HISTORY.md#060--2025-05-02">release 0.6.0</a>.</p>

<p>This release has actually been under development for a long time.
<a href="#1-ordered-and-unordered-collections-and-queries">Ordered and unordered collections and queries</a>,
which are the centerpiece of this release, required major changes to
the type inference algorithm, not to mention a new
<a href="https://github.com/hydromatic/morel/issues/235">data type</a> (<code class="language-plaintext highlighter-rouge">bag</code>),
<a href="https://github.com/hydromatic/morel/issues/277">query step</a> (<code class="language-plaintext highlighter-rouge">unorder</code>),
and
<a href="https://github.com/hydromatic/morel/issues/276">expression</a> (<code class="language-plaintext highlighter-rouge">ordinal</code>).
The type inference changes have been under development for six months
(during which time there were two other Morel releases), and were so
extensive that we got
<a href="#2-function-overloading">function overloading</a> practically for free.</p>

<p>There are other changes to query syntax:
<a href="#3-sorting-on-expressions">sorting on expressions</a>,
<a href="#4-atomic-yield-steps">atomic <code class="language-plaintext highlighter-rouge">yield</code> steps</a>, and
<a href="#5-set-operators-in-pipelines">set operators in pipelines</a>.</p>

<p>Morel aims to be a solid implementation of Standard ML and good
general-purpose programming language, in addition to being a
revolutionary query language, which means gradually completing our
implementation of Standard ML’s
<a href="https://smlfamily.github.io/Basis/">Basis Library</a>. This release we
have completed the
<a href="#6-string-and-char-structures"><code class="language-plaintext highlighter-rouge">String</code> and <code class="language-plaintext highlighter-rouge">Char</code> structures</a>.</p>

<p>Let’s explore the key features. For complete details, see the
<a href="https://github.com/hydromatic/morel/blob/main/HISTORY.md#070--2025-06-07">official release notes</a>.</p>

<h2 id="1-ordered-and-unordered-collections-and-queries">1. Ordered and unordered collections and queries</h2>

<p>The biggest change in 0.7.0 is the introduction of
<a href="https://github.com/hydromatic/morel/issues/273">ordered and unordered collections and queries</a>.
Previously, every query was over a <code class="language-plaintext highlighter-rouge">list</code> type, whose elements were
ordered and duplicates were allowed.</p>

<p>But saying that every collection and query is over a <code class="language-plaintext highlighter-rouge">list</code> type
is a white lie. Consider this query:</p>

<!-- morel skip
from e in scott.emps
  where e.sal > 1000.0
  yield e.ename;
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="o">&gt;</span> <span class="mi">1000</span><span class="p">.</span><span class="mi">0</span>
  <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">;</span></div>
</div>

<p>The collection <code class="language-plaintext highlighter-rouge">scott.emps</code> maps to the <code class="language-plaintext highlighter-rouge">EMP</code> table in the <code class="language-plaintext highlighter-rouge">scott</code>
database, and Morel’s goal is to push as much of the processing as
possible to where the data resides. In this case, Morel can generate
the SQL query</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">ENAME</span>
<span class="k">FROM</span> <span class="n">SCOTT</span><span class="p">.</span><span class="n">EMP</span>
<span class="k">WHERE</span> <span class="n">SAL</span> <span class="o">&gt;</span> <span class="mi">1000</span><span class="p">.</span><span class="mi">0</span><span class="p">;</span>
</code></pre></div></div>

<p>SQL makes no guarantees about the order of results. If you execute
the query twice, a DBMS is free to return the results in a different
order each time. So Morel is being dishonest if it says that result
is a <code class="language-plaintext highlighter-rouge">list</code>.</p>

<p>Could we redefine <code class="language-plaintext highlighter-rouge">list</code> so that its iteration order is undefined?
Yes, but then we would be short-changing queries such as</p>

<!-- morel
from i in ["a", "b"],
    j in [1, 2, 3]
  yield (i, j);
> val it = [("a",1),("a",2),("a",3),("b",1),("b",2),("b",3)]
>   : (string * int) list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="s2">"a"</span><span class="p">,</span> <span class="s2">"b"</span><span class="p">],</span>
    <span class="nv">j</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">);</span></div>
<div class="code-output">val it = [("a",1),("a",2),("a",3),("b",1),("b",2),("b",3)]
  : (string * int) list</div>
</div>

<p>which do have a defined order.</p>

<p>The fact is – even though the relational model tells us it ain’t so
– some data sets are ordered, and some are unordered. Adding distinct
<code class="language-plaintext highlighter-rouge">bag</code> and <code class="language-plaintext highlighter-rouge">list</code> types, relational operators that can work on both,
and relational operators to convert between them, was the way to go.</p>

<p>The features that we implemented are described in the article
“<a href="http://blog.hydromatic.net/2025/06/06/ordered-unordered.html">Ordered and unordered data</a>”.</p>

<h2 id="2-function-overloading">2. Function overloading</h2>

<p>In Standard ML, and in Morel until recently, a name could only have
one binding.  Functions are values, and therefore inhabit the same
namespace as regular values.  If I declare <code class="language-plaintext highlighter-rouge">x</code> to be an <code class="language-plaintext highlighter-rouge">int</code> value</p>

<!-- morel skip
val x = 42;
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="nv">x</span> <span class="p">=</span> <span class="mi">42</span><span class="p">;</span></div>
</div>

<p>and then later try to declare <code class="language-plaintext highlighter-rouge">x</code> to be a function</p>

<!-- morel skip
val x = fn y => y + 1;
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="nv">x</span> <span class="p">=</span> <span class="kr">fn</span> <span class="n">y</span> <span class="o">=&gt;</span> <span class="n">y</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span></div>
</div>

<p>then the previous declaration of <code class="language-plaintext highlighter-rouge">x</code> is no longer accessible.</p>

<!-- morel fail
val z = x - 2;
> stdIn:1.9 Error: unbound variable or constructor: x
>   raised at: stdIn:1.9
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="nv">z</span> <span class="p">=</span> <span class="n">x</span> <span class="o">-</span> <span class="mi">2</span><span class="p">;</span></div>
<div class="code-error">stdIn:1.9 Error: unbound variable or constructor: x
  raised at: stdIn:1.9</div>
</div>

<p>To create
<a href="https://github.com/hydromatic/morel/issues/237">overloaded functions</a>,
we need declare that an identifier is special; we do this using the
new <code class="language-plaintext highlighter-rouge">over</code> keyword:</p>

<!-- morel
over f;
> over f
-->

<div class="code-block">
<div class="code-input"><span class="kr">over</span> <span class="n">f</span><span class="p">;</span></div>
<div class="code-output">over f</div>
</div>

<p>Now we can define several instances of <code class="language-plaintext highlighter-rouge">f</code>:</p>

<!-- morel
val inst f = fn (x : int, y : int) => x + y;
> val f = fn : int * int -> int
val inst f = fn list => length list;
> val f = fn : 'a list -> int
val inst f = fn SOME x => x ^ "!" | NONE => ":(";
> val f = fn : string option -> string
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="kr">inst</span> <span class="nv">f</span> <span class="p">=</span> <span class="kr">fn</span> <span class="p">(</span><span class="n">x</span> <span class="p">:</span> <span class="n">int</span><span class="p">,</span> <span class="n">y</span> <span class="p">:</span> <span class="n">int</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="p">;</span></div>
<div class="code-output">val f = fn : int * int -&gt; int</div>
<div class="code-input"><span class="kr">val</span> <span class="kr">inst</span> <span class="nv">f</span> <span class="p">=</span> <span class="kr">fn</span> <span class="n">list</span> <span class="o">=&gt;</span> <span class="n">length</span> <span class="n">list</span><span class="p">;</span></div>
<div class="code-output">val f = fn : 'a list -&gt; int</div>
<div class="code-input"><span class="kr">val</span> <span class="kr">inst</span> <span class="nv">f</span> <span class="p">=</span> <span class="kr">fn</span> <span class="n">SOME</span> <span class="n">x</span> <span class="o">=&gt;</span> <span class="n">x</span> ^ <span class="s2">"!"</span> <span class="p">|</span> <span class="n">NONE</span> <span class="o">=&gt;</span> <span class="s2">":("</span><span class="p">;</span></div>
<div class="code-output">val f = fn : string option -&gt; string</div>
</div>

<p>All must be functions, because the overloads are resolved based on
the type of the first argument.</p>

<p>Calls to <code class="language-plaintext highlighter-rouge">f</code> will be resolved based on the types of the arguments:</p>

<!-- morel
(* Call the "int * int -> int" overload. *)
f (7, 8);
> val it = 15 : int
(* Call the "'a list -> int" overload. *)
f ["a", "b", "c"];
> val it = 3 : int
f [1, 2, 3, 4];
> val it = 4 : int
f [];
> val it = 0 : int
(* Call the "string option -> string" overload. *)
f (SOME "happy");
> val it = "happy!" : string
f NONE;
> val it = ":(" : string
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Call the "int * int -&gt; int" overload. *)</span>
<span class="n">f</span> <span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">8</span><span class="p">);</span></div>
<div class="code-output">val it = 15 : int</div>
<div class="code-input"><span class="c">(*</span><span class="cm"> Call the "'a list -&gt; int" overload. *)</span>
<span class="n">f</span> <span class="p">[</span><span class="s2">"a"</span><span class="p">,</span> <span class="s2">"b"</span><span class="p">,</span> <span class="s2">"c"</span><span class="p">];</span></div>
<div class="code-output">val it = 3 : int</div>
<div class="code-input"><span class="n">f</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">];</span></div>
<div class="code-output">val it = 4 : int</div>
<div class="code-input"><span class="n">f</span> <span class="p">[];</span></div>
<div class="code-output">val it = 0 : int</div>
<div class="code-input"><span class="c">(*</span><span class="cm"> Call the "string option -&gt; string" overload. *)</span>
<span class="n">f</span> <span class="p">(</span><span class="n">SOME</span> <span class="s2">"happy"</span><span class="p">);</span></div>
<div class="code-output">val it = "happy!" : string</div>
<div class="code-input"><span class="n">f</span> <span class="n">NONE</span><span class="p">;</span></div>
<div class="code-output">val it = ":(" : string</div>
</div>

<!-- morel fail
(* No overloads match "int option" or "(int, int, int)" arguments. *)
f (SOME 42);
> 0.0-0.0 Error: Cannot deduce type: no valid overloads
>   raised at: 0.0-0.0
f (1, 2, 3);
> 0.0-0.0 Error: Cannot deduce type: no valid overloads
>   raised at: 0.0-0.0
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> No overloads match "int option" or "(int, int, int)" arguments. *)</span>
<span class="n">f</span> <span class="p">(</span><span class="n">SOME</span> <span class="mi">42</span><span class="p">);</span></div>
<div class="code-error">0.0-0.0 Error: Cannot deduce type: no valid overloads
  raised at: 0.0-0.0</div>
<div class="code-input"><span class="n">f</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">);</span></div>
<div class="code-error">0.0-0.0 Error: Cannot deduce type: no valid overloads
  raised at: 0.0-0.0</div>
</div>

<h2 id="3-sorting-on-expressions">3. Sorting on expressions</h2>

<p>There are only a few places in Morel syntax where you do not use an
expression, and the <code class="language-plaintext highlighter-rouge">order</code> step used to be one of them.  Previously,
<code class="language-plaintext highlighter-rouge">order</code> was followed by a list of “order items”, each an expression
optionally followed by <code class="language-plaintext highlighter-rouge">desc</code>. The items were separated by commas, and
the list could not be empty.</p>

<p>The commas were a problem. In the expression</p>

<!-- morel skip
foo (from i in [1, 2, 3] order i desc, j);
-->

<div class="code-block">
<div class="code-input"><span class="n">foo</span> <span class="p">(</span><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span> <span class="kr">order</span> <span class="n">i</span> <span class="kr">desc</span><span class="p">,</span> <span class="n">j</span><span class="p">);</span></div>
</div>

<p>it is not clear whether <code class="language-plaintext highlighter-rouge">j</code> is a second argument for the call to the
function <code class="language-plaintext highlighter-rouge">foo</code> or the second item in the <code class="language-plaintext highlighter-rouge">order</code> clause.</p>

<p>Another problem was the fact that the <code class="language-plaintext highlighter-rouge">order</code> clause could not be
empty. The
<a href="#1-ordered-and-unordered-collections-and-queries">ordered/unordered collections</a>
feature introduced an <code class="language-plaintext highlighter-rouge">unorder</code> step to convert a <code class="language-plaintext highlighter-rouge">list</code> to a <code class="language-plaintext highlighter-rouge">bag</code>,
and we need the opposite of that, a trivial sort whose
key has the same value for every element.</p>

<p>The answer was to
<a href="https://github.com/hydromatic/morel/issues/244">make the argument to <code class="language-plaintext highlighter-rouge">order</code> an expression</a>.
A composite sort specification is now a tuple, still separated by
commas, but now enclosed in parentheses.  If a sort key is descending,
you now wrap it in the <code class="language-plaintext highlighter-rouge">Descending</code> data type by preceding it with the
<code class="language-plaintext highlighter-rouge">DESC</code>.  Thus:</p>

<!-- morel skip
(* Old syntax *)
from e in scott.emps
  order e.job, e.sal desc;

(* New syntax *)
from e in scott.emps
  order (e.job, DESC e.sal);
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Old syntax *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="kr">desc</span><span class="p">;</span>

<span class="c">(*</span><span class="cm"> New syntax *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">);</span></div>
</div>

<p>You can now sort by any data type, including tuples, records,
sum-types such as <code class="language-plaintext highlighter-rouge">Option</code> and <code class="language-plaintext highlighter-rouge">Descending</code>, lists, bags, and any
combination thereof.</p>

<p>To achieve the trivial sort, you can sort by any constant value, such
as the integer <code class="language-plaintext highlighter-rouge">0</code> or the <code class="language-plaintext highlighter-rouge">Option</code> constructor <code class="language-plaintext highlighter-rouge">NONE</code>, but
conventionally you would sort by the empty tuple <code class="language-plaintext highlighter-rouge">()</code>:</p>

<!-- morel
from e in scott.emps
  yield e.ename
  order ();
> val it =
>   ["SMITH","ALLEN","WARD","JONES","MARTIN","BLAKE","CLARK","SCOTT","KING",
>    "TURNER","ADAMS","JAMES",...] : string list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span>
  <span class="kr">order</span> <span class="p">();</span></div>
<div class="code-output">val it =
  ["SMITH","ALLEN","WARD","JONES","MARTIN","BLAKE","CLARK","SCOTT","KING",
   "TURNER","ADAMS","JAMES",...] : string list</div>
</div>

<p>The key thing is that the result is a <code class="language-plaintext highlighter-rouge">list</code>.  The elements are in
arbitrary order (because any order is consistent with the empty sort
key) but in converting the collection to a <code class="language-plaintext highlighter-rouge">list</code> the arbitrary order
has become frozen and repeatable.</p>

<h2 id="4-atomic-yield-steps">4. Atomic yield steps</h2>

<p>At any step in a Morel query, there are generally several named fields
you can use to reference parts of the current row.  For example, the
<code class="language-plaintext highlighter-rouge">where</code> step in the following query refers to both fields, <code class="language-plaintext highlighter-rouge">i</code> and
<code class="language-plaintext highlighter-rouge">j</code>.</p>

<!-- morel silent
Sys.set ("output", "tabular");
> val it = () : unit
-->
<!-- morel
from i in [1, 2, 3],
    j in [4, 5, 6]
  where i + j > 7;
> i j
> - -
> 2 6
> 3 5
> 3 6
>
> val it : {i:int, j:int} list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span>
    <span class="nv">j</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">where</span> <span class="n">i</span> <span class="o">+</span> <span class="n">j</span> <span class="o">&gt;</span> <span class="mi">7</span><span class="p">;</span></div>
<div class="code-output">i j
- -
2 6
3 5
3 6

val it : {i:int, j:int} list</div>
</div>

<p>But there is one circumstance where a step does not produce any named
fields: a <code class="language-plaintext highlighter-rouge">yield</code> whose expression is not a record, what we call an
“atomic yield”. Here is an example:</p>

<!-- morel skip
from i in [1, 2, 3],
    j in [4, 5, 6]
  yield i + j;
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span>
    <span class="nv">j</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="n">i</span> <span class="o">+</span> <span class="n">j</span><span class="p">;</span></div>
</div>

<p>That query is valid, but suppose we wished to sort or filter the
results.  If we added an <code class="language-plaintext highlighter-rouge">order</code> or <code class="language-plaintext highlighter-rouge">where</code> step it would have no way
to refer to the current row. We allowed atomic yields because we
needed queries with non-record elements, but we made a rule that the
atomic yield had to be the last step.</p>

<p>That restriction was becoming more of a burden, and the final straw
was ordered/unordered queries, which often end in <code class="language-plaintext highlighter-rouge">order</code> or
<code class="language-plaintext highlighter-rouge">unorder</code>. So we decided to fix the problem.</p>

<p>We
<a href="https://github.com/hydromatic/morel/issues/265">added a new expression, <code class="language-plaintext highlighter-rouge">current</code></a>,
that refers to the current element. (It is only available in query
steps, but you can use it inside a sub-expression or sub-query.)  If
the value is atomic, <code class="language-plaintext highlighter-rouge">current</code> is that value; if there are named
fields, <code class="language-plaintext highlighter-rouge">current</code> is a record consisting of those fields. (In the
previous example, <code class="language-plaintext highlighter-rouge">current</code> would be equivalent to <code class="language-plaintext highlighter-rouge">{i, j}</code>.)</p>

<p>If a <code class="language-plaintext highlighter-rouge">yield</code> is atomic but the expression has a clear name, as in
<code class="language-plaintext highlighter-rouge">yield i</code> or <code class="language-plaintext highlighter-rouge">yield e.deptno</code>, you can also use that name.  (The
expression is still considered atomic, and the result of the query
will be a collection of that type, not a collection of records.)</p>

<p>Here are some examples of <code class="language-plaintext highlighter-rouge">current</code> in action.</p>

<!-- morel
from i in [1, 2, 3],
    j in [4, 5, 6]
  yield i + j
  order DESC current;
> val it = [9,8,8,7,7,7,6,6,5] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span>
    <span class="nv">j</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="n">i</span> <span class="o">+</span> <span class="n">j</span>
  <span class="kr">order</span> <span class="n">DESC</span> <span class="kr">current</span><span class="p">;</span></div>
<div class="code-output">val it = [9,8,8,7,7,7,6,6,5] : int list</div>
</div>

<!-- morel
from maker in ["ford", "ferrari"],
    color in ["red", "green"]
  order current.color;
> color maker
> ----- -------
> green ford
> green ferrari
> red   ford
> red   ferrari
>
> val it : {color:string, maker:string} list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">maker</span> <span class="kr">in</span> <span class="p">[</span><span class="s2">"ford"</span><span class="p">,</span> <span class="s2">"ferrari"</span><span class="p">],</span>
    <span class="nv">color</span> <span class="kr">in</span> <span class="p">[</span><span class="s2">"red"</span><span class="p">,</span> <span class="s2">"green"</span><span class="p">]</span>
  <span class="kr">order</span> <span class="kr">current</span><span class="p">.</span><span class="n">color</span><span class="p">;</span></div>
<div class="code-output">color maker
----- -------
green ford
green ferrari
red   ford
red   ferrari

val it : {color:string, maker:string} list</div>
</div>

<!-- morel
from i in [1, 2, 3, 4]
  yield 4 * (i mod 2) + (i div 2)
  order current;
> val it = [1,2,4,5] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="mi">4</span> <span class="o">*</span> <span class="p">(</span><span class="n">i</span> <span class="kr">mod</span> <span class="mi">2</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="n">i</span> <span class="kr">div</span> <span class="mi">2</span><span class="p">)</span>
  <span class="kr">order</span> <span class="kr">current</span><span class="p">;</span></div>
<div class="code-output">val it = [1,2,4,5] : int list</div>
</div>

<!-- morel
from e in scott.emps
  yield e.deptno
  distinct
  order current;
> val it = [10,20,30] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span>
  <span class="kr">distinct</span>
  <span class="kr">order</span> <span class="kr">current</span><span class="p">;</span></div>
<div class="code-output">val it = [10,20,30] : int list</div>
</div>

<!-- morel
from e in scott.emps
  yield e.deptno
  distinct
  order deptno;
> val it = [10,20,30] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span>
  <span class="kr">distinct</span>
  <span class="kr">order</span> <span class="n">deptno</span><span class="p">;</span></div>
<div class="code-output">val it = [10,20,30] : int list</div>
</div>

<h2 id="5-set-operators-in-pipelines">5. Set operators in pipelines</h2>

<p>The set operators (<code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">intersect</code> and <code class="language-plaintext highlighter-rouge">except</code>) were previously
available via functions but now have
<a href="https://github.com/hydromatic/morel/issues/253">dedicated steps</a> in
the query pipeline.</p>

<p>The steps have slightly different semantics for ordered and unordered
collections, and have an optional <code class="language-plaintext highlighter-rouge">distinct</code> keyword to eliminate
duplicates.</p>

<p>For example, here is a query that finds all employees in departments
10 and 20, but excludes those who are managers or clerks:</p>

<!-- morel skip
from e in scott.emps
  where e.deptno = 10
  union (from e in scott.emps where e.deptno = 20)
  except (from e in scott.emps where e.job = "MANAGER"),
     (from e in scott.emps where e.job = "CLERK");
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">10</span>
  <span class="kr">union</span> <span class="p">(</span><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span> <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="mi">20</span><span class="p">)</span>
  <span class="kr">except</span> <span class="p">(</span><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span> <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"MANAGER"</span><span class="p">),</span>
     <span class="p">(</span><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span> <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"CLERK"</span><span class="p">);</span></div>
</div>

<p>If you have ever wondered about the semantics of <code class="language-plaintext highlighter-rouge">intersect</code> and
<code class="language-plaintext highlighter-rouge">except</code> with duplicates, wonder no more!
<a href="/draft-blog3/2025/06/03/intersect-fractions.html">INTERSECT ALL, EXCEPT ALL, and the arithmetic of fractions</a>
explains everything using a fun example.</p>

<h2 id="6-string-and-char-structures">6. String and Char structures</h2>

<p>Morel now includes complete
<a href="https://github.com/hydromatic/morel/issues/279"><code class="language-plaintext highlighter-rouge">String</code></a> and
<a href="https://github.com/hydromatic/morel/issues/264"><code class="language-plaintext highlighter-rouge">Char</code></a> structures
following the
<a href="https://smlfamily.github.io/Basis/">Standard ML Basis Library</a>
specification.</p>

<p>This gives you comprehensive text manipulation capabilities:</p>

<!-- morel
String.size "hello world";
> val it = 11 : int

String.substring ("hello world", 6, 5);
> val it = "world" : string

String.tokens (fn c => c = #" ") "hello world morel";
> val it = ["hello","world","morel"] : string list

Char.isAlpha #"a";
> val it = true : bool

Char.toUpper #"a";
> val it = #"A" : char

String.map Char.toUpper "hello";
> val it = "HELLO" : string
-->

<div class="code-block">
<div class="code-input"><span class="nn">String</span><span class="p">.</span><span class="n">size</span> <span class="s2">"hello world"</span><span class="p">;</span></div>
<div class="code-output">val it = 11 : int</div>
<div class="code-input">
<span class="nn">String</span><span class="p">.</span><span class="n">substring</span> <span class="p">(</span><span class="s2">"hello world"</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">5</span><span class="p">);</span></div>
<div class="code-output">val it = "world" : string</div>
<div class="code-input">
<span class="nn">String</span><span class="p">.</span><span class="n">tokens</span> <span class="p">(</span><span class="kr">fn</span> <span class="n">c</span> <span class="o">=&gt;</span> <span class="n">c</span> <span class="p">=</span> #<span class="s2">" "</span><span class="p">)</span> <span class="s2">"hello world morel"</span><span class="p">;</span></div>
<div class="code-output">val it = ["hello","world","morel"] : string list</div>
<div class="code-input">
<span class="nn">Char</span><span class="p">.</span><span class="n">isAlpha</span> #<span class="s2">"a"</span><span class="p">;</span></div>
<div class="code-output">val it = true : bool</div>
<div class="code-input">
<span class="nn">Char</span><span class="p">.</span><span class="n">toUpper</span> #<span class="s2">"a"</span><span class="p">;</span></div>
<div class="code-output">val it = #"A" : char</div>
<div class="code-input">
<span class="nn">String</span><span class="p">.</span><span class="n">map</span> <span class="nn">Char</span><span class="p">.</span><span class="n">toUpper</span> <span class="s2">"hello"</span><span class="p">;</span></div>
<div class="code-output">val it = "HELLO" : string</div>
</div>

<p>These structures provide everything you need for serious text
processing, from basic operations like substring extraction to
advanced features like tokenization and character classification.</p>

<h2 id="7-breaking-changes">7. Breaking changes</h2>

<p>This release includes some breaking changes to be aware of.</p>

<h3 id="database-schema-updates">Database schema updates</h3>

<p>The <code class="language-plaintext highlighter-rouge">scott</code> sample database now uses
<a href="https://github.com/hydromatic/morel/issues/255">pluralized table names</a>,
mapping the <code class="language-plaintext highlighter-rouge">emps</code> value maps to the <code class="language-plaintext highlighter-rouge">EMP</code> table, and <code class="language-plaintext highlighter-rouge">depts</code> to the
<code class="language-plaintext highlighter-rouge">DEPT</code> table.</p>

<!-- morel skip
(* Old *)
from e in scott.emp
  join d in scott.dept on e.deptno = d.deptno;

(* New *)
from e in scott.emps
  join d in scott.depts on e.deptno = d.deptno;
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Old *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emp</span>
  <span class="kr">join</span> <span class="nv">d</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">dept</span> <span class="kr">on</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="nn">d</span><span class="p">.</span><span class="n">deptno</span><span class="p">;</span>

<span class="c">(*</span><span class="cm"> New *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">join</span> <span class="nv">d</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">depts</span> <span class="kr">on</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="nn">d</span><span class="p">.</span><span class="n">deptno</span><span class="p">;</span></div>
</div>

<p>This change aligns with the modern programming convention that
collections have plural names.</p>

<h3 id="type-based-orderings">Type-based orderings</h3>

<p>The previous <code class="language-plaintext highlighter-rouge">order</code> syntax no longer works.</p>

<p>You should convert a following <code class="language-plaintext highlighter-rouge">desc</code> to preceding <code class="language-plaintext highlighter-rouge">DESC</code>:</p>

<!-- morel skip
(* Old syntax *)
from e in scott.emps
  order e.sal desc;

(* New syntax *)
from e in scott.emps
  order DESC e.sal;
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Old syntax *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="kr">desc</span><span class="p">;</span>

<span class="c">(*</span><span class="cm"> New syntax *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">;</span></div>
</div>

<p>and put parentheses around composite orderings:</p>

<!-- morel skip
(* Old syntax *)
from e in scott.emps
  order e.job, e.sal desc;

(* New syntax *)
from e in scott.emps
  order (e.job, DESC e.sal);
-->

<div class="code-block">
<div class="code-input"><span class="c">(*</span><span class="cm"> Old syntax *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="kr">desc</span><span class="p">;</span>

<span class="c">(*</span><span class="cm"> New syntax *)</span>
<span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">);</span></div>
</div>

<h2 id="conclusion">Conclusion</h2>

<p>Release 0.7.0 represents a major evolution in Morel’s
capabilities. Extensions to the query language, type system, and
standard library make Morel a good solution for a wide range of
data processing tasks, from simple queries to complex data
transformations.</p>

<p>As always, you can get started with Morel by visiting
<a href="https://github.com/hydromatic/morel">GitHub</a>.
For more background, read about its
<a href="/draft-blog3/2020/02/25/morel-a-functional-language-for-data.html">goals</a>
and <a href="/draft-blog3/2020/03/03/morel-basics.html">basic language</a>,
and find a full definition of the language in the
<a href="https://github.com/hydromatic/morel/blob/main/docs/query.md">query reference</a>
and the
<a href="https://github.com/hydromatic/morel/blob/main/docs/reference.md">language reference</a>.</p>

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">I&#39;m pleased to announce release 0.7 of <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a>! This is a huge release, adding support for ordered/unordered data, set operators, and revised order syntax. A major rework of Morel&#39;s type inference algorithm delivered function overloading. <a href="https://t.co/hERffT3Kxn">https://t.co/hERffT3Kxn</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1931931352729079968?ref_src=twsrc%5Etfw">June 9, 2025</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<p>This article
<a href="https://github.com/julianhyde/share/commits/main/blog/_posts/2025-06-08-morel-release-0-7-0.md">has been updated</a>.</p>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[I am pleased to announce Morel release 0.7.0, just one month after release 0.6.0.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Morel release 0.7.0</title><link href="http://0.0.0.0:4000/draft-blog3/2025/06/08/morel-release-0-7-0.md-save" rel="alternate" type="text/html" title="Morel release 0.7.0" /><published>2025-06-08T13:00:00-07:00</published><updated>2025-06-08T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/06/08/morel-release-0-7-0</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/06/08/morel-release-0-7-0.md-save"><![CDATA[I am pleased to announce Morel
[release 0.7.0](https://github.com/hydromatic/morel/blob/main/HISTORY.md#070--2025-06-07),
just one month after
[release 0.6.0](https://github.com/hydromatic/morel/blob/main/HISTORY.md#060--2025-05-02).

This release has actually been under development for a long time.
[Ordered and unordered collections and queries](#1-ordered-and-unordered-collections-and-queries),
which are the centerpiece of this release, required major changes to
the type inference algorithm, not to mention a new
[data type](https://github.com/hydromatic/morel/issues/235) (`bag`),
[query step](https://github.com/hydromatic/morel/issues/277) (`unorder`),
and
[expression](https://github.com/hydromatic/morel/issues/276) (`ordinal`).
The type inference changes have been under development for six months
(during which time there were two other Morel releases), and were so
extensive that we got
[function overloading](#2-function-overloading) practically for free.

There are other changes to query syntax:
[sorting on expressions](#3-sorting-on-expressions),
[atomic `yield` steps](#4-atomic-yield-steps), and
[set operators in pipelines](#5-set-operators-in-pipelines).

Morel aims to be a solid implementation of Standard ML and good
general-purpose programming language, in addition to being a
revolutionary query language, which means gradually completing our
implementation of Standard ML's
[Basis Library](https://smlfamily.github.io/Basis/). This release we
have completed the
[`String` and `Char` structures](#6-string-and-char-structures).

Let's explore the key features. For complete details, see the
[official release notes](https://github.com/hydromatic/morel/blob/main/HISTORY.md#070--2025-06-07).

## 1. Ordered and unordered collections and queries

The biggest change in 0.7.0 is the introduction of
[ordered and unordered collections and queries](https://github.com/hydromatic/morel/issues/273).
Previously, every query was over a `list` type, whose elements were
ordered and duplicates were allowed.

But saying that every collection and query is over a `list` type
is a white lie. Consider this query:

<!-- morel skip
from e in scott.emps
  where e.sal > 1000.0
  yield e.ename;
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="n">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span> <span class="o">&gt;</span> <span class="mi">1000</span><span class="p">.</span><span class="mi">0</span>
  <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">;</span></div>
</div>


The collection `scott.emps` maps to the `EMP` table in the `scott`
database, and Morel's goal is to push as much of the processing as
possible to where the data resides. In this case, Morel can generate
the SQL query

```sql
SELECT ENAME
FROM SCOTT.EMP
WHERE SAL > 1000.0;
```

SQL makes no guarantees about the order of results. If you execute
the query twice, a DBMS is free to return the results in a different
order each time. So Morel is being dishonest if it says that result
is a `list`.

Could we redefine `list` so that its iteration order is undefined?
Yes, but then we would be short-changing queries such as

<!-- morel
from i in ["a", "b"],
    j in [1, 2, 3]
  yield (i, j);
> val it = [("a",1),("a",2),("a",3),("b",1),("b",2),("b",3)]
>   : (string * int) list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="n">i</span> <span class="kr">in</span> <span class="p">[</span><span class="s2">"a"</span><span class="p">,</span> <span class="s2">"b"</span><span class="p">],</span>
    <span class="n">j</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">);</span></div>
<div class="code-output">val it = [("a",1),("a",2),("a",3),("b",1),("b",2),("b",3)]
  : (string * int) list</div>
</div>


which do have a defined order.

The fact is -- even though the relational model tells us it ain't so
-- some data sets are ordered, and some are unordered. Adding distinct
`bag` and `list` types, relational operators that can work on both,
and relational operators to convert between them, was the way to go.

The features that we implemented are described in the article
"[Ordered and unordered data](http://blog.hydromatic.net/2025/06/06/ordered-unordered.html)".

## 2. Function overloading

In Standard ML, and in Morel until recently, a name could only have
one binding.  Functions are values, and therefore inhabit the same
namespace as regular values.  If I declare `x` to be an `int` value

<!-- morel skip
val x = 42;
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="nv">x</span> <span class="p">=</span> <span class="mi">42</span><span class="p">;</span></div>
</div>


and then later try to declare `x` to be a function

<!-- morel skip
val x = fn y => y + 1;
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="nv">x</span> <span class="p">=</span> <span class="kr">fn</span> <span class="n">y</span> <span class="o">=&gt;</span> <span class="n">y</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span></div>
</div>


then the previous declaration of `x` is no longer accessible.
<!-- morel fail
int z = x - 2;
> stdIn:1.5 Error: unbound variable or constructor: z
>   raised at: stdIn:1.5
-->

<div class="code-block">
<div class="code-input"><span class="n">int</span> <span class="n">z</span> <span class="p">=</span> <span class="n">x</span> <span class="o">-</span> <span class="mi">2</span><span class="p">;</span></div>
<div class="code-output">stdIn:1.5 Error: unbound variable or constructor: z
  raised at: stdIn:1.5</div>
</div>


To create
[overloaded functions](https://github.com/hydromatic/morel/issues/237),
we need declare that an identifier is special; we do this using the
new `over` keyword:

<!-- morel
over f;
> over f
-->

<div class="code-block">
<div class="code-input"><span class="kr">over</span> <span class="n">f</span><span class="p">;</span></div>
<div class="code-output">over f</div>
</div>


Now we can define several instances of `f`:

<!-- morel
val inst f = fn (x : int, y : int) => x + y;
> val f = fn : int * int -> int
val inst f = fn list => length list;
> val f = fn : 'a list -> int
val inst f = fn SOME x => x ^ "!" | NONE => ":(";
> val f = fn : string option -> string
-->

All must be functions, because the overloads are resolved based on
the type of the first argument.

Calls to `f` will be resolved based on the types of the arguments:
```sml
(* Call the "int * int -> int" overload. *)
f (7, 8);
(*[> val it = 15 : int]*)

(* Call the "'a list -> int" overload. *)
f ["a", "b", "c"];
(*[> val it = 3 : int]*)
f [1, 2, 3, 4];
(*[> val it = 4 : int]*)
f [];
(*[> val it = 0 : int]*)

(* Call the "string option -> string" overload. *)
f (SOME "happy");
(*[> val it = "happy!" : string]*)
f NONE;
(*[> val it = ":(" : string]*)

(* No overloads match "int option" or "(int, int, int)" arguments. *)
f (SOME 42);
(*[> 0.0-0.0 Error: Cannot deduce type: no valid overloads
>   raised at: 0.0-0.0]*)
f (1, 2, 3);
(*[> 0.0-0.0 Error: Cannot deduce type: no valid overloads
>   raised at: 0.0-0.0]*)
```

## 3. Sorting on expressions

There are only a few places in Morel syntax where you do not use an
expression, and the `order` step used to be one of them.  Previously,
`order` was followed by a list of "order items", each an expression
optionally followed by `desc`. The items were separated by commas, and
the list could not be empty.

The commas were a problem. In the expression

```sml
foo (from i in [1, 2, 3] order i desc, j);
```

it is not clear whether `j` is a second argument for the call to the
function `foo` or the second item in the `order` clause.

Another problem was the fact that the `order` clause could not be
empty. The
[ordered/unordered collections](#1-ordered-and-unordered-collections-and-queries)
feature introduced an `unorder` step to convert a `list` to a `bag`,
and we need the opposite of that, a trivial sort whose
key has the same value for every element.

The answer was to
[make the argument to `order` an expression](https://github.com/hydromatic/morel/issues/244).
A composite sort specification is now a tuple, still separated by
commas, but now enclosed in parentheses.  If a sort key is descending,
you now wrap it in the `Descending` data type by preceding it with the
`DESC`.  Thus:

```sml
(* Old syntax *)
from e in scott.emps
  order e.job, e.sal desc;

(* New syntax *)
from e in scott.emps
  order (e.job, DESC e.sal);
```

You can now sort by any data type, including tuples, records,
sum-types such as `Option` and `Descending`, lists, bags, and any
combination thereof.

To achieve the trivial sort, you can sort by any constant value, such
as the integer `0` or the `Option` constructor `NONE`, but
conventionally you would sort by the empty tuple `()`:

```sml
from e in scott.emps
  yield e.ename
  order ();
(*[> val it =
>   ["SMITH","ALLEN","WARD","JONES","MARTIN","BLAKE","CLARK",
>    "SCOTT","KING","TURNER","ADAMS","JAMES","FORD","MILLER"]
>   : string list]*)
```

The key thing is that the result is a `list`.  The elements are in
arbitrary order (because any order is consistent with the empty sort
key) but in converting the collection to a `list` the arbitrary order
has become frozen and repeatable.

## 4. Atomic yield steps

At any step in a Morel query, there are generally several named fields
you can use to reference parts of the current row.  For example, the
`where` step in the following query refers to both fields, `i` and
`j`.

```sml
from i in [1, 2, 3],
    j in [4, 5, 6]
  where i + j > 7;

(*[> i j
> - -
> 2 6
> 3 5
> 3 6
>
> val it : {i:int, j:int} list]*)
```

But there is one circumstance where a step does not produce any named
fields: a `yield` whose expression is not a record, what we call an
"atomic yield". Here is an example:

```sml
from i in [1, 2, 3],
    j in [4, 5, 6]
  yield i + j;
```

That query is valid, but suppose we wished to sort or filter the
results.  If we added an `order` or `where` step it would have no way
to refer to the current row. We allowed atomic yields because we
needed queries with non-record elements, but we made a rule that the
atomic yield had to be the last step.

That restriction was becoming more of a burden, and the final straw
was ordered/unordered queries, which often end in `order` or
`unorder`. So we decided to fix the problem.

We
[added a new expression, `current`](https://github.com/hydromatic/morel/issues/265),
that refers to the current element. (It is only available in query
steps, but you can use it inside a sub-expression or sub-query.)  If
the value is atomic, `current` is that value; if there are named
fields, `current` is a record consisting of those fields. (In the
previous example, `current` would be equivalent to `{i, j}`.)

If a `yield` is atomic but the expression has a clear name, as in
`yield i` or `yield e.deptno`, you can also use that name.  (The
expression is still considered atomic, and the result of the query
will be a collection of that type, not a collection of records.)

Here are some examples of `current` in action.

```sml
from i in [1, 2, 3],
    j in [4, 5, 6]
  yield i + j
  order DESC current;
(*[> val it = [9,8,8,7,7,7,6,6,5] : int list]*)

from maker in ["ford", "ferrari"],
    color in ["red", "green"]
  order current.color;
(*[> color maker
> ----- -------
> green ford
> green ferrari
> red   ford
> red   ferrari
>
> val it : {color:string, maker:string} list]*)

from i in [1, 2, 3, 4]
  yield 4 * (i mod 2) + (i div 2)
  order current;
(*[> val it = [1,2,4,5] : int list]*)

from e in scott.emps
  yield e.deptno
  distinct
  order current;
(*[> val it = [10,20,30] : int list]*)

from e in scott.emps
  yield e.deptno
  distinct
  order deptno;
(*[> val it = [10,20,30] : int list]*)
```

## 5. Set operators in pipelines

The set operators (`union`, `intersect` and `except`) were previously
available via functions but now have
[dedicated steps](https://github.com/hydromatic/morel/issues/253) in
the query pipeline.

The steps have slightly different semantics for ordered and unordered
collections, and have an optional `distinct` keyword to eliminate
duplicates.

For example, here is a query that finds all employees in departments
10 and 20, but excludes those who are managers or clerks:

```sml
from e in scott.emps
  where e.deptno = 10
  union (from e in scott.emps where e.deptno = 20)
  except (from e in scott.emps where e.job = "MANAGER"),
     (from e in scott.emps where e.job = "CLERK");
```

If you have ever wondered about the semantics of `intersect` and
`except` with duplicates, wonder no more!
[INTERSECT ALL, EXCEPT ALL, and the arithmetic of fractions](/draft-blog3/2025/06/03/intersect-fractions.html)
explains everything using a fun example.

## 6. String and Char structures

Morel now includes complete
[`String`](https://github.com/hydromatic/morel/issues/279) and
[`Char`](https://github.com/hydromatic/morel/issues/264) structures
following the
[Standard ML Basis Library](https://smlfamily.github.io/Basis/)
specification.

This gives you comprehensive text manipulation capabilities:

```sml
String.size "hello world";
(*[> val it = 11 : int]*)

String.substring ("hello world", 6, 5);
(*[> val it = "world" : string]*)

String.tokens (fn c => c = #" ") "hello world morel";
(*[> val it = ["hello","world","morel"] : string list]*)

Char.isAlpha #"a";
(*[> val it = true : bool]*)

Char.toUpper #"a";
(*[> val it = #"A" : char]*)

String.map Char.toUpper "hello";
(*[> val it = "HELLO" : string]*)
```

These structures provide everything you need for serious text
processing, from basic operations like substring extraction to
advanced features like tokenization and character classification.

## 7. Breaking changes

This release includes some breaking changes to be aware of.

### Database schema updates

The `scott` sample database now uses
[pluralized table names](https://github.com/hydromatic/morel/issues/255),
mapping the `emps` value maps to the `EMP` table, and `depts` to the
`DEPT` table.

```sml
(* Old *)
from e in scott.emp
  join d in scott.dept on e.deptno = d.deptno;

(* New *)
from e in scott.emps
  join d in scott.depts on e.deptno = d.deptno;
```

This change aligns with the modern programming convention that
collections have plural names.

### Type-based orderings

The previous `order` syntax no longer works.

You should convert a following `desc` to preceding `DESC`:

```sml
(* Old syntax *)
from e in scott.emps
  order e.sal desc;

(* New syntax *)
from e in scott.emps
  order DESC e.sal;
```

and put parentheses around composite orderings:

```sml
(* Old syntax *)
from e in scott.emps
  order e.job, e.sal desc;

(* New syntax *)
from e in scott.emps
  order (e.job, DESC e.sal);
```

## Conclusion

Release 0.7.0 represents a major evolution in Morel's
capabilities. Extensions to the query language, type system, and
standard library make Morel a good solution for a wide range of
data processing tasks, from simple queries to complex data
transformations.

As always, you can get started with Morel by visiting
[GitHub](https://github.com/hydromatic/morel).
For more background, read about its
[goals](/draft-blog3/2020/02/25/morel-a-functional-language-for-data.html)
and [basic language](/draft-blog3/2020/03/03/morel-basics.html),
and find a full definition of the language in the
[query reference](https://github.com/hydromatic/morel/blob/main/docs/query.md)
and the
[language reference](https://github.com/hydromatic/morel/blob/main/docs/reference.md).

If you have comments, please reply on
[Bluesky @julianhyde.bsky.social](https://bsky.app/profile/julianhyde.bsky.social)
or Twitter:

<div data_dnt="true">
<div class='jekyll-twitter-plugin'><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">I&#39;m pleased to announce release 0.7 of <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a>! This is a huge release, adding support for ordered/unordered data, set operators, and revised order syntax. A major rework of Morel&#39;s type inference algorithm delivered function overloading. <a href="https://t.co/hERffT3Kxn">https://t.co/hERffT3Kxn</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1931931352729079968?ref_src=twsrc%5Etfw">June 9, 2025</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<!--
This article
[has been updated](https://github.com/julianhyde/share/commits/main/blog/_posts/2025-06-08-morel-release-0-7-0.md-save).
-->]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[I am pleased to announce Morel [release 0.7.0](https://github.com/hydromatic/morel/blob/main/HISTORY.md#070--2025-06-07), just one month after [release 0.6.0](https://github.com/hydromatic/morel/blob/main/HISTORY.md#060--2025-05-02).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ordered and unordered data</title><link href="http://0.0.0.0:4000/draft-blog3/2025/06/06/ordered-unordered.html" rel="alternate" type="text/html" title="Ordered and unordered data" /><published>2025-06-06T13:00:00-07:00</published><updated>2025-06-06T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/06/06/ordered-unordered</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/06/06/ordered-unordered.html"><![CDATA[<p>Despite what the relational model says, some data is <em>ordered</em>.</p>

<p>I’m not talking about <em>sorted</em> data. If you sort a collection,
applying a comparator function to its elements, then you have no
more information than you had before.</p>

<p>No, the integer list</p>

<!-- morel skip
[3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
> val it = [3,1,4,1,5,9,2,6,5,3] : int list
-->

<div class="code-block">
<div class="code-input"><span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span></div>
<div class="code-output">val it = [3,1,4,1,5,9,2,6,5,3] : int list</div>
</div>

<p>and the string list</p>

<!-- morel skip
["Shall I compare thee to a summer's day?",
  "Thou art more lovely and more temperate",
  "Rough winds do shake the darling buds of May",
  "And summer's lease hath all too short a date"]
> val it =
>   ["Shall I compare thee to a summer's day?",
>    "Thou art more lovely and more temperate",
>    "Rough winds do shake the darling buds of May",
>    "And summer's lease hath all too short a date"] : string list
-->

<div class="code-block">
<div class="code-input"><span class="p">[</span><span class="s2">"Shall I compare thee to a summer's day?"</span><span class="p">,</span>
  <span class="s2">"Thou art more lovely and more temperate"</span><span class="p">,</span>
  <span class="s2">"Rough winds do shake the darling buds of May"</span><span class="p">,</span>
  <span class="s2">"And summer's lease hath all too short a date"</span><span class="p">]</span></div>
<div class="code-output">val it =
  ["Shall I compare thee to a summer's day?",
   "Thou art more lovely and more temperate",
   "Rough winds do shake the darling buds of May",
   "And summer's lease hath all too short a date"] : string list</div>
</div>

<p>depend on the order of their elements for their meaning.</p>

<p>But of course, some data is <em>unordered</em>, for good reason. A relational
database would be foolish to guarantee that if you write rows into a
table in a particular order, they will be read back in the same
order. Such a guarantee would seriously limit the database’s
scalability.</p>

<p>This post is about how we allow ordered and unordered data to coexist
in <a href="https://github.com/hydromatic/morel">Morel</a>.</p>

<p>We achieved this with a collection of new features, including
<a href="https://github.com/hydromatic/morel/issues/235">adding a <code class="language-plaintext highlighter-rouge">bag</code> type</a>,
the
<a href="https://github.com/hydromatic/morel/issues/273">ordered relational operators</a>,
the
<a href="https://github.com/hydromatic/morel/issues/276"><code class="language-plaintext highlighter-rouge">ordinal</code> keyword</a>,
and a new
<a href="https://github.com/hydromatic/morel/issues/277"><code class="language-plaintext highlighter-rouge">unorder</code> step</a>.
All of these features will appear shortly in Morel release 0.7.</p>

<h2 id="list-and-bag-types">List and bag types</h2>

<p>As a functional query language, Morel spans the worlds of database and
functional programming.</p>

<p>Databases’ fundamental type, the relation, is an unordered collection
of records.  (Though curiously, modern SQL allows columns to contain
“nested tables”, which can be either of the ordered <code class="language-plaintext highlighter-rouge">ARRAY</code> type or
the unordered <code class="language-plaintext highlighter-rouge">MULTISET</code> type.)</p>

<p>Functional programming languages’ fundamental type is the list, an
ordered type. Functional programs are often defined by structural
induction on lists.  For example, the function</p>

<!-- morel
fun allPositive [] = true
  | allPositive (x::xs) = x > 0 andalso allPositive xs;
> val allPositive = fn : int list -> bool
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">allPositive</span> <span class="p">[]</span> <span class="p">=</span> <span class="n">true</span>
  <span class="p">|</span> <span class="n">allPositive</span> <span class="p">(</span><span class="n">x</span><span class="o">::</span><span class="n">xs</span><span class="p">)</span> <span class="p">=</span> <span class="n">x</span> <span class="o">&gt;</span> <span class="mi">0</span> <span class="kr">andalso</span> <span class="n">allPositive</span> <span class="n">xs</span><span class="p">;</span></div>
<div class="code-output">val allPositive = fn : int list -&gt; bool</div>
</div>

<p>inductively defines that a list of numbers is “all-positive” if it is
empty, or if its first element is positive and the rest of the list is
“all-positive”. This kind of inductive definition requires a firm
distinction between the first element of a list and the rest of the
list, a distinction that is not present in an unordered collection.</p>

<p>So, Morel needs to support both ordered and unordered collections.</p>

<p>Earlier versions of Morel papered over the difference. All collections
had type <code class="language-plaintext highlighter-rouge">list</code>, even the unordered collections backed by database
tables. Morel’s relational operators produced results in deterministic
order if you applied them to in-memory collections using the
in-process interpreter, but order was not guaranteed when Morel
converted the query to SQL for execution in a DBMS.</p>

<p>To fix the problem, the first step was to add a <code class="language-plaintext highlighter-rouge">bag</code> type.  (Bag is a
synonym for <a href="https://en.wikipedia.org/wiki/Multiset">multiset</a>,
implying a given element may occur more than once, but iteration order
is not defined.) <code class="language-plaintext highlighter-rouge">bag</code> is the unordered counterpart to the ordered
<code class="language-plaintext highlighter-rouge">list</code> type, and has similar operations.</p>

<!-- morel
val b = bag [3, 1, 4, 1, 5];
> val b = [3,1,4,1,5] : int bag
Bag.length b;
> val it = 5 : int
Bag.toList b;
> val it = [3,1,4,1,5] : int list
Bag.fromList [3, 1, 4, 1, 5];
> val it = [3,1,4,1,5] : int bag
-->

<div class="code-block">
<div class="code-input"><span class="kr">val</span> <span class="nv">b</span> <span class="p">=</span> <span class="n">bag</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">];</span></div>
<div class="code-output">val b = [3,1,4,1,5] : int bag</div>
<div class="code-input"><span class="nn">Bag</span><span class="p">.</span><span class="n">length</span> <span class="n">b</span><span class="p">;</span></div>
<div class="code-output">val it = 5 : int</div>
<div class="code-input"><span class="nn">Bag</span><span class="p">.</span><span class="n">toList</span> <span class="n">b</span><span class="p">;</span></div>
<div class="code-output">val it = [3,1,4,1,5] : int list</div>
<div class="code-input"><span class="nn">Bag</span><span class="p">.</span><span class="n">fromList</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">];</span></div>
<div class="code-output">val it = [3,1,4,1,5] : int bag</div>
</div>

<p>Order-dependent operations from the <code class="language-plaintext highlighter-rouge">list</code> type, such as <code class="language-plaintext highlighter-rouge">hd</code> and
<code class="language-plaintext highlighter-rouge">drop</code>, are defined for <code class="language-plaintext highlighter-rouge">bag</code> instances, but they are not guaranteed
to return the same result every time you call them.</p>

<!-- morel
Bag.hd b;
> val it = 3 : int
Bag.drop (b, 2);
> val it = [4,1,5] : int bag
-->

<div class="code-block">
<div class="code-input"><span class="nn">Bag</span><span class="p">.</span><span class="n">hd</span> <span class="n">b</span><span class="p">;</span></div>
<div class="code-output">val it = 3 : int</div>
<div class="code-input"><span class="nn">Bag</span><span class="p">.</span><span class="n">drop</span> <span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="mi">2</span><span class="p">);</span></div>
<div class="code-output">val it = [4,1,5] : int bag</div>
</div>

<p>Collections backed by database tables now have type <code class="language-plaintext highlighter-rouge">bag</code>:</p>

<!-- morel skip
from e in scott.depts;
> deptno dname      loc
> ------ ---------- --------
> 10     ACCOUNTING NEW YORK
> 20     RESEARCH   DALLAS
> 30     SALES      CHICAGO
> 40     OPERATIONS BOSTON
>
> val it : {deptno:int, dname:string, loc:string} bag
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">depts</span><span class="p">;</span></div>
<div class="code-output">deptno dname      loc
------ ---------- --------
10     ACCOUNTING NEW YORK
20     RESEARCH   DALLAS
30     SALES      CHICAGO
40     OPERATIONS BOSTON

val it : {deptno:int, dname:string, loc:string} bag</div>
</div>

<p>(You may notice that <code class="language-plaintext highlighter-rouge">scott.depts</code> collection, backed by the <code class="language-plaintext highlighter-rouge">DEPT</code>
table of the <code class="language-plaintext highlighter-rouge">SCOTT</code> JDBC data source, has changed its name as well
as its type. It used to be called <code class="language-plaintext highlighter-rouge">scott.dept</code>. Morel collection names
should be plural and lower-case, and improvements to the
<a href="https://github.com/hydromatic/morel/issues/255">name mapping system</a>
make it easier to derive proper collection names.)</p>

<p>Next, we provide relational operators to convert between <code class="language-plaintext highlighter-rouge">list</code> and
<code class="language-plaintext highlighter-rouge">bag</code>.</p>

<h2 id="converting-between-ordered-and-unordered">Converting between ordered and unordered</h2>

<p>Now that queries can reference <code class="language-plaintext highlighter-rouge">list</code> and <code class="language-plaintext highlighter-rouge">bag</code> collections, we need
operators to convert from one to the other. To do this, we use the
existing <code class="language-plaintext highlighter-rouge">order</code> step, and add an <code class="language-plaintext highlighter-rouge">unorder</code> step and an <code class="language-plaintext highlighter-rouge">ordinal</code>
expression.</p>

<p>In previous versions of Morel, the <code class="language-plaintext highlighter-rouge">order</code> step converted a list to a
<code class="language-plaintext highlighter-rouge">list</code> with a different ordering; now its input can be a list <em>or</em> a
bag:</p>

<!-- morel
from i in [3, 1, 4, 1, 5]
  order DESC i;
> val it = [5,4,3,1,1] : int list
from i in bag [3, 1, 4, 1, 5]
  order DESC i;
> val it = [5,4,3,1,1] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">]</span>
  <span class="kr">order</span> <span class="n">DESC</span> <span class="n">i</span><span class="p">;</span></div>
<div class="code-output">val it = [5,4,3,1,1] : int list</div>
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="n">bag</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">]</span>
  <span class="kr">order</span> <span class="n">DESC</span> <span class="n">i</span><span class="p">;</span></div>
<div class="code-output">val it = [5,4,3,1,1] : int list</div>
</div>

<p>If the sort key does not create a total ordering, the results will be
nondeterministic but still a list. For example, we can sort integers
so that even numbers occur before odd numbers</p>

<!-- morel skip
from i in bag [3, 1, 4, 1, 5]
  order i mod 2;
> val it = [4, 1, 5, 1, 3]
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="n">bag</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">]</span>
  <span class="kr">order</span> <span class="n">i</span> <span class="kr">mod</span> <span class="mi">2</span><span class="p">;</span></div>
<div class="code-output">val it = [4, 1, 5, 1, 3]</div>
</div>

<p>or convert a bag to a list in arbitrary order.</p>

<!-- morel skip
from i in bag [3, 1, 4, 1, 5]
  order ();
> val it = [5, 4, 1, 1, 3]
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="n">bag</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">]</span>
  <span class="kr">order</span> <span class="p">();</span></div>
<div class="code-output">val it = [5, 4, 1, 1, 3]</div>
</div>

<p>To go the opposite direction, the new <code class="language-plaintext highlighter-rouge">unorder</code> step converts a list
to a bag:</p>

<!-- morel
from i in [3, 1, 4, 1, 5]
  unorder;
> val it = [3,1,4,1,5] : int bag
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">]</span>
  <span class="kr">unorder</span><span class="p">;</span></div>
<div class="code-output">val it = [3,1,4,1,5] : int bag</div>
</div>

<p>(You are also free to apply <code class="language-plaintext highlighter-rouge">unorder</code> to a <code class="language-plaintext highlighter-rouge">bag</code>; it will have no
effect.)</p>

<p>As we said above, a <code class="language-plaintext highlighter-rouge">bag</code> contains less information than its
corresponding <code class="language-plaintext highlighter-rouge">list</code>. If you plan to convert the <code class="language-plaintext highlighter-rouge">bag</code> to a <code class="language-plaintext highlighter-rouge">list</code>
at a later stage, you need to store the ordering in an extra field.
The new <code class="language-plaintext highlighter-rouge">ordinal</code> expression lets us do this:</p>

<!-- morel
from i in [3, 1, 4, 1, 5]
  yield {i, j = ordinal}
  unorder
  order j
  yield i;
> val it = [3,1,4,1,5] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="p">{</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span> <span class="p">=</span> <span class="kr">ordinal</span><span class="p">}</span>
  <span class="kr">unorder</span>
  <span class="kr">order</span> <span class="n">j</span>
  <span class="kr">yield</span> <span class="n">i</span><span class="p">;</span></div>
<div class="code-output">val it = [3,1,4,1,5] : int list</div>
</div>

<p>The <code class="language-plaintext highlighter-rouge">ordinal</code> expression can be used in an expression in a
step whose input is ordered (except the steps whose expressions are
evaluated before the query starts: <code class="language-plaintext highlighter-rouge">except</code>, <code class="language-plaintext highlighter-rouge">intersect</code>, <code class="language-plaintext highlighter-rouge">skip</code>,
<code class="language-plaintext highlighter-rouge">take</code>, and <code class="language-plaintext highlighter-rouge">union</code>). <code class="language-plaintext highlighter-rouge">ordinal</code> evaluates to 0 for the first element,
1 for the next element, and so on. But as we shall see, the optimizer
avoids evaluating <code class="language-plaintext highlighter-rouge">ordinal</code> if it can.</p>

<p>Here is a query that computes the salary rank of each employee,
then returns only the poorly-paid clerks.</p>

<!-- morel skip
from e in scott.emps
  order e.sal
  yield {e, rank = 1 + ordinal}
  where e.job = "CLERK";
> ename  rank
> ------ ----
> MILLER 9
> ADAMS  12
> JAMES  13
> SMITH  14
>
> val it : {ename:string, rank:int} list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span>
  <span class="kr">yield</span> <span class="p">{</span><span class="n">e</span><span class="p">,</span> <span class="n">rank</span> <span class="p">=</span> <span class="mi">1</span> <span class="o">+</span> <span class="kr">ordinal</span><span class="p">}</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="p">=</span> <span class="s2">"CLERK"</span><span class="p">;</span></div>
<div class="code-output">ename  rank
------ ----
MILLER 9
ADAMS  12
JAMES  13
SMITH  14

val it : {ename:string, rank:int} list</div>
</div>

<p>The main reason to apply <code class="language-plaintext highlighter-rouge">order</code> and <code class="language-plaintext highlighter-rouge">unorder</code> in a query is
to control the target collection type. But there is a more subtle
reason which relates to performance. The ordered and unordered
versions of the relational operators may produce the same results
(modulo ordering) but ordered execution may be less efficient (such
as running with reduced parallelism). If a query contains an <code class="language-plaintext highlighter-rouge">order</code>
or <code class="language-plaintext highlighter-rouge">unorder</code>, the order of the input to that step is irrelevant, and
the optimizer can use a more efficient execution plan.</p>

<p>This, by the way, is why the specification of the <code class="language-plaintext highlighter-rouge">order</code> step does
not guarantee stability. If <code class="language-plaintext highlighter-rouge">order</code> was stable, the optimizer would
have to use ordered execution of upstream steps if the sort key is
not exhaustive.</p>

<p>If you want <code class="language-plaintext highlighter-rouge">order</code> to be stable, you can add <code class="language-plaintext highlighter-rouge">ordinal</code> to the
trailing edge of the sort key:</p>

<!-- morel skip
from e in scott.emps
  order DESC e.sal
  where e.deptno <> 20
  yield {e.ename, e.job, e.sal}
  order (e.job, ordinal);
> val it =
>   [{ename="MILLER",job="CLERK",sal=1300.0},
>    {ename="JAMES",job="CLERK",sal=950.0},
>    {ename="BLAKE",job="MANAGER",sal=2850.0},
>    {ename="CLARK",job="MANAGER",sal=2450.0},
>    {ename="KING",job="PRESIDENT",sal=5000.0},
>    {ename="ALLEN",job="SALESMAN",sal=1600.0},
>    {ename="TURNER",job="SALESMAN",sal=1500.0},
>    {ename="WARD",job="SALESMAN",sal=1250.0},
>    {ename="MARTIN",job="SALESMAN",sal=1250.0}]
>   : {ename:string, job:string, sal:real} list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span>
  <span class="kr">order</span> <span class="n">DESC</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="o">&lt;</span><span class="o">&gt;</span> <span class="mi">20</span>
  <span class="kr">yield</span> <span class="p">{</span><span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">sal</span><span class="p">}</span>
  <span class="kr">order</span> <span class="p">(</span><span class="nn">e</span><span class="p">.</span><span class="n">job</span><span class="p">,</span> <span class="kr">ordinal</span><span class="p">);</span></div>
<div class="code-output">val it =
  [{ename="MILLER",job="CLERK",sal=1300.0},
   {ename="JAMES",job="CLERK",sal=950.0},
   {ename="BLAKE",job="MANAGER",sal=2850.0},
   {ename="CLARK",job="MANAGER",sal=2450.0},
   {ename="KING",job="PRESIDENT",sal=5000.0},
   {ename="ALLEN",job="SALESMAN",sal=1600.0},
   {ename="TURNER",job="SALESMAN",sal=1500.0},
   {ename="WARD",job="SALESMAN",sal=1250.0},
   {ename="MARTIN",job="SALESMAN",sal=1250.0}]
  : {ename:string, job:string, sal:real} list</div>
</div>

<p>Materializing <code class="language-plaintext highlighter-rouge">ordinal</code> as a 1-based, contiguous sequence of integers
is expensive because it forces sequential execution, and the
optimizer will avoid this if it can. In this case, because <code class="language-plaintext highlighter-rouge">ordinal</code>
is used for sorting but is not returned, the optimizer downgrades
<code class="language-plaintext highlighter-rouge">ordinal</code> to a virtual expression. The plan might use an ordered
implementation of the <code class="language-plaintext highlighter-rouge">where</code> and <code class="language-plaintext highlighter-rouge">yield</code> steps followed by a stable
sort, or it might replace <code class="language-plaintext highlighter-rouge">ordinal</code> with the previous sort key
(<code class="language-plaintext highlighter-rouge">DESC e.sal</code>).</p>

<h2 id="ordered-relational-operators">Ordered relational operators</h2>

<p>We need to define the semantics of the relational operators
over all types of collection.</p>

<p>Part of the job has been done already:</p>
<ul>
  <li>The relational model defines the semantics of operators over
<strong>sets</strong> (unordered collections without duplicates).</li>
  <li>The SQL standard specifies the relational operators
over <strong>tables</strong> (unordered collections with duplicates).</li>
  <li>Previous versions of Morel defined semantics for (and implemented)
relational operators over <strong>multisets</strong> (unordered collections with
duplicates).  While the collection type was at the time called
<code class="language-plaintext highlighter-rouge">list</code>, we were actually defining the current <code class="language-plaintext highlighter-rouge">bag</code> type.  Unlike
SQL, elements need not be records.</li>
</ul>

<p>What remains is to define the semantics of queries over <strong>lists</strong>
(ordered collections with duplicates), and for hybrid queries that
combine lists and multisets. (We define hybrid semantics in the <a href="#hybrid-relational-operators">next
section</a>.)</p>

<p>Because a
<a href="https://github.com/hydromatic/morel/blob/main/docs/query.md">query</a>
consists of a sequence of steps, each corresponding to a relational
operator, we define the semantics of each step over input that is a
<code class="language-plaintext highlighter-rouge">list</code>:</p>

<ul>
  <li>The first step in a query – <code>from <i>pat</i> in
<i>exp</i></code>, <code>forall <i>pat</i> in <i>exp</i></code>, or
<code>exists <i>pat</i> in <i>exp</i></code> – returns elements in
the same order that they are emitted from <i>exp</i>.</li>
  <li><code>join <i>pat</i> in <i>exp</i> [ on <i>condition</i> ]</code>
for each element from its input evaluates <i>exp</i>, then, in order
of those elements, emits a record consisting of fields of the two
elements, skipping records where <i>condition</i> is false.</li>
  <li>If a <code class="language-plaintext highlighter-rouge">from</code>, <code class="language-plaintext highlighter-rouge">forall</code>, <code class="language-plaintext highlighter-rouge">exists</code> or <code class="language-plaintext highlighter-rouge">join</code> step has more than one
scan, each subsequent scan behaves as if it were a separate <code class="language-plaintext highlighter-rouge">join</code>
step.</li>
  <li><code>yield <i>exp</i></code> preserves order.</li>
  <li><code>where <i>condition</i></code> preserves order, dropping rows
for which <i>condition</i> is false.</li>
  <li><code>skip <i>count</i></code> and <code>take <i>count</i></code>
preserve order (respectively dropping the first <i>count</i> rows,
or taking the first <i>count</i> rows).</li>
  <li><code class="language-plaintext highlighter-rouge">distinct</code> preserves order, emitting only the first occurrence
of each element.</li>
  <li><code>group <i>groupKey<sub>1</sub></i>, ...,
<i>groupKey<sub>g</sub></i> [ compute <i>agg<sub>1</sub></i>, ...,
<i>agg<sub>a</sub></i> ]</code> preserves order, emitting groups in
the order that the first element in the group was seen; each
aggregate function <code><i>agg<sub>i</sub></i></code> is invoked
with a list of the input elements that belong to that group, in
arrival order.</li>
  <li><code>compute <i>agg<sub>1</sub></i>, ...,
<i>agg<sub>a</sub></i></code> behaves as a <code class="language-plaintext highlighter-rouge">group</code> step where all
input elements are in the same group.</li>
  <li><code>union [ distinct ] <i>exp<sub>1</sub></i>, ...,
<i>exp<sub>n</sub></i></code> outputs the elements of the input in
order, followed by the elements of each <i>exp<sub>i</sub></i>
argument in order (just like the UNIX <code class="language-plaintext highlighter-rouge">cat</code> command). If <code class="language-plaintext highlighter-rouge">distinct</code>
is specified, outputs only the first occurrence of each element.</li>
  <li><code>intersect [ distinct ] <i>exp<sub>1</sub></i>, ...,
<i>exp<sub>n</sub></i></code> outputs the elements of the input in
order, provided that every <i>exp<sub>i</sub></i> argument contains
at least the number of occurrences of this element so far.  If
<code class="language-plaintext highlighter-rouge">distinct</code> is specified, outputs only the first occurrence of each
element.</li>
  <li><code>except [ distinct ] <i>exp<sub>1</sub></i>, ...,
<i>exp<sub>n</sub></i></code> outputs the elements of the input in
order, provided that the number of occurrences of that element so
far is less than the number of occurrences of that element in all
the <i>exp<sub>i</sub></i> arguments.  If <code class="language-plaintext highlighter-rouge">distinct</code> is specified,
outputs only the first occurrence of each element.</li>
  <li><code>require <i>condition</i></code> (which can only occur in a
<code class="language-plaintext highlighter-rouge">forall</code> query) has the same behavior as the unordered case.</li>
  <li><code class="language-plaintext highlighter-rouge">order</code> and <code class="language-plaintext highlighter-rouge">unorder</code>, as discussed earlier, have the same
semantics as in the unordered case.</li>
</ul>

<p>The rules for <code class="language-plaintext highlighter-rouge">from</code> and <code class="language-plaintext highlighter-rouge">join</code> produce the same familiar ordering as
a nested “for” loop in a language such as C, Python or Java:</p>

<!-- morel silent
Sys.set ("printLength", ~1);
> val it = () : unit
-->
<!-- morel
from hundreds in [100, 200, 300],
    tens in [10, 20, 30]
  join units in [1, 2, 3]
  yield hundreds + tens + units;
> val it =
>   [111,112,113,121,122,123,131,132,133,211,212,213,221,222,223,231,232,233,311,
>    312,313,321,322,323,331,332,333] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">hundreds</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">100</span><span class="p">,</span> <span class="mi">200</span><span class="p">,</span> <span class="mi">300</span><span class="p">],</span>
    <span class="nv">tens</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">30</span><span class="p">]</span>
  <span class="kr">join</span> <span class="nv">units</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="n">hundreds</span> <span class="o">+</span> <span class="n">tens</span> <span class="o">+</span> <span class="n">units</span><span class="p">;</span></div>
<div class="code-output">val it =
  [111,112,113,121,122,123,131,132,133,211,212,213,221,222,223,231,232,233,311,
   312,313,321,322,323,331,332,333] : int list</div>
</div>

<p>The rules for <code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">intersect</code> and <code class="language-plaintext highlighter-rouge">except</code> are rather subtle, and
are best illustrated by example:</p>

<!-- morel
from i in [3, 1, 4, 1, 5, 9, 2, 6]
  union [2, 7, 1, 8, 2, 8, 1, 8];
> val it = [3,1,4,1,5,9,2,6,2,7,1,8,2,8,1,8] : int list
from i in [3, 1, 4, 1, 5, 9, 2, 6]
  intersect [2, 7, 1, 8, 2, 8, 1, 8];
> val it = [1,1,2] : int list
from i in [3, 1, 4, 1, 5, 9, 2, 6]
  except [2, 7, 1, 8, 2, 8, 1, 8];
> val it = [3,4,5,9,6] : int list
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">union</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">8</span><span class="p">];</span></div>
<div class="code-output">val it = [3,1,4,1,5,9,2,6,2,7,1,8,2,8,1,8] : int list</div>
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">intersect</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">8</span><span class="p">];</span></div>
<div class="code-output">val it = [1,1,2] : int list</div>
<div class="code-input"><span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">6</span><span class="p">]</span>
  <span class="kr">except</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">8</span><span class="p">];</span></div>
<div class="code-output">val it = [3,4,5,9,6] : int list</div>
</div>

<h2 id="hybrid-relational-operators">Hybrid relational operators</h2>

<p>We have specified the behavior of queries where input collections are
all lists or all bags. But what if a query has a mix of list and bag
inputs?</p>

<p>The mixing can occur if the first step of the query (<code class="language-plaintext highlighter-rouge">from</code>, <code class="language-plaintext highlighter-rouge">exists</code>,
or <code class="language-plaintext highlighter-rouge">forall</code>) has more than one scan, or in steps that introduce
another collection (<code class="language-plaintext highlighter-rouge">join</code>, <code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">intersect</code>, or <code class="language-plaintext highlighter-rouge">except</code>). In all
cases, unordered wins: if any input is a <code class="language-plaintext highlighter-rouge">bag</code>, the step becomes
unordered, and unordered semantics apply from then on.</p>

<p>For example, if we join a <code class="language-plaintext highlighter-rouge">list</code> of department numbers (ordered) to a
table of employees (unordered), selecting only the clerks and
managers, the result is a <code class="language-plaintext highlighter-rouge">bag</code> (unordered):</p>

<!-- morel skip
from deptno in [10, 20, 30]
  join e in scott.emps on e.deptno = deptno
  where e.job elem ["CLERK", "MANAGER"]
  yield {deptno, e.ename};
> deptno ename
> ------ ------
> 30     JAMES
> 10     CLARK
> 20     ADAMS
> 10     MILLER
> 20     SMITH
> 30     BLAKE
> 20     JONES
>
> val it : {deptno:int, ename:string} bag
-->

<div class="code-block">
<div class="code-input"><span class="kr">from</span> <span class="nv">deptno</span> <span class="kr">in</span> <span class="p">[</span><span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">30</span><span class="p">]</span>
  <span class="kr">join</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span> <span class="kr">on</span> <span class="nn">e</span><span class="p">.</span><span class="n">deptno</span> <span class="p">=</span> <span class="n">deptno</span>
  <span class="kr">where</span> <span class="nn">e</span><span class="p">.</span><span class="n">job</span> <span class="kr">elem</span> <span class="p">[</span><span class="s2">"CLERK"</span><span class="p">,</span> <span class="s2">"MANAGER"</span><span class="p">]</span>
  <span class="kr">yield</span> <span class="p">{</span><span class="n">deptno</span><span class="p">,</span> <span class="nn">e</span><span class="p">.</span><span class="n">ename</span><span class="p">};</span></div>
<div class="code-output">deptno ename
------ ------
30     JAMES
10     CLARK
20     ADAMS
10     MILLER
20     SMITH
30     BLAKE
20     JONES

val it : {deptno:int, ename:string} bag</div>
</div>

<h2 id="type-inference-challenges">Type inference challenges</h2>

<p>This feature was challenging to implement because it required
major changes to Morel’s type inference algorithm. (We mention this
only in the spirit of sharing war-stories, and for the interest of
those who understand the internal workings of Morel’s compiler.
Hopefully, the changes to type-inference algorithm will be invisible
to the casual user.)</p>

<p>The problem is evident in a program such as</p>

<!-- morel skip
let
  fun f (xs, ys) =
    from i in xs
      intersect ys
in
  f ((from e in scott.emps yield e.empno), [7521, 7782, 8000])
end;
> val it = [7521,7782] : int bag
-->

<div class="code-block">
<div class="code-input"><span class="kr">let</span>
  <span class="kr">fun</span> <span class="nf">f</span> <span class="p">(</span><span class="n">xs</span><span class="p">,</span> <span class="n">ys</span><span class="p">)</span> <span class="p">=</span>
    <span class="kr">from</span> <span class="nv">i</span> <span class="kr">in</span> <span class="n">xs</span>
      <span class="kr">intersect</span> <span class="n">ys</span>
<span class="kr">in</span>
  <span class="n">f</span> <span class="p">((</span><span class="kr">from</span> <span class="nv">e</span> <span class="kr">in</span> <span class="nn">scott</span><span class="p">.</span><span class="n">emps</span> <span class="kr">yield</span> <span class="nn">e</span><span class="p">.</span><span class="n">empno</span><span class="p">),</span> <span class="p">[</span><span class="mi">7521</span><span class="p">,</span> <span class="mi">7782</span><span class="p">,</span> <span class="mi">8000</span><span class="p">])</span>
<span class="kr">end</span><span class="p">;</span></div>
<div class="code-output">val it = [7521,7782] : int bag</div>
</div>

<p>While resolving the type of function <code class="language-plaintext highlighter-rouge">f</code> and its embedded query, the
types of the arguments <code class="language-plaintext highlighter-rouge">xs</code> and <code class="language-plaintext highlighter-rouge">ys</code> have not yet been
determined. Morel’s previous type inference algorithm allowed us to
say “<i><code class="language-plaintext highlighter-rouge">xs</code> and <code class="language-plaintext highlighter-rouge">ys</code> must be lists with the same element type</i>” or
“<i><code class="language-plaintext highlighter-rouge">xs</code> and <code class="language-plaintext highlighter-rouge">ys</code> must be bags with the same element type</i>”. It was
based on
<a href="https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_W">Hindley-Milner’s Algorithm W</a>
and unification, which basically means finding an assignment of
logical variables so that two trees are structurally identical.</p>

<p>But the type inference rules for queries with a mixture of ordered and
unordered collections require conditions that contain the word
‘or’. For example, resolving the <code class="language-plaintext highlighter-rouge">intersect</code> expression above requires
that we say “<i>we can allow <code class="language-plaintext highlighter-rouge">xs</code> and <code class="language-plaintext highlighter-rouge">ys</code> to both be bags, or both be
lists, or one to be a bag and the other a list, but they must have
same element type</i>”.  Furthermore, we need to derive the result
type, saying “<i>the result of the query is a list if both arguments
are lists, otherwise a bag, with the same element type as the
arguments</i>”.</p>

<p>We needed a system where we can place a number of constraints on type
variables, and then solve for those constraints. The new type
inference algorithm extends Hindley-Milner with constraints, using the
approach described in
<a href="https://dl.acm.org/doi/pdf/10.1145/224164.224195">“A Second Look at Overloading” by Odersky, Wadler &amp; Wehr (1995)</a>.
As the title of that paper suggests, we have <a href="https://github.com/hydromatic/morel/issues/237">added a kind of
overloading</a> to Morel;
it is as if the <code class="language-plaintext highlighter-rouge">intersect</code> operator now has four forms:</p>

<ul>
  <li><code>intersect: &alpha; bag * &alpha; bag &rarr; &alpha; bag</code></li>
  <li><code>intersect: &alpha; bag * &alpha; list &rarr; &alpha; bag</code></li>
  <li><code>intersect: &alpha; list * &alpha; bag &rarr; &alpha; bag</code></li>
  <li><code>intersect: &alpha; list * &alpha; list &rarr; &alpha; list</code></li>
</ul>

<p>(and similar overloads for the other relational operators) and the
type inference algorithm solves the constraints to land on one valid
assignment of types.</p>

<p>The algorithm took several months of hard work to implement, but the
results are pleasing.  Morel retains the key benefits of a
Hindley-Milner type system: strong static typing, runtime efficiency,
and type inference without the need for type annotations.</p>

<p>Like any other major change in architecture, constraint-based type
inference will take a while to mature;
[<a href="https://github.com/hydromatic/morel/issues/270">MOREL-270</a>]
and
[<a href="https://github.com/hydromatic/morel/issues/271">MOREL-271</a>]
describe some of the remaining issues.</p>

<h2 id="conclusion">Conclusion</h2>

<p>The ability to combine ordered and unordered data sets, and process
both using relational operators, is a major new feature in Morel. It
allows Morel to handle, with equal ease, data from files and
relational databases, and data that is generated programmatically.</p>

<p>This feature will be available in Morel release 0.7.</p>

<p>To find out more about Morel, read about its
<a href="/draft-blog3/2020/02/25/morel-a-functional-language-for-data.html">goals</a>
and <a href="/draft-blog3/2020/03/03/morel-basics.html">basic language</a>, peruse the
<a href="https://github.com/hydromatic/morel/blob/main/docs/query.md">query reference</a>
or
<a href="https://github.com/hydromatic/morel/blob/main/docs/reference.md">language reference</a>,
or download it from <a href="https://github.com/hydromatic/morel/">GitHub</a> and
give it a try.</p>

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">Database tables are unordered, but functional programming languages work best over ordered lists. Which should <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a> prefer? Both! We now have &quot;list&quot; and &quot;bag&quot; types, and full relational algebra over both. <a href="https://t.co/n8vZx0pUmG">https://t.co/n8vZx0pUmG</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1931153173097660591?ref_src=twsrc%5Etfw">June 7, 2025</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<!--
This article
[has been updated](https://github.com/julianhyde/share/commits/main/blog/_posts/2025-06-06-ordered-unordered.md).
-->]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[Despite what the relational model says, some data is ordered.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">INTERSECT ALL, EXCEPT ALL, and the arithmetic of fractions</title><link href="http://0.0.0.0:4000/draft-blog3/2025/06/03/intersect-fractions.html" rel="alternate" type="text/html" title="INTERSECT ALL, EXCEPT ALL, and the arithmetic of fractions" /><published>2025-06-03T13:00:00-07:00</published><updated>2025-06-03T13:00:00-07:00</updated><id>http://0.0.0.0:4000/draft-blog3/2025/06/03/intersect-fractions</id><content type="html" xml:base="http://0.0.0.0:4000/draft-blog3/2025/06/03/intersect-fractions.html"><![CDATA[<p>SQL’s <code class="language-plaintext highlighter-rouge">INTERSECT ALL</code> and <code class="language-plaintext highlighter-rouge">EXCEPT ALL</code> operators rarely get attention,
but they elegantly solve a classic math problem. The problem is
computing the <strong>greatest common divisor (GCD)</strong> and <strong>least common
multiple (LCM)</strong> of two integers, using the prime factors of those
integers.  In this post we show how to do this using <code class="language-plaintext highlighter-rouge">intersect</code> and
<code class="language-plaintext highlighter-rouge">except</code>, Morel’s equivalent of <code class="language-plaintext highlighter-rouge">INTERSECT ALL</code> and <code class="language-plaintext highlighter-rouge">EXCEPT ALL</code>.</p>

<p>SQL’s set operators (<code class="language-plaintext highlighter-rouge">UNION</code>, <code class="language-plaintext highlighter-rouge">INTERSECT</code>, and <code class="language-plaintext highlighter-rouge">EXCEPT</code>) have set and
multiset variants.  The multiset variants retain duplicates and use
the <code class="language-plaintext highlighter-rouge">ALL</code> keyword; the set variants discard duplicates, and you can
use the optional <code class="language-plaintext highlighter-rouge">DISTINCT</code> keyword if you want to be explicit.</p>

<p>Morel has <a href="https://github.com/hydromatic/morel/issues/253">just added</a>
<code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">intersect</code> and <code class="language-plaintext highlighter-rouge">except</code> query steps, achieving parity
with both Standard SQL and
<a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/pipe-syntax#union_pipe_operator">GoogleSQL’s pipe syntax</a>.
(This post is about multiset mode, which retains duplicate values; to
use the set mode, which discards duplicates and is far more common,
use the <code class="language-plaintext highlighter-rouge">distinct</code> keyword, for example <code class="language-plaintext highlighter-rouge">intersect distinct</code>.)</p>

<p>Using these steps, we can compute GCD and LCM. The queries are even more
concise because Morel queries over integer values do not require
column names.</p>

<h2 id="adding-fractions">Adding fractions</h2>

<p>Remember how – probably in middle school – you learned how to add
two fractions, and to reduce a fraction to its lowest terms?</p>

<p>Suppose you need to add 5/36 and 7/120. First, find the least
common multiple (LCM) of their denominators (36 and 120).</p>

<p>Next, convert each fraction to an equivalent fraction with the LCM
(360) as the denominator.</p>
<ul>
  <li>For 5/36: Multiply the numerator and denominator by 10
(since 36 * 10 = 360).</li>
  <li>For 7/120: Multiply the numerator and denominator by 3
(since 120 * 3 = 360).</li>
</ul>

<p>Last, add the fractions.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  5      7        5 * 10    7 * 3        50      21       71
---- + -----  =  ------- + -------  =  ----- + -----  =  -----
 36     120      36 * 10   120 * 3      360     360       360
</code></pre></div></div>

<h2 id="computing-gcd-and-lcm">Computing GCD and LCM</h2>

<p>To compute the GCD of two numbers, you start by finding their prime
factors.  Prime factors can be repeated, so these are multisets, not
sets.  Let’s find the GCD of 36 and 120.</p>

<ul>
  <li>36 is 2<sup>2</sup> * 3<sup>2</sup>, so has factors [2, 2, 3, 3]</li>
  <li>120 is 2<sup>3</sup> * 3 * 5, so has factors [2, 2, 2, 3, 5]</li>
</ul>

<p>Where there are factors in common, we take the lower repetition count.
Taking the minimum count for each common factor – two 2s, one 3, and
no 5s – the GCD is therefore 2<sup>2</sup> * 3, which is 12.</p>

<p>The crucial step of the algorithm is to combine two multisets and take
the minimum repetition count; that is exactly what <code class="language-plaintext highlighter-rouge">intersect</code> does.</p>

<p>The LCM is similar, but takes the higher repetition count.
This can be achieved by taking the union of both factor multisets,
then subtracting their intersection. Here’s why: The union gives us
all factors from both numbers, but it adds the counts together. Since
we want the maximum count (not the sum), we subtract the intersection,
which contains the overlapping factors we double-counted.
The LCM is therefore 2<sup>3</sup> * 3<sup>2</sup> * 5, which is 360.</p>

<h2 id="using-morel-to-compute-lcm-and-gcd">Using Morel to compute LCM and GCD</h2>

<p>To convert this algorithm to code, we will need three things:</p>
<ul>
  <li>a <code class="language-plaintext highlighter-rouge">factorize</code> function splits the numbers into multisets of prime
factors;</li>
  <li>the <code class="language-plaintext highlighter-rouge">intersect</code> step combines the multisets;</li>
  <li>a <code class="language-plaintext highlighter-rouge">product</code> function converts the multisets back to a number.</li>
</ul>

<p>Here are the <code class="language-plaintext highlighter-rouge">factorize</code> and <code class="language-plaintext highlighter-rouge">product</code> functions.</p>

<!-- morel
fun factorize n =
  let
    fun factorize' n d =
      if n < d then [] else
      if n mod d = 0 then d :: (factorize' (n div d) d)
      else factorize' n (d + 1)
  in
    factorize' n 2
  end;
> val factorize = fn : int -> int list

fun product [] = 1
  | product (x::xs) = x * (product xs);
> val product = fn : int list -> int
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">factorize</span> <span class="n">n</span> <span class="p">=</span>
  <span class="kr">let</span>
    <span class="kr">fun</span> <span class="nf">factorize'</span> <span class="n">n</span> <span class="n">d</span> <span class="p">=</span>
      <span class="kr">if</span> <span class="n">n</span> <span class="o">&lt;</span> <span class="n">d</span> <span class="kr">then</span> <span class="p">[]</span> <span class="kr">else</span>
      <span class="kr">if</span> <span class="n">n</span> <span class="kr">mod</span> <span class="n">d</span> <span class="p">=</span> <span class="mi">0</span> <span class="kr">then</span> <span class="n">d</span> <span class="o">::</span> <span class="p">(</span><span class="n">factorize'</span> <span class="p">(</span><span class="n">n</span> <span class="kr">div</span> <span class="n">d</span><span class="p">)</span> <span class="n">d</span><span class="p">)</span>
      <span class="kr">else</span> <span class="n">factorize'</span> <span class="n">n</span> <span class="p">(</span><span class="n">d</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
  <span class="kr">in</span>
    <span class="n">factorize'</span> <span class="n">n</span> <span class="mi">2</span>
  <span class="kr">end</span><span class="p">;</span></div>
<div class="code-output">val factorize = fn : int -&gt; int list</div>
<div class="code-input">
<span class="kr">fun</span> <span class="nf">product</span> <span class="p">[]</span> <span class="p">=</span> <span class="mi">1</span>
  <span class="p">|</span> <span class="n">product</span> <span class="p">(</span><span class="n">x</span><span class="o">::</span><span class="n">xs</span><span class="p">)</span> <span class="p">=</span> <span class="n">x</span> <span class="o">*</span> <span class="p">(</span><span class="n">product</span> <span class="n">xs</span><span class="p">);</span></div>
<div class="code-output">val product = fn : int list -&gt; int</div>
</div>

<p>Here’s how they work:</p>

<!-- morel
factorize 120;
> val it = [2,2,2,3,5] : int list
product (factorize 120);
> val it = 120 : int
-->

<div class="code-block">
<div class="code-input"><span class="n">factorize</span> <span class="mi">120</span><span class="p">;</span></div>
<div class="code-output">val it = [2,2,2,3,5] : int list</div>
<div class="code-input"><span class="n">product</span> <span class="p">(</span><span class="n">factorize</span> <span class="mi">120</span><span class="p">);</span></div>
<div class="code-output">val it = 120 : int</div>
</div>

<p>So, we can compute GCD like this:</p>

<!-- morel skip
fun gcd (m, n) =
  from f in factorize m
    intersect factorize n
    compute product;
> val gcd = fn : int * int -> int
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">gcd</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="p">=</span>
  <span class="kr">from</span> <span class="nv">f</span> <span class="kr">in</span> <span class="n">factorize</span> <span class="n">m</span>
    <span class="kr">intersect</span> <span class="n">factorize</span> <span class="n">n</span>
    <span class="kr">compute</span> <span class="n">product</span><span class="p">;</span></div>
<div class="code-output">val gcd = fn : int * int -&gt; int</div>
</div>

<p>The last step uses <code class="language-plaintext highlighter-rouge">compute</code> because <code class="language-plaintext highlighter-rouge">product</code> fulfills Morel’s only
criterion to be an aggregate function: its argument is a collection
of values. (At least one SQL dialect agrees with us, and has a
<a href="https://duckdb.org/docs/stable/sql/functions/aggregates#productarg">PRODUCT</a>
aggregate function.)</p>

<p>LCM can be computed from GCD:</p>

<!-- morel skip
fun lcm (m, n) =
  (m * n) div gcd (m, n);
> val lcm = fn : int * int -> int
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">lcm</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="p">=</span>
  <span class="p">(</span><span class="n">m</span> <span class="o">*</span> <span class="n">n</span><span class="p">)</span> <span class="kr">div</span> <span class="n">gcd</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span></div>
<div class="code-output">val lcm = fn : int * int -&gt; int</div>
</div>

<p>But, as we discussed above, it can also be computed directly using
<code class="language-plaintext highlighter-rouge">union</code>, <code class="language-plaintext highlighter-rouge">except</code> and <code class="language-plaintext highlighter-rouge">intersect</code>:</p>

<!-- morel skip
fun lcm' (m, n) =
  let
    val m_factors = factorize m
    val n_factors = factorize n
  in
    from f in m_factors
      union (n_factors)
      except (from f in m_factors
        intersect n_factors)
    compute product
  end;
> val lcm' = fn : int * int -> int
-->

<div class="code-block">
<div class="code-input"><span class="kr">fun</span> <span class="nf">lcm'</span> <span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="n">n</span><span class="p">)</span> <span class="p">=</span>
  <span class="kr">let</span>
    <span class="kr">val</span> <span class="nv">m_factors</span> <span class="p">=</span> <span class="n">factorize</span> <span class="n">m</span>
    <span class="kr">val</span> <span class="nv">n_factors</span> <span class="p">=</span> <span class="n">factorize</span> <span class="n">n</span>
  <span class="kr">in</span>
    <span class="kr">from</span> <span class="nv">f</span> <span class="kr">in</span> <span class="n">m_factors</span>
      <span class="kr">union</span> <span class="p">(</span><span class="n">n_factors</span><span class="p">)</span>
      <span class="kr">except</span> <span class="p">(</span><span class="kr">from</span> <span class="nv">f</span> <span class="kr">in</span> <span class="n">m_factors</span>
        <span class="kr">intersect</span> <span class="n">n_factors</span><span class="p">)</span>
    <span class="kr">compute</span> <span class="n">product</span>
  <span class="kr">end</span><span class="p">;</span></div>
<div class="code-output">val lcm' = fn : int * int -&gt; int</div>
</div>

<p>Let’s test them:</p>

<!-- morel skip
gcd (36, 120);
> val it = 12 : int
lcm (36, 120);
> val it = 360 : int
lcm' (36, 120);
> val it = 360 : int
-->

<div class="code-block">
<div class="code-input"><span class="n">gcd</span> <span class="p">(</span><span class="mi">36</span><span class="p">,</span> <span class="mi">120</span><span class="p">);</span></div>
<div class="code-output">val it = 12 : int</div>
<div class="code-input"><span class="n">lcm</span> <span class="p">(</span><span class="mi">36</span><span class="p">,</span> <span class="mi">120</span><span class="p">);</span></div>
<div class="code-output">val it = 360 : int</div>
<div class="code-input"><span class="n">lcm'</span> <span class="p">(</span><span class="mi">36</span><span class="p">,</span> <span class="mi">120</span><span class="p">);</span></div>
<div class="code-output">val it = 360 : int</div>
</div>

<h2 id="conclusion">Conclusion</h2>

<p>The <code class="language-plaintext highlighter-rouge">intersect</code>, <code class="language-plaintext highlighter-rouge">except</code>, and <code class="language-plaintext highlighter-rouge">union</code> steps neatly solve the problem
of computing GCD and LCM because they handle repeated factors in
exactly the way we need.</p>

<p>These steps will be available shortly in Morel release 0.7.</p>

<p>If you have comments, please reply on
<a href="https://bsky.app/profile/julianhyde.bsky.social">Bluesky @julianhyde.bsky.social</a>
or Twitter:</p>

<div data_dnt="true">
<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet" data-cards="hidden"><p lang="en" dir="ltr">Be honest, did you ever find a real-world use for SQL&#39;s &quot;INTERSECT ALL&quot; operator? Now we did! This post explains how you can use <a href="https://twitter.com/morel_lang?ref_src=twsrc%5Etfw">@morel_lang</a>&#39;s &quot;intersect&quot; to compute GCD (greatest common divisor). <a href="https://t.co/bulN6RHr96">https://t.co/bulN6RHr96</a></p>&mdash; Julian Hyde (@julianhyde) <a href="https://twitter.com/julianhyde/status/1930104410375630901?ref_src=twsrc%5Etfw">June 4, 2025</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>
</div>

<p>This article
<a href="https://github.com/julianhyde/share/commits/main/blog/_posts/2025-06-03-intersect-fractions.md">has been updated</a>.</p>]]></content><author><name>Julian Hyde</name></author><summary type="html"><![CDATA[SQL’s INTERSECT ALL and EXCEPT ALL operators rarely get attention, but they elegantly solve a classic math problem. The problem is computing the greatest common divisor (GCD) and least common multiple (LCM) of two integers, using the prime factors of those integers. In this post we show how to do this using intersect and except, Morel’s equivalent of INTERSECT ALL and EXCEPT ALL.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" /><media:content medium="image" url="http://0.0.0.0:4000/draft-blog3/assets/img/OldDesignShop_MushroomSpringMorel-240x240.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>