Fractals in BigQuery

Doing recursive queries without the option of using the RECURSIVE clause

I’ve been working a lot with cloud databases lately. They say the best way to get good at a language is to learn by doing. The obvious problem is that if you don’t have access to a gigantic database of real-world data, toy databases will only get you so far. That’s why for this Bigquery tutorial, I was inspired by Michael Malis’ presentation on SQL Fractals. I figured it would be interesting to try this out with this in BigQuery. You can follow the code below either in the terminal or using the Google BigQuery Console.

The Fractals

Michael Malis’ SQL fractals made use of an SQL technique called recursive CTEs. Recursive CTEs are normally used to identify the hierarchies of data, such as an organizational structure, employee-manager, bill-of-materials, and document hierarchy. In this case, they provide a way to express fractal iteration in the queries.

With some work, Michael was able map fractals of two types into SQL queries:

  • Escape-time fractals: A class of fractals based on repeatedly evaluating a formula at every point in a grid. In the original SQL formulation, use one recursive CTE to iterate through every point in a grid. I then use a second recursive CTE to repeatedly evaluate a function over each point to test whether or not the point is part of the fractal.
  • Self-similar fractals: A class of fractals that can be described by what’s known as an ”L-system”:. An L-system encodes a fractal as an initial string and rules for modifying that string. It is then possible to decode the fractal from the string. I wrote a SQL query that takes an arbitrary L-system. It runs a few iterations of the L-system and then maps the resulting string back to the fractal. Mapping the L-system to the final fractal is a bit tedious, but not too hard.

This all seems pretty straightforward, but we have a problem.

While many relational databases (e.g., Teradata, Oracle, etc.) support recursive queries, Google BigQuery does not support features like Recursive CTE or VIEWS. By extention, Google BigQuery does not support either WITH RECURSIVE clause or using the RECURSIVE clause in a CREATE VIEW statement.

Without these features, we’re going to need to get creative with some alternatives to RECURSIVE.

Sierpinski’s Triangle.

Michael Malis’ original version

The Sierpinski’s Triangle is a fractal that will be familiar to most people. It’s probably the first fractal you learned to make on your TI-84 calculator, and has been the subject of far too many “Illuminati Confirmed” memes (give it a rest people, that meme’s getting old).

This falls into the category of escape-time fractals. The BigQuery query for our fractal is below, which you can ompare to the original:

WITH cells AS (
  SELECT r, c
  FROM UNNEST(GENERATE_ARRAY(0, 63)) AS r
  CROSS JOIN UNNEST(GENERATE_ARRAY(0, 63)) AS c
),
marked_points AS (
  SELECT r,
         c,
         CASE WHEN r & c = 0
              THEN '**'
              ELSE '  ' END AS marker
  FROM cells
),
combined_rows AS (
  SELECT string_agg(marker, '') AS row_text
  FROM marked_points
  GROUP BY r
  ORDER BY r DESC
)
SELECT string_agg(row_text, '\n') FROM combined_rows;

…which outputs

**                                                                                                                              
****                                                                                                                            
**  **                                                                                                                          
********                                                                                                                        
**      **                                                                                                                      
****    ****                                                                                                                    
**  **  **  **                                                                                                                  
****************                                                                                                                
**              **                                                                                                              
****            ****                                                                                                            
**  **          **  **                                                                                                          
********        ********                                                                                                        
**      **      **      **                                                                                                      
****    ****    ****    ****                                                                                                    
**  **  **  **  **  **  **  **                                                                                                  
********************************                                                                                                
**                              **                                                                                              
****                            ****                                                                                            
**  **                          **  **                                                                                          
********                        ********                                                                                        
**      **                      **      **                                                                                      
****    ****                    ****    ****                                                                                    
**  **  **  **                  **  **  **  **                                                                                  
****************                ****************                                                                                
**              **              **              **                                                                              
****            ****            ****            ****                                                                            
**  **          **  **          **  **          **  **                                                                          
********        ********        ********        ********                                                                        
**      **      **      **      **      **      **      **                                                                      
****    ****    ****    ****    ****    ****    ****    ****                                                                    
**  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **                                                                  
****************************************************************                                                                
**                                                              **                                                              
****                                                            ****                                                            
**  **                                                          **  **                                                          
********                                                        ********                                                        
**      **                                                      **      **                                                      
****    ****                                                    ****    ****                                                    
**  **  **  **                                                  **  **  **  **                                                  
****************                                                ****************                                                
**              **                                              **              **                                              
****            ****                                            ****            ****                                            
**  **          **  **                                          **  **          **  **                                          
********        ********                                        ********        ********                                        
**      **      **      **                                      **      **      **      **                                      
****    ****    ****    ****                                    ****    ****    ****    ****                                    
**  **  **  **  **  **  **  **                                  **  **  **  **  **  **  **  **                                  
********************************                                ********************************                                
**                              **                              **                              **                              
****                            ****                            ****                            ****                            
**  **                          **  **                          **  **                          **  **                          
********                        ********                        ********                        ********                        
**      **                      **      **                      **      **                      **      **                      
****    ****                    ****    ****                    ****    ****                    ****    ****                    
**  **  **  **                  **  **  **  **                  **  **  **  **                  **  **  **  **                  
****************                ****************                ****************                ****************                
**              **              **              **              **              **              **              **              
****            ****            ****            ****            ****            ****            ****            ****            
**  **          **  **          **  **          **  **          **  **          **  **          **  **          **  **          
********        ********        ********        ********        ********        ********        ********        ********        
**      **      **      **      **      **      **      **      **      **      **      **      **      **      **      **      
****    ****    ****    ****    ****    ****    ****    ****    ****    ****    ****    ****    ****    ****    ****    ****    
**  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  
********************************************************************************************************************************

Mandelbrot Set

Michael Malis’ original version

The Mandelbrot set is the set of complex numbers cc for which the function fc(z)=z2+cf_{c}(z)=z^{2}+c does not diverge when iterated from z=0z=0, i.e., for which the sequence fc(0)f_{c}(0), fc(fc(0))f_{c}(f_{c}(0)), etc., remains bounded in absolute value.

Once again, this falls into the category of escape-time fractals. However this time, we need to find a more clever way of doing without the RECURSIVE functionality we’d rely on with an SQL language like Postgres. The BigQuery query for our fractal is below, which you can ompare to the original:

WITH RECURSIVE points AS (
  SELECT r, c FROM UNNEST(GENERATE_ARRAY(-2, 2, 0.05)) AS r
  CROSS JOIN UNNEST(GENERATE_ARRAY(-2, 1, 0.05)) AS c
  ORDER BY r DESC, c ASC
),
iterations AS (
   SELECT r, c, CAST(0.0 as FLOAT64) AS zr, CAST(0.0 as FLOAT64) AS zc, 0 AS iteration FROM points
   UNION ALL
   SELECT r, c, zr*zr - zc*zc + c AS zr, 2*zr*zc + r AS zc, iteration+1 AS iteration
   FROM iterations WHERE zr*zr + zc*zc < 4 AND iteration < 1000
),
final_iteration AS (
  SELECT * FROM iterations WHERE iteration = 1000
),
marked_points AS (
   SELECT r, c, (CASE WHEN EXISTS (SELECT 1 FROM final_iteration i WHERE p.r = i.r AND p.c = i.c)
                  THEN '**'
                  ELSE '  '
                  END) AS marker
   FROM points p
   ORDER BY r DESC, c ASC
),
lines AS (
   SELECT r, string_agg(marker, '') AS r_text
   FROM marked_points
   GROUP BY r
   ORDER BY r DESC
)
SELECT string_agg(r_text, '\n') FROM lines;

…which outputs

** 

Julia Set Fractal

Michael Malis’ original version

The Julia Set fractal.

Now we’re getting into the territory of self-similar fractals, which will be a bit tricker than the escape-time fractals. Once again we need to find a way of doing without the RECURSIVE functionality we’d rely on with an SQL language like Postgres. The BigQuery query for our fractal is below, which you can ompare to the original:

WITH RECURSIVE points AS (
  SELECT ROW, col FROM UNNEST(GENERATE_ARRAY(-2, 2, 0.05)) AS a(ROW)
  CROSS JOIN UNNEST(GENERATE_ARRAY(-2, 2, 0.05)) AS b(col)
  ORDER BY ROW DESC, col ASC
),
iterations AS (
   SELECT ROW, col, col::float AS zr, row::float AS zc, 0 AS iteration FROM points
   UNION ALL
   SELECT ROW, col, zr*zr - zc*zc + 1 - 1.61803398875 AS zr, 2*zr*zc AS zc, iteration+1 AS iteration
   FROM iterations WHERE zr*zr + zc*zc < 4 AND iteration < 1000
),
final_iteration AS (
  SELECT * FROM iterations WHERE iteration = 1000
),
marked_points AS (
  SELECT ROW, col, (CASE WHEN EXISTS (SELECT 1 FROM final_iteration i WHERE p.ROW = i.ROW AND p.col = i.col)
                  THEN '**'
                  ELSE '  '
                  END) AS marker
  FROM points p
  ORDER BY ROW DESC, col ASC
),
ROWS AS (
   SELECT ROW, string_agg(marker, '') AS row_text
   FROM marked_points
   GROUP BY row
   ORDER BY ROW DESC
)
SELECT string_agg(row_text, '\n') FROM ROWS;

…which outputs

** 

Hilbert Curve

Michael Malis’ original version

The Hilbert Curve is a continuous fractal space-filling curve first described by the German mathematician David Hilbert in 1891.

As a self-similar fractal, the learning curve is beginning to steepen. Once again we need to find a way of doing without the RECURSIVE functionality we’d rely on with an SQL language like Postgres. The BigQuery query for our fractal is below, which you can ompare to the original:

WITH RECURSIVE iterations AS (
  SELECT 'A' AS PATH, 0 AS iteration
  UNION ALL
  SELECT replace(replace(replace(PATH, 'A', '-CF+AFA+FC-'), 'B', '+AF-BFB-FA+'), 'C', 'B'), iteration+1 AS iteration
  FROM iterations WHERE iteration < 3
),
segments AS (
  SELECT 0 AS r1, 0 AS c1, 0 AS r2, 0 AS c2, 0 AS r3, 0 AS c3, 0 AS dr, 1 AS dc, (SELECT path FROM iterations ORDER BY iteration DESC LIMIT 1) AS path_left
  UNION ALL
  SELECT r3 AS r1, c3 AS c1, r3 + dr * movement AS r2, c3 + dc * movement AS c2, r3 + 2 * dr * movement AS r3, c3 + 2 * dc * movement AS c3, dr, dc, SUBSTRING(path_left FROM 2) AS path_left
  FROM (
    SELECT r1, c1, r2, c2, r3, c3,
      CASE WHEN SUBSTRING(path_left FOR 1) = '-' THEN -dc
           WHEN SUBSTRING(path_left FOR 1) = '+' THEN dc
           ELSE dr
      END AS dr,
      CASE WHEN SUBSTRING(path_left FOR 1) = '-' THEN dr
           WHEN SUBSTRING(path_left FOR 1) = '+' THEN -dr
           ELSE dc
           END AS dc,
           path_left,
      CASE WHEN SUBSTRING(path_left FOR 1) = 'F' THEN 1 ELSE 0 END AS movement
  FROM segments
  WHERE CHAR_LENGTH(path_left) > 0
  ) sub
),
end_points AS (SELECT r1 AS r, c1 AS c FROM segments UNION SELECT r3, c3 FROM segments),
points AS (
  SELECT r, c FROM UNNEST(GENERATE_ARRAY((SELECT MIN(r) FROM end_points), (SELECT MAX(r) FROM end_points))) AS a(r)
  CROSS JOIN UNNEST(GENERATE_ARRAY((SELECT MIN(c) FROM end_points), (SELECT MAX(c) FROM end_points))) AS b(c)
),
marked_points AS (
  SELECT r, c, (CASE WHEN
    EXISTS (SELECT 1 FROM end_points e WHERE p.r = e.r AND p.c = e.c)
    THEN '*'

    WHEN EXISTS (SELECT 1 FROM segments s WHERE p.r = s.r2 AND p.c = s.c2 AND dc != 0)
    THEN '-'

    WHEN EXISTS (SELECT 1 FROM segments s WHERE p.r = s.r2 AND p.c = s.c2 AND dr != 0)
    THEN '|'

    ELSE ' '
    END
    ) AS marker
  FROM points p
),
lines AS (
   SELECT r, string_agg(marker, '') AS row_text
   FROM marked_points
   GROUP BY r
   ORDER BY r DESC
)
SELECT string_agg(row_text, E'\n') FROM lines;

…which outputs

** 

Compare to the original below:

Dragon Curve

The Dragon Curve

If there was a ‘hard mode’ for the self-similar fractal queries, this would be it. Once again we need to find a way of doing without the RECURSIVE functionality we’d rely on with an SQL language like Postgres. The BigQuery query for our fractal is below, which you can compare to the original:

WITH RECURSIVE iterations AS (
  SELECT 'FX' AS PATH, 0 AS iteration
  UNION ALL
  SELECT replace(replace(replace(PATH, 'X', 'X+ZF+'), 'Y', '-FX-Y'), 'Z', 'Y'), iteration+1 AS iteration
  FROM iterations WHERE iteration < 8
),
segments AS (
  SELECT 0 AS r1, 0 AS c1, 0 AS r2, 0 AS c2, 0 AS r3, 0 AS c3, 0 AS dr, 1 AS dc, (SELECT path FROM iterations ORDER BY iteration DESC LIMIT 1) AS path_left
  UNION ALL
  SELECT r3 AS r1, c3 AS c1, r3 + dr * movement AS r2, c3 + dc * movement AS c2, r3 + 2 * dr * movement AS r3, c3 + 2 * dc * movement AS c3, dr, dc, SUBSTRING(path_left FROM 2) AS path_left
  FROM (
    SELECT r1, c1, r2, c2, r3, c3,
      CASE WHEN SUBSTRING(path_left FOR 1) = '-' THEN -dc
           WHEN SUBSTRING(path_left FOR 1) = '+' THEN dc
           ELSE dr
      END AS dr,
      CASE WHEN SUBSTRING(path_left FOR 1) = '-' THEN dr
           WHEN SUBSTRING(path_left FOR 1) = '+' THEN -dr
           ELSE dc
           END AS dc,
           path_left,
      CASE WHEN SUBSTRING(path_left FOR 1) = 'F' THEN 1 ELSE 0 END AS movement
  FROM segments
  WHERE CHAR_LENGTH(path_left) > 0
  ) sub
),
end_points AS (
  SELECT r1 AS r, c1 AS c FROM segments UNION SELECT r3, c3 FROM segments
),
points AS (
  SELECT r, c FROM UNNEST(GENERATE_ARRAY((SELECT MIN(r) FROM end_points), (SELECT MAX(r) FROM end_points))) AS a(r)
  CROSS JOIN UNNEST(GENERATE_ARRAY((SELECT MIN(c) FROM end_points), (SELECT MAX(c) FROM end_points))) AS b(c)
),
marked_points AS (
SELECT r, c, (CASE WHEN
    EXISTS (SELECT 1 FROM end_points e WHERE p.r = e.r AND p.c = e.c)
    THEN '*'

    WHEN EXISTS (SELECT 1 FROM segments s WHERE p.r = s.r2 AND p.c = s.c2 AND dc != 0)
    THEN '-'

    WHEN EXISTS (SELECT 1 FROM segments s WHERE p.r = s.r2 AND p.c = s.c2 AND dr != 0)
    THEN '|'

    ELSE ' '
    END
    ) AS marker
FROM points p
),
lines AS (
   SELECT r, string_agg(marker, '') AS row_text
   FROM marked_points
   GROUP BY r
   ORDER BY r DESC
)
SELECT string_agg(row_text, '\n') FROM lines;

…which outputs

** 

Board

Our final self-similar fractal. Once again we need to find a way of doing without the RECURSIVE functionality we’d rely on with an SQL language like Postgres. The BigQuery query for our fractal is below, which you can ompare to the original:

WITH RECURSIVE iterations AS (
  SELECT 'F+F+F+F' AS PATH, 0 AS iteration
  UNION ALL
  SELECT replace(replace(replace(PATH, 'F', 'FF+F+F+F+FF'), 'R', '+LF-RFR-FL+'), 'Z', 'R'), iteration+1 AS iteration
  FROM iterations WHERE iteration < 3
),
segments AS (
  SELECT 0 AS r1, 0 AS c1, 0 AS r2, 0 AS c2, 0 AS r3, 0 AS c3, 0 AS dr, 1 AS dc, (SELECT path FROM iterations ORDER BY iteration DESC LIMIT 1) AS path_left
  UNION ALL
  SELECT r3 AS r1, c3 AS c1, r3 + dr * movement AS r2, c3 + dc * movement AS c2, r3 + 2 * dr * movement AS r3, c3 + 2 * dc * movement AS c3, dr, dc, SUBSTRING(path_left FROM 2) AS path_left
  FROM (
    SELECT r1, c1, r2, c2, r3, c3,
      CASE WHEN SUBSTRING(path_left FOR 1) = '-' THEN -dc
           WHEN SUBSTRING(path_left FOR 1) = '+' THEN dc
           ELSE dr
      END AS dr,
      CASE WHEN SUBSTRING(path_left FOR 1) = '-' THEN dr
           WHEN SUBSTRING(path_left FOR 1) = '+' THEN -dr
           ELSE dc
           END AS dc,
           path_left,
      CASE WHEN SUBSTRING(path_left FOR 1) = 'F' THEN 1 ELSE 0 END AS movement
  FROM segments
  WHERE CHAR_LENGTH(path_left) > 0
  ) sub
),
end_points AS (
  SELECT r1 AS r, c1 AS c FROM segments UNION SELECT r3, c3 FROM segments
),
points AS (
  SELECT r, c FROM UNNEST(GENERATE_ARRAY((SELECT MIN(r) FROM end_points), (SELECT MAX(r) FROM end_points))) AS a(r)
  CROSS JOIN UNNEST(GENERATE_ARRAY((SELECT MIN(c) FROM end_points), (SELECT MAX(c) FROM end_points))) AS b(c)
),
marked_points AS (
  SELECT r, c, (CASE WHEN
    EXISTS (SELECT 1 FROM end_points e WHERE p.r = e.r AND p.c = e.c)
    THEN '*'

    WHEN EXISTS (SELECT 1 FROM segments s WHERE p.r = s.r2 AND p.c = s.c2 AND dc != 0)
    THEN '-'

    WHEN EXISTS (SELECT 1 FROM segments s WHERE p.r = s.r2 AND p.c = s.c2 AND dr != 0)
    THEN '|'

    ELSE ' '
    END
    ) AS marker
  FROM points p
),
lines AS (
   SELECT r, string_agg(marker, '') AS row_text
   FROM marked_points
   GROUP BY r
   ORDER BY r DESC
)
SELECT string_agg(row_text, '\n') FROM lines;

…which outputs

** 

End Notes

This is a fun trick and all. Of course, when it comes to interesting code snippets there’s always a bigger fish. I’d recommend checking out Fabien’s implementation of a Turing Machine built with SQL queries.


Cited as:

@article{mcateer2019fracpost,
    title = "Fractals in BigQuery",
    author = "McAteer, Matthew",
    journal = "matthewmcateer.me",
    year = "2019",
    url = "https://matthewmcateer.me/blog/fractals-in-bigquery/"
}

If you notice mistakes and errors in this post, don’t hesitate to contact me at [contact at matthewmcateer dot me] and I would be very happy to correct them right away!

See you in the next post 😄

I write about AI, Biotech, and a bunch of other topics. Subscribe to get new posts by email!


This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

At least this isn't a full-screen popup

That'd be more annoying. Anyways, subscribe to my newsletter to get new posts by email! I write about AI, Biotech, and a bunch of other topics.


This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.