O'Reilly Databases

oreilly.comSafari Books Online.Conferences.

We've expanded our coverage and improved our search! Search for all things Database across O'Reilly!

Search Search Tips

advertisement
AddThis Social Bookmark Button

Print Subscribe to Databases Subscribe to Newsletters

Batch Updates with PL/pgSQL

by David Wheeler
09/07/2006

The previous article in this series, Practical PL/pgSQL: Managing Ordered Sets, created four functions to simplify the management of ordered collections as many-to-many relationships. The two more complex functions, entry_coll_tag_set() and entry_coll_tag_add(), take an iterative approach to managing those relationships. By iterative, I mean that they use loops to iterate over an array of IDs in order to do the right thing for each.

The downside to this approach is that the performance of those functions is directly proportional to the number of IDs in the array (Ο(n)). It would be ideal to make the runtime of the functions constant, regardless of the number of IDs in the array (Ο(1)).

Fortunately, there is a way to do just that in PostgreSQL. Before then however, think back to the Fibonacci examples from the first article in this series, Introduction to PostgreSQL PL/pgSQL. Returning to those examples, I'll introduce some new concepts in a simpler format than the collection functions allow.

Set-Returning Functions

As I mentioned in the first article, PostgreSQL functions can return a value of any supported data type. I didn't mention that they can also return sets of a particular type. A set is a list of values of a particular type, but rather than returning those values as a list or an array (as you might expect in a dynamic programming language), PostgreSQL functions return them as rows of data.

Suppose that you need to get a list of Fibonacci numbers up to a particular place in the Fibonacci sequence. Writing such a function in PL/pgSQL is as simple as modifying the fib_fast() function to return each Fibonacci number as it's calculated. It does so using the PL/pgSQL RETURN NEXT statement. Here are fib_fast() and the new set-returning function fibs_to():


1  CREATE OR REPLACE FUNCTION fib_fast(
2      fib_for integer
3  ) RETURNS integer AS $$
4  DECLARE
5      ret integer := 0;
6      nxt integer := 1;
7      tmp integer;
8  BEGIN
9      FOR num IN 1..fib_for LOOP
10
11         tmp := ret;
12         ret := nxt;
13         nxt := tmp + nxt;
14     END LOOP;
15
16     RETURN ret;
17 END;
18 $$ LANGUAGE plpgsql;

1  CREATE OR REPLACE FUNCTION fibs_to(
2      max_num integer
3  ) RETURNS SETOF integer AS $$
4  DECLARE
5      ret integer := 0;
6      nxt integer := 1;
7      tmp integer;
8  BEGIN
9      FOR num IN 1..max_num LOOP
10         RETURN NEXT ret;
11         tmp := ret;
12         ret := nxt;
13         nxt := tmp + nxt;
14     END LOOP;
15
16    RETURN NEXT ret;
17 END;
18 $$ LANGUAGE plpgsql;

There are really only three differences aside from the function names, and I've emphasized them in fibs_to(). The first difference is on line three, where the fibs_to() declaration indicates that it returns a SETOF integer instead of simply an integer. The SETOF keyword tells PostgreSQL that this function returns a set.

The other differences are that, rather than simply returning the value of the ret integer variable, fibs_to() uses the RETURN NEXT statement to return each Fibonacci number after its calculation in the loop. The final RETURN NEXT statement returns the final Fibonacci number in the sequence.

Those are the only changes necessary to create a set-returning function. As such a function, fibs_to() must also be called in a different context. While you can call fib_fast() in a SELECT statement:

try=% select fib_fast(8);
 fib_fast 
----------
       21
(1 row)

fibs_to() essentially behaves like a table, and you must treat it as such by using it in a FROM clause:

try=% select * from fibs_to(8);
 fibs_to 
---------
       0
       1
       1
       2
       3
       5
       8
      13
      21
(9 rows)

Be warned, however, that while it looks like fibs_to() and behaves like a continuation, (and, for most practical purposes, is treatable like a continuation), PostgreSQL actually buffers all of the values returned by RETURN NEXT and only returns them to the calling context after the function has calculated them all. That means that if you write a set-returning function that returns a lot of values, you need to make sure that your server's memory can handle it.

That caveat aside, set-returning functions can be extremely useful.

Pages: 1, 2, 3, 4, 5

Next Pagearrow




Tagged Articles

Be the first to post this article to del.icio.us

Related to this Article

Data Jujitsu: The Art of Turning Data into Product Data Jujitsu: The Art of Turning Data into Product
November 2012
$0.00 USD

Designing Great Data Products Designing Great Data Products
March 2012
$0.00 USD

Sponsored Resources

  • Inside Lightroom
Advertisement
O'reilly

© 2013, O’Reilly Media, Inc.

(707) 827-7019 (800) 889-8969

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.

About O'Reilly

  • Academic Solutions
  • Jobs
  • Contacts
  • Corporate Information
  • Press Room
  • Privacy Policy
  • Terms of Service
  • Writing for O'Reilly

Community

  • Authors
  • Community & Featured Users
  • Forums
  • Membership
  • Newsletters
  • O'Reilly Answers
  • RSS Feeds
  • User Groups

Partner Sites

  • makezine.com
  • makerfaire.com
  • craftzine.com
  • igniteshow.com
  • PayPal Developer Zone
  • O'Reilly Insights on Forbes.com

Shop O'Reilly

  • Customer Service
  • Contact Us
  • Shipping Information
  • Ordering & Payment
  • The O'Reilly Guarantee