23.2. Description

23.2.1. Structure of PL/pgSQL

PL/pgSQL is a block structured language. All keywords and identifiers can be used in mixed upper and lower-case. A block is defined as:

[<<label>>]
[DECLARE
    declarations]
BEGIN
    statements
END;

There can be any number of sub-blocks in the statement section of a block. Sub-blocks can be used to hide variables from outside a block of statements.

The variables declared in the declarations section preceding a block are initialized to their default values every time the block is entered, not only once per function call. For example:

CREATE FUNCTION somefunc() RETURNS INTEGER AS '
DECLARE
   quantity INTEGER := 30;
BEGIN
   RAISE NOTICE ''Quantity here is %'',quantity;  -- Quantity here is 30
   quantity := 50;
   --
   -- Create a sub-block
   --
   DECLARE
      quantity INTEGER := 80;
   BEGIN
      RAISE NOTICE ''Quantity here is %'',quantity;  -- Quantity here is 80
   END;

   RAISE NOTICE ''Quantity here is %'',quantity;  -- Quantity here is 50
END;
' LANGUAGE 'plpgsql';

It is important not to confuse the use of BEGIN/END for grouping statements in PL/pgSQL with the database commands for transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do not start or end a transaction. Functions and trigger procedures are always executed within a transaction established by an outer query --- they cannot start or commit transactions, since Postgres does not have nested transactions.

23.2.2. Comments

There are two types of comments in PL/pgSQL. A double dash -- starts a comment that extends to the end of the line. A /* starts a block comment that extends to the next occurrence of */. Block comments cannot be nested, but double dash comments can be enclosed into a block comment and a double dash can hide the block comment delimiters /* and */.

23.2.3. Variables and Constants

All variables, rows and records used in a block or its sub-blocks must be declared in the declarations section of a block. The exception being the loop variable of a FOR loop iterating over a range of integer values.

PL/pgSQL variables can have any SQL datatype, such as INTEGER, VARCHAR and CHAR. All variables have as default value the SQL NULL value.

Here are some examples of variable declarations:

user_id INTEGER;
quantity NUMBER(5);
url VARCHAR;

23.2.3.1. Constants and Variables With Default Values

The declarations have the following syntax:

name [ CONSTANT ] type [ NOT NULL ] [ { DEFAULT | := } value ];

The value of variables declared as CONSTANT cannot be changed. If NOT NULL is specified, an assignment of a NULL value results in a runtime error. Since the default value of all variables is the SQL NULL value, all variables declared as NOT NULL must also have a default value specified.

The default value is evaluated every time the function is called. So assigning 'now' to a variable of type timestamp causes the variable to have the time of the actual function call, not when the function was precompiled into its bytecode.

Examples:

quantity INTEGER := 32;
url varchar := ''http://mysite.com'';
user_id CONSTANT INTEGER := 10;

23.2.3.2. Parameters Passed to Functions

Parameters passed to functions are named with the identifiers $1, $2, etc. Optionally, aliases can be declared for the $n parameter names for increased readability. Some examples:

CREATE FUNCTION sales_tax(REAL) RETURNS REAL AS '
DECLARE
    subtotal ALIAS FOR $1;
BEGIN
    return subtotal * 0.06;
END;
' LANGUAGE 'plpgsql';


CREATE FUNCTION instr(VARCHAR,INTEGER) RETURNS INTEGER AS '
DECLARE
    v_string ALIAS FOR $1;
    index ALIAS FOR $2;
BEGIN
    -- Some computations here
END;
' LANGUAGE 'plpgsql';

23.2.3.3. Attributes

Using the %TYPE and %ROWTYPE attributes, you can declare variables with the same datatype or structure of another database item (e.g: a table field).

variable%TYPE

%TYPE provides the datatype of a variable or database column. You can use this to declare variables that will hold database values. For example, let's say you have a column named user_id in your users table. To declare a variable with the same datatype as users.user_id you write:

user_id   users.user_id%TYPE;

By using %TYPE you don't need to know the datatype of the structure you are referencing, and most important, if the datatype of the referenced item changes in the future (e.g: you change your table definition of user_id to become a REAL), you won't need to change your function definition.

table%ROWTYPE

%ROWTYPE provides the composite datatype corresponding to a whole row of the specified table. table must be an existing table or view name of the database. The fields of the row are accessed in the dot notation. Parameters to a function can be composite types (complete table rows). In that case, the corresponding identifier $n will be a rowtype, and fields can be selected from it, for example $1.user_id.

Only the user-defined attributes of a table row are accessible in a rowtype variable, not OID or other system attributes (because the row could be from a view). The fields of the rowtype inherit the table's field sizes or precision for char() etc. data types.

DECLARE
    users_rec users%ROWTYPE;
    user_id users.user_id%TYPE;
BEGIN
    user_id := users_rec.user_id;
    ...

create function cs_refresh_one_mv(integer) returns integer as '
   DECLARE
        key ALIAS FOR $1;
        table_data cs_materialized_views%ROWTYPE;
   BEGIN
        SELECT INTO table_data * FROM cs_materialized_views
               WHERE sort_key=key;

        IF NOT FOUND THEN
           RAISE EXCEPTION ''View '' || key || '' not found'';
           RETURN 0;
        END IF;

        -- The mv_name column of cs_materialized_views stores view
        -- names.
 
        TRUNCATE TABLE table_data.mv_name;
        INSERT INTO table_data.mv_name || '' '' || table_data.mv_query;

        return 1;
end;
' LANGUAGE 'plpgsql';

23.2.3.4. RENAME

Using RENAME you can change the name of a variable, record or row. This is useful if NEW or OLD should be referenced by another name inside a trigger procedure.

Syntax and examples:

RENAME oldname TO newname;

RENAME id TO user_id;
RENAME this_var TO that_var;

23.2.4. Expressions

All expressions used in PL/pgSQL statements are processed using the backend's executor. Expressions that appear to contain constants may in fact require run-time evaluation (e.g. 'now' for the timestamp type) so it is impossible for the PL/pgSQL parser to identify real constant values other than the NULL keyword. All expressions are evaluated internally by executing a query

SELECT expression
using the SPI manager. In the expression, occurrences of variable identifiers are substituted by parameters and the actual values from the variables are passed to the executor in the parameter array. All expressions used in a PL/pgSQL function are only prepared and saved once. The only exception to this rule is an EXECUTE statement if parsing of a query is needed each time it is encountered.

The type checking done by the Postgres main parser has some side effects to the interpretation of constant values. In detail there is a difference between what these two functions do:

CREATE FUNCTION logfunc1 (text) RETURNS timestamp AS '
    DECLARE
        logtxt ALIAS FOR $1;
    BEGIN
        INSERT INTO logtable VALUES (logtxt, ''now'');
        RETURN ''now'';
    END;
' LANGUAGE 'plpgsql';
and
CREATE FUNCTION logfunc2 (text) RETURNS timestamp AS '
    DECLARE
        logtxt ALIAS FOR $1;
        curtime timestamp;
    BEGIN
        curtime := ''now'';
        INSERT INTO logtable VALUES (logtxt, curtime);
        RETURN curtime;
    END;
' LANGUAGE 'plpgsql';
In the case of logfunc1(), the Postgres main parser knows when preparing the plan for the INSERT, that the string 'now' should be interpreted as timestamp because the target field of logtable is of that type. Thus, it will make a constant from it at this time and this constant value is then used in all invocations of logfunc1() during the lifetime of the backend. Needless to say that this isn't what the programmer wanted.

In the case of logfunc2(), the Postgres main parser does not know what type 'now' should become and therefore it returns a data type of text containing the string 'now'. During the assignment to the local variable curtime, the PL/pgSQL interpreter casts this string to the timestamp type by calling the text_out() and timestamp_in() functions for the conversion.

This type checking done by the Postgres main parser got implemented after PL/pgSQL was nearly done. It is a difference between 6.3 and 6.4 and affects all functions using the prepared plan feature of the SPI manager. Using a local variable in the above manner is currently the only way in PL/pgSQL to get those values interpreted correctly.

If record fields are used in expressions or statements, the data types of fields should not change between calls of one and the same expression. Keep this in mind when writing trigger procedures that handle events for more than one table.

23.2.5. Statements

Anything not understood by the PL/pgSQL parser as specified below will be put into a query and sent down to the database engine to execute. The resulting query should not return any data.

23.2.5.1. Assignment

An assignment of a value to a variable or row/record field is written as:

identifier := expression;
If the expressions result data type doesn't match the variables data type, or the variable has a size/precision that is known (as for char(20)), the result value will be implicitly casted by the PL/pgSQL bytecode interpreter using the result types output- and the variables type input-functions. Note that this could potentially result in runtime errors generated by the types input functions.

user_id := 20;
tax := subtotal * 0.06;

23.2.5.2. Calling another function

All functions defined in a Postgres database return a value. Thus, the normal way to call a function is to execute a SELECT query or doing an assignment (resulting in a PL/pgSQL internal SELECT).

But there are cases where someone is not interested in the function's result. In these cases, use the PERFORM statement.

PERFORM query
This executes a SELECT query over the SPI manager and discards the result. Identifiers like local variables are still substituted into parameters.

PERFORM create_mv(''cs_session_page_requests_mv'',''
     select   session_id, page_id, count(*) as n_hits,
              sum(dwell_time) as dwell_time, count(dwell_time) as dwell_count
     from     cs_fact_table
     group by session_id, page_id '');

23.2.5.3. Executing dynamic queries

Often times you will want to generate dynamic queries inside your PL/pgSQL functions. Or you have functions that will generate other functions. PL/pgSQL provides the EXECUTE statement for these occasions.

EXECUTE query-string
where query-string is a string of type text containing the query to be executed.

When working with dynamic queries you will have to face escaping of single quotes in PL/pgSQL. Please refer to the table available at the "Porting from Oracle PL/SQL" chapter for a detailed explanation that will save you some effort.

Unlike all other queries in PL/pgSQL, a query run by an EXECUTE statement is not prepared and saved just once during the life of the server. Instead, the query is prepared each time the statement is run. The query-string can be dynamically created within the procedure to perform actions on variable tables and fields.

The results from SELECT queries are discarded by EXECUTE, and SELECT INTO is not currently supported within EXECUTE. So, the only way to extract a result from a dynamically-created SELECT is to use the FOR ... EXECUTE form described later.

An example:

EXECUTE ''UPDATE tbl SET ''
        || quote_ident(fieldname)
        || '' = ''
        || quote_literal(newvalue)
        || '' WHERE ...'';

This example shows use of the functions quote_ident(TEXT) and quote_literal(TEXT). Variables containing field and table identifiers should be passed to function quote_ident(). Variables containing literal elements of the dynamic query string should be passed to quote_literal(). Both take the appropriate steps to return the input text enclosed in single or double quotes and with any embedded special characters.

Here is a much larger example of a dynamic query and EXECUTE:

CREATE FUNCTION cs_update_referrer_type_proc() RETURNS INTEGER AS '
DECLARE
    referrer_keys RECORD;  -- Declare a generic record to be used in a FOR
    a_output varchar(4000);
BEGIN 
    a_output := ''CREATE FUNCTION cs_find_referrer_type(varchar,varchar,varchar) 
                  RETURNS varchar AS '''' 
                     DECLARE 
                         v_host ALIAS FOR $1; 
                         v_domain ALIAS FOR $2; 
                         v_url ALIAS FOR $3; ''; 

    -- 
    -- Notice how we scan through the results of a query in a FOR loop
    -- using the FOR <record> construct.
    --

    FOR referrer_keys IN select * from cs_referrer_keys order by try_order LOOP
        a_output := a_output || '' if v_'' || referrer_keys.kind || '' like '''''''''' 
                 || referrer_keys.key_string || '''''''''' then return '''''' 
                 || referrer_keys.referrer_type || ''''''; end if;''; 
    END LOOP; 
  
    a_output := a_output || '' return null; end; '''' language ''''plpgsql'''';''; 
 
    -- This works because we are not substituting any variables
    -- Otherwise it would fail. Look at PERFORM for another way to run functions
    
    EXECUTE a_output; 
end; 
' LANGUAGE 'plpgsql';

23.2.5.4. Obtaining other results status

GET DIAGNOSTICS variable = item [ , ... ]
This command allows retrieval of system status indicators. Each item is a keyword identifying a state value to be assigned to the specified variable (which should be of the right datatype to receive it). The currently available status items are ROW_COUNT, the number of rows processed by the last SQL query sent down to the SQL engine; and RESULT_OID, the Oid of the last row inserted by the most recent SQL query. Note that RESULT_OID is only useful after an INSERT query.

23.2.5.5. Returning from a function

RETURN expression
The function terminates and the value of expression will be returned to the upper executor. The return value of a function cannot be undefined. If control reaches the end of the top-level block of the function without hitting a RETURN statement, a runtime error will occur.

The expressions result will be automatically casted into the function's return type as described for assignments.

23.2.6. Control Structures

Control structures are probably the most useful (and important) part of PL/SQL. With PL/pgSQL's control structures, you can manipulate PostgreSQL data in a very flexible and powerful way.

23.2.6.1. Conditional Control: IF statements

IF statements let you take action according to certain conditions. PL/pgSQL has three forms of IF: IF-THEN, IF-THEN-ELSE, IF-THEN-ELSE IF. NOTE: All PL/pgSQL IF statements need a corresponding END IF statement. In ELSE-IF statements you need two: one for the first IF and one for the second (ELSE IF).

IF-THEN

IF-THEN statements is the simplest form of an IF. The statements between THEN and END IF will be executed if the condition is true. Otherwise, the statements following END IF will be executed.

IF v_user_id <> 0 THEN
    UPDATE users SET email = v_email WHERE user_id = v_user_id;
END IF;

IF-THEN-ELSE

IF-THEN-ELSE statements adds to IF-THEN by letting you specify the statements that should be executed if the condition evaluates to FALSE.

IF parentid IS NULL or parentid = ''''
THEN 
    return fullname;
ELSE
    return hp_true_filename(parentid) || ''/'' || fullname;
END IF;


IF v_count > 0 THEN 
    INSERT INTO users_count(count) VALUES(v_count);
    return ''t'';
ELSE 
    return ''f'';
END IF;

IF statements can be nested and in the following example:

IF demo_row.sex = ''m'' THEN
  pretty_sex := ''man'';
ELSE
  IF demo_row.sex = ''f'' THEN
    pretty_sex := ''woman'';
  END IF;
END IF;

IF-THEN-ELSE IF

When you use the "ELSE IF" statement, you are actually nesting an IF statement inside the ELSE statement. Thus you need one END IF statement for each nested IF and one for the parent IF-ELSE.

For example:

IF demo_row.sex = ''m'' THEN
   pretty_sex := ''man'';
ELSE IF demo_row.sex = ''f'' THEN
        pretty_sex := ''woman'';
     END IF;
END IF;

23.2.6.2. Iterative Control: LOOP, WHILE, FOR and EXIT

With the LOOP, WHILE, FOR and EXIT statements, you can control the flow of execution of your PL/pgSQL program iteratively.

LOOP

[<<label>>]
LOOP
    statements
END LOOP;
An unconditional loop that must be terminated explicitly by an EXIT statement. The optional label can be used by EXIT statements of nested loops to specify which level of nesting should be terminated.

EXIT

EXIT [ label ] [ WHEN expression ];
If no label is given, the innermost loop is terminated and the statement following END LOOP is executed next. If label is given, it must be the label of the current or an upper level of nested loop blocks. Then the named loop or block is terminated and control continues with the statement after the loops/blocks corresponding END.

Examples:

LOOP
    -- some computations
    IF count > 0 THEN
        EXIT;  -- exit loop
    END IF;
END LOOP;

LOOP
    -- some computations
    EXIT WHEN count > 0;
END LOOP;

BEGIN
    -- some computations
    IF stocks > 100000 THEN
        EXIT;  -- illegal. Can't use EXIT outside of a LOOP
    END IF;
END;

WHILE

With the WHILE statement, you can loop through a sequence of statements as long as the evaluation of the condition expression is true.

[<<label>>]
WHILE expression LOOP
    statements
END LOOP;
For example:
WHILE amount_owed > 0 AND gift_certificate_balance > 0 LOOP
    -- some computations here
END LOOP;

WHILE NOT boolean_expression LOOP
    -- some computations here
END LOOP;

FOR

[<<label>>]
FOR name IN [ REVERSE ] expression .. expression LOOP
    statements
END LOOP;
A loop that iterates over a range of integer values. The variable name is automatically created as type integer and exists only inside the loop. The two expressions giving the lower and upper bound of the range are evaluated only when entering the loop. The iteration step is always 1.

Some examples of FOR loops (see Section 23.2.7 for iterating over records in FOR loops):

FOR i IN 1..10 LOOP
  -- some expressions here

    RAISE NOTICE 'i is %',i;
END LOOP;

FOR i IN REVERSE 1..10 LOOP
    -- some expressions here
END LOOP;

23.2.7. Working with RECORDs

Records are similar to rowtypes, but they have no predefined structure. They are used in selections and FOR loops to hold one actual database row from a SELECT operation.

23.2.7.1. Declaration

One variables of type RECORD can be used for different selections. Accessing a record or an attempt to assign a value to a record field when there is no actual row in it results in a runtime error. They can be declared like this:

name RECORD;

23.2.7.2. Assignments

An assignment of a complete selection into a record or row can be done by:

SELECT  INTO target expressions FROM ...;
target can be a record, a row variable or a comma separated list of variables and record-/row-fields. Note that this is quite different from Postgres' normal interpretation of SELECT INTO, which is that the INTO target is a newly created table. (If you want to create a table from a SELECT result inside a PL/pgSQL function, use the equivalent syntax CREATE TABLE AS SELECT.)

If a row or a variable list is used as target, the selected values must exactly match the structure of the target(s) or a runtime error occurs. The FROM keyword can be followed by any valid qualification, grouping, sorting etc. that can be given for a SELECT statement.

Once a record or row has been assigned to a RECORD variable, you can use the "." (dot) notation to access fields in that record:

DECLARE
    users_rec RECORD;
    full_name varchar;
BEGIN
    SELECT INTO users_rec * FROM users WHERE user_id=3;

  full_name := users_rec.first_name || '' '' || users_rec.last_name;

There is a special variable named FOUND of type boolean that can be used immediately after a SELECT INTO to check if an assignment had success.

SELECT INTO myrec * FROM EMP WHERE empname = myname;
IF NOT FOUND THEN
    RAISE EXCEPTION ''employee % not found'', myname;
END IF;
You can also use the IS NULL (or ISNULL) conditionals to test for NULLity of a RECORD/ROW. If the selection returns multiple rows, only the first is moved into the target fields. All others are silently discarded.

DECLARE
    users_rec RECORD;
    full_name varchar;
BEGIN
    SELECT INTO users_rec * FROM users WHERE user_id=3;

    IF users_rec.homepage IS NULL THEN
        -- user entered no homepage, return "http://"

        return ''http://'';
    END IF;
END;

23.2.7.3. Iterating Through Records

Using a special type of FOR loop, you can iterate through the results of a query and manipulate that data accordingly. The syntax is as follow:

[<<label>>]
FOR record | row IN select_clause LOOP
    statements
END LOOP;
The record or row is assigned all the rows resulting from the select clause and the loop body executed for each. Here is an example:

create function cs_refresh_mviews () returns integer as '
DECLARE
     mviews RECORD;

     -- Instead, if you did:
     -- mviews  cs_materialized_views%ROWTYPE;
     -- this record would ONLY be usable for the cs_materialized_views table

BEGIN
     PERFORM cs_log(''Refreshing materialized views...'');

     FOR mviews IN SELECT * FROM cs_materialized_views ORDER BY sort_key LOOP

         -- Now "mviews" has one record from cs_materialized_views

         PERFORM cs_log(''Refreshing materialized view '' || mview.mv_name || ''...'');
         TRUNCATE TABLE mview.mv_name;
         INSERT INTO mview.mv_name || '' '' || mview.mv_query;
     END LOOP;

     PERFORM cs_log(''Done refreshing materialized views.'');
     return 1;
end;
' language 'plpgsql';
If the loop is terminated with an EXIT statement, the last assigned row is still accessible after the loop.

The FOR-IN EXECUTE statement is another way to iterate over records:

[<<label>>]
FOR record | row IN EXECUTE text_expression LOOP 
    statements
END LOOP;
This is like the previous form, except that the source SELECT statement is specified as a string expression, which is evaluated and re-planned on each entry to the FOR loop. This allows the programmer to choose the speed of a pre-planned query or the flexibility of a dynamic query, just as with a plain EXECUTE statement.

23.2.8. Aborting and Messages

Use the RAISE statement to throw messages into the Postgres elog mechanism.

RAISE level 'format' [, variable [...]];
Possible levels are DEBUG (write the message into the postmaster log), NOTICE (write the message into the postmaster log and forward it to the client application) and EXCEPTION (raise an error, aborting the transaction). Inside the format string, % is replaced by the next optional argument's external representation. Write %% to emit a literal %. Note that the optional arguments must presently be simple variables, not expressions, and the format must be a simple string literal.

RAISE NOTICE ''Calling cs_create_job(%)'',v_job_id;
In this example, the value of v_job_id will replace the % in the string.

RAISE EXCEPTION ''Inexistent ID --> %'',user_id;
This will abort the transaction with the given error message.

23.2.9. Exceptions

Postgres does not have a very smart exception handling model. Whenever the parser, planner/optimizer or executor decide that a statement cannot be processed any longer, the whole transaction gets aborted and the system jumps back into the main loop to get the next query from the client application.

It is possible to hook into the error mechanism to notice that this happens. But currently it is impossible to tell what really caused the abort (input/output conversion error, floating point error, parse error). And it is possible that the database backend is in an inconsistent state at this point so returning to the upper executor or issuing more commands might corrupt the whole database. And even if, at this point the information, that the transaction is aborted, is already sent to the client application, so resuming operation does not make any sense.

Thus, the only thing PL/pgSQL currently does when it encounters an abort during execution of a function or trigger procedure is to write some additional DEBUG level log messages telling in which function and where (line number and type of statement) this happened.