Blog do Marcio Portes: Pipelined Function

Wednesday, May 11, 2005

Pipelined Function

Pipelined Function é uma peça de código pl/sql para gerar registros, como se fosse uma uma tabela virtual. A grande vantagem dessa modalidade é a habilidade de devolver linhas dinamicamente a partir da geração da mesma, ou seja, quando emitimos a instrução pipe row( n ), ele envia, de volta ao cliente, antes de gerar a próxima, ao contrário da função normal ou um array (varray ou plsql_table) que constróem todos os registros em batch antes de retorná-los ao cliente.

As funções pipelined podem ser usadas sempre que surge a necessidade de usar select * from plsql_function. As vezes, a aplicação necessita balisar uma seqüência de números ou datas e tabela está incompleta, outras vezes um requerimento especial onde a linha deve ser repetida n vezes, porém n está no mesmo registro dificultando o cenário (recentemente respondi essa dúvina na lista oracle_br - muito boa lista por sinal). Enfim, teremos tantos exemplos quanto nossa criatividade permitir.

Demonstrando:

Primeiro vamos gerar uma seqüência de datas usando pipelined.


ops$marcio@ORA10G> create or replace type n as table of number;
  2  /

Type created.

ops$marcio@ORA10G>
ops$marcio@ORA10G> create or replace
  2  function f( p_times in number ) return n pipelined
  3  is
  4  begin
  5      for i in 1 .. p_times
  6      loop
  7          pipe row(i);
  8      end loop;
  9      return;
 10  end;
 11  /

Function created.

ops$marcio@ORA10G>
ops$marcio@ORA10G> select * from table( f( 10 ) );

 COLUMN_VALUE
-------------
            1
            2
            3
            4
            5
            6
            7
            8
            9
           10

10 rows selected.

ops$marcio@ORA10G>
ops$marcio@ORA10G> var dtini varchar2(20)
ops$marcio@ORA10G> var dtfim varchar2(20)
ops$marcio@ORA10G>
ops$marcio@ORA10G> alter session set nls_date_format = 'dd/mm/rr';

Session altered.

ops$marcio@ORA10G>
ops$marcio@ORA10G> exec :dtini := trunc(sysdate)

PL/SQL procedure successfully completed.


DTINI
--------------------------------
11/05/05

ops$marcio@ORA10G> exec :dtfim := trunc(sysdate)+10

PL/SQL procedure successfully completed.


DTFIM
--------------------------------
21/05/05

ops$marcio@ORA10G>
ops$marcio@ORA10G> select to_date(:dtfim, 'dd/mm/rr')
  2         - to_date(:dtini, 'dd/mm/rr') dias
  3    from dual;

         DIAS
-------------
           10

1 row selected.

ops$marcio@ORA10G>
ops$marcio@ORA10G> select to_date(:dtini, 'dd/mm/rr') + (column_value - 1) datas
  2    from table ( f(to_date(:dtfim, 'dd/mm/rr') - to_date(:dtini, 'dd/mm/rr')) )
  3  /

DATAS
--------
11/05/05
12/05/05
13/05/05
14/05/05
15/05/05
16/05/05
17/05/05
18/05/05
19/05/05
20/05/05

10 rows selected.

Agora vamos provar a escalabilidade do uso da pipelined.
Sempre que preciso fazer pivot ou balisar alguma query como demonstrado acima, eu normalmente usava all_objects, por ser uma view disponível para todos os usuários e com grande quantidade de registros, entretanto o preço é caro.


ops$marcio@ORA10G> set autotrace traceonly statistics
ops$marcio@ORA10G>
ops$marcio@ORA10G> select to_date(:dtini, 'dd/mm/rr') + (rownum - 1) datas
 2    from all_objects
 3   where rownum <= 10000  
 4  / 

10000 rows selected.  

Statistics 
----------------------------------------------------------           
         0  recursive calls           
         0  db block gets       
     16500  consistent gets           
         0  physical reads           
         0  redo size      
    103314  bytes sent via SQL*Net to client        
      1597  bytes received via SQL*Net from client         
       101  SQL*Net roundtrips to/from client           
         0  sorts (memory)           
         0  sorts (disk)       
     10000  rows processed

Como visto, 16500 consistent gets. No próximo exemplo, é o que estou atualmente usando em substituição a all_objects, uma combinação da dual e connect by 1=1 (looping eterno).


ops$marcio@ORA10G> select to_date(:dtini, 'dd/mm/rr') + (r - 1) datas
 2    from (
 3  select rownum r
 4    from dual
 5  connect by 1=1 and level <= 10000  
 6         )  
 7  / 

10000 rows selected. 

Statistics
----------------------------------------------------------          
        0  recursive calls          
        0  db block gets          
        0  consistent gets          
        0  physical reads          
        0  redo size     
   103314  bytes sent via SQL*Net to client       
     1597  bytes received via SQL*Net from client        
      101  SQL*Net roundtrips to/from client          
        1  sorts (memory)          
        0  sorts (disk)      
    10000  rows processed

Excelente plano, porém 1 sort, em memória, mas sort. Essa é a grande vantagem da pipelined já comentada acima - o uso mínimo de memória. Isso escala, porque não deixa recurso preso, e aumenta a velocidade da query, porque não tem que organizar nada.


ops$marcio@ORA10G>
ops$marcio@ORA10G> select to_date(:dtini, 'dd/mm/rr') + (column_value - 1) datas
 2    from table ( f(10000) )
 3  /

10000 rows selected.

Statistics
----------------------------------------------------------
         0  recursive calls
         0  db block gets
         0  consistent gets
         0  physical reads
         0  redo size
    103314  bytes sent via SQL*Net to client
      1597  bytes received via SQL*Net from client
       101  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
     10000  rows processed

ops$marcio@ORA10G>

Voce pode também medir o uso de memória que sua pga está gastando para usar a pipelined, no 10g, ele bind isso na memória e usa. Rodei essas queries na mesma sessão onde eu realizei o teste acima, portanto não houve diferença em usar f() para 10 ou para 1 milhão.


ops$marcio@ORA10G> column value new_val v
ops$marcio@ORA10G> column name format a40
ops$marcio@ORA10G>
ops$marcio@ORA10G> select a.name, b.value
 2    from v$statname a, v$mystat b
 3   where a.statistic# = b.statistic#
 4     and a.name = 'session pga memory max';

NAME                                             VALUE
---------------------------------------- -------------
session pga memory max                         2529868

1 row selected.

ops$marcio@ORA10G>
ops$marcio@ORA10G> select count(*) from table( f(10) )
 2  /

    COUNT(*)
-------------
          10

1 row selected.

ops$marcio@ORA10G> select a.name, b.value, b.value - &v diff
 2    from v$statname a, v$mystat b
 3   where a.statistic# = b.statistic#
 4     and a.name = 'session pga memory max'
 5  /

NAME                                             VALUE          DIFF
---------------------------------------- ------------- -------------
session pga memory max                         2529868             0

1 row selected.

ops$marcio@ORA10G>
ops$marcio@ORA10G> select count(*) from table( f(1000000) )
 2  /

    COUNT(*)
-------------
     1000000

1 row selected.

ops$marcio@ORA10G> select a.name, b.value, b.value - &V diff
 2    from v$statname a, v$mystat b
 3   where a.statistic# = b.statistic#
 4     and a.name = 'session pga memory max'
 5  /

NAME                                             VALUE          DIFF
---------------------------------------- ------------- -------------
session pga memory max                         2529868             0

1 row selected.

E para finalizar, o exemplo mencionado acima.
Repetiremos a linha conforme times na tabela T abaixo:


ops$marcio@ORA10G> create or replace type n as table of number
 2  /

Type created.

ops$marcio@ORA10G>
ops$marcio@ORA10G> create or replace
 2  function f( p_cursor in sys_refcursor ) return n pipelined
 3  is
 4      l_rec  t%rowtype;
 5  begin
 6      loop
 7          fetch p_cursor into l_rec;
 8          exit when p_cursor%notfound;
 9          for i in 1 .. l_rec.times
10          loop
11               pipe row(l_rec.linha);
12          end loop;
13      end loop;
14      close p_cursor;
15      return;
16  end;
17  /

Function created.

ops$marcio@ORA10G> show error
No errors.
ops$marcio@ORA10G>
ops$marcio@ORA10G> select * from t;

       LINHA         TIMES
------------- -------------
           1             1
           2             2
           3             1
           4             3
           5             0

5 rows selected.

ops$marcio@ORA10G>
ops$marcio@ORA10G> select t.*
 2    from t, table( f( cursor( select * from t ) ) )
 3   where column_value = linha
 4  /

       LINHA         TIMES
------------- -------------
           1             1
           2             2
           2             2
           3             1
           4             3
           4             3
           4             3

7 rows selected.

ops$marcio@ORA10G>
ops$marcio@ORA10G> update t set times = 2 where linha = 5;

1 row updated.

ops$marcio@ORA10G> commit;

Commit complete.

ops$marcio@ORA10G>
ops$marcio@ORA10G> select t.*
 2    from t, table( f( cursor( select * from t ) ) )
 3   where column_value = linha
 4  /

       LINHA         TIMES
------------- -------------
           1             1
           2             2
           2             2
           3             1
           4             3
           4             3
           4             3
           5             2
           5             2

9 rows selected.

Abraços,

# posted by Marcio : 2:22 AM

Comments:

show de bola esse exemplo tio. A gente normalmente usa isso pra fazer carga de uma dimensão de tempo dentro de um DW, justamente pq vc precisa gerar várias linhas de referências aos anos e isso não tem uma origem exata. Muito legal.

# posted by

Anonymous : 2:36 PM

Exatamente, muito usado em etls.
abraço,

# posted by

Marcio : 5:43 PM

Se eu quizer usar esse processo a partir de um banco remoto, isto é, com database link, é possível ???

# posted by

Márcio de Souza Almeida : 2:30 PM

sim, é possível.

# posted by

Marcio : 11:09 PM

Blog do Marcio Portes

Wednesday, May 11, 2005

Pipelined Function

About Me

Dúvida?

Humor

Como Fazer & Exemplos

Previous Posts

Links

Archives