HomeArticlesSlideshowsAsk BobLogin or Register
NewsCustom SearchContactLinks
Intra-field list pivot and unpivot
ORACLE STUFFS
17-Feb-2009
Aside standard unpivot tasks (we create n rows using values from n columns) following unpivot task appears:
One record with coma separated values in a column causes creating several rows, one for each part of the coma separated field.

Let's show it on following example uf intra-column unpivot task:

create table tt (a integer,b varchar2(500));
insert into tt values(1,'John,Steve,Jim,Anne');
insert into tt values(2,'Karen,Betty,Rebecca,Tom,Henry');
commit;

Query created sample data:

SQL> Select a,b from tt 2 / A B ---------- -------------------------------------------------- 1 John,Steve,Jim,Anne 2 Karen,Betty,Rebecca,Tom,Henry

Following query returns unpivoted set of data:

Select ttt.a,regexp_substr(tt.b,'(^|[,])([^,]*)',1,upiv.lvl,'ni',2) b
from tt
join (select level as lvl from dual connect by level<=100) upiv on upiv.lvl<=regexp_count(tt.b,'[,]')+1
/

A B ---------- -------------------------------------------------- 1 John 1 Steve 1 Jim 1 Anne 2 Karen 2 Betty 2 Rebecca 2 Tom 2 Henry

Note: Presented example do not counts with more complex cases with comas within fields etc. More complex regular expression would be necessary to solve them.

For other separating characters you just have to replace [,] and [^,] strings with others. Following example shows values delimited by a pipe sign [|].

Select ttt.a,regexp_substr(tt.b,'(^|[|])([^|])',1,upiv.lvl,'ni',2) b
from tt
join (select level as lvl from dual connect by level<=100) upiv on upiv.lvl<=regexp_count(tt.b,'[|]')+1
/

In the case we have a constant, a variable, or a parameter and we want to tokenize it by a SQL query we can use following simplification:

Select regexp_substr(p_wanted_patterns,'(^|[,])([^,]*)',1,level,'ni',2) b 
      from dual connect by level<=regexp_count(p_wanted_patterns,'[,]')+1


Inverse task - Pivot:
Lets prepare unpivoted sample table using the query above:

Create table tta as
Select tt.a,regexp_replace(tt.b,'(^([^,]*,){'||to_char(upiv.lvl-1)||'})([^,]*)(,.*|$)','\3') b
from tt
join (select level as lvl from dual connect by level<=100) upiv on upiv.lvl<=length(regexp_replace(tt.b,'[^,]'))+1 
/

SQL> select a,b from tta; A B ---------- --------------- 1 John 2 Karen 1 Steve 2 Betty 1 Jim 2 Rebecca 1 Anne 2 Tom 2 Henry

Following query reverts data into intra-column pivoted form:

select a 
 ,listagg(tta.b,',')within group(order by b) as b  
from tta  
group by a;



In the case of older versions than Oracle 11g following workaround should be used:

select a
 ,rtrim(xmlagg(xmlelement(c,tta.b||',')).extract('//text()'),',') as b 
from tta 
group by a;

Returns:

SQL> select a,rtrim(xmlagg(xmlelement(c,tta.b||',')).extract('//text()'),',') as b from tta group by a; A B ---------- -------------------------------------------------- 1 John,Steve,Jim,Anne 2 Karen,Henry,Betty,Rebecca,Tom

Note: Last mentioned method - Pivot using xmlagg works well for small amounts of data. In Oracle 10g I have met some difficulties when data was over 5000 records.

Note: From the Oracle 11g version function listagg satisfies needs of text aggregation well.

See: solution through user defined aggregation function.


CSV decomposition skeleton
Following example shows a decomposition of CSV file with certain number of columns into particular fields.
It also supports enclosing of fields.

Declare 
  c_delimiter Varchar2(1 CHAR):='|';
  c_encloser  Varchar2(1 CHAR):='"';
  c_linefeed  Varchar2(1 CHAR):=chr(10);
  c_carrige   Varchar2(1 CHAR):=chr(13);
  c_fields    Integer:=4;
  v_SRC       CLOB; ---the delimited CLOB
Begin
  for r1 in(
    Select regexp_substr(v_SRC,'(^|['||c_linefeed||'])([^'||c_linefeed||']*)('||c_carrige||'?|\s*)',1,level,'ni',2) line
      from dual connect by level<=regexp_count(v_SRC,'['||c_linefeed||']')+1
  )LOOP
    ------------------------------------
    DBMS_OUTPUT.PUT_LINE('>'||r1.line||'<');
    ------------------------------------
    for r2 in(
      Select regexp_substr(r1.line,'(^|['||c_delimiter||'])((['||c_encloser||'][^'||c_encloser||']*)|([^'||c_delimiter||']*))',1,level,'ni',2) field
        from dual connect by level<=c_fields
    )LOOP
      r2.field:=ltrim(r2.field,c_encloser);
      -------------------------------------
      DBMS_OUTPUT.PUT_LINE('['||r2.field||']');
      -------------------------------------
    end LOOP;   
  end LOOP;
End;
/

[Download]

Source CLOB:

AAAAAA|"BBBBBB"|CCC,DDDD|"EEE|WW" AAAxAA|"BBzzBB"|CxC,DD000|"ExaWW" AAAxxA|"BBzzzzz"|CyC,DDtttD|"EUU||WW"

Result:

>AAAAAA|"BBBBBB"|CCC,DDDD|"EEE|WW"< [AAAAAA] [BBBBBB] [CCC,DDDD] [EEE|WW] >AAAxAA|"BBzzBB"|CxC,DD000|"ExaWW"< [AAAxAA] [BBzzBB] [CxC,DD000] [ExaWW] >AAAxxA|"BBzzzzz"|CyC,DDtttD|"EUU||WW"< [AAAxxA] [BBzzzzz] [CyC,DDtttD] [EUU||WW]

The same in one query using WITH clauses and PIVOT to et fields into fields:

with CONST as( 
  Select 
     '|'      as c_delimiter
    ,'"'      as c_encloser
    ,chr(10)  as c_linefeed
    ,chr(13)  as c_carriage
    ,4        as c_fields
  from DUAL

,SRC as(
   Select 
     CL_VAL  as v_SRC            --the delimited CLOB
    ,CONST.* 
   from       CONST
   cross join ZZZ_BJA_CLOB       --source table with CLOB
)
,LINES as(
   Select 
      regexp_substr(v_SRC,'(^|['||c_linefeed||'])([^'||c_linefeed||']*)('||c_carriage||'?|\s*)',1,level,'ni',2) line
     ,level as line_nr
     ,c_delimiter
     ,c_encloser
     ,c_fields
   from SRC connect by level<=regexp_count(v_SRC,'['||c_linefeed||']')+1
)
,FIELDS as(
   Select cast(ltrim(regexp_substr(line,'(^|['||c_delimiter||'])((['||c_encloser||'][^'||c_encloser||']*)|([^'||c_delimiter||']*))',1,field_nr,'ni',2),c_encloser) as VARCHAR2(4000)) field
   ,field_nr
   ,line_nr
   from LINES cross join (select level as field_nr from CONST connect by level<=c_fields)
)
Select * from FIELDS 
pivot(max(FIELD) for FIELD_NR in(1 as "F1",2 as "F2", 3 as "F3", 4 as "F4"))

Ludek Bob Jankovsky
All Right Reserved © 2007, Designed by Bob Jankovsky