HomeArticlesSlideshowsAsk BobLogin or Register
NewsCustom SearchContactLinks
 Class:   Search:   for:      |<  <  > 
SIMPLELOAD - Elementary 'last phase' solution for ETL
    by Ludek Bob Jankovsky, 31-May-2017 (ETL PATTERNS)
Despite on poorest of the solution in comparison to true metadata driven ETL, I often meet transformations realized as views with almost the same format as the target table, followed by the pattern based implementation of the final stage - loading the transformed source data into target table, considering current data and maintaining history.
I have to admit the smart approach is proper for small and quick win solutions.
For easy implementation I scripted a skeleton of the last phase load element based on data dictionary information, supporting the most common patterns. ...more
    by Ludek Bob Jankovsky, 01-Dec-2016 (ETL PATTERNS)
METASWAMP is a set of metadata driven pattern and repository requirements implemented primarily as an XML schema. It is intended to serve as an interface between ETL patterns and Repository storage(s). ...more
Quantum eyeballing method of evaluation - comming soon
    by Ludek Bob Jankovsky, 03-Nov-2016 (NEWS)
ETL and the ORA-01502: index *** or partition of such index is in unusable state error, ORA-14063
    by Ludek Bob Jankovsky, 02-Nov-2016 (ORACLE STUFFS)
Especially in huge ETL processes we often increase efficiency of the process by setting some indexes on the target table UNUSABLE. That way we speed-up active operations such as inserts and merges into the table with unusable indexes, but the ORA-01502 error can back-stab us in some phases of the process.
We should be worried about reading operations what we can slow down (sometimes even freeze) by lack of the unusable index, but most of ETL operations are provided in an exclusive time period. By the other hand many concurrent ETL processes and even most overall analytical queries do not need indexes for its work.
However, time after time it happens, the error mentioned above appears in the process. ...more
Simpletask is under the UPL license now
    by Ludek Bob Jankovsky, 29-Oct-2016 (HEAP WORKFLOW)
To make all the license stuff clearer in the case of the Simpletask solution, the UPL - Universal permissive license has been applied for all parts of the software and documentation published on this site under the Simpletask name.
Differently from plain Copyleft license it allows easier incorporation of the software and parts of the software into Larger works without obligation of using the same Copyleft license for these larger works.  ...more
APEX - Downloading files
    by Ludek Bob Jankovsky, 11-Jan-2016 (ORACLE STUFFS)
Using APplication EXpress as a simple GUI cover for Matadata based generators of XML and SQL files led me to task to download these files.
Browsing over various helps I decided to add another one to the rubbish.
There are two basic scenarios:
1. We want to download result of PL/SQL method (function) directly to client machine, or
2. We want to download generated codes from stage - some storage of files prepared for download.  ...more
Lazy hint - go faster for bulk processing in Oracle
    by Ludek Bob Jankovsky, 24-Aug-2015 (ORACLE STUFFS)
Following an example of politicians, we add a tax to index usage to stimulate hash processing using a hint:
/*+ OPT_PARAM('optimizer_index_cost_adj', 30000)*/
Shake It!
    by Ludek Bob Jankovsky, 17-Mar-2015 (ORACLE STUFFS)
Time after time we need to move tables in development databases to "heal" non-standard operation gaps and chains. Following script could serve as a template. ...more
Real-time ETL patterns and how to piss to pit
    by Ludek Bob Jankovsky, 16-Mar-2014 (ETL PATTERNS)
Recently I published an article about real-time ETL. Designing it we hit on several challenges and various different approaches. To be certain, we speak about Real-time ETL realizing Data integration of sources, not a Service integration. We have to react on smallest changes in source data and propagate the changes to transformed target Data structure. Advantages and disadvantages I described in previous article. ...more
Edges of Real-time ETL
    by Ludek Bob Jankovsky, 11-Jan-2014 (ETL PATTERNS)
Nowadays Real-time and message based ETL take more significant place in IT infrastructure. Differently from Messaging based operational data, in the Real-time approach target data are refreshed based on changes of source data in source systems. That uses means what was earlier often used for replication of data. Differently from replication, realtime ETL is hardly 1:1 but the transformation takes place in the process. Real-time ETL brings many limitation in comparison to standard batch ETL. ...more
XML Import and export utility for SIMPLETASK
    by Ludek Bob Jankovsky, 25-Aug-2013 (HEAP WORKFLOW)
According to principle of maintaining Logical process separately from physical implementation in SIMPLETASK the Export / Import format is necessary.
Following Add-on package contains API for import XML format. The primary intention is to use the feature for importing metadata generated from Logical process model, but it also can be used to archive physical metadata or to transfer them between environments. ...more
    by Ludek Bob Jankovsky, 14-Aug-2013 (HEAP WORKFLOW)
Passive API of SIMPLETASK are solved as a set of views. Most of the views are related to particular tables enriched by useful columns. ...more
    by Ludek Bob Jankovsky, 11-Aug-2013 (HEAP WORKFLOW)
BWTA_OPER is an active API in the form of oracle PL/SQL package. It is designed as a basic layer of API allowing active manipulation with tasks in their runtime within the SIMPLETASK solution.  ...more
    by Ludek Bob Jankovsky, 11-Aug-2013 (HEAP WORKFLOW)
BWTA_METADATA is an active API in the form of oracle PL/SQL package. It is designed as a basic layer of API allowing active manipulation with metadata of the SIMPLETASK solution.  ...more
Scripts for creating of a deployment script of database object through SQL scripts
    by Ludek Bob Jankovsky, 11-Aug-2013 (ORACLE STUFFS)
Sometimes it happens you want to create package script and the tool you can use doesn't satisfy your needs. Sometimes it happens you need to embed the creation scripts into script independent on any such tools. For these case I stored simple script extraction scripts in the article. ...more
APEX GUI for SIMPLETASK heap workflow
    by Ludek Bob Jankovsky, 10-Jul-2013 (HEAP WORKFLOW)
Simpletask Heap Workflow is a simple Oracle based solution for ETL process management based on Oracle DBMS_SCHEDULER and Heap Workflow principles. Now we present APEX GUI, allowing to manipulate metadata and operate ETL easier way than using the PL/SQL based API only. ...more
Heap Workflow
    by Ludek Bob Jankovsky, 29-Jun-2013 (HEAP WORKFLOW)
Long ago when I started to design ETL systems I realized, that robust ETL process management is as important as any other pillar of ETL approach.
Most current workflow systems are focused to easy visualization in structured form.
Structured workflow systems are good for simple process management. I have never seen ETL system with just a little focus on robustness and performance. For that you need a fine grain heap based workflow system allowing to use operational research methods.

What's different?
When you put task into structured workflow, you wire it in certain place to be exacuted at. In heap workflow you put all task in one heap and define constraints of their executions.
The constraints are of following types:
  • Predecessor dependency (solves dependencies comming from referential integrity, data rediness etc.)
  • Resource consumption (solves system resource balancing, exclusivity of filling certain segments and finally using pending resource consumption it can solve temporarily disability of certain objects)
  • Condition function (solves additional timing requirement, branches, skips etc.)
Use data type arithmetic to convert from 'Timestamp' world to 'Date' world and opposite way
    by Ludek Bob Jankovsky, 09-Mar-2013 (ORACLE STUFFS)
There is not straight conversion between the TIMESTAMP and DATE datatypes. Easy way to 'misuse' the standard data type arithmetic to do particular conversions in our queries.
Following article will briefly explain the ways. ...more
Analytical functions and interpolation - samples
    by Ludek Bob Jankovsky, 23-Feb-2013 (ORACLE STUFFS)
There are various situations you want to fill missing values based on last filled or neighbor filled values. Following simple example show the approach: ...more
Aging table - drop column of partitioned table with compressed old partitions
    by Ludek Bob Jankovsky, 11-Nov-2012 (ORACLE STUFFS)
Using fast aging data (fact tables, logs) you will have to consider some kind of aging strategies like partitioning by time slices, maintaining old partitions compressing them, moving them to slower storage or dropping them at the end. One of factor considered during decision COMPRESS - NO COMPRESS is potential problem of dropping column from table. That is impossible while table or some partitions are compressed. Uncompress all the partitions of huge aged table can make space difficulties. Following solution avoid the problem and offer robust way how to drop column from aged table with compressed partitions. ...more
Usage of SUBQUERY_PRUNING undocumented hint
    by Ludek Bob Jankovsky, 20-Sep-2011 (ORACLE STUFFS)
There are cases when you want to merge data from small table to large one, where partitioning of target table allows to merge data just into last few partitions. Anyway, to know what partitions should be used you need query didstinst partition keys from source table. Oprimizer usually works itself, but if it doesn't, following hint can help. No more words, just an example: ...more
Partitioning sreategy patterns - principal element of Data architecture
    by Ludek Bob Jankovsky, 19-Aug-2011 (ETL PATTERNS)
One of basic chapters in data architecture is a decision about partitioning strategy for particular data layers, table stereotypes and ETL patterns.
Following article will describe basic drivers influencing decisions about partitioning strategy. Details on particular patterns will be linked along.  ...more
Equipartitioning - solution for very large data transformations, but not only for them
    by Ludek Bob Jankovsky, 16-Aug-2011 (ORACLE STUFFS)
There are situations, you want to join two huge tables unselectively for bulk data operartion.
Using NESTED LOOP operation is not effective for big number of rows.
Using HASH JOIN operation is usually effective in such cases, but it has limits when nomber of rows exceeds tens of millions.
EQUIPARTITIONING is a way allowing to make huge bulk operations on hundreds of millions of rows. ...more
Ridiculous (well just for nonparticipants) behaviour of LEFT JOIN in Oracle 11g
    by Ludek Bob Jankovsky, 12-Apr-2011 (ORACLE STUFFS)
There is great feature in Oracle 11g, named Optimizer join elimination. It's designed to reduce effort of database following way:
When there is a Left outer joined table in query, and no column from the table used in result set, optimizer simplifies query and doesn't access the outer joined table - as not necessary.
It sounds great, doesn't it?
Well, it would sound well, if there is no bug.  ...more
ETL limit
    by Andre Enslin, 27-Jan-2011 (ASK BOB READY)
Is there a limit in the number of statistics ETL can aggregate in 30 minutes? We try to run a report every hour, but some times it fall behind with +- 2 hours. ...more
Quick check of index condition
    by Ludek Bob Jankovsky, 11-Jan-2011 (ORACLE STUFFS)
One of task when checking Data Warehouse schema condition could be following quick index check. We look for:
  • Global indexes (suspicious in general, tere must be good reason to use them in data warehouse)
  • B-TREE indexes with low cardinality
  • B-TREE with high number of AVG_DATA_BLOCKS_PER_KEY (over 300) - such are reasonable to be made unusable before load process and rebuilt after
Failover technics - validation during the load phase
    by Ludek Bob Jankovsky, 09-May-2010 (ETL PATTERNS)
There are several places within the ETL process to catch wrong and low quality data. Generally there are four basic moments:
  • Validation during the load phase
  • Validation during the transformation phase
  • Limited validation during transformation phase just defaulting invalid values
  • Ex-post validation - data quality management

Following article is about the first one. ...more
SQL*Loader-500: Unable to open file
    by Ludek Bob Jankovsky, 02-Mar-2010 (ORACLE STUFFS)
There is a misleading error message in Oracle SQL*LOADER - following chain of messages could appear despite the problem is in another place:  ...more
Rigid Exchange Partition Data Acquisition approach
    by Ludek Bob Jankovsky, 05-Feb-2010 (ETL PATTERNS)
There are several ways to support consistency of data warehouses and data marts within business hoursperiod. In standard simple approaches it usually suppresses loading time window to really short period between readiness of source data and required readiness of target data. Following approach is just another way allowing keep all data in the mart consistent at the moment, anyway I consider it as an interesting stuff. ...more
Oracle MERGE to remote DB - ORA-01008
    by Ludek Bob Jankovsky, 24-Sep-2009 (ORACLE STUFFS)
Oracle Merge statement has some additional requirements when we process it thru DB-link. The first one is Oracle error "ORA-01008: not all variables bound" appearing despite it seems no variables are bound in the statement. ...more
Articles: 1 .. 30   first  prev  next 
All Right Reserved © 2007, Designed by Bob Jankovsky