Oracle PL SQL Tutorial day 17 - parallel processing

 


Oracle PL SQL tutorial day 17 - parallel processing 


Introduction


Oracle PL/SQL is a powerful procedural language extension for SQL that is used extensively in enterprise applications. While PL/SQL is efficient and optimized, there are scenarios where serial execution becomes a bottleneck, especially when dealing with large volumes of data.


This is where parallel processing in Oracle comes into play.


In this blog, we'll cover:

  • - What is parallel processing?
  • - Benefits and use cases
  • - Oracle parallel processing options
  • - PL/SQL parallel execution using DBMS_SCHEDULER and DBMS_PARALLEL_EXECUTE
  • - A complete working example
  • - Tips and best practices


What is Parallel Processing?


Parallel processing means executing multiple tasks simultaneously rather than sequentially. In the context of Oracle, this can involve dividing a large task (like processing a big table) into smaller chunks and executing them in parallel using multiple database sessions or processes.


Benefits of Parallel Processing


- Reduces execution time

- Efficient use of CPU and memory

- Better throughput in ETL/data warehouse jobs

- Ideal for batch processing and large updates


Parallel Processing Options in Oracle


Oracle provides several options for parallelism:


1. Parallel DML (INSERT, UPDATE, DELETE)

2. Parallel Query (PQ)

3. PL/SQL-based parallel execution using DBMS_SCHEDULER

4. Chunk-based parallelism using DBMS_PARALLEL_EXECUTE


In this article, we focus on parallel execution in PL/SQL using DBMS_PARALLEL_EXECUTE.


Use Case

--------

Suppose you have a table called CUSTOMERS with 10 million rows and a column STATUS that needs to be recalculated and updated. Doing this in one massive update could:

- Lock the entire table

- Run for hours

- Consume lots of UNDO and REDO


Instead, we'll break the work into chunks and process each chunk in parallel.


Step-by-Step: PL/SQL Parallel Processing Using DBMS_PARALLEL_EXECUTE


We’ll use the following Oracle components:

- DBMS_PARALLEL_EXECUTE: For chunk creation and task execution

- DBMS_SQL: To execute dynamic SQL

- Custom PL/SQL procedure


Table Setup (CUSTOMERS)


Let’s assume the table structure is as follows:


CREATE TABLE customers (

  customer_id NUMBER PRIMARY KEY,

  name VARCHAR2(100),

  status VARCHAR2(20),

  last_purchase DATE

);


We'll update the status column based on the last purchase date:

- "ACTIVE" if purchase in last 12 months

- "INACTIVE" otherwise


Step 1: Create a Task


BEGIN

  DBMS_PARALLEL_EXECUTE.create_task(task_name => 'UPDATE_CUSTOMER_STATUS');

END;

/


Step 2: Chunk the Table by ROWID


BEGIN

  DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(

    task_name => 'UPDATE_CUSTOMER_STATUS',

    table_owner => 'YOUR_SCHEMA',

    table_name => 'CUSTOMERS',

    by_row => TRUE,

    chunk_size => 100

  );

END;

/


This creates internal metadata for each chunk based on the table's ROWIDs.


Step 3: Create a PL/SQL Procedure to Process Each Chunk


CREATE OR REPLACE PROCEDURE update_customer_chunk (

  p_task_name IN VARCHAR2,

  p_chunk_id IN NUMBER

) IS

  l_sql VARCHAR2(4000);

  l_cursor INTEGER;

  l_rows NUMBER;

BEGIN

  l_sql := '

    UPDATE customers

    SET status = CASE

                  WHEN last_purchase >= ADD_MONTHS(SYSDATE, -12) THEN ''ACTIVE''

                  ELSE ''INACTIVE''

                END

    WHERE ROWID BETWEEN :start_id AND :end_id';


  DBMS_PARALLEL_EXECUTE.run_task_chunk(

    task_name => p_task_name,

    chunk_id => p_chunk_id,

    sql_stmt => l_sql,

    language_flag => DBMS_SQL.NATIVE

  );

END;

/


Step 4: Run the Task in Parallel


BEGIN

  DBMS_PARALLEL_EXECUTE.run_task(

    task_name => 'UPDATE_CUSTOMER_STATUS',

    sql_stmt => 'BEGIN update_customer_chunk(:task_name, :chunk_id); END;',

    language_flag => DBMS_SQL.NATIVE,

    parallel_level => 10

  );

END;

/


Step 5: Monitor Progress


Use the view USER_PARALLEL_EXECUTE_TASKS and USER_PARALLEL_EXECUTE_CHUNKS:


SELECT task_name, status, start_time, end_time

FROM user_parallel_execute_tasks;


SELECT chunk_id, status, error_code

FROM user_parallel_execute_chunks

WHERE task_name = 'UPDATE_CUSTOMER_STATUS';


Step 6: Clean Up


After completion, drop the task metadata:


BEGIN

  DBMS_PARALLEL_EXECUTE.drop_task('UPDATE_CUSTOMER_STATUS');

END;

/


Sample Output


TASK_NAME STATUS START_TIME END_TIME

----------------------- ----------- -------------------- --------------------

UPDATE_CUSTOMER_STATUS COMPLETED 08-JUN-25 10:05:00 08-JUN-25 10:10:03


CHUNK_ID STATUS ERROR_CODE

-------- ---------- -----------

1 COMPLETED NULL

2 COMPLETED NULL

...


Handling Failures


You can re-run only failed chunks:


BEGIN

  DBMS_PARALLEL_EXECUTE.retry_task(

    task_name => 'UPDATE_CUSTOMER_STATUS'

  );

END;

/


You can also resume a paused/incomplete task using resume_task.


Benefits of DBMS_PARALLEL_EXECUTE


- Easy chunking of big tables (by ROWID, column, or SQL WHERE clause)

- Built-in task management

- Error tracking for failed chunks

- Avoids single massive update, which helps with undo/redo and locking


Comparison: Manual Threads vs. DBMS_PARALLEL_EXECUTE


| Feature | Manual Parallelism | DBMS_PARALLEL_EXECUTE |

|-----------------------------|----------------------------|-------------------------------|

| Chunk management | Manual | Automatic |

| Retry failed chunks | Complex | Built-in |

| Error logging | Manual | Automatic |

| Resume/retry on failure | Manual logic | Built-in |

| Suitable for massive tables | Somewhat | Excellent |


Performance Tips


- Keep chunk size optimal (100–1000 rows/chunk)

- Monitor redo/undo space during large updates

- Avoid parallel DML hints inside chunks

- If the logic is complex, process chunk data into intermediate tables


Best Practices


- Always test on a small sample before running on full data

- Log your task and chunk status

- Consider using AUTONOMOUS_TRANSACTION if logging inside chunks

- Use bind variables for better performance and memory usage


Conclusion


Parallel processing in PL/SQL using DBMS_PARALLEL_EXECUTE is a powerful, scalable, and reliable way to process large volumes of data efficiently. It avoids common issues like table locking and undo log overflow that occur in single-threaded processing.


Whether you're building data pipelines, batch jobs, or large update processes, this approach can significantly boost performance.


With proper planning, chunking, and monitoring, you can achieve near-linear scalability for many database workloads.


What’s Next?

- Use DBMS_SCHEDULER for scheduled background execution of parallel jobs

- Combine with DBMS_PIPE or ADVANCED QUEUE for distributed processing

- Try parallel INSERTs with PARALLEL ENABLE hint in data warehousing


Stay tuned for more PL/SQL performance tips and advanced techniques!


Post a Comment

0 Comments