Parse XML Documents, Database Articles, Global Guide Line Technology is a Mega IT Portal.
Google
Join Global Guide Line community

     Home                    

   First Website Guide       

   Learn HTML                 

   Learn CSS                    

 

   Learn XML                    

   Learn XSLT                  

   Learn Java Script          

   Learn SEO                   

   Learn SQL                   

   Database Articles         

   Web Hosting Guide      

   Services                       

   Contacts                       

Parse XML Documents


     Back             Next     

The use of XML for data transfer has increased dramatically over the last few years. Since most applications still rely on structured relational data it is necessary to explode the non-structured XML data into relations tables. Oracle9i Release 2 includes the XDK for PL/SQL packages loaded into the SYS schema by default, but these have been superseded by a set of integrated DBMS_% packages within the XDB schema. I shall present a simple example of using the new XDB packages to load employee records into the EMP table:
First we create a directory object pointing to the source XML file and the EMP table which is the final destination of the data and :
 
-- As SYS
CREATE DIRECTORY xml_dir AS 'c:\';
GRANT READ ON DIRECTORY xml_dir TO <user-name>;

-- As schema owner
CREATE TABLE EMP (
  EMPNO     NUMBER(4),
  ENAME     VARCHAR2(10),
  JOB       VARCHAR2(9),
  MGR       NUMBER(4),
  HIREDATE  DATE,
  SAL       NUMBER(7, 2),
  COMM      NUMBER(7, 2),
  DEPTNO   NUMBER(2));

The source of the data is the emp.xml document. Extracting the data from the document involves several steps. First the XML document must be parsed and a DOMDocument created from it. Once the DOMDocument is created the parser is no longer needed so it's resources can be freed:
 

l_parser := dbms_xmlparser.newParser;

dbms_xmlparser.parseClob(l_parser, l_clob);
l_doc := dbms_xmlparser.getDocument(l_parser);

dbms_xmlparser.freeParser(l_parser);

Next the XPATH syntax is used to get a DOMNodeList containing all the EMP nodes:
 

l_nl := dbms_xslprocessor.selectNodes(dbms_xmldom.makeNode(l_doc),
'/EMPLOYEES/EMP');

Once we have the DOMNodeList we can loop through it getting the values for each node in turn. The values are returned using the XPATH sytax and placed in a table collection:
 

l_n := dbms_xmldom.item(l_nl, cur_emp);
dbms_xslprocessor.valueOf(l_n,'EMPNO/text()',
t_tab(t_tab.last).empno);

With all the data retrieved into the table collection the inserts can be performed. Putting it all together we get:
 

DECLARE
  l_bfile   BFILE;
  l_clob    CLOB;
  l_parser  dbms_xmlparser.Parser;
  l_doc     dbms_xmldom.DOMDocument;
  l_nl      dbms_xmldom.DOMNodeList;
  l_n       dbms_xmldom.DOMNode;
  l_temp    VARCHAR2(1000);

  TYPE tab_type IS TABLE OF emp%ROWTYPE;
  t_tab  tab_type := tab_type();
BEGIN

  l_bfile := BFileName('XML_DIR', 'emp.xml');
  dbms_lob.createtemporary(l_clob, cache=>FALSE);
  dbms_lob.open(l_bfile, dbms_lob.lob_readonly);
  dbms_lob.loadFromFile(dest_lob => l_clob,
                  src_lob  => l_bfile,
                  amount   => dbms_lob.getLength(l_bfile));
  dbms_lob.close(l_bfile);
  
  -- make sure implicit date conversions are performed 
  correctly
  dbms_session.set_nls('NLS_DATE_FORMAT',
  '''DD-MON-YYYY''');

  -- Create a parser.
  l_parser := dbms_xmlparser.newParser;

  -- Parse the document and create a new DOM document.
  dbms_xmlparser.parseClob(l_parser, l_clob);
  l_doc := dbms_xmlparser.getDocument(l_parser);

  -- Free resources associated with the CLOB and 
  Parser now they are no longer needed.
  dbms_lob.freetemporary(l_clob);
  dbms_xmlparser.freeParser(l_parser);

  -- Get a list of all the EMP nodes in the document 
  using the XPATH syntax.
  l_nl := 
  dbms_xslprocessor.selectNodes(dbms_xmldom.makeNode(l_doc),
  '/EMPLOYEES/EMP');

  -- Loop through the list and create a new record in a 
  tble collection
  -- for each EMP record.
  FOR cur_emp IN 0 .. dbms_xmldom.getLength(l_nl) - 1 LOOP
    l_n := dbms_xmldom.item(l_nl, cur_emp);

    t_tab.extend;

    -- Use XPATH syntax to assign values to he elements of 
    the collection.
    dbms_xslprocessor.valueOf(l_n,'EMPNO/text()',
    t_tab(t_tab.last).empno);
    dbms_xslprocessor.valueOf(l_n,'ENAME/text()',
    t_tab(t_tab.last).ename);
    dbms_xslprocessor.valueOf(l_n,'JOB/text()',
    t_tab(t_tab.last).job);
    dbms_xslprocessor.valueOf(l_n,'MGR/text()',
    t_tab(t_tab.last).mgr);
    dbms_xslprocessor.valueOf(l_n,'HIREDATE/text()',
    t_tab(t_tab.last).hiredate);
    dbms_xslprocessor.valueOf(l_n,'SAL/text()',
    t_tab(t_tab.last).sal);
    dbms_xslprocessor.valueOf(l_n,'COMM/text()',
    t_tab(t_tab.last).comm);
    dbms_xslprocessor.valueOf(l_n,'DEPTNO/text()',
    t_tab(t_tab.last).deptno);
  END LOOP;

  -- Insert data into the real EMP table from the table collection.
  -- Form better performance multiple collections should be used to allow
  -- bulk binding using the FORALL construct but this would make the code
  -- too long-winded for this example.
  FOR cur_emp IN t_tab.first .. t_tab.last LOOP
    INSERT INTO emp
    (empno,
     ename,
     job,
     mgr,
     hiredate,
     sal,
     comm,
     deptno)
    VALUES
    (t_tab(cur_emp).empno,
     t_tab(cur_emp).ename,
     t_tab(cur_emp).job,
     t_tab(cur_emp).mgr,
     t_tab(cur_emp).hiredate,
     t_tab(cur_emp).sal,
     t_tab(cur_emp).comm,
     t_tab(cur_emp).deptno);
  END LOOP;

  COMMIT; 

  -- Free any resources associated with the document now it
  -- is no longer needed.
  dbms_xmldom.freeDocument(l_doc);

EXCEPTION
  WHEN OTHERS THEN
    dbms_lob.freetemporary(l_clob);
    dbms_xmlparser.freeParser(l_parser);
    dbms_xmldom.freeDocument(l_doc);
END;
/

In an attempt to make the code a little shorter I've avoided multiple collections required for bulk binding of inserts. In a production environment it would be advisable to consider bulk binding.
 

XML Parser Issues

At the time of writing this article there are a few bugs/issues relating to the new integrated XML implementation on NT/2000:
 

  • The "/text()" reference is necessary when using the dbms_xslprocessor.valueOf procedure. This bug has been fixed in 9.2.0.3.0 so these references can be removed.
  • When using the "/text()" reference, if a tag contains no text value an "ORA-03113: end-of-file on communication channel" error is reported and the session is killed. This reference is not necessary in 9.2.0.3.0.
  • In 9.2.0.3.0 the SRC_BFILE parameter in DBMS_LOB.LOADFROMFILE should be replaced by SRC_LOB.
  • Parsing a document with an NT/2000 style path gives a "ORA-29280: invalid directory path" error. If all "\" references are replaced by "/" the document is parsed corretly. Since the UTL_FILE package works properly I can only assume the parse procedure is splitting the document path incorrectly. To avoid this issue I've used a directory object to specify the path and loaded the XML document via a CLOB.

Check your current platform for these issues before launching into any development. Until these issues are resolved I would suggest sticking with the Oracle8i method described on this site.

For more information see:
 

Hope this helps. Regards Tim...


     Back             Next     

 

[ About ] [ Contact ] [ Home ]
[ Links ] [ Site Map ] [ Services ] [ Privacy ]

Copyright © 2005-2006 www.globalguideline.com All rights reserved. Join Global Guide Line community.