Transformasjon fra XML til SQL
Versjon 1
Vi skriver en transformasjon med outputmetode text.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" encoding="UTF-8"/> <xsl:template match="/"> USE olympic; DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); <xsl:apply-templates select="IOC/OlympicGame/event/athlet/result"/> </xsl:template> <xsl:template match="//result"> INSERT INTO results(place,year,event,name,nation,result) values( '<xsl:value-of select="ancestor::athlet/ancestor::event/ancestor::OlympicGame/@place"/>', '<xsl:value-of select="ancestor::athlet/ancestor::event/ancestor::OlympicGame/@year"/>', '<xsl:value-of select="ancestor::athlet/ancestor::event/@dist"/>', '<xsl:value-of select="ancestor::athlet/name"/>', '<xsl:value-of select="ancestor::athlet/nation"/>', '<xsl:value-of select="ancestor::athlet/result"/>'); </xsl:template> </xsl:stylesheet>
Vi forutsetter at databasen (olympic) er opprettet, fjerner tabellen results, oppretter den på ny og fyller den fra XML-dataene.
Resultatet av transformasjonen (et utdrag) er slik:
USE olympic: DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Maurice Greene', 'USA', '9.87'); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Obadele Thompson', 'BAR', '10.04'); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Dwain Chambers', 'GBR', '10.08'); ...
Vi kan så bruke denne fila, dbresult.txt, mot en database. F.eks. i følgende dialog i et dos-vindu:
C:\temp\xsldemo>mysql -u root -p Enter password:****** Welcome to the MySQL monitor. Commands end with . or \g. Your MySQL connection id is 3 to server version: 5.0.15-nt Type 'help' or '\h' for help. Type '\c' to clear the buffer mysql>source dbresult.txt;
Versjon 2
Vi beholder den samme databasestrukturen, og endrer transformasjonen som lager SQL. Endringene er to:
- Vi bruker kortformen ../ for å angi foreldre noden.
- Vi lage en samlet SQL-setning for insert oppgaven.
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text" encoding="UTF-8"/> <xsl:template match="/"> USE olympic; DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO results(place,year,event,name,nation,result) values <xsl:apply-templates select="IOC/OlympicGame/event/athlet/result"/> ; </xsl:template> <xsl:template match="//result"> ( '<xsl:value-of select="../../../@place"/>', '<xsl:value-of select="../../../@year"/>', '<xsl:value-of select="../../@dist"/>', '<xsl:value-of select="../name"/>', '<xsl:value-of select="../nation"/>', '<xsl:value-of select="."/>' ) <xsl:if test="not( position() = last())">,</xsl:if> </xsl:template> </xsl:stylesheet>
Resultatet blir slik (utdrag):
USE olympic: DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO results(place,year,event,name,nation,result) values ( 'Sidney', '2000', '100m', 'Maurice Greene', 'USA', '09.87') , ( 'Sidney', '2000', '100m', 'Obadele Thompson', 'BAR', '10.04') , ..... ( 'Athens', '2004', '400m', 'Michael Blackwood', 'JAM', '45.55') ;
Oppdatering av basen blir som i versjon 1, men det går vesentlig raskere.
DB-oppdatering fra Python
I denne løsningen skriver vi et Pythonskript som både foretar tansformasjonen og oppdaterer databasen. Vi trenger vi to biblioteker for å få til dette:
- Vi benytter oss av biblioteket lxml for å få til en transformasjon fra et Pytonskript. Mer om lxml i modulen lxml . Vi bruker den samme transformasjonen som vi bruket i versjon 2 ovenfor.
- Vi benytter også en kopling mellom MySQL og Python, se modulen Databaser
Pythonkoden blir slik:
import MySQLdb from lxml import etree """ Use lxml and MySQL Transform XML-structure and update database, one record for each result """ XML_FILE='c:\\web\\commondata\\olympiade\\all_results.xml' XSLT_FILE='c:\\web\\dw\\olymp\\ol2db\\dbtrans2.xslt' #---------------------------------- # file IO def getTextFile(filename): try: file=open(filename,'r') intext=file.read() file.close() return intext except: print 'Error reading file ',filename return None #---------------------------------- # connect and execute a sql-request def connectAndExecute(sql): try: myBase=MySQLdb.connect(host='localhost', user='root', passwd='hemmelig', db='olympic') myTab=myBase.cursor() myTab.execute(sql) myBase.commit() myBase.close() except MySQLdb.Error, e: print "Error %d: %s" % (e.args[0], e.args[1]) myBase.close() #---------------------------------- # Perform transformation def transform(): xmlTree=etree.parse(XML_FILE) xsltTree=etree.parse(XSLT_FILE) transform=etree.XSLT(xsltTree) resultTree=transform(xmlTree) # and update database sqlsentences=str(resultTree).split(';') print sqlsentences for sentence in sqlsentences: if len(sentence.strip()) > 0: #print sentence connectAndExecute(sentence) transform()