Transformasjon fra XML til SQL
Versjon 1
Vi skriver en transformasjon med outputmetode text.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="/">
USE olympic;
DROP TABLE IF EXISTS results;
CREATE TABLE results(
resultid INT PRIMARY KEY AUTO_INCREMENT,
place VARCHAR(20),
year INT,
event VARCHAR(10),
name VARCHAR(35),
nation VARCHAR(10),
result VARCHAR(10)
);
<xsl:apply-templates select="IOC/OlympicGame/event/athlet/result"/>
</xsl:template>
<xsl:template match="//result">
INSERT INTO results(place,year,event,name,nation,result)
values(
'<xsl:value-of select="ancestor::athlet/ancestor::event/ancestor::OlympicGame/@place"/>',
'<xsl:value-of select="ancestor::athlet/ancestor::event/ancestor::OlympicGame/@year"/>',
'<xsl:value-of select="ancestor::athlet/ancestor::event/@dist"/>',
'<xsl:value-of select="ancestor::athlet/name"/>',
'<xsl:value-of select="ancestor::athlet/nation"/>',
'<xsl:value-of select="ancestor::athlet/result"/>');
</xsl:template>
</xsl:stylesheet>
Vi forutsetter at databasen (olympic) er opprettet, fjerner tabellen results, oppretter den på ny og fyller den fra XML-dataene.
Resultatet av transformasjonen (et utdrag) er slik:
USE olympic: DROP TABLE IF EXISTS results; CREATE TABLE results( resultid INT PRIMARY KEY AUTO_INCREMENT, place VARCHAR(20), year INT, event VARCHAR(10), name VARCHAR(35), nation VARCHAR(10), result VARCHAR(10) ); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Maurice Greene', 'USA', '9.87'); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Obadele Thompson', 'BAR', '10.04'); INSERT INTO results(place,year,event,name,nation,result) values( 'Sidney', '2000', '100m', 'Dwain Chambers', 'GBR', '10.08'); ...
Vi kan så bruke denne fila, dbresult.txt, mot en database. F.eks. i følgende dialog i et dos-vindu:
C:\temp\xsldemo>mysql -u root -p Enter password:****** Welcome to the MySQL monitor. Commands end with . or \g. Your MySQL connection id is 3 to server version: 5.0.15-nt Type 'help' or '\h' for help. Type '\c' to clear the buffer mysql>source dbresult.txt;
Versjon 2
Vi beholder den samme databasestrukturen, og endrer transformasjonen som lager SQL. Endringene er to:
- Vi bruker kortformen ../ for å angi foreldre noden.
- Vi lage en samlet SQL-setning for insert oppgaven.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:template match="/">
USE olympic;
DROP TABLE IF EXISTS results;
CREATE TABLE results(
resultid INT PRIMARY KEY AUTO_INCREMENT,
place VARCHAR(20),
year INT,
event VARCHAR(10),
name VARCHAR(35),
nation VARCHAR(10),
result VARCHAR(10)
);
INSERT INTO results(place,year,event,name,nation,result)
values
<xsl:apply-templates select="IOC/OlympicGame/event/athlet/result"/>
;
</xsl:template>
<xsl:template match="//result">
(
'<xsl:value-of select="../../../@place"/>',
'<xsl:value-of select="../../../@year"/>',
'<xsl:value-of select="../../@dist"/>',
'<xsl:value-of select="../name"/>',
'<xsl:value-of select="../nation"/>',
'<xsl:value-of select="."/>'
)
<xsl:if test="not( position() = last())">,</xsl:if>
</xsl:template>
</xsl:stylesheet>
Resultatet blir slik (utdrag):
USE olympic:
DROP TABLE IF EXISTS results;
CREATE TABLE results(
resultid INT PRIMARY KEY AUTO_INCREMENT,
place VARCHAR(20),
year INT,
event VARCHAR(10),
name VARCHAR(35),
nation VARCHAR(10),
result VARCHAR(10)
);
INSERT INTO results(place,year,event,name,nation,result)
values
(
'Sidney',
'2000',
'100m',
'Maurice Greene',
'USA',
'09.87')
,
(
'Sidney',
'2000',
'100m',
'Obadele Thompson',
'BAR',
'10.04')
,
.....
(
'Athens',
'2004',
'400m',
'Michael Blackwood',
'JAM',
'45.55')
;
Oppdatering av basen blir som i versjon 1, men det går vesentlig raskere.
DB-oppdatering fra Python
I denne løsningen skriver vi et Pythonskript som både foretar tansformasjonen og oppdaterer databasen. Vi trenger vi to biblioteker for å få til dette:
- Vi benytter oss av biblioteket lxml for å få til en transformasjon fra et Pytonskript. Mer om lxml i modulen lxml . Vi bruker den samme transformasjonen som vi bruket i versjon 2 ovenfor.
- Vi benytter også en kopling mellom MySQL og Python, se modulen Databaser
Pythonkoden blir slik:
import MySQLdb
from lxml import etree
"""
Use lxml and MySQL
Transform XML-structure
and update database, one record for each result
"""
XML_FILE='c:\\web\\commondata\\olympiade\\all_results.xml'
XSLT_FILE='c:\\web\\dw\\olymp\\ol2db\\dbtrans2.xslt'
#----------------------------------
# file IO
def getTextFile(filename):
try:
file=open(filename,'r')
intext=file.read()
file.close()
return intext
except:
print 'Error reading file ',filename
return None
#----------------------------------
# connect and execute a sql-request
def connectAndExecute(sql):
try:
myBase=MySQLdb.connect(host='localhost',
user='root',
passwd='hemmelig',
db='olympic')
myTab=myBase.cursor()
myTab.execute(sql)
myBase.commit()
myBase.close()
except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
myBase.close()
#----------------------------------
# Perform transformation
def transform():
xmlTree=etree.parse(XML_FILE)
xsltTree=etree.parse(XSLT_FILE)
transform=etree.XSLT(xsltTree)
resultTree=transform(xmlTree)
# and update database
sqlsentences=str(resultTree).split(';')
print sqlsentences
for sentence in sqlsentences:
if len(sentence.strip()) > 0:
#print sentence
connectAndExecute(sentence)
transform()