Task: You have huge XML file (300 Mb), you need to get out(filter out/extract/grab) it only one tag with specific value.
Under Ubuntu 12.04 install xslt processor:
Structure of huge XML (hugeFile.xml):
Create extract.xsl file:
Execute XSLT processor (12 seconds):
Under Ubuntu 12.04 install xslt processor:
sudo apt-get install xsltproc
Structure of huge XML (hugeFile.xml):
<?xml version="1.0" encoding="ISO-8859-1"?> <Feed ExtractDate="07/25/2013" ExtractTime="15:30:15">
..... a lot of companies information<COMPANY ... LegalName="MyCompany" .....>
..... a lot of inner tags .....
</COMPANY>
..... a lot of companies information
</Feed>
Create extract.xsl file:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:element name="TagToExtract">
<xsl:apply-templates select="//COMPANY[@LegalName='MyCompany']" />
</xsl:element>
</xsl:template>
<xsl:template match="//COMPANY[@LegalName='MyCompany']">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Execute XSLT processor (12 seconds):
xsltproc extract.xsl hugeFile.xml > 1.xml
