Showing posts with label xml. Show all posts
Showing posts with label xml. Show all posts

Tuesday, January 3, 2017

Process maven's pom xml file with xmlstarlet

General command  to get version of certain plugin dependency from pom file:

xmlstarlet sel -N pom=http://maven.apache.org/POM/4.0.0 -t -m "/pom:project/pom:build/pom:pluginManagement/pom:plugins/pom:plugin[pom:artifactId='maven-checkstyle-plugin']/pom:dependencies/pom:dependency[pom:artifactId='checkstyle']/pom:version" -v . pom.xml

Examples for editing and selection are at:
https://github.com/sevntu-checkstyle/sevntu.checkstyle/blob/master/pom-version-bump.sh

https://github.com/checkstyle/checkstyle/wiki/How-to-generate-Checkstyle-report-for-Google-Guava-project

Few XPATH conditions for query:
http://stackoverflow.com/questions/28370054/find-a-specific-groupid-from-a-maven-pom-xml-using-xmstarlet


Useful command to print all tags from xml file to help construct XPATH:
xmlstarlet el -v pom.xml


Sunday, July 28, 2013

Extract certain tag with inner tags from huge XML

Task: You have huge XML file (300 Mb), you need to get out(filter out/extract/grab) it only one tag with specific value.

Under Ubuntu 12.04 install xslt processor:
sudo apt-get install xsltproc

Structure of huge XML (hugeFile.xml):
<?xml version="1.0" encoding="ISO-8859-1"?>
<Feed ExtractDate="07/25/2013" ExtractTime="15:30:15">
  ..... a lot of companies information
<COMPANY ... LegalName="MyCompany" .....>
     ..... a lot of inner tags .....
  </COMPANY>
 ..... a lot of companies information
</Feed>
 

Create extract.xsl file:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">

<xsl:element name="TagToExtract">
   <xsl:apply-templates select="//COMPANY[@LegalName='MyCompany']" />
</xsl:element>

 </xsl:template>
  <xsl:template match="//COMPANY[@LegalName='MyCompany']">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:template>
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Execute XSLT processor (12 seconds):
xsltproc extract.xsl hugeFile.xml > 1.xml