Intland's free requirements, development and test management hosting.
This server hosts 100.000+ users on the cloud!
Kudos to Flo for sharing these examples!

 

groovy inside scriptella <query> and <script> elements

To use groovy scripts inside scriptella you have to follow these steps:

  1. download jsr223-engines.zip  and move groovy-engine.jar into scriptella/lib directory
  2. download groovy from http://groovy.codehaus.org/Download and put groovy-all-1.7.2.jar from embeddable folder into scriptella/lib directory
  3. test the groovy installation with following etl-file:

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="groovy" driver="script">language=groovy</connection>
    <script id-connection="groovy">
        println "Hello World"
    </script>
</etl>

Example 1 - Searching a folder for xml files to process, store references, and process them in a new scriptella-config-file...

<etl>
    <properties>
        $datasetPath = C:/datasets/
    </properties>
    <connection id="groovy" driver="script">language=groovy</connection>
    <connection id="scriptella" driver="scriptella" />
    <query connection-id="groovy"> 
        def dir = new File("$datasetPath")
        dir.eachFile{source->
            if(source.name.endsWith(".xml")) {
                datasetFileName = source.name
                datasetFileUrl = "$datasetPath$datasetFileName"
                query.next()
            }
        }
        <script connection-id="scriptella">
            modules/version1/processXmlFiles.etl.xml
        </script>
    </query>
</etl>
It is possible to do java stuff inside a groovy/query element and use the variables in other child elements (e.g. generating an uuid, no need to import java.util.* classes...):
<script connection-id="groovy">
	uniqueIdentifier = java.util.UUID.randomUUID().toString().replace('-','');
</script>

Other way round: Busy ants inside scriptella (really groovy)

From inside groovy you can use the power of ant...

To use ant from inside a groovy element you have to...

  1. download ant binary from: http://ant.apache.org/bindownload.cgi
  2. move ant.jar and ant-launcher.jar to scriptella/lib directory

Example 2 - Validate content of an Zip archive:

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="groovy" driver="script">language=groovy</connection>
    <connection id="scriptella" driver="scriptella" />
    <query connection-id="groovy">
        def ant = new AntBuilder()
        ant.resources(id: "artifacts.set.discrepancies") {
            difference() {
                zipfileset(src: "$archiveFileUrl" ) {
                    include(name: "**/*")
                }
                resources() {
                    zipentry(archive: "$archiveFileUrl", name: "metadata.xml")
                    zipentry(archive: "$archiveFileUrl", name: "pictures/picture.tif")
                }
            }
        }
        ant.pathconvert(property: "set.difference", refid: "artifacts.set.discrepancies");
        ant.condition(property: "artifacts.verified") {
            resourcecount(count: 0, when: "eq", refid: "artifacts.set.discrepancies")
        }
        ant.fail(unless: "artifacts.verified", message:"The archive is missing some files")
        query.next()
        <script connection-id="scriptella">
            unzip.etl.xml
        </script>
    </query>
</etl>
  • great perfomance, checking without extracting them first...

Example 3 - Unzip files from Zip archive:

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="groovy" driver="script">language=groovy</connection>
    <script>
        def ant = new AntBuilder()
        ant.unzip(src: "$datasetFileUrl", dest: "$destPath") {
            patternset { include(name: "**/*.tif") }
            mapper(type: "glob", from: "*.tif", to: "$datasetFileName"+".tif")
        }
    </script>
</etl>

Example 4 - Validating XML against a XSD schema:

<etl>
    <connection driver="script">language=groovy</connection>
    <script>
        def ant = new AntBuilder()
        ant.xmlvalidate(file: "G:/scriptella-1.0rc2/modules/test/xml/meta.xml", classname: "org.apache.xerces.parsers.SAXParser",
                        lenient: "false", failonerror: "true", warn:"true") {
            attribute(    name:"http://apache.org/xml/features/validation/schema", value: "true")
            property(    name: "http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation",
                        value: "G:/scriptella-1.0rc2/modules/ver/ISO19115_schema.xsd")
        }
    </script>
</etl>

Example 5 - Send files to an ftp server:

<etl>
    <connection id="groovy" driver="script">language=groovy</connection>
    <script connection-id="groovy">
        ant = new AntBuilder()
        ant.ftp(server: "$hostFTP",
                userid: "$userNameFTP",
                password: "$userPswdFTP",
                remotedir: "$dirFTP")
                {
                    fileset(dir: "$datasetPath") {
                        include(name: "$datasetFileName*")
                    }
                }
    </script>

Example 6 - Calling an command line tool and read in the printout

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <!--  -->
    <description>
        Command line tool: G:\scriptella-1.0rc2\tools\windows\gdal\gdalinfo.exe C:\data\raster\479b90525f4148ae8db4cf534ebaf12a.tif
    </description>
    <connection id="groovy" driver="script">language=groovy</connection>
    <query connection-id="groovy" >
        uniqueIdentifier = java.util.UUID.randomUUID().toString().replace('-','');
        def cmd = "$gdalinfo $GeotiffPath$datasetFileName"+".tif"
        def rasterInfo = cmd.execute().text
        rasterInfo.eachLine{source->
            if(source[0..6] == "Upper L") {
                upperLeftX = source[14..24]
                upperLeftY = source[27..37]
            }
            if(source[0..6] == "Upper R") {
                upperRightX = source[14..24]
                upperRightY = source[27..37]
            }
            if(source[0..6] == "Lower R") {
                lowerRightX = source[14..24]
                lowerRightY = source[27..37]
            }
            if(source[0..6] == "Lower L") {
                lowerLeftX = source[14..24]
                lowerLeftY = source[27..37]
            }
        }
        println "---Extract BBOX---"
        println ""
        query.next();
        <script connection-id="groovy">
            gt = "GeomFromText('POLYGON(($upperLeftX $upperLeftY, $upperRightX $upperRightY, $lowerRightX $lowerRightY, $lowerRightX $lowerLeftY))',$srid)";
            updateQuery = "INSERT INTO $footprintTable (the_geom, uuid) VALUES $gt '$uniqueIdentifier')";
            println updateQuery
        </script>
    </query>
</etl>

Final thoughts

if you put much of this into your etl-process-files - stacking together can get a little bit messy if the etl-process is not linear (which leads to many etl-files containing similar processing code). Considering to build a system or file format specific driver to seperate data flow instructions (with a domain specific language) and processing by the driver could be much cleaner because you stick closer to the predefined scriptella ways of querying and scripting to system- or file format connections...