Thanks I will give this a go, I’m a bit of a novice at this so might be a few more questions to follow :’)
Hi @marpet
I tried creating the operator as you said, annoyingly it ran fine and created it!
I’ve looked at the compiled jar i’m running on the cluster and it definitely contains the CalibrationOp.class etc.
Here’s a screenshot to show:
At a loss now, I can access them but the OperatorSpi’s just aren’t being populated…
If I manually add the operator using:
val calibrationSpi: OperatorSpi = new org.esa.s1tbx.calibration.gpf.CalibrationOp().getSpi
GPF.getDefaultInstance.getOperatorSpiRegistry.addOperatorSpi(calibrationSpi)
I then get:
Exception in thread "main" org.esa.snap.core.gpf.OperatorException: Operator 'CalibrationOp': Value for 'Source Band' is invalid: 'Intensity_VH'
I don’t want to do this manually though as locally I already have that operator available. I know however that those parameters for Source Band are correct as again locally it runs fine.
For reference, we believe now that a call in DefaultServiceRegistry
in com.bc.ceres.core
to getService(serviceName)
is not returning the Operator, whilst debugging we can see that the services Hashmap is indeed only 12 large. Not sure how this gets populated and whether it is system dependent?
Meanwhile I have a guess. It seems that you compile the classes differently than we do.
Do you consider als the file in resources/META-INF/services
There are the OperatorSPIs defined.
See
Really do appreciate the help @marpet
As reference here’s my whole pom.xml for the code I’m running, any pointers as to where I should change it are appreciated
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.mint</groupId>
<artifactId>sar-imagery</artifactId>
<version>0.0.1-SNAPSHOT</version>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.tools.version>2.11</scala.tools.version>
<scala.version>${scala.tools.version}.8</scala.version>
<spark.version>2.0.0.cloudera1</spark.version>
<geotrellis.version>1.1.0-RC2</geotrellis.version>
<spark.scope>compile</spark.scope>
<snap.version>5.0.3</snap.version>
<s1tbx.version>5.0.0</s1tbx.version>
</properties>
<dependencies>
<!-- GIS -->
<dependency>
<groupId>org.locationtech.geotrellis</groupId>
<artifactId>geotrellis-spark_${scala.tools.version}</artifactId>
<version>${geotrellis.version}</version>
<scope>${spark.scope}</scope>
</dependency>
<dependency>
<groupId>com.vividsolutions</groupId>
<artifactId>jts-core</artifactId>
<version>1.14.0</version>
<scope>${spark.scope}</scope>
</dependency>
<!-- Scala -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
<scope>${spark.scope}</scope>
</dependency>
<!-- Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.tools.version}</artifactId>
<version>${spark.version}</version>
<scope>${spark.scope}</scope>
</dependency>
<!-- SNAP -->
<dependency>
<groupId>org.esa.snap</groupId>
<artifactId>snap-core</artifactId>
<version>${snap.version}</version>
</dependency>
<dependency>
<groupId>org.esa.s1tbx</groupId>
<artifactId>s1tbx-io</artifactId>
<version>${s1tbx.version}</version>
</dependency>
<dependency>
<groupId>org.esa.s1tbx</groupId>
<artifactId>s1tbx-op-calibration</artifactId>
<version>${s1tbx.version}</version>
</dependency>
<dependency>
<groupId>org.esa.s1tbx</groupId>
<artifactId>s1tbx-commons</artifactId>
<version>${s1tbx.version}</version>
</dependency>
<dependency>
<groupId>org.esa.s1tbx</groupId>
<artifactId>s1tbx-op-sar-processing</artifactId>
<version>${s1tbx.version}</version>
</dependency>
<dependency>
<groupId>org.esa.snap</groupId>
<artifactId>snap-bigtiff</artifactId>
<version>${snap.version}</version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-core_${scala.tools.version}</artifactId>
<version>3.7.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-junit_${scala.tools.version}</artifactId>
<version>3.7.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.tools.version}</artifactId>
<version>3.0.0-M15</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<scalaCompatVersion>${scala.tools.version}</scalaCompatVersion>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.2</version>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>snap-repo-public</id>
<name>Public Maven Repository for SNAP</name>
<url>http://nexus.senbox.net/nexus/content/repositories/public/</url>
<releases>
<enabled>true</enabled>
<checksumPolicy>warn</checksumPolicy>
</releases>
<snapshots>
<enabled>true</enabled>
<checksumPolicy>warn</checksumPolicy>
</snapshots>
</repository>
</repositories>
<profiles>
<profile>
<id>cluster</id>
<properties>
<spark.scope>provided</spark.scope>
</properties>
</profile>
</profiles>
</project>
I think you will need this transformer:
https://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer
It merges multiple META-INF/services files in a single one.
Brilliant! Now past where it was breaking, many thanks for that @marpet!
Really appreciate you taking the time to help, I understand it’s not the usual SNAP/snappy questions haha!
@marpet With your cluster, do you know if you do a similar thing with reading as you do writing? Eg. ProductIO.writeProduct()
to a temporary file then add this to HDFS? It doesn’t seem to look like I can write a product to a path in HDFS directly from the SNAP classes.
I get a java.io.IOException: failed to create data output directory: <my directory in hdfs>
It’s either a similar issue with reading, or possibly permissions? Although I’m running the code as the owner of the folders in HDFS
Yes, we do the same for writing. First write it locally and then copy to HDFS.
Just, in case you are curios. This is the about page of our cluster. A bit outdated.
The disk space is meanwhile around ~1.5 PB.
Hi @marpet
When trying to write the file locally and copy to HDFS, I’m having an issue with saving as BEAM-DIMAP, it appears that writing to a temporary file with ProductIO.writeProduct()
does not produce the .data file, only the .dim, how would you go about doing this as ProductIO.writeProduct()
only takes in one File
object to write to?.
I think I drop out as man-in-the-middle for now and I delegate you to my colleague @mzuehlke. He works more with the cluster.
No worries! Many thanks for the help again!
@mzuehlke Could you offer any advice as to my above comment?
http://forum.step.esa.int/t/snap-and-hdfs/5250/23?u=ciaranevans
Hi @CiaranEvans,
the DIMAP writer tries to write the .data
file into the same directory the that you specify for the .dim
file.
Could it be that this directory is not writable ? Maybe you better create a temporary direcory first and the write the .dim
into that directory.
If I’m pointing into the wrong direction could you share the lines of code that involve the writing?
Cheers,
Marco
I think I’m almost making too short a shortcut in my temporary file writing:
val configuration: Configuration = new Configuration()
val fileSystem: FileSystem = FileSystem.get(configuration)
val filepath = productBaseDir + filename + fileExtension
var fileToSave: File = File.createTempFile(filename, fileExtension)
ProductIO.writeProduct(product, fileToSave, fileType, false)
FileUtil.copy(fileToSave, fileSystem, new Path(filepath), true, configuration)
With FileUtil from org.apache.hadoop.fs.FileUtil
, file extension being “.dim” and the rest is just filepaths relating to my folder structures.
I tried using .tif and this worked but I assume it’s because it’s writing one file to one temp file, rather than .dim which will actually make 2
I’m writing Java code below, my Scala is currently reading only
File tmpDir = Files.createTempDirectory("dummy").toFile();
File dimFile = new File(tmpDir, "my_dimpa_product.dim");
ProductIO.writeProduct(product, dimFile, fileType, false);
After that in you tmpDir
shuolde be a .dim
file and a .data
directory
You should the open an OutputStream
to HDFS and wrap it inside a ZipOutputStream
OutputStream os = fileSystem.create(new Path(filepath))
ZipOutputStream zos = new ZipOutputStream(new BufferedOutputStream(os))
And then write recursively write all files and directories into the zip stream example: write to zip. addDirToZipArchive(zos, tmpDir)
Hope this helps.
Hadoop can be tricky.
Brilliant! Will give that a go, no worries! I’m a student so even Java isn’t my best! Luckily Scala and Java are very compatible
Many thanks for the pointers!
Hello,CiaranEvans,
i coded it like your discribled above to creat a tempFile. And the HDFS data can lode this File , but while to use product.getBands () , there have exception, java.io.IOException: Stream closed at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy. Can you help me to solve this?
Best wish!
Xiaojian.
@Xiaojian_Gan Could you possibly show me the code you’re using to do this?
I can’t really help with what you’ve given me right now, the most I can offer is that it seems your file system has closed before you try to access it: