Introducing snapista, a GPT wrapper for Python

Hello!

Let me introduce snapista, a GPT wrapper for Python. The goal is to provide an easy and pythonic way (to some extent) to write and run SNAP graphs using Python (a script or a Jupyter Notebook).

To create a graph, one would write:

from snapista import Graph
from snapista import Operator
g = Graph()
g.add_node(operator=Operator('Read'), 
           node_id='read_1')

calibration = Operator('Calibration')

calibration.createBetaBand = 'false'

g.add_node(operator=calibration, 
           node_id='calibration', 
           source='read_1')

It is documented here: https://snap-contrib.github.io/snapista/ and the software repo is on Github here: https://github.com/snap-contrib/snapista

There’s a demo on Binder (linked in the software repo README) and a set of examples in the documentation.

You’ll also find other elements that may support your activities on the Github organization https://github.com/snap-contrib, eg:

  • SNAP packaged with conda
  • a docker container with SNAP and Python
  • how to use Visual Studio Code with remote containers
  • a Jupyter/Theia docker image with SNAP and Python

Feedback and comments welcome!

9 Likes

I’d like to ask about the already created graph in SNAP, for instance to find a node and then to edit this node and make it processable with python,
An example,
file path = Process.xml
find the read node
change it to make it within a for loop in python , in order to process multiple tile within the directory,

Please give an example if snapista could do so.
Thanks a lot.

Hello @falahfakhri,

The goal here is get away from XML and instead write code to create and process Graphs.
So although snapista can be used to serialize a snapista.Graph object to XML, it is not meant to deserialize an XML graph and create snapista objects (Graph and Operator).

Fabrice

1 Like

Hi @fabricebrito,
sounds awesome! I’ll try it.
Is your wrapper using the snappy package behind the curtains?
I ask, because after finishing a complex python script with snappy it was very painful to see that the processing is very very slow, much slower than using the commandline (pure java).
I would love to see a way using python for my large batch operations with comparable speed…

I tried all the ways to install it, anaconda python 3.8.8, windows 10, but all of them failed.

Hello @florian.beyer,

It uses snappy to get the info about the operators but relies on the gpt for the execution by doing a system call. So processing performance is gpt’s

1 Like

Hello @falahfakhri, snapista relies on snap packaged as conca and that targets Linux OS.
For Windows and Mac OS, there’s guidance on how to use VS Code remote containers to run a Linux based container for the development activities.
See https://snap-contrib.github.io/snapista/installation/

Hope this helps

Hi @fabricebrito!
This sounds really good! Thanks for the efforts and sharing with the community.

I have seen that relies on snap 8.0.0 and I am wondering if it does automatically checks and is able to update the plugins once they are available.

Additionally, another question would be, is there any easy way to make it also compatible with previous SNAP versions? or newer ones?

Thanks!

1 Like

Hello @mdelgado,

It does rely on snap 8.0.0, so when you install snapista you get snap 8.0.0 installed.
For updating it, you need to follow the same approach as for snap headless updates.

Here’s an example for installing idepix for Sentinel-3

${PREFIX}/snap/bin/snap --nosplash --nogui --modules --install org.esa.snap.idepix.core                 
${PREFIX}/snap/bin/snap --nosplash --nogui --modules --install org.esa.snap.idepix.olci 

For older or newer versions, one would have to do the equivalent of https://github.com/snap-contrib/snap-conda for these versions.

I hope you have all the info requested

1 Like

Great contribution Fabrice, I was wondering if we could automate generating new versions of snapista with each SNAP module update so that novice users would not need to deal with Conda?

Hello @mengdahl,
snapista does not have to change when snap gets an update. What changes is its underlying dependency on snap package as conda (snap-conda). So we don’t have to re-bluid snapista when that happens.
snap-conda will follow major updates of snap. For additional modules or their updates, users (it includes the novice users) have to update the snap installation following the headless update with

/bin/sh -c snap --nosplash --nogui --modules --update-all

which for a snap installed with conda is:

${PREFIX}/snap/bin/snap --nosplash --nogui --modules --update-all

Hello @falahfakhri

There’s a new update in snapista for version 0.1.3. You can load a graph from a local path or a remote URL on a HTTP(s) server and update it:

g = read_file("https://gist.githubusercontent.com/fabricebrito/fe7df152e9f0df3a3ff6d3974b87e9e2/raw/294b5d8fec9b2b1d4fdc7468611c9bb7756f9e7a/graph.xml")

g.view()

g.add_node(
        operator=Operator(
            "Read",
            formatName="DIMAP",
            file='a file',
        ),
        node_id="read",
    )

g.view()

I hope this helps!

1 Like

Dear @fabricebrito

Thanks a lot for this update, I’d like to raise a question here,

Let’s say we have created the following xml,

g = read_file("https://gist.githubusercontent.com/fabricebrito/fe7df152e9f0df3a3ff6d3974b87e9e2/raw/294b5d8fec9b2b1d4fdc7468611c9bb7756f9e7a/graph.xml")

g.view()

g.add_node(
        operator=Operator(
            "Read",
            formatName="DIMAP",
            file='a file',
        ),
        node_id="read",
    )

g.dd_node(
       operator=Operator(
      "Apply Orbit"
     node_id="apply orbit"
))

g.view()

How could I access the node read_file to add up in my python script a for loop, in order to read and apply the operator- operators to multiple file?

And the second question,

Do you have any cheat_sheet, or you might creating a one for updating the operators of the *.xml file without returning back to SNAP!

@falahfakhri
I’d go for something like this:

from snapista import read_file
from snapista import Operator

g = read_file("https://gist.githubusercontent.com/fabricebrito/fe7df152e9f0df3a3ff6d3974b87e9e2/raw/294b5d8fec9b2b1d4fdc7468611c9bb7756f9e7a/graph.xml")

for myfile in ['filea', 'fileb']:

    g.add_node(
            operator=Operator(
                "Read",
                formatName="DIMAP",
                file=myfile,
            ),
            node_id="read",
        )

    g.view()
    
    g.run() 

A second suggestion is to use CWL and Docker as explained here: https://github.com/snap-contrib/cwl-snap-graph-runner

This approach is convenient to batch process files against an existing graph file. It’s completely detached from snapista as it uses a snap docker image. This allows running SNAP against local EO data without installing SNAP

1 Like

@fabricebrito

Thanks a lot for clarifications, But since me and might be the others of our colleagues have many questions, also I’m not sure if many researchers know this great code, I have suggestions, I hope you have time to take them in your account,

What do you think if you could create two or three hours webinars each week one hour for instance, talk about the following:

First : snapista, installation under windows 10 os
snapista, installation under Linux os

Second: create a virtual processing example of different data
Read the data, apply some operators, write the data

I think this is the best way of shorten the silly questions like mine, and give the people wide area to give you their suggestions. Also this we’ll be a reference of your script.

You could take and simulate any two projects form RUS copernicus for instance, once talk about S1- and the second talks about S2.

I hope you able to find time in your schedule to implement this suggestion.

Do you have any cheat_sheet, oryou might creating a one for updating the operators of the *.xml file without returning back to SNAP!

Hello @pavithra

Here’s the link to the documentation: https://snap-contrib.github.io/snapista/gettingstarted/#load-an-existing-graph

1 Like

@fabricebrito, the snapista package looks very promising! Kudos for such nice work! :wink: :love_you_gesture: :muscle:

I have been using it recently and it makes life easier!
Looking forward to seeing its full potential!

1 Like

@fabricebrito, Very glad to see a promising solution for combining Snap and python, looking forward to try it!

I was just wondering about the performances, how are they set within Snapista ?

Thanks for your contribution!

ps: I’am using something like:

    gpt_cli = ['gpt',
               graph_path,
               '-q', MAX_CORES,  # Maximum parallelism
               '-J-Xms2G -J-Xmx{}'.format(bytes2snap(MAX_MEM)),  # Initially/max allocated memory
               '-J-Dsnap.log.level=WARNING',
               '-J-Dsnap.jai.defaultTileSize={}'.format(TILE_SIZE),  # Tile size, set to 4096 or lower for ESD operator
               '-J-Dsnap.dataio.reader.tileWidth={}'.format(TILE_SIZE),
               '-J-Dsnap.dataio.reader.tileHeigh={}'.format(TILE_SIZE),
               '-J-Dsnap.jai.prefetchTiles=true',
               '-c {}'.format(bytes2snap(0.75 * MAX_MEM)),  # Tile cache, up to 75% of max memory
               # '-x', # Clears the internal tile cache after writing a complete row to the target file
               *other_args]

Snapista is a Python wrapper and has the same performance as the Java gpt.