Potential fix/work-around to problems observed in these posts:
This issue has been driving me nuts. Others and myself couldn’t find a way for snappy to process multiple images one after the other without it consuming all of the memory on a machine without clearing it.
After a lot of digging around I’ve stumbled across Subprocesses in Python. Effectively, you spawn a new Python process via a .py file which runs and then terminates. This terminating behaviour frees the memory snappy is using, much akin to just killing the script normally.
The line of code I use to spawn my processing pipeline is:
pipeline_out = subprocess.check_output(['python', 'src/SarPipeline.py', location_wkt], stderr=subprocess.STDOUT)
Note: pipeline_out
is the STDOUT from the script, so in my case to find out what file has just been processed I have print("filepath: " + path_to_file)
in src/SarPipeline.py
so that I can traverse pipeline_out
, extract the line that begins with filepath:
and then serve that file via my Flask API.
This is by no means an ideal nor pretty, but it works. I can call my SAR API as many times as I like and the memory usage always drops back to next to 0.
Main issues now:
-
Needs a load of extra error handling and code to find the desired output lines
-
You lose the console output from snappy in logs/when following the process on the command line
-
You must reference the file you’re spawning in the subprocess relative to where the parent script was run from
Sorry for the huge post!
Hopefully this will help alleviate any memory/caching issues people are having when trying to provide a service utilising snappy or a pipeline that processes multiple images.
Please feel free to reply to this/message me with any concerns!
Ciaran