Node order in graph files

joaquimrosa · November 12, 2015, 12:48pm

Is the order in which nodes are defined inside a graph XML file relevant?
Does it affect in any way the processing performance or can it have other implications?
Furthermore, is there a maximum number of nodes that can be defined in a single graph?

lveci · November 12, 2015, 2:26pm

The order of the nodes in the xml does not matter. The order of
connections within the graph of course matters. Performance can be
affected when you have a very large chain of operators depending on the
operations and the size of your tile cache.

SvH · November 13, 2015, 8:23am

Hi @lveci,
could you comment on what is best practice when using graph files with a large chain? We did notice that their is intelligence built into the GCP that assures, for example if you have a SubsetOp near the end of your chain, that the preceding nodes only process the data within the specified subset. Could you give a high-level explanation of the mechanism behind this? We like to make sure that we don’t have unnecessary performance loss due to non-optimal ordering within our graphfiles.

Thanks alot in advance

lveci · November 13, 2015, 8:27pm

SvH, it works on a pull model. Something at the end of a chain requests a tile to be processed. The data is pull through the chain based on the requests from each of the operators. If a subset operator is requesting only a part of the image then then operators before it will also request only that part of the image unless for their own computation they need more.

Where you could get into performance problem is where one operator requires all the data such as a classifier or a statistic.

SvH · November 26, 2015, 11:02am

Hi Iveci,
I have the following follow-up question.

Assume I want to cut 100 small sections from a product, which are clustered together in 10 clusters of 10 sections each. Also assume the sections in a single cluster are so close together that they fit on one of your internal tiles. Under these circumstances, would it matter in what order the overall 100 sections are pulled through the processing chain? In other words, keeps each operator track of which tiles were already computed before or could it happen in a large graph that tiles are computed double?

lveci · November 26, 2015, 2:45pm

When a tile is processed it’s put in a tile cache. If other operators in the graph need the same time area, it first looks in the tile cache. If the tile cache gets full, older tiles will be removed.

SvH · December 15, 2015, 10:56am

Is there possibly a way to control the size of the TileCache? I can imagine it could have a significant effect on our overall processing time.