Blender
V3.3
|
during the preparation of the execution All ReadBufferOperation will receive an offset. This offset is used during execution as an optimization trick Next all operations will be initialized for execution
Render priority is an priority of an output node. A user has a different need of Render priorities of output nodes than during editing. for example. the Active ViewerNode has top priority during editing, but during rendering a CompositeNode has. All NodeOperation has a setting for their render-priority, but only for output NodeOperation these have effect. In ExecutionSystem.execute all priorities are checked. For every priority the ExecutionGroup's are check if the priority do match. When match the ExecutionGroup will be executed (this happens in serial)
When a ExecutionGroup is executed, first the order of chunks are determined. The settings are stored in the ViewerNode inside the ExecutionGroup. ExecutionGroups that have no viewer-node, will use a default one. There are several possible chunk orders
When the chunk-order is determined, the first few chunks will be checked if they can be scheduled. Chunks can have three states:
An ExecutionGroup can have dependencies to other ExecutionGroup's. Data passing from one ExecutionGroup to another one are stored in 'chunks'. If not all input chunks are available the chunk execution will not be scheduled.
+-------------------------------------+ +--------------------------------------+ | ExecutionGroup A | | ExecutionGroup B | | +----------------+ +-------------+ | | +------------+ +-----------------+ | | | NodeOperation a| | WriteBuffer | | | | ReadBuffer | | ViewerOperation | | | | *==* Operation | | | | Operation *===* | | | | | | | | | | | | | | | +----------------+ +-------------+ | | +------------+ +-----------------+ | | | | | | | +--------------------------------|----+ +---|----------------------------------+ | | | | +---------------------------+ | MemoryProxy | | +----------+ +---------+ | | | Chunk a | | Chunk b | | | | | | | | | +----------+ +---------+ | | | +---------------------------+
In the above example ExecutionGroup B has an outputoperation (ViewerOperation) and is being executed. The first chunk is evaluated [ExecutionGroup.schedule_chunk_when_possible], but not all input chunks are available. The relevant ExecutionGroup (that can calculate the missing chunks; ExecutionGroup A) is asked to calculate the area ExecutionGroup B is missing. [ExecutionGroup.schedule_area_when_possible] ExecutionGroup B checks what chunks the area spans, and tries to schedule these chunks. If all input data is available these chunks are scheduled [ExecutionGroup.schedule_chunk]
+-------------------------+ +----------------+ +----------------+ | ExecutionSystem.execute | | ExecutionGroup | | ExecutionGroup | +-------------------------+ | (B) | | (A) | O +----------------+ +----------------+ O | | O ExecutionGroup.execute | | O------------------------------->O | . O | . O-------\ | . . | ExecutionGroup.schedule_chunk_when_possible . . O----/ (*) | . . O | . . O | . . O ExecutionGroup.schedule_area_when_possible| . . O---------------------------------------->O . . . O----------\ ExecutionGroup.schedule_chunk_when_possible . . . . | (*) . . . . O-------/ . . . . O . . . . O . . . . O-------\ ExecutionGroup.schedule_chunk . . . . . | . . . . . O----/ . . . . O<=O . . . O<=O . . . O . . O<========================================O . . O | . O<=O | . O | . O |
This happens until all chunks of (ExecutionGroup B) are finished executing or the user break's the process.
NodeOperation like the ScaleOperation can influence the area of interest by reimplementing the [NodeOperation.determine_area_of_interest] method
+--------------------------+ +---------------------------------+ | ExecutionGroup A | | ExecutionGroup B | | | | | +--------------------------+ +---------------------------------+ Needed chunks from ExecutionGroup A | Chunk of ExecutionGroup B (to be evaluated) +-------+ +-------+ | +--------+ |Chunk 1| |Chunk 2| +----------------+ |Chunk 1 | | | | | | ScaleOperation | | | +-------+ +-------+ +----------------+ +--------+ +-------+ +-------+ |Chunk 3| |Chunk 4| | | | | +-------+ +-------+
the WorkScheduler is implemented as a static class. the responsibility of the WorkScheduler is to balance WorkPackages to the available and free devices. the work-scheduler can work in 2 states. For witching these between the state you need to recompile blender
Default the work-scheduler will place all work as WorkPackage in a queue. For every CPUcore a working thread is created. These working threads will ask the WorkScheduler if there is work for a specific Device. the work-scheduler will find work for the device and the device will be asked to execute the WorkPackage.
For debugging reasons the multi-threading can be disabled. This is done by changing the COM_threading_model
to ThreadingModel::SingleThreaded
. When compiling the work-scheduler will be changes to support no threading and run everything on the CPU.
A Device within the compositor context is a Hardware component that can used to calculate chunks. This chunk is encapsulated in a WorkPackage. the WorkScheduler controls the devices and selects the device where a WorkPackage will be calculated.
The WorkScheduler controls all Devices. When initializing the compositor the WorkScheduler selects all devices that will be used during compositor. There are two types of Devices, CPUDevice and OpenCLDevice. When an ExecutionGroup schedules a Chunk the schedule method of the WorkScheduler The Workscheduler determines if the chunk can be run on an OpenCLDevice (and that there are available OpenCLDevice). If this is the case the chunk will be added to the work-list for OpenCLDevice's otherwise the chunk will be added to the work-list of CPUDevices.
A thread will read the work-list and sends a workpackage to its device.
When a CPUDevice gets a WorkPackage the Device will get the inputbuffer that is needed to calculate the chunk. Allocation is already done by the ExecutionGroup. The outputbuffer of the chunk is being created. The OutputOperation of the ExecutionGroup is called to execute the area of the outputbuffer.
To be completed!
Finally the last step, the node functionality :)