GPU vs CPU how do they divide the work?

The basic components of an NLE are:

1. Media files stored on an external HD
-Media access speed is determined by: HD speed, Raid level and connection speed to workstation.
2. CPU> decode, encode and calculate non-accelerated effects.
3. GPU>accelerates:
Alpha Adjust, Basic 3D, Black & White, Brightness & Contrast
Color Balance (RGB), Color Pass (Windows only)
Color Replace, Crop Drop Shadow, Edge Feather, Eight&Four-Point Garbage Matte
Extract, Fast Color Corrector, Gamma Correction, Garbage Matte (4, 8, 16)
Gaussian Blur, Horizontal Flip, Levels, Luma Corrector, Luma Curve, Noise
Proc Amp, RGB Curves, RGB Color Corrector, Sharpen
Sixteen-Point Garbage Matte, Three-way Color Corrector, Timecode
Tint, Track Matte Key, Ultra Keyer, Video Limiter, Vertical Flip
Cross Dissolve, Dip to Black, Dip to White, Directional Blur, Fast Blur, Invert, Additive Dissolve, Film Dissolve, Warp Stabilizer

7 steps to render/playback a frame:
1. Frame is fetched from disk.
2. CPU decodes the frame (CODEC has to be properly installed)
3. Frame is modified to an intermediate format and put into memory.
4. If GPU acceleration can be used, the intermediate format is uploaded to the video card for processing.
5. The intermediate results are then downloaded from the video card back into memory.
6. The CPU starts encoding to the final delivery format.
7. The final results are written to disk or played on screen.

A video file gets decoded on the CPU. Performance is determined by how well it is optimized and if it can offload tasks to GPU or not. A lot of codecs on PC call the QT32 process which runs in a 4GB memory space.
ProRes playback is not GPU accelerated. More cores are more important than multiple GPUs, as not all instructions have been written to run outside
the CPU. It’s a common misconception that adding a GPU will
automatically take and run all the regular instructions from
the CPU. Software engineers must write applications to
send instructions to a specific processor before they can
expect speed improvements.
A typical system with eight cores needs 3 GB per core, which equals 24 GB of RAM required by After Effects to run multiprocessing properly.
Although that fits into a 32 GB computer, After Effects may still hold as
much as 6 GB of RAM for other applications, bringing the full
amount of RAM left for After Effects to 24, which is already

close to the limit.

This applies to Windows/PC only:
Below Playback the way it should be: Low CPU load and Disk I/O at the bit rate of the codec. Note the files on the timeline being read in the Disk I/O list.
1AVID_AVID-MXF_PLAY_NO_FILTER
Now look what happens when playing back a ProRes QuickTime in Premiere on a PC
1. CPU cycles taken up by Premiere
2. CPU cycles taken up by the QT32 process
3-8 The files on disk being read by the QT32 process including:
9 pagefile.sys the Page File it has to keep writing and reading off the drive to compensate for the 4GB memory limit.
PR_PRO_PLAY_NO_FILTER_pagefile