I’ve begun to use a lot more subframes (1000s) to maximize my efficiency in my heavy light pollution area and have really started to notice the slow read speeds of the SGP grader. I have one of the faster NVMe SSDs out there (a Samsung 970 PRO) and I still only see 32MB/s read speeds while the grader is running. I also have an Intel 660P which has about 1/4th the random read throughput which shows the exact same performance within SGP. This works out to right about 256 mbps so I suspect there’s something coded in SGP to limit disk IO/bandwidth? Anyone else able to consume more than 32MB/s read?
I think your analysis is overlooking that after each image is loaded it must be analysed, and I suspect this analyis is likely quite CPU intensive.
I cannot see why the SGP devs would go to the trouble of writing a grader ‘speed limiter’: More likely in my opinion (for what it;s worth) is that the design and implementation choices made by the SGP devs were likely not optimal for grading images at the highest achievable rate.
For comparison I ran Image Grader on a folder containing 80 FITS image files of 17.95 KB each, using an Intel NUC model with i5-8259U processor and M2-PCIe SSD.
a) With a CPU intensive task running in the background, SGP image grader took approx 4 mins on the task
b) With the background CPU task suspended, the same analysis took approx 2min 20s.
As an alternative, if you use it for platesolving, you might find it useful to explore the image analysis capabilities of ASTAP, see the Stack tab, Analyse and organise images button.
On the same NUC with the background CPU task suspended as in b) above, ASTAP analysed the same folder of images in c90 seconds. And the image analysis is a little more comprehensive that Image Grader.
While a possibility I don’t believe that’s the case. This workstation is dedicated to processing. Its only purpose in life is editing 8k video and working with complex CAD projects. All acquisition occurs on another system. Files are copied over, and then processed locally from the NVMe.
By the numbers:
The NVMe (Samsung 970 Pro) has 330GB free of 476 (500GB). NVMs’s can get slower when full, this is not the case.
With SGP grading I show of 10 GB of 64GB DDR4 memory in use.
With SGP grading I show 11% of the i9-10900K utilized.
With SGP off I show 3-5% of the i9-10900K utilized.
There’s plenty of horsepower there, it’s just not being leveraged. The task does appear to be single threaded, however, and possibly prime for parallelization.
Happy to donate my machine “to the cause” if there’s anything the DEVs would like to test/beta test. I’ll also start testing other grading workflows with APP and Pixinsight as grading in SGP alone (811 objects at 31MB) adds about 20 minutes to the process.
I am open to the idea that recent widespread availability of large sensor, high-speed CMOS cameras may be high-lighting issues in the way SGP IG was implemented with regard to performance but that is not the same as suggesting the devs perhaps deliberately throttled back performance. But maybe I’ve misinterpreted what you intended to say.
I don’t know how key to their way of working most SGP users would view image grader. Personally I use it only occasionally to weed out very obviously poor subs. So I’m not sure what priority the devs would put on further improving this function.
I’ve observed that on the occasions when I use different imaging processing tools I only rarely see 100% agreement as to say the worst 10% of images, so I personally prefer to allow the final selection to be made by my image stacking software.
That said however it seems to me that IG is an easily identifiable function that could be open to develoment by some enterprising person with approprate knowledge and coding skills to provide a well-optimised, callable function in a similar manner to Guiding and PlateSolving.
Oh no no, I certainly didn’t mean to point the finger. What I was highlighting was that there may be a product design choice in the past to ensure maximum capability with the number of different performing machines out there that is limiting SGP’s access to actual system capabilities. Or it may have simply been overlooked as storage IO has drastically increased. I’m also in tech and we do this all the time. Sometimes decisions to single thread tasks are made because yes, it’s easier but sometimes we also do it to protect older systems from crashing. This is why programs like APP allow you to select the number of threads/cores the app utilizes. If you leave one for the OS, you’re almost guaranteed not to crash while granting more compute power to the application.
256 mbps is just right in line with one of those 1024 factors we look out for which point to coded performance, whether it be something in someone else’s library or in the application itself. My only goal of this point was to highlight that SGP is leaving some performance on the table when it comes to storage IO (and CPU threads) with the grading application and A) make sure the Devs are aware of it and B) hopefully hear performance will be markedly improved in version x.x.x