Maybe this is the answer ?
" I saw this thread mentioned elsewhere and thought I’d put my abstention from CN aside to add some insight to this issue, an issue that will become more prominent as more people use cameras that have vastly higher pixel counts than the typical astrophotography cameras did from even just 3 or so years ago.
The core of the issue is what is called data types and how ASCOM packages up the pixel data from the camera before passing it to the host application, the host application being SGP (in this case) or any other app that is running the camera’s ASCOM driver. If you’re a programmer, you might know what I mean. If you’re not a programmer and don’t have the slightest as to what a data type is, I’ll try to explain it in simple terms, and then tie that into the explanation about this issue.
Programmatically, the image data from a sensor is an array . You can think of an array as a box that’s filled with distinct items, each item corresponding to a pixel of the sensor. That means if you have a 20mp sensor, this array - the container - has 20 million items in it. Now let’s talk about each item (pixel). We know that imaging sensors output a certain bits per pixel . This bitness describes the maximum range of values that can be encoded for each pixel. The more bits per pixel, the more resolution the ADC has to describe the charge of the pixel. An 8 bit pixel can describe the brightness in a range from 0 to 255. 14 bit pixels can be described in a range from 0 to 16383, 16 bits from 0 to 65535, and so on. Obviously, for imaging, more resolution means more being more tonally descriptive, which means things like higher dynamic range. The bitness in this case equates to the size of the items in the box (the array). The more bits, the more space each item takes up in the box, and that means the box must be large enough to hold them all. So if you have a 61mp sensor like what’s in the QHY600/ASI6200, and it produces image data at 16 bits per pixel, that array is going to be 16 x 61,000,000 = 976,000,000 bits big, or 122,000,000 bytes (there are 8 bits to a byte.) So raw image is 122 megabytes. That’s a pretty big box compared to the one needed to hold the data for a 12 or 16 megapixel sensor that’s running at 8, 12, or 14 bits per pixel.
Also in programming, we have to deal with what are called data types . A data type describes what a thing is in memory and the size of it, in bits (just like the pixel bitness.) Memory, after all, is also an array. So we have data types that are 8 bits, 16 bits, 32 bits, even 64 and 128 bits large - those two can hold some large numbers. But the point is, is that the computer can deal with items only in these terms. Everything has a type associated with it, and with that type comes a size. When the camera spits its image data onto the wire, either the camera firmware or the camera’s driver on your computer must convert the data (if required) from its image native sensor format (which might be 10, 12, or 14 bits for some sensors) to either 8 or 16 bits. So a pixels with 14 bits per pixel will get scaled to 16 bits - a data type a computer can use. There is a cost of speed in this conversion but it’s not really noticeable in the grand scheme of things.
Now ASCOM slides into the picture. Being an API that tries to present a consistent interface in which to programmatically interact with cameras, it demands that the image data be delivered in a generic data type that’s the same no matter what camera produced it. In ASCOM’s case, it specifies that the image data be presented as an Object . An object is kind of an amorphous data type - it’s a box, but a box that where all the items in it have melted together into a large blob with no discernible separation or organization to them. That’s great, programmatically, because an object can hold any kind of data, and that data has no real form. The programmer has to bestow form to it by breaking that one large blob into many 8 bit things or 16 bit things - whichever or whatever data type is appropriate or desired.
If you’re keeping track, you might notice that data is being converted between types an awful lot here. 10, 12, or 14 bit sensor data is getting scaled and converted into 16 bit data. That 16bit data is then being converted in to a formless object, and then that object is being converted back to an array of 16 bit-sized items for it to be usable by the host application. The programmatic flexibility of an Object is great, but it comes at a cost. When low pixel count sensors were the norm, there wasn’t a whole lot of data to convert back and forth and so the cost in time to do that was negligible or not really noticed. CCD cameras were also the norm, and are relatively pokey in reading out their sensors in the first place, so the additional time cost of conversions between data types and objects for 8 or 10 megapixel cameras was just just part of the CCD life. Now we’re trying convert 4 to 8 times that amount of pixels from CMOS cameras that have astounding readout speeds, and this is where the conversion costs become quite noticeable. Type conversion in programming languages have never been a super speedy thing in the first place. Only under certain, well-prepared circumstances can it be relatively quick… but by and large it’s a lumbering process compared to everything else. Speeding it up is attained more or less by pure brute force - how quickly your CPU and its memory controller can shove blocks of memory about, on top of how well the programming language in question manages conversions.
What “native” camera drivers do is avoid the dismal speed penalty incurred by boxing and unboxing that intermediate ASCOM ImageArray object. If the camera or its vendor-provided SDK presents 16 bit data, that 16 bit data is consumed directly by the host application and generally stays that way throughout its use. There is no need to repackage or unpackage it from a generic data type such as an object. For example, in NINA, we take the image data array as it’s handed to us by the camera’s SDK, and we keep it as-is. One internal process copies it, wraps it in a FITS or XISF header, and writes it out as a file to disk. Another process takes a copy of it, runs it through image statistics and a midtone stretching algorithm and presents it on the screen… and that’s that. The memory is deallocated and life moves on to the next exposure. The downside to native drivers is that it means the application developer is in charge of interacting directly with the camera. Sometimes this is easy, sometimes it is complicated (looking at you, Canon.) There are pros and cons to it. But with sensor pixel counts growing, the pros are outweighing the cons by far. Instead of wasting 10 or 20 seconds each frame just for it to trickle into the application, we gain that time back which can be used for more light frame exposures over the course of a night."