INS-DAS-20
This document is not in itself part of the UltraDAS architecture.
Hence, the first problem is how to achieve a set of up to 10 windows in the frame read out on one channel. The same solution is then applied to each channel of a camera.
This is a harder problem than has previously been solved by any user of SDSU controllers. The possible overlapping of windows in the y direction is what makes it hard, as the controller cannot then fully read out one window before starting on the next.
Any readout on a CCD can be described by a sequence of four basic operations:
For a given pattern of windows, it is fairly straightforward to write an assembly-code routine to descibe the readout in terms of loops over sequences of the four basic instructions above. However, the number of possible patterns for n windows is a multiple of n2 and is hence far too high to provide fixed subroutines for each option.
A very-general program could start from the definitions of the individual windows and emit sequences of basic readout instructions, perhaps a detector row at a time, to the SDSU video board. This is deemed too difficult to do in the controller, firstly because the program would be complex, subtle and extremely hard hard to express in assembly code; secondly because the program code might grow to be too large for the controller's memory; and thirdly because the DSP on the SDSU timing board is committed to real-time clocking operations during a readout and is not available to do the geometry calculations.
If the host computer (the DAS SPARCstation) did the geometry calculations, it could, in principle, upload a detector-row's worth of readout instructions at a time to the controller. This approach fails, partly because of the limited bandwidth on the uplink lead (the time to upload the instructions would be at least 0.0025s for a full row of an EEV42 CCD described in 24 bits per readout instruction), but more importantly because the DAS SPARCstation does not have a good real-time response. The time between readout operations would vary wildly and the data quality would be degraded.
The problem becomes tractable if the readout is described in terms of strips of pixels (all to be read out or all to be skipped) that are contiguous in x in a given row, and blocks of identically-patterned rows that are contiguous in y.
In a system that allows up to n windows, any given row can have up to n strips of pixels to read out and n+1 strips of pixels to skip. If the count of pixels in each strip is represented by one word of controller memory, then the row is described by 2n+1 words. To make the table easier to parse in the controller, the windows are ordered in increasing order of the x coordinate of their leftmost column.
The description of a row can be grown into the description of a block by adding one word to hold the repeat count.
To help in working out when to use a row-skip operation, it is helpful to put a one-word flag at the in each block-description, after the repeat count and before the description of the first strip. The flag is set to one for a skipped row and zero for a row that is going to be read out. This makes the size of each row 2n+3 words.
The same n-windowed system can have up to 2n+1 blocks in the most-complicated case: one block before the first window, one block inside each window and one block after each one. Partly overlapping any pair of the n windows in y does not change the number of distinct blocks; however, making the y-range of one window a sub-set of another reduces the number of blocks by 2.
Thus, the storage required to represent a readout of n windows is 4n2 + 8n + 3 words. A storage requirement of 483 words for 10 windows is high, but not impossibly so for a controller with around 16Kwords of application space.
The table of strips and blocks is compiled by the camera's server-program on the host SPARCstation and uploaded to the detector controller whenever the camera is initialized, or the pattern of windows is changed.
For simplicity, the table should be of a fixed size for a given maximum number of windows. That is, the table does not grow or shrink when the user activates a different number of windows. Windows that the user is not using (those set in the server program to zero extent in one or both dimensions) still have columns in the table. They generate strips of zero pixels skipped and zero pixels read.
As an example, consider a readout format for a CCD of 2048 by 4028 pixels. The CCD is on a spectrograph, with spectral dispersion along the y axis. The observer wants two narrow windows, each of 100 pixels in x, covering the full spectral range except for 20 rows of poor-quality pixels at the low-y end. The windows start at x=500 and x=1500 respectively and the observer has arbitrarily set windows 8 and 5. The table is (sized for up to 10 windows):
20 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4008 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 499 100 900 100 549 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...(last line repeated 17 times). The first block is 20 row-skips (second word of the line is the row-skip flag). The second block is the remaining 4008 rows of the detector (including overscan), in which the pattern is "skip 499, read 100, skip 900, read 100, skip remaining 549". Windows 1, 2, 3, 4, 6, 7, 9 and 10 (which the observer hasn't set) have a low x-value of zero and hence are sorted to the front of the row description; they generate "skip zero, read zero' pairs. Windows 5 and 8 (which are set) are sorted to the back. Window 8 has been sorted in front of window 5 because it extends to lower x (hence there is no fixed relationship between window numbers and column numbers in the table). The third and subsequent lines are filled with zeros as all rows of the readout have already been described.
Some detector programs may not leave enough memory free for a table of 10 windows. In these cases, a smaller number of windows may be sufficient; 4 windows (as allowed in the Data-Cell DAS) can be described in 99 words using this new system. Two windows (as in the old perkin-Elmer DAS) can be described in 35 words. If this approach is to be allowed, even as an option on future cameras, then the host program will have to ask the controller program for the size of the table each time it sets up a window pattern.
Here, I list one possible way of compiling the table. I believe that the code produced is as compact as can be obtained from a "just-in-time" compiler without optimization.
The machine-code representation uses two levels of the hardware do-loops which are part of the architecture of Motarola DSPs. The two loops are nested.
The inner loops each contain a jump instruction leading to a subroutine that reads or skips one pixel. The outer loop contains 2n+1 of the inner loops, one for each possible strip in the table. The repeat counts for the inner loops are the numbers in the body of the table. The repeat counts for the outer loops are the numbers at the start of each line of the table.
Consider again the example table given above:
20 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4008 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 499 100 900 100 549 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...The parsing goes as follows. I have shown the assembly code equivalent to the machine code that the parser would plant. In the do instructions, the notation +i indicates an address i higher than the location of the do itself: this is the address of the first instruction after the loop.
do +66, 20For the loop over the rows in this block.
jmp rowskipwhere rowskip is the address of the row-skip subroutine.
do +66, 4008for the outer loop.
jmp rowreadwhere rowread is the address of the subroutine to clock a row into the output register.
do +2, 0 nop nopThe instruction at the loop-end address (the first instruction after the loop) cannot be another do, so we have to nop between inner loops.
do +2, 499 jmp pixelskip nopwhere pixelskip is the address of the pixel-skip subroutine.
do +2, 100 jmp pixelread noppixelread is the address of the pixel-readout subroutine.
do +2, 900 jmp pixelskip nop
do +2, 100 jmp pixelread nop
do +2, 549 jmp pixelskip nop
The code would be much shorter if the DSP's rep instruction could be used to loop over pixels in a strip. This is not possible, as rep cannot repeat a jump instruction.
Readouts need to be interruptible as they are sometimes aborted. The outer loop for each block needs to include a check of a flag with a possible jump out of the readout code.
If there are n windows on the camera and m channels, and if the windows are allowed to overlap the boundaries of the frames on each readout channel, then there can be a total of mn sub-windows on the camera with n in each readout table.
This generality makes it hard to implement windows. An SDSU detector-controller has no built-in multi-tasking, but runs a single sequence of code during readout. The readout tables for each channel have to be compiled together. Simplistically, one imagines the inner loops of the code discussed above being expanded to have one jmp instruction per channel.
The problem is harder than that. Where the pattern of windows is different on the frame attached to each readout channel, the pattern of blocks in the readout tables differs too. Each of the mn windows on the camera can generates blocks in each readout table, making the readout tables 2mn+1 blocks long. There is a danger of running out of memory.
There is a further problem. The DAS computer expects to divide the pixel stream by readout channel. In a full-frame readout, one pixel is sent in turn on each channel and the DAS can identifiy the pixels by their position in the stream. In a window pattern, the interleaving is not necessarily uniform.
UltraDAS cannot afford to have different windowing code optimized for many cameras and for many patterns of windows; a single, general solution is required. There are seven apparent solutions of which only survives closer inspection.
Method 4 is a specialization of method 7 in that the controller sets the value of all ghost pixels to zero. The memory cost for this refinement is prohibitive in large, mosaicked cameras and method 4 is unsuitable as a standard algorithm.
Method 5 is not supported by SDSU's protocol for transmitting pixels from the controller to the DAS, so it cannot be used with standard SDSU products.
Method 6 is thought to be feasible with controllers of optical CCDs, although this assertion has not been proven in practice. However, method 6 does not not work for the IR camera INGRID; it requires the INGRID controller to use more code for reading amplifiers than will fit in the controller's memory. Method 6 cannot be the standard algorithm.
Method 7 is thought to be feasible on INGRID: it requires the least-possible volume of readout code. The method will also work on cameras where the readout channels cannot be clocked independently, such as the INT WFC and single CCDs with two or more amplifiers.
Hence, the standard windowing solution for UltraDAS is the seventh method listed above, For any camera, the DAS downloads to the controller one window table and requires the controller to apply the table equally to all amplifiers of the camera.
The controller must read the amplifiers in the same sequence that it would read them if windowing were not applied; this sequence must be fixed before the application code for the controller is installed in the observing system, but may be changed after the code for the DAS is installed. That is, the order of readout is dictated to the DAS by a configuration file that also defines the file of object code that the DAS downloads to the controller to work the readout. The controller may never change the sequence of interleaving by omitting one or more channels. If the user wants to ignore a channel, the data from this channel must be sent to the DAS and then discarded.