2023-02-19

What if VIC-II had vcbase writable

Ciriciao folks!
This one needs some explanation, what is vcbase? According to the VIC-II article vc is an internal 10 bit counter register that is used to scan the 1000 cells of the screen, vcbase is another register that is used to initialize vc after each scanline. At the start of a frame vcbase is set to 0, so in text mode the screen starts with the first byte pointed by the video page in $18, after each column vc is incremented, after each scanline vc is reset to vcbase if the row is not finished, or the other way around vcbase is set to vc after one full row (8 scanlines). So vcbase is 40 at the beginning of the second row (read byte with offset 40 of the video page), 80 for the next row, etc.

What if we could at the begininning of each frame set the value of vcbase to something else than 0?
What for?

One effect thet requires precise timing, also explained in the VIC-II Article, is called DMA Delay.
It is used to increment vc and vcbase before starting to show the screen, thus scrolling it sideways of the desired amount of characters.
Vc and vcbase are 10 bit registers, so somewhere in the last part of the scrolled screen the counter wraps and the first bytes of the video page are displayed.

Contuining with the experiment of using adders where VIC-II used justaxpositions, we'll extend vc and vcbase to 14 bits (no wrap-arounds until the end of the 16KB bank) and add a new write-only register vcbase_latch. To write to vcbase_latch we'll use another pair of read-only registers, $D01E and $D01F, normally used to read the status of sprites collisions.
In the previous post the adder was only 4 bits wide and did not give issues in VIC-II Kawari, now vc is extended from 10 to 14 bits and the 4 additional bits overlap and need another 4 bit adder.
One other advantage of having 14 bits is that the screen can also be scrolled vertically of any amount of characters.

Mind that the color memory is still only 1K, so the counter will wrap at 10 bit boundaries for the colors, as bofore.

Implementation in Vice

No special flag to enable this feature, the Kernal initializes vcbase_latch to 0, so everityhing works as before if you don't change it.
A new internal register is added in file vice/src/viciisc/viciitypes.h:

    /* Internal memory pointer (VCBASE).  */
    int vcbase;
    int vcbase_latch;

In file vice/src/viciisc/vicii-mem.c write accesses to registers $1E and $1F are diverted to vcbase_latch:

inline static void vcbase_store(const uint16_t addr, const uint8_t value)
{
    VICII_DEBUG_REGISTER(("WIV vc base register %s: $%02X", (addr == 0x1e) ? "LOW" : "HIGH", value));

    if (addr == 0x1e) {
        ((uint8_t *)&vicii.vcbase_latch)[0] = value;
    } else {
        ((uint8_t *)&vicii.vcbase_latch)[1] = value & 0x3f;
    }
}

...

        case 0x1e:                /* $D01E: Sprite-sprite collision or VIC-WIV vc base low */
        case 0x1f:                /* $D01F: Sprite-background collision or VIC-WIV vc base high */
            if (IS_WIV) {
                vcbase_store(addr, value);
            } else {
                collision_store(addr, value);
            }

In file vice/src/viciisc/vicii-cycle.c vcbase is initialized at the beggining of each frame to vcbase_latch:

    vicii.vcbase = IS_WIV ? vicii.vcbase_latch : 0;

Implementation in VIC-II Kawari

In file hdl/matrix.v vcbase is initialized:

`ifdef WIV_EXTENSIONS
                vc_base <= wiv_vcbase_latch;
                vc <= wiv_vcbase_latch;
`else
                vc_base <= 10'd0;
                vc <= 10'd0;
`endif

And in file hdl/addressgen.v it's used for the bitmap mode, 11 bits now instead of the 10 bits of before:

                    if (wiv_xmp)
                        vic_addr = {cb + vc[10:7], vc[6:0], rc}; // bitmap data
                    else
                        vic_addr = {cb[3] + vc[10], vc[9:0], rc}; // bitmap data

And for text mode, the whole 14 bits:

`ifdef WIV_EXTENSIONS
            vic_addr = {vm + vc[13:10], vc[9:0]}; // video matrix c-access
`else
            vic_addr = {vm, vc}; // video matrix c-access
`endif

BASIC example

Only one example in text mode, with increments or decrements of 1 you get an horizontal scrolling, with steps of 40 yuou get a vertical scrolling.

10 rem ** vertical scroll
20 for i=24 to 0 step -1
25 a=i*40
30 poke 53278,(a and 255)
35 poke 53279,(a / 256)
40 for p=0 to 200:next
45 next
50 rem ** horizontal scroll
60 for i=39 to 0 step -1
70 poke 53278,i
80 for p=0 to 200:next
85 next

2023-02-06

What if VIC-II had 1K aligned charsets (and bitmaps)

Ciriciao folks!

Another feature that VIC has (I'm talking of the graphics chip of the VIC-20) and VIC-II hasn't is the ability to point to charsets at 1K boundaries.
Charset are 2K big both for VIC-20 and C64, but VIC-20 can place them in memory at 1K boundaries while C64 must place them at 2K boundaries.
One reason VIC-20 needed that is the low amount of memory accessible by the VIC, that theoretically is 16K but pratically it's only 9K, 4K are ROM (two charsets, both in plain and reverse versions), 5K are RAM, but of those 5K 256 bytes are shared with the zero page and another 256 bytes are shared with the stack, then some of the next 512 bytes are used by the kernal, leaving little more than 4K of RAM available for the programs.

So how the 1K boundary of the charset comes into play?

Let's suppose you wanted to write a game, you need your charset, containing the shapes of your background and sprites, to point to RAM, but you also need to print messages and scores, so you also copy some of the characters from ROM to RAM, what a waste!
Instead you point your charset to be 1K in RAM and 1K in ROM, so you'll have 128 tiles for your graphics and 128 standard characters.
Plain/reverse charactes will be swapped, but the RV bit might come to the rescue! (More on this in another article may be)

Register $05 of VIC, mapped at $9005, looks like register $18 of VIC-II, mapped at $D018

VIC |Bt7|Bt6|Bt5|Bt4|Bt3|Bt2|Bt1|Bt0|Function
----+---+---+---+---+---+---+---+---+---------------
9005|V13|V12|V11|V10|C13|C12|C11|C10|Memory pointers
----+---+---+---+---+---+---+---+---+---------------
VIC2|Bt7|Bt6|Bt5|Bt4|Bt3|Bt2|Bt1|Bt0|Function
----+---+---+---+---+---+---+---+---+---------------
D018|V13|V12|V11|V10|C13|C12|C11| - |Memory pointers
----+---+---+---+---+---+---+---+---+---------------

The difference is the CB10 bit (C10 in my table), bit #10 of the address of character data.

Why?

Look at chaper 3.7.3.1 of the Vic-II Article, how the g-access address is composed:

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 13| 12| 11| 10|  9|  8|  7|  6|  5|  4|  3|  2|  1|  0|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|C13|C12|C11| D7| D6| D5| D4| D3| D2| D1| D0|RC2|RC1|RC0|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

If we want to put bit C10 into the picture then we need to add it to D7, and propagate the carry up to C13, that is, instead of a simple juxtaposition of data coming from different sources, wee need to add some of them, and when dealing with circuits juxtaposition is free and immediate, while an adder has a cost in terms of space (number of gates) and time (will it introduce delays?).

That's about characters, what about bitmaps?

Same and worse. VIC-II uses only CB13 bit, the other bits come from the video counter VC and row counter RC:

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 13| 12| 11| 10|  9|  8|  7|  6|  5|  4|  3|  2|  1|  0|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|C13|VC9|VC8|VC7|VC6|VC5|VC4|VC3|VC2|VC1|VC0|RC2|RC1|RC0|
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Here we need to put 3 more bits of CB into the picture, added to 3 bits of VC, with carry added to CB13.

Implementation in Vice

In Vice we may not worry too much of the 'performance loss' that a sum will introduce with respect to a bitwise or.
As usual a new macro is defined in file vice/src/viciisc/viciitypes.h

/* VIC-II WIV flags */
...
#define WIV_XMP (vicii.regs[0x13] & 0x02) /* eXtended Memory Pointers: enable all bits of register $18 */

Reading register $18 will give the lowest bit only of XMP is set, file vice/src/viciisc/vic-mem.c

        case 0x18:              /* $D018: Video and char matrix base address */
            VICII_DEBUG_REGISTER(("Video memory address register: $%02X",
                                  vicii.regs[addr]));
            value = vicii.regs[addr] | (IS_WIV && WIV_XMP ? 0x0 : 0x1);
            break;

And finally in file vice/src/viciisc/vic-fetch.c when the memory address is built, a sum takes place instead of a bitwise or:

inline static uint16_t g_fetch_addr(uint8_t mode)
{
    ...
    /* BMM */
    if (mode & 0x20) {
        a = (vicii.vc << 3) | vicii.rc;

        if (IS_WIV) {
            a += (vicii.regs[0x18] & (WIV_XMP ? 0xf : 0x8)) << 10;
            a &= 0x3fff;
        } else {
            a |= (vicii.regs[0x18] & 0x8) << 10;
        }
    } else {
        a = (vicii.vbuf[vicii.vmli] << 3) | vicii.rc;

        if (IS_WIV) {
            a += (vicii.regs[0x18] & (WIV_XMP ? 0xf : 0xe)) << 10;
            a &= 0x3fff;
        } else {
            a |= (vicii.regs[0x18] & 0xe) << 10;
        }
    }
    ...
}

Implementation in VIC-II Kawari

In file hdl/registers.v the output cb (characters base) is expanded from 3 to 4 bits:

`ifdef WIV_EXTENSIONS
           output reg [3:0] cb,
`else
           output reg [2:0] cb,
`endif

A new flag wiv_xmp is added to register CR3:

           output reg wiv_cre = 1'b0, // VIC-WIV control registers read enable
           output reg wiv_xmp = 1'b0, // VIC-WIV extended memory pointers: enable all bits of register $18
           output reg wiv_dvb = 1'b0, // VIC-WIV disable vertical border
           output reg wiv_dmb = 1'b0, // VIC-WIV disable main border

Read and written to:

                            dbo[0] <= wiv_cre;
                            dbo[1] <= wiv_xmp;
                            dbo[2] <= wiv_dvb;
                            dbo[3] <= wiv_dmb;
...
                        wiv_cre <= dbi[0];
                        wiv_xmp <= dbi[1];
                        wiv_dvb <= dbi[2];
                        wiv_dmb <= dbi[3];

Used to enable reading the least significant bit of the expanded cb:

                    /* 0x18 */ `REG_MEMORY_SETUP: begin
`ifdef WIV_EXTENSIONS
                        dbo[0] <= cb[0] | ~wiv_xmp;
                        dbo[3:1] <= cb[3:1];
`else
                        dbo[0] <= 1'b1;
                        dbo[3:1] <= cb[2:0];
`endif
                        dbo[7:4] <= vm[3:0];
                    end

While writing always goes through:

                    /* 0x18 */ `REG_MEMORY_SETUP: begin
`ifdef WIV_EXTENSIONS
                        cb[3:0] <= dbi[3:0];
`else
                        cb[2:0] <= dbi[3:1];
`endif
                        vm[3:0] <= dbi[7:4];
                    end

The new flag is used in file hdl/addressgen.v where a sum is used when XMP is set:

`ifdef WIV_EXTENSIONS
                    if (wiv_xmp)
                        vic_addr = {cb + { 1'b0, vc[9:7]}, vc[6:0], rc}; // bitmap data
                    else
                        vic_addr = {cb[3], vc, rc}; // bitmap data
`else
                    vic_addr = {cb[2], vc, rc}; // bitmap data
`endif
                else
`ifdef WIV_EXTENSIONS
                    if (wiv_xmp)
                        vic_addr = {cb + { 3'b000, char_ptr[7]}, char_ptr[6:0], rc}; // character pixels
                    else
                        vic_addr = {cb[3:1], char_ptr, rc}; // character pixels
`else
                    vic_addr = {cb, char_ptr, rc}; // character pixels
`endif

BASIC example

Text mode example

The first example shows how to use the new feature in text mode: the charset points to the second half of the lowercase ROM font, so thet the upper 128 chars are in RAM and free to be redefined and displayed.

10 rem ** half charset from rom
11 rem ** and half charset from ram
20 poke 53267,2:poke 53272,23
30 rem ** smiley face
40 data 129,36,36,0,66,66,60,129
50 for i=0to7:read a:poke 8192+i,a:next
60 print"smile! "chr$(18)+"@@@"

Output:

Output

Bitmap mode example

The second example shows how the bitmap buffer can now be set at 1K boundaries.

10 rem ** move start of bitmap
11 rem ** with 1kb steps
20 print chr$(147)
30 poke 53267,2:poke 53265,55
40 for i=0to15
50 poke 53272,16+i
60 for p=0to1000:next
70 next
80 goto 40

Output:

Output