## [C] Understanding an interesting piece of code

### [C] Understanding an interesting piece of code

Hello everyone!

Some times ago I found an interesting bomberman clone written in C with some cool graphics effects in it.
Trying to understand how it achieved those effects, I put up togheter the minimum code required to reproduce them, but still have trouble understanding some very obscure (to me) pieces...
Especially this routine:

{l Code}: {l Select All Code}
`void Rotator(void){   u32   ix, iy;   u16   px, py, px2, py2, incx, incy;   u32   nMultiplier;   u8   *pScr, *pPic;   SDL_LockSurface(screen);   // Multiplicateur (Zoom).    #if RZ_TYPE   == 0        nMultiplier = 0x08000 + ((0x0060 + pCos[(gRoto.nAngle + 192) & 0xFF]/2) << 9);    #else        nMultiplier = 0x08000 + ((0x0060 + pCos[gRoto.nZoom]/2) << 9);    #endif   // Val sin et cos (Roto).   incx = (pCos[gRoto.nAngle] * nMultiplier) >> 16;   incy = (pSin[gRoto.nAngle] * nMultiplier) >> 16;   // Point de départ.   // x' = x cos f - y sin f   // y' = y cos f + x sin f   px = (128 << 8) - ((ROTO_CTR_X * incx) - (ROTO_CTR_Y * incy));   py = (128 << 8) - ((ROTO_CTR_Y * incx) + (ROTO_CTR_X * incy));   // Loop.   pScr = (u8*)screen->pixels;   pPic = (u8*)pRotoPic->pixels;   pPic += (gRoto.nScroll & 63) << 8;   // Scroll.   s32   nPitchDiff = screen->pitch - SCREEN_WIDTH;       for (iy = 0; iy < SCREEN_HEIGHT; iy++)   {      px2 = px;      py2 = py;      for (ix = 0; ix < SCREEN_WIDTH; ix++)      {         *pScr++ = *(pPic + ((py2 >> 8) << 8) + (px2 >> 8));         px2 += incx;         py2 += incy;      }      pScr += nPitchDiff;      //incx++;      // Déformation.      px -= incy;      py += incx;   }    #if RZ_TYPE   == 0        gRoto.nAngle += 1;    #else        // Déplacement des rotations, zooms, scrolls...        gRoto.nAngCnt--;        gRoto.nAngCnt &= 63;        if (gRoto.nAngCnt == 0)        {            gRoto.nAngVar = (gRoto.nAngVar + 1) & 3;            gRoto.nAngCnt = (rand() & 63) | 16;        }        if (gRoto.nAngVar == 1) gRoto.nAngle++;        else if (gRoto.nAngVar == 3) gRoto.nAngle--;        else gRoto.nScroll++;        gRoto.nZoomCnt--;        gRoto.nZoomCnt &= 63;        if (gRoto.nZoomCnt == 0)        {            gRoto.nZoomVar = (gRoto.nZoomVar + 1) & 1;            gRoto.nZoomCnt = (rand() & 63) | 32;        }        if (gRoto.nZoomVar == 0) gRoto.nZoom++;    #endif   SDL_UnlockSurface(screen);}`

I can't understand some numbers come from, like in
nMultiplier = 0x08000 + ((0x0060 + pCos[(gRoto.nAngle + 192) & 0xFF]/2) << 9);
or
px = (128 << 8) - ((ROTO_CTR_X * incx) - (ROTO_CTR_Y * incy));
, are them some kind of mathematical constants?

I've tracked down the author website (in french) and found a simplistic explanations about some of the operations, but I can't completly understand them all.

I attached the full example to this post.
Attachments
roto.zip
Last edited by Pix3l on 15 Jan 2023, 22:02, edited 1 time in total.
http://www.pix3lworkshop.altervista.org/ - Your 8bit choice since 2006!

Pix3l

Posts: 55
Joined: 10 Sep 2010, 21:00
Location: Italy

### Re: [C] Understanding an interesting piece of code

You haven't picked easy code to understand It's and efficient oldschool code (judging from C89 and old SDL), the guy is using sin/cos lookup tables and fixed point math and some other hacks with pointers that help efficiency -- this is good but it'll be hard to understand and decypher, especially with French comments etc. I've been briefly trying to decypher it and got some idea how it works, but TBH I would just rewrite it if I were to do this myself, it would be faster than trying to understand it, the code is not extremely well documented. If you want to dig through it, take a look at affine transformations and transform matrices (he's not using matrices per se, but he's effectively implementing the same effect), take a look at bitwise operators and how they're used (e.g. & is used to mask out bits and can be used e.g. as fast modulo, bitwise shifts << and >> are used for fast multiplication by powers of two and utilized with fixed point math, | is used to set bits, i.e. combine different flags etc.). Sorry I can't be of more help, but you can perhaps find some help on my wiki (e.g. https://www.tastyfish.cz/lrs/fixed_point.html, https://www.tastyfish.cz/lrs/bit_hack.html etc.).

drummyfish

Posts: 448
Joined: 29 Jul 2018, 20:30
Location: Moravia

### Re: [C] Understanding an interesting piece of code

I forgot to include the link to the author's website in the previous post.
What I can tell you, is that, this example use a circle of 256 degrees, instead of 360 (for simplicity he says), and like you pointed, he precalculate all the cos and sin around.

Anyway, I agree with you that will be faster rewriting all of this code than trying to understand, but I was just curious.
The rotation and zooming effect applied to the image in this example, it's quite amazing for me to see it doesn't use some external libraries to achieve these effects.
I'd just like to understand better how this thing works, mainly for personal interest.

Anyway, many thanks for your time and your suggestions, your blog seems very interesting, I will take an eye on it soon :]
http://www.pix3lworkshop.altervista.org/ - Your 8bit choice since 2006!

Pix3l

Posts: 55
Joined: 10 Sep 2010, 21:00
Location: Italy

### Re: [C] Understanding an interesting piece of code

I like that attitude, most people would just do it with some library and not even care how it works. The principle is the same as mode 7 rendering, look that up for details.

Here I've written basically the same thing with my tiny C engine (https://codeberg.org/drummyfish/saf) -- maybe this will be better understandable; it also doesn't use any libraries for the transformation and it only uses fixed point. The code shows the idea without trying to be hardcore optimized because that obscures it (for speed you should e.g. write directly to frame buffer etc.).

Here is the code (compile by putting saf.h in the directory of the source and do e.g. gcc -DSAF_PLATFORM_SDL2=1 tmp.c -lSDL2):

{l Code}: {l Select All Code}
`#define SAF_PROGRAM_NAME "test"#include "saf.h"uint8_t image[1026] = {0x20,0x20,0xb6,0x92,0x92,0xb6,0x4e,0x49,0xb6,0xb6,0x92,0x49,0x9a,0xb6,0x92,0x92,0x92,0x92,0x92,0x92,0x49,0xb6,0x92,0x76,0x92,0x92,0x92,0x92,0x6d,0x6d,0x49,0x76,0x92,0x92,0x92,0x92,0x92,0x51,0x4a,0x20,0xb6,0xf3,0x6d,0x49,0x92,0xb6,0x92,0x92,0x92,0x92,0x92,0x92,0x49,0x92,0xb6,0x92,0x92,0xb6,0x92,0xb6,0x92,0x4d,0x6d,0x72,0x76,0xb6,0x31,0x31,0x31,0x6d,0x49,0x49,0x72,0x92,0x49,0x49,0x6d,0x6d,0x92,0x6d,0x92,0x6d,0x92,0x6d,0x28,0x49,0x49,0xb6,0x92,0x6d,0x51,0x6d,0x31,0x49,0x4d,0x49,0x49,0x31,0x92,0x92,0x92,0x92,0x49,0x6d,0x6d,0x6d,0x44,0x44,0x6d,0x92,0x6d,0x4e,0x49,0x49,0x69,0x44,0x49,0x49,0x09,0x49,0x25,0x31,0x49,0x49,0x00,0x6d,0x92,0x92,0x92,0x92,0x76,0x92,0x92,0x92,0x49,0xdb,0xdb,0xdb,0xb6,0xdb,0x92,0x92,0xdb,0xdb,0xb6,0xdb,0xb6,0xdb,0xdb,0xdb,0xdb,0xdb,0xb6,0xdb,0xb6,0xdb,0xdb,0x6d,0x92,0x92,0x92,0xb6,0x92,0x92,0x92,0x92,0x6d,0xdb,0xb6,0xb6,0xb6,0x92,0x92,0xb6,0xb6,0x92,0xb6,0x92,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x6d,0x92,0x92,0x92,0x36,0x36,0x92,0x76,0x92,0x6d,0xdb,0xb6,0x92,0x92,0x92,0x92,0xb6,0xb6,0x92,0x92,0xb6,0x92,0xb6,0x92,0xb6,0x92,0xb6,0x92,0xb6,0x92,0xb6,0xb6,0x49,0xb6,0x92,0x92,0x92,0x92,0x92,0x92,0x6d,0x6d,0xdb,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x76,0xb6,0x76,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x51,0x92,0x92,0x92,0x92,0x6d,0x92,0x92,0x6d,0x6d,0xdb,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0x9a,0x92,0xb6,0x92,0xb6,0xdb,0x6d,0x36,0x92,0x92,0x92,0x49,0x49,0x49,0x24,0x6d,0xdb,0x92,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0x92,0xb6,0x92,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0x51,0x92,0x2a,0x25,0x31,0x6d,0x49,0x49,0x49,0x6d,0xdb,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0x9a,0xb6,0xb6,0xb6,0x6d,0x49,0x6d,0x92,0x6d,0xb6,0x92,0xb6,0x6d,0x6d,0xdb,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x6d,0x6d,0x92,0x92,0xb6,0x92,0x92,0x92,0x92,0x6d,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0x31,0x6d,0x92,0x92,0x92,0x92,0x92,0x92,0x6d,0x6d,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0x49,0x6d,0x92,0x92,0x92,0x92,0xb6,0x92,0x4d,0x6d,0xdb,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0x6d,0x6d,0xb6,0x92,0xb6,0x92,0xb6,0x92,0x6d,0x6d,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xdb,0x49,0x6d,0x92,0x92,0xb6,0x92,0x92,0x92,0x05,0x6d,0xdb,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xdb,0x49,0x6d,0x92,0x92,0x92,0xb6,0x92,0x92,0x00,0x6d,0xdb,0xb6,0x95,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xdb,0x6d,0x6d,0xb6,0xb6,0x92,0x6d,0x6d,0x6d,0x24,0x6d,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0x6d,0x4d,0x6d,0x6d,0x51,0x36,0x92,0x6d,0x4d,0x6d,0xdb,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x6d,0x24,0x6d,0x6d,0x92,0x92,0x92,0x92,0x6d,0x6d,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x95,0xb6,0x92,0xb6,0x92,0x6d,0xb6,0x92,0x92,0xb6,0x92,0xb6,0x6d,0x6d,0xdb,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x6d,0x49,0x92,0x92,0x92,0x92,0xb6,0xb6,0x76,0x6d,0xdb,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0x92,0xb6,0x92,0xb6,0x6d,0x6d,0xb6,0x92,0x92,0x76,0xb6,0x92,0x92,0x6d,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xdb,0x92,0xb6,0xb6,0xb6,0xb6,0x95,0xb6,0xb6,0xb6,0x92,0x92,0xb6,0x92,0xb6,0xb6,0x92,0x6d,0x92,0xb6,0xb6,0x92,0x92,0x92,0x92,0x6d,0xdb,0xb6,0xb6,0xb6,0x92,0xb6,0xb1,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0x95,0xb6,0x92,0xb6,0x6d,0x6d,0x9a,0x92,0xb6,0x92,0xb6,0xb6,0x6d,0x6d,0xdb,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xb6,0xdb,0xb6,0xb6,0xb6,0x92,0xb6,0xb6,0xdb,0x92,0x6d,0x92,0xb6,0xb6,0x6d,0x6d,0x6d,0x31,0x4e,0x92,0x6d,0x92,0x6d,0x92,0x6d,0x92,0x92,0x6d,0x92,0x4d,0x6d,0x6d,0x4d,0x92,0x4d,0x6d,0x92,0x6d,0x49,0x49,0x31,0x4d,0x6d,0x92,0x92,0x51,0x92,0x92,0x92,0x92,0x49,0x4d,0x25,0x05,0x49,0x49,0x49,0x49,0x49,0x49,0x25,0x6d,0x6d,0x49,0x49,0x44,0x31,0x6d,0x49,0x6d,0x6d,0x92,0x6d,0x92,0x49,0x31,0x6d,0xb6,0xb6,0x92,0xb6,0xb6,0x49,0x6d,0x92,0x92,0x92,0x44,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0x49,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0xb6,0x49,0x92,0xb6,0xb6,0x76,0x76,0x92,0xb6,0x92,0x6d,0xb6,0xb6,0x92,0x49,0x92,0x92,0x6d,0x92,0x92,0x92,0x92,0x6d,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0x92,0xf3,0x49,0x92,0x9a,0x92,0x92,0x92,0x92,0x92,0x92,0x89,0xb6,0xb6,0x92,0x28,0x92,0x92,0x92,0x6d,0x92,0x92,0x92,0x6d,0x6d,0x92,0x92,0x92,0x92,0x6d,0x92,0x92,0x6d,0xb6,0x6d,0xb6,0x92,0x93,0x92,0x36,0x92,0x92,0x92,0x49,0xb6,0xd2,0x92,0x49,0x92,0x92,0x92,0x92,0x92,0x76,0x92,0x6d,0x6d,0xb6,0x92,0x92,0x92,0x92,0x92,0x92,0x6d,0xdb,0x49,0x92,0x92,0x92};void SAF_init(void){}uint8_t imagePixel(int x, int y){  // wrap-around the coordinates so that they're inside the image  return image[2 + (((unsigned int) x) % 32) + (((unsigned int) y) % 32) * 32];}#define FIXED_POINT_SCALE 512void drawBackground(int centerX, int centerY, int scale, int rotation){  int stepX = (scale * SAF_sin(rotation)) / 256, // traversal direction vector      stepY = (scale * SAF_cos(rotation)) / 256;  centerX *= FIXED_POINT_SCALE; // convert to fixed point  centerY *= FIXED_POINT_SCALE;  for (uint8_t y = 0; y < SAF_SCREEN_HEIGHT; ++y)  {    int pictureX = centerX, pictureY = centerY;    for (uint8_t x = 0; x < SAF_SCREEN_WIDTH; ++x) // draw the line    {      SAF_drawPixel(x,y,        imagePixel(pictureX / FIXED_POINT_SCALE,pictureY / FIXED_POINT_SCALE));      pictureX += stepX;      pictureY += stepY;    }     centerX += stepY; // step in 90 degree direction to the next line    centerY -= stepX;  }}  int x = 0, y = 0, s = FIXED_POINT_SCALE * 2, r = 0;int dx = 0, dy = 0, ds = 0, dr = 0;uint8_t SAF_loop(void){  SAF_clearScreen(255);  drawBackground(x,y,s,r);  x += dx;  y += dy;  s += ds;  r += dr;  if (SAF_frame() % 64 == 0)  {    dx = -2 + SAF_random() % 4;    dy = -2 + SAF_random() % 4;    dr = -2 + SAF_random() % 4;    ds = -8 + SAF_random() % 16;  }  if (s < FIXED_POINT_SCALE)    ds = 5;  else if (s > FIXED_POINT_SCALE * 3)    ds = -5;  return 1;}`

drummyfish

Posts: 448
Joined: 29 Jul 2018, 20:30
Location: Moravia

### Re: [C] Understanding an interesting piece of code

* "// Multiplicateur (Zoom)." -> "multiplier (zoom)" obviously.
* "// Val sin et cos (Roto)." -> "sin and cos value (roto)"
* "// Point de départ." -> "starting point"
* "// Déformation." -> hm... don't know english word for this... basically, modifying the shape. Transforming, except it's not scaling, rotating, nor translating which are technically transformations, too.
* "// Déplacement des rotations, zooms, scrolls..." -> "moving rotations, zooms, scrolls..."

I doubt this helps though. If there's questions about french stuff, I can certainly help anyway. As for trigonometry and old optimisation tricks, I have some knowledge and understanding, but overall my math level is not very high, and I usually need to focus more than a bit to do 3D transformations, as well as browsing again bookmarks and whatnot to remember what those things (dot products, for example) are used for.
Note that those old tricks were mostly useful in the past for several reasons:

* old CPUs didn't necessarily had a unit dedicated to floating point numbers
* they didn't even had instructions to run parallel instructions like we now have, that is, something like: " for( struct foo const* bar = start; bar != end; ++bar ){ bar.i++; bar.j++; bar.z++; }" can nowadays run all the for loop content in parallel, and it's even better when doing some loop unrolling (have to measure on a case by case though).
* there was obviously a single core
* and having a GPU was not the norm
* there were no SRAM caches in processors, they accessed DRAM modules directly (SRAM: static ram. Cost an eye, but is fast as hell. Used mostly for CPU caches: L1, L2, L3, etc. DRAM: dynamic ram, cost nothing, but is rather slow. This is what you have in those DDR RAM, obviously).

I said those tricks were useful in the past... they still are, but not much (not much, means they still are) on normal desktops or laptops. They can bring you benefits if you write code for micro-controllers though.

Nowadays, for laptops/desktops, a good way to get some speed is by taking care about your memory layout (avoiding cache misses can really help, and this actually means avoiding padding in structures, and using types as small as possible, not to save RAM, but to save bandwidth).
Of course, avoiding cosine, sine or square root calculations is still important, and you might want to avoid doing square roots when you want to compare distances. As always, reducing number of instructions used will lead to improved performances. Will this be worth it though, is a different problem. What can make optimisations not worth it is when it makes the code hard to maintain, while not giving much, if any, performance improvements because the time spent in a routine only represents 0.01% of total time.

Having highly specialized structures would allow for both (usually) easier to maintain and faster code.
Easier to maintain, because no need to worry what all those members are for, so you only bother about less stuff, and faster because this means members are close in memory, allowing CPU to store more of them in memory lines (which are usually 64 bytes only, mind you... on modern systems, it means only 8 pointers, so I think it's quite fun to use pointers to address elements in arrays that won't have more than 64K elements: an unsigned short index would save 75% size here... but I'm out of topic now I think?).
Close data in memory also allows compilers to use sse instruction set, sometimes, which is about doing things in parallel. And people sometimes forcefully use SSE instructions, because no, compilers don't know how to optimise everything (really, loop unrolling can notably give pretty impressive perf boosts, but again: measure before and after, so that you can know if it's worth it).
freem
BN Moderator

Posts: 106
Joined: 18 May 2015, 19:38

### Re: [C] Understanding an interesting piece of code

True, though if you write for microcontrollers things like cache friendliness may not matter because simple MCUs often don't have memory caches -- in the end it all depends on the platform. My approach is to always optimize for the weakest platform I want to support -- if I make it run fast on a weak embedded device, it will definitely run fast on desktop, even if the code isn't optimimal for desktop. On my wiki I keep a list and quick-how-to on optimization I have learned so far: https://www.tastyfish.cz/lrs/optimization.html.

drummyfish

Posts: 448
Joined: 29 Jul 2018, 20:30
Location: Moravia

### Who is online

Users browsing this forum: No registered users and 1 guest