So pitch here is the number of bytes per row including both the pixels and padding. Thus indexing multiplies of pitch are the starts of each row.
e.g. say you have a 10 pixel wide image in 24bit RGB (3 bytes), that makes each row of pixels 3*10 bytes, 30 bytes. Now the BMP spec says each row must be a multiple of 32bits (4 bytes). 30/4 is 7.5 so clearly not a multiple, but adding 2 extra bytes (30%4 == 2) giving 32bytes is a multiple (32/4 = 8). So that gives `width=10` and `pitch=32`. You might also see it referred to as the "stride".
Sorry, “sx”, “sy” were maybe not the best named variables for a quick example ? I was referring to “source x” and “source y”, while x,y is “destination x” and “destination y”..While "+ 0", "+ 1", "+ 2" is for the 3 8bit components of a pixel. You can see this on the 3 assignment lines, (x,y) is on the left side while (sx,sy) on the right side.
"x * in.width / out.width" lines being an integer scale I came up with really quickly that I believe should be correct for a simple nearest neighbour. e.g. say "in.width" is 50 and "out.width" is 150, for x starting from zero [0, 0, 0, 1, 1, 1, 2, 2, 2, .... 48, 48, 48, 49, 49, 49] so 3x scale up repeating each pixel 3 times, or for say "in.width" is 100 and "out.width" is 50, [0, 2, 4, 6, ... 94, 96, 98] so 2x scale down taking every other pixel.