What follows below is a description of base64 encoding and how it works at the bit level. Although i’ve used it extensively in my career, i never needed to know the underlying implementation. I had a decent grasp on it but most descriptions felt lacking when i actually went about trying to understand how to implement it.
What follows is a quick write up of what I learned and what I believe to be true about base64 encoding.
Given a string of “ABC”, we want to construct a base64 encoded representation. The final string will be “QUJD”.
Understanding the bits
1 standard byte is 8 bits, therefore our bit level representation of ABC is below.
Standard ASCII representation of bits
Each character is represented by 8 bits. The maximum value that can be represented using 8 bits is 255. The value is used as an index in a table of chars.
A  B  C 

01000001  01000010  01000011 
Base64 representation of bits
Each character is represented by 6 bits. The maximum value that can be represented using 6 bits is 63. The value is used as an index in a table of chars.
Base64 Character Table
Each value from 063 can be used as an index to this lookup table to perform the encoding or decoding.


Encoding
Q  U  J  D 

010000  010100  001001  000011 
If we work in groups of 3 bytes, we have a stream equal to 24 bits. 3 Characters of 8 bits = 24 bits. To convert this stream to base 64, we can concatenate those 24 bits and split them into 6 bit groups giving us 4 base64 characters. This works since 4 characters of 6 bytes also equals 24 bits.
This is accomplished with bit shifting and looks like this:


 shift 6 bits from first char (010000)


 shift 2 bits from first char, 4 from second char (2 + 4 = 6)


 shift 4 bits from second char (0010), 2 from third char (01) (4 + 2 = 6)


 shift 6 bits from third char (all that’s left) into ch_4


Now that we have all 4 new bytes, we can just use a table of allowable base64 characters to pick from since our values will be 063. In the event there are not enough bytes to get 3, we will add a padding character in that place (=).


original = base64 ABC = QUJD
Now if we base64 encode ABCD we will end up with QUJDRA==.
Note that this is because we must always create groups of 24 bits to perform encoding or decoding. When we add that new character D, we are bumping our bit count from 24 to 32. 32 is not divisible by 24 bits or 3 bytes, which is a requirement of base64 encoding.
3  4 = 1
Since we want to have groups of 3 characters, or 24 bites, we add 2 padding bytes (=) to create a total of 6 chars, which IS divisble.
Readable base64 encode/decode in python


Yields the following output:

