Bill Gates Legend: 40 Years Since Basic for 6502

Let’s first comeback in the past and discover what Bill Gates was doing in his beginning, here’s from Wikipedia a brief story:

After Gates read the January 1975 issue of Popular Electronics, which demonstrated the Altair 8800, he contacted Micro Instrumentation and Telemetry Systems (MITS), the creators of the new microcomputer, to inform them that he and others were working on a BASIC interpreter for the platform. In reality, Gates and Allen did not have an Altair and had not written code for it; they merely wanted to gauge MITS’s interest. MITS president Ed Roberts agreed to meet them for a demo, and over the course of a few weeks they developed an Altair emulator that ran on a minicomputer, and then the BASIC interpreter. The demonstration, held at MITS’s offices in Albuquerque, was a success and resulted in a deal with MITS to distribute the interpreter as Altair BASIC. Paul Allen was hired into MITS, and Gates took a leave of absence from Harvard to work with Allen at MITS in Albuquerque in November 1975. They named their partnership “Micro-Soft” and had their first office located in Albuquerque. Within a year, the hyphen was dropped, and on November 26, 1976, the trade name “Microsoft” was registered with the Office of the Secretary of the State of New Mexico. Gates never returned to Harvard to complete his studies.

Microsoft BASIC for 6502 was based on Altair BASIC, which Bill Gates and Paul Allen famously created for the MITS Altair 8800 (which used Intel’s 8080 CPU) in 1975, writing it in a motel in Albuquerque, New Mexico. That same year, MOS Technology created the 6502 microprocessor as a cheaper alternative to other microprocessors of the day. The 6502 would eventually be used in a number of popular computers, such as the Apple I, the Apple II, Commodore VIC-20 and 64 systems, as well as gaming consoles such as the Atari 2600.

For the hobbyist, here’s one of the machines using the 6502 microprocessor and where the Microsoft Basic worked on.

Here are some characteristics:

OS: Commodore Basic 1.0. CPU: 6502 @ 1 mHz. RAM: 4 KB (early version) then 8 KB VRAM: 1 KB ROM: 14 KB TEXT MODES: 40 x 25. GRAPHIC MODES: None

What you can do with a RAM of 4KB, today we need a minimum of 4Go. Was it possible to write software with the only 4KB of RAM?

Yes, it was and here’s the proof. The oldest publicly available piece of source written by Bill Gates in 1978.

Let’s discover some facts about this piece of code.  And first, let’s generate a word cloud from the comments from the source code that describe it to discover what was the main preoccupation of the developers:

String, Pointer, Variable, Value, Space are the most recurrent ones, which proves that the memory management was a big issue for this kind of machines. The code is optimized to make the most of the RAM available. Another interesting remark is the Garbage word which is cited many times.  The garbage was very important to optimize the memory usage.

Here’s from the source code comments how the strings are managed:

STRINGS
		IN THE VARIABLE TABLES STRINGS ARE STORED JUST LIKE
		NUMERIC VARIABLES. SIMPLE STRINGS HAVE THREE VALUE
		BYTES WHICH ARE INITIALIZED TO ALL ZEROS (WHICH
		REPRESENTS THE NULL STRING). THE ONLY DIFFERENCE
		IN HANDLING IS THAT WHEN "PTRGET" SEES A "$" AFTER THE
		NAME OF A VARIABLE, "PTRGET" SETS [VALTYP]
		TO NEGATIVE ONE AND TURNS
		ON THE MSB (MOST-SIGNIFIGANT-BIT) OF THE VALUE OF
		THE FIRST CHARACTER OF THE VARIABLE NAME.
		HAVING THIS BIT ON IN THE NAME OF THE VARIABLE ENSURES
		THAT THE SEARCH ROUTINE WILL NOT MATCH
		'A' WITH 'A$' OR 'A$' WITH 'A'. THE MEANING OF
		THE THREE VALUE BYTES ARE:
			LOW
				LENGTH OF THE STRING
				LOW 8 BITS
				HIGH 8 BITS  OF THE ADDRESS
					OF THE CHARACTERS IN THE
					STRING IF LENGTH.NE.0.
					MEANINGLESS OTHERWISE.
			HIGH
		THE VALUE OF A STRING VARIABLE (THESE 3 BYTES)
		IS CALLED THE STRING DESCRIPTOR TO DISTINGUISH
		IT FROM THE ACTUAL STRING DATA. WHENEVER A
		STRING CONSTANT IS ENCOUNTERED IN A FORMULA OR AS
		PART OF AN INPUT STRING, OR AS PART OF DATA, "STRLIT"
		IS CALLED, CAUSING A DESCRIPTOR TO BE BUILT FOR
		THE STRING. WHEN ASSIGNMENT IS MADE TO A STRING POINTING INTO
		"BUF" THE VALUE IS COPIED INTO STRING SPACE SINCE [BUF]
		IS ALWAYS CHANGING.

And how the errors are managed:

	ERROR MESSAGES
		WHEN AN ERROR CONDITION IS DETECTED,
		[ACCX] MUST BE SET UP TO INDICATE WHICH ERROR
		MESSAGE IS APPROPRIATE AND A BRANCH MUST BE MADE
		TO "ERROR". THE STACK WILL BE RESET AND ALL
		PROGRAM CONTEXT WILL BE LOST. VARIABLES
		VALUES AND THE ACTUAL PROGRAM REMAIN INTACT.
		ONLY THE VALUE OF [ACCX] IS IMPORTANT WHEN
		THE BRANCH IS MADE TO ERROR. [ACCX] IS USED AS AN
		INDEX INTO "ERRTAB" WHICH GIVES THE TWO
		CHARACTER ERROR MESSAGE THAT WILL BE PRINTED ON THE
		USER'S TERMINAL.

What’s interesting in this source code is the clear explanation of many aspects of the program. no need to search elsewhere to understand what the code does.

Some facts about the code quality:

  • The naming is easy to understand

When exploring the source code, you don’t find variable names such as a, b or x, like some recently developed projects. The names are well chosen and well commented.

  •  The code is split into many small subroutines

The 6502 assembly language is very low level, and to make the code easier to understand and maintain, the “Divide and Conquer” principle is applied. Indeed the code is split into many small subroutines, what makes them easy to read and maintain. Here’s an example of a small subroutine defined in its source code:

After exploring the code, In my opinion, Bill Gate was a genius developer and also a good businessman. It’s not a surprise that he become one of the most influencers in the computer world as steve jobs and a few other legends.