Let’s first comeback in the past and discover what Bill Gates was doing in his beginning, here’s from Wikipedia a brief story:
After Gates read the January 1975 issue of Popular Electronics, which demonstrated the Altair 8800, he contacted Micro Instrumentation and Telemetry Systems (MITS), the creators of the new microcomputer, to inform them that he and others were working on a BASIC interpreter for the platform. In reality, Gates and Allen did not have an Altair and had not written code for it; they merely wanted to gauge MITS’s interest. MITS president Ed Roberts agreed to meet them for a demo, and over the course of a few weeks they developed an Altair emulator that ran on a minicomputer, and then the BASIC interpreter. The demonstration, held at MITS’s offices in Albuquerque, was a success and resulted in a deal with MITS to distribute the interpreter as Altair BASIC. Paul Allen was hired into MITS, and Gates took a leave of absence from Harvard to work with Allen at MITS in Albuquerque in November 1975. They named their partnership “Micro-Soft” and had their first office located in Albuquerque. Within a year, the hyphen was dropped, and on November 26, 1976, the trade name “Microsoft” was registered with the Office of the Secretary of the State of New Mexico. Gates never returned to Harvard to complete his studies.
Microsoft BASIC for 6502 was based on Altair BASIC, which Bill Gates and Paul Allen famously created for the MITS Altair 8800 (which used Intel’s 8080 CPU) in 1975, writing it in a motel in Albuquerque, New Mexico. That same year, MOS Technology created the 6502 microprocessor as a cheaper alternative to other microprocessors of the day. The 6502 would eventually be used in a number of popular computers, such as the Apple I, the Apple II, Commodore VIC-20 and 64 systems, as well as gaming consoles such as the Atari 2600.
For the hobbyist, here’s one of the machines using the 6502 microprocessor and where the Microsoft Basic worked on.
Here are some characteristics:
OS: Commodore Basic 1.0. CPU: 6502 @ 1 mHz. RAM: 4 KB (early version) then 8 KB VRAM: 1 KB ROM: 14 KB TEXT MODES: 40 x 25. GRAPHIC MODES: None
What you can do with a RAM of 4KB, today we need a minimum of 4Go. Was it possible to write software with the only 4KB of RAM?
Yes, it was and here’s the proof. The oldest publicly available piece of source written by Bill Gates in 1978.
Let’s discover some facts about this piece of code. And first, let’s generate a word cloud from the comments from the source code that describe it to discover what was the main preoccupation of the developers:
String, Pointer, Variable, Value, Space are the most recurrent ones, which proves that the memory management was a big issue for this kind of machines. The code is optimized to make the most of the RAM available. Another interesting remark is the Garbage word which is cited many times. The garbage was very important to optimize the memory usage.
Here’s from the source code comments how the strings are managed:
STRINGS IN THE VARIABLE TABLES STRINGS ARE STORED JUST LIKE NUMERIC VARIABLES. SIMPLE STRINGS HAVE THREE VALUE BYTES WHICH ARE INITIALIZED TO ALL ZEROS (WHICH REPRESENTS THE NULL STRING). THE ONLY DIFFERENCE IN HANDLING IS THAT WHEN "PTRGET" SEES A "$" AFTER THE NAME OF A VARIABLE, "PTRGET" SETS [VALTYP] TO NEGATIVE ONE AND TURNS ON THE MSB (MOST-SIGNIFIGANT-BIT) OF THE VALUE OF THE FIRST CHARACTER OF THE VARIABLE NAME. HAVING THIS BIT ON IN THE NAME OF THE VARIABLE ENSURES THAT THE SEARCH ROUTINE WILL NOT MATCH 'A' WITH 'A$' OR 'A$' WITH 'A'. THE MEANING OF THE THREE VALUE BYTES ARE: LOW LENGTH OF THE STRING LOW 8 BITS HIGH 8 BITS OF THE ADDRESS OF THE CHARACTERS IN THE STRING IF LENGTH.NE.0. MEANINGLESS OTHERWISE. HIGH THE VALUE OF A STRING VARIABLE (THESE 3 BYTES) IS CALLED THE STRING DESCRIPTOR TO DISTINGUISH IT FROM THE ACTUAL STRING DATA. WHENEVER A STRING CONSTANT IS ENCOUNTERED IN A FORMULA OR AS PART OF AN INPUT STRING, OR AS PART OF DATA, "STRLIT" IS CALLED, CAUSING A DESCRIPTOR TO BE BUILT FOR THE STRING. WHEN ASSIGNMENT IS MADE TO A STRING POINTING INTO "BUF" THE VALUE IS COPIED INTO STRING SPACE SINCE [BUF] IS ALWAYS CHANGING.
And how the errors are managed:
ERROR MESSAGES WHEN AN ERROR CONDITION IS DETECTED, [ACCX] MUST BE SET UP TO INDICATE WHICH ERROR MESSAGE IS APPROPRIATE AND A BRANCH MUST BE MADE TO "ERROR". THE STACK WILL BE RESET AND ALL PROGRAM CONTEXT WILL BE LOST. VARIABLES VALUES AND THE ACTUAL PROGRAM REMAIN INTACT. ONLY THE VALUE OF [ACCX] IS IMPORTANT WHEN THE BRANCH IS MADE TO ERROR. [ACCX] IS USED AS AN INDEX INTO "ERRTAB" WHICH GIVES THE TWO CHARACTER ERROR MESSAGE THAT WILL BE PRINTED ON THE USER'S TERMINAL.
What’s interesting in this source code is the clear explanation of many aspects of the program. no need to search elsewhere to understand what the code does.
Some facts about the code quality:
- The naming is easy to understand
When exploring the source code, you don’t find variable names such as a, b or x, like some recently developed projects. The names are well chosen and well commented.
- The code is split into many small subroutines
The 6502 assembly language is very low level, and to make the code easier to understand and maintain, the “Divide and Conquer” principle is applied. Indeed the code is split into many small subroutines, what makes them easy to read and maintain. Here’s an example of a small subroutine defined in its source code:
After exploring the code, In my opinion, Bill Gate was a genius developer and also a good businessman. It’s not a surprise that he become one of the most influencers in the computer world as steve jobs and a few other legends.