Thought experiment: Bare metal to interpreter

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
andi
Posts: 5
Joined: Wed Jan 30, 2019 8:57 pm

Thought experiment: Bare metal to interpreter

Post by andi »

Say you want to build an 8 (12 if it's easier) bit computer from scratch.

Assume most of the hardware is built (registers, ALU, bus, RAM, EEPROM, etc). However, there is no I/O. You have to write instructions into RAM by hand.

What opcodes should you implement in hardware? Then, using those opcodes, what form of I/O hardware could you implement to allow a user to (a) view what’s at a specific memory location and (b) set the contents of a memory location to a user-specified value, implemented via the opcodes you just defined (no cheating w/ an Arduino or ATTiny).

Once you can read/set memory using the computer, your next goal is to implement a higher level language, again using the opcodes you have at hand.

Alternatively: using your existing custom opcodes, write a separate program to communicate with another microprocessor, like an ATMega328, and send firmware to it via SPI or the likes that implements an I/O routine. This gives you a more powerful architecture (ARM) to work with, and should eliminate most hardware constraints as modern microprocessors seem like they should be capable of handling this. But remember, you can't have any existing firmware on it, except maybe the bootloader to allow it to accept new firmware over the wire.

I would guess the first step to take would be to make a simple assembler, in which you can write ASCII into memory via your I/O mechanism and then have a bunch of opcodes to convert your ASCII into machine code.

Armed with an assembler, you could begin to write a miniature compiler. Question here is what language should your compiler target? I’m thinking maybe Forth or Lisp, but even those are a ton of effort. I read that apparently C was originally written in BCPL but that in itself was influenced by ALGOL….basically, by the time C was made, people had long evolved past writing direct assembly, and were rather heading down the path of using existing compilers to make better compilers. We don’t have that luxury though, we need to jump straight from assembly to compiler. So maybe we should also make a micro-language, that is just a few steps above ASM? I feel like this path has already been walked, but it was last walked several decades ago.

The ideal end goal would be to have a keyboard/display setup with a tiny REPL, running Lisp or one of its variants. With Lisp it would be very easy to implement everything else that’s missing in order to make this a “real” general purpose computer.

The biggest milestones in order would be (a) figuring out some way to implement I/O in the least amount of opcodes possible, (b) writing assembly to convert fake assembly into real assembly, (c) writing a compiler in assembly that is at the very least capable of generating better compilers, and (d) evolving those compilers to be able to write a Lisp interpreter and hooking that up to your I/O setup.

It’s definitely possible, since people obviously managed to do it (or else we wouldn’t be here), but the difficulty would be doing it without millions of man hours and sitting on the shoulders of giants. My big bet is that we can offset that effort using what we know now (and what we’ve learned from the past) to avoid all the time spent experimenting and using it to beeline from nothing to a working solution.

Essentially what I'm trying to do in the broadest sense is simulate the advancement of computer technology from the very beginning. Way before we had GUIs, keyboards, interpreters, compilers, hell, even punchcards.

How could we get from there to a little closer to now?

How would you do it?
tabz
Member
Member
Posts: 35
Joined: Fri Apr 20, 2018 9:15 am
Location: Cambridge, UK

Re: Thought experiment: Bare metal to interpreter

Post by tabz »

andi wrote:Say you want to build an 8 (12 if it's easier) bit computer from scratch.

Assume most of the hardware is built (registers, ALU, bus, RAM, EEPROM, etc). However, there is no I/O. You have to write instructions into RAM by hand.

What opcodes should you implement in hardware?
Are you essentially saying that all the hardware is implemented but the instruction decode, so we'd have to design our own ISA and instruction decode hardware?
andi
Posts: 5
Joined: Wed Jan 30, 2019 8:57 pm

Re: Thought experiment: Bare metal to interpreter

Post by andi »

tabz wrote:Are you essentially saying that all the hardware is implemented but the instruction decode, so we'd have to design our own ISA and instruction decode hardware?
Yes, more or less. You get full authority over the microcode and corresponding opcodes.
User avatar
iansjack
Member
Member
Posts: 4703
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Thought experiment: Bare metal to interpreter

Post by iansjack »

Is this a coursework assignment? It reads awfully like one.
andi
Posts: 5
Joined: Wed Jan 30, 2019 8:57 pm

Re: Thought experiment: Bare metal to interpreter

Post by andi »

Not at all. I am in a computer architecture class but this is really just a question I came up with myself because I'm interested in the evolution behind higher level languages. I've mainly been hooked by this series by Ben Eater: https://www.youtube.com/channel/UCS0N5b ... UrhCEo8WlA and trying to think about how I could expand it.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Thought experiment: Bare metal to interpreter

Post by Korona »

I will not comment on the technical issues, but let me comment on this:
andi wrote:It’s definitely possible, since people obviously managed to do it (or else we wouldn’t be here), but the difficulty would be doing it without millions of man hours and sitting on the shoulders of giants. My big bet is that we can offset that effort using what we know now (and what we’ve learned from the past) to avoid all the time spent experimenting and using it to beeline from nothing to a working solution.
This is a questionable statement. Today, information about high level languages (and also x86 implementation details) is broadly available on the internet. Information on how to efficiently deal with raw machine code - probably not so much. You have to relearn this "lost knowledge" from older resources (e.g. papers published at that time); this will inevitably require additional effort.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
andi
Posts: 5
Joined: Wed Jan 30, 2019 8:57 pm

Re: Thought experiment: Bare metal to interpreter

Post by andi »

This is a fair point. I'm not looking for code, of course, just a general idea of what could work if you had enough time and resources to do it yourself.
User avatar
eekee
Member
Member
Posts: 891
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Thought experiment: Bare metal to interpreter

Post by eekee »

andi wrote:You have to write instructions into RAM by hand.
This is indeed how early computers were started, but you may be assuming too much about the necessary steps. I'm not sure they even had microcode at the time. I don't know too much about that time, but I have gathered that the normal process involved "hand-assembling": Write your assembly language on paper, perhaps with one element per line (operation or operand), then write the octal code next to it.

Octal? There's no point using hexadecimal on a 17- or 23-bit computer, :) and octal is easier to decode. You don't really want to write the extra instructions to convert hex when you're entering instructions by toggling front panel switches. (I assume. I've never actually done it. :) ) Why the odd and large numbers of bits? I've tried to design an 8-bit CPU a couple of times, and found that just trying to cram the design into so few bits makes a lot of work. You need to have experience and know tricks to do it. Then there's the addressing issue. For single-type languages, (i.e. simple ones,) you really want your word size to be the same as your address range.
andi wrote:Question here is what language should your compiler target? I’m thinking maybe Forth or Lisp, but even those are a ton of effort.
No... :) Download JonesForth and have a look at it. It's two files, one assembly language, the other Forth, with an extensive tutorial in the comments. If I remember right, it's about 2,000 lines *including* the tutorial. It's not a minimal Forth, and it's not even a compiler, it's an interpreter already. Forth is an astonishingly powerful and compact language, you can write everything from the very lowest level to the very highest in the same language, building higher and higher level constructs in the language itself. (JonesForth itself targets Linux, but plenty of interpreted Forths run on bare hardware.) Of course, at the lowest level you will need inline assembly language, but not as much as you might think. You can write your scheduler in Forth itself -- interpreted!

Lisp, on the other hand, is bloatware. :) You need a lot of assembly language (or C) to get a Lisp system up, despite its simple syntax.

Note: an 8-bit Forth isn't really minimal. In fact, there's no such thing. On 8-bit hardware, the Forth is still a 16 bit language because its core is single-typed. You make other types with the language itself.
andi wrote:I read that apparently C was originally written in BCPL but that in itself was influenced by ALGOL….basically, by the time C was made, people had long evolved past writing direct assembly, and were rather heading down the path of using existing compilers to make better compilers.
Not quite and definitely no. :)

Algol influenced many languages in the 60s, BCPL was just one of them. It was an interpreted language. Ken Thompson apparently didn't like BCPL's syntax, so he made a language with a minimal syntax called B. It was still interpreted and it had only one type. As the guys in the lab used it, they realised other types would be desirable. Dennis Ritchie implemented this, and he also wrote a (very slow) compiler, calling the new language C. It had no type checking! :) I don't know where or when type checking was first implemented, but ANSI forced it into the language very much against the wishes of Ken and Dennis :shock: :lol: . (I might be wrong, that might not have been one of the parts they argued about.) Eventually, they gave up fighting the standard. After that, they wrote Plan 9 exclusively using type-checked C! They also wrote very much faster compilers, but those didn't get open-sourced until 2000 so almost nobody uses them.

The "definitely no" part concerns the abandonment of assembly language. Many working programmers were programming for low-end hardware. I think they became the bulk of the programming work force in the 70s. They stuck with assembly language because no compiled language they could afford could possibly compete on performance, and performance was critical on that hardware. The first language to get into this segment was C, which became the most popular language and remained so until C++ took over in... 2007 I think.

The majority of programmers don't work in the fields of programming you hear about, they work in corporations making software for the corporation's own use. All the hype that's come and gone about Java and everything else has had relatively little impact on that field until this decade, when even embedded hardware became powerful enough to run comparatively bloated and slow languages like Python. ;)


I haven't answered the original question. I don't want to answer it exactly, because designing an instruction set is quite a task on its own. Designing a *good* instruction set is up there with OS design. I've mostly been avoiding the question for this whole post. :lol:

I'll want to start with some real primitive hardware; 8-bit is cheap, not primitive. Let's have 14 bits at least; the same for data and address buses. There's no microcode, that won't be invented for another decade or two (I think). RISC hasn't been invented either, so the instruction set is technically CISC, but still simpler than such extravagant, luxurious designs as the 8086. :lol: It was designed by someone more experienced than myself. ;)

This machine isn't going to have a built-in screen because the hardware to drive it would be more complex than the computer itself. Instead, we have serial ports. A teletypewriter is connected to one of them. (We can't afford those flashy new "glass TTYs". Even if we could, we'd still need to print out our program listings so we could look at them properly.)

It's also got front-panel toggle switches because interfacing with a serial port is non-trivial especially when the data is coming intermittently from a TTY. There's no ROM to instruct the CPU to do that task.

So now, the program I personally would toggle into the front panel would be just enough to read octal from the serial port and write it into memory, if I type slowly enough. No buffer!

Typing in octal, I then input a better input program. It still accepts only octal, but lets me type faster and cancel a character or a line if I make a mistake. (Idea from Unix v6, 3rd Q here.) It still can't buffer more than one line, so now it prints a prompt to say, "your turn."

(At this point, I can very much see why they had punched-card readers. Typing into the tty could be replaced by inserting pre-punched cards. Mistakes could be fixed on the cards or new cards made. Carrying on without one.)

Now I write the primitives of a Forth interpreter. (Yes, I know it hasn't been invented yet. :P ) It's smaller than JonesForth but still a long slog, especially without macros. By the end of it, I can enter all characters and define what Forth calls 'code' words; somewhere between in-line assembler and the built-in functions of other languages. There is no assembler though; I'm so used to octal by this point that I just write octal straight into the definition. ;)

(Ever used colorForth? Charles Moore wrote hex straight into definitions. Worse, you'll sometimes see cyan hex code in the middle of regular (green) Forth definitions with no explanation! In fact, before colorForth he abandoned Forth and wrote an OS with GUI elements in raw hex. 8) It was called OKAD. colorForth was basically the scripting language for OKAD2, the kernel of which wasn't in Forth. Even in this fantasy, I'm not yet as experienced as Charles Moore. :) )

Now where am I? Oh yes, soon I can write normal Forth definitions, and can get on with the work of building up the system. I can even re-define words to add functionality or correct mistakes. What luxury!

I haven't mentioned permanent storage. I was thinking core memory which is non-volatile, but considering the lack of any sort of memory protection it would be much better to have a tape drive. This is connected to another serial port with some of what you might call GPIO to control the tape motor and switch between reading and writing. In the last phase above, I implement a tape driver and start recording the system onto tape with a rotating backup scheme. In future boots, I can 'just' toggle in a tape driver and load my previous work.

Finis

That was long enough without designing an instruction set and its encoding! I'm seriously tired now. I think next time I'll quote a couple of sentences and just reply, "Dunning-Kruger effect detected." :P
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
linguofreak
Member
Member
Posts: 510
Joined: Wed Mar 09, 2011 3:55 am

Re: Thought experiment: Bare metal to interpreter

Post by linguofreak »

Note: I tried to post this last night, but the forum had become unreachable by the time I hit post. Some of what's here may already have been said by others.
andi wrote:Say you want to build an 8 (12 if it's easier) bit computer from scratch.

Assume most of the hardware is built (registers, ALU, bus, RAM, EEPROM, etc). However, there is no I/O. You have to write instructions into RAM by hand.

What opcodes should you implement in hardware? Then, using those opcodes, what form of I/O hardware could you implement to allow a user to (a) view what’s at a specific memory location and (b) set the contents of a memory location to a user-specified value, implemented via the opcodes you just defined (no cheating w/ an Arduino or ATTiny).
What was generally done on early computers was that a front panel was provided to load the initial boot code into memory without involving any execution of code on the CPU, then that code would pull in larger programs from either a punch card reader (typical on IBM machines), a paper tape reader (possibly attached to a teletype terminal, typical on DEC machines), or from a disk, if your site could afford one. On the very, very earliest computers, code was often input by physically rewiring the machine.
Essentially what I'm trying to do in the broadest sense is simulate the advancement of computer technology from the very beginning. Way before we had GUIs, keyboards, interpreters, compilers, hell, even punchcards.
Punchcards actually predate computers, and other than front panels (which didn't operate entirely under CPU control) were probably the first I/O devices used on computers.
andi
Posts: 5
Joined: Wed Jan 30, 2019 8:57 pm

Re: Thought experiment: Bare metal to interpreter

Post by andi »

Thank you both for your insightful replies. The front panel is a good idea, and it seems like Forth is the way to go.

This is all very fascinating. Do you know where I can find literature from the era to better explain how people originally programmed computers?

I assume TTYs were entirely electromechanical, with nothing resembling microcontrollers driving their input? Did they just use the Serial protocol?
davidv1992
Member
Member
Posts: 223
Joined: Thu Jul 05, 2007 8:58 am

Re: Thought experiment: Bare metal to interpreter

Post by davidv1992 »

For explicit literature, especially for early machines, I am not sure. There are very few books on programming systems from the very early period, since these machines were typically one of a kind (or one of very few), and training on them wasn't yet as formalized. For example, UNIVAC (one of the first comercially sold computers, though not THE first), had an install base of 46 systems. For general overview, searching on the internet should give a reasonable picture of some early machines, (try searching for Manchester baby, ENIAC and early computing history). For context, it might also be useful to dive into how computing was done before the first electronic computers, as there were influences from mechanical calculators and tabulation machines.

For programming early machines, the earliest machines were programmed in machine code, one would assume typically through some sort of hand-assembling process. The IEEE points to the EDSAC as the first computer with an assembler, I am unsure how similar it was to the modern concept. The idea seems to have spread rather quickly though, no doubt due to the fact that hand-assembling is rather laborious, at least in my experience.

Early I/O, as mentioned earlier machines typically had either plugboards or some sort of frontpanel with switches to input simple programs. This was very quickly amended with punchcards and punchtape for larger program loading, with early loaders typically still input through frontpanels. For output, most early machines did include printers.

As for the origin of the TTY, it probably already reveals a lot when you realize that it is shorthand for TeleTYpe. These were essentially the follow-on for simple morse code based telegraphs, and were already widespread before the second world war, and consisted of a printer/keyboard combination (rather like a typewriter), where the printer portion could (also?) be controlled by a keyboard on the other end of a telegraph wire. Early computer TTYs were similar, printing output on paper with a keyboard below sending a stream of characters to the machine it was connected to. I'm not sure how common use of them was though, as early systems were typically batch based instead of interactive. Again, a search engine is your friend if you want more detail.
User avatar
eekee
Member
Member
Posts: 891
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Thought experiment: Bare metal to interpreter

Post by eekee »

andi wrote:This is all very fascinating. Do you know where I can find literature from the era to better explain how people originally programmed computers?
I'm afraid I don't, but [some?] computers still had the front-panel switches as late as the early '70s. There are stories from the '60s and '70s out there. If I remember right, Unix was either developed on a PDP-11/70 or ported to it quite early on. The Unix History Society may have some relevant info, their mailing list archives go back a long way. The PDP-8 was a simpler but also quite well-respected machine. The Altair 8800 was launched with front-panel toggles in the mid-70s, but probably generated more complaints than stories of heroics. :) It was cheap, programmers working the toggles wouldn't have been paid the rates of previous decades. SIMH simulates those 3 and many more historic systems; lots of names to search for.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
User avatar
eekee
Member
Member
Posts: 891
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Thought experiment: Bare metal to interpreter

Post by eekee »

Charles Moore wrote about inputting a Forth interpreter via the switches in section 9 of Programming a Problem-Oriented-Language, preserved here: https://colorforth.github.io/POL.htm He admits it's not the best-quality work, but it is a fascinating look into the computing world of 1970. :) Two quotes, the second of which shows he recommended (based on experience) to do things differently to what I assumed would be easiest:
First you'll have to know some things: how to turn the computer on and off (odd hours), how to enter and display data from the console switches, and how to avoid damaging data stored on disk. You may have to corner operators or engineers to discover such information; it is so rarely of interest it doesn't get written down.
Now establish the stacks, the dictionary search subroutine and entries for WORD and NUMBER. Be very careful to do it right the first time; that is, don't simplify NUMBER and plan to re-do it later. The total amount of work is greater, even using the switches.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Post Reply