Writing a compiler - how?

Programming, for all ages and all languages.
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Writing a compiler - how?

Post by NickJohnson »

This reminds me of when a friend of mine wrote a calculator program by iteratively searching through the input and evaluating +,-,*,/, etc. :lol:
User avatar
iansjack
Member
Member
Posts: 4706
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Writing a compiler - how?

Post by iansjack »

You wrote a compiler in 20 minutes? I don't believe you, however simple it was.

It takes me more than 20 minutes to switch the computer on and have the first cup of coffee, let alone do any productive work.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Writing a compiler - how?

Post by Solar »

ACcurrent wrote:And after all, the 1st version of JS# was written in 20 minutes.
Wow. Including version control setup, source comments, and manual? I am impressed. But then again, as was said before, JS# is a source-to-source converter, not a compiler...
Every good solution is obvious once you've found it.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Writing a compiler - how?

Post by turdus »

You are discussing different things (or different subsets of the same thing) I belive.

I agree with JamesM, that a full language cannot be parsed that way efficiently (more likely your code became a mess and won't work at all).

But ACcurrent probably referring to a partial language, and he is right about it could be enough and more efficient under some circumstances. For example a partial language (some descriptor language, like CSS or a language with extremely strict syntax like Sinclair-BASIC) can be parsed faster without lexing. I never saw anybody writing a lexer for a key=value configuration language for example, neither do I saw any C compiler without.

(For clarity I use the word "full language" contra "partial language" in a sense that it lacks either any of three control structures (or you cannot include their combinations in infinite depth), or complex expressions with precedence and unlimited number of sub-expressions. Sorry, I can't translate the right technical term for this. Dictionary translates it to "full", which sounds stupid, but I've no better.)
User avatar
Yoda
Member
Member
Posts: 255
Joined: Tue Mar 09, 2010 8:57 am
Location: Moscow, Russia

Re: Writing a compiler - how?

Post by Yoda »

I don't believe in compiler from scratch in 20 minutes. It may be one of the two possibilities: either it is built from a ready pre-built blocks, or it was just a stub and the rest 2 days were spent to debug it and to stuff with the rest of features.
Yet Other Developer of Architecture.
OS Boot Tools.
Russian national OSDev forum.
User avatar
miker00lz
Member
Member
Posts: 144
Joined: Wed Dec 08, 2010 3:16 am
Location: St. Louis, MO USA

Re: Writing a compiler - how?

Post by miker00lz »

some nice links in this thread. writing a compiler is something i've been considering too. i already understand how to do the majority of it, but what i haven't wrapped my head around yet is how to parse expressions. i mean, things like handling operator precedence in mathematical expressions, etc... expressions can get incredibly complicated, and one little error in parsing breaks the entire program.

then i'd have to come up with a good way to create a "stack" of smaller operations, and do them in exactly the right order and then plug the various results into each other correctly. if i sat down and gave it some real serious thought i can probably come up with a solution, but it seems to be such a big task i haven't wanted to try. :o

everything else should be relatively easy by comparison. (keyword: relatively)
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Writing a compiler - how?

Post by turdus »

berkus wrote:Well, when you build an expression tree
Well, you don't have to build a tree. It's easier to use polish notation (which uses only stacks), and the result will be the same as walking through the tree.
http://en.wikipedia.org/wiki/Polish_notation
Also there're lot of examples on the web in different programming languages how to do that.
Here's one in Javascript: http://scriptasylum.com/tutorials/infix ... stfix.html
User avatar
Yoda
Member
Member
Posts: 255
Joined: Tue Mar 09, 2010 8:57 am
Location: Moscow, Russia

Re: Writing a compiler - how?

Post by Yoda »

@turdus,
I totally agree with you, - polish notation greatly simplifies parsing an expressions and you don't need anything else if you create hobby compiler.
But IMHO, building a tree provides more possibilities to optimize code for size, speed and memory usage.
Yet Other Developer of Architecture.
OS Boot Tools.
Russian national OSDev forum.
User avatar
Thomas
Member
Member
Posts: 281
Joined: Thu Jun 04, 2009 11:12 pm

Re: Writing a compiler - how?

Post by Thomas »

Hi,
berkus wrote:
turdus wrote:
berkus wrote:Well, when you build an expression tree
Well, you don't have to build a tree. It's easier to use polish notation (which uses only stacks), and the result will be the same as walking through the tree.
That's an implicit tree (or what comes from breadth-first traversal of such tree), so I'm not buying that.
Not really ... . It is strictly not required to create an expression tree for parsing .

Actually RPN like parsing leads to form of parsing known as operator precedence parsing , it is actually an LR form of parsing and not an LL type . It is actually an efficient method of parsing expressions. But during later stages of development its considered far more elegant to have an AST and then make use of the visitor pattern for rest of the process . As we are hobbyist's we need not strictly follow the norm !

You guys might find this interesting : http://www.cs.usfca.edu/~galles/compilerdesign/
You might also find this thread interesting : http://forum.osdev.org/viewtopic.php?t=21349

Link : http://www.ipdatacorp.com/mmpd.html . DASM that comes with the MMURTL package makes use of operator precedence parsing for the most part and generates "okay code " and that's a good enough start !.

If you can figure out a good way to pass semantic and other info in a good way ( generally people use some form of trees for it :D ) to later stages of the compiler you are good to go 8)


Hopefully something in my rambling is useful to you guys
--Thomas
User avatar
NickJohnson
Member
Member
Posts: 1249
Joined: Tue Mar 24, 2009 8:11 pm
Location: Sunnyvale, California

Re: Writing a compiler - how?

Post by NickJohnson »

User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: Writing a compiler - how?

Post by AndrewAPrice »

Thomas wrote:Not really ... . It is strictly not required to create an expression tree for parsing.
Expression trees, while not necessary, make things a lot more natural to parse. Grammars are represented by a hierarchy of
rules (with tokens and lexemes as the leaves), so it's very intuitive to represent an instance of the grammar (your program) as a tree. For example:

Code: Select all

int a = someFunc(sin(a), cos(b + c * 2));

Code: Select all

- declare variable
  - type: int
  - name: a
  - value:
     - sumFunc
       - parameter 1:
         -sin
           - a
       - parameter 2:
         - cos
           - +
              - b
              - *
                - c
                - 2
A popular alternative is to output to an intermediate stack based language.
My OS is Perception.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Writing a compiler - how?

Post by JamesM »

... I think some people are missing the point.

RPN is extremely easy to parse as there is no precedence. However, you need to have your language defined such that it uses RPN. You're shifting the burden of definition onto the programmer.

Infix expressions (or even prefix expressions as in s-exprs in lisp) are easier for a human to write and grok than pure reverse polish notation.

So your hands are tied parsing wise by what your language defines.
User avatar
miker00lz
Member
Member
Posts: 144
Joined: Wed Dec 08, 2010 3:16 am
Location: St. Louis, MO USA

Re: Writing a compiler - how?

Post by miker00lz »

MessiahAndrw wrote:
Thomas wrote:Not really ... . It is strictly not required to create an expression tree for parsing.
Expression trees, while not necessary, make things a lot more natural to parse. Grammars are represented by a hierarchy of
rules (with tokens and lexemes as the leaves), so it's very intuitive to represent an instance of the grammar (your program) as a tree. For example:

Code: Select all

int a = someFunc(sin(a), cos(b + c * 2));

Code: Select all

- declare variable
  - type: int
  - name: a
  - value:
     - sumFunc
       - parameter 1:
         -sin
           - a
       - parameter 2:
         - cos
           - +
              - b
              - *
                - c
                - 2
A popular alternative is to output to an intermediate stack based language.
yeah a tree like that would make it pretty straightforward. even if not needed, i would definitely do that for my first compiler.
User avatar
Yoda
Member
Member
Posts: 255
Joined: Tue Mar 09, 2010 8:57 am
Location: Moscow, Russia

Re: Writing a compiler - how?

Post by Yoda »

MessiahAndrw wrote:Expression trees, while not necessary, make things a lot more natural to parse...
A popular alternative is to output to an intermediate stack based language.
We are talking about that the parsing of expression doesn't need building a tree for stack approach like RPN. And that expression stack may easily (but not necessarily effectively) be implemented in common x86 architecture.
Yet Other Developer of Architecture.
OS Boot Tools.
Russian national OSDev forum.
User avatar
turdus
Member
Member
Posts: 496
Joined: Tue Feb 08, 2011 1:58 pm

Re: Writing a compiler - how?

Post by turdus »

JamesM wrote:... I think some people are missing the point.

RPN is extremely easy to parse as there is no precedence. However, you need to have your language defined such that it uses RPN.
Sorry, but you're totally wrong. Any infix notation with different precedence operators can be converted into predecende-free postfix notation without problems, and it's totally independent of the language.

Have you checked the link I posted?
Post Reply