Conca : a small interpreter

Discussion:

Conca : a small interpreter

Claude Marinier

2014-04-29 17:24:59 UTC

Bonjour,

I was inspired by reading "The joy of Joy" and other things about
concatenative languages. Conca is an interpreter for a language like Joy
and Cat. It is still young and is missing file output. It can do a few
things. Here is a quick example.

C:\Util\conca>conca
conca 0.5, built on 2014-04-23
define built-in words
define parser and evaluator

[dup *] "sq" define
12 sq .

144

[1 2 3 4 5 6 7 8 9] [sq] map .

[ 1 4 9 16 25 36 49 64 81 ]

"fibonacci.conca" load
18 fibonacci .

[ 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 ]

quit

You will notice that Conca definitions are postfix like the rest of the
language. I could not think of a good reason to deviate from the postfix
syntax to handle definitions. Is there a technical reason Cat and Joy use
something else?

- Joy
square == dup *
- Cat
define square { dup * }

I am not sure how to deal with errors. Currently, Conca prints a message
and skips the rest of the line. Should errors in a script be treated
differently? So many choices to make. This is more complicted than I first
thought. :-)

You can get a Win32 binary distribution here. It includes a Fibonacci
definition script and a test script.

http://sourceforge.net/projects/conca/

I will read and consider comments. If there is interest, I may develop it
further.

Merci.

P.S. Has interest in concatenative languages wanned in recent years?

--
Claude Marinier

Jon Purdy

2014-04-29 18:20:31 UTC

I was inspired by reading "The joy of Joy" and other things about concatenative languages. Conca is an interpreter for a language like Joy and Cat.

This looks like a good project for learning about how concatenative
languages are put together.

You will notice that Conca definitions are postfix like the rest of the language. I could not think of a good reason to deviate from the postfix syntax to handle definitions. Is there a technical reason Cat and Joy use something else?

Cat is statically typed, and it needs to know ahead of time what
definitions are available for type checking. I donât know much about
Joy, but it doesnât seem to have the same facility for mixing
compile-time and runtime evaluation that Forth does, so it seems
simpler to make â==â just part of the syntax.

Since Conca is heavily dynamic, it makes sense to keep it consistently postfix.

I am not sure how to deal with errors. Currently, Conca prints a message and skips the rest of the line. Should errors in a script be treated differently? So many choices to make. This is more complicted than I first thought. :-)

Error handling is a hard problem! You might implement an exception
system of some kind; itâs entirely up to you.

I will read and consider comments. If there is interest, I may develop it further.

You should develop it further because of your own interest, not
because of other peopleâs. The question is: what problem do you want
Conca to solve? If you can answer that clearly, even if the answer is
âI just want to write a programming language for funâ, you will know
exactly what to do. :)

P.S. Has interest in concatenative languages wanned in recent years?

The community is rather quiet, but fairly active. You should join
#concatenative on Freenode, where we talk about Factor and I post
updates about my own statically typed concatenative language project
called Kitten.

Claude Marinier

2014-04-29 22:39:07 UTC

Post by Jon Purdy

Post by Claude Marinier
You will notice that Conca definitions are postfix like the rest of
the language. I could not think of a good reason to deviate from the
postfix syntax to handle definitions. Is there a technical reason Cat
and Joy use something else?

Cat is statically typed, and it needs to know ahead of time what
definitions are available for type checking. I donât know much about
Joy, but it doesnât seem to have the same facility for mixing
compile-time and runtime evaluation that Forth does, so it seems
simpler to make â==â just part of the syntax.
Since Conca is heavily dynamic, it makes sense to keep it consistently postfix.

Ah. OK.

Post by Jon Purdy

Post by Claude Marinier
I am not sure how to deal with errors. Currently, Conca prints a
message and skips the rest of the line. Should errors in a script be
treated differently? So many choices to make. This is more complicted
than I first thought. :-)

Error handling is a hard problem! You might implement an exception
system of some kind; itâs entirely up to you.

I will consider this.

Post by Jon Purdy

Post by Claude Marinier
I will read and consider comments. If there is interest, I may develop it further.

You should develop it further because of your own interest, not
because of other peopleâs. The question is: what problem do you want
Conca to solve? If you can answer that clearly, even if the answer is
âI just want to write a programming language for funâ, you will know
exactly what to do. :)

It is mostly the fun of developping it. :-)

--
Claude Marinier

[Non-text portions of this message have been removed]

William Tanksley, Jr

2014-04-29 19:33:53 UTC

Jon's advice is good.

You will notice that Conca definitions are postfix like the rest of the language. I could not think of a good reason to deviate from the postfix syntax to handle definitions. Is there a technical reason Cat and Joy use something else?

Making definitions "postfix" in a concatenative language actually
means that they're executed at runtime, which means it's possible to
build definitions at runtime. That's a major problem in a typechecked
language, or one that's trying to be theoretically pure in some
defined way.

At least, so far as we know...

P.S. Has interest in concatenative languages wanned in recent years?

It comes and goes. There are some unsolved questions about the idea,
and some people think there's no solution. There are some reasonable
success stories, but all of them have shortcomings.

Claude Marinier

-Wm

Jon Purdy

2014-04-29 19:58:06 UTC

Post by William Tanksley, Jr
Making definitions "postfix" in a concatenative language actually
means that they're executed at runtime, which means it's possible to
build definitions at runtime.

Not necessarily. âdefineâ could be a compile-time word that adds a
definition to the dictionary. So the language would need forward
declarations like C, but would still statically checkable.

Claude Marinier

2014-04-29 22:53:54 UTC

Post by Jon Purdy

Post by William Tanksley, Jr
Making definitions "postfix" in a concatenative language actually
means that they're executed at runtime, which means it's possible to
build definitions at runtime.

Not necessarily. âdefineâ could be a compile-time word that adds a
definition to the dictionary. So the language would need forward
declarations like C, but would still statically checkable.

The code in a quotation could be type checked as it is read; this could be
done for all cases: a quotation before a conditional or a loop as well as
code for a function definition.

Conca does all its type checking at run time. I see that this will delay
error detection to a less convenient time.

--
Claude Marinier

[Non-text portions of this message have been removed]

p***@gmail.com

2014-04-30 10:34:21 UTC

Interestingly, from the point of view of error checking, the simplified version of my own Furphy (see http://users.beagle.com.au/peterl/furphy.html) doesn't have any possible compile time errors apart from insufficient memory. ALL possible source is syntactically valid, and run time checks are the only meaningful ones! But I also have immediate words to run during compilation to provide syntactic sugar, which not only makes some syntax like matching square brackets and quotation marks necessary, it also provides natural places to test for errors in them. Ideally, the compiler builds up a pretty printed listing as it goes, and syntax failures simply drop error annotations into that while keeping track of the error level reached and making safe-ish default assumptions to try to fill in mismatches; at the end, just before the compiler triggers its compile-and-go run time execution, it should check the error level and only run if an immediate word has specified a safe higher error level than a default of no errors (and there should be a default output of the listing at that stage whether the program runs or not, unless another immediate word has stopped that).

A good all round guide to this whole area is P.J.Brown's "Writing Interactive Compilers and Interpreters".

Claude Marinier

2014-04-30 22:53:53 UTC

Post by p***@gmail.com
Interestingly, from the point of view of error checking, the simplified
version of my own Furphy (see
http://users.beagle.com.au/peterl/furphy.html) doesn't have any possible
compile time errors apart from insufficient memory. ALL possible source
is syntactically valid, and run time checks are the only meaningful
ones! But I also have immediate words to run during compilation to
provide syntactic sugar, which not only makes some syntax like matching
square brackets and quotation marks necessary, it also provides natural
places to test for errors in them. Ideally, the compiler builds up a
pretty printed listing as it goes, and syntax failures simply drop error
annotations into that while keeping track of the error level reached and
making safe-ish default assumptions to try to fill in mismatches; at the
end, just before the compiler triggers its compile-and-go run time
execution, it should check the error level and only run if an immediate
word has specified a safe higher error level than a default of no errors
(and there should be a default output of the listing at that stage
whether the program runs or not, unless another immediate word has
stopped that).
A good all round guide to this whole area is P.J.Brown's "Writing
Interactive Compilers and Interpreters".

I tend to read e-mail from this account in the evening (North America,
EDT).

I am reading and digesting what you say. That's a lot to think about.

Yes, there is so lottle syntax that all programs are valid. The issue here
is type checking. At run-time, built-in words check the type of the data
they use. The dynamic nature of the language make this the default
behaviour. Static type checking requires knowledge of what the caller
expects (usually easy) and of what the caller is providing (much more
difficult).

This is starting to look like "more than I can chew"(1).

Thanks for the comments.

(1) American English has an abundant supply of colourful idiomatic
expressions. :-)

--
Claude Marinier

p***@gmail.com

2014-05-01 10:17:34 UTC

"Claude Marinier"

Post by p***@gmail.com
Interestingly, from the point of view of error checking, the simplified
version of my own Furphy (see
http://users.beagle.com.au/peterl/furphy.html) doesn't have any possible
compile time errors apart from insufficient memory. ALL possible source
is syntactically valid, and run time checks are the only meaningful
ones! But I also have immediate words to run during compilation to
provide syntactic sugar, which not only makes some syntax like matching
square brackets and quotation marks necessary, it also provides natural
places to test for errors in them. Ideally, the compiler builds up a
pretty printed listing as it goes, and syntax failures simply drop error
annotations into that while keeping track of the error level reached and
making safe-ish default assumptions to try to fill in mismatches; at the
end, just before the compiler triggers its compile-and-go run time
execution, it should check the error level and only run if an immediate
word has specified a safe higher error level than a default of no errors
(and there should be a default output of the listing at that stage
whether the program runs or not, unless another immediate word has
stopped that).
A good all round guide to this whole area is P.J.Brown's "Writing
Interactive Compilers and Interpreters".

I tend to read e-mail from this account in the evening (North America,
EDT).

I am in Australia; it's around 8 p.m. right now.

I am reading and digesting what you say. That's a lot to think about.
Yes, there is so lottle syntax that all programs are valid. The issue here
is type checking. At run-time, built-in words check the type of the data
they use. The dynamic nature of the language make this the default
behaviour. Static type checking requires knowledge of what the caller
expects (usually easy) and of what the caller is providing (much more
difficult).

The issue is partly what the syntax features offer a programmer by holding his hand and making him do the right thing, and partly what functionality can be provided most easily that way. For instance, you can do object oriented programming in C even though it doesn't have the features for coding it directly, but the features for that in C++ make it easier.

Since I come from a Forth-ish orientation, I don't want to constrain the programmer too much. Rather, I have found that careful choice of meaningful word names helps a lot here by making him think through what he is doing. So called "Hungarian notation" - naming guidelines to reflect conceptual types (i.e. what the programmer has in mind as a type, not what the virtual machine will insist on) - may help too.

Over and above that, I have found Forth stack commenting helpful, e.g. ( true/false val1 val2 --- val ) is a comment for Furphy's IF that should make it clear that it expects three parameters, of which the first should be conceptually Boolean but the others are unrestricted, and leaves just one unrestricted value. I have thought about using a variant of that comment formalism to set up "assertions", i.e. pieces of run time code that are only triggered during testing and return error messages if parameters and results don't match the assertions. That might perhaps be extended to static testing during compilation.

For me, types don't look as promising as a coding guideline so much as for implementing object oriented stuff; Wirth's Oberon family of languages implements object orientation by using type extension, so it should be possible. If I ever add it to Furphy, first I will look hard at Oberon and FICL (an object oriented Forth).

This is starting to look like "more than I can chew"(1).
Thanks for the comments.
(1) American English has an abundant supply of colourful idiomatic
expressions. :-)

Although I am in Australia, I am in fact British, of a third generation of world travellers. "Biting off more than you can chew" isn't U.S. English in particular, just English. Then again, French has not a few such expressions of its own; "faire suer le burnous" (mes parents habitaient et travaillaient en Algers entre 1944 et 1948, et s'y rencontraient) and "aller aux fraises" come to mind, just off the top of my head (which is itself an expression). PML.

--
Claude Marinier

John Cowan

2014-05-01 14:04:03 UTC

Post by p***@gmail.com
Although I am in Australia, I am in fact British, of a third
generation of world travellers. "Biting off more than you can chew"
isn't U.S. English in particular, just English.

As I said in connection with "till the cows come home" once, the
Sundering Sea has not sundered our homely metaphors. (Alas, however,
"homely" means "ugly" in North America.)

--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
That you can cover for the plentiful and often gaping errors, misconstruals
and disinformation in your posts through sheer volume -- that is another
misconception. --Mike to Peter

h***@yahoo.de

2014-05-01 14:23:39 UTC

Hello Claude, hello *,

the decision of Joy to allow definitions only at compile time may force development into a direction like term rewriting or something like that, I don't know.

What I think is: introducing a define-primitive is a door-opener into a scheme-like style of programming, maybe a bit more than scheme-style, a more natural way than scheme marcos.

Scheme and Joy both say: Programm == Data == List
So put something on stack, do all data-transformation you want until you get the definition you need and define.

I'm, like you, am one of those with an self made Joy implementation.
this is how my joy does it:

zip befor transformation:
zip: {l l -- l}
( nil? | nip
| nild? | zap
others | 2uncons pair^^ loop cons )dsplit cond*

and afterwards:
(zip) {l l -- l} (((nil? )(nip )(nild? )(zap )(else )(2uncons (pair )dip2 zip cons ))cond* ) define

about type checking at compile time:

Programming in Haskell or OCaml, when comming for scheme or lisp a cool experience:
when it runs it is correct.

But there are very successful languages without this property, as scheme/lisp or Python.

best regards
Heiko

Claude Marinier

2014-05-01 22:59:14 UTC

Post by p***@gmail.com
So called "Hungarian notation" - naming guidelines to reflect conceptual
types (i.e. what the programmer has in mind as a type, not what the
virtual machine will insist on) - may help too.

This is a controversial topic in some circles but in this case it is a
valuable practice.

Post by p***@gmail.com
Over and above that, I have found Forth stack commenting helpful, e.g. (
true/false val1 val2 --- val ) is a comment for Furphy's IF that should
make it clear that it expects three parameters, of which the first
should be conceptually Boolean but the others are unrestricted, and
leaves just one unrestricted value. I have thought about using a variant
of that comment formalism to set up "assertions", i.e. pieces of run
time code that are only triggered during testing and return error
messages if parameters and results don't match the assertions. That
might perhaps be extended to static testing during compilation.

That is a good idea: type assertions which can be turned off for
production code. It could be handled (internally) by one function which
compares the assertions with the contents of the stack. It might
not be too hard. Hum ...

--
Claude Marinier

William Tanksley, Jr

2014-05-01 23:36:45 UTC

Post by Claude Marinier
That is a good idea: type assertions which can be turned off for
production code. It could be handled (internally) by one function which
compares the assertions with the contents of the stack. It might
not be too hard. Hum ...

Another solution would involve type assertions that are handled
statically when possible, and dynamically when not.

Factor, by the way, is a concatenative language that performs static
typechecking. It's a very impressive project.

Post by Claude Marinier
Claude Marinier

-Wm

p***@gmail.com

2014-04-30 10:37:17 UTC

Testing... I'm not seeing confirmation that my replies to WT Jr. and CM have gone through.

eas lab

2014-05-01 01:36:57 UTC

Thanks for good hi-level/over-view info on Furphy, in these F/B twitter days,
where there's nothing but repetitive eye-candy and pointers to ...
Ie. no meat!

== Chris Glur.

Claude Marinier

2014-04-29 22:49:28 UTC

Post by William Tanksley, Jr
Jon's advice is good.

Post by Claude Marinier
You will notice that Conca definitions are postfix like the rest of
the language. I could not think of a good reason to deviate from the
postfix syntax to handle definitions. Is there a technical reason Cat
and Joy use something else?

Making definitions "postfix" in a concatenative language actually
means that they're executed at runtime, which means it's possible to
build definitions at runtime. That's a major problem in a typechecked
language, or one that's trying to be theoretically pure in some
defined way.

I think I see the problem: it is nice to have assurances at the time the
function is defined rather than later that the code will work as expected.

--
Claude Marinier

p***@gmail.com

2014-04-30 10:11:54 UTC

No, my own Furphy (see http://users.beagle.com.au/peterl/furphy.html http://users.beagle.com.au/peterl/furphy.html) uses postfix (Reverse Polish) naming during compilation, and it doesn't have any distinct run time stuff during compilation apart from what is implemented by immediate words. In fact, that was the most natural way to work it with a simple compile-and-go compiler, so as not to need any special construct to be actioned during compilation to carry out a definition - and, without a defining word at all, there isn't a defining word available later at run time. With that approach, the compiler starts a new word and then compiles tokens and numbers from the source into a new word until it reaches the end or an unrecognised token, either of which terminates the current word with a return or tail call optimisation. An unrecognised token is assigned as the current word's name and then the compiler starts a new word and continues as before; at the end of the source there is one final anonymous word, and that is just run (it can save an executable and halt before falling through into application words that the executable will start at, if you want compile-and-save behaviour).

John Cowan

2014-04-30 18:26:02 UTC

Post by Claude Marinier
You will notice that Conca definitions are postfix like the rest of the
language. I could not think of a good reason to deviate from the postfix
syntax to handle definitions. Is there a technical reason Cat and Joy use
something else?

Joy programs rely on the fact that you can't redefine Joy words at runtime.
So yes, the way definitions are written is just syntax.

--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
Female celebrity stalker, on a hot morning in Cairo:
"Imagine, Colonel Lawrence, ninety-two already!"
El Auruns's reply: "Many happy returns of the day!"

17 Replies
22 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Claude Marinier 2014-04-29 17:24:59 UTC

Jon Purdy 2014-04-29 18:20:31 UTC

Claude Marinier 2014-04-29 22:39:07 UTC

William Tanksley, Jr 2014-04-29 19:33:53 UTC

Jon Purdy 2014-04-29 19:58:06 UTC

Claude Marinier 2014-04-29 22:53:54 UTC

p***@gmail.com 2014-04-30 10:34:21 UTC

Claude Marinier 2014-04-30 22:53:53 UTC

p***@gmail.com 2014-05-01 10:17:34 UTC

John Cowan 2014-05-01 14:04:03 UTC

h***@yahoo.de 2014-05-01 14:23:39 UTC

Claude Marinier 2014-05-01 22:59:14 UTC

William Tanksley, Jr 2014-05-01 23:36:45 UTC

p***@gmail.com 2014-04-30 10:37:17 UTC

eas lab 2014-05-01 01:36:57 UTC

Claude Marinier 2014-04-29 22:49:28 UTC

p***@gmail.com 2014-04-30 10:11:54 UTC

John Cowan 2014-04-30 18:26:02 UTC

about - legalese

Loading...