compiling pattern matching to point-free combinators sequence

Discussion:

Cyrille Duret

2014-04-30 07:04:36 UTC

hello,
I have good interest in postfix concatenative languages for some years now,
I discover the postfix notation with forth and saw the beauty of joy and
now I try to experiment the benefit of a concatenative language for
distributed application, user interface ( I am thinking of a visual touch
based programming environment a bit like hopscotch
https://www.gethopscotch.com/) , etc..

My language of prototyping is javascript and my language of choice for
implementation is ATS2 (http://www.ats-lang.org/)

I have already done a conventional implementation of a very basic language
here https://github.com/cduret/stk.js , but I want to go further.
At the time I read this paper (
http://mitarbeiter.hs-heilbronn.de/~herzberg/Publications/ICSOFT.2009.pdf),
I saw the power and simplicity of a concatenative language based on a
rewriting system.

So I began my implementation of a basic rewrite system :
The words can be defined with pattern matching as follow :

x dup => x x.
x y swap => y x.
[x_] call => x_.
...

Rewriting system is powerful and can compute symbolic expression as well
but the explosion of rewriting steps can lead to slow execution time.
That's why I try to think about a compiler that would transform the
rule-based program into a stack oriented bytecode.

My first problem is to find a way of rewriting a rule with named parameters
into a point-free word (in a forth or factor meaning).

How I could compile the rule :

a b toto => a a * 3 b * +.

into

toto => 3 * swap dup * +.

I am looking for such an algorithm if you have some advices..

thank you very much ;-)

William Tanksley, Jr

2014-04-30 16:18:40 UTC

Permalink

That's fun!

There have been a couple of concatenative systems based on rewriting,
and you'll probably want to learn from them; the one I can recall
offhand is "Enchilada" and its author's next language "SPREAD", the
latter at http://oercode.blogspot.nl/. (I don't remember where
Enchilada lives.)

However, rewriting lambdas is a little different. iepos wrote a paper
that includes discussion of that at http://tunes.org/~iepos/joy.html.
It's fairly simple compared to general concatenative rewriting.

-Wm

Post by Cyrille Duret
hello,
I have good interest in postfix concatenative languages for some years now, I discover the postfix notation with forth and saw the beauty of joy and now I try to experiment the benefit of a concatenative language for distributed application, user interface ( I am thinking of a visual touch based programming environment a bit like hopscotch https://www.gethopscotch.com/) , etc..
My language of prototyping is javascript and my language of choice for implementation is ATS2 (http://www.ats-lang.org/)
I have already done a conventional implementation of a very basic language here https://github.com/cduret/stk.js , but I want to go further.
At the time I read this paper (http://mitarbeiter.hs-heilbronn.de/~herzberg/Publications/ICSOFT.2009.pdf), I saw the power and simplicity of a concatenative language based on a rewriting system.
x dup => x x.
x y swap => y x.
[x_] call => x_.
...
Rewriting system is powerful and can compute symbolic expression as well but the explosion of rewriting steps can lead to slow execution time.
That's why I try to think about a compiler that would transform the rule-based program into a stack oriented bytecode.
My first problem is to find a way of rewriting a rule with named parameters into a point-free word (in a forth or factor meaning).
a b toto => a a * 3 b * +.
into
toto => 3 * swap dup * +.
I am looking for such an algorithm if you have some advices..
thank you very much ;-)

William Tanksley, Jr

2014-04-30 18:22:21 UTC

Permalink

I found Enchilada:

http://www.enchiladacode.nl/

On Wed, Apr 30, 2014 at 9:18 AM, William Tanksley, Jr

Post by William Tanksley, Jr
That's fun!
There have been a couple of concatenative systems based on rewriting,
and you'll probably want to learn from them; the one I can recall
offhand is "Enchilada" and its author's next language "SPREAD", the
latter at http://oercode.blogspot.nl/. (I don't remember where
Enchilada lives.)
However, rewriting lambdas is a little different. iepos wrote a paper
that includes discussion of that at http://tunes.org/~iepos/joy.html.
It's fairly simple compared to general concatenative rewriting.
-Wm

Jon Purdy

2014-04-30 18:20:09 UTC

Permalink

Post by Cyrille Duret
Rewriting system is powerful and can compute symbolic expression as well but the explosion of rewriting steps can lead to slow execution time.

I think it is possible to produce a fast implementation of a
concatenative system based on pure rewriting. I was working on such a
thing a few years ago. My approach (as I remember it) was to collect
all rewrite rules, sort them in order of specificity, and group those
by the head word if there was one; the result was an O(1) hash table
lookup plus an O(log n) search for the most specific pattern, each
rewriting step. Seemed fast enough, though I never benchmarked it. I
will see if I can dig up the code or rewrite it for fun.

Post by Cyrille Duret
My first problem is to find a way of rewriting a rule with named parameters into a point-free word (in a forth or factor meaning).

Thatâs called an âabstraction algorithmââconverting lambdas into
combinators. The Brent Kirby article that William Tanksley linked is a
good start, but it would require a good deal of optimisation to be
suitable.

You can also think of a concatenative expression with local variables
as a data flow graphâdirected and acyclic, hopefullyâand try to reduce
the graph as much as possible. But that breaks down with higher-order
functions, which seem to be the best way to make concatenative
expressions succinct, legible, and efficient.

John Cowan

2014-04-30 18:23:36 UTC

Permalink

Post by Jon Purdy
I think it is possible to produce a fast implementation of a
concatenative system based on pure rewriting.

You should look at Pure, which is not concatenative but is based on
eager bottom-up term rewriting.

--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
"Why yes, I'm ten percent Jewish on my manager's side."
--Connie Francis

Cyrille Duret

2014-05-01 08:09:38 UTC

Permalink

@william
Thanks for the references, indeed enchilada and SPREAD are the closest to
my implementation and I will look carefully at their code.

For the lambda reduction I will try this simple algorithm first but I think
a lot of optimizations are required after reducing the amazing number of
combinator words produced

if you think simple lambda expression as

A\ B A [C A] is reduced to

[B] dip dup [i] dip [[C] dip i] cons

you can suspect that the generated code will be way too fat to be efficient.

@jon

I would be interested to see ur implementation.
The heart of my implementation is the match and rewrite functions of this
amazing book about term rewriting
http://www.amazon.com/Term-Rewriting-That-Franz-Baader/dp/0521779200
I adapted it to concatenative needs

My ultimate goal would be to have a very simple rewriting core and doing
all with it : type-checking, optimization, compilation/code generation.
I don't know if it is possible but I have the feeling that the combination
of rewriting and concatenative paradigm is so powerful.

@john

thanks for the reference I got the book
http://www.amazon.com/Functional-Programming-Parallel-Rewriting-International/dp/0201416638that
describe the underlying implementation of the Clean (now Pure)
programming language.
I don't know if the high level part is suitable for my case of postfix,
concatenative language, but the VM bytecode generation and the type check
system and the IO implementation is amazing ;-)

William Tanksley, Jr

2014-05-01 13:10:32 UTC

Permalink

For the lambda reduction I will try this simple algorithm first but I think a lot of optimizations are required after reducing the amazing number of combinator words produced
if you think simple lambda expression as
A\ B A [C A] is reduced to
[B] dip dup [i] dip [[C] dip i] cons
you can suspect that the generated code will be way too fat to be efficient.

That's completely correct. This is a purely textual algorithm, which
has advantages for reasoning on paper but disadvantages otherwise. If
you lift the phrase into a dataflow graph you'll be able to perform
translation and optimization with the same step -- but it's a more
complicated step.

The heart of my implementation is the match and rewrite functions of this amazing book about term rewriting
http://www.amazon.com/Term-Rewriting-That-Franz-Baader/dp/0521779200
I adapted it to concatenative needs

I'm delighted to hear it.

Check out the back history of this group for talk about static
typechecking. It's not a simple issue, but several people have done
the basics.

-Wm