Discussion:
slate gui slowness needs fixing before proceeding
Timmy Douglas
2006-06-11 19:09:15 UTC
Permalink
Well, I've added patches to my repos for simple one-way undo and now
you can type in all the characters fine. The problem is that there is
no point in going further (selections, copy paste, searching,
kill-line, cleaning up my ugly code, etc) when the current gui is too
slow to type in. I guess my ideal path after those features to the
text buffer would be to modify demo/inspector.slate to elegantly edit
the current environment. But enough of the future talk.


So I'd like to take a look at speeding up (or something to speed up
the GUI portion) of slate, but the only real dealings I have with
compilers are from when I wrote a tiger compiler in sml (following
appel's book for a class). Anyways, I didn't see any hint of where to
start from the last 1000 messages on this list, so I thought I'd start
a thread. So what are the options?


In about a week, I'm going to get more busy though since another
summer class will start, but until then, I should have time to get
something done.
Brian Rice
2006-06-11 23:15:03 UTC
Permalink
Post by Timmy Douglas
Well, I've added patches to my repos for simple one-way undo and now
you can type in all the characters fine. The problem is that there is
no point in going further (selections, copy paste, searching,
kill-line, cleaning up my ugly code, etc) when the current gui is too
slow to type in. I guess my ideal path after those features to the
text buffer would be to modify demo/inspector.slate to elegantly edit
the current environment. But enough of the future talk.
Okay. I've pulled these into the site repositories.
Post by Timmy Douglas
So I'd like to take a look at speeding up (or something to speed up
the GUI portion) of slate, but the only real dealings I have with
compilers are from when I wrote a tiger compiler in sml (following
appel's book for a class). Anyways, I didn't see any hint of where to
start from the last 1000 messages on this list, so I thought I'd start
a thread. So what are the options?
I've discussed this off-list with someone who was going to work on
it, but I haven't heard back from him after initial email exchanges.
I've CC'd him just to get some basic communication re-established,
hopefully. The options that we went over were:

1) Improve/port the experimental_jit.c. It already gives a 2-4x
speedup. The problem is that it does no dynamic inlining, so that the
huge message-send layering which is the majority of the performance
problem is not taken care of.
2) Fix/complete Lee's optimizing compiler framework. This has a
couple of sub-options.
The direct option is to finish his x86 code generator, figure out
how to link it with the image, etc. Basically lots of stuff that I
have no idea how to do, and I don't know if I can get him to come
back to do it (although I'd try if enough people asked... or maybe
they should try themselves).
The other option is to add a back-end target to LLVM from the IR
code. This is slightly problematic because Lee wrote the IR to use
SSU (single static usage) instead of SSA (single static assignment)
form, which are inversions of each other from a data-flow
perspective. Other than that, Lee's framework does "speak" LLVM at
least from the abstract perspective.
3) Write an inlining bytecode compiler (my idea, would work
independently of a JIT) and associated caching/flushing system.
4) Translate the VM into a direct-threaded style, which Eliot Miranda
endorsed at Smalltalk Solutions this year when I spoke with him. It
makes inlining much easier and has other benefits in terms of
architectural/code-manipulation simplifications.
Post by Timmy Douglas
In about a week, I'm going to get more busy though since another
summer class will start, but until then, I should have time to get
something done.
That likely won't be enough time to accomplish a deep change, but
some sketch code to start with would be feasible. Slate's VM
structure is pretty simple and malleable. All the bytecode-related
code is in vm.slate, for example.

I hope that David won't mind, but I've attached his initial VM
proposal email from a couple of months ago.
Mark Haniford
2006-06-12 01:28:28 UTC
Permalink
This seems to be the second time that someone has had to stop work on
the gui because of the slowness of Slate. It might be time to ask Lee
to come back and work on the compiler/JIT stuff. Time keeps on
marching by and it's not waiting for Slate.
Post by Brian Rice
Post by Timmy Douglas
Well, I've added patches to my repos for simple one-way undo and now
you can type in all the characters fine. The problem is that there is
no point in going further (selections, copy paste, searching,
kill-line, cleaning up my ugly code, etc) when the current gui is too
slow to type in. I guess my ideal path after those features to the
text buffer would be to modify demo/inspector.slate to elegantly edit
the current environment. But enough of the future talk.
Okay. I've pulled these into the site repositories.
Post by Timmy Douglas
So I'd like to take a look at speeding up (or something to speed up
the GUI portion) of slate, but the only real dealings I have with
compilers are from when I wrote a tiger compiler in sml (following
appel's book for a class). Anyways, I didn't see any hint of where to
start from the last 1000 messages on this list, so I thought I'd start
a thread. So what are the options?
I've discussed this off-list with someone who was going to work on
it, but I haven't heard back from him after initial email exchanges.
I've CC'd him just to get some basic communication re-established,
1) Improve/port the experimental_jit.c. It already gives a 2-4x
speedup. The problem is that it does no dynamic inlining, so that the
huge message-send layering which is the majority of the performance
problem is not taken care of.
2) Fix/complete Lee's optimizing compiler framework. This has a
couple of sub-options.
The direct option is to finish his x86 code generator, figure out
how to link it with the image, etc. Basically lots of stuff that I
have no idea how to do, and I don't know if I can get him to come
back to do it (although I'd try if enough people asked... or maybe
they should try themselves).
The other option is to add a back-end target to LLVM from the IR
code. This is slightly problematic because Lee wrote the IR to use
SSU (single static usage) instead of SSA (single static assignment)
form, which are inversions of each other from a data-flow
perspective. Other than that, Lee's framework does "speak" LLVM at
least from the abstract perspective.
3) Write an inlining bytecode compiler (my idea, would work
independently of a JIT) and associated caching/flushing system.
4) Translate the VM into a direct-threaded style, which Eliot Miranda
endorsed at Smalltalk Solutions this year when I spoke with him. It
makes inlining much easier and has other benefits in terms of
architectural/code-manipulation simplifications.
Post by Timmy Douglas
In about a week, I'm going to get more busy though since another
summer class will start, but until then, I should have time to get
something done.
That likely won't be enough time to accomplish a deep change, but
some sketch code to start with would be feasible. Slate's VM
structure is pretty simple and malleable. All the bytecode-related
code is in vm.slate, for example.
I hope that David won't mind, but I've attached his initial VM
proposal email from a couple of months ago.
If anyone wants to discuss this intensively, I recommend Skype or IM.
My ID there is "water451" and my IM identifiers are in my signature
vCard.
--
-Brian
http://tunes.org/~water/brice.vcf
Brian Rice
2006-06-12 01:38:34 UTC
Permalink
Post by Mark Haniford
This seems to be the second time that someone has had to stop work on
the gui because of the slowness of Slate. It might be time to ask Lee
to come back and work on the compiler/JIT stuff. Time keeps on
marching by and it's not waiting for Slate.
I agree; this observation has been impressed on me since the moment
he declared his lack of enthusiasm.

I just want this project to work and do useful things for people. I'd
sacrifice quite a bit of control to achieve that.

To Lee:
How can we persuade you to return somehow? Would it need to involve
shedding some of the formalities of a public open-source project?
Post by Mark Haniford
Post by Timmy Douglas
Post by Timmy Douglas
Well, I've added patches to my repos for simple one-way undo and
now
Post by Timmy Douglas
you can type in all the characters fine. The problem is that
there is
Post by Timmy Douglas
no point in going further (selections, copy paste, searching,
kill-line, cleaning up my ugly code, etc) when the current gui
is too
Post by Timmy Douglas
slow to type in. I guess my ideal path after those features to the
text buffer would be to modify demo/inspector.slate to elegantly
edit
Post by Timmy Douglas
the current environment. But enough of the future talk.
Okay. I've pulled these into the site repositories.
Post by Timmy Douglas
So I'd like to take a look at speeding up (or something to speed up
the GUI portion) of slate, but the only real dealings I have with
compilers are from when I wrote a tiger compiler in sml (following
appel's book for a class). Anyways, I didn't see any hint of
where to
Post by Timmy Douglas
start from the last 1000 messages on this list, so I thought I'd
start
Post by Timmy Douglas
a thread. So what are the options?
I've discussed this off-list with someone who was going to work on
it, but I haven't heard back from him after initial email exchanges.
I've CC'd him just to get some basic communication re-established,
1) Improve/port the experimental_jit.c. It already gives a 2-4x
speedup. The problem is that it does no dynamic inlining, so that the
huge message-send layering which is the majority of the performance
problem is not taken care of.
2) Fix/complete Lee's optimizing compiler framework. This has a
couple of sub-options.
The direct option is to finish his x86 code generator, figure out
how to link it with the image, etc. Basically lots of stuff that I
have no idea how to do, and I don't know if I can get him to come
back to do it (although I'd try if enough people asked... or maybe
they should try themselves).
The other option is to add a back-end target to LLVM from the IR
code. This is slightly problematic because Lee wrote the IR to use
SSU (single static usage) instead of SSA (single static assignment)
form, which are inversions of each other from a data-flow
perspective. Other than that, Lee's framework does "speak" LLVM at
least from the abstract perspective.
3) Write an inlining bytecode compiler (my idea, would work
independently of a JIT) and associated caching/flushing system.
4) Translate the VM into a direct-threaded style, which Eliot Miranda
endorsed at Smalltalk Solutions this year when I spoke with him. It
makes inlining much easier and has other benefits in terms of
architectural/code-manipulation simplifications.
Post by Timmy Douglas
In about a week, I'm going to get more busy though since another
summer class will start, but until then, I should have time to get
something done.
That likely won't be enough time to accomplish a deep change, but
some sketch code to start with would be feasible. Slate's VM
structure is pretty simple and malleable. All the bytecode-related
code is in vm.slate, for example.
I hope that David won't mind, but I've attached his initial VM
proposal email from a couple of months ago.
--
-Brian
http://tunes.org/~water/brice.vcf
Mark Haniford
2006-06-12 02:43:08 UTC
Permalink
So what is the "real" reason for Lee quitting in the first place?
Maybe if Lee was bestowed the position of "Benevolant Dictator" then
he would come back.

But wasn't Lee the only person working on the VM anyway? Were other
people trying to bogart in on the VM or were their arguments about the
design or what?

I think Slate has some great ideas behind it.
Post by Brian Rice
How can we persuade you to return somehow? Would it need to involve
shedding some of the formalities of a public open-source project?
Brian Rice
2006-06-12 13:53:37 UTC
Permalink
Post by Mark Haniford
So what is the "real" reason for Lee quitting in the first place?
Maybe if Lee was bestowed the position of "Benevolant Dictator" then
he would come back.
The last relevant quotes (over instant messaging) I got from him
about his stance towards the project were:

"i can say with absolute certainty i have no interest in continuing it"

"i'm disinterested in where the project is going and i'm not sure we
are capable of running it together to the mutual benefit of us both"

"as it has been it has been nothing more than parasitic to me"

If I may paraphrase further, basically he did not want to maintain a
public project at all, and only cared about a hobby-level language
+compiler+OS toolchain that ran on bare hardware. He actually
*suggested* that we focus on the IDE to make Slate more useful rather
than fret about performance. I'm not sure how he never took the
performance issue seriously enough to realize that the UI-based IDE
would be unusable without a dynamic inliner.

I am CC'ing him so that he can clarify his stance on this if he wishes.
Post by Mark Haniford
But wasn't Lee the only person working on the VM anyway?
Lee actually didn't originally like the idea of using a VM. I simply
introduced a (buggy) Slate-to-C translator a la Squeak and sketched
out a VM design just to get the ball rolling when we were stuck in a
Common Lisp interpreter. If we hadn't done that, we'd still be there
on CL since he never completed his compiler.

That said, he wrote most of the VM code and wound up maintaining it.
He has a propensity to write pages and pages of really interesting
code with no comments; I spoke with other people he has collaborated
with and they've confirmed this. So it's non-trivial to pick up code
that he wrote and run with it, and he got stuck with what he had
written, with little-to-no desire to do so (apparently).
Post by Mark Haniford
Were other people trying to bogart in on the VM or were their
arguments about the design or what?
Towards the end there was a decent amount of clamoring for
continuation support, which he apparently found unwarranted. He
basically silently refused to code any support for it, while making
hand-waving explanations about how easy it would be to get a subset
of the functionality. At least a few people disagreed with him on the
matter.

No one really criticized or tried to mess with the basic VM design or
such; in fact I think he was its biggest critic.

The last that I heard, Lee was learning his father's real estate
business and selling real estate in/near Las Vegas. I suggested a few
open positions for the type of work he was doing with Slate, but it
didn't interest him. He probably makes good money and is totally
wasting his technical talent (or not - he occasionally just
contributes to a few projects as a donor).
Post by Mark Haniford
I think Slate has some great ideas behind it.
Post by Brian Rice
How can we persuade you to return somehow? Would it need to involve
shedding some of the formalities of a public open-source project?
--
-Brian
http://tunes.org/~water/brice.vcf
Timmy Douglas
2006-06-14 05:04:32 UTC
Permalink
Post by Brian Rice
Post by Mark Haniford
But wasn't Lee the only person working on the VM anyway?
Lee actually didn't originally like the idea of using a VM. I simply
introduced a (buggy) Slate-to-C translator a la Squeak and sketched
out a VM design just to get the ball rolling when we were stuck in a
Common Lisp interpreter. If we hadn't done that, we'd still be there
on CL since he never completed his compiler.
Was the reason behind making a VM because it's easier to make? I'm not
really up-to-date with the good-sides of VMs other than portability
and maybe simplicity, but it's probably too late to argue one way or
another with that.
Post by Brian Rice
Post by Mark Haniford
Were other people trying to bogart in on the VM or were their
arguments about the design or what?
Towards the end there was a decent amount of clamoring for
continuation support, which he apparently found unwarranted. He
basically silently refused to code any support for it, while making
hand-waving explanations about how easy it would be to get a subset
of the functionality. At least a few people disagreed with him on
the matter.
Well, I took a look at "Continuations, concurrency, yada yada..." It
seems like he was looking for a good idiom for continuations to
represent. I can't really argue for them though since I've never
really used them (the call/cc type at least). oh well
Brian Rice
2006-06-14 06:58:18 UTC
Permalink
Post by Timmy Douglas
Post by Brian Rice
Post by Mark Haniford
But wasn't Lee the only person working on the VM anyway?
Lee actually didn't originally like the idea of using a VM. I simply
introduced a (buggy) Slate-to-C translator a la Squeak and sketched
out a VM design just to get the ball rolling when we were stuck in a
Common Lisp interpreter. If we hadn't done that, we'd still be there
on CL since he never completed his compiler.
Was the reason behind making a VM because it's easier to make? I'm not
really up-to-date with the good-sides of VMs other than portability
and maybe simplicity, but it's probably too late to argue one way or
another with that.
1) Tiny: Common Lisp images are huge - 3 Mb at the minimum. The Slate
VM starts at 32kb if you build it minimally.
2) Simple: Implements only the PMD dispatch algorithm plus basic
memory, control-flow, and operations.
3) Portable: We generate ANSI C (vm.c and vm.h) and specialize to
POSIX/Win32/etc. as is suitable.

Also, it is the minimal way in which Slate system code could be
written in Slate - self-hosting was a major plus for us, to get an
idea of what the next stage problems would be in terms of using Slate
for systems code. It also *is* faster than the Common Lisp interpreter.
Post by Timmy Douglas
Post by Brian Rice
Post by Mark Haniford
Were other people trying to bogart in on the VM or were their
arguments about the design or what?
Towards the end there was a decent amount of clamoring for
continuation support, which he apparently found unwarranted. He
basically silently refused to code any support for it, while making
hand-waving explanations about how easy it would be to get a subset
of the functionality. At least a few people disagreed with him on
the matter.
Well, I took a look at "Continuations, concurrency, yada yada..." It
seems like he was looking for a good idiom for continuations to
represent. I can't really argue for them though since I've never
really used them (the call/cc type at least). oh well
Yeah, but at issue was the fact that he resisted email interchange of
any kind - his replies only dampened further discussion. I wound up
writing emails for him, most of the time.

--
-Brian
http://tunes.org/~water/brice.vcf
Timmy Douglas
2006-06-12 05:36:22 UTC
Permalink
Post by Brian Rice
Post by Timmy Douglas
Well, I've added patches to my repos for simple one-way undo and now
you can type in all the characters fine. The problem is that there is
no point in going further (selections, copy paste, searching,
kill-line, cleaning up my ugly code, etc) when the current gui is too
slow to type in. I guess my ideal path after those features to the
text buffer would be to modify demo/inspector.slate to elegantly edit
the current environment. But enough of the future talk.
Okay. I've pulled these into the site repositories.
thanks. I made another patch with drop-mark, delete-region, and
kill-line (which for now behaves differently than emacs since it does
drop-mark, end-of-line, and delete-region in one command, which sucks
up the next line I think).
Post by Brian Rice
Post by Timmy Douglas
So I'd like to take a look at speeding up (or something to speed up
the GUI portion) of slate, but the only real dealings I have with
compilers are from when I wrote a tiger compiler in sml (following
appel's book for a class). Anyways, I didn't see any hint of where to
start from the last 1000 messages on this list, so I thought I'd start
a thread. So what are the options?
I've discussed this off-list with someone who was going to work on
it, but I haven't heard back from him after initial email exchanges.
I've CC'd him just to get some basic communication re-established,
1) Improve/port the experimental_jit.c. It already gives a 2-4x
speedup. The problem is that it does no dynamic inlining, so that the
huge message-send layering which is the majority of the performance
problem is not taken care of.
hm, this sort of seems like a temporary patch to speed up things but
I'm not sure it is enough. You'd have to try out the UI to see for
yourself, but 2-4x doesn't seem like it will fix the problem. It looks
like it might be the easiest option though. It's weird because
sometimes there are long delays and then the events will all get
processed by the ui fairly quickly (1 sec between typed characters
showing up) compared to the delay before the events registered (like
2-10 sec after the first keypress). But if you try inspector.slate and
try to drag the inspector windows you will go crazy. I guess mouse
motion events really drag it down or something along those lines.


I'm not really that familiar with how slate's compiler/interpreter
works or builds now. I was hoping to find something that would tell me
how mobius/ is built since it's like already slate code and I'm not
sure what can actually build that first stage (which I assume produces
the vm.c file in the base directory). Can someone point me to docs on
how the whole build process works? I think it'd go a long way with
understanding the vm... so far I've just been opening up like millions
of source files but it's hard to get the whole picture from that and
mobius.pdf.
Post by Brian Rice
2) Fix/complete Lee's optimizing compiler framework. This has a
couple of sub-options.
The direct option is to finish his x86 code generator, figure out
how to link it with the image, etc. Basically lots of stuff that I
have no idea how to do, and I don't know if I can get him to come
back to do it (although I'd try if enough people asked... or maybe
they should try themselves).
Yeah, there is quite a bit of code in src/mobius/optimizer. So you're
saying none of that is being used at the moment?
Post by Brian Rice
The other option is to add a back-end target to LLVM from the IR
code. This is slightly problematic because Lee wrote the IR to use
SSU (single static usage) instead of SSA (single static assignment)
form, which are inversions of each other from a data-flow
perspective. Other than that, Lee's framework does "speak" LLVM at
least from the abstract perspective.
Sounds like we probably wouldn't have to worry about optimizations
once we got it into llvm's hands. I don't know how good that'd be. I
guess I'd have to spend a week figuring out how the optimization
framework works first.
Post by Brian Rice
3) Write an inlining bytecode compiler (my idea, would work
independently of a JIT) and associated caching/flushing system.
Well I'm pretty new to this stuff so I don't really know what would
get cached and flushed... I went to the library today since I saw you
thought ACD&I(?) was a good book (on some irc conversation a while
back) but they only let me check it out for 2 hours in-library... I
gave up after 30 min since there wasn't much I could do there.
Post by Brian Rice
4) Translate the VM into a direct-threaded style, which Eliot Miranda
endorsed at Smalltalk Solutions this year when I spoke with him. It
makes inlining much easier and has other benefits in terms of
architectural/code-manipulation simplifications.
ok
Post by Brian Rice
Post by Timmy Douglas
In about a week, I'm going to get more busy though since another
summer class will start, but until then, I should have time to get
something done.
That likely won't be enough time to accomplish a deep change, but
some sketch code to start with would be feasible. Slate's VM
structure is pretty simple and malleable. All the bytecode-related
code is in vm.slate, for example.
I hope that David won't mind, but I've attached his initial VM
proposal email from a couple of months ago.
Well, if he takes the powerpc asm route, I hope there is a way that I
could run it on this old athlon box.

Thanks for the reply.
Tony Garnock-Jones
2006-06-12 10:24:16 UTC
Permalink
Post by Timmy Douglas
processed by the ui fairly quickly (1 sec between typed characters
showing up) compared to the delay before the events registered (like
2-10 sec after the first keypress). But if you try inspector.slate and
try to drag the inspector windows you will go crazy. I guess mouse
motion events really drag it down or something along those lines.
It sounds like there's something rotten in the interface to SDL, rather
than basic slowness of Slate... well, I'd be surprised if it was Slate
since I've always been pleasantly surprised at the speed of the system
in general. Perhaps there's something sub-par in the polling logic?

I can't say for sure, of course, since I've been concentrating on other
aspects of the system and haven't tried the SDL interfaces for a long,
long time!
Post by Timmy Douglas
Well I'm pretty new to this stuff so I don't really know what would
get cached and flushed... I went to the library today since I saw you
thought ACD&I(?) was a good book (on some irc conversation a while
back) but they only let me check it out for 2 hours in-library... I
gave up after 30 min since there wasn't much I could do there.
Try the Self papers:

http://citeseer.ist.psu.edu/chambers92design.html
http://citeseer.ist.psu.edu/chambers91making.html
http://citeseer.ist.psu.edu/hlzle94adaptive.html
http://citeseer.ist.psu.edu/chambers90iterative.html

They changed my life ;-)
Post by Timmy Douglas
Well, if he takes the powerpc asm route, I hope there is a way that I
could run it on this old athlon box.
One important aspect of a direct-threaded design, ISTM, is the
instruction encodings... if someone can design a fixed-width "VLIW"
format for direct-threaded bytecodes, then the same architecture ought
to be able to apply to x86 as well as PPC.

Tony
Nick Forde
2006-06-12 12:27:32 UTC
Permalink
Post by Tony Garnock-Jones
Post by Timmy Douglas
processed by the ui fairly quickly (1 sec between typed characters
showing up) compared to the delay before the events registered (like
2-10 sec after the first keypress). But if you try inspector.slate and
try to drag the inspector windows you will go crazy. I guess mouse
motion events really drag it down or something along those lines.
It sounds like there's something rotten in the interface to SDL, rather
than basic slowness of Slate... well, I'd be surprised if it was Slate
since I've always been pleasantly surprised at the speed of the system
in general. Perhaps there's something sub-par in the polling logic?
I haven't tried the SDL interface but you can get a very rough idea
of the basic VM performance by running 'make benchmark' and
comparing the results to those at: http://shootout.alioth.debian.org/
Last time I tried this the results were unfortunately not good :-(

The source for these tests can be found in test/benchmark and there is
some documentation in doc/benchmarks.txt.

Nick.
Brian Rice
2006-06-12 13:27:18 UTC
Permalink
Post by Nick Forde
Post by Tony Garnock-Jones
Post by Timmy Douglas
processed by the ui fairly quickly (1 sec between typed characters
showing up) compared to the delay before the events registered (like
2-10 sec after the first keypress). But if you try
inspector.slate and
try to drag the inspector windows you will go crazy. I guess mouse
motion events really drag it down or something along those lines.
It sounds like there's something rotten in the interface to SDL, rather
than basic slowness of Slate... well, I'd be surprised if it was Slate
since I've always been pleasantly surprised at the speed of the system
in general. Perhaps there's something sub-par in the polling logic?
I haven't tried the SDL interface but you can get a very rough idea
of the basic VM performance by running 'make benchmark' and
comparing the results to those at: http://shootout.alioth.debian.org/
Last time I tried this the results were unfortunately not good :-(
*ahem* Just to be pedantic, that's not the VM's performance, that's
the VM+image's performance. There are a lot of message sends that
happen for basic control-flow operations, for example, or even for
just executing a non-literal (argument to a method) block.

That said, the VM design is rather naive and could be much better,
but not by the order of magnitude that the shootout benchmarks are.
The performance problem we have relates to the Strongtalk design - we
just have no inlining compiler to rely on.
Post by Nick Forde
The source for these tests can be found in test/benchmark and there is
some documentation in doc/benchmarks.txt.
--
-Brian
http://tunes.org/~water/brice.vcf