Tom Edelson's Songline

Tom Edelson's Songline
Writing about computers, life, and society from the perspective of a "poly Quaker Taoist" living in the Triangle region of North Carolina.

My home page at The Well

Categories In This Blog:

Home page

About me

Blogging

Greyhounds

Macintosh

Personal Finance Software

Poetry

Politics and Social Change

Polyamory

Programming Languages

• Java

• Perl

• Scheme

Radio Userland

Retirement

Writing

Wednesday, April 16, 2008

Comparing JScheme and SISC: Tradeoffs in Programming Language Design (part 4)

As I explained in the previous post, in JScheme, an integer value is always tagged with a specific type. It might be the default Java integer type, int; or it might be one of the alternatives, such as long. But if it's an integer value which will be handled by JScheme's built-in facilities for same, then its type, as seen by the JScheme interpreter, will be one of the built-in Java integer types.

Furthermore, each of those built-in Java integer types has a size limit, a largest positive [and largest negative] value that can be represented. A long takes up 64 bits, as opposed to 32 bits for an int, and so the maximum positive value for a long is 2⁶³-1, as opposed to 2³¹-1. But there is always a limit; and as a result, there is always the possibility that a seemingly simple calculation may produce a wrong answer, in JScheme as in Java.

In SISC, on the other hand, an integer is, so far as is visible to the SISC programmer, simply an integer. The SISC interpreter does some work behind the scenes to keep track of the amount of memory that must be reserved for the value, since that is variable, with no predefined limit. But integers of different sizes are not considered to be of different types.

As a result of this design difference, SISC cannot produce the sort of wrong answers to which I referred above, those due to "integer overflow". The memory space occupied by the result is as big as it needs to be in order to accomodate the mathematically correct answer.

This can be viewed as an advantage of SISC over JScheme, and also over Java ... and over most other programming languages, for that matter.

Furthermore, in respect to both the original design difference, and the difference in mathematical correctness that results from it, SISC is conforming to the R5RS Scheme standard, and JScheme is not.

That'a all a recap of what I've said before. The new business for the present post is to explain why I say that this advantage for SISC necessarily comes with a disadvantage, when it comes to calling a Java method ("method" being, pretty much, Java's name for what might otherwise be called a routine, procedure, or function) ... specifically, when passing integer parameters from Scheme code to the Java method.

As background to this, one must know and remember certain facts about Java methods themselves. The first of these is that when you create a Java method, you explicitly declare a type for each of its parameters; and in order to call the method, you must supply a list of parameters (or, more strictly speaking, "arguments") whose types match the types in the declared parameter list exactly. For example, if the method's definition says that there must be two arguments, the first an int and the second a String, you cannot call it with a long and a String, instead.

The second fact is that a Java class, or source file, can contain two (or more) methods with the same name. You can have one which expects an int and a String, and another, with the same name, which expects a long and a String. Their implementations will [necessarily] be different as well, but the difference in parameter types is the only difference which the underlying language software can use in order to figure out which of the two methods will be called. (We would say, of these two example methods, that the type declared for the first parameter is the only difference in their "signatures". That's because "signature", for a method, is effectively defined to mean: that collection of facts about it which are relevant to determining whether or not it should be called (or "invoked") in any given situation.)

(When the calling code is written in Java as well, this determination can be made at compile time. When the calling code is Scheme (JScheme or SISC), this can, in general, only be determined at run time. This difference between the languages, while important in other contexts, is not very relevant to the particular difference between JScheme and SISC which is our topic today.)

(I will also mention that this particular fact about Java, the fact that you can have two methods with the same name, and thus distinguishable only by their parameter lists, is one of my least favorite things about Java. It is undoubtedly considered a good thing by many who program primarily in Java itself; but for those interested in calling Java from a "dynamic" language like Scheme, it is the source of much pain, largely because of just the sort of consequences of it that are under discussion here. However, I generally try not to fuss much about what I do or don't like about Java; Java is just there, y'know?)

So anyway, now we can look at the consequences of these facts about Java, as they affect calling Java from JScheme; and contrast that with how they affect calling Java from SISC.

From JScheme, this is relatively straightforward. I'll continue to use the example in which there are two candidate methods, having the same name, with the only difference in their signatures being that the first method requires an int in the first argument position, while the second method requires a long.

When the JScheme interpreter encounters a method call in which the given method name, and other context, point at a choice between these two candidate methods, it decides between them by examining the internal tags which indicate the types of the arguments being supplied. If the arguments are tagged as an an int and a String, the first method will be invoked. If they are tagged as a long and a String, the second method will be invoked. If anything else, a run-time error will be signaled.

Now consider the same situation, except that SISC, not JScheme, is the Scheme implementation being used. In SISC, just as in JScheme (or any Scheme implementation), the values are tagged, internally, with some indication of the "type" of each. But in SISC, as I said above, an integer is just an integer ... meaning that it is tagged only as such, not as one of the various specific Java integer types such as int or long.

Let me make clear that in SISC, you can have a value tagged as a Java int or as a Java long. Indeed, one needs to make use of this capability in order to resolve the difficulty under discussion. But a value tagged in either of these ways is not considered to be an "integer" by SISC: it doesn't let you use either of them in its native arithmetic operations. There's really not much you can do with them, other than to pass them to Java methods, or to convert them to actual Scheme integers.

But to return to the situation at hand: the SISC program wants to call one of these two candidate methods. It supplies an argument list in which the first parameter is tagged internally simply as a Scheme integer, not as a Java int or long. But only by treating this argument as an int, or as a long, can the SISC interpreter legitimately determine which method to call. So how can it make this decision?

The answer is that it can't. If the SISC program attempts this, an error will be signalled. Since there's no way for the SISC interpreter to make the decision, the person writing the SISC program must make it. SISC must, and does, require that the SISC code first explicitly convert this Scheme integer value into either a Java int or a Java long, before attempting to pass it to one of these methods. And [obviously?], the type to which it is converted determines which method is invoked.

The conversion is not, technically, part of the method call, but it is a prerequisite to it ... if you want to call one of these methods, and the value you want to pass as the first argument is [for example] the result of an arithmetic calculation done in the Scheme code.

Having to do an explicit conversion means having to write additional code; but we're not talking about a lot of additional code here. Let me [finally!] show you what the code would actually look like. In so doing, I'll simplify the example: now our two candidate methods each requires one argument, the first method requiring it to be an int, and the second a long.

The conversion, together with the method call, could look like this in SISC:

(my-method (->jint scheme-value))

in order to call the first method, or

(my-method (->jlong scheme-value))

in order to call the second. Whereas in JScheme, it could look simply like this:

(my-method scheme-value)

(Of course, in either implementation, it couldn't look exactly like what I've shown ... unless you had first defined the symbol my-method so that its value is the appropriate "generic Java method". That prior step would look different in SISC from how it would look in JScheme; I'm assuming that it's already been made, in order to abstract away this syntactic difference, and focus on the difference concerning the data type conversion, or lack of same, in SISC or JScheme, respectively.

In fact, what I am doing here is the opposite of what I did in part 2. There, my original SISC example was

(define later (->boolean ((generic-java-method 'after) date2 date1)))

in which the "->boolean", plus one set of parentheses, represented one of these type conversions which SISC requires, and JScheme does not. Then I waved my hand at the type conversion, promising to explain it later ("planning to come back, in a subsequent post, to the issues around it"), and then abstracted it out, in order to focus on the syntactic difference. The "subsequent post" in question is the one you are reading now.)

I will claim that I have now shown you a very good example of a necessary tradeoff in language design. I stress that it is in language design, not language choice: it's not a question of having to accept the whole package of choices that one or another language designer made, with it being [merely] statistically close-to-impossible that either designer will have made each choice exactly as you would wish.

The tradeoff here could be described, either from the [presumed] perspective of the JScheme designers/implementors, or from the perspective of those creating SISC.

For the people designing JScheme, it seems clear that a top priority was given to making calls from JScheme to Java as easy ... and succinct ... as possible. That means you can't require explicit type conversions, for those are additional code. The only way to make the calls, unambiguously, without requiring explicit type conversions (in at least some cases) is to have the JScheme types correspond one-to-one with Java types. Each of the built-in integer types in Java has a predefined size limit; so the same must be true of each one in JScheme.

Finally, if there is a finite set of integer data types, each with a predefined size limit, then there must be some cases in which you can code an arithmetic operation, and give it input values, each of which fits within the range of values that can be represented; but the actual correct result of the arithmetic operation, as that operation is understood outside "computer arithmetic", is outside the range of values that can be represented. Since the correct answer cannot be represented, JScheme cannot give you that correct answer as the result of the operation.

The people designing SISC, on the other hand, seem to have had a fundamental design goal of fully implementing the R5RS Scheme specification. The specification requires that an implementation not give incorrect arithmetic results, at least not of the sort that are entailed by fixed limitations on the range of representable values. Therefore SISC, or any other fully conforming Scheme implementation, cannot have such predefined limits on the range of values.

But the built-in Java integer types all do have such predefined limits. Therefore the SISC types cannot correspond one-to-one with the Java types. Therefore (and given the fact that two Java methods can have the same name, and be distinguishable only by the types in their parameter lists), if you were allowed simply to pass values of Scheme types as arguments to Java methods, there would necessarily be cases in which the interpreter could not determine which method to invoke. Therefore SISC must require explicit type conversions, in at least some cases, in order to eliminate these ambiguities.

In short, sometimes you can't have it both ways.
Categorie(s) for this post: Scheme.

3:42:59 PM comment []