To my surprise to following code (Jasmin source) is verifiable and runs without throwing any exceptions (Sun's JRE 1.4.1): .class public ifmerge3 .super java/lang/Object .field public static r Ljava/lang/Runnable; .method public <init>()V .limit locals 2 .limit stack 8 aload_0 invokenonvirtual java/lang/Object/<init>()V return .end method .method public static main([Ljava/lang/String;)V .limit stack 5 .limit locals 4 ldc "" putstatic ifmerge3/r Ljava/lang/Runnable; return .end method
The red lines are the interesting ones. It is apparantly (the vmspec doesn't really talk about this) legal to use any object reference in any place where an interface reference is expected.
This has an interesting implication for the performance of interfaces in Java. It means that whenever an interface method is invoked, the VM will always need to check if the reference does in fact implement the interface in question. This should make interface invocation slower. To test this theory, I wrote a small benchmark:
class ifperf implements Runnable { public void run() {} public static void main(String[] args) { Runnable r = new ifperf(); Runnable[] ar = new Runnable[10000]; for(int i = 0; i < ar.length; i++) ar[i] = r; long start = System.currentTimeMillis(); for(int i = 0; i < 1000; i++) for(int j = 0; j < ar.length; j++) ar[j].run(); long end = System.currentTimeMillis(); System.out.println(end - start); } }
Sun's J2RE 1.4.1: 300 ms IKVM (on .NET 1.1 beta): 150 ms
John Lam has written about the implementation of .NET's interface method dispatch here. I haven't been able to find any articles that talk about HotSpot's implementation of interface method dispatch. Let's look at the code that is generated by both JITs.
.NET 1.1 beta code (inner loop only): 07591DCC cmp edi,dword ptr [esi+4] 07591DCF jl 07591DD4 07591DD1 inc ebx 07591DD2 jmp 07591D91 07591DD4 cmp edi,dword ptr [esi+4] 07591DD7 jae 07591DED 07591DD9 mov ecx,dword ptr [esi+edi*4+0Ch] 07591DDD mov eax,dword ptr [ecx] 07591DDF mov eax,dword ptr [eax+0Ch] 07591DE2 mov eax,dword ptr [eax+94h] 07591DE8 call dword ptr [eax] 07591DEA inc edi 07591DEB jmp 07591DCC 07591E80 ret
J2RE 1.4.1 HotSpot code (inner loop only): 00AC7023 mov esi,dword ptr [ebp-18h] 00AC7026 mov edi,dword ptr [ebp-0Ch] 00AC7029 cmp edi,dword ptr [esi+8] 00AC702C jae 00AC7146 00AC7032 mov esi,dword ptr [esi+edi*4+0Ch] 00AC7036 mov dword ptr [esp],esi 00AC7039 mov ecx,esi 00AC703B cmp eax,dword ptr [ecx] 00AC703D mov eax,6B68778h 00AC7042 call 00ABCE25 00AC7047 inc dword ptr [ebp-0Ch] 00AC704A mov esi,dword ptr [ebp-18h] 00AC704D mov esi,dword ptr [esi+8] 00AC7050 mov edi,dword ptr [ebp-0Ch] 00AC7053 es: 00AC7054 cs: 00AC7055 fs: 00AC7056 gs: 00AC7057 nop 00AC7058 cmp edi,esi 00AC705A jl 00AC7023 00ABCE25 mov edx,dword ptr [eax+0Ch] 00ABCE28 mov ebx,dword ptr [eax+8] 00ABCE2B cmp edx,dword ptr [ecx+4] 00ABCE2E jne 00ABB6C0 00ABCE34 cmp dword ptr [ebx+38h],0 00ABCE3B jne 00ABB8C0 00ABCE41 jmp 00A67940 00A67940 ret
The first thing to notice, is that HotSpot (in this particular example) is not very good at optimizing register usage, compared to the .NET JIT. .NET uses EDI as the loop counter and ESI as the array reference, whereas HotSpot keeps both the counter and the array reference on the stack. This difference probably accounts for a large part of the performance difference.
The second thing to notice, is that both HotSpot and .NET do not eliminate the array bounds check (again, in this particular example).
The code at 00ABCE25 is where it gets really interesting. After running a while, HotSpot noticed that all calls to Runnable.run() at this site actually resolved to ifperf.run() and so it emitted code that takes advantage of that fact. However, since it cannot prove that this will always be true, it has generated code that checks that the reference does indeed refer to an ifperf object (the cmp instruction at 00ABCE2B does this), and if it doesn't, it jumps to some code at 00ABB6C0 to do the interface method lookup and patch the call at 00AC7042 to point to a new location that does the interface method lookup and does the profiling that started this optimization in the first place. I'm not sure what the second comparison does, but it's something similar. The final branch at 00ABCE41 jumps to the ifperf.run() implementation (bypassing any dynamic dispatch!).
Conclusions: All of this complex optimization (and on-the-fly deoptimization!) makes micro benchmarking fairly useless (as many others have previously demonstrated). As an example, storing just a single non ifperf Runnable in the array slows HotSpot down by about 10%, while in .NET this has no effect. However, interleaving two different Runnable object types in the array slows HotSpot down by about 20% whereas it slows down .NET by about 300% (and I cannot explain that).
Now, the real question is, does IK.VM.NET need to support assigning object references (that do not implement a particular interface) to references of that interface type? The current design (although the implementation isn't quite correct) is to allow the code above to be verified, but when it runs it throws an IncompatibleClassChangeError when the putstatic is executed.
2:20:15 PM Comments
|