I've got this stored procedure, a very simple little creature, that does a simple select into XML, and then returns the result as an XmlType. Works fine...most of the time. But for some unknown reason, when moving between machines, the procedure will suddenly begin through an ORA-03113 error, or "end-of-file on communication channel" error. This is somewhat similar to the more drastic ORA-0600 error, in that it basically means "Something screwed up bad enough that I'm bailing out on you."
The annoying thing, though, is that it would work on other machines. And sometimes a recompile, or a reboot of the machine, would fix the problem. So it's not the SQL itself. Even more proof - if I snip the body of the procedure out and run it directly? Works fine.
In our case, at least, we seem to have nailed the problem down to one of trying to return XmlType. One of our Oracle consultants said simply, "CLOB has been around a lot longer than XmlType, I trust it more." So I modified the procedure to return its clob value, instead of XmlType. Since I needed it to be XML, however, I added an XmlElement.createXML(...) wrapper around the final result. Guess what? So far, no more ORA-03113. Not saying we've completely fixed it, just that it's a step in a positive direction. Our consultant feels that maybe the error occurs when the stored proc, which runs in different memory space than just the straight SQL would, is forced to cast its return value into the XmlType. By basically letting it say "Whatever's in here, I don't care, have a CLOB", we're getting around that problem.
Just documenting in case somebody else is in a similar situation and can say "Aha! Let's try that."
UPDATE: Apparently that was NOT the problem, as my QA guy reports that it just started happening again. It looks like the act of recompiling the procedure was enough to make the problem go into hiding again. I just confirmed that by making a non-trivial change to the code (that did not otherwise change what is returned, or the signature of the method), and presto chango, suddenly no 3113 bug anymore. Not sure what that means, but part of our build process does involve potential updates to the stored procedures, so it is possible that there's some sort of "The error may or may not occur after a build" causality going on.