Py4J Released


this is a minor bugfix release.

This release includes the following bugfixes:

  • Fixed constructors not being able to pass proxy (python classes implementing Java interfaces).
  • Java 6 compatibility was restored in compiled jar file.
  • Fixed unit tests for JDK 8.
  • Added a few extra paths to find_jar_path.

The details of the specific issues for Py4J 0.8.1 are available on GitHub.

Installing Py4J is one pip away: pip install py4j

Again, feel free to contact me or to write a feature request on GitHub!

For more information about Py4J:



Py4J 0.8 Released

After two years, Py4J 0.8 has just been released!

Although I merged pull requests and fixed bugs during these two years, my Ph.D., wedding, and new job made it difficult to find the time to make a proper release. All happy events that made me busy life busier 🙂

This release includes the following new features:

  • Major fix to the Java byte[] support. Thanks to @agronholm for spotting this subtle but major issue and thanks to @fdinto from The Atlantic for providing a patch!
  • Ability to fail early if the py4j.java_gateway.JavaGateway cannot connect to the JVM.
  • Added support for long primitives, BigDecimal, enum types, and inner classes on the Java side.
  • Set saner log levels
  • Many small bug fixes and API enhancements (backward compatible).
  • Wrote a section in the FAQ about security concerns and precautions with Py4J.
  • Added support of Travis-CI and cleaned up the test suite to remove hardcoded paths.

The specific issues are discussed on GitHub.

Installing Py4J is one pip away: pip install py4j

Again, feel free to contact me or to write a feature request on GitHub!

For more information about Py4J:



Py4J Backlog, Bytes, and Open Source

Since my Ph.D. thesis is being printed right now, I thought I could give a status update on Py4J.

One Py4J contributor/user reported a problem with how Py4J handles byte arrays almost a year ago. Because Py4J was treating byte arrays as any other arrays (i.e., a reference), access to individual cells in the arrays were costly (one roundtrip per access). Byte arrays are special beasts because when you go down to the level of bytes, you usually want the raw power and the hanging rope that come with it: you certainly don’t want the programming language or a particular library to stand in your way. Because Py4J uses a String protocol (e.g., newlines are used as separators), transferring raw bytes would require a lot of modifications and would introduce a special case that would need more code than the usual case.

I thus implemented a naive solution that just shifted the byte by 8 bit, to make sure that I could still use my dear newlines. The same person came back at me a few months later though, and introduced me to the concept of UTF-16 surrogates and how Java did not like these special pairs of characters, even in UTF-8, the default encoding for Py4J.

I boosted the priority of this issue, but because I had started a new job and I was trying to finish my thesis during the weekends (advice: this is the fastest way to end up in an asylum), I did not have the time nor the strength to find a solution. Fortunately, a contributor from The Atlantic made a nice Christmas present to Py4J users: he implemented a fix using Base64 and opened a pull request. I merged the pull request in January, but I’m still fighting with some test glitches caused by the difference between Python 2 and Python 3. The Open Source community has been very kind to me and I have been fortunate to receive significant contributions from Py4J users in the past (Python 3 support anyone?). Because I am working for a company that is sympathetic to open source contributions, I will make sure in the near future that the effort behind the various Py4J patches were not in vain.

There are currently 5 open issues that I need to close before releasing 0.8, but all issues have some work in progress so I am confident that I will go through this backlog soon. After that, I will try to come back to a regular release cycle.

Py4J 0.7 Released!

Py4J 0.7 has just been released!

This release includes the following new features:

  • Major refactoring to support Python 3. Thanks to Alex Grönholm for his patch.
  • The build and setup files have been totally changed. Py4J no longer requires Paver to build and everything is done through ant. The file only uses distutils.
  • Added support for Java byte[]: byte array are passed by value and converted to bytearray or bytes.
  • Py4J package name changed from Py4J to py4j.
  • Bug fixes in the Python callback server and unicode support.

The specific issues are discussed on GitHub.

Installing Py4J is one pip away: pip install py4j

Although I’m still using Py4J everyday, I do not need many more features so future development will be mostly driven by feature requests and bug reports. Feel free to contact me or to write a feature request on GitHub!




I shall blog soon about a small and hidden feature that I introduced in the Py4J eclipse default server this morning and that made it into 0.7…

Py4J 0.6 Released!

Py4J 0.6 has just been released.

This release includes the following new (and great) features:

  • New exception, Py4JJavaError, that enables Python client programs to access instance of Java exception thrown in the Java client code.
  • Improved Py4J setup: warnings are no longer displayed when installing Py4J.
  • Bug fixes and API additions.

In case you did not notice, Py4J moved to github so contributing is now easier than ever!

I plan to do at least another release (0.7) before Py4J leaves beta and moves to 1.0.

About Py4J
Py4J enables Python programs running in a Python interpreter to dynamically access Java objects in a Java Virtual Machine. Methods are called as if the Java objects resided in the Python interpreter and Java collections can be accessed through standard Python collection methods. Py4J also enables Java programs to call back Python objects. Py4J is distributed under the BSD license.

Py4J and Exceptions

Py4J 0.6 is almost ready to be released, thanks to Jakub L. Gustak who submitted important bug reports, feature requests, and patches. I have been trying to polish Py4J in the latest releases to make the API more consistent and predictable and the biggest “feature” of 0.6 will no doubt be how Py4J treats Exceptions.

Currently, exceptions can be raised in four places: (1) in the Py4J Python code, (2) in the Py4J Java code, (3) in the Java client code, and (4) in the network stack. An exception might be raised in the Py4J code if the client code is not correct, for example, if the client tries to call from Python a Java method that does not exist. Before 0.6, Py4J raised a Py4JError in cases 1,2,3 and a Py4JNetworkError (a subtype of Py4JError) in case 4. Moreover, if the Java exception was raised on the Java side, the Java stack trace was copied, as a string, in the Py4JError.

There are two issues with this approach. First, the client does not have access to the exception instance on the Java side, and this exception may have some important fields and methods that can help the error recovery. Second, it is very difficult for the client to determine at runtime the source of the error.

Starting from 0.6, Py4J will raise three types of exceptions: Py4JNetworkError in case #4, Py4JJavaError in case #3, Py4JError in cases #1 and #2. Py4JNetworkError and Py4JJavaError will be a subtype of Py4JError (so a client can implement a catch all). Py4JJavaError will also have a method that will return the instance of the Java exception and Py4JError will still display the Java stack trace for case #2.

Stay tuned for 0.6!