Skip to content

2016

Java 9 Process API

In a previous blog post I wrote about one of my favourite features of Java 9, the JShell. At this post, I will write about another feature I am excited about. The new Java 9 Process API. I will also present some code showing how powerful and intuitive it is.

The new API adds greater flexibility to spawning, identifying and managing processes. As an example, before Java 9 someone would need to do the following in order to retrieve the PID of a running process:

The above is not intuitive and seems like a hack. It feels to someone that the Java process should at least easily expose its own PID.

Moreover, I quite a few times needed to spawn new child processes from inside a Java process and manage them. The process of doing so is very cumbersome. A reference to the child process has to be kept throughout the program's execution if the developer wishes to destroy that process later. Not to mention that getting the PIDs of the children processes is also a pain.

Fortunately, Java 9 comes to fix those issues and provide a clean API for interaction with processes. More specifically two new interfaces has been added to the JDK:

1. java.lang.ProcessHandle 2. java.lang.ProcessHandle.Info

The two new interfaces add quite a few methods. The first one methods for retrieving a PID, all the processes running in the system and also methods for relationships between processes. The second one mainly provides meta information about the process.

As someone would expect most of the methods have native, platform specific implementations. The OpenJDK's implementation of ProcessHandle can be found here. Also the Unix specific implementation can be seen here.

I have created a very simple program which makes use of most of the features of this new Process API. The program does the below:

  • Can retrieve the running process' PID
  • Can start a long running process
  • Can start a short running process, which terminates about ~5seconds after starting
  • Can list all child processes that were spawned by the parent one
  • Can kill all child processes that were spawned by the parent one
  • Attaches a callback when a child process exits. This is done using the onExit() method of the ProcessHandle

The sample class is provided below. For the entire example please see here:

JShell

As of now, Java 9 official release date is 27.07.2017. According to openJDK mailing list the push back was due to the most anticipated feature of Java 9, which is the modularisation of the JDK or commonly known as Project Jigsaw.

I am not as much excited for this feature as I am for the brand new JShell. Many people criticise the language's verbosity and sometimes the amount of code that is required to do some stuff. I do not disagree that this, in many cases, is true. But, what I was missing mainly from Java was the ability to quickly evaluate an expression/algorithm/piece of code.

For example, many times I find myself needing to try something quick which involves reading a file or reading something from the web and performing some manipulation on it. Or even sometimes testing out a lambda expression to see its behaviour. Up to this point, actions like that were a bit cumbersome, as it involved the creation of a class, a main method and the execution of that program.

JShell is introduced to solve problems like that and more. Also known as Project Kulla JShell is an extremely useful tool. It is a REPL (Read Evaluate Print Loop) tool. Similar ones exist in various other languages like Python, Perl, even Scala.

For someone to use JShell she/he needs to download JDK 9. Then all she/he has to do is to navigate to /bin directory and execute the jshell command.

Firstly, the JShell itself prompts the user to type /help intro

The jshell comes with auto-completion features, so the user can press Tab and see a list of commands depending on the first letter she/he typed:

A list of help command can appear on the output by typing /help.

By default JShell has to import the classes that the user is going to use. It comes with a pre-defined set of common classes already imported:

A user can import any JDK class, or even her/his own classes by adding to the classpath:

As someone can notice it is not mandatory to add semicolons in the end of statements. However, it is mandatory to add them if the user adds a class or a method.

Each expression the user writes on the console is evaluated and printed on the standard output. If an expression has a return type that return type is automatically assigned in a variable that the shell creates on the fly. Of course, later on the user can make use of that variable as normal:

User can define methods outside of classes. Additionally, classes can be defined and referenced as normal:

A very nice feature is the fact that the user does not need any try{}catch{} blocks for methods which define checked exceptions:

Finally, the user can see the defined methods, types and variables and reset her/his session:

Concluding, I believe JShell will be a nice to have tool. By exploring it I am pretty sure people will come up with some interesting uses of it.

JDK Evolution

I know for fact that many people (especially in the financial technology industry) are very skeptical when a new version of Java is released. People, actually persist to update their Java version (even the JRE version) for many years. There are a lot of places that are still using Java 6! Even though, this persistence is valid for some cases, especially in the early stages of a new release, i personally find it wrong. Indeed, to upgrade the version of Java a software is using is not a simple and easy thing in most of the cases. Lots of testing needs to be done, to ensure at least the application's performance has not degraded. Additionally, more testing is needed when the application is doing something very tailored, like calling native code in-process.

In my opinion, upgrading to the newest version is advisable for many reasons. The one i would like to mention today is the JDK evolution. Meaning, that in most of the cases a software developer will have some free gains, without him, in principal, doing anything. The code inside the JDK has some minor changes between releases. This is done for bug fixing reasons, improving performance reasons or even better for following hardware trends. There are lots of times that CPUs introduce a new instruction which solves a problem down at the silicon level, meaning faster processing. The Java engineers and in particular people who are involved in the OpenJDK project have lots of mechanical sympathy.

A well shout example is the commonly used java.util.concurrent.atomic.AtomicInteger class. There is a huge difference in the implementation for a couple of methods in this class, between Java7 and Java8. The difference is presented below:

Java 7

117  public final int getAndSet(int newValue) {118      for (;;) {119          int current = get();120          if (compareAndSet(current, newValue))121              return current;122      }123  }

Java 8

119  public final int getAndSet(int newValue) {120      return unsafe.getAndSetInt(this, valueOffset, newValue);121  }

There is a very important difference. Java 8 uses some code inside the Unsafe class, where Java 7 is performing a busy loop. Java 8 actually makes uses of a new CPU instruction for that. That means that the unsafe.getAndSetInt is an intrinsic function. Java's intrinsic functions can be found here.

This is a very simple but very important reason why someone should consider regularly upgrading his/her Java version. Simple things like that, which are spread across the newer implementations can actually have a positive impact on every application.

Log4j2 vs Log4j

Log4j2 is the evolution not only to Log4j but also to Logback, as it takes Logback's feature one step forward. The main selling point is the improved performance, throughput of messages and latency, which apparently is a huge leap forward compared to Log4j and also Logback.

Other interesting Log4j2 features are:

  • Automatic reloading of logging configurations
  • Property Support: Log4j2 loads the system's properties and they can be evaluated even at the configuration level
  • Java8 lambdas and lazy evaluation: It provides an API for wrapping a log message inside a lambda statement, which only gets evaluated if truly needed
  • Garbage free: An interesting architectural feature, as Log4j2 has no or very little (in case of web apps) garbage. You can read more about that here.
  • Async loggers using the LMAX Disruptor: The disruptor is a very interesting technology and it is always provoking to examine use cases of it being used in strain

I played around with Log4j2 and in general i was very happy with its API, implementation ( it actually separates the API from the implementation, even though that means the developer needs to add 2 maven dependencies), configuration simplicity and finally the performance.

Even though measuring a logger's performance with JMH is not advisable i tried to compare its performance (using async and sync loggers) against the old Log4j. The performance (average time and throughput) was indeed better and at the edge cases 15K ops/ms faster!. Having said that, you should take that with a pinch of salt, because as mentioned earlier JMH is not the right tool to performance measure and compare the two logging implementation.

For reference the simple java program used to perform the various tests can be found in Github.

Some indicative results, performing 3 runs for each logger can be seen below.

Log4j2 Async Logger: #1 Benchmark Mode Cnt Score Error Units Log4JBenchmarking.logMessage thrpt 20 84.875 ± 6.383 ops/ms Log4JBenchmarking.logMessage avgt 20 0.015 ± 0.001 ms/op #2 Benchmark Mode Cnt Score Error Units Log4JBenchmarking.logMessage thrpt 20 87.430 ± 9.362 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op #3 Log4JBenchmarking.logMessage thrpt 20 79.753 ± 13.381 ops/ms Log4JBenchmarking.logMessage avgt 20 0.013 ± 0.001 ms/op ----------------------------------------------------------------- Log4j2 Logger: #1 Log4JBenchmarking.logMessage thrpt 20 75.881 ± 10.960 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.002 ms/op #2 Log4JBenchmarking.logMessage thrpt 20 79.698 ± 12.290 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.002 ms/op #3 Log4JBenchmarking.logMessage thrpt 20 87.428 ± 6.678 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op ----------------------------------------------------------------- Log4j Logger: #1 Log4JBenchmarking.logMessage thrpt 20 72.490 ± 8.350 ops/ms Log4JBenchmarking.logMessage avgt 20 0.014 ± 0.002 ms/op #2 Log4JBenchmarking.logMessage thrpt 20 84.169 ± 9.227 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op #3 Log4JBenchmarking.logMessage thrpt 20 72.599 ± 10.801 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op

Being a polyglot developer

Throughout my professional career i have mainly used object oriented, statically typed programming languages. The likes of Java and C# were ideal for big scale projects. Such languages have a massive active community behind them, developing frameworks, tools, writing articles about it and responding to questions in online forums. Hence it is really hard to advocate against using them. It is rare that someone would stumble upon a problem that someone else hasn't solved already. It was not until recently that i had to switch to a different paradigm of language. A dynamically typed language which can be used as an object oriented one, but also having functional abilities.

This blog post is not about advocating which style is better, as i believe there is no silver bullet out there. Every problem is different and can be solved in multiple ways.

However, this blog post is about the benefits of being a polyglot developer, or using different styles of programming languages. At the beginning, i found myself struggling to get used to the new one. I tended to try make it work using the approach i already knew, which was not the right mentality. After a couple of weeks and as i got more familiar with the language its features and its 'mentality' i found myself trying to approach the problems from a different perspective. I was combining the various different programming styles and nice ideas were coming out of those. Even when i was going back and solving problems in the object oriented way, i found out that i could apply new techniques which were making the solution much more elegant and fun to program.

I believe that being open to different languages is a great thing for a software engineer. It opens new perspectives, adding new fundamental knowledge and giving agility on how someone approaches a problem. Having said that, I was looking at a new programming language to learn or at least get a bit familiar with. I heard really positive things about Haskell. A functional, statically type language. Even though i know i might not use such a language for a project at my workplace, i believe that i will benefit a lot by at least spending a couple of months playing with it.

Concluding, i would urge everyone to try a different programming language than the one he/she is used to. The benefits are enormous and is really fun too.

Hibernate Tools - JPA Entity generation

Recently i was reviewing and trying some examples using the Hibernate Tools. More specifically, i was trying their latest version (5.0.0.CR1) in order to generate some JPA entity POJOs, out of a database schema.

Hibernate Tools, can either be used programmatically from their Java API, or using their pre-defined ANT tasks. The below examples demonstrate the programmatic way and a Mavenized way, by invoking ANT from within Maven.

I used an in-memory HSQL database, with two very simple tables. A Users table with an ID and a name and an Address table with an ID, some fields and a foreign key to the Users table, mimicking a many-to-one dependency.

The code that starts up the HSQL server and creates the tables can be found in GitHub.

As mentioned above the Hibernate Tools can be invoked programmatically. Initially i found it a bit tricky as i hadn't realized i needed to invoke the JDBC Configuration step before i invoked the POJOs generation step. Probably, this is needed in order for the tool to read the Hibernate configuration file, and identify the database and its schema. The configuration that is needed is actual rather trivial:

  • Set the destination folder
  • Point the tool to the hibernate configuration file, in order to pickup the database details
  • Invoke the JDBCConfigurationTask in order to identify the database schema
  • Invoke the Hbm2JavaGenerationTask in order to generate the JPA entities out of the above database schema

A sample code that does the above is shown below:

The java code that is generated for the two database tables is the below:

The whole process can be made as part of a maven compilation step. This is done using the ANT tasks that are provided. The relevant section of the pom.xml file is the below. Additionally, using the maven helper plugin the generated classes can automatically be added on the project's classpath, bulletproffing the application ( and automating the tedious task of re-generating the entities ) of future changes to the database schema.

The complete example can be found in GitHub.

Java Enum as a class

Recently i have been asked a fairly simple question. "Can you extend an enum?". My reaction to that was "Why would you want to do that?". But, given a second thought, i realized that i didn't really know the answer. Of course i knew that in Java enums are treated as classes, but i had no clue how they look like inside the JVM, whether they were made final or not. I could of course try to extend an enum in IntelliJ and see whether the IDE would give me an error or not.

However, the correct way is to inspect how the class looks like after it gets deconstructed back from its bytecode. This can be done using the javap utility which comes along with the JDK. For example imagine we have the following enum:

Using the javap utility we can dissasemble the .class file, which will not give us the above result.

[bash] javap Weekdays.class [/bash]

The class that the JVM knows is:

Finally we got our answer. The enums are indeed represented as classes inside the JVM and those classes are final, hence we cannot extend them.

Deadlock

This article will present a deadlock and some tools to examine and identify it.

A deadlock situation happens when two or more threads are waiting to acquire the object monitor of one or more objects that are already locked one of the competing threads. Hence, the threads will wait forever, if there are no detection and prevention strategies.

The following little code snippet simulates the occurrence of a deadlock, between two competing threads.

In the above situation, thread named 'Left-1' tries and acquires the monitor of object named 'left'. Then it sleeps for a couple of seconds and tries to acquire the monitor of object named 'right', but 'Right-1' thread has already done so. The two threads have no back out logic, hence that program execution will freeze forever.

Detecting a deadlock

Although, in the above example the program is trivial and we can immediately understand where and why the deadlock is happening, in a real-world application that might be a bit tricky. The easiest way is to get a thread dump and analyze it.

  • Using an IDE

In case you were running the application locally, from your IDE, most of the chances are that your IDE already have the ability to do so. I am mainly using IntelliJ. You can find that functionality in the 'Run' window as shown below.

IntelliJ_dump_threads

That will dump in your standard output all the threads with their stack and the state they are in.

[bash highlight="4,5,14,15"] "Right-1" #13 prio=5 os_prio=31 tid=0x00007f8444219800 nid=0x5503 waiting for monitor entry [0x000070000134f000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nikoskatsanos.deadlock.Deadlocked.lambda$start$1(Deadlocked.java:44) - waiting to lock <0x00000007970c0328> (a java.lang.Object) - locked <0x00000007970c05c0> (a java.lang.Object) at com.nikoskatsanos.deadlock.Deadlocked$$Lambda$2/1241276575.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

"Left-1" #12 daemon prio=5 os_prio=31 tid=0x00007f8443944800 nid=0x530f waiting for monitor entry [0x000070000124c000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nikoskatsanos.deadlock.Deadlocked.lambda$start$0(Deadlocked.java:33) - waiting to lock <0x00000007970c05c0> (a java.lang.Object) - locked <0x00000007970c0328> (a java.lang.Object) at com.nikoskatsanos.deadlock.Deadlocked$$Lambda$1/1022308509.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [/bash]

The above is part of the thread dump created by IntelliJ, which includes the stack of our two deadlocked threads. By analyzing the above snippet, we can see that both threads are in the 'BLOCKED' state and both of them are waiting to lock an object. If we observe closer the objects that each thread is trying to lock, is the one already locked by the other thread. This indication and a look at our source code, which will ensure that those locks will never be released, is enough for us to come to our conclusion.

  • Using a tool

Another way to analyze and detect a deadlock, would be to use a more sophisticated tool. There are plenty out there some of them commercial and some of them shipped with your JDK. Three of the most popular are: JConsole, JVisualVM and JavaMissionControl.

Those tools are very easy in use and all of them are quite similar. JConsole is probably the simplest. Using JConsole requires to launch the application and connect to the process running the application you want to analyze. Once started, the user can find a tab named 'Threads'. That screen will give the user everything he/she needs. The user can examine the existing threads. The information is actually the same as the one produced by IntelliJ above and we will see the reason further below. But most importantly the user can notice a detect deadlock button on the bottom. By just using that button makes it extremely easy to find if a deadlock is present in the application. It will look like below, which indicates the two threads on the left hand side are in a deadlock.

jconsole_deadlock_screen

  • Using jstack

Finally, in many cases the application might be running in a server and the only way to interact with it is a shell. In such cases the user needs to use command line utilities provided by the JDK itself. More specifically the jstack. jstack is what is actually used underneath the covers by the above two ways.

In order to do that the user needs to find the process' PID. That can be done either by using OS level command or by just using the jps command, which also comes with the JDK. Once the user has the pid he/she can invoke jstack command in order to get an output similar to the above tools.

[bash] jstack -l ${PID} [/bash]

The full source code for the example can be found in GitHub.

A ThreadFactory

After Java 1.5 writing multithreaded code become much easier, compared to prior versions. Lots of logic was encapsulated behind classes that were baked inside the JDK. Additionally, the way developers were creating their threads radically changed.

Making use of Executors and ExecutorServices took away the boilerplate code that was needed in order to create and manage the lifecycle of threads.

But in order to make monitoring and debugging easier, threads should have descriptive names. Most of the above executors make use of the DefaultThreadFactory which gives a not so descriptive name (i.e pool-1-thread-1).

Fortunately enough the programmer can pass in its own implementation of a ThreadFactory.

A sample implementation, which gives the thread a descriptive name and a counter can be the following:

The class, along with some unit tests can be found on GitHub.

Design Patterns, Builder

The Builder is a design pattern which belongs to the family of creational design patterns. It comes very handy when an object is complicated to create (i.e has too many fields) or when the developer needs to control the initialization of an object.

Using the Builder pattern, the construction of a complicated object is encapsulated and controlled only at one place. That way, the user can modify the creation logic, or even the object, without having to find every place where the object's constructor is called.

Additionally, another use case of the Builder pattern is when some validation on an object's field's value needs to happen prior initialization (e.g email validation).

Take for example the below constructor:

Even though at the moment the constructor does not seem too complicated, imagine in the future the developer needs to add one or more fields. He/She will need to change every line of code that initializes a Person object.

Moreover, what if the developer needs to validate the fields prior to object's construction. Those things can be solved using a builder. I like having the builder as a separate static inner class of the object that it builds.

An example of a PersonBuilder, which performs some validation prior to object creation, is the below:

The full example along with some unit tests can be found on Github.