Jump to content
Excelsior Forums
KarelKyovsky

write calls inside JNI interrupted by Linux JET?

Recommended Posts

Hi,

When i'm calling:

int len_written = write(fd,char* bytes, int count);

In JNI library where fd is ie. serial port (/dev/ttyS0) file descriptor than not always count == len_written.

This behavior doesn't appear when i'm using Sun JVM.

Perhaps Linux JET is firing some signals that interrupt write call?

Can you please check root cause of this behavior difference?

Thank you,

Karel Kyovsky

Share this post


Link to post
Share on other sites

Hi,

Could you please give us more information?

1) Which GNU/Linux distribution do you use? Is it 64-bit?

2) Which version of JET do you use? Have you installed any updates?

3) Which version of Sun JVM do you use?

4) Do you have any sample application to reproduce the issue?

Best regards,

Svyatoslav Scherbina.

Share this post


Link to post
Share on other sites

Hi,

Could you please give us more information?

1) Which GNU/Linux distribution do you use? Is it 64-bit?

2) Which version of JET do you use? Have you installed any updates?

3) Which version of Sun JVM do you use?

4) Do you have any sample application to reproduce the issue?

Best regards,

Svyatoslav Scherbina.

1)

Linux Ubuntu 8.04 32bit PAE Kernel 2.6.24-32-generic #1 SMP Tue Sep 4 18:28:08 CEST 2012 i686 GNU/Linux

2)

JET 7.60mp5:

java version "1.6.0_37"

Java 2 Runtime Environment, Standard Edition

Excelsior JET 7.60

JET Profile [1.6.0_37 (binary compatibility level 30)]

Runtime: Desktop [sMP: yes, optimizations: enabled]

3)

java version "1.6.0_07"

Java SE Runtime Environment (build 1.6.0_07-b06)

Java HotSpot Server VM (build 10.0-b23, mixed mode)

4)

Currently I don't have application which is not HW dependent to be able to send it to you.

I noticed this behavior when using RXTX-2.2pre2 JNI library when i'm sending large bitmaps on the printer's serial port. write(new byte[70000]) on serial port OutputStream.

I fixed the RXTX library sources to accommodate with described JET's behavior (not all writes are performed fully), but it took me lots of time of debugging and i would like to know why there is this difference between JET and SunJVM.

Share this post


Link to post
Share on other sites

Just to give more detailed information.

This is the original SerialImpl.c code in RXTX-2.2pre2 that has the issue when running with JET:




do {
result=write (fd, (void * ) ((char *) body + total + offset), count - total);
if(result >0){
	total += result;
}
}  while ( ( total < count ) && (result < 0 && errno==EINTR ) );

And this is my fix that contains JET workaround:

       do {
               result=write (fd, (void * ) ((char *) body + total + offset), count - total);
               if(result >0){
                       total += result;
               }
       }  while ( ( result > 0 && total < count ) || (result < 0 && errno==EINTR ) );

Share this post


Link to post
Share on other sites

Thanks for your investigation!

We have already reproduced the issue.

In Linux version of Excelsior JET VM we do use signals for some internal purposes,

and these signals may interrupt write calls producing the issue described in your post.

A more detailed explanation will follow soon, please stay tuned.

Share this post


Link to post
Share on other sites

Thanks for your investigation!

We have already reproduced the issue.

In Linux version of Excelsior JET VM we do use signals for some internal purposes,

and these signals may interrupt write calls producing the issue described in your post.

A more detailed explanation will follow soon, please stay tuned.

Thanks for your post, waiting for more information.

Share this post


Link to post
Share on other sites

Implementation of threading in Excelsior JET VM differs from that of Oracle JVM.

Although we use signal handling in our implementation for Linux, it still complies with the Java specification,

which is also confirmed by the fact that Excelsior JET passes JCK.

On the other hand, the behavior of write call when it writes only a part of data is expected and described in its documentation.

Sometimes it happens that an application or native library works on Oracle JVM but does not work

on Excelsior JET (and some other VMs too) due to relying on implementation features not enforced by the Java specification.

You may find other examples of such problems in the pinned topics of "Defect Reports" on the forum.

Share this post


Link to post
Share on other sites

Implementation of threading in Excelsior JET VM differs from that of Oracle JVM.

Although we use signal handling in our implementation for Linux, it still complies with the Java specification,

which is also confirmed by the fact that Excelsior JET passes JCK.

On the other hand, the behavior of write call when it writes only a part of data is expected and described in its documentation.

Unfortunately, it is a common situation when an application or native library works on Oracle JVM but does not work

on Excelsior JET (and some other VMs) due to relying on implementation features not enforced by the Java specification.

You may find other examples of such problems in the pinned topics of "Defect Reports" on the forum.

This is it? This is the answer?

I need more information on this issue, than this statement.

Please tell me which signal causes this behavior, SIG35?

To which feature is this signal linked? GC or something different?

What other calls apart from write can be interrupted? Can you refer me to some documentation?

How do you recommend us to fix/workaround 3rd party JNI libs for which we will never have the source code?

Thank you

Share this post


Link to post
Share on other sites

What other calls apart from write can be interrupted? Can you refer me to some documentation?

Please start with section "Interruption of system calls and library functions by signal handlers"

at http://man7.org/linux/man-pages/man7/signal.7.html

Note that we use SA_RESTART flag.

How do you recommend us to fix/workaround 3rd party JNI libs for which we will never have the source code?

The assumption that a system call from a native library will never be interrupted is wrong.

It is not enforced by the Java specification, so using a system call without checking if it was interrupted is a bug.

Even having the source code of Excelsior JET you can't do anything to reliably work around this bug in the library if you don't have its source code.

So the only recommendation is to contact the authors of the library and ask them to fix it, because it is much easier to fix a library than to re-implement threading in VM.

Share this post


Link to post
Share on other sites

Please start with section "Interruption of system calls and library functions by signal handlers"

at http://man7.org/linux/man-pages/man7/signal.7.html

Note that we use SA_RESTART flag.

Assumption that a system call from native library will never be interrupted is wrong.

It is not enforced by Java specification, so using a system call

without checking if it has been interrupted is a bug.

Even having the source code of Excelsior JET you can't do anything to reliably workaround this bug in a library

if you don't have its source code.

So the only recommendation is to contact authors of a library and ask them to fix it,

because it is much easier to fix a library than to re-implement threading in VM.

Can you recommend me a way how can i stress test my application's resistance to SA_RESTART flag?

Can i test it somehow by sending a signal to JVM process using kill utility?

Share this post


Link to post
Share on other sites

Can you recommend me a way how can i stress test my application's resistance to SA_RESTART flag?

What do you mean by "resistance to SA_RESTART flag"?

  • Like 1

Share this post


Link to post
Share on other sites

What do you mean by "resistance to SA_RESTART flag"?

"resistance to SA_RESTART flag" is fact that application will not misbehave (will write correct number of bytes) when it receives SA_RESTART flag in signal.

I would like to know how can i send in automated way multiple number of signals to my application that will check correct handling of write interruption.

Share this post


Link to post
Share on other sites

"resistance to SA_RESTART flag" is fact that application will not misbehave (will write correct number of bytes) when it receives SA_RESTART flag in signal.

SA_RESTART flag itself does not make the application to misbehave,

it actually helps to cut off possible problems caused by wrong handling of EINTR.

The application misbehaves due to wrong handling of interruption.

With disabled SA_RESTART flag the write issue can not disappear.

But some new issues may pop up.

I would like to know how can i send in automated way multiple number of signals to my application that will check correct handling of write interruption.

Please try to stress test thread executing this native code with Thread.suspend()/resume().

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×