9 Java low latency interview questions & answers with lots of diagrams

JDD 2019: No GC coding techniques for low latency Java, Ivan Zvieriev

You will be quizzed on the low latency application you had recently worked on especially the outcomes in terms of the latencies, response times, and throughput along with the challenges you faced.

Kafka achieves low latencies due its design principles such as Sequential I/O with append-only totally ordered data structure, Don’t Copy by using binary data across producers, consumers & brokers without any modifications & Zero Copy Principle to reduce CPU cycles & memory bandwidth, Batching of data to reduce network calls, and Compression of batches (and not individual messages) using LZ4, SNAPPYor GZIP codecs, horizontal scaling by adding more nodes.

If you are building a “trade order matching” engine in Java, the “network latency” is the time taken in say micro seconds to receive an order matching request to the engine from a client app plus the time taken for the client app to receive the first byte of the response message from the engine. The “processing latency” is the time elapsed in micro or milli seconds for the engine to match the order and build the response to be sent back to the client app.

Executors are worker nodes’ processes in charge of running individual tasks in a given Spark job. Spark Executors are launched at the beginning of a Spark application and typically run for the entire lifetime of an application. Once they have finished running the tasks they send the results to the “Driver Application”. “Spark Executors” also provide in-memory storage for RDDs that are cached.

Latency should be as minimum as possible whilst throughput should be as high as possible. It is difficult to achieve both at the same time but strive for both to find a good balance to meet the SLAs (i.e. Service Level Agreements). It is worth noting that you can sometimes sacrifice latency to get throughput by batching things together, and conversely improving latency by sending data immediately after it is created by sacrificing throughput.

The most common issues are related to network latency, disk latency, and CPU latency. Network latency can be improved by using a faster network connection or by optimizing the network code. Disk latency can be improved by using a faster disk or by optimizing the disk access code. CPU latency can be improved by using a faster CPU or by optimizing the code.

Low Latency is a term used in the financial industry to describe the time it takes for a trade to be executed. Low Latency trading systems are designed to minimize this time, as even a few milliseconds can make a difference in the outcome of a trade. For this reason, Low Latency trading is a highly sought-after skill in the financial industry. In this article, we will review some common Low Latency interview questions that you may encounter during your job search.

There are a few different ways that you could go about monitoring latency in a production environment. One way would be to use a tool like New Relic to track the response times of your application. Another way would be to set up a system where you log the response times of each request made to your application. This would give you a more detailed view of where the bottlenecks are in your system.

Yes, it is possible to predict latency at various levels of an application stack. One way to do this is to use a tool like New Relic to monitor application performance. New Relic can provide insights into where latency is occurring and help to identify potential bottlenecks. Another way to predict latency is to use a tool like JMeter to load test the application. This will help to identify areas of the application that are not able to handle high traffic levels and may need to be optimized.

Latency is the time it takes for a request to be processed and a response to be returned, while response time is the time it takes for a user to receive a response after making a request. In general, latency is a measure of how long it takes for a system to process a request, while response time is a measure of how long it takes for a user to receive a response.

Separation of business and infrastructure code.

Your business code is your essential complexity of your application. These are the things you have to do to meet your requirements. Actually how this is done should be a separate and replaceable concern.

Your infrastructure code is your enabler code which should be able to be easily replaced without changing what your application does.

If your application doesn’t do what it should, you should be able to change your business logic to fix it. If the application doesn’t work the way it should, you should be able to change your infrastructure with some confidence that your application will still do what it is required to do.

12 Answers 12 Sorted by:

Although immutability is good, it is not necessarily going to improve latency. Ensuring low-latency is likely to be platform dependent.

Other than general performance, GC tuning is very important. Reducing memory usage will help GC. In particular if you can reduce the number of middle-aged objects that need to get moved about – keep it object either long lived or short lived. Also avoid anything touching the perm gen.

avoid boxing/unboxing, use primitive variables if possible.

Avoid context switching wherever possible on the message processing path Consequence: use NIO and single event loop thread (reactor)

Buy, read, and understand Effective Java. Also available online

Avoid extensive locking and multi-threading in order not to disrupt the enhanced features in modern processors (and their caches). Then you can use a single thread up to its unbelievable limits (6 million transactions per second) with very low latency.

If you want to see a real world low-latency Java application with enough details about its architecture have a look at LMAX:

Measure, measure and measure. Use as close to real data with as close to production hardware to run benchmarks regularly. Low latency applications are often better considered as appliances, so you need to consider the whole box deployed not just the particular method/class/package/application/JVM etc. If you do not build realistic benchmarks on production like settings you will have surprises in production.

Do not schedule more threads in your application than you have cores on the underlying hardware. Keep in mind that the OS will require thread execution and potentially other services sharing the same hardware, so your application may be requried to use less than the maximunm number of cores available.

Consider using non-blocking approaches rather than synchronisation.

Consider using volatile or atomic variables over blocking data structures and locks.

Consider using object pools.

Use arrays instead of lists as they are more cache-friendly.

Normally for small tasks sending data to other cores can take more time than processing on a single core because of locking and memory and cache access latency. Hence, consider processing a task by a single thread.

Decrease the frequency of accessing main memory and try to work with data stored in caches.

Consider choosing a server-side C2 JIT compiler that is focused on performance optimizations contrary to C1 which is focused on quick startup time.

Make sure you dont have false object field sharing when two fields used by different threads can be situated on a single cache line.

Read https://mechanical-sympathy.blogspot.com/

Consider using UDP over TCP

Use StringBuilder instead of String when generating large Strings. For example queries.

Another important idea is to get it working first, then measure the performance, then isolate any bottlenecks, then optimize them, then measure again to verify improvement.

As Knuth said, “premature optimization is the root of all evil”.

I think “Use mutable objects only where appropriate” is better than “Make your objects immutable”. Many very low latency applications have pools of objects they reuse to minimize GC. Immutable objects cant be reused in that way. For example, if you have a Location class:

You can create some on bootup and use them over and over again so they never cause allocations and the subsequent GC.

This approach is much trickier than using an immutable location object though, so it should only be used where needed.

In addition to developer level solutions advised here already it can also be very beneficial to consider accelerated JIT runtimes e.g Zing and off heap memory solutions like Teracotta BigMemory, Apache Ignite to reduce Stop-the-world GC pauses. If some GUI involved using Binary Protocols like Hessian, ZERO-C ICE instead of webservice etc is very effective.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers. Draft saved Draft discarded

FAQ

What is Java low latency programming?

As a result, low latency code in Java is compiled once before the program is run, making it run faster and without significant delays.

How do you build low latency applications?

11 Best Practices for Low Latency Systems

Choose the right language. Scripting languages need not apply. …
Keep it all in memory. …
Keep data and processing colocated. …
Keep the system underutilized. …
Keep context switches to a minimum. …
Keep your reads sequential. …
Batch your writes. …
Respect your cache.

What does latency mean in Java?

Latency is simply defined as the time taken for one operation to happen. Although operation is a rather broad term, what I am referring to here is any behavior of a software system that is worth measuring and that a single run of that type of operation is observed at some point in time.

9 Java low latency interview questions & answers with lots of diagrams

JDD 2019: No GC coding techniques for low latency Java, Ivan Zvieriev

Separation of business and infrastructure code.

12 Answers 12 Sorted by:

FAQ

Related posts:

Related Posts

The Top 20 XHTML Interview Questions for Web Developers in 2023

Preparing for a Udacity Interview: Commonly Asked Questions and How to Answer Them

Leave a Reply Cancel reply