Virtual IP and timeouts

Let us say you are writing a client application that connects to a server and maintains a persistent connection. Whenever there is some user intervention, for e.g. typing some input, the client application sends the user input to the server. And prints back the output produced by the server. (Sounds like telnet?)

One thing your client program doesn't know about is that the server it is connecting to makes use of a virtual IP! Consider the case when your client program initiates a connection and the connection is in ESTABLISHED state. Right in the middle of the connection, the server drops the virtual IP address and for some strange reason it never gets reassigned to another host. In that case, neither the client program nor the server program will know the event of removal of the IP address. The server application would still be listening in the removed IP address and would still be holding the ESTABLISHED connections through removed IP address. Likewise, the client would still be holding an ESTABLISHED connection to the server.

If you observe the way Linux handles writes on sockets, when the client program writes into the socket, the written bytes will be copied to the kernel buffer and the write call will immediately return. The kernel will perform a best effort delievery attempt to deliver the bytes that you have written. Remember that the packets would never be delivered to the server in case of virtual IP removal. Worst part of this is the client doesn't know about this at all.

If you look at the server part, some of the servers have an inactivity timeout after which the server will close the connection. For e.g. Apache. So the server wouldn't keep these open sockets forever.

Hmm ... Unless the client opens another connection to the server, which would timeout, there is no way the client would know that its unable to reach the server.

So the moral of the story is, if you suspect that the server might be using a virtual IP and it might be dropped and never be reassigned to another server, build both the client and the server programs with proper timeouts for read/write/inactivity. Yes, keeping a time out for inactivity is very important.


Popular posts from this blog

The mysterious ORA-03111 error

Note on allocationSize parameter of @SequenceGenerator while using JPA

Creating a Collection with a single element