Friday, December 14, 2007

Installing modules in Perl

The easiest way to install the Perl modules from CPAN is to use the CPAN module. For e.g. the following command will install the XML::Simple module.
perl -MCPAN -e 'install XML::Simple;'
If you wish to view the readme file for a particular module, try this command (I am viewing the readme file for XML::Simple module):
perl -MCPAN -e 'readme XML::Simple;'
Another useful way of running CPAN module interactively is to give the following command:
perl -MCPAN -e shell
Once you get the shell prompt "cpan>", try "help" to get to know more useful options. To know more, refer here.

Monday, December 03, 2007

Essential Firefox plugins

Here is a list of three plugins for Firefox that really makes my life a lot easy, secure and fast.
I have written an article earlier about Adblock and it can make your browsing experience better by getting rid of unnecessary ads.

If you are working in multiple machines (your laptop, your desktop, etc), Foxmarks will help you to sync up all the bookmarks automatically.

NoScript is for preventing those junk scripts that really slow down your system and at times prove to be insecure. Be prepared for surprises! Some times the web pages you visit might not work the way you expect them to ("Why the link doesn't take me anywhere which I click?"). The JavaScript that is supposed to kick in on clicking the link might be blocked by NoScript.

Friday, November 30, 2007

Difference between StringBuffer and StringBuilder

There is a subtle yet important difference between the StringBuffer and StringBuilder. StringBuffer is synchronized when it has to access/modify its contents, where as StringBuilder is not. A StringBuilder object is not safe to be shared across multiple threads.

It is much efficient to make use of StringBuilder when the StringBuilder object is not shared between multiple threads.

StringBuilder was introduced in JDK 1.5.

Thursday, November 29, 2007

Starting X sessions in Cygwin

Question: "I want to have my Linux desktop in my Windows machine. I have installed Cygwin. How to do this?"

Follow these steps:
  • Start your Cygwin command shell.
  • Give "xinit -- -clipboard" in the command line. You will see a bare X window show with a command prompt in it. You will also see something like "Cygwin/X - 0:0" on the left-top of the window. This tells you the display in which the X server is listening for incoming connections.
  • Give "xhost +" in the command prompt. This is to let the server accept all the incoming connections. Remember: if you are concerned about the security, refer to the man page of xhost on how to give a list of hosts instead of wild card "+".
  • Start an ssh connection to your Linux box.
  • Once logged in, set the display variable. As per this example it would be "export DISPLAY=x.x.x.x:0.0" where x.x.x.x is the IP address of your Windows box.
  • Start your Gnome session by giving "gnome-session". Voila! You will see your desktop in your X window.
A couple of trouble shooting tips. If you have a firewall blocking connections from your Linux box to your Windows box, you will encounter an error. Just use telnet to make sure that your connection works. Try "telnet x.x.x.x 6000" to confirm this. It might be the case that you didn't do the "xhost +" step above, in which case your X server will reject the incoming connections. Make sure you have done this right.

A note on the "-clipboard" option above: if you don't give this option, X server doesn't share the windows clipboard with the applications running in it, hence you will not be able to copy/paste between your Windows and Linux applications.

Update on 1/28/2008: I found that giving the "-keyhook" option lets you switch between the frames inside Cygwin/X using Alt-TAB. You should give it like:
xinit -- -clipboard -keyhook

Glossary from Altova

I was searching for one good place where I can get a decent list of acronyms that I come across often, without a need to search for them every time in the net. The Altova glossary page looks like a good place (despite their nepotism toward their brainchild XML Spy!).

Monday, November 19, 2007

Finding DLL dependencies in Windows

For quite sometime now I was searching for a tool that will help me in finding out the dependent DLLs for a given EXE/DLL (like ldd in Solaris/Linux). I came across an amazing utility today called Dependency Walker. Its really neat and cool in finding out the broken dependencies.

Friday, November 09, 2007

Nuke those Google ads that come in your way

Google ads are good, at times. But I don't remember when was the last time I clicked on one and visited the target page. If I need something, I mostly use Google search, not the ads. Most of the time I feel that those ads come in my way when I am reading something really serious. Its waste of bandwidth and time to load these ads when I am not in a mood to clicking on them.

Here is a cool solution for the Firefox. All that you have to do is to download the Adblock plug-in and install it. Then restart your Firefox. Go to Tools -> Adblock -> Preferences. Enter the Google syndication URL (http://pagead2.googlesyndication.com/pagead/show_ads.js). Save and you are done. Have a Google ads free surfing :-)

Thursday, November 01, 2007

Puzzle of repeating numbers

My friend Siva asked me this puzzle.
There is a set of N numbers. All numbers in this set repeat even number of times, except two numbers that repeat odd number of times. Develop an algorithm that will find these two numbers in O(N) space & time complexity.

I couldn't find the solution, but the solution Siva gave and another variant of that solution that his friend gave were simply amazing. Not to spoil the fun, I am not giving any of those solutions here :-)

Thursday, August 02, 2007

dsh - Distributed Shell

In one of the Linux Magazine articles, I came across this shell called dsh (Distributed Shell). The name is a misnomer as it is actually not a shell. It is a wrapper to invoke shell commands using ssh (or any other configured shell) in different machines. Nevertheless, I think this tool would be of tremendous help when the same command is to be run in multiple machines, which is what most of the admins do often. An example given in the Linux Magazine article is to run last command in multiple machines to see the login/logout activity.

Sunday, June 24, 2007

Comparison of PTHREAD_PROCESS_SHARED in Soaris and FreeBSD

Let us begin with the sample code below (headers omitted for brevity):
int main(int argc, const char* argv[]) {
void* mmap_ptr = mmap (NULL, sizeof (pthread_mutex_t),
PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANON, -1, 0);
if (mmap_ptr == MAP_FAILED) {
perror ("mmap failed");
return -1;
}

fprintf (stderr, "mmaped at: %x\n", mmap_ptr);

pthread_mutex_t* mutp = (pthread_mutex_t*)mmap_ptr;

// initialize the attribute
pthread_mutexattr_t attr;
pthread_mutexattr_init (&attr);
pthread_mutexattr_setpshared (&attr, PTHREAD_PROCESS_SHARED); // this is what we're testing

// initialize the mutex
pthread_mutex_init (mutp, &attr);
pthread_mutexattr_destroy (&attr);

// acquire the lock before fork
pthread_mutex_lock (mutp);

pid_t chld = fork ();
if (chld != 0) { // parent
fprintf (stderr, "parent: going to sleep...\n");
sleep (30);
fprintf (stderr, "parent: unlocking.\n");
pthread_mutex_unlock (mutp);
} else { // child
fprintf (stderr, "child: going to acquire the lock ...\n");
pthread_mutex_lock (mutp);
fprintf (stderr, "child: acquired the lock.\n");
sleep (30);
pthread_mutex_unlock (mutp);
}
return 0;
}

The above code makes use of mutexes to synchronize between two processes. It first acquires an anonymous memory mapped region, which is marked for sharing between processes. Then it initializes a mutex in that region with the attribute PHREAD_PROCESS_SHARED. Before forking, the mutex is locked by the parent. After fork, the child process too attempts to lock the mutex.

The expected result is that the child should block until the parent unlocks the mutex.

I tried this piece of code in two Unix flavours: Solaris 9 and FreeBSD 6.2. Only the Solaris implementation seemed to give the expected result.

In FreeBSD implementation, the types pthread_mutex_t, pthread_cond_t and pthread_rwlock_t are all typedefed as pointers. Hence whenever a variable of one of these types is declared, only a pointer is declared. When the respective *_init function is called, the actual memory is allocated. The thread/process calling init functions does not have any control over from where the memory is acquired (a memory mapped region, shared memory or heap). Also, it might lead to memory leaks (but how many times in a program a thread, attribute, mutex, etc are initialized :-).

Contrarily, Soaris has these types as structures. Hence it is easy to share a mutex between two processes by having the mutex in a mmaped region or shared memory.

To compile the above program in Solaris, give "CC -mt ". In FreeBSD, give "CC -pthread -lpthread".

Saturday, June 16, 2007

Infnite loop while doing ++map.begin() on empty map

Look at this seemingly simple program:
int main(int argc, const char* argv[]) {
map mymap;
map::iterator it = mymap.begin();
++it;
return 0;
}
What do you think would happen when I compile and run this program? Infinite loop! I tried this program in two different versions of compilers: Sun Forte 6 suite on Solaris and gcc 3.4.6 on FreeBSD.

In most of the library distributions, maps and sets are implemented using red-black trees. The iterator above seem to have three links: parent, left and right. For some strange reason, when the map is empty, the iterator returned by begin() (and end() too) has parent as NULL and left and right to be pointing to itself!

(gdb) print it
$1 = {_M_node = 0xbfbfecd4}
(gdb) print *it._M_node
$2 = {_M_color = std::_S_red, _M_parent = 0x0, _M_left = 0xbfbfecd4, _M_right = 0xbfbfecd4}
You can have a look at the _M_increment function to know why this results in an infinite loop.

Now the history. One of our programs running in test region behaved very weird. What should have been processed in few mins wasn't processed even after an hour. So when I attached a debugger to the program and analyzed, I figured this was the issue. I accept that the program had a logical error that it didn't check for empty container. I think getting into an infinite loop is a big punishment. A core dump at least would have given a hint that something went wrong.

In his book "Writing Solid Code," Steve Maguire tells that code should be written such a way that every bug is forced to surface. I guess that's what is missing in this piece of library function.

Wednesday, June 13, 2007

Zombies due to pipes in system() function call

Today I solved an interesting problem. One of my fellow developers used system() function in his code to run some command. The code looks like:

while (condition) {
if(system (...) == 0)
dosomething (...);
sleep (...);
}
When we ran the application, I observed that the system was crawling. I verified the IO utilization and found it was normal. I checked the CPU utilization using top and that too was normal. When I did a ps, I found that there were too many defunct processes in the system.

I grabbed a cup of coffee and dug what could have caused so many defunct processes. There was only one place, which I suspected, could have caused the defuncts. That piece of code is given above. So I thought what was wrong with the argument to the system () command. It goes something like this:
system ("head -1 input.txt | grep pattern")
I modified the command above as it would be executed in system (), and run it through truss to find out if all the forked processes are reaped using wait () or waitid () calls. The following is the truss output (note the -f argument to truss, which is very important):
$ truss -f /bin/sh -c "head -1 input.txt | grep pattern" 2>&1 | egrep  "fork|exec|wait"
80: execve("/usr/bin/sh", 0xFFBFFB6C, 0xFFBFFB7C) argc = 3
80: fork() = 81
81: fork() (returning as child ...) = 80
81: fork() = 83
83: fork() (returning as child ...) = 81
81: execve("/usr/bin/grep", 0x0003A498, 0x0003A588) argc = 2
83: execve("/usr/bin/head", 0x0003A4B4, 0x0003A5A8) argc = 3
80: waitid(P_PID, 81, 0xFFBFF8D0, WEXITED|WTRAPPED|WNOWAIT) = 0
80: waitid(P_PID, 81, 0xFFBFF8D0, WEXITED|WTRAPPED) = 0
There are three processes: the shell (pid 80), the grep process (pid 81) and the head process (pid 83). But we find only one waitid () call, which is for grep process. The head process, being first in the pipe, is left to become a zombie. The moral of the story is:
If you have a long pipe of processes, remember that only the last process of the pipe will be reaped using waitid () by the shell. Rest of the processes will become defuncts and reaped by the init process soon.
But if the rate at which the defuncts are created is high (like the while loop given above), then your system is bound to experience a slow down.

(The actual code is not as trivial as using a head and grep alone!)

Monday, June 11, 2007

How to free memory held by a container?

I have a test program like this:
int main() {
string large_str;
for (int i = 1; i <= 1000; ++i) {
string slice(100*i, 'X');
large_str += slice;
large_str.clear ();
printf ("size: %-5d, capacity: %-5d\n", large_str.size(), large_str.capacity());
}
}
The last line of the output is:
 size: 0, capacity: 131043  
The question is:
It is very obvious that the string container still holds memory that it allocated for the longest string it contained. How to deallocate this memory, without destructing the object?
Thanks to James Kanze who posted an answer in this usenet thread, here is an elegant solution for this problem.

template <typename Container>
void reset( Container& c ) {
Container().swap( c ) ;
}

So when you have to free the memory, just call reset(large_str).

Saturday, June 09, 2007

A script to monitor IO activities

This follows my discussion posted earlier regarding iostat. The script given below might be helpful in monitoring a device that has the given directory in it.

#!/bin/ksh

# This script will print IO activity for the partition given in the argument.
# If no partition is given, it will print IO activity for the partition
# that contains the current directory.
#

if [ -z "$1" ] ; then
DIR=`pwd`
else
DIR="$1"
fi

ORIG_DIR=$DIR
while [ "$DIR" != "/" -a "$DIR" != "." ] ; do
MDEV=`mount -p | grep $DIR | nawk '{print $1;}'`
if [ ! -z "$MDEV" ] ; then
MDEV=`basename $MDEV`
break
else
DIR=`dirname $DIR`
fi
done

if [ -z "$MDEV" ] ; then
echo "Unable to find out the mounted device for $ORIG_DIR."
exit 1
fi

echo "Mounted device for $ORIG_DIR is $MDEV"

iostat -x -p -n -z 5 | egrep "$MDEV|device"

This script was tested in SunOS 5.9 running ksh.

Friday, June 08, 2007

Using iostat for monitoring disk activities

There could possibly be a lot of reasons for application slow down. Identifying the cause for the slow down could be a bit tricky. iostat is a tool that helps in monitoring I/O activities in the system, which might have been caused your application slowdown.

iostat helps to monitor I/O activity on a per-disk and per-partition basis. There are a number of options that might suite your particular need. But I find the ones below to be good enough for my needs:
iostat -x -n -p -z 5 10

-x : Show me extended statistics for each disk.
-n : Don't show cryptic names of devices, if possible show readable names.
-p : Show per device statistics and per partition statistics.
-z : Don't show me the rows that have all zeros in them.

Let us take a sample output and explore.

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.8 0.0 10.8 0.0 0.0 0.0 0.6 0 0 c2t7d3s6

The following is what the 'man iostat' has to say regarding the columns above:

device
name of the disk

r/s reads per second

w/s writes per second

Kr/s kilobytes read per second

Kw/s kilobytes written per second

wait average number of transactions waiting for service
(queue length)

actv average number of transactions actively being serviced
(removed from the queue but not yet completed)

svc_t average service time, in milliseconds

%w percent of time there are transactions waiting for
service (queue non-empty)

%b percent of time the disk is busy (transactions in pro-
gress)

wsvc_t is the average time spent in the wait queue and asvc_t is the average time spent being serviced.

There are a couple of things that are important to us:
  • If your application is performing too many random reads/writes, you will find that the first four columns will have high values. (What is high is dependent on your system! There is no universal number.)
  • As a result, you will find that the wsvc_t and asvc_t to be high too.
Here comes the tricky part: how will you know if these numbers go high, it is due to your application? To a reasonable extent, you could find out.

First, make sure that you are looking at the right device/partition where your application is doing reads/writes. You could use mount, and find out the device which is having the directory you are interested in.

Second, as much as possible you should try to isolate the numbers on a per partition basis, rather than on a per deice basis. Per device statistics are aggregations over all the partitions under them. For e.g. monitor c2t7d3s6 instead of c2t7d3, as you will get a slightly more accurate picture.

The following are some sample outputs of iostat that would help you to do a comparison.


extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
27.8 181.6 497.8 4748.0 124.0 211.6 592.0 1010.5 72 100 c2t7d3s6

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
216.2 167.0 5534.2 5363.0 0.3 88.7 0.8 231.4 2 100 c2t7d3s6

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.2 0.0 0.2 0.0 0.0 0.0 11.1 0 0 c2t7d3s6

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 2.8 0.0 1.5 0.0 0.0 0.0 8.6 0 0 c2t7d3s6

The first two snap shots were taken when the system was having heavy IO activity, and the latter two were taken when there was not much of an IO activity. Compare the wait times and bytes read/written. It is interesting to ask the following questions:
  1. Why is the waiting time in the queue is much lesser than the time being serviced? And what would the reverse of this case mean?
  2. Let us assume that the device can serve 2N fixed sized IO requests in the given time interval. Consider two applications making N requests each. How different this would be from one application making 2N requests? Using which parameter you could distinguish these two cases?
Okay. Once you have identified that your application is choking IO, what is the next step?

One strategy that you could follow to make your application better is to keep your design such a way that the read/writes are sequential instead of random. This might not be possible always. At least, try to reduce the spread of random read/writes by using strategies like always using the lower addressed blocks. I have encountered applications that use linked list to keep track of free memory blocks on the disk, where I changed the the linked list to a min-heap that proved to be improving performance. But using a min-heap too has its downsides :-)

Thursday, June 07, 2007

An interesting question on Fibonacci series

I was thinking about an interesting question regarding the Fibonacci series. Here it is.
Assume that we draw the Fibonacci series of n as an inverted tree, where each node has its two additive terms as child nodes. For e.g. root node F(n) has F(n-1) and F(n-2) as children, and so on. Give an expression for the number of times node i occurs in the entire tree, where 1 <= i <= n.
Try to solve this problem. There is an interesting pattern to observe here.

Wednesday, April 25, 2007

Absolutely cool dbx commands

Have you ever used dbx to debug a program? Its really cool. I have been using dbx for more than 6 years now. I am hoping that I will write a full fledged tutorial on using dbx (before it is dead and deep buried :-) Here are a few things which I felt absolutely essential to any programmer using dbx.

  1. edit and fix
  2. examine
  3. assign
  4. set follow_fork_mode and set follow_fork_inherit commands
Most of the time I find that people who are new to dbx exiting the dbx session just to make a small fix. Well, you don't really have to exit, recompile and start a new dbx session. You can use a combination of edit and fix commands. If you simply give edit, dbx will open the current file in the editor set by EDITOR environmental variable(vi is default). If you wish to edit a specific file, you can give that file's name as an argument to the edit commend. You don't have to give the full path, as dbx is smart enough to find out the full path of the known files. (Give files command to find out what are all the files that dbx can find out without specifying the full path). So you give edit command, and make all your changes. You save the file and exit. You give the command "fix -a". Boom! All the files that were modified are identified by dbx, and they are recompiled and their respective new object files are loaded. However the changes are temporary and you should build your application when you exit dbx session.

An important thing to remember when you use fix command. If your program is active, and recompiling an object might cause an issue, dbx will warn you so. So you could stop your program by giving "kill" command. Then give a "fix -a" command.

Examine command is used to examine arbitrary memory regions. In case you feel that your application has made an array boundary write, the best way to confirm that is to give the examine command and examine the memory region. For e.g. "examine (void*)&MyStruct /128c" will print 128 bytes as characters, occupied by the variable MyStruct.

Assign command will come in handy when you have to change a variable's value. For e.g. you find that an if block like "if(ShouldProcess) { }" will be executed only when ShouldProcess is set to be non zero and its current value is zero. You can set this variable to be some non zero value using assign and get this block executed. But remember one thing: if you use assign without prudence, the integrity of your program's state might be lost!

The set follow_fork_mode and set follow_fork_inherit commands are very handy when your program performs forks. If follow_fork_mode is not set, then dbx will continue with the parent program, where as your real objective might be to trace the child program. So set it like "set follow_fork_mode ask", and the dbx will prompt you every time it encounters fork. You might have set breakpoints in your parent program. If you don't set follow_fork_inherit to be "on" they will be lost. So you can use a combination of these two.

Above everything else: use the "help" command if you are struck up some where or you want to know what more options you could pass. dbx online help is pretty decent.

Hopefully these commands help you save some time and make your debugging faster.

Thursday, March 22, 2007

Understanding filter, map and reduce

Most of the computation problems that I have faced in the past could easily be solved using a mix and match of filter, map and reduce operations. These operations could be performed on any set of objects that can be iterated.

Filtering operation is one of the most fundamental operations that we perform more frequently than we think. In simple terms, we define a predicate and check if this predicate is true for each object in the set, iterating over them one by one. Whenever the predicate is true, we append the object to the output, otherwise we ignore. Consider grep tool as an example. The lines in the file(s) are iterable. If the current line is L, the the predicate is the question: "does L contain the pattern XYZ?". The output set has utmost as many elements as input set has.

Mapping is the operation of producing an output for each element in the input, by performing a function on that input. Unlike filtering, which used a predicate to check, the map uses a function. The map operation produces exactly the same number of elements as its input set. An example is printing the square of the all the numbers from 1 to N.

Reduction is an iterative operation performed on each element using that element and the partial result obtained in the previous step. So reduction operation takes two inputs: a function and an initial value to perform the computation with the first element. Reduce operation has exactly only one output.

Python provides language level support for these three operations. In C++ too you could perform these operations using a combination of remove_copy_if (for filtering), transform (for mapping) and partial_sum (for reduction).

Here comes the most interesting part of all. I claimed that most of the computation problems could be solved easily by using a combination of these three operations. Let us take an example and explore.

Example of billing a subscriber for his phone calls
Let us assume that we are receiving a huge volume of call records. We would like to calculate the monthly bill amount for user A. How would we do that? In three steps:
  1. Extract all the records for user A. (This is filtering with predicate: "is this call record for user A?")
  2. Find the day and night time tariff rates as per the time of the call. (This is mapping. With a function that returns the call rate depending on the time of the call.) And multiply the rate with the actual duration. (Again a mapping operation.)
  3. Sum up all the charges. (Reduction with initial value of zero or may be his previous month balance)
An important this to understand is we could also perform interleaved mapping and reducing in steps (2) and (3) above. For e.g. in the above case, finding the rate, calculating the call charge and adding that to the partial charges so far could be done in the same iteration. Yet, it remains that they are map and reduce operations. Such implementation tweaks could always be done to boost the performance.

Saturday, March 17, 2007

dhclient - Obtaining IP address dynamically

Whenever you want to obtain a dynamic IP address for your Linux/Unix machine from a DHCP server, you should use dhclient utility.

Both DHCP request and responses are UDP requests. If you use a sniffer to identify the pattern of the request responses, the following is what you might see. The following was taken from dhclient running in FreeBSD 6.1.

Len SrcIP SrcMACAddr DestIP DestMACAddr Protocol
342 0.0.0.0 00:0c:29:c1:13:81 255.255.255.255 ff:ff:ff:ff:ff:ff UDP

62 192.168.49.254 00:50:56:ef:75:a8 192.168.49.128 ff:ff:ff:ff:ff:ff ICMP ping request
342 192.168.49.254 00:50:56:ef:75:a8 192.168.49.128 00:0c:29:c1:13:81 UDP
60 192.168.49.128 00:0c:29:c1:13:81 ff:ff:ff:ff:ff:ff ARP request


If you observe the source and destination (IP, MACAddr) patterns, it is easy to appreciate what happens. Here is what happens:

  1. A DHCP request is sent on the network. Both destination MACAddr and IPAddr are broadcast addresses.
  2. The DHCP server chooses and IP address to offer and confirms if the IP address it is not used anywhere else in the network.
  3. DHCP server offers the IP address to the requester.
  4. The requester sends an ARP request to register its MAC address and IP address in interested hosts in the network.
DHCP is an application layer protocol. Thats an important thing to keep in mind.

Tuesday, February 20, 2007

VMWare Server - Free to download

A couple of weeks old news, but still thought worth sharing. For all those VMWare fans, the VMWare Server is now available free of cost for download. Earlier only VMWare player was available for download. Download and try it from here.

You might wish to view all my previous posts on VMWare to know few more tricks.

Thursday, February 08, 2007

Opening control panel from command prompt

How do you go to a tab in control panel directly from command prompt?

Under the system32 directory, you will find a few *.cpl files. Running these files will take you directly to the control panel tab corresponding to them. For e.g. appwiz.cpl will take you to "Add/Remove Programs" tab in control panel.

You can get a complete list of these files from this Microsoft Knowledge Base article.

It saves a lot of time especially when you are running in a lesser privileged account and want to switch to admin account to install/uninstall programs.

Tuesday, February 06, 2007

A python script to monitor a process

Here is a python script to monitor a process' memory usage in Linux. This tool periodically prints how much of heap that has been used by the process so far. When you stop monitoring the script prints out a summary along with a text graph that will help you understand the trend.

You can take a look at the script here. It might be useful and handy.

Friday, February 02, 2007

Debugging by printing

There are two ways to debug a program. One is to study the source code and try to understand what the program is doing and try to figure out what could have gone wrong. This is a kind of static debugging. Doing a dry run of the program is at times overkill. The other is to let the program run and print its state information. We can reason out what might have gone wrong from the collected information.

Roughly speaking, state of a running program includes everything in the memory of the running program. But to make out what might have gone wrong, we only have to focus on what is really necessary.


When I say "let the program print its state information," it could be the modified program with a lot of print statements. Or it could be a debugger attached to the running process and printing the state information. It definitely helps to learn a few ways to print the state information. An incomplete list of useful information on how to print in various systems.

Language or Tool
Function/Command
Remarks
C
printf, fprintf
You could try fprintf(stderr, ...) and redirect the error output to a file.
C++
cout, cerr, clog
cerr << "Value of var is: " << var << endl;
Java
System.err.println, System.out.println,
java.util.logging.Logger
Logging API is very handy if you would like ship your code with logging facility.
make
$(info ...), $(warning ...), $(error ...)
Very handy when you want to know the value of a variable. Try $(info ${.VARIABLES}) to instantly figure out the list of variables set.
make
-d, --debug, --dry-run
-d and --debug print debugging information. --dry-run will tell you what all the actions to be taken without really taking them.
dbx
print, examine, dump
Examine is a very useful command if you would like to figure out any array boundary writes in a core dump. dump command prints all the local variables in the current stack frame.
Lisp
message, prin1

Perl
print
print STDERR ... will print in the standard error.
Python
print
Prints the given argument in the standard output. If you would like to print different types of items, you can give them in comma separated list.
Shell script
print, echo
The difference is print is an internal command in most of the shells, echo is an external command. This means that using echo will result in a new process being started for every echo statement.
Shell script
set -x
Cool command. This will print each command before it is executed. Be careful if you are giving passwords as arguments to any commands, as they will be printed in plain text on screen!


One of the most important things while logging any information is to print it along with some useful message, and possible workarounds if any. For e.g. the following is really really useless!
if(l == null) System.out.println("null value");

System.out.println("i is invalid");


Such messages are not only useless, but frustrating and misleading at times. Rather the following messages are very useful.

if(l == null)
System.out.println("Null value encountered while processing " +
"INI file lines.
Please check if all rows are formatted " +
"as \"name=value\"");


System.out.println("i is having illegal value [" + i + "]." +
" Expected i to be in the range 0 to 6 (inclusive)");

Keep in mind. We write once, but debug every time when there is an issue.


Thursday, February 01, 2007

Browsable HotSpot VM Source Code

With JDK source code having been made open source, I wanted to go through the source code to figure out how things have been done. I found the following two links to be useful.
OpenGrok link is too good if you would like to search something in the source code. If you have any better links, please do care to post in the comments.

Tuesday, January 30, 2007

Deterministic destruction of Objects in Java - An idea

I love programming in C++, despite the hues and cries against it as a "most programmer unfriendly" language. I have my own reasons to like C++. Top two reasons being:
  • Bjarne was born in 1950, it was the year when my father was born.
  • Initial C++ (C With Classes) was released in 1979, it was the year when I was born. :-)
Nevertheless, there is one aspect of C++ that I find very useful and will definitely help prevent resource leaks. It is the destructor of an object. In simple terms, whenever we exit a lexical block (a block enclosed within "{" and "}"), all the objects constructed within that block are guaranteed to be destructed. The guarantee is true even when the exit happens due to thrown exceptions.

This guarantee is a very strong weapon for any programmer. It makes destruction of an object a deterministic phenomena. Objects created in heap using new key word require the user to explicitly destroy them, and thats not what I am discussing here.

One of the good programming practice is to acquire and release any resource within the same lexical scope, as much as possible. (I hear you yelling "Its not always possible, fella" and I agree with you!)

If we look at the object life cycle models provided by C++ and Java, C++ implicitly supports the above mentioned practice. But in Java, every user of the object must remember to invoke the destructor's equivalent (freeResources, resetAll, etc). finalize method in Java is altogether for a different purpose, and the language inventors themselves strongly discourage using and relying upon finalize method.

I was chewing this idea for sometime and this is what I feel.
Garbage collection and object destruction are two different things. Whenever a lexical scope exit happens, the JVM has the list of objects within that scope. JVM can decide if an object still has any references to it, or it is safe to be destructed. Thus destructed object can wait in heap till GC occurs.
My thought is not too deep to be incorporated into a production release! But it certainly is worthy enough to be considered.

Is there a rationale behind not supporting deterministic object destruction in Java?

Monday, January 29, 2007

Notes on ObjectOutputStream.writeObject()

If you write the same object twice into the ObjectOutputStream using writeObject() method, typically you would expect that the size of the stream should increase approximately by the size of the object (and all the fields within that recursively). But it wouldn't happen so.

It is very critical to understand how writeObject() method works. It writes an object only once into a stream. The next time when the same object is written, it just notes down the fact that the object is already available in the same stream.

Let us take an example. We want to write 1000 student records into an ObjectOutputStream. We create only one record object, and plan to reuse the same record within a loop so that we save time on object creation. We will use setter methods to update the same object with next student's details. If we use writeObject() to carry out this task, changes made to all but the first student's records will be lost. (Go ahead and try the program given below)

To achieve the objective stated above, you must use writeUnshared() method call. (Change the writeObject() method to writeUnshared() method and convince yourself)

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.IOException;
import java.io.Serializable;

class StudentRecord implements Serializable {
public String name;
public String major;
}

public class ObjectStreamTest {
public static void main(String[] argv) throws IOException, java.lang.ClassNotFoundException {
// Open the Object stream.
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("objectfile.bin"));

// Create the record that will be reused.
StudentRecord rec = new StudentRecord();

// Write the records.
rec.name = "John"; rec.major = "Maths";
oos.writeObject(rec);
rec.name = "Ben"; rec.major = "Arts";
oos.writeObject(rec);
oos.close();

// Read the objects back to reconstruct them.
ObjectInputStream ois = new ObjectInputStream(new FileInputStream("objectfile.bin"));
rec = (StudentRecord)ois.readObject();
System.out.println("name: " + rec.name + ", major: " + rec.major);
rec = (StudentRecord)ois.readObject();
System.out.println("name: " + rec.name + ", major: " + rec.major);
ois.close();
}
}

Thursday, January 25, 2007

Software Development Best Practices Conference 2007

Last week Friday I attended the Software Development Best Practices Conference 2007. It was an eventful day. There were two presentations which made me feel that I got much more in return than what I paid for. They are "Better Software - No matter what" by Dr. Scott Meyers and "Securing Software Design and Architecture: Uncut and Uncensored" by Dr. Herbert Thompson. In the photo, I am seen with Dr. Scott Meyers. (Thanks to Abhishek Pandey from Intuit for the photo)

You can see the presentation slides of Dr. Scott Meyers in the SD Expo web site.

Other sponsored speakers discussed more about their companies and the products that they were advertising, which is quite understandable.

Dr. Thompson's speech was lively and full of information. He shared three incidents that happened in the past that drove him mad to believe that "bugs are everywhere" and security is the most critical aspect of any product. Of the three incidents, I loved the Bahamian Adventure of Soda Machines! A couple of his best books can be viewed here.

Bottom line: I am deeply convinced that one can break any software.

Monday, January 15, 2007

Bjarne's Interview in MIT Technology Review

As refreshing and thought provoking as ever. I would strongly urge you to read Bjarne's interview (Part 1) completely. Of all the answers, I like the following one in particular.
Expressing dislike of something you don't know is usually known as prejudice. Also, complainers are always louder and more certain than proponents--reasonable people acknowledge flaws.

I believe this answer is true for our personal life too
, just as much it is true in the context as it is presented here. The second part of the interview is available here.

Friday, January 12, 2007

What is robots.txt?

Before I explain why and what of robots.txt file, let me give you an incident that beat us off board sometime back. (For security reasons, I have excluded the names.)

We used to serve real-time/delayed quotes to our customers. One of the customers wanted to provide searching functionality for their web site. Hence they bought and indexing/searching application which had a crawler at its core. The customer had a list of Top 10 Active stocks and their respective delayed quotes in their landing page. Some stocks, would have rapid fluctuations in their prices. The crawler (at that time what we called a "stupid crawler," without knowing robots.txt file) started indexing the landing page as rapidly as the values change. For each request, the customer's application server started sending ten quote requests to us. Lucky we! We had a very robust infrastructure that our server didn't come down. But at the end of two days we had thousands and thousands of quote requests, which surged our service graph to an unprecedented level!

Well that was the story. We figured the issue by looking at the logs and informed the customer about the issue. Last I heard, they had disabled the search facility.

Thats where the robots.txt comes into picture to save us.
robots.txt a rules file that every crawler reads before it crawls a particular site.
It has the list of directories whose contents will dynamically change and hence must not be indexed by the crawler. It also facilitates crawler specific rules. For e.g. If www.mywebsite.com/robots.txt is present as follows, then Google wouldn't index www.mywebsite.com.
User-agent: googlebot
Disallow: /
You can read much more about robots.txt here.

Monday, January 08, 2007

Learning a system and the use of profiler

Here is a question: When you are given a huge system with source code and asked to learn the system, where will you start? Think for a moment and answer.

My answer goes like this:
  • Run the system through a debugger that would give you a fair idea about the system (where to start, what are all the functions called, etc.)
  • Run the system under truss or strace (or whichever tool is applicable to your platform), which will give you a very good idea of what are all the resources the system is using. (INI files, resource files, etc)
  • Observe what functions are called while different functionalities are accessed in the system (what happens in my server after I click the "Submit" button, what is the function invocation sequence when I login, etc.)
If you venture into studying the system brute force by going through the source code at random points, you might waste time at unnecessary places. The activities mentioned above should help you at least which piece of source code you should look at. Needless to say, the items above are just the beginnings and you should eventually go through the source code to understand the full system.

Let us take the case of a Java system. How would you find out the sequence of function calls when some functionality is invoked? Running your program under a debugger is a lot more painful. So let that not be your first resource, if not the last. We really need a digest of the function calls, rather than we stepping through the execution.

I came across this excellent tutorial by Andrew Wilcox on building our own call profiler. I believe that this is an excellent starting point. You can find more about how to write your own profiler here. Remember that you will have to slightly modify the source code present in the tutorial to make it log all the function calls, rather than just the function being profiled. That tutorial lists a decent set of references too. You might find most of them very useful, particularly the one on the JNI (Java Native Interface).

You are most welcome to share your thoughts on the ideas presented. I would be more than glad to hear them.

Friday, January 05, 2007

System.identityHashCode() - What is it?

Today I learnt about a function called System.identityHashCode(). To understand where it is used, let us consider the following program.
//
// What will be the output of toString() if we override hashCode() function?
//
public class HashCodeTest {

public int hashCode() { return 0xDEADBEEF; }

public static void main(String[] argv) {
HashCodeTest o1 = new HashCodeTest();
HashCodeTest o2 = new HashCodeTest();

System.out.println("Using default toString():");
System.out.println("First: " + o1);
System.out.println("Second: " + o2);

System.out.println("Using System.identityHashCode():");
System.out.println("First: " + System.identityHashCode(o1));
System.out.println("Second: " + System.identityHashCode(o2));
}
}
This program overrides the function hashCode() which is perfectly legal. As a result of this, you cannot find out the real identity of the object as it would be printed in the default toString() method. The output turns out to be:
Using default toString():
First: HashCodeTest@deadbeef
Second: HashCodeTest@deadbeef
Using System.identityHashCode():
First: 27553328
Second: 4072869

Sometimes you might with to print the identity along with your own message, when you override toString() method. In such instances identityHashCode() function comes handy. If you look at the second part of the program output, the identity hash code for both the objects are unique.