Showing posts from 2007

Installing modules in Perl

The easiest way to install the Perl modules from CPAN is to use the CPAN module. For e.g. the following command will install the XML::Simple module.
perl -MCPAN -e 'install XML::Simple;'If you wish to view the readme file for a particular module, try this command (I am viewing the readme file for XML::Simple module):
perl -MCPAN -e 'readme XML::Simple;'Another useful way of running CPAN module interactively is to give the following command:
perl -MCPAN -e shellOnce you get the shell prompt "cpan>", try "help" to get to know more useful options. To know more, refer here.

Essential Firefox plugins

Here is a list of three plugins for Firefox that really makes my life a lot easy, secure and fast.
NoScriptI have written an article earlier about Adblock and it can make your browsing experience better by getting rid of unnecessary ads.

If you are working in multiple machines (your laptop, your desktop, etc), Foxmarks will help you to sync up all the bookmarks automatically.

NoScript is for preventing those junk scripts that really slow down your system and at times prove to be insecure. Be prepared for surprises! Some times the web pages you visit might not work the way you expect them to ("Why the link doesn't take me anywhere which I click?"). The JavaScript that is supposed to kick in on clicking the link might be blocked by NoScript.

Difference between StringBuffer and StringBuilder

There is a subtle yet important difference between the StringBuffer and StringBuilder. StringBuffer is synchronized when it has to access/modify its contents, where as StringBuilder is not. A StringBuilder object is not safe to be shared across multiple threads.

It is much efficient to make use of StringBuilder when the StringBuilder object is not shared between multiple threads.

StringBuilder was introduced in JDK 1.5.

Starting X sessions in Cygwin

Question: "I want to have my Linux desktop in my Windows machine. I have installed Cygwin. How to do this?"

Follow these steps:
Start your Cygwin command shell.Give "xinit -- -clipboard" in the command line. You will see a bare X window show with a command prompt in it. You will also see something like "Cygwin/X - 0:0" on the left-top of the window. This tells you the display in which the X server is listening for incoming connections.
Give "xhost +" in the command prompt. This is to let the server accept all the incoming connections. Remember: if you are concerned about the security, refer to the man page of xhost on how to give a list of hosts instead of wild card "+".Start an ssh connection to your Linux box.
Once logged in, set the display variable. As per this example it would be "export DISPLAY=x.x.x.x:0.0" where x.x.x.x is the IP address of your Windows box.Start your Gnome session by giving "gnome-session". Voil…

Glossary from Altova

I was searching for one good place where I can get a decent list of acronyms that I come across often, without a need to search for them every time in the net. The Altova glossary page looks like a good place (despite their nepotism toward their brainchild XML Spy!).

Finding DLL dependencies in Windows

For quite sometime now I was searching for a tool that will help me in finding out the dependent DLLs for a given EXE/DLL (like ldd in Solaris/Linux). I came across an amazing utility today called Dependency Walker. Its really neat and cool in finding out the broken dependencies.

Nuke those Google ads that come in your way

Google ads are good, at times. But I don't remember when was the last time I clicked on one and visited the target page. If I need something, I mostly use Google search, not the ads. Most of the time I feel that those ads come in my way when I am reading something really serious. Its waste of bandwidth and time to load these ads when I am not in a mood to clicking on them.

Here is a cool solution for the Firefox. All that you have to do is to download the Adblock plug-in and install it. Then restart your Firefox. Go to Tools -> Adblock -> Preferences. Enter the Google syndication URL ( Save and you are done. Have a Google ads free surfing :-)

Puzzle of repeating numbers

My friend Siva asked me this puzzle.
There is a set of N numbers. All numbers in this set repeat even number of times, except two numbers that repeat odd number of times. Develop an algorithm that will find these two numbers in O(N) space & time complexity.
I couldn't find the solution, but the solution Siva gave and another variant of that solution that his friend gave were simply amazing. Not to spoil the fun, I am not giving any of those solutions here :-)

dsh - Distributed Shell

In one of the Linux Magazine articles, I came across this shell called dsh (Distributed Shell). The name is a misnomer as it is actually not a shell. It is a wrapper to invoke shell commands using ssh (or any other configured shell) in different machines. Nevertheless, I think this tool would be of tremendous help when the same command is to be run in multiple machines, which is what most of the admins do often. An example given in the Linux Magazine article is to run last command in multiple machines to see the login/logout activity.

Comparison of PTHREAD_PROCESS_SHARED in Soaris and FreeBSD

Let us begin with the sample code below (headers omitted for brevity):
int main(int argc, const char* argv[]) {
void* mmap_ptr = mmap (NULL, sizeof (pthread_mutex_t),
if (mmap_ptr == MAP_FAILED) {
perror ("mmap failed");
return -1;

fprintf (stderr, "mmaped at: %x\n", mmap_ptr);

pthread_mutex_t* mutp = (pthread_mutex_t*)mmap_ptr;

// initialize the attribute
pthread_mutexattr_t attr;
pthread_mutexattr_init (&attr);
pthread_mutexattr_setpshared (&attr, PTHREAD_PROCESS_SHARED); // this is what we're testing

// initialize the mutex
pthread_mutex_init (mutp, &attr);
pthread_mutexattr_destroy (&attr);

// acquire the lock before fork
pthread_mutex_lock (mutp);

pid_t chld = fork ();
if (chld != 0) { // parent
fprintf (stderr, "parent: going to sleep...\n");
sleep (30);
fprintf (stderr, "parent: unlocking.\n");
pthread_mutex_unlock (mutp);
} else { // child
fprintf (stder…

Infnite loop while doing ++map.begin() on empty map

Look at this seemingly simple program:
int main(int argc, const char* argv[]) {
map mymap;
map::iterator it = mymap.begin();
return 0;
}What do you think would happen when I compile and run this program? Infinite loop!I tried this program in two different versions of compilers: Sun Forte 6 suite on Solaris and gcc 3.4.6 on FreeBSD.

In most of the library distributions, maps and sets are implemented using red-black trees. The iterator above seem to have three links: parent, left and right. For some strange reason, when the map is empty, the iterator returned by begin() (and end() too) has parent as NULL and left and right to be pointing to itself!

(gdb) print it
$1 = {_M_node = 0xbfbfecd4}
(gdb) print *it._M_node
$2 = {_M_color = std::_S_red, _M_parent = 0x0, _M_left = 0xbfbfecd4, _M_right = 0xbfbfecd4}You can have a look at the _M_increment function to know why this results in an infinite loop.

Now the history. One of our programs running in test region behaved very weird. What should have …

Zombies due to pipes in system() function call

Today I solved an interesting problem. One of my fellow developers used system() function in his code to run some command. The code looks like:

while (condition) {
if(system (...) == 0)
dosomething (...);
sleep (...);
}When we ran the application, I observed that the system was crawling. I verified the IO utilization and found it was normal. I checked the CPU utilization using top and that too was normal. When I did a ps, I found that there were too many defunct processes in the system.

I grabbed a cup of coffee and dug what could have caused so many defunct processes. There was only one place, which I suspected, could have caused the defuncts. That piece of code is given above. So I thought what was wrong with the argument to the system () command. It goes something like this:
system ("head -1 input.txt | grep pattern")I modified the command above as it would be executed in system (), and run it through truss to find out if all the forked processes are reaped using wait () or w…

How to free memory held by a container?

I have a test program like this:
int main() {
string large_str;
for (int i = 1; i <= 1000; ++i) {
string slice(100*i, 'X');
large_str += slice;
large_str.clear ();
printf ("size: %-5d, capacity: %-5d\n", large_str.size(), large_str.capacity());
} The last line of the output is:
size: 0, capacity: 131043 The question is: It is very obvious that the string container still holds memory that it allocated for the longest string it contained. How to deallocate this memory, without destructing the object? Thanks to James Kanze who posted an answer in this usenet thread, here is an elegant solution for this problem.

template <typename Container>
void reset( Container& c ) {
Container().swap( c ) ;

So when you have to free the memory, just call reset(large_str).

A script to monitor IO activities

This follows my discussion posted earlier regarding iostat. The script given below might be helpful in monitoring a device that has the given directory in it.


# This script will print IO activity for the partition given in the argument.
# If no partition is given, it will print IO activity for the partition
# that contains the current directory.

if [ -z "$1" ] ; then

while [ "$DIR" != "/" -a "$DIR" != "." ] ; do
MDEV=`mount -p | grep $DIR | nawk '{print $1;}'`
if [ ! -z "$MDEV" ] ; then
MDEV=`basename $MDEV`
DIR=`dirname $DIR`

if [ -z "$MDEV" ] ; then
echo "Unable to find out the mounted device for $ORIG_DIR."
exit 1

echo "Mounted device for $ORIG_DIR is $MDEV"

iostat -x -p -n -z 5 | egrep "$MDEV|device"

This script was tested in SunOS 5.9 running ksh.

Using iostat for monitoring disk activities

There could possibly be a lot of reasons for application slow down. Identifying the cause for the slow down could be a bit tricky. iostat is a tool that helps in monitoring I/O activities in the system, which might have been caused your application slowdown.

iostat helps to monitor I/O activity on a per-disk and per-partition basis. There are a number of options that might suite your particular need. But I find the ones below to be good enough for my needs:
iostat -x -n -p -z 5 10
-x : Show me extended statistics for each disk.
-n : Don't show cryptic names of devices, if possible show readable names.
-p : Show per device statistics and per partition statistics.
-z : Don't show me the rows that have all zeros in them.

Let us take a sample output and explore.

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.8 0.0 10.8 0.0 0.0 0.0 0.6 0 0 c2t7d3s6

The following is what the 'man iostat' …

An interesting question on Fibonacci series

I was thinking about an interesting question regarding the Fibonacci series. Here it is.
Assume that we draw the Fibonacci series of n as an inverted tree, where each node has its two additive terms as child nodes. For e.g. root node F(n) has F(n-1) and F(n-2) as children, and so on. Give an expression for the number of times node i occurs in the entire tree, where 1 <= i <= n. Try to solve this problem. There is an interesting pattern to observe here.

Absolutely cool dbx commands

Have you ever used dbx to debug a program? Its really cool. I have been using dbx for more than 6 years now. I am hoping that I will write a full fledged tutorial on using dbx (before it is dead and deep buried :-) Here are a few things which I felt absolutely essential to any programmer using dbx.

edit and fix
examineassignset follow_fork_mode and set follow_fork_inherit commands
Most of the time I find that people who are new to dbx exiting the dbx session just to make a small fix. Well, you don't really have to exit, recompile and start a new dbx session. You can use a combination of edit and fix commands. If you simply give edit, dbx will open the current file in the editor set by EDITOR environmental variable(vi is default). If you wish to edit a specific file, you can give that file's name as an argument to the edit commend. You don't have to give the full path, as dbx is smart enough to find out the full path of the known files. (Give files command to find out what are…

Understanding filter, map and reduce

Most of the computation problems that I have faced in the past could easily be solved using a mix and match of filter, map and reduce operations. These operations could be performed on any set of objects that can be iterated.

Filtering operation is one of the most fundamental operations that we perform more frequently than we think. In simple terms, we define a predicate and check if this predicate is true for each object in the set, iterating over them one by one. Whenever the predicate is true, we append the object to the output, otherwise we ignore. Consider grep tool as an example. The lines in the file(s) are iterable. If the current line is L, the the predicate is the question: "does L contain the pattern XYZ?". The output set has utmost as many elements as input set has.

Mapping is the operation of producing an output for each element in the input, by performing a function on that input. Unlike filtering, which used a predicate to check, the map uses a function. The ma…

dhclient - Obtaining IP address dynamically

Whenever you want to obtain a dynamic IP address for your Linux/Unix machine from a DHCP server, you should use dhclient utility.

Both DHCP request and responses are UDP requests. If you use a sniffer to identify the pattern of the request responses, the following is what you might see. The following was taken from dhclient running in FreeBSD 6.1.

Len SrcIP SrcMACAddr DestIP DestMACAddr Protocol
342 00:0c:29:c1:13:81 ff:ff:ff:ff:ff:ff UDP
62 00:50:56:ef:75:a8 ff:ff:ff:ff:ff:ff ICMP ping request
342 00:50:56:ef:75:a8 00:0c:29:c1:13:81 UDP
60 00:0c:29:c1:13:81 ff:ff:ff:ff:ff:ff ARP request

If you observe the source and destination (IP, MACAddr) patterns, it is easy to appreciate what happens. Here is what happens:

A DHCP request is sent on the network. Both destination MACAddr and IPAddr are broadcast addresses.
The DHCP server chooses and IP address t…

VMWare Server - Free to download

A couple of weeks old news, but still thought worth sharing. For all those VMWare fans, the VMWare Server is now available free of cost for download. Earlier only VMWare player was available for download. Download and try it from here.

You might wish to view all my previous posts on VMWare to know few more tricks.

Opening control panel from command prompt

How do you go to a tab in control panel directly from command prompt?

Under the system32 directory, you will find a few *.cpl files. Running these files will take you directly to the control panel tab corresponding to them. For e.g. appwiz.cpl will take you to "Add/Remove Programs" tab in control panel.

You can get a complete list of these files from this Microsoft Knowledge Base article.

It saves a lot of time especially when you are running in a lesser privileged account and want to switch to admin account to install/uninstall programs.

A python script to monitor a process

Here is a python script to monitor a process' memory usage in Linux. This tool periodically prints how much of heap that has been used by the process so far. When you stop monitoring the script prints out a summary along with a text graph that will help you understand the trend.

You can take a look at the script here. It might be useful and handy.

Debugging by printing

There are two ways to debug a program. One is to study the source code and try to understand what the program is doing and try to figure out what could have gone wrong. This is a kind of static debugging. Doing a dry run of the program is at times overkill. The other is to let the program run and print its state information. We can reason out what might have gone wrong from the collected information.

Roughly speaking, state of a running program includes everything in the memory of the running program. But to make out what might have gone wrong, we only have to focus on what is really necessary.

When I say "let the program print its state information," it could be the modified program with a lot of print statements. Or it could be a debugger attached to the running process and printing the state information. It definitely helps to learn a few ways to print the state information. An incomplete list of useful information on how to print in various systems.

Language or Tool

Browsable HotSpot VM Source Code

With JDK source code having been made open source, I wanted to go through the source code to figure out how things have been done. I found the following two links to be useful.
OpenGrok - A wicked fast source browser (and it lives upto its word!)OpenJDK Subversion TrunkOpenGrok link is too good if you would like to search something in the source code. If you have any better links, please do care to post in the comments.

Deterministic destruction of Objects in Java - An idea

I love programming in C++, despite the hues and cries against it as a "most programmer unfriendly" language. I have my own reasons to like C++. Top two reasons being:
Bjarne was born in 1950, it was the year when my father was born.Initial C++ (C With Classes) was released in 1979, it was the year when I was born. :-)Nevertheless, there is one aspect of C++ that I find very useful and will definitely help prevent resource leaks. It is the destructor of an object. In simple terms, whenever we exit a lexical block (a block enclosed within "{" and "}"), all the objects constructed within that block are guaranteed to be destructed. The guarantee is true even when the exit happens due to thrown exceptions.

This guarantee is a very strong weapon for any programmer. It makes destruction of an object a deterministic phenomena. Objects created in heap using newkey word require the user to explicitly destroy them, and thats not what I am discussing here.

One of the g…

Notes on ObjectOutputStream.writeObject()

If you write the same object twice into the ObjectOutputStream using writeObject() method, typically you would expect that the size of the stream should increase approximately by the size of the object (and all the fields within that recursively). But it wouldn't happen so.

It is very critical to understand how writeObject() method works. It writes an object only once into a stream. The next time when the same object is written, it just notes down the fact that the object is already available in the same stream.

Let us take an example. We want to write 1000 student records into an ObjectOutputStream. We create only one record object, and plan to reuse the same record within a loop so that we save time on object creation. We will use setter methods to update the same object with next student's details. If we use writeObject() to carry out this task, changes made to all but the first student's records will be lost. (Go ahead and try the program given below)

To achieve the objec…

Software Development Best Practices Conference 2007

Last week Friday I attended the Software Development Best Practices Conference 2007. It was an eventful day. There were two presentations which made me feel that I got much more in return than what I paid for. They are "Better Software - No matter what" by Dr. Scott Meyers and "Securing Software Design and Architecture: Uncut and Uncensored" by Dr. Herbert Thompson. In the photo, I am seen with Dr. Scott Meyers. (Thanks to Abhishek Pandey from Intuit for the photo)

You can see the presentation slides of Dr. Scott Meyers in the SD Expo web site.

Other sponsored speakers discussed more about their companies and the products that they were advertising, which is quite understandable.

Dr. Thompson's speech was lively and full of information. He shared three incidents that happened in the past that drove him mad to believe that "bugs are everywhere" and security is the most critical aspect of any product. Of the three incidents, I loved the Bahamian Adventure …

Bjarne's Interview in MIT Technology Review

As refreshing and thought provoking as ever. I would strongly urge you to read Bjarne's interview (Part 1) completely. Of all the answers, I like the following one in particular.
Expressing dislike of something you don't know is usually known as prejudice. Also, complainers are always louder and more certain than proponents--reasonable people acknowledge flaws.
I believe this answer is true for our personal life too, just as much it is true in the context as it is presented here. The second part of the interview is available here.

What is robots.txt?

Before I explain why and what of robots.txt file, let me give you an incident that beat us off board sometime back. (For security reasons, I have excluded the names.)

We used to serve real-time/delayed quotes to our customers. One of the customers wanted to provide searching functionality for their web site. Hence they bought and indexing/searching application which had a crawler at its core. The customer had a list of Top 10 Active stocks and their respective delayed quotes in their landing page. Some stocks, would have rapid fluctuations in their prices. The crawler (at that time what we called a "stupid crawler," without knowing robots.txt file) started indexing the landing page as rapidly as the values change. For each request, the customer's application server started sending ten quote requests to us. Lucky we! We had a very robust infrastructure that our server didn't come down. But at the end of two days we had thousands and thousands of quote requests, which …

Learning a system and the use of profiler

Here is a question: When you are given a huge system with source code and asked to learn the system, where will you start? Think for a moment and answer.

My answer goes like this:
Run the system through a debugger that would give you a fair idea about the system (where to start, what are all the functions called, etc.)Run the system under truss or strace (or whichever tool is applicable to your platform), which will give you a very good idea of what are all the resources the system is using. (INI files, resource files, etc)Observe what functions are called while different functionalities are accessed in the system (what happens in my server after I click the "Submit" button, what is the function invocation sequence when I login, etc.)
If you venture into studying the system brute force by going through the source code at random points, you might waste time at unnecessary places. The activities mentioned above should help you at least which piece of source code you should look a…

System.identityHashCode() - What is it?

Today I learnt about a function called System.identityHashCode(). To understand where it is used, let us consider the following program.
// What will be the output of toString() if we override hashCode() function?
public class HashCodeTest {

public int hashCode() { return 0xDEADBEEF; }

public static void main(String[] argv) {
HashCodeTest o1 = new HashCodeTest();
HashCodeTest o2 = new HashCodeTest();

System.out.println("Using default toString():");
System.out.println("First: " + o1);
System.out.println("Second: " + o2);

System.out.println("Using System.identityHashCode():");
System.out.println("First: " + System.identityHashCode(o1));
System.out.println("Second: " + System.identityHashCode(o2));
This program overrides the function hashCode() which is perfectly legal. As a result of this, you cannot find out the real identity of the object as it would be printed in the default toString() method. The output turns out t…