Sunday, 7 November 2010

Has-a relationship: reference, pointer or value? ( 1/2 )



You are going to implement an Has-a relationship between two classes. Which is the best way to model this relation: by reference, by pointer or by value?


By Value
In general this is the preferable way to implement an Has-a relationship, the main advantage is to have to complete ownership of the object. Let's have a look at this small chunk of code:
  1. class Component
  2. {
  3.     public:
  4.         /* some stuff */
  5.     private:
  6.         /* some other stuff */
  7. };
  8. class Owner
  9. {
  10.     public:
  11.         Owner(const Component& component)
  12.             :component_(component) {}
  13.     private:
  14.         Component component_;
  15. };
So far so good, every time we create a relationship between two objects we bond the behaviour of one object to the interface of the other one. I will use this simple example to show how the Owner class will change based upon the Component interface.
The constructor of Owner takes a const reference of a component object. The keyword  const is important because we don't and we shouldn't change the object that we are going to copy and store on our class (the keyword explicit is to avoid undesired automatic conversion, have a look at Strongly Type post for more information).

The first drawback of storing the new component by value is performance, we are copying a value, so if Component object is big we spend a lot of time copying it. Please note that the object can be indefinitely big. For example a std::vector can be very-very-very big.
  1. class Component
  2. {
  3.     private:
  4.         /* some stuff */
  5.         Component(const Component& other);
  6.         Component& operator=(const Component& other);
  7.     public:
  8.         /* some other stuff */
  9. };
The author of Component is a good developer and he knows that copying a very-very-very big object is bad, so he decided to disallow the copy semantic (as above).

Now we have the first trouble, the Owner class will not compile! So, the sad moral of the story is: we cannot implement has-a relationship by value if the component object has not copy semantic.


This is not the worst case scenario, let's take as example this Component interface:
  1. class Component
  2. {
  3.     public:
  4.         /* some stuff */
  5.         Component(Component& other);
  6.         Component& operator=(Component& other);
  7.     private:
  8.         /* some other stuff */
  9. };
This time the copy semantic is allowed but the keywords const has been removed. Why? I don't know the people are different and the developers too. Removing the const keyword can give you a very small advantage in performance (do you really need it?) but it can spoil the day of another developer. How? This example explains it:
  1. Component comp;
  2. /* some calculation */
  3. /* comp is very important! My whole program
  4.    is based on it! */
  5. Owner o(comp);
  6.    
  7. /* Arggggg.... Owner can change it! */
Changing an object that you take as an input parameter is like scratching the car of your friend. You shouldn't do that.
So: we can but we shouldn't implement an has-a relationship by value if the component object has non-const copy semantic.


Actually, we haven't finished yet with our analysis. There is another pitfall to avoid. It is a well known feature of C++, if you don't provide Constructor, Destructor, Copy-Constructor and Assignment Operator the compiler provide a version for you. You may think cool, less work. Actually there is huge drawback. The compiler is smart, but it is not a mind reader. Even the smartest compiler cannot reproduce the copy semantic that you think! You need to code it, otherwise the compiler will do the best and it can creating a bitwise copy. Bitwise copy is an exact copy of an existing object bit by bit. In order to show all the possible outcomes I have prepared a small program. Even though the program is completely meaningless the result output of it is exactly what we want:
  1. #include <iostream>
  2. #include <string>
  3. using namespace std;
  4. class Component
  5. {
  6.     public:
  7.         /* some stuff */
  8.         Component(int numberOfValue,string myString) :
  9.             i_(new int[numberOfValue]),
  10.             numberOfValue_(numberOfValue),
  11.             myString_(myString)
  12.         {
  13.             changeValue(10);
  14.         }
  15.            
  16.         ~Component() { delete[] i_; }    
  17.        
  18.         int getNumberOfValue() const { return numberOfValue_; }
  19.        
  20.         string getMyString() const { return myString_; }
  21.        
  22.         void printArray() const
  23.         {
  24.             for (int j=0; j < numberOfValue_; j++)
  25.                 cout << i_[ j ] << " " ;
  26.             cout << " Data stored by pointer" << endl;           
  27.         }
  28.        
  29.         void modify(int newNumberOfValue,string anotherString)
  30.         {
  31.             delete[] i_;
  32.             i_ = new int[newNumberOfValue];
  33.             numberOfValue_ = newNumberOfValue;     
  34.             myString_ = anotherString;
  35.             changeValue(100);
  36.         }
  37.     private:
  38.         void changeValue(int leverage)
  39.         {
  40.             for (int j=0; j < numberOfValue_; j++)
  41.                 i_[ j ] = leverage*j;          
  42.         }
  43.         /* some other stuff */
  44.         int* i_;
  45.         int numberOfValue_;
  46.         string myString_;
  47. };
  48. class Owner
  49. {
  50.     public:
  51.         Owner(Component& component)
  52.             :component_(component)
  53.     {
  54.         component_.modify(component_.getNumberOfValue(), "Bar");   
  55.     }
  56.    
  57.     private:
  58.         Component component_;
  59. };
  60. int main (int argc, char *argv[])
  61. {
  62.     //numberOfValue equal to 10
  63.     Component comp(10,"Foo");
  64.     cout << comp.getNumberOfValue() << " Built-in type "  << endl;
  65.     cout << comp.getMyString() << " Object with deep-copy semantic" << endl;
  66.     comp.printArray();
  67.        
  68.     //Make a copy of Component!
  69.     //Using the compiler automatic
  70.     //generated copy consturctor
  71.     Owner o(comp);
  72.     cout << comp.getNumberOfValue() << " Built-in type "  << endl;
  73.     cout << comp.getMyString() << " Object with deep-copy semantic" << endl;
  74.     comp.printArray();
  75.     return 0;
  76. }
The output is:
10 Built-in type
Foo Object with deep-copy semantic
0 10 20 30 40 50 60 70 80 90  Data stored by pointer
10 Built-in type
Foo Object with deep-copy semantic
0 100 200 300 400 500 600 700 800 900  Data stored by pointer



As you can see the Built-in type is copy correctly and also the string object. The array is doing something odd. The value of the array has been modified after the creation of Owner object. The reason is that the compiler makes a shallow copy of the pointer, it copies just the pointer and not all the data pointed by the pointer. So you will end up with two entities sharing the same data. Sometimes is what you want but very often is not.



So: we shouldn't implement an has-a relationship by value if the component object hasn't a deep copy semantic. If we really want to do that we need to carefully document our choice because it is not the behaviour that the client of our class will expect.

Thursday, 4 November 2010

Using RAII idiom for profiling


The RAII is a well-known C++ idiom. The idea is simple and great at the same time. Two things in life are certain: birth and death. If you are wondering about taxes, please don't forget that I am Italian and in my country taxes are not mandatory.
In object oriented world birth means constructor. The constructor is the first function called when we create it, so it is a wonderful place to allocate a resource. Death means destructor, when a object goes out of scope the destructor is always called (of course if you leak the resource the destructor is not called). That's great, the destructor is the perfect place to release the memory.
Check this link for more information on RAII. What I am going to show you is a very simple class that implement this idiom to get profiling information.

Profiling
If you need to make your application go faster you need to know where your application is actually performing worst. DON'T SPECULATE! The developer are not good to guess where an application is going slow, if they were good the there wouldn't be slow applications.
Profiling is tedious, you need to put extra debug output everywhere, check the result and at the end remove them from the code (of course you can use a profiling tool, but sometimes the learning curve is so big that it is better using trivial way).
The RAII idiom will rescue us, have a look at this class:
  1. #ifndef _PROFILER_H_
  2. #define _PROFILER_H_
  3. #include <string>
  4. #include <iostream>
  5. #include <sys/time.h>
  6. using std::string;
  7. using std::ostream;
  8. namespace BitingCpp
  9. {
  10.     class Profiler
  11.     {
  12.         public:
  13.             Profiler(string what,
  14.                      ostream& out = std::cout,
  15.                      bool verbose = false);
  16.             ~Profiler(); /* throw() */
  17.         private:
  18.             Profiler(const Profiler& rhs);
  19.             Profiler& operator=(const Profiler& rhs);
  20.             string what_;
  21.             timeval start_;
  22.             timeval end_;
  23.             ostream& out_;
  24.             bool verbose_;
  25.     };
  26. } //namespace BitingCpp
  27. #endif

As you can see the class provides only constructor and destructor. The copies are not allowed. There are some private member and this is the implementation file:

  1. #include "Profiler.h"
  2. using std::endl;
  3. namespace BitingCpp
  4. {
  5.     Profiler::Profiler(string what,ostream& out,
  6.                                 bool verbose) :
  7.                                 what_(what), start_(),
  8.                                 end_(),out_(out),verbose_(verbose)
  9.     {
  10.         gettimeofday(&start_,0);
  11.         if (verbose_)
  12.             out_ << "Begin of " << what_ << " at " << start_.tv_sec
  13.                  << " s " << start_.tv_usec << " us" << endl;
  14.     }
  15.        
  16.     Profiler::~Profiler() /* throw() */
  17.     {
  18.         try
  19.         {
  20.             gettimeofday(&end_,0);
  21.             if (verbose_)
  22.                 out_ << "End of " << what_ << " at " << end_.tv_sec
  23.                      << " s " << end_.tv_usec << " us" << endl;
  24.            
  25.             double timems =
  26.               ((static_cast<double>(end_.tv_sec - start_.tv_sec))
  27.               *1000.0 + ((end_.tv_usec - start_.tv_usec)/1000.0));
  28.             out_ << what_ << " running time: "
  29.                  << timems  << " ms " << endl;
  30.         }
  31.         catch(...) {} // to avoid core dump during the stack unwind!
  32.     }
  33. } //namespace BitingCpp
The constructor stores the timestamp of creation (and print an optional statement) and the destructor makes a simple calculation and print the result.

This small program shows how to use this class:


  1. #include <iostream>
  2. #include "Profiler.h"
  3. int main (int argc, char *argv[])
  4. {
  5.     BitingCpp::Profiler mainProfile(__FUNCTION__);
  6.    
  7.     BitingCpp::Profiler* justFirstLongLoop =
  8.        new BitingCpp::Profiler("Just First Long Loop");
  9.    
  10.     int j = 0;
  11.     for (int i=0; i < 10000000; i++)
  12.         j += i*i;
  13.        
  14.     delete justFirstLongLoop; // trigger the deconstructor
  15.     BitingCpp::Profiler anotherBigLoop("Another Big Loop");
  16.     j = 0;
  17.     for (int i=0; i < 10000000; i++)
  18.         j += i*i;
  19.    
  20.    
  21.     return 0; // anotherBigLoop goes out of
  22. }             // scope first and then mainProfile



The output of this program is something like that:
Just First Long Loop running time: 52.852 ms
Another Big Loop running time: 52.435 ms
main running time: 105.829 ms
Now it is clear why the copies are not allowed, what does copying a Profiler object mean? The Profiler keeps track of its date of birth (actually a timeval structure). If I copy a Profiler object which date of birth should the new object have? The same of the first one or the time stamp of the copy operation? Too many question and furthermore I cannot see the point to make a copy of a Profiler object in the first place.
Note: the code is working on UNIX platform. You can easily change the implementation file to use Windows functions. __FUNCTION__ macro should be pretty much cross platform, I don't think it is already in the standard, but I am sure is widely supported.