Friday 29 October 2010

Strongly Type



Is this program compiling?

  1. #include <iostream>
  2. #include <string>
  3. //in a dusty corner of your library
  4. #define false 1
  5. #define true 0
  6. using std::string;
  7. using std::endl;
  8. using std::cout;
  9. void foo(bool aBool,bool bBool,const string& aString,const string& bString)
  10. {
  11.     cout << "aBool ="   << aBool << endl;
  12.     cout << "bBool ="   << bBool << endl;
  13.     cout << "aString =" << aString << endl;    
  14.     cout << "bString =" << bString << endl;
  15. }
  16. int main (int argc, char *argv[])
  17. {
  18.     foo(false,true,"A string","B string");
  19.     foo(false,"A string",true,"B string");
  20.     foo("A String",false,true,"B string");
  21.     return 0;
  22. }

I have tried on different compilers and I haven't get any error or warning.  As you can see a function that gets two bools and two const references to string can be fed with the parameters in any order and the compiler will never complain.


If you run it you will get an output like this:


aBool =1
bBool =0
aString =A string
bString =B string
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_S_construct NULL not valid


Are you surprised? I was when I bumped in this error few days ago. Let's try to understand what is going on and why this meaningless code is compiling fine.
Let's start with bool. A literal string is constant at compile time and it is stored using a char*, in other words we type "A string" and the compiler stores in the constant memory a pointer (integer).
The bool type participates in integral promotions. An r-value of type bool can be converted to an r-value of type int, with false becoming zero and true becoming one. So, it is not surprising that  a literal string can be automatically converted to a bool type.


Let's face the hard bit now. The std::string class has got this available constructors:
  1. string ( );
  2. string ( const string& str );
  3. string ( const string& str, size_t pos, size_t n = npos );
  4. string ( const char * s, size_t n );
  5. string ( const char * s );
  6. string ( size_t n, char c );


Let's exclude the n. 1, 3, 4, 6 because they are using more than one parameter, so they are not suitable candidates for the implicit conversion. We need to choose between the copy constructor  (const reference of string) and the overload of the constructor that takes const char*. 


Who is guilty? At this point we still don't know, but we can do some speculation and try to perform a test that can give us an answer. The standard says that a reference CANNOT be null, must always pointing to something. If you have sharp eyes, you have already noticed that we substituted the string parameter with 0 (remember #define true 0). So it cannot be the copy constructor because a reference CANNOT be null. If we write:


  1. foo(false,"A string",false,"B string");


The compiler will spot the error:

StronglyType/main.cpp(0,0): Error: invalid conversion from ‘int’ to ‘const char*’ (StronglyType)

The same problem will occur if the function foo has different signature (accepting string by value instead of by reference).


The moral of the story is: don't feed a function with literal string or constant, but define the variable before calling the function.



This code doesn't compile as we want:


  1. int main (int argc, char *argv[])
  2. {
  3.     bool false_(false);
  4.     bool true_(true);
  5.     string aString("A string");
  6.     string bString("B string");
  7.     foo(false_,aString,true_,bString);
  8.     return 0;
  9. }


I know, you are thinking: hang on, you are telling me that I need to type 5 lines of code instead of one!


Well writing 4 lines of code will take 30 seconds of your life. Loosing the type safe checking that the compiler guarantee can cost you hours of nasty debug. Your choice.



Last but not least, the big question. Is it happening just with string or also with other objects?


  1. class Bar
  2. {
  3.     public:
  4.         explicit Bar(const char* value) : value_(value) {}
  5.         Bar(const Bar& other) : value_(other.getValue()) {}
  6.         string getValue() const;
  7.     private:
  8.         string value_;
  9. };
  10. void foo(bool aBool,bool bBool,const Bar& aBar,const Bar& bBar)
  11. {
  12.   // some clever stuff
  13. }
  14. int main (int argc, char *argv[])
  15. {
  16.     foo(false, "A Bar", true, "B Bar");
  17.     return 0;
  18. }


This code is not compiling. The keyword explicit in front of the one-parameter constructor causes a compilation error. If you remove the keyword we have the same behaviour of the std::string.


For more information about the explicit keyword I recommend this link: http://msdn.microsoft.com/en-us/library/h1y7x448.aspx


Note: In order to compile it using gcc version 4.4.3 I added the two #define lines at the top. Other compilers are not so smart and the code will compile even without the two macros (by the way a good reason to not use macros to define constant). If you run the example with a different compiler let me know if the program will compile also without the two #define.