C++ is bad: problems with the ternary operator

8 April 2008, 11:54 PM

In today’s installment of “why not to program in C++,” I give you the following quiz, which Dustin, Steve, Tom, and I had to figure out today (Dustin’s code was doing the weirdest things, and we eventually traced it down to this):

Suppose you start out with the following code:

class Argument;
Argument x;
void Foo(const Argument& arg);
bool test;

You can assume that all of these are defined/initialized elsewhere in the code. For each pair of code snippets below, decide whether the two snippets are equivalent to each other.

#	Code Snippet A	Code Snippet B
1	if (test) Foo(x); else Foo(Argument());	Foo(test ? x : Argument());
2	{ // limit the scope of y Argument y; Foo(test ? x : y); }	Foo(test ? x : Argument());
3	Foo(x);	Foo(test ? x : x);
4	Foo(x);	Foo(true ? x : Argument());

Edit: what I meant by the curly braces in Question 2 is that you shouldn’t consider “y is now a defined variable” to be a significant difference between the two snippets.

Don’t read further until you think you have the answers.

Have you decided which are the same? Good.

Only the third pair are equivalent. Are you surprised? I certainly was! Here’s what’s going on:

Since Foo() takes an Argument reference, whatever is passed into Foo() must be an lvalue (something that can go on the left side of an equals operator). The lvalues here are x and y, but not Argument() (i.e., the line Argument() = x; would not be valid).

When the ternary operator (the ?: syntax) operates on two lvalues, the result is another lvalue. However, when it operates on something that is not an lvalue, the result isn’t one, either. To pass that result into Foo(), it needs to be placed in a temporary location (which is an lvalue and whose reference can be passed to Foo()). This means that the copy constructor is invoked, to copy the value returned by the ternary operator into a temporary location, so that Foo() can get a reference to that location.

So now, justifications for the answers:

If test is true, code snippet A does not call the copy constructor, while snippet B does (since the ternary operator won’t necessarily return an lvalue, it needs to copy it into a temporary location). If the copy constructor for Argument has side effects, the behavior of the snippets will differ. If the copy constructor does something unusual (for instance, it does not copy a certain member variable, or it resets the value of some internal state in the copy), Foo() will operate on different data in the two snippets (in B, it would operate on the new, uncopied member variable and the reset/reinitialized state, rather than x‘s version). Moreover, the location of the object passed into Foo() is different (one is x itself, while the other is a copy of x, stored somewhere else). It’s unlikely that Foo() will change its behavior based on the location of arg, but you never know. Note that if test is false, the copy constructor is not called in either snippet because even though the default constructor does not return an lvalue, it can be stored in an lvalue without using the copy constructor.
Again, when test is true, the copy constructor is invoked in snippet B but not in A. In snippet A, both operands in the ternary operator are lvalues, so it returns an lvalue, which can be used directly by Foo(), but this is not the case in snippet B, and the copy constructor needs to be invoked. This has the same issues as Problem 1. Moreover, if test is true, only snippet A invokes Argument‘s default constructor and destructor (which might have side effects of their own; in an extreme case, the constructor could change the value of test itself so that one snippet passes a newly constructed Argument to Foo while the other passes x or a copy thereof). Edit: also, if Argument is POD, y will be uninitialized in snippet A, so when test is false snippet A will operate on uninitialized data while snippet B will operate on data that has been zeroed out because it used the default constructor due to the parentheses. Just as before, the snippets have the same behavior if (edit: Argument is not POD and) test is false (both snippets call the default constructor, both call the destructor, and neither calls the copy constructor).
These really are the same. Since both parts of the ternary operator are lvalues, the result is an lvalue, and the copy constructor is not used.
Again, we have the same problems with the copy constructor being invoked in snippet B. Note that even in an optimized build, the copy constructor is still used! The test at the start of the ternary operator and the code to call the default constructor if the test turned out false are removed, but the copy constructor is still used in case you’re relying on one of the differences mentioned above.

This is yet another way in which C++ can have weird issues that are really hard to debug. If you are a fan of C++, please consider using a different (read: modern, high level) language. Both Java and Python only give you objects by reference, so the copy constructor would not be called in any of the above cases, which, for me at least, adheres more closely to the Principle of Least Surprise. The curmudgeons out there will want me to note that Java and Python do pass-by-value (not pass-by-reference, as you may have misinterpreted from my previous sentence) but the values themselves are references to the data stored in the objects, so they’re passing-by-value the references to the data. and yes, Python doesn’t really have a copy constructor, but that’s beside the point.

I realize that sometimes you need the speed available in C++, but there are a lot of times when it’s OK to be 2-3 times slower, and in those times you should use a language like Java (or Python, if you can stand being a bit slower than that). Remember that my Java runs just as fast on a new computer as your C++ does on a 2 year old computer. It’s not that big a performance hit.

Edit: See the addendum for another unexpected issue with the ternary operator.

Tags: code, computer science, cpp, cpp is bad, software, ternary, ternary operator, work
Category: best of the blog, C++ Is Bad, Computer Science & Coding | Comment (RSS) | Trackback

14 Comments

macdaddyfrosh says:

9 April 2008 at 9:16 PM

Why not just avoid the ternary operator? It’s already harder to read.

Reply to this comment
- Alan says:
  
  9 April 2008 at 10:35 PM
  
  It’s already harder to read.
  
  Really? I think it makes things much easier in the right situations. Consider these two snippets, where log_file is a pointer:
  if (log_file) // If the log file exists (it isn't NULL) WriteErrorToLog(error_message, info_about_state, log_file); else WriteErrorToLog(error_message, info_about_state, default_log_file);
  
  versus
  WriteErrorToLog(error_message, info_about_state, (log_file ? log_file : default_log_file));
  
  I personally think the second one is much more readable because it gets rid of the duplicated code.
  
  Reply to this comment
  - macdaddyfrosh says:
    
    9 April 2008 at 10:44 PM
    
    I still think the first one is more readable; besides, doesn’t the style guide tell you not to use ternary? ;-)
    
    Reply to this comment
    - Alan says:
      
      10 April 2008 at 6:30 AM
      
      Nope, it’s ternary agnostic. :-)
      
      Reply to this comment
  - dhalps says:
    
    10 April 2008 at 6:44 AM
    
    No, the write answer here is actually a #define or a function that encapsulates that if statement.
    
    Reply to this comment
    - dhalps says:
      
      10 April 2008 at 6:45 AM
      
      write, right. Yeah. Don’t right that code :).
      
      Reply to this comment
      - macdaddyfrosh says:
        
        10 April 2008 at 3:11 PM
        
        Instead, Wrong it.
        
        Reply to this comment
  - jcmdev0 says:
    
    10 April 2008 at 8:03 AM
    
    Thinking about what the compiler would do with the constructor, I expect you’d get something on the order of
```
Argument temp;
if (test)
  temp = x;
else
  temp = Argument();

Foo(temp);
```
    Or something to that nature. I can’t think of what might happen if they are different types, but that seems like it would be asking for trouble.
    
    Also, what is the scope of log_file? If you are going to be writing to the default anyway you could set log_file to default_log_file, and if you wanted to guard some of the writes to not be written if it was the default you could test log_file against default_log_file.
    
    I’m not sure that this qualifies as fodder for a C++ vs other high level languages debate. Fwiw, c++ is getting revved in the near future.
    
    Reply to this comment
  - Anonymous says:
    
    10 April 2008 at 9:29 AM
```
if(!log_file)
    log_file = default_log_file;
WriteErrorToLog(error_message, info_about_state, log_file);
```
    Reply to this comment
janna says:

9 April 2008 at 9:20 PM

I’m just glad 3 is equivalent. If that failed I’d probably have to change careers :)

Reply to this comment
Anonymous says:

10 April 2008 at 10:34 AM

It’s not the ternary operator. It’s the temporary object!

Like you, I’m a big fan of ?: for it’s ability to reduce a 4-line if/then/else block to a one liner. But I think in this case ?: is just complicating the fact that you’re using temporary objects. They’ve confused me in C++ for a while.

What are they? Well, compare the following:

Code 1)
```
if( condition ) {
  Argument y;  // constructor called
  Foo( y );
  z = 1;
}  // destructor called
```
Code 2)
```
if( condition ) {
  Foo(Argument()); // action happens here
  z = 1;
}
```
In Code 1, it’s clear where the constructor for object y is called, and its destructor is called when we exit the block of code controlled by the if(). But in Code 2, an instance of Argument is created and destroyed too, but where? I’d guess it’s constructed after the first closing parenthesis on the indicated line, and destroyed after the semi-colon on the same line. But it’s a slippery object. The compiler creates it and destroys it for you, but you can’t really touch it – it doesn’t even have a name.

It’s even more confusing when a function returns temporary objects. Just go look at this (Q1 & Q2). I still don’t fully understand the line “binding a temporary object to a reference to const on the stack lengthens the lifetime of the temporary to the lifetime of the reference itself”.

To me it feels like the C++ compiler is doing *some* automatic object lifecycle management, but not going all the way. Whenever you only partially implement an idea, there always seem to be gotchas.

Reply to this comment
leonardo_m says:

10 April 2008 at 11:20 AM

This is yet another way in which C++ can have weird issues that are really hard to debug. If you are a fan of C++, please consider using a different (read: modern, high level) language. Both Java and Python only give you objects by reference, so the copy constructor would not be called in any of the above cases, which, for me at least, adheres more closely to the Principle of Least Surprise.

Or you can use the D language, that may be used as low as C, and it always gives objects by reference, like Python.

Reply to this comment
Anonymous says:

10 April 2008 at 12:04 PM

It’s merely a compiler issue

What compiler did you use with these code snippets? I’ve just checked the first one with mingw, digital mars and VC++ 9 express, and only the first one does work the way like you’ve explained.

Reply to this comment
- Alan says:
  
  12 July 2008 at 10:49 PM
  
  Re: It’s merely a compiler issue
  
  I’m using GCC. Are you sure you typed up the example correctly? Everything I described is detailed in Section 5.16 of the C++ Standard.
  
  Reply to this comment

Interesting Stuff

C++ is bad: problems with the ternary operator

14 Comments

Leave a Reply

Categories

Archives

Other Cool Stuff