Understanding Operators in C programming.

The Assignment Operator...

There is only one way to move data from one location to another. Whether the data represents a single value or an array of thousands the assignment operator '=' is used...

a = b;

From the compilers perspective the statement translates to... Assign (move) the data from source 'b' to the destination location 'a' however programmers will commonly refer to this statement as 'a equals b'.

Math Operators...

The basic math operations '+ - * / %' (add, subtract, multiply, divide, modulus) are all that is provided for. That's all there was at the time. Additionally math operations are dependent on the source and destination data sizes and types and whether or not they represent 'signed' values.

A misunderstanding of basic math and or the data size-limits used as input and output terms can lead to overflow or underflow of the data and cause indeterminate program behavior. Good programs should account for this and know how to bounds-check and handle error results when this occurs.

Comparison Operators...

To compare data '< <= > >= == !=' (less then, less then or equal, greater then, greater then or equal, is equal to and not equal to) operators are provided. Notice here the distinction made for the 'double-equal' as 'is equal to' from the 'single-equal' as an 'assignment' which is not a comparison. A good compiler will recognize this ambiguity and complain.

The Negation Operator...

Actually... There are two operators used for negation. The '!' is a 'logical' operator while ~' is used for 'bitwise' inversion, or complimenting, of each bit in the input term.

There is a critical distinction between these two references that is essential for all programmers to understand. It is important not only that this difference ripples down to the processor and how it implements the logic being executed but how programmers can take use of it to simplify data manipulation. It also applies to All 'Logic' operations.

Logically... Data at the processor level operates on all bits of the input terms. This includes whether the operand is '0' (false, NULL) or 'anything except 0' (true). The single distinction for 'logical' operators is that the input terms are first compared to '0' and thus the output term is either '1' (true) or '0' (false). In all other cases the operation is considered 'mathematical' or 'bitwise' and all term (input and output) bits are considered for the result.

Consider the binary example...

b = 0101;

a = !b; In this case 'a' equals 0 'logically'.

a = ~b; In this case 'a' equals 1010 'bitwise' (arithmetically).

Logic Operators...

Logic... That's where it all began. In the old days we used logic elements, built with hardware, to translate input requests to output actions. Processors are still (fundamentally) composed of zillions of these elements. They were called gates. There was 'and, or, exclusive-or and invert, Invert is commonly referred to as 'not'. Every logic (binary) device is based on this simple 'DNA' and it's logical derivatives.

To try and simplify basic logic... An 'and' result is only true when all inputs are true otherwise it is false. An 'or' result is true when any input is true and is false when all inputs are false. Notice the opposite behaviors. The 'exclusive-or' result is false when both inputs are the same (both-true or both-false) and is true for any difference. This combination along with the ability to invert any input or output using the 'not' A.K.A. 'negation' operator are the primary rules that all digital devices obey. To further simplify it actually only takes any single gate and an inverter to generate all other gate derivatives. however because of the additional linear processing steps it would take (propagated) to achieve this most gates (at the circuit level) have been optimized to a common set that still make up the building blocks that processors and software operate on.

The C language presents two methods for a program to consider when operating 'logically' on data. The first method '& | ^' specifies 'bitwise' operations (and, or, exclusive-or). As well as '&& ||' known as 'logical' operations.

There is a very subtle difference here that is one of those easy to make mistakes (bugs) when coding. Basically bitwise (&, |, ^) operate as if there were n gates where n is the number of bits of the source terms (8, 16... 128) and the result yields an output bit for each input term pair. This method is used to manipulate (select) certain bits within a larger data type most commonly for manipulating and checking flag bits and maybe determining program flow based on their states. For example...

a = 10100101 & 01001101; Result a is 00000101.

a = 10100101 | 01001101; Result a is 11101101.

a = 10100101 ^ 01001101; Result a is 11101000.

The second derivative '&& ||' is even more confusing. The compiler first reduces both input terms to Boolean values. Basically... Comparing both input terms to '0' (single boolean result) before the logic operation. For example...

a = b && c;

To the compiler this statement translates to...

a = (b != 0) & (c != 0);

Which reads... If any bit in term 'b' is true and any bit in term 'c' is true then result 'a' is true. Otherwise result 'a' is false.

Arithmetic Shift Operators...

The final logic operators '<< >>' represent arithmetic shifting of the data (left-shift and right-shift). Unfortunately... These statements are defined as 'implementation dependent'. To Explain... Originally processors did not possess the ability to multiply or divide in a single instruction. Even though it was well understood how there just wasn't enough silicon left to add this feature which requires an algorithm with the primitive logic available at the time. Hence Co-Processors. Instead... Software was utilized which involved shifting in conjunction with addition or subtraction to either multiply or divide. It should be noted that the C language provides only one method to shift data when most processors typically provide three derivatives with very subtle however very significant differences. That's where the ambiguity of 'Implementation Dependent' comes in. As far as I know... The C Reference does not define exactly which derivative 'must' be used. Probably because not all processors implement all of them and they can be simulated with proper coding.

Basically... Most processors implement 'Arithmetic Shift' 'Logical Shift' and 'Rotate' instructions. For Arithmetic shifts, the sign (Most Significant) bit is preserved. In other words... Negative numbers remain negative during shifting as do positive. Logical shifts do not preserve the sign bit so shifting a negative number will have unexpected results. The rotate method allows cascading shift operations for handling data that is larger then the native data-bus width but has a processor dependency that can not be easily abstracted from the hardware so it is ignored in the C world.

So... C provides a shift operator which may be of type 'Logical' or might be 'Arithmetic'. Additionally it provides for shifting multiple bits in a single statement. If a given target processor does not implement multiple shifting as a single instruction then the compiler will simulate it using a loop.

Modifying The Assignment Operator...

The C language also provides some compound operators. Mostly because of the close relationship it maintains to the processor and poor optimization methods used by early compilers. Most of these 'short-hand' uses can be represented by a single processor instruction (when possible) and early programmers knew that because they commonly viewed the Assembly-code the compiler generated. That was how code used to be optimized. Other then that they can all be represented through different expression syntax.

The operators '!= ~= += -= *= /= %= &= |= ^= <<= >>=' modify the assignment operator and do exactly what each one suggests. For each compound operator there is an alternative coding syntax that will achieve the same result however most likely will NOT produce the same machine code. This is all determined by the tuning of a compiler to the target architecture (optimization). Consider...

a != b; As meaning a = !b;

a += 4; As meaning a = a + 4;

a &= 0101; As meaning a = a & 0101;

In the Assembly world one might program 'and D0,D1'. Translating... Bitwise 'and' the contents of the 'D0' register and the contents of the 'D1' register and store the results in register 'D1'. Executed as a single processor instruction.

In the old days... If you could use the statement 'a &= b;' and get the above code you couldn't optimize it any better then that. This is also a good example of the tight coupling of the C compiler to the code that it 'can' generate. A good programmer can use this relationship to produce code that translates well the the underlying target machine code.

Conditional Operators...

All programs need a mechanism to manipulate flow. Conditions regularly arise, like a fork in the road, where a program can chose multiple paths. Programs know (usually) or determine this based a previous operation and the result 'condition-code' the processor produced (=0 !=0 <0 <=0 >0 >=0).

Since the concurrency of the code generated can not match the C source code exactly C abstracts this through the statements 'if else while do for'. In the context it provides it is actually a pretty good match to what used to be done in Assembly to achieve the same control.