Current Topic: C compilers provide only a small set of 'built-in' data types. Fortunately they also provide for programs to extend this limitation giving them endless flexibility.
The Built-In Data-Types C Provides Closely Match The Hardware Of The Day...
C provides the basic data types that are immediately usable by most processors. This is closely tied to the computers of the era but, as time has proved they still provide a usable set for almost all programs and target processors. The 'Data Types' In order of bit-width are...
char short int long float double
There is a math modifier 'unsigned' which tells the compiler how to treat the data for 'All' arithmetic operations. Apparently there is also a companion modifier 'signed' but since all type references are signed by default I have never needed to use it. Maybe I missed the brief era where it was useful or maybe I'm just a complete idiot.
There is also a 'memory space' modifier 'const'. It means 'constant' meaning 'read only'. In embedded processing this means that the data may reside in 'Read Only Memory' and will cause a 'BUSS ERROR' when trying to write to it. Instant program termination. Generally though it is used as an rule for programs to make the C compiler complain, as a warning, for code that violates the rule. Giving the programmer the opportunity to correct the violation or override it. TIP: If it was defined correctly, and the compiler complains about the usage, then adjust the program. Don't direct the compiler to ignore it.
Here's the tricky part though. In my programming experience only three of the data types have been stable over time. This may surprise you. 'char' has always represented an 8-Bit value and 'float' and 'double are defined by ancient 'IEEE' specifications and so 'have to' be implemented to an outside standard (bugs excluded).
For the remaining types 'short' has been relatively stable as a 16-Bit value however 'int' and 'long' have always been compiler specific. Of course... I've been programming for, probably, 35 years and on target processors from 8-Bit to 64-Bit using dozens of different compilers (with hundreds of revisions). So I've seen the best of it and the worst of it. Furthermore I have always strictly coded to the original (K&R) standard. There are later revisions (standards) that extend 'data types' furthermore enforcing minimum data Bit-widths however I chose to avoid them. And... Yes, that neglect, does in fact still give me a mosquito bite in the backside occasionally. Usually when I upgrade my computer. I don't really care though since currently I only write C programs for my own private purposes.
The Final And Most Dangerous Type Is The Pointer...
A 'pointer' in C represents a direct address in memory. When any program reads or writes a variable it is through this address in memory. Normally C makes this transparent when assigning data however there are many cases where this is not enough access. Some programs need to get closer to the data. For this C introduces pointer types. Typically... Programs define pointers and assign them to objects or variables utilizing the facilities of C for manipulating them intuitively. It's also important to point out that pointers have 'sizes' representing the size of the data-type. In other words... A pointer to an int is not compatible with a pointer to a 'string' (char array).
To define a 'pointer' a '*' precedes the type definition. After the type declaration.
{ short * ps; }
This statement creates a pointer to a single short variable referenced as ps. Many functions will require you pass a pointer of a given type. For instances where the variable is already defined and it needs to be passed as a pointer (to it) you can use the & or 'address-of' directive.
{ short s; get_short(&s); }
This statement creates a short variable s and then passes it by reference (address of) to a function which requires a single value passed to it of type short* (short pointer).
C Also Provides Methods To Define Custom Types...
The first convenience type is the 'enum' or enumerator. This defines a data type designed to enforce mutual exclusion between members. In other words... Every value (defined) within an enumerated list is guaranteed to be unique. Programs can use this as a method (compiler rule) to help enforce data integrity.
enum _states {none, on, off};
This statement creates a 'compile time' type _states that can be used to define a variable whose members {none, on, off} are assigned constant values. In this case the three members are guaranteed to be of different values. Additionally... The enumerated members can be used as constants anywhere (within scope) where a numeric value is allowed. This includes all arithmetic and assignment operations. I believe that for these operations the compiler will treat them as 'of type' int.
A program is also allowed to set the member values of an enum type during it's definition.
enum _flags {unknown=0, enable=1, normal=2, select=4, highlight=8};
This statement specifically defines a set of 'named' bit-field positions intended for logic operations useful when manipulating devices. And a million other things.
My favorite data type is by far 'struct'. With this mechanism programming, for me, became unlimited. Yeah! Most programmers would say 'So what!'. Well... I say to them. Would you want to program the same functionality in Assembly code. All would say No. This simple and unique abstraction is what defines the power of C for me. In implementation it translates well to most underlying processors (address-register indexed indirect) while providing the simplicity to handle tremendous amounts of data in an intuitive fashion.
The struct statement works by defining a 'compile time' type which is actually a block of memory partitioned into sections for which there is a named member associated with each section.
struct _object { short state; int flags; long count; char name[32]; };
This statement defines a type '_object' which will create a data structure of size sizeof(struct _object) in length. It has members (state flags count name) of different types which the compiler manages individually, and correctly, when manipulating.
Referencing a member for a struct uses two methods depending on whether the struct has been defined (passed) as a pointer. This is necessary so the compiler can resolve additional references it may encounter while parsing the code.
The first method is for cases where the struct is being referenced by value and uses the '.' (period) index operator.
{ struct _object obj; obj.state = none; obj.flags = unknown; obj.count = -1; obj.name[0] = '\0'; }
For the case where the struct is a pointer (dereferenced) the '->' indirect operator is required.
{ struct _object obj; struct _object *pobj; pobj = &obj; obj->state = none; obj->flags = unknown; obj->count = -1; obj->name[0] = '\0'; }
Any good compiler will complain on improper use (syntax) which may seem silly because the compiler knows the difference. Actually... It's to ensure the programmer understands the difference and how well the compiler can 'guard' against simple programming errors (out of bounds, wrong type).
Why Is This So Important...
7759