C语言的奇技淫巧

by · 2016年03月29日 · 2232 Words · ~5min reading time | Improve on

位运算技巧
求结构体成员的偏移
#define OFFSET(structure, member) ((int) &((structure *)0)->member)

c语言 x[n] 相当于 *((x)+(n))
所以 x[n] 和 n[x] 结果一样

编译期类型检查: (void)(&a == &b)检测a和b是不是同一类型，不是同一类型编译不过
#define is_power_of_2(n) ((n)==((n)&~(n)+1))
编译时断言：
1. 在编译时就能够进行条件检查的断言，而不是在运行时进行。下面是个Linux Kernel的例子/* Force a compilation error if condition is true */
  #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
位域操作 tagged pointer：在动态语言中，和位运算结合，实现Int类型的unbox（这个有点偏。。enum type { pair, string, vector, ... };
```cpp
typedef struct value *SCM;

struct value {
enum type type;
union {

struct { SCM car, cdr; } pair;
struct { int length; char *elts; } string;
struct { int length; SCM  *elts; } vector;
...

} value;
};

#define POINTER_P(x) (((int) (x) & 7) == 0)
#define INTEGER_P(x) (! POINTER_P (x))

#define GET_INTEGER(x) ((int) (x) >> 3)
#define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1))

二、数据结构相关
匿名数据结构：匿名数组、匿名结构体等。下面我们来实现Lisp的List。。
```cpp
int main() {
    struct mylist { 
        int a; 
        struct mylist* next;
    };
    
    #define cons(x, y) (struct mylist[]){{x, y}}
    struct mylist *list = cons(1, cons(2, cons(3, NULL)));
    struct mylist *p = list;
    while(p != 0) {
        printf("%d\n", p->a);
        p = p -> next;
    }
}```
结构体指定初始化 来自Linux Kernel的例子，一般用得比较少
```cpp
static struct usb_driver usb_storage_driver = {
    .owner = THIS_MODULE,
    .name = "usb-storage",
    .probe = storage_probe,
    .disconnect = storage_disconnect,
    .id_table = storage_usb_ids,
};

结构体+指针：这个花样就多了去了。。内存对齐：用在优化中三、函数相关setjmp/longjmp实现协程、异常等都靠它函数指针函数指针可以实现高阶函数、模拟简单的闭包。gcc的C扩展支持嵌套函数. clang的C好像还有闭包..#include

int main() {
   int swap (int *a, int *b) {
       int c;
       c = *a;
       *a = *b;
       *b = c;
       return 0;
    }
    int first = 12, second = 34;
    printf("f is %d and s is %d\n", first, second);
    swap(&first, &second);

    printf("f is %d and s is %d\n", first, second);

    return 0;
}

C的tricks不多，但大多比较晦涩，就跟数学一样需要循序渐进的理解，同时也需要对编译器、C语言标准都有一定的理解。而C语言的雷区较多，也即有很多我们经常碰到的未定义行为(undefined behavior)。在看tricks前，我们先看个雷：

a = b + c;

它看起来很简单，对吗？但如果b和c加起来大于了上限，如INT_MAX，那编译器会做什么？事实上这里会得到一个负数。这是非常令人烦躁的未定义举动，尤其是你因为它而要debug的时候。当然，标准会告诉你每次都去检查它没有意义，我们要的只是一个高速语言。恩，读者要记住一件事，玩标准的人不写代码。而且，千万不要轻易的忽略它们，它们都有可能造成致命的问题。——所以说C语言是高手才能真正玩好的语言。如果没有对C语言较好的了解，不要轻易的使用各类tricks，说不定里面就埋了个debug不出来的雷。

我们开始，先看个小技巧，舒缓一下心情：

编译器判断优化

由于某些编译器没有这个内部函数, 或者直接用 PGO (profile guided optimization) 也能达到这个效果, 所以就定义为 LIKELY 和 UNLIKELY 宏可以随时关掉好了

#ifndef __GNUC__
#define __builtin_expect(x, expected_value) (x)
#endif
#define LIKELY(x)    __builtin_expect(!!(x),1)
#define UNLIKELY(x)  __builtin_expect((x)!=0,0)

这个技巧就是在循环判断时去期望是或否。这涉及到了编译器优化，以及很经典的火车选路。在linux内核中十分常见。你可以给知道它的人加5分，懂一点内核挺好的。

定长类型

再来一个我们在强类型时不能忽略的：

stdint.h

#ifndef __int8_t_defined
#define __int8_t_defined
typedef signed char		   int8_t;
typedef short int		     int16_t;
typedef int			         int32_t;
#if __WORDSIZE == 64
typedef long int		     int64_t;
#else
__extension__
typedef long long int		 int64_t;
#endif
#endif

这些类型非常棒，比起char, short, int, long的意思清晰十倍，尤其在bitmap运算时，最好用这些强类型（记得在算bits时要用UINT）。

但你可以看到项目组里往往用

#define INT16 short
#define INT32  long

这样的代码来定义它们，实际上是错误的，乖乖的#include 为好。

逗号运算符

像这样用是可以的。注意逗号取最后一个值返回。

for (int i=0; i < 10; i++, doSomethingElse())
{
/* whatever */
}

结构体初始化

注意它不是一般意义上的全0初始化，而是逻辑0初始化，这里引用一段stackoverflow上的英文：

memset/calloc do “all bytes zero” (i.e. physical zeroes), which is indeed not defined for all types. { 0 } is guaranteed to intilaize everything with proper logical zero values. Pointers, for example, are guranteed to get their proper null values, even if the null-value on the given platform is 0xBAADFOOD
1

struct mystruct a = {0};

Bit fields

很有用的bit定义，尤其是用在某些算法中

struct cat {

unsigned int legs:3;  // 3 bits for legs (0-4 fit in 3 bits)
unsigned int lives:4; // 4 bits for lives (0-9 fit in 4 bits)
// ...

};

cat make_cat()
{

cat kitty;
kitty.legs = 4;
kitty.lives = 9;
return kitty;

}

有限自动机


that can be achieved with the following macros:

#define FSM
#define STATE(x)      s_##x :
#define NEXTSTATE(x)  goto s_##x

FSM {
  STATE(x) {
    ...
    NEXTSTATE(y);
  }

  STATE(y) {
    ...
    if (x == 0)
      NEXTSTATE(y);
    else
      NEXTSTATE(x);
  }
}

Interlacing structures

Interlacing structures like Duff’s Device:

strncpy(to, from, count)
char *to, *from;
int count;
{

int n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7:      *to = *from++;
case 6:      *to = *from++;
case 5:      *to = *from++;
case 4:      *to = *from++;
case 3:      *to = *from++;
case 2:      *to = *from++;
case 1:      *to = *from++;
           } while (--n >0);
}

}

数组指定初始化技巧

很好用的技巧，GCC早期已经实现，后合入C99，但其他很多编译器没有实现

#define FOO 16
#define BAR 3

myStructType_t myStuff[] = {

[FOO] = { foo1, foo2, foo3 },
[BAR] = { bar1, bar2, bar3 },
...

同上：

struct foo{
int x;
int y;
char* name;
};

void main(){
struct foo f = { .y = 23, .name = "awesome", .x = -38 };
}

GCC-参数格式化

int my_printf (void *my_object, const char *my_format, ...)

        __attribute__ ((format (printf, 2, 3)));

动态指定浮动打印精度

#include

int main() {

int a = 3;
float b = 6.412355;
printf("%.*f\n",a,b);
return 0;

}

静态检查

编译时而非运行时

//--- size of static_assertion array is negative if condition is not met
#define STATIC_ASSERT(condition)

typedef struct { \
    char static_assertion[condition ? 1 : -1]; \
} static_assertion_t

//--- ensure structure fits in
STATIC_ASSERT(sizeof(mystruct_t) <= 4096);

字符串连接

#define PATH "/some/path/"
fd = open(PATH "/file", flags);

0 bit fields

struct {
int a:3;
int b:2;
int :0;
int c:4;
int d:3;
};

which will give a layout of
1

000aaabb 0ccccddd

instead of without the :0;
1

0000aaab bccccddd

The 0 width field tells that the following bitfields should be set on the next atomic entity (char)
Lambda

Lambda’s (e.g. anonymous functions) in GCC:

#define lambda(return_type, function_body)

({ return_type fn function_body fn })

This can be used as:
1

lambda (int, (int x, int y) { return x > y; })(1, 2)

Which is expanded into:
1

({ int fn (int x, int y) { return x > y } fn; })(1, 2)

字符选择

hexDigit = "0123456789abcdef"[someNybble];

include/宏协助数组初始化-预编译技巧

double normals[][] = {
#include "normals.txt"
};

打印调用点技巧，适合跟踪调试代码

#define WHERE fprintf(stderr,"[LOG]%s:%d\n",FILE,LINE);

强悍而需要功力的X_MACRO

X Macro() data list
Format: Enum, Value, Text
*/
#define X_ERROR
X(ERROR_NONE, 1, "Success")
X(ERROR_SYNTAX, 5, "Invalid syntax")
X(ERROR_RANGE, 8, "Out of range")

Build an array of error return values
e.g. {0,5,8}
*/
static int ErrorVal[] =
{
#define X(Enum,Val,Text) Val,
X_ERROR
#undef X
};

Build an array of error enum names
e.g. {"ERROR_NONE","ERROR_SYNTAX","ERROR_RANGE"}
*/

static char * ErrorEnum[] = {
#define X(Enum,Val,Text) #Enum,
X_ERROR
#undef X
};

Build an array of error strings
e.g. {"Success","Invalid syntax","Out of range"}
*/
static char * ErrorText[] = {
#define X(Enum,Val,Text) Text,
X_ERROR
#undef X
};

Create an enumerated list of error indexes
e.g. 0,1,2
/
enum {
#define X(Enum,Val,Text) IDX_##Enum,
X_ERROR
#undef X
IDX_MAX / Array size */
};

void showErrorInfo(void)
{

int i;

/*
 * Access the values
 */
for (i=0; i<IDX_MAX; i++)
    printf(" %s == %d [%s]\n", ErrorEnum[i], ErrorVal[i], ErrorText[i]);

}

Test validity of an error value
case ERROR_SUCCESS:
case ERROR_SYNTAX:
case ERROR_RANGE:
*/

switch(value)
{

#define X(Enum,Val,Text) case Val:
X_ERROR
#undef X

     printf("Error %d is ok\n",value);
     break;
  default:
     printf("Invalid error: %d\n",value);
     break;

}