Sunday, February 03, 2013

TI C6000 SYS/BIOS (3)

  • Debug mode is slow, for example, LDH (load) will follow by a NOP4 (no operation), no pipeline.
  • Call function needs 6 cycles, return needs 6 cycles.
  • Level of optimization: -o1(local, single block), -o2(function, across blocks), -o3(file, across functions), -pm -o3(program, across files)
  • uint32_t is good and the same in different system. 'long', 'int' will be different in different system.
  • Keyword 'restrict': tell the compiler that in this scope, no aliasing for this pointer. (aliasing: e.g. one memory location, two ways to access it)
  • It is not good to insert "asm(...)" in C codes. If necessary, try to create an asm file and call it in C.
  • The results of different optimization methods: Debug, no opt, -g (925k cycles), Release -o2, no -g (33k c), opt (20k), opt with MUST_ITERATE (17k), opt with MUST_ITERATE and restrict (7k), DSPLib (8k)
  • Cache has Valid + Tag + Index. Data is cache is reusable, it is good for temporal locality and spatial locality, not good for random number.
  • y = \sum a_n x_n, the address of a is 8000 and x is 8010. Trouble here because the last digits are '0', it is the same index. 
  • Use Direct-Mapped Cache (1-way), associates an address within each block with one cache line, there will be only one unique cache index for any address in the memory-map. Good for L1P (level 1 program). 2-way set associate is good for L1D

No comments:

Post a Comment