@Kai771: Ubuntu defaults to -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16, which would be okay if targeting the K4/K5, but is completely wrong for the K3.
My best guess is that something's still not picking up the right *FLAGS along the way, and thus ends up using gcc's defaults => SIGILL/SIGSEGV on the K3 because armv7 > armv6 & vfp3 > vfp.
You can try with the full march/mtune/mfpu/float-abi/marm stuff in your *FLAGS & env, see if it helps.
The fact that you ran CS with O3 & Ubuntu without explains the different optimization related flags (all the ones you flaged only in CS with a 1). The two others are because the Ubuntu TC is more recent & defaults to targeting the Cortex family (mvectorize-with-neon-quad), and defaults to Thumb mode instructions (mthumb).
FWIW, with my ct-ng self-built TC (no args, so default opt level):
Code:
gcc version 4.7.2 20120910 (prerelease) (crosstool-NG hg+default-c79d55b27724)
COLLECT_GCC_OPTIONS='-Q' '-v' '-march=armv6j' '-mtune=arm1136jf-s' '-mfloat-abi=softfp' '-mfpu=vfp' '-mtls-dialect=gnu'
options passed: -v /home/niluje/hello.c -march=armv6j -mtune=arm1136jf-s
-mfloat-abi=softfp -mfpu=vfp -mtls-dialect=gnu
options enabled: -fauto-inc-dec -fbranch-count-reg -fcommon
-fdebug-types-section -fdelete-null-pointer-checks -fdwarf2-cfi-asm
-fearly-inlining -feliminate-unused-debug-types -ffunction-cse -fgcse-lm
-fgnu-runtime -fident -finline-atomics -fira-share-save-slots
-fira-share-spill-slots -fivopts -fkeep-static-consts -fleading-underscore
-fmath-errno -fmerge-debug-strings -fmove-loop-invariants -fpeephole
-fprefetch-loop-arrays -freg-struct-return -fsched-critical-path-heuristic
-fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
-fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
-fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fshow-column
-fsigned-zeros -fsplit-ivs-in-unroller -fstrict-volatile-bitfields
-ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert
-ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
-ftree-parallelize-loops= -ftree-phiprop -ftree-pta -ftree-reassoc
-ftree-scev-cprop -ftree-slp-vectorize -ftree-vect-loop-version
-funit-at-a-time -fvar-tracking -fvar-tracking-assignments
-fzero-initialized-in-bss -marm -mglibc -mlittle-endian -msched-prolog
-munaligned-access -mvectorize-with-neon-quad
Basically follows what I've asked of it: defaults to -march=armv6j -mtune=arm1136jf-s -mfloat-abi=softfp -mfpu=vfp -marm