Tuesday, July 5, 2016

Compiling Cyanogenmod for TF300t

Inspired by JustArchi's optimizations on compiling the Cyanogenmod code, I set out to compile an optimized ROM for my dated Asus TF300t tablet. The official Cyanogenmod built does work, but it just feel a bit laggy.

The idea is simple. Follow the Cyanogenmod how-to guide to setup the environment and repo sync to retrieve the code.  Then apply the patches from JustArchi's ArchiDroid.

However, the options used in JustArchi are quite "exotic".  I am more interested to improve the performance by e.g. removing "-g" and turning on NEON auto-vectorization etc.  Besides, the tf300t build is having problem with "-O3" on target ARM (it is OK on THUMB though). So I ended up using these parameters instead:

ARCHIDROID_GCC_CFLAGS_ARM := -O2

ARCHIDROID_GCC_CFLAGS := -O2 -funsafe-math-optimizations -ftree-vectorize -mvectorize-with-neon-quad -fgcse-las -fgcse-sm -fipa-pta -fivopts -fomit-frame-pointer -frename-registers -fsection-anchors -ftracer -ftree-loop-im -ftree-loop-ivcanon -funsafe-loop-optimizations -funswitch-loops -fweb -Wno-error=array-bounds -Wno-error=clobbered -Wno-error=maybe-uninitialized -Wno-error=strict-overflow

Note that with auto vectorization turned on, the file external/libopus/celt/rate.c failed to compile due to a known bug.  The error is as follow:

target thumb C: libopus <= external/libopus/celt/rate.c
external/libopus/celt/rate.c: In function 'compute_allocation':
external/libopus/celt/rate.c:638:1: error: unrecognizable insn:
 }
 ^
(insn 1122 1121 1123 153 (set (reg:V4SI 1012)
        (unspec:V4SI [
                (const_vector:V4SI [
                        (const_int 0 [0])
                        (const_int 0 [0])
                        (const_int 0 [0])
                        (const_int 0 [0])
                    ])
                (reg:V4SI 1008 [ vect_var_.64 ])
                (const_int 1 [0x1])
            ] UNSPEC_VCGE)) external/libopus/celt/rate.c:521 -1
     (nil))
external/libopus/celt/rate.c:638:1: internal compiler error: in extract_insn, at recog.c:2150


So I modified the code to disable auto vectorization for the compute_allocation function in external/libopus/celt/rate.c:

__attribute__((optimize("no-tree-vectorize")))
int compute_allocation(const CELTMode *m, int start, int end, const int *offsets, const int *cap, int alloc_trim, int *intensity, int *dual_stereo,
      opus_int32 total, opus_int32 *balance, int *pulses, int *ebits, int *fine_priority, int C, int LM, ec_ctx *ec, int encode, int prev, int signalBandwidth)


After all the changes, clean the source tree (make clean) and delete the ccache.  Then use the "breakfast" and "brunch" command to build the zip.  Flash the ROM as usual.

References:

ARM Floating point reference
gcc auto-vectorization
gcc optimization options
gcc ARM options