Run node.js On Older Android Device

TL;DR node.js crashes when ran on Android API level 15 and below due to libuv use of pthread_sigmask which is broken on older versions of Android. If libuv is patched with the fix for that function everything works fine.

As part of the journey to try and run node.js everywhere, I’ve recently came across an interesting issue of running node.js on Android devices with API level 15 and below. (Or, Android versions 4.0.4 and below, which apperently account for more than 10% of Android’s market share).

The ability to build and run node.js on the Android platform has been around for quite some time now, and given the node.js source code, a Linux machine and an NDK copy, it should be pretty straight forward.

However, when trying to run node.js on older Android devices, it seems to immediately crash with the following cryptic error message:


    I/DEBUG﹕ signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr deadbaad
    I/DEBUG﹕ r0 deadbaad  r1 00000001  r2 40000000  r3 00000000
    I/DEBUG﹕ r4 00000000  r5 00000027  r6 0000000a  r7 4aae8bf8
    I/DEBUG﹕ r8 00000004  r9 00000003  10 0000004d  fp 4b51c964
    I/DEBUG﹕ ip ffffffff  sp 4b51c930  lr 4001f121  pc 4001b880  cpsr 60000030
    I/DEBUG﹕ d0  0000000000000000  d1  0000000000000000
    I/DEBUG﹕ d2  0000000000000000  d3  4370000043708000
    I/DEBUG﹕ d4  0000000041c00000  d5  3f80000000000000
    I/DEBUG﹕ d6  0000000000000000  d7  0000000000000000
    I/DEBUG﹕ d8  0000000000000000  d9  0000000000000000
    I/DEBUG﹕ d10 0000000000000000  d11 0000000000000000
    I/DEBUG﹕ d12 0000000000000000  d13 0000000000000000
    I/DEBUG﹕ d14 0000000000000000  d15 0000000000000000
    I/DEBUG﹕ scr 60000012
    I/DEBUG﹕ #00  pc 00017880  /system/lib/libc.so
    I/DEBUG﹕ #01  lr 4001f121  /system/lib/libc.so
    I/DEBUG﹕ code around pc:
    I/DEBUG﹕ 4001b860 4623b15c 2c006824 e026d1fb b12368db
    I/DEBUG﹕ 4001b870 21014a17 6011447a 48124798 24002527
    I/DEBUG﹕ 4001b880 f7f47005 2106ee60 eeeef7f5 460aa901
    I/DEBUG﹕ 4001b890 f04f2006 94015380 94029303 eab8f7f5
    I/DEBUG﹕ 4001b8a0 4622a905 f7f52002 f7f4eac2 2106ee4c
    I/DEBUG﹕ code around lr:
    I/DEBUG﹕ 4001f100 41f0e92d 46804c0c 447c2600 68a56824
    I/DEBUG﹕ 4001f110 e0076867 300cf9b5 dd022b00 47c04628
    I/DEBUG﹕ 4001f120 35544306 37fff117 6824d5f4 d1ee2c00
    I/DEBUG﹕ 4001f130 e8bd4630 bf0081f0 000283da 41f0e92d
    I/DEBUG﹕ 4001f140 fb01b086 9004f602 461f4815 4615460c
    I/DEBUG﹕ stack:
    I/DEBUG﹕ 4b51c8f0  002d8448
    I/DEBUG﹕ 4b51c8f4  4004c568
    I/DEBUG﹕ 4b51c8f8  000000d0
    I/DEBUG﹕ 4b51c8fc  4004c5a8
    I/DEBUG﹕ 4b51c900  4004770c
    I/DEBUG﹕ 4b51c904  4004c85c
    I/DEBUG﹕ 4b51c908  00000000
    I/DEBUG﹕ 4b51c90c  4001f121  /system/lib/libc.so

Unfortunately, the log doesn’t seem to give any information on the source of the error, just a reference to the standard c library (libc) and there’s not a lot we can do with it.
In such cases, there are basically 2 things I try to do:

  1. Try to debug the thing
  2. Add logs everywhere

Since node.js’s source code is pretty big, the first option seemed more promising.
It took some twisting and turning, but after 1-2 days, I was able to make ndk-gdb work with node.js on android, which means that I can now set breakpoints, and inspect local variable values, among other things.

There is plenty of documentation out there on how to get ndk-gdb working,so we’re not gonna spend any time on this part, but the main advice I can tell you about running ndk-gsb is that you should pay close attention carefully to its error messages and don’t be afraid to change the script in order to make it specifically work for your app.

After spending some time on setting up some breakpoints in various code paths in node, I was able to narrow down the source of the SIGSEGV signal to line 103 in libuv’s signal.c:


.....
static void uv__signal_block_and_lock(sigset_t* saved_sigmask) {
    sigset_t new_mask;
    if (sigfillset(&new_mask))
        abort();
    if (pthread_sigmask(SIG_SETMASK, &new_mask, saved_sigmask))
        abort();  // line 103
    if (uv__signal_lock())
        abort();
}
....

After inspecting the return value of the call to pthread_sigmask it seems that it always fails with the return value of 22, or EINVAL, which causes the 2nd if clause to call abort, which results with the SIGSEGV we were seeing earlier.

Some more digging up, and apparently, pthread_sigmask not working on Android API <=15 is a known issue!

Looking at the change set that fixed this issue for API level 16, it seems like it’s a rather small change that we can try and incorporate into libuv’s signal.c.

We start by adding the fix from the android source base above and a new pthread_sigmask_patched method in which we will first try to call to the system’s pthread_sigmask function, and if it fails with an EINVAL, we’ll try to call the fixed pthread_sigmask version.


/* signal.c code here... */
// --- Start of Android platform fix --
/* Despite the fact that our kernel headers define sigset_t explicitly
 * as a 32-bit integer, the kernel system call really expects a 64-bit
 * bitmap for the signal set, or more exactly an array of two-32-bit
 * values (see $KERNEL/arch/$ARCH/include/asm/signal.h for details).
 *
 * Unfortunately, we cannot fix the sigset_t definition without breaking
 * the C library ABI, so perform a little runtime translation here.
*/
typedef union {
    sigset_t   bionic;
    uint32_t   kernel[2];
} kernel_sigset_t;
/* this is a private syscall stub */
extern int __rt_sigprocmask(int, const kernel_sigset_t *, kernel_sigset_t *, size_t);
int pthread_sigmask_android16(int how, const sigset_t *set, sigset_t *oset)
{
    int ret, old_errno = errno;
    /* We must convert *set into a kernel_sigset_t */
    kernel_sigset_t  in_set, *in_set_ptr;
    kernel_sigset_t  out_set;
    in_set.kernel[0]  = in_set.kernel[1]  =  0;
    out_set.kernel[0] = out_set.kernel[1] = 0;
    /* 'in_set_ptr' is the second parameter to __rt_sigprocmask. It must be NULL
        * if 'set' is NULL to ensure correct semantics (which in this case would
        * be to ignore 'how' and return the current signal set into 'oset'.
    */
    if (set == NULL) {
        in_set_ptr = NULL;
    } else {
        in_set.bionic = *set;
        in_set_ptr = &in_set;
    }
    ret = __rt_sigprocmask(how, in_set_ptr, &out_set, sizeof(kernel_sigset_t));
    if (ret < 0)
        ret = errno;
    if (oset)
        *oset = out_set.bionic;
    errno = old_errno;
    return ret;
}
// --- End of Android platform fix --
// first try to call pthread_sigmask, in case of failure try again with the API 16 fix
int pthread_sigmask_patched(int how, const sigset_t *set, sigset_t *oset) {
    int ret = pthread_sigmask(how, set, oset);
    if (ret == EINVAL) {
        return pthread_sigmask_android16(how, set, oset);
    }
}
/* more signal.c code here... */

Additionally, we also change the 2 methods in signal.c that uses pthread_sigmask to use the patched version instead:


static void uv__signal_block_and_lock(sigset_t* saved_sigmask) {
    sigset_t new_mask;
    if (sigfillset(&new_mask))
        abort();
    // Code was changed here in order to fix android API <= 15 broken pthread_sigmask issue
    // original code called directly pthread_sigmask
    if (pthread_sigmask_patched(SIG_SETMASK, &new_mask, saved_sigmask))
        abort();
    if (uv__signal_lock())
        abort();
}
static void uv__signal_unlock_and_unblock(sigset_t* saved_sigmask) {
    if (uv__signal_unlock())
        abort();
    // Code was changed here in order to fix android API <= 15 broken pthread_sigmask issue
    // original code called directly pthread_sigmask
    if (pthread_sigmask_patched(SIG_SETMASK, saved_sigmask, NULL))
        abort();
}

Compiling and trying again to run node.js…and guess what? node starts as expected, no crashes, and everything seems to work fine!

Pretty miraculously, this was everything needed in order to make node.js run on older Android versions!

Building Node.js for Android

The good news is that Node.js does run on Android. The bad news is that at least at the time I’m writing this the build process requires a few extra steps. Nothing too scary though. See below for details.

Building Node.js for Android

  1. Go find a Linux machine or maybe a Mac.

    These instructions don’t currently work on Windows due to issues with the sh scripts being used. Yes, I did try the scripts in MINGW32 and no it didn’t work.

  2. Go download the Android NDK.

    Which NDK to download does take a bit of attention. Most Android devices today are 32 bit so I want the Platform (32-bit target). But my Linux OS (Elementary OS) is 64 bit so I want Linux 64-bit (x86) under Platform (32-bit target).

  3. After downloading the NDK unzip it.

    Let’s assume you put the NDK into ~/android-ndk-r10b.

  4. Go clone node.

    Let’s assume you put that into ~/node. I am running these instructions off master branch.

  5. Check that you have all of node’s dependencies as listed here

    I believe any modern Linux distro will have all of these already but just in case I decided to include the link.

  6. Go edit ~/node/android-configure and change ’arm-linux-androideabi-4.7’ to instead be ’arm-linux-androideabi-4.8.

    This is the pull request that added basic Android support to Node. It contains some instructions. The first instruction will set up the build environment for Android. But the set up script is designed for an older version of the Android NDK. So we need to update it. Specifically 4.7 is apparently not supported by NDK 10 so I switched it to 4.8 which is. I decided to leave platform=android-9 for no particularly good reason.

  7. Run from inside of ~/node directory the command “source ./android-configure ~/android-ndk-r10b”
  8. Now go to ~/node/android-toolchain/bin and issue the command “mv python2.7 oldpython2.7 && ln -s /usr/bin/python2.7 python2.7”

    The NDK appears to ship with its own version of Python 2.7 that doesn’t support a library (bz2) that is needed by files in the NDK. In any sane world this just means that the NDK is broken but I’m sure there is some logic here. This bug was reported to Node (since it breaks Node’s support of Android) but they responded that this is an NDK issue so Google should deal with it. But if we want to build we have to get connected to a version of Python that does support bz2. That’s what we did above. We linked the main version of Python (which any sane Linux distro will use) with the NDK so it will use that and hence support bz2.

  9. Now go to ~/node and issue ’make’

    The actual instructions from the checkin say to run ’make -j8’ which enables parallel capabilities in Make. Apparently the rule of thumb is to set the value after j to 2x the number of hardware threads available on the machine.

Using Node.js on Android via ADB

Eventually I’ll write up an AAR that just wraps all the Node stuff and provides a standard API for launching node and feeding it a script. But that isn’t my current priority so instead I need to just get node onto my device and play with it.

  1. Issue the command “adb push ~/node/out/Release /data/local/tmp/Release”
    • There is a step I’m skipping here. I actually do my development on Windows. So I copy the Release folder from my Linux VM (via Virtualbox) and then use the linked drive to move it to my Windows box. So in fact my adb push command above isn’t from the Linux location but my Windows location.
    • The out/Release folder contains all the build artifacts for Node. Of this mess I suspect only the node executable is actually needed. But for the moment I’m going to play it safe and just move everything over.
    • The reason for putting the node materials into /data/local/tmp/Release is because /data/local/tmp is one of the few areas where we can execute the chmod command in the next step and make Node executable. But when we wrap this thing up in an AAR we can actually use the setExecutable function instead.
  2. Issue “adb shell”. Once in the shell issue “chmod 700 /data/local/tmp/Release/node”
  3. I then issued an ’adb push’ for a simple hello world node program I have that I put inside of /data/local/tmp
    • I used “Hello HTTP” from http://howtonode.org/hello-node
  4. Then I went in via “adb shell” and ran “/data/local/tmp/Release/node helloworld.js”
    • And yes, it worked! I even tested it by going to the browser on the phone and navigating to http://localhost:8000.
  5. To kill things I just ctrl-c which does kill the adb shell but also the node app. Good enough for now.

What about NPM?

In theory one should be able to use NPM on the Linux box and then just move the whole thing over to Android and run it there. But this only works if none of the dependencies use an add-on. An add-on requires compiling C code into a form Android can handle. It looks like NPM wants to support making this happen but so far I haven’t found the right voodoo. So I’m still investigating.