0%

理解Android的启动过程

前言

在上篇文章中我们大致介绍了关于Linux启动的过程,基本上都会经历几个流程:

  1. 加电自检
  2. 启动bootloader
  3. bootloader启动kernel

kernel启动完之后,会执行第一个程序init,之后init进程会继续fork出许多系统核心进程来提供相应的服务。其中跟Android关系最密切的zygote进程也是由init进程直接fork出来,如下图所示:

image

之后再由zygote进程孵化出Android核心进程systemserver,Android很多核心服务如ActivityManagerService、WindowManagerService都是直接以线程的方式驻留在systemserver进程为app(客户端)提供服务,如下图所示:
image

从上面两张图中我们可以看到,Linux就是Android系统的基石,没有Linux提供的基础服务(内存管理、进程调度、文件系统等),Android就无从谈起。从这个意义上来说,Android运行在Linux的一个应用程序而已。当然,Android只是基于Linux,而不是Linux,Android基于Linux,创建了一套完全与Linux应用不一样的开发”语言”,让开发者可以使用Java就能做出有趣的应用,间接地降低了开发的门槛(C++因为内存泄露和野指针的问题饱受诟病)。这套”语言”里包含了很多概念比如Activity、Service、Broadcast、ContentProvider、Window、Surface、View等,后续我们会一一说明。今天我们先从代码角度来看Android的启动过程。本文源码基于Android 2.3.7_r1。

分析

zygote进程的创建

init.rc的main,里面先解析rc文件,把service保存在一个list里,之后会调用service_start方法,用fork开启新的进程。然后有个死循环不停地poll事件,通过event和内核进行交互。

system/core/rootdir/init.rc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
on early-init
start ueventd

on init

sysclktz 0

loglevel 3

# setup the global environment
export PATH /sbin:/vendor/bin:/system/sbin:/system/bin:/system/xbin
export LD_LIBRARY_PATH /vendor/lib:/system/lib
export ANDROID_BOOTLOGO 1
export ANDROID_ROOT /system
export ANDROID_ASSETS /system/app
export ANDROID_DATA /data
export EXTERNAL_STORAGE /mnt/sdcard
export ASEC_MOUNTPOINT /mnt/asec
export LOOP_MOUNTPOINT /mnt/obb
export BOOTCLASSPATH /system/framework/core.jar:/system/framework/bouncycastle.jar:/system/framework/ext.jar:/system/framework/framework.jar:/system/framework/android.policy.jar:/system/framework/services.jar:/system/framework/core-junit.jar

# 省略部分源码

on boot
# basic network init
ifup lo
hostname localhost
domainname localdomain

# 省略部分源码
service ril-daemon /system/bin/rild
socket rild stream 660 root radio
socket rild-debug stream 660 radio system
user root
group radio cache inet misc audio sdcard_rw

service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
socket zygote stream 666
onrestart write /sys/android_power/request_state wake
onrestart write /sys/power/state on
onrestart restart media
onrestart restart netd

service media /system/bin/mediaserver
user media
group system audio camera graphics inet net_bt net_bt_admin net_raw
ioprio rt 4
# 省略部分源码

system/core/init/init.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
// 启动一个service,是通过fork的方式启动的
void service_start(struct service *svc, const char *dynamic_args)
{
struct stat s;
pid_t pid;
int needs_console;
int n;
// 省略部分源码
NOTICE("starting '%s'\n", svc->name);

pid = fork();

if (pid == 0) {
struct socketinfo *si;
struct svcenvinfo *ei;
char tmp[32];
int fd, sz;

if (properties_inited()) {
get_property_workspace(&fd, &sz);
sprintf(tmp, "%d,%d", dup(fd), sz);
add_environment("ANDROID_PROPERTY_WORKSPACE", tmp);
}

for (ei = svc->envvars; ei; ei = ei->next)
add_environment(ei->name, ei->value);

for (si = svc->sockets; si; si = si->next) {
int socket_type = (
!strcmp(si->type, "stream") ? SOCK_STREAM :
(!strcmp(si->type, "dgram") ? SOCK_DGRAM : SOCK_SEQPACKET));
int s = create_socket(si->name, socket_type,
si->perm, si->uid, si->gid);
if (s >= 0) {
publish_socket(si->name, s);
}
}

if (svc->ioprio_class != IoSchedClass_NONE) {
if (android_set_ioprio(getpid(), svc->ioprio_class, svc->ioprio_pri)) {
ERROR("Failed to set pid %d ioprio = %d,%d: %s\n",
getpid(), svc->ioprio_class, svc->ioprio_pri, strerror(errno));
}
}
// 省略部分源码
}

if (pid < 0) {
ERROR("failed to start '%s'\n", svc->name);
svc->pid = 0;
return;
}

svc->time_started = gettime();
svc->pid = pid;
svc->flags |= SVC_RUNNING;

if (properties_inited())
notify_service_state(svc->name, "running");
}

int main(int argc, char **argv)
{
int fd_count = 0;
struct pollfd ufds[4];
char *tmpdev;
char* debuggable;
char tmp[32];
int property_set_fd_init = 0;
int signal_fd_init = 0;
int keychord_fd_init = 0;

if (!strcmp(basename(argv[0]), "ueventd"))
return ueventd_main(argc, argv);

/* clear the umask */
umask(0);

/* Get the basic filesystem setup we need put
* together in the initramdisk on / and then we'll
* let the rc file figure out the rest.
*/
mkdir("/dev", 0755);
mkdir("/proc", 0755);
mkdir("/sys", 0755);

mount("tmpfs", "/dev", "tmpfs", 0, "mode=0755");
mkdir("/dev/pts", 0755);
mkdir("/dev/socket", 0755);
mount("devpts", "/dev/pts", "devpts", 0, NULL);
mount("proc", "/proc", "proc", 0, NULL);
mount("sysfs", "/sys", "sysfs", 0, NULL);

/* We must have some place other than / to create the
* device nodes for kmsg and null, otherwise we won't
* be able to remount / read-only later on.
* Now that tmpfs is mounted on /dev, we can actually
* talk to the outside world.
*/
open_devnull_stdio();
log_init();

INFO("reading config file\n");

// 这里开始解析init.rc文件,方法实现在同级目录的init_parser.c里
init_parse_config_file("/init.rc");

/* pull the kernel commandline and ramdisk properties file in */
import_kernel_cmdline(0);

get_hardware_name(hardware, &revision);
snprintf(tmp, sizeof(tmp), "/init.%s.rc", hardware);
init_parse_config_file(tmp);
// 省略部分源码
/* execute all the boot actions to get us started */
action_for_each_trigger("early-boot", action_add_queue_tail);
// 这个阶段对应init.rc 里的 on boot,会把 zygote进程启动起来
action_for_each_trigger("boot", action_add_queue_tail);
/* run all property triggers based on current state of the properties */
queue_builtin_action(queue_property_triggers_action, "queue_propety_triggers");
// 省略部分源码
for(;;) {
int nr, i, timeout = -1;
// 省略部分源码
nr = poll(ufds, fd_count, timeout);
if (nr <= 0)
continue;

for (i = 0; i < fd_count; i++) {
if (ufds[i].revents == POLLIN) {
if (ufds[i].fd == get_property_set_fd())
handle_property_set_fd();
else if (ufds[i].fd == get_keychord_fd())
handle_keychord();
else if (ufds[i].fd == get_signal_fd())
handle_signal();
}
}
}

return 0;
}

init_parser.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
static void parse_config(const char *fn, char *s)
{
struct parse_state state;
char *args[INIT_PARSER_MAXARGS];
int nargs;

nargs = 0;
state.filename = fn;
state.line = 1;
state.ptr = s;
state.nexttoken = 0;
state.parse_line = parse_line_no_op;
for (;;) {
switch (next_token(&state)) {
case T_EOF:
state.parse_line(&state, 0, 0);
return;
case T_NEWLINE:
if (nargs) {
int kw = lookup_keyword(args[0]);
if (kw_is(kw, SECTION)) {
state.parse_line(&state, 0, 0);
// 我们再看这里做了什么
parse_new_section(&state, kw, nargs, args);
} else {
state.parse_line(&state, nargs, args);
}
nargs = 0;
}
break;
case T_TEXT:
if (nargs < INIT_PARSER_MAXARGS) {
args[nargs++] = state.text;
}
break;
}
}
}

void parse_new_section(struct parse_state *state, int kw,
int nargs, char **args)
{
printf("[ %s %s ]\n", args[0],
nargs > 1 ? args[1] : "");
switch(kw) {
case K_service:
// 我们继续看parse_service做了什么
state->context = parse_service(state, nargs, args);
if (state->context) {
state->parse_line = parse_line_service;
return;
}
break;
case K_on:
state->context = parse_action(state, nargs, args);
if (state->context) {
state->parse_line = parse_line_action;
return;
}
break;
}
state->parse_line = parse_line_no_op;
}


//其实很简单,就是把service解析出来,放到一个全局的service_list中
static void *parse_service(struct parse_state *state, int nargs, char **args)
{
struct service *svc;
if (nargs < 3) {
parse_error(state, "services must have a name and a program\n");
return 0;
}
if (!valid_name(args[1])) {
parse_error(state, "invalid service name '%s'\n", args[1]);
return 0;
}

svc = service_find_by_name(args[1]);
if (svc) {
parse_error(state, "ignored duplicate definition of service '%s'\n", args[1]);
return 0;
}

nargs -= 2;
svc = calloc(1, sizeof(*svc) + sizeof(char*) * nargs);
if (!svc) {
parse_error(state, "out of memory\n");
return 0;
}
svc->name = args[1];
svc->classname = "default";
memcpy(svc->args, args + 2, sizeof(char*) * nargs);
svc->args[nargs] = 0;
svc->nargs = nargs;
svc->onrestart.name = "onrestart";
list_init(&svc->onrestart.commands);
list_add_tail(&service_list, &svc->slist);
return svc;
}

通过init_parser.c文件我们可以看到init.rc的service被放到了service_list中,那这些service什么时候被启动的呢?答案就在init.c文件的main函数里。执行到on boot阶段会把init.rc解析出来的服务全部启动起来。

init.c 的 main方法

1
2
3
4
5
/* execute all the boot actions to get us started */
action_for_each_trigger("early-boot", action_add_queue_tail);
action_for_each_trigger("boot", action_add_queue_tail);
/* run all property triggers based on current state of the properties */
queue_builtin_action(queue_property_triggers_action, "queue_propety_triggers");

从init.rc文件我们可以看到,Android系统中很重要的几个服务zygote, servicemanager, media都是从init进程fork出来的,它们都以进程的方式存在。

从C世界进入到C++世界

上面我们讲到init进程通过fork调用启动了zygote进程,最终是通过execve函数启动了system/bin/app_process程序(启动之后会通过pctrl将自己进程名改成zygote)。
app_process 程序所对应的文件入口是 frameworks/base/cmd/app_process/app_main.cpp,接下里我们开始分析app_main.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
/*
* Main entry of app process.
*
* Starts the interpreted runtime, then starts up the application.
*
*/

#define LOG_TAG "appproc"

#include <binder/IPCThreadState.h>
#include <binder/ProcessState.h>
#include <utils/Log.h>
#include <cutils/process_name.h>
#include <cutils/memory.h>
#include <android_runtime/AndroidRuntime.h>

#include <stdio.h>
#include <unistd.h>

namespace android {

void app_usage()
{
fprintf(stderr,
"Usage: app_process [java-options] cmd-dir start-class-name [options]\n");
}

status_t app_init(const char* className, int argc, const char* const argv[])
{
LOGV("Entered app_init()!\n");

AndroidRuntime* jr = AndroidRuntime::getRuntime();
// 核心逻辑,加载了com.android.internal.os.ZygoteInit类的main方法
jr->callMain(className, argc, argv);

LOGV("Exiting app_init()!\n");
return NO_ERROR;
}

class AppRuntime : public AndroidRuntime
{
public:
AppRuntime()
: mParentDir(NULL)
, mClassName(NULL)
, mArgC(0)
, mArgV(NULL)
{
}

//省略部分源码
virtual void onStarted()
{
sp<ProcessState> proc = ProcessState::self();
if (proc->supportsProcesses()) {
LOGV("App process: starting thread pool.\n");
proc->startThreadPool();
}

//核心逻辑,这里的mClassName就是 com.android.internal.os.ZygoteInit
app_init(mClassName, mArgC, mArgV);

if (ProcessState::self()->supportsProcesses()) {
IPCThreadState::self()->stopProcess();
}
}

//省略部分源码
const char* mParentDir;
const char* mClassName;
int mArgC;
const char* const* mArgV;
};

}

using namespace android;

/*
* sets argv0 to as much of newArgv0 as will fit
*/
static void setArgv0(const char *argv0, const char *newArgv0)
{
strlcpy(const_cast<char *>(argv0), newArgv0, strlen(argv0));
}

int main(int argc, const char* const argv[])
{

// 省略部分源码
AppRuntime runtime;
const char *arg;
const char *argv0;
// Everything up to '--' or first non '-' arg goes to the vm
int i = runtime.addVmArguments(argc, argv);
// Next arg is startup classname or "--zygote"
if (i < argc) {
arg = argv[i++];
if (0 == strcmp("--zygote", arg)) {
bool startSystemServer = (i < argc) ?
strcmp(argv[i], "--start-system-server") == 0 : false;
setArgv0(argv0, "zygote");

//将进程名设置成zygote
set_process_name("zygote");

// 核心逻辑,开始调用了ZygoteInit类方法,会走到AppRuntime.onStarted
runtime.start("com.android.internal.os.ZygoteInit",
startSystemServer);
}
}
//省略部分源码
}

从代码中我们可以清楚地看到main方法里直接调用runtime.start方法去启动Java核心类ZygoteInit,然后再继续调用ZygoteInit的main方法。于是程序便由C++世界转向了Java世界。

从C++到Java世界

你可能会好奇,runtime.start方法到底做了什么就能够让程序顺利进入Java世界了?接下来我们继续分析runtime.start方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
/*
* Start the Android runtime. This involves starting the virtual machine
* and calling the "static void main(String[] args)" method in the class
* named by "className".
*/
void AndroidRuntime::start(const char* className, const bool startSystemServer)
{

// 省略部分逻辑
// 核心逻辑第一步:启动VM虚拟机
/* start the virtual machine */
if (startVm(&mJavaVM, &env) != 0)
goto bail;

/*
* Register android functions.
*/
// 核心逻辑第二步:为很多Android中的Java类注册相应的jni方法
if (startReg(env) < 0) {
LOGE("Unable to register all android natives\n");
goto bail;
}

/*
* We want to call main() with a String array with arguments in it.
* At present we only have one argument, the class name. Create an
* array to hold it.
*/
jclass stringClass;
jobjectArray strArray;
jstring classNameStr;
jstring startSystemServerStr;

// 核心逻辑第三步:已经创建好了env,就可以通过env来加载Java类了
stringClass = env->FindClass("java/lang/String");
assert(stringClass != NULL);
strArray = env->NewObjectArray(2, stringClass, NULL);
assert(strArray != NULL);
classNameStr = env->NewStringUTF(className);
assert(classNameStr != NULL);
env->SetObjectArrayElement(strArray, 0, classNameStr);
startSystemServerStr = env->NewStringUTF(startSystemServer ?
"true" : "false");
env->SetObjectArrayElement(strArray, 1, startSystemServerStr);

/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
jclass startClass;
jmethodID startMeth;

slashClassName = strdup(className);
for (cp = slashClassName; *cp != '\0'; cp++)
if (*cp == '.')
*cp = '/';

startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
LOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
LOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);

#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
//省略部分源码
}

我们发现start方法核心逻辑主要有两步:startVM和startReg,之后就可以开始调用class类的main方法了。这里有个疑惑点:就是start方法本身也调用了一次main方法

1
2
startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");

而在我们之前分析的AppRuntime.onStarted()方法里也调用了一次main方法

1
2
3
4
5
6
7
8
9
10
11
status_t app_init(const char* className, int argc, const char* const argv[])
{
LOGV("Entered app_init()!\n");

AndroidRuntime* jr = AndroidRuntime::getRuntime();
// 核心逻辑,加载了com.android.internal.os.ZygoteInit类的main方法
jr->callMain(className, argc, argv);

LOGV("Exiting app_init()!\n");
return NO_ERROR;
}

到底是哪个先调用呢?会不会调用两次呢?留给你思考。调用了ZygoteInit的main方法之后,程序便由C++世界转换到了Java世界!

ZygoteInit.main 方法分析

ZygoteInit.main方法开起来很简单,但是做的事情可不简单。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
public static void main(String argv[]) {
try {
VMRuntime.getRuntime().setMinimumHeapSize(5 * 1024 * 1024);

// Start profiling the zygote initialization.
SamplingProfilerIntegration.start();

// 核心步骤1:创建一个socket
registerZygoteSocket();
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());

// 核心步骤2:预加载Java类,主要是Framework.jar中的类,是通过Class.forName()的方式预加载的
preloadClasses();
//cacheRegisterMaps();

// 核心步骤3:预加载资源文件包括R.string.xxx/R.color.xxx等,这样我们才能在应用程序中用到Android的资源文件
preloadResources();
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());

// Finish profiling the zygote initialization.
SamplingProfilerIntegration.writeZygoteSnapshot();

// Do an initial gc to clean up after startup
gc();

// If requested, start system server directly from Zygote
if (argv.length != 2) {
throw new RuntimeException(argv[0] + USAGE_STRING);
}

if (argv[1].equals("true")) {
// 核心步骤4:启动SystemServer,有了这个,我们调用Context.getSystemService才能正常功能
startSystemServer();
} else if (!argv[1].equals("false")) {
throw new RuntimeException(argv[0] + USAGE_STRING);
}

Log.i(TAG, "Accepting command socket connections");

if (ZYGOTE_FORK_MODE) {
runForkMode();
} else {
// 核心步骤5:等待新进程创建,每当一个新app启动时,需要和Zygote进程通过socket进行交互
runSelectLoopMode();
}

closeServerSocket();
} catch (MethodAndArgsCaller caller) {
caller.run();
} catch (RuntimeException ex) {
Log.e(TAG, "Zygote died with exception", ex);
closeServerSocket();
throw ex;
}
}

除了预加载Class文件之外,还预加载了Android的资源,因为客户端都是从zygote进程fork出来的,所以客户端进程可以轻松地获取到对应的Android资源和预加载的类,从而减少客户端启动时间,这点在设计上很有想法。在main函数中,还启动了SystemServer进程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/**
* Finish remaining work for the newly forked system server process.
*/
private static void handleSystemServerProcess(
ZygoteConnection.Arguments parsedArgs)
throws ZygoteInit.MethodAndArgsCaller {

closeServerSocket();

/*
* Pass the remaining arguments to SystemServer.
* "--nice-name=system_server com.android.server.SystemServer"
*/
RuntimeInit.zygoteInit(parsedArgs.remainingArgs);
/* should never reach here */
}

/**
* Prepare the arguments and fork for the system server process.
*/
private static boolean startSystemServer()
throws MethodAndArgsCaller, RuntimeException {
/* Hardcoded command line to start the system server */
String args[] = {
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,3001,3002,3003",
"--capabilities=130104352,130104352",
"--runtime-init",
"--nice-name=system_server",
"com.android.server.SystemServer",
};
ZygoteConnection.Arguments parsedArgs = null;

int pid;

try {
parsedArgs = new ZygoteConnection.Arguments(args);

/*
* Enable debugging of the system process if *either* the command line flags
* indicate it should be debuggable or the ro.debuggable system property
* is set to "1"
*/
int debugFlags = parsedArgs.debugFlags;
if ("1".equals(SystemProperties.get("ro.debuggable")))
debugFlags |= Zygote.DEBUG_ENABLE_DEBUGGER;

/* Request to fork the system server process */
pid = Zygote.forkSystemServer(
parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids, debugFlags, null,
parsedArgs.permittedCapabilities,
parsedArgs.effectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}

/* For child process */
if (pid == 0) {
handleSystemServerProcess(parsedArgs);
}

return true;
}

zygote进程fork出了system_server进程,system_server进程先把自己的soeckt关掉,因为它不负责接收socket消息启动新应用。然后又调用了

1
RuntimeInit.zygoteInit(parsedArgs.remainingArgs);

注意:这里调用的zygoteInit其实已经在SystemServer进程里了。我们再继续看zygoteInit方法

1
2
3
4
5
6
7
8
9
10
11
12
13
public static final void zygoteInit(String[] argv)
throws ZygoteInit.MethodAndArgsCaller {

// 省略部分代码
// Remaining arguments are passed to the start class's static main

//注意:我们传进来的是com.android.server.SystemServer
String startClass = argv[curArg++];
String[] startArgs = new String[argv.length - curArg];

System.arraycopy(argv, curArg, startArgs, 0, startArgs.length);
invokeStaticMain(startClass, startArgs);
}

从代码我们可以看到调用了SystemServer的main方法。

1
2
3
4
5
public static void main(String[] args) {
// 省略部分代码
System.loadLibrary("android_servers");
init1(args);
}

SystemServer.main代码也十分简单,就是加载了一下libandroid_servers.so库,其对应代码放在frameworks/base/services/jni目录下,然后我们看init1,其实是Native调用,代码在com_android_server_SystemServer.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
namespace android {

extern "C" int system_init();

static void android_server_SystemServer_init1(JNIEnv* env, jobject clazz)
{
system_init();
}

/*
* JNI registration.
*/
static JNINativeMethod gMethods[] = {
/* name, signature, funcPtr */
{ "init1", "([Ljava/lang/String;)V", (void*) android_server_SystemServer_init1 },
};

int register_android_server_SystemServer(JNIEnv* env)
{
return jniRegisterNativeMethods(env, "com/android/server/SystemServer",
gMethods, NELEM(gMethods));
}

}; // namespace android

绕了一圈,继续调用system_init,代码在frameworks/base/cmds/system_server/library/system_init.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
extern "C" status_t system_init()
{
LOGI("Entered system_init()");

sp<ProcessState> proc(ProcessState::self());

sp<IServiceManager> sm = defaultServiceManager();
LOGI("ServiceManager: %p\n", sm.get());
//省略部分代码
AndroidRuntime* runtime = AndroidRuntime::getRuntime();

LOGI("System server: starting Android services.\n");
runtime->callStatic("com/android/server/SystemServer", "init2");

// If running in our own process, just go into the thread
// pool. Otherwise, call the initialization finished
// func to let this process continue its initilization.
if (proc->supportsProcesses()) {
LOGI("System server: entering thread pool.\n");
ProcessState::self()->startThreadPool();
IPCThreadState::self()->joinThreadPool();
LOGI("System server: exiting thread pool.\n");
}
return NO_ERROR;
}

发现绕了一圈,又调回来了SystemServer的init2方法

1
2
3
4
5
6
public static final void init2() {
Slog.i(TAG, "Entered the Android system server!");
Thread thr = new ServerThread();
thr.setName("android.server.ServerThread");
thr.start();
}

代码虽短,确实系统服务的核心,在这个ServerThread里,初始化了很多系统服务,我们来仔细看看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
@Override
public void run() {
EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN,
SystemClock.uptimeMillis());

Looper.prepare();

// 省略部分代码
LightsService lights = null;
PowerManagerService power = null;
BatteryService battery = null;
ConnectivityService connectivity = null;
IPackageManager pm = null;
Context context = null;
WindowManagerService wm = null;
BluetoothService bluetooth = null;
BluetoothA2dpService bluetoothA2dp = null;
HeadsetObserver headset = null;
DockObserver dock = null;
UsbService usb = null;
UiModeManagerService uiMode = null;
RecognitionManagerService recognition = null;
ThrottleService throttle = null;

// Critical services...
try {
Slog.i(TAG, "Entropy Service");
ServiceManager.addService("entropy", new EntropyService());

Slog.i(TAG, "Power Manager");
power = new PowerManagerService();
ServiceManager.addService(Context.POWER_SERVICE, power);

Slog.i(TAG, "Activity Manager");
context = ActivityManagerService.main(factoryTest);

Slog.i(TAG, "Telephony Registry");
ServiceManager.addService("telephony.registry", new TelephonyRegistry(context));

AttributeCache.init(context);

Slog.i(TAG, "Package Manager");
pm = PackageManagerService.main(context,
factoryTest != SystemServer.FACTORY_TEST_OFF);

ActivityManagerService.setSystemProcess();

mContentResolver = context.getContentResolver();

// The AccountManager must come before the ContentService
try {
Slog.i(TAG, "Account Manager");
ServiceManager.addService(Context.ACCOUNT_SERVICE,
new AccountManagerService(context));
} catch (Throwable e) {
Slog.e(TAG, "Failure starting Account Manager", e);
}

Slog.i(TAG, "Content Manager");
ContentService.main(context,
factoryTest == SystemServer.FACTORY_TEST_LOW_LEVEL);

Slog.i(TAG, "System Content Providers");
ActivityManagerService.installSystemProviders();

Slog.i(TAG, "Battery Service");
battery = new BatteryService(context);
ServiceManager.addService("battery", battery);

Slog.i(TAG, "Lights Service");
lights = new LightsService(context);

Slog.i(TAG, "Vibrator Service");
ServiceManager.addService("vibrator", new VibratorService(context));

// only initialize the power service after we have started the
// lights service, content providers and the battery service.
power.init(context, lights, ActivityManagerService.getDefault(), battery);

Slog.i(TAG, "Alarm Manager");
AlarmManagerService alarm = new AlarmManagerService(context);
ServiceManager.addService(Context.ALARM_SERVICE, alarm);

Slog.i(TAG, "Init Watchdog");
Watchdog.getInstance().init(context, battery, power, alarm,
ActivityManagerService.self());

Slog.i(TAG, "Window Manager");
wm = WindowManagerService.main(context, power,
factoryTest != SystemServer.FACTORY_TEST_LOW_LEVEL);
ServiceManager.addService(Context.WINDOW_SERVICE, wm);

((ActivityManagerService)ServiceManager.getService("activity"))
.setWindowManager(wm);

// 省略部分代码

} catch (RuntimeException e) {
Slog.e("System", "Failure starting core service", e);
}

DevicePolicyManagerService devicePolicy = null;
StatusBarManagerService statusBar = null;
InputMethodManagerService imm = null;
AppWidgetService appWidget = null;
NotificationManagerService notification = null;
WallpaperManagerService wallpaper = null;
LocationManagerService location = null;

//省略部分代码

Looper.loop();
Slog.d(TAG, "System ServerThread is exiting!");
}

基本上你能想到的重要服务包括ActivityManagerService、WindowManagerService、PowerManagerService、PackageManagerService、BatteryService等服务都会在这里初始化,并且通过binder向servicemanager进程注册,这样客户端就能通过binder获取到这些服务啦。那么一个客户端程序又是如何启动的呢?

客户端启动

我们知道Context.startActivity最终都会调用到ActivityManagerService来处理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
private final void startProcessLocked(ProcessRecord app,
String hostingType, String hostingNameStr) {

// 省略部分代码
try {
// 省略部分代码
// 注意我们传进来的类是:ActivityThread
int pid = Process.start("android.app.ActivityThread",
mSimpleProcessManagement ? app.processName : null, uid, uid,
gids, debugFlags, null);
// 省略部分代码

} catch (RuntimeException e) {
// XXX do better error recovery.
app.pid = 0;
Slog.e(TAG, "Failure starting process " + app.processName, e);
}
}

继续看android.os.Process.start方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
public static final int start(final String processClass,
final String niceName,
int uid, int gid, int[] gids,
int debugFlags,
String[] zygoteArgs){
//去掉部分代码
return startViaZygote(processClass, niceName, uid, gid, gids,
debugFlags, zygoteArgs);
}


private static int startViaZygote(final String processClass,
final String niceName,
final int uid, final int gid,
final int[] gids,
int debugFlags,
String[] extraArgs)
throws ZygoteStartFailedEx {
int pid;
//去掉部分代码
synchronized(Process.class) {
ArrayList<String> argsForZygote = new ArrayList<String>();
// --runtime-init, --setuid=, --setgid=,
// and --setgroups= must go first
argsForZygote.add("--runtime-init");
argsForZygote.add("--setuid=" + uid);
argsForZygote.add("--setgid=" + gid);
//注意我们传进来的processClass是android.app.ActivityThread
argsForZygote.add(processClass);

if (extraArgs != null) {
for (String arg : extraArgs) {
argsForZygote.add(arg);
}
}

pid = zygoteSendArgsAndGetPid(argsForZygote);
}

return pid;
}

//直接打开socket和zygote进程通信
private static int zygoteSendArgsAndGetPid(ArrayList<String> args)
throws ZygoteStartFailedEx {
int pid;
openZygoteSocketIfNeeded();
//省略部分代码
return pid;
}

最终会启动socket和zygote进程进行通信

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/**
* Tries to open socket to Zygote process if not already open. If
* already open, does nothing. May block and retry.
*/
private static void openZygoteSocketIfNeeded()
throws ZygoteStartFailedEx {

int retryCount;

if (sPreviousZygoteOpenFailed) {
/*
* If we've failed before, expect that we'll fail again and
* don't pause for retries.
*/
retryCount = 0;
} else {
retryCount = 10;
}

/*
* See bug #811181: Sometimes runtime can make it up before zygote.
* Really, we'd like to do something better to avoid this condition,
* but for now just wait a bit...
*/
for (int retry = 0
; (sZygoteSocket == null) && (retry < (retryCount + 1))
; retry++ ) {

//省略部分代码
ZygoteSocket = new LocalSocket();

//核心逻辑:直接和zygote_socket进行通信,把processClass等参数传递进来了
sZygoteSocket.connect(new LocalSocketAddress(ZYGOTE_SOCKET,
LocalSocketAddress.Namespace.RESERVED));

sZygoteInputStream
= new DataInputStream(sZygoteSocket.getInputStream());

sZygoteWriter =
new BufferedWriter(
new OutputStreamWriter(
sZygoteSocket.getOutputStream()),
256);

Log.i("Zygote", "Process: zygote socket opened");
}

//省略部分代码
if (sZygoteSocket == null) {
sPreviousZygoteOpenFailed = true;
throw new ZygoteStartFailedEx("connect failed");
}
}

而在zygote进程我们继续看zygote进程是怎么处理ActivityManagerService传递过来的消息的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
/**
* Runs the zygote process's select loop. Accepts new connections as
* they happen, and reads commands from connections one spawn-request's
* worth at a time.
*
* @throws MethodAndArgsCaller in a child process when a main() should
* be executed.
*/
private static void runSelectLoopMode() throws MethodAndArgsCaller {
ArrayList<FileDescriptor> fds = new ArrayList();
ArrayList<ZygoteConnection> peers = new ArrayList();
FileDescriptor[] fdArray = new FileDescriptor[4];

fds.add(sServerSocket.getFileDescriptor());
peers.add(null);

int loopCount = GC_LOOP_COUNT;
while (true) {
int index;

/*
* Call gc() before we block in select().
* It's work that has to be done anyway, and it's better
* to avoid making every child do it. It will also
* madvise() any free memory as a side-effect.
*
* Don't call it every time, because walking the entire
* heap is a lot of overhead to free a few hundred bytes.
*/
if (loopCount <= 0) {
gc();
loopCount = GC_LOOP_COUNT;
} else {
loopCount--;
}


try {
fdArray = fds.toArray(fdArray);
index = selectReadable(fdArray);
} catch (IOException ex) {
throw new RuntimeException("Error in select()", ex);
}

if (index < 0) {
throw new RuntimeException("Error in select()");
} else if (index == 0) {
ZygoteConnection newPeer = acceptCommandPeer();
peers.add(newPeer);
fds.add(newPeer.getFileDesciptor());
} else {
boolean done;
//核心逻辑
done = peers.get(index).runOnce();

if (done) {
peers.remove(index);
fds.remove(index);
}
}
}
}

核心逻辑在runOnce方法,我们继续看runOnce方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/**
* Reads one start command from the command socket. If successful,
* a child is forked and a {@link ZygoteInit.MethodAndArgsCaller}
* exception is thrown in that child while in the parent process,
* the method returns normally. On failure, the child is not
* spawned and messages are printed to the log and stderr. Returns
* a boolean status value indicating whether an end-of-file on the command
* socket has been encountered.
*
* @return false if command socket should continue to be read from, or
* true if an end-of-file has been encountered.
* @throws ZygoteInit.MethodAndArgsCaller trampoline to invoke main()
* method in child process
*/
boolean runOnce() throws ZygoteInit.MethodAndArgsCaller {

// 省略了一些代码
parsedArgs = new Arguments(args);
pid = Zygote.forkAndSpecialize(parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids, parsedArgs.debugFlags, rlimits);

if (pid == 0) {
// in child
handleChildProc(parsedArgs, descriptors, newStderr);
// should never happen
return true;
} else { /* pid != 0 */
// in parent...pid of < 0 means failure
return handleParentProc(pid, descriptors, parsedArgs);
}
}

主要是fork出来了一个子进程,注意:我们的args里是包含android.app.ActivityThread的。继续往下看直接调用了RuntimeInit.zygoteInit,我们已经快接近尾声了!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
private void handleChildProc(Arguments parsedArgs,
FileDescriptor[] descriptors, PrintStream newStderr)
throws ZygoteInit.MethodAndArgsCaller {

/*
* Close the socket, unless we're in "peer wait" mode, in which
* case it's used to track the liveness of this process.
*/
if (parsedArgs.peerWait) {
try {
ZygoteInit.setCloseOnExec(mSocket.getFileDescriptor(), true);
sPeerWaitSocket = mSocket;
} catch (IOException ex) {
Log.e(TAG, "Zygote Child: error setting peer wait "
+ "socket to be close-on-exec", ex);
}
} else {
closeSocket();
ZygoteInit.closeServerSocket();
}

// 省略部分代码
// 注意:我们传进来的runtimeInit为true,所以会走到RuntimeInit.zygoteInit
if (parsedArgs.runtimeInit) {
RuntimeInit.zygoteInit(parsedArgs.remainingArgs);
} else {
//省略部分代码
}
}

直接调用了ActivityThread的main方法,这个其实就是应用程序的入口!

1
2
3
4
5
6
7
8
9
10
11
12
13
public static final void zygoteInit(String[] argv)
throws ZygoteInit.MethodAndArgsCaller {

// 省略部分代码
String startClass = argv[curArg++];
String[] startArgs = new String[argv.length - curArg];

System.arraycopy(argv, curArg, startArgs, 0, startArgs.length);

// 之前已经分析过,startClass就是android.app.ActivityThread
// 直接调用了ActivityThread的main方法,over
invokeStaticMain(startClass, startArgs);
}

至此,分析完毕。整体流程可以用下图来表示(图片来自《深入理解Android卷1》88页)
image

总结

Android的启动过程往简单了说就是Linux启动过程和Java环境启动过程。我们见证了从加电到bootloader,bootloader启动kernel,kernel启动init进程,init进程启动C++程序,C++程序再启动Java程序,最终由zygoteInit.main方法启动了所有Android-Java世界的历程。本文是我学习Android启动过程的总结和思考,希望能对你有帮助。

参考

  • 《深入理解Android卷1》第4章
  • 《Android内核剖析》
  • 《深入理解Android内核设计思想》