Scalability

To support our approach, proxy servers need to be deployed throughout the access network, which could be a large provincial or national cellular network. On the one hand, one could envision an architecture where each wireless cell provides dedicated proxy servers, resulting in relatively little concurrent use of an individual server but inducing a high handover overhead and costs. At the other extreme, we could provide only one or very few proxy servers that support applications in many different wireless cells, reducing the handover overhead but requiring more powerful servers. With potentially multiple thousands of users executing resource-intensive next-generation mobile applications, the scalability of our approach becomes extremely important. To explore this issue, we started to develop performance prediction models based on Layered Queuing Networks (LQNs).

LQNs study the performance of distributed systems that have hardware and software [FRA 98, ROL 95]. A task is a basic unit in LQN. A task represents the software in execution. An entry represents a service provided by a task. If a task can provide multiple services, then the task has multiple entries. Entries in different tasks communicate with each other through requesting and providing services. Client tasks make requests to proxies; these in turn invoke services provided by the application server task. Requests are either synchronous or asynchronous.

Each task has at least one entry. The entries in a task have execution demands respectively, and may also interact with entries in other tasks by calling the entries in those tasks. The client tasks will not receive requests from other tasks. They are called reference tasks. For reference tasks, usually there is a think time that is denoted by Z, which implies the pause time between two consecutive operations. Execution demands of entries are characterized in two phases. Phase 1 is the service demand between reception and response (for synchronous requests), phase 2 describes the service demands after the response (for synchronous requests) or the total service demand for asynchronous requests. The LQN analytical tool describes the system by the average behaviour of the entries and solves the performance model by approximate MVA calculations. To study the scalability of our system, we developed a four layer LQN, extracted data from traces collected from an operational WAP-based system [9], and studied the impact of introducing proxy servers. In modelling terms, the introduction of a proxy results in less execution demand on the portable devices and more execution demand on the proxy servers. Since we assume the proxy servers to be more powerful, the increase in load is only fraction of the load decrease on the portable device.

The complete model is shown in Figure 1. A parallelogram represents a task entry. Cascaded parallelograms indicate an entry of multiple tasks. The task name is written near the parallelogram. [Z] in the client task entry models the client think time. [0, tc] in the client task entry represents the execution demands of the client task entry between requests. The pair of brackets inside the non-referential task entries has the same meaning as the one in the client task entry. The notation under the pair of brackets is the entry name. The ellipse represents CPU processors. The arrow segment connects the calling entry and the called entry. The straight segment connects the task and the CPU on which the task runs. The pair of circular brackets beside the arrow line contains the number of calls from the calling entry to the called entry. 'sh' denotes synchronous calls and 'ay' denotes asynchronous calls. Client tasks make (indirectly) requests to an application server and wait for the responses. This server answers some of the requests directly and passes some to other servers on the Internet. Generally, passing a request to another server takes less time than answering one directly. The application server task has four entries.

General Server Idle Server

Figure 1. Layered queuing model

General Server Idle Server

Figure 1. Layered queuing model

The first entry se1 processes the synchronous requests from client tasks and responds to the clients directly. The second entry se2 is responsible for asynchronous requests from client tasks. The third entry se3 passes synchronous requests from the clients to other servers. The General Server task is used to represent additional servers on the Internet since it is impossible to get information for all the Internet servers and model them individually. The fourth entry se4 is used to represent the idle time between consecutive sessions with the help of an imaginary Idle Server task and CPU4.

We studied the capacity of the system under various conditions and the effect of transferring execution load from handsets to proxies. The capacity of the system indicates the maximum number of users that the system can serve at the same time without being saturated. Proxies, Application Server, General Server and Idle Server are multithreaded, and can provide as many threads as needed. We define 0.7 as the threshold utilization of CPU2, beyond which we consider that the system is saturated. 'MC' is the maximum number of clients that the system can sustain in this case. We studied the effect of increasing the percentage of requests serviced directly by the application server and executing a higher proportion of the client tasks at the proxy servers.

We traced the WAP-based application for several months; some detailed data is reported in [KUN 00]. The total average number of synchronous requests (the sum of sh1 and sh2) per user session is 11.2, i.e., sh1 + sh2 = 11.2. sh1 = 0 represents the case when all the requests are passed on to other servers. sh1 = 11.2 represents the case when all the requests are directly processed by the Application server. We show the capacity implications of various splits between sh1 and sh2 in Table 3. To derive the base system capacity, we assumed that the proxy layer is essentially non-existent: requests from clients and replies from the Application server are forwarded immediately and no processing happens at the proxy.

The base capacity of the system decreases with the increase of sh1. That is, the more requests the Application server processes directly, the smaller the system capacity becomes. For the period we traced the application, the average maximum numbers of concurrent sessions for each month are indeed below 8, but during the peak hours of some days, the maximum number of concurrent sessions is bigger than 10. This period of potential capacity overload is very short, however, usually less than one minute.

Table 3. Basic system capacity

shl

0

1.2

2.2

3.2

4.2

5.2

6.2

7.2

8.2

9.2

10.2

11.2

MC

50

36

29

24

20

17

15

13

11

10

9

8

We also investigated the effect of load migration from handsets to the proxies. We assume that the proxy CPU is 25 times faster than the handset CPU. The load migration from handsets to the proxies reduces the service demand at the clients' side. Assuming that each user has access to a dedicated proxy server, this is equivalent, in modeling terms, to replacing a slow user with a faster, more demanding user (more requests per unit time), reducing the overall system capacity. However, since proxy servers are shared between multiple users, this may not necessarily be the case. We varied the service demand on the portable device (tc) from 6 to 0, in steps of 1, with a corresponding (smaller) increase in service demand (pe1) at the proxy. The performance prediction results are shown in Tables 4 and 5.

Table 4. Capacity vs. load migration, all client requests processed by application server

Table 4. Capacity vs. load migration, all client requests processed by application server

Table 5. Capacity vs. load migration, all client requests forwarded to external servers

Table 5. Capacity vs. load migration, all client requests forwarded to external servers

We can see that, all else being equal, the capacity decreases with increasing migration of computational load from portable devices to the proxies. This is consistent with other result reported in [ZHO 00] that show that the system can serve more slow users than fast ones.

Living With Android

Living With Android

Get All The Support And Guidance You Need To Finally Be A Success At Using Android. This Book Is One Of The Most Valuable Resources In The World When It Comes To Playing With the Hottest Android Tablet.

Get My Free Ebook


Post a comment