In this section, as a warm-up, you will generate a Poisson process with rate λ and simulate the M/M/1 queueing system with service rate μ, based only on the standard uniform random number generator from [0, 1]. (See Appendix on how to draw a random number from a given distribution FX (x) from the uniform random number generator.) Equivalently, you will simulate the birth- death Markov chain X(t) ∈ ? for M/M/1 model based on the state transition diagram as covered in class, where ? = 0, 1, 2, . . . throughout in this project. The arrivals may represent aggregated calls, HTTP requests, or ‘clicks’ on html links, etc. from a large population, which are all known to follow a Poisson process (extension of the law of small numbers!).
Each of the arrival is placed in queue if the server is busy, and served on a first-come-first serve (FCFS) basis. Throughout, we assume that the buffer size is infinite (i.e., an arriving customer/job/packet is never lost). For your actual simulation, you will have to assign a large enough memory space for the buffer to ensure that nothing is lost there. Define ρ = λ/μ < 1 for stability. For simplicity, set μ = 1 and we will vary λ = ρ from λ ∈ {0.7, 0.8, 0.9, 0.95} (total 4 different values). Your simulation program will terminate once N customers have completed their services. For initial conditions, assume that at time t = 0 the system is empty. Draw a random number from exp(λ) to decide when the first arrival will occur, and then start your simulation by locating the first event, then draw another random number to decide the next event, and so on.∗ You may want to keep the following information for each customer entering the system:
• ti, 1 ≤ i ≤ N : the time instant at which the i-th customer arrives to the system.
• wi, 1 ≤ i ≤ N : the total waiting time of the i-th customer in the system (queue + server).
Note that this i-th customer departs the system at ti + wi.
• X(ti), 1 ≤ i ≤ N : the number of customers in the system seen by the i-th arrival. For instance, X(t1) = 0 since the first customer arrives to (initially) empty system.
• Di, 1 ≤ i ≤ N : the number of customers seen (left behind) by the i-th departure. Note that you may have to generate more than N arrivals to ensure that N customers will have departed by the end of your simulation (and some customers are still left in the queue!). Using these values, you will measure the empirical distribution (ccdf) of the number of customers in the system seen by the arrivals and departures based on N observations (events) and compare them with the theoretical results you learned in class. In addition, while you are running the simulations, you will also keep a record of whether or not the system is empty. In other words, you:
∗There should be no library or module in your source code. You can only use the uniformly distributed random number generator over [0, 1] that should be available in any program language. In other words, you will have to write your code from scratch. 1are also measuring Ns(t). To this end, whenever there is an ‘event’ (arrival or departure), update this 1/0-valued process Ns(t) and obtain the duration of idle and busy periods. Let Ii, Bi denote the i-th idle/busy periods. Note that depending on λ values (μ = 1 throughout still), out of N customer departure, the number of idle/busy period samples may be smaller than N . Say, you obtain M samples for Bi (i = 1, 2, . . . , M ) by the end of your simulation (N -th departure). Then, you can obtain your empirical distribution (histogram and then converted into ccdf) for B. Specifically, do the following:
• Set N = 10, 000. For each of λ value (out of 4 choices), repeat your simulations K = 10 times using different initial random seed. This way, you will obtain K = 10 datasets that are presumably independent and identically distributed.
• For each of λ value chosen and for each sample path (out of K independent runs), collect X(ti) samples, 1 ≤ i ≤ N . From this N -dimensional array for the histogram, convert this into CCDF (another N -dimensional array) to estimate the probability that the number of customers seen by an arrival is ≥ n, for different (suitably chosen) values of n. You then repeat this K − 1 more times, obtaining total K such CCDF plots (for the same value n). Take average over K values (for a given n) and plot this estimated CCDF on a semi-log scale (y-axis is log(P{X ≥ n}), x-axis is linear, number n).
• Repeat the above for Di, the number of customers seen by i-th departure.
• For each of λ value and for each run of your simulations (out of K independent runs), you will obtain a number of samples for Bi (B1, B2, . . .). Once you have obtained K such set of samples, compute the sample average, and plot the CCDF on a semi-log scale. You should be able to measure all the quantities and obtain the empirical CCDF out of the same simulation run, instead of running separately for X(ti), Di, Bi each. Then, perform the following simulations (30 points for simulation running and code check) and answer/report the followings (10 points for plots and 10 points for your explanation to other questions):
1. There will be total 8 cases (X(ti) and Di for 4 different values of λ). Plot all these on the same semi-log plot, and overlay another plot for the theoretical result as learned in class, i.e., P{X ≥ k} = ρk, k = 0, 1, 2, . . . where X here is the number of customers in the system in the steady-state (time average). What do you see here? Does PASTA property hold in your simulation? Do your simulation results match well with the theory? If not, why? Discuss, and suggest some ways to reduce the errors from simulations.
2. Plot 4 different cases (for 4 different λ values) of CCDF of the busy-period on the same semi-log scale. Is B exponentially distributed? If not, what can you say about the shape of the distribution (CCDF) of B based on your simulation results? 2 Plot Thickens: Multiple Queues with Scheduling.
In this section, we will attempt to simulate a more realistic scenario with m multiple queues. These m ‘servers + queues’ can model a data center or cloud, which is basically nothing but a collection 2of servers (including buffer/queue) in a cluster (datacenter) or all over the places (cloud). Each arriving customer (your ‘click’ or a file being uploaded to the cloud, etc) will be assigned to one of m queues according to some scheduling policy.