http://www.internetworkflow.com/downloads/ns2leach/

Installing NS-2.27 and LEACH extension on Fedora Core and Ubuntu

 

Step 0: Prepare neccesary files for installation

  1. NS-2.27 package: ns-allinone-2.27.tar.gz
  2. Patch for compiling NS-2.27 with GCC 4.1.x: ns-2.27-gcc410.patch
  3. MIT’s LEACH extension: mit.tar.gz
  4. LEACH’s Makefile patch: leach_makefile-2.27.patch

Step 1: Download NS-2.27, apply ns-2.27-gcc410.patch, and install it
Under your home directory (~):

 

  1. apt-get install build-essential autoconf automake libxmu-dev kernel-package libncurses5-dev fakeroot wget bzip2 (커널 설정!)

  2. wget http://www.internetworkflow.com/downloads/ns2leach/ns-allinone-2.27.tar.gz
  3. tar zxvf ns-allinone-2.27.tar.gz
  4. wget http://www.tekno.chalmers.se/~yusheng/reports/ns-2.27-gcc410.patch
  5. patch -p0 < ns-2.27-gcc410.patch (설치시 Ubuntu package 과정 및 업데이트 되어있어야함!)
  6. cd ns-allinone-2.27/
  7. ./install
  8. 설치후에 bash_profile, vimrc를 넣는다.(vim bash_profile해서 수정하여야함!!IP주소등..참고로 .bash_profile을 사용해서 숨긴다.)

-출처:http://tmtam.wordpress.com/2007/07/31/installing-ns-227-and-leach-extension-on-ubuntu-704-feisty-fawn/ -

[출처] Install NS-2.27 and Ubuntu|작성자 울라짱구


LEACH 설치

 

1) 실행 : /home/cjm/ns-allinone-2.27/ns-2.27 디렉토리에서

./test

 

(ns tcl/ex/wireless.tcl -sc mit/uAMPS/sims/100nodescen -rp leach -x
1000 -y 1000 -nn 101 -stop 100 -eq_energy 1 -init_energy 2 -filename
leach_file -dirname leach_dir -num_clusters 5 -bs_x 0 -bs_y 0)

 

 

2) 설치 단계(gcc 버젼 확인,,, 4.1.2 이상 권장)

 

Step 1: Download NS-2.27

다운로드 : wget http://www.isi.edu/nsnam/dist/ns-allinone-2.27.tar.gz

압축 해제 : tar zxvf ns-allinone-2.27.tar.gz

 

Step 2: You need to apply a patch to ns-2.27 to make it works with gcc4.1. You can get the patch here

gcc 다운 : wget http://www.tekno.chalmers.se/~yusheng/reports/ns-2.27-gcc410.patch

위치 : ns-allinone-2.27 디렉토리 밖에서

패치 : patch -p0 <ns-2.27-gcc410.patch in bash

 

Step 3 : tk-8.4 최신버전으로 패치

위치 : cd ns-allinone-2.27/tk8.4.5

패치 : patch -p0 < tk-8.4-lastevent.patch

 

Step 4 : ns-allinone-2.27 인스톨

위치 : cd ns-allinone-2.27

설치 : ./install

 

Step 5 : Add the environmental variables to /etc/profile

NS=/srv/ns-allinone-2.27

export PATH=$PATH:$NS/bin:$NS/tcl8.4.5/unix:$NS/tk8.4.5/unix:$NS/ns-2.27:$NS/nam-1.13

export LD_LIBRARY_PATH=$NS/otcl-1.8:$NS/lib:$/usr/X11R6/lib:$/usr/local/lib

export TCL_LIBRARY=$NS/tcl8.4.5/library:$/usr/lib
source /etc/profile

 

Step 5-1 : 만약 cygwin으로 NS2를 실행하기 위해서는 startxwin.bat 실행
위치 : cygwin 실행창

실행 : startxwin.bat

출력화면 : startxwin.bat - Starting on Windows NT/2000/XP/2003

Step 6 : ns확인

위치 : cd ns-allinone-2.27/ns-2.27

실행 : ns tcl/ex/simple.tcl

 

Step 7 : Download LEACH code, extract it under ~ns-2.27/ directory.

위치 : cd ns-2.27

다운로드 : wget http://www.internetworkflow.com/downloads/ns2leach/mit.tar.gz
압축 해제 : tar zxvf mit.tar.gz

 

Step 8 : Add the following line in ~ns2.27/mac/wireless-phy.cc, line 59, that is, after the line #define max(a,b) (((a)<(b))?(b):(a))
#define min(a,b) (((a)>(b))?(b):(a))

 

Step9 : Install the LEACH code by following instructions below:

Step 5.1: Edit the Makefile as follows:
Add DMIT_uAMPS to the DEFINE list
Add -I./mit/rca -I./mit/uAMPS to the INCLUDE list
Add the following just prior to the line gaf/gaf.o \
mit/rca/energy.o mit/rca/rcagent.o \
mit/rca/rca-ll.o mit/rca/resource.o \
mac/mac-sensor-timers.o mac/mac-sensor.o mit/uAMPS/bsagent.o \

Step 5.2 : Add the environmental variables to /etc/profile
export RCA_LIBRARY=$NS/ns-2.27/mit/rca
export uAMPS_LIBRARY=$NS/ns-2.27/mit/uAMPS

Step 5.3: source /etc/profile and clean up previous build
source /etc/profile
make clean

Step 5.4: Rebuild ns2, redirecting output
make  ( or  => nohup make 2>error.log >make.log & )

Step 10 : Test default wireless demo and LEACH
./test

 

Step 11 : Validate the full installation, redirect the output
nohup ./validate-full 2> validate.error >validate.log &

[출처] NS2- LEACH 설치 방법|작성자 둘리




신고

'research > Simulation & Performance Evaluation' 카테고리의 다른 글

ns2 LEACH extension installing  (9) 2009.10.05
Posted by 나마스떼

댓글을 달아 주세요

  1. ukdissertationwritinghelp 2011.01.26 14:54 신고  댓글주소  수정/삭제  댓글쓰기

    It’s a great tip on Do It yourself stuff. Thanks!

  2. ukdissertationwritinghelp 2011.01.26 14:54 신고  댓글주소  수정/삭제  댓글쓰기

    It’s a great tip on Do It yourself stuff. Thanks!

  3. term paper assignment 2011.04.09 19:17 신고  댓글주소  수정/삭제  댓글쓰기

    It’s very rare that you find the relevant information on the net but your article did provide me the relevant information. I am going to save your URL and will definitely visit the site again.

  4. Tablet Android Honeycomb Terbaik Murah 2011.08.15 19:27 신고  댓글주소  수정/삭제  댓글쓰기

    이 사이트에 우리의 제 3의 휴가는 것입니다. 이 블로그 사이트 때문에 단순히 아주 같은 전문적인 틈새 안에 새로운 프로젝트를 시작했다. 귀하의 블로그 게시물은 사용 사람 중요한 데이터를 가구. 당신은 훌륭한 일을 했어. 자신의 피부 색상을 보존할 수 사랑스러운 여자 문제와 함께 방법을 plumped 그 자신의 무기가이 사람을 얻을 것이다 지켜보고 결국.

  5. mesothelioma lawyers 2011.08.16 16:29 신고  댓글주소  수정/삭제  댓글쓰기

    선물 감싸 관련된 훌륭한 팩의 그! 난 당신이 그들 모두 (항상되지 않습니다?)에 대해 매우 걱정하고 있고 볼 수 있습니다. 당연히 당신은 송곳니 돌을을 수행할 수 있습니다. 난 그녀가 매우 신속하게 밖으로 작동한 폭풍 인형을 인수했습니다. 더 많은 강아지 간식을 읽어 모든. 인터넷 마케팅 기쁘게 사람이 현재 선물을 좋아했습니다.

  6. Lawyer Marketing 2011.08.18 06:13 신고  댓글주소  수정/삭제  댓글쓰기

    당신은 완전히 당신이 달 일이 작업을 통해 지출을 방문 무료입니다. 그들이 고려 수있는 이야기에 대해 정확하게 일반적으로 문제가없는 사람 때문에 세계는 훨씬 더 열렬한 아웃소싱 헬퍼를 기원합니다. 일반적으로 현재 심혈관 가신 답니다.

  7. Ban Terbaik di Indonesia GT Radial 2011.11.24 13:35 신고  댓글주소  수정/삭제  댓글쓰기

    내가 아주이 블로그에 대한 즐길입니다. 그 정보를 주제. 그것은 몇 가지 문제를 해결하기 위해 절 그다지 도움이됩니다. 그 기회가 너무 빠른 너무 환상적이고 일하는 스타일입니다. 난 당신이 모두 도움이 될 것 같아요. 나와 함께이 아름다운 블로그를 즐기는 줘서 고마워. 나는 정말 그것을 감사 해요! 다른 훌륭한 블로그를 기대하겠습니다.작성자에게 행운을 빕니다! 모두 제일 좋다

  8. ultrabook notebook tipis harga murah terbaik 2012.01.29 14:16 신고  댓글주소  수정/삭제  댓글쓰기

    시작하기 위해 내 배우자를 언급하고 새로운 설문 조사가 부가 확인란되고이 후 의견을 포함시킬 수 때 같은 검토를 고집 일부 메시지를 구입하는 동안 당신 - 알림을해서 교전이 동안. 아마 그러나 당신이 도움이됩니다 통해 멀리 데려다해야 있나요? 건배!

  9. AGEN BOLA 2012.08.20 12:24 신고  댓글주소  수정/삭제  댓글쓰기

    agen bola bv berkomentar situs anda sangat menarik dan saya harap anda terus mengembangkanya 귀하는 차단되었으므로 사용

XShell 한글 지원

research 2009.10.05 15:40

우분투 EUC-KR 설치

#sudo apt-get install language-pack-ko

#sudo locale-gen ko_KR.EUC-KR

GNOME 한글세팅은 여기서

1. /etc/environment 설정값 바꾸기

$ sudo vim /etc/environment

2. 환경 바꾸기

LANG="ko_KR.UTF-8"

LANGUAGE="ko_KR:ko:en_GB:en"

위에와 같이 선택이 되어있다. EUC-KR를 추가해주자.

LANG="ko_KR.UTF-8"

LANG="ko_KR.EUC-KR"

LANGUAGE="ko_KR:ko:en_GB:en"

3. 재부팅한다.

신고

'research' 카테고리의 다른 글

XShell 한글 지원  (0) 2009.10.05
ubuntu에 tinyos 설치  (1) 2009.09.30
SVM  (0) 2009.08.31
BibTeX in Latex with WinEdt / WinEdit + MikTEx  (0) 2009.05.11
permasense Project  (0) 2009.03.18
Posted by 나마스떼

댓글을 달아 주세요

출처: http://blog.byfun.com/351
신고

'research' 카테고리의 다른 글

XShell 한글 지원  (0) 2009.10.05
ubuntu에 tinyos 설치  (1) 2009.09.30
SVM  (0) 2009.08.31
BibTeX in Latex with WinEdt / WinEdit + MikTEx  (0) 2009.05.11
permasense Project  (0) 2009.03.18
Posted by 나마스떼

댓글을 달아 주세요

  1. 나마스떼 2009.10.02 19:30 신고  댓글주소  수정/삭제  댓글쓰기

    http://blog.naver.com/PostView.nhn?blogId=finalyu&logNo=20057678652&widgetTypeCall=true

SVM

research 2009.08.31 20:40
Support Vector Machines is a powerful methodology for solving problems in nonlinear classification, function estimation and density estimation which has also led to many other recent developments in kernel based methods in general. Originally, it has been introduced within the context of statistical learning theory and structural risk minimization. In the methods one solves convex optimization problems, typically quadratic programs. Least Squares Support Vector Machines (LS-SVM) are reformulations to the standard SVMs which lead to solving linear KKT systems. LS-SVMs are closely related to regularization networks and Gaussian processes but additionally emphasize and exploit primal-dual interpretations. Links between kernel versions of classical pattern recognition algorithms such as kernel Fisher discriminant analysis and extensions to unsupervised learning, recurrent networks and control are available. Robustness, sparseness and weightings can be imposed to LS-SVMs where needed and a Bayesian framework with three levels of inference has been developed. LS-SVM alike primal-dual formulations have been given to kernel PCA, kernel CCA and kernel PLS. For ultra large scale problems and on-line learning a method of Fixed Size LS-SVM is proposed. The present LS-SVMlab toolbox contains Matlab/C implementations for a number of LS-SVM algorithms.
신고

'research' 카테고리의 다른 글

XShell 한글 지원  (0) 2009.10.05
ubuntu에 tinyos 설치  (1) 2009.09.30
SVM  (0) 2009.08.31
BibTeX in Latex with WinEdt / WinEdit + MikTEx  (0) 2009.05.11
permasense Project  (0) 2009.03.18
Posted by 나마스떼

댓글을 달아 주세요

TCP Performance and the Mathis Equation

Good network engineers know about TCP performance over Long, Fat Networks (LFNs - see RFC1323) and how to use bandwidth, delay, and window size to calculate the maximum throughput of a connection.  But it seems that not many people know about the Mathis Equation, which describes how packet loss factors into the throughput calculations.

For those not familiar with it, the buffering required in a receiving system for maximum performance is based on the BW-Delay Product (BDP).  It is the amount of data that can be sent between ACKs.  You multiply the path's minimum bandwidth by the round-trip delay.  The table below shows the buffering requirements for several examples, with the BW converted to Bytes/Sec and the Delay converted to Sec, resulting in the BDP units of Bytes.

BW (Mbps) RT Delay (ms) Bytes/Sec Delay (Sec) BDP (Bytes)
1.533 5 191,625 0.005 958
10 5 1,250,000 0.005 6,250
45 5 5,625,000 0.005 28,125
100 5 12,500,000 0.005 62,500
1000 5 125,000,000 0.005 625,000
1.533 20 191,625 0.020 3,833
10 20 1,250,000 0.020 25,000
45 20 5,625,000 0.020 112,500
100 20 12,500,000 0.020 250,000
1000 20 125,000,000 0.020 2,500,000
1.533 60 191,625 0.060 11,498
10 60 1,250,000 0.060 75,000
45 60 5,625,000 0.060 337,500
100 60 12,500,000 0.060 750,000
1000 60 125,000,000 0.060 7,500,000

When the TCP window size is more than the BDP, the path BW is the limiting factor in throughput.  But when the TCP window size is less than the buffering required to keep the pipe filled, we use another equation to calculate the maximum throughput of the path.  The sending system will send a full TCP window worth of data and then wait for an acknowledgement from the receiver.  When the ACKs are received, more data can be sent.  Therefore, the maximum throughput is the window size divided by the time it takes to get back an ACK (i.e., the round trip time).

An example is useful to illustrate what happens here.  Let's say that we want to transfer a large file from a server to our workstation and that the workstation supports a TCP window size of 65535.  The path between the server and the workstation is T3 speeds (45Mbps) and the two systems are separated by a terrestrial link that's 3000 miles long.  One-way delay on terrestrial links is nominally 10ms per 1000 miles, so our link is 30ms long, one-way and 60ms round-trip.

The amount of buffering required is the link speed, converted to bytes, times the delay, in seconds.
Buffering = BW * Delay = 45,000,000/8 Bytes/Sec * .06 Sec = 337,500 Bytes.  That's bigger than our window, so we need to use the equation for TCP throughput.

Throughput = TCPWindow / round-trip-delay = 65535 Bytes / .06 Sec = 1,092,250 Bytes/Sec
                     = 8,738,000 bits/sec

That's considerably less than the T3 bandwidth that a non-technical person might expect.  If the systems support the window scaling option, the window size can be multiplied by a scaling factor in order to fill the pipe.  But many systems don't have window scaling options enabled or enough buffer space allocated to handle the high throughput for modern network paths, so throughput suffers and the people using these systems have lower productivity.

The above calculations all assume a lossless path.  This brings us to the question of what happens when the path experiences packet loss.  The formula that approximates what happens when TCP experiences loss in the path was approximated in a 1997 paper by Mathis, Semke, Mahdavi & Ott in Computer Communication Review, 27(3), July 1997, titled The macroscopic behavior of the TCP congestion avoidance algorithm

Rate < (MSS/RTT)*(1 / sqrt(p))
where p is the probability of packet loss.

This equation seems to have become known as the Mathis Equation.  Obviously, TCP performance goes down as packet loss increases, but this gives us an idea of how quickly this happens.  Fiber Bit Error Rates (BER) are typically 10E-13.  Some optical gear treats a link as down at a BER of 10E-6 (one bad bit in 1M bits).  Assuming a stream of 1460 byte packets, that's one bad packet approximately every 85 packets.  Mathis, et al., did a thorough review of what happens with high packet loss probability, which I'll not try to duplicate here; see the paper referenced below. 

 What it means

What is important for us as network engineers is to understand the impact on throughput so we can decided when to do something about it.  We should be looking for links that are experiencing BER loss that exceeds 10E-10.  A rough engineering calculation is that a packet nominally contains 10,000 bits (1250 Bytes), or 10E4 bits.  This makes the math easier and allows us to get a gut feel for the figures.  That means that we need a packet loss figure of 10E-10 (BER) - 10E-4 (the packet size in bits) = 10E-6 (packet loss rate).  Converting to a decimal number from scientific representation and we have .000001, or .0001%.  That's a number that we can use with our network monitoring systems to highlight lossy links and paths.  Statistically, there isn't much difference between this calculation and one in which you assume an average packet size of 300 Bytes.  I prefer the simpler calculation and base my network management thresholds on it. Also note that LAN interfaces should typically have a loss rate that's several orders of magnitude less than a WAN link or wireless link, so you may want to use different figures for different types of links.

Additional References

There is a good tutorial on TCP performance, with examples of fiber BER and different delays at
http://www.linuxsa.org.au/meetings/2003-09/tcpperformance.print.pdf

A validation was reported in Modelling TCP throughput: A simple model and its empirical validation by J. Padhye, V. Firoiu, D. Townsley and J. Kurose, in Proc. SIGCOMM Symp. Communications Architectures and Protocols Aug. 1998, pp. 304-314.  While this may be a more accurate analysis, it is certainly beyond the level of detail that most network engineers wish to go.

Pete Welcher also took a look at this and posted his own view and links on the topic at:
http://www.netcraftsmen.net/resources/blogs/tcp-ip-performance-factors.html

I also recommend that you read RFC1323 (May 1992) to learn more about the fundamental mechanisms that have been in place for a long time.  There are also some great references on the Internet on how to tune various TCP stacks for optimum operation over LFNs.  Understanding how to measure performance and what to do about it when it seems slow is valuable knowledge.

  -Terry
 

신고

'research > E2ENetPer' 카테고리의 다른 글

TCP Performance and the Mathis Equation  (0) 2009.08.22
성능 분석  (1) 2007.04.14
iperf로 test하기  (0) 2007.03.26
RFC-Network Performance Metric  (0) 2007.03.26
Posted by 나마스떼

댓글을 달아 주세요


AMPL, an acronym for "A Mathematical Programming Language", is a high-level programming language, developed at Bell Laboratories, for describing and solving high complexity problems for large scale mathematical computation (i.e. large scale optimization and scheduling type problems). AMPL does not solve those problems directly; instead, it calls appropriate external solvers (such as CPLEX, FortMP, MINOS, IPOPT, SNOPT, KNITRO, and so on) to obtain solutions. AMPL handles linear and nonlinear optimization problems as well as complementarity problems (MPECs), in discrete or continuous variables.

One particular advantage of AMPL is the similarity of its syntax to the mathematical notation of optimization problems. This allows for a very concise and readable definition of problems in the domain of optimization. Many modern solvers available on the NEOS [1] server (hosted at the Argonne National Laboratory) accept AMPL input.

AMPL was created by Robert Fourer, David Gay and Brian Kernighan. It is currently maintained by AMPL Optimization LLC.


KNITRO is a software package for solving large scale mathematical optimization problems. KNITRO is specialized for nonlinear optimization, but also solves linear programming problems, quadratic programming problems, and systems of nonlinear equations. The unknowns in these problems must be continuous variables incontinuous functions; however, functions can be convex or nonconvex. KNITRO computes a numerical solution to the problem -- it does not find a symbolic mathematical solution.

Optimization problems must be presented to KNITRO in mathematical form, and should provide a way of computing function derivatives using sparse matrices. Problems may be written in C, C++, Fortran, or Java, in which case KNITRO is called as a software routine to solve the problem. An often easier approach is to develop the optimization problem in an algebraic modeling language (AML) like AIMMS, AMPL, GAMS, Mathematica, etc. The modeling environment computes function derivatives, and KNITRO is called as a "solver" from within the environment.

KNITRO offers three different optimization algorithms for solving optimization problems. Two algorithms are of the interior point type, and one is of the active set type. These algorithms are known to have fundamentally different characteristics; for example, interior point methods follow a path through the interior of the feasible regionwhile active set methods tend to stay at the boundaries. KNITRO provides both types of algorithm for greater flexibility in solving problems, and allows crossover during the solution process from one algorithm to another. The code also provides a multistart option for promoting the computation of the global minimum.

KNITRO, short for "Nonlinear Interior point Trust Region Optimization" (the "K" is silent) was created primarily by Richard Waltz, Jorge Nocedal, Todd Plantenga and Richard Byrd. It is produced by Ziena Optimization, Inc. KNITRO was introduced in 2001 as a derivative of academic research at Northwestern, and has undergone continual improvement since.

신고

'research > mathematics' 카테고리의 다른 글

KNITRO with AMPL  (0) 2009.07.22
Cardinality  (0) 2009.05.21
bloom filter  (0) 2008.01.10
수학기호  (3) 2006.12.22
Posted by 나마스떼

댓글을 달아 주세요

The meaning of Cardinality at Wikipedia
신고

'research > mathematics' 카테고리의 다른 글

KNITRO with AMPL  (0) 2009.07.22
Cardinality  (0) 2009.05.21
bloom filter  (0) 2008.01.10
수학기호  (3) 2006.12.22
Posted by 나마스떼

댓글을 달아 주세요


사용법
신고

'research' 카테고리의 다른 글

XShell 한글 지원  (0) 2009.10.05
ubuntu에 tinyos 설치  (1) 2009.09.30
SVM  (0) 2009.08.31
BibTeX in Latex with WinEdt / WinEdit + MikTEx  (0) 2009.05.11
permasense Project  (0) 2009.03.18
Posted by 나마스떼

댓글을 달아 주세요

permasense Project

research 2009.03.18 22:05
저먼 스위스 알프스 산에서 Global Warming으로 인한 환경 피해를 모니터링하고 개선하고자 센서 네트워크를 이용하는 프로젝트.. 멋지구리.
.
http://www.permasense.ch/
신고

'research' 카테고리의 다른 글

XShell 한글 지원  (0) 2009.10.05
ubuntu에 tinyos 설치  (1) 2009.09.30
SVM  (0) 2009.08.31
BibTeX in Latex with WinEdt / WinEdit + MikTEx  (0) 2009.05.11
permasense Project  (0) 2009.03.18
Posted by 나마스떼

댓글을 달아 주세요

http://www.adaptivebox.net/research/bookmark/psocodes_link.html
신고

'research > distributed algorithm' 카테고리의 다른 글

PSO  (0) 2008.02.12
Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=10548&objectType=file


anneal Minimizes a function with the method of simulated annealing (Kirkpatrick et al., 1983)

ANNEAL takes three input parameters, in this order:

LOSS is a function handle (anonymous function or inline) with a loss function, which may be of any type, and needn't be continuous. It does, however, need to return a single value.

PARENT is a vector with initial guess parameters. You must input an initial guess.

OPTIONS is a structure with settings for the simulated annealing. If no OPTIONS structure is provided, anneal uses a default structure. OPTIONS can contain any or all of the following fields (missing fields are filled with default values):

Verbosity: Controls output to the screen.
0 suppresses all output
1 gives final report only [default]
2 gives temperature changes and final report
Generator: Generates a new solution from an old one. Any function handle that takes a solution as input and gives a valid solution (i.e. some point in the solution space) as output. The default function generates a row vector which slightly differs from the input vector in one element: @(x) (x+(randperm(length(x))==length(x))*randn/100). Other examples of possible solution generators: @(x) (rand(3,1)): Picks a random point in the unit cube. @(x) (ceil([9 5].*rand(2,1))): Picks a point in a 9-by-5 discrete grid.
InitTemp: The initial temperature, can be any positive number. Default is 1.
StopTemp: Temperature at which to stop, can be any positive number smaller than InitTemp. Default is 1e-8.
StopVal: Value at which to stop immediately, can be any output of LOSS that is sufficiently low for you. Default is -Inf.
CoolSched: Generates a new temperature from the previous one. Any function handle that takes a scalar as input and returns a smaller but positive scalar as output. Default is @(T) (.8*T).
MaxConsRej: Maximum number of consecutive rejections, can be any positive number. Default is 1000.
MaxTries: Maximum number of tries within one temperature, can be any positive number. Default is 300.
MaxSuccess: Maximum number of successes within one temperature, can be any positive number. Default is 20.


Usage:
[MINIMUM,FVAL] = ANNEAL(LOSS,NEWSOL,[OPTIONS]);
MINIMUM is the solution which generated the smallest encountered value when input into LOSS.
FVAL is the value of the LOSS function evaluated at MINIMUM.
OPTIONS = ANNEAL();
OPTIONS is the default options structure.


Example:
The so-called six-hump camelback function has several local minima in the range -3<=x<=3 and -2<=y<=2. It has two global minima, namely f(-0.0898,0.7126) = f(0.0898,-0.7126) = -1.0316. We can define and minimise it as follows:
camel = @(x,y)(4-2.1*x.^2+x.^4/3).*x.^2+x.*y+4*(y.^2-1).*y.^2;
loss = @(p)camel(p(1),p(2));
[x f] = anneal(loss,[0 0])
We get output:
Initial temperature: 1
Final temperature: 3.21388e-007
Consecutive rejections: 1027
Number of function calls: 6220
Total final loss: -1.03163
x =
-0.0899 0.7127
f =
-1.0316
Which reasonably approximates the analytical global minimum (note that due to randomness, your results will likely not be exactly the same).
신고

'research > distributed algorithm' 카테고리의 다른 글

PSO  (0) 2008.02.12
Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요

http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html
신고

'research > distributed algorithm' 카테고리의 다른 글

PSO  (0) 2008.02.12
Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요

Global Optimization

Summary

Global optimization has become a buzzword in optical design since Adaptive Simulated Annealing (ASA) was introduced in OSLO SIX in 1992. Several new optimization schemes have been introduced in other software to compete against ASA, but none seem to be designed to find a global optimum, vendor claims notwithstanding.

It's important to consider a few basic points about global optimization. First, a global optimum pertains to a specified error function in a specified region of search. Second, there can be only one global optimum in the region. Third, for continuous variables, the assertion of success is probabilistic. Finally, the search time should be specified for comparative evaluation. A typical result of a global optimization task might be something like "after a one hour search, point P has the highest probability of being the global optimum of error function F in region R". ASA has a solid theoretical basis for making such a claim.

Discussion

[Ed. Note] This article was originally published in 1993 as a Sinclair Optics Design Note titled "Global Optimization in Lens Design", by G.W. Forbes and Andrew E. W. Jones. It has been slightly edited, primarily to accommodate HTML typography and to bring references up to date.

OSLO SIX includes a global optimization algorithm called Adaptive Simulated Annealing (ASA). Although global optimization schemes have attracted substantial interest among lens designers in the past, only recently has affordable computer hardware become available with power sufficient to meet the high demands posed by global optimization problems in lens design.

The rationale for global optimization

The goal in lens design is the determination of the best possible lens that satisfies the given constraints. The performance of a lens system is measured by a merit function and the goal in the optimization stage of design is to determine the configuration of the lens system for which the merit function is smallest over the region of interest defined by the constraints—that is, to locate the global minimum of the merit function.

One simple global optimization scheme is the grid search: the merit function M is evaluated at points on a regular grid aligned to the coordinate axes, and the sample with the lowest value of M is taken as an estimate of the global minimum. Such a simple scheme, however, rapidly becomes unworkably inefficient with increasing dimensionality. Consider a problem in which five samples are required in each coordinate direction to locate the global minimum to the desired accuracy and in which the merit function requires 1 microsecond to evaluate. Thus, the time needed to perform a grid search in N dimensions is 5N microseconds: for N=10, the search is completed in less than ten seconds, but for N=15 more than eight hours is needed, and in just 20 dimensions, over three years is required. Clearly, even with such sparse sampling and with such unrealistically favorable estimates of evaluation speed, a grid search is impractical; the sampling of the region of interest in the configuration space must be more selective.

The most widely employed optimization scheme consists of the selection of a sample point by the designer, followed by the application of a local optimizer (in OSLO, this is the "Iterate full" command) that selects subsequent samples with the aim of proceeding "downhill"—that is, in a direction of decreasing merit-function value. If the local optimum thus found is not of sufficient quality, a new initial sample is chosen and the process is repeated until a satisfactory local minimum is found. While computationally much less demanding than a grid search, the success of such a scheme is heavily dependent upon the choice of the starting point, and the resulting designs are typically similar in form to the starting point.

OSLO has an alternative optimization command, "Iterate standard", that does not impose the downhill requirement. This command has been used for many years to overcome the so-called "stagnation effect" exhibited by damped-least-squares optimizers. Recently, David Shafer [1] has documented a scheme for using this command to generate multiple solutions to optical design problems. Such solution generators are fundamentally different from the sort of global optimizer that is discussed here, however, in that the results depend heavily on the starting point.

By using a global optimization scheme, we can transcend some of the limitations of local optimizers. Although global optimization is more computationally demanding than local optimization, the rapid and continuing increases in the computer power that is available at a given monetary cost make global optimization ever more practical. Since the performance of global optimizers is essentially independent of the particular starting point, they are especially effective in situations in which suitable starting points are difficult to determine. One attractive global optimization scheme is the simulated annealing algorithm, first applied to lens design in 1984 [2]. Simulated annealing belongs to a class of global optimizers that are called controlled random search methods because the merit-function space is sampled randomly according to a scheme in which the values of several parameters determine the distribution of the random samples. Although these so-called Monte Carlo methods might at first seem as promising as throwing darts whilst blindfolded, there are sound reasons for their implementation (see Why go to Monte Carlo?).

Simulated annealing

The simulated annealing algorithm derives its name from the fact that its behavior is controlled principally by a parameter T, called "temperature", that is analogous to the temperature in the thermal annealing process. We call simulated annealing a "true" global optimization algorithm because each run attempts to search the entire region of interest for the global minimum rather than performing multiple downhill optimization runs in which the selection of the various starting points is automated. On the surface, simulated annealing is quite simple, as depicted by the following flowchart.

Figure 2. Simulated annealing flowchart

In simulated annealing, the optimization process is not required to proceed uniformly downhill, but is allowed to make occasional uphill moves. The typical increase in M that is acceptable in an uphill move is determined by the value of T. At the start of the annealing process, T has a relatively large value compared to the standard deviation of the merit function over the region of interest, and the random walk effectively traverses all of this region. As the random walk progresses, T is lowered, the allowed increases in M are then typically smaller, and the walk is effectively constrained to ever lower valleys. If T is reduced sufficiently slowly, the random walk can escape the higher valleys during its earlier stages, and it can be expected to terminate at the global minimum. If T is lowered more quickly, however, the random walk is more likely to become trapped in one of the higher valleys. This follows the analogy with the thermal annealing process: if a hot solid is cooled slowly, its final state is likely to be one of lower potential energy (usually, a more ordered arrangement of atoms); if it is cooled more quickly, the final state is likely to be one of higher potential energy.

There exist numerous variants of simulated annealing, each of which represents an attempt to address the following concerns:

How is the initial value of T chosen, and how is T lowered?

How are the steps generated?

Most variants of simulated annealing require the user to answer these questions by providing the values of a set of parameters that control the course of optimization; for example, for the temperature, the user may be required to specify an initial value, a factor between 0 and 1 by which T is multiplied periodically, and a parameter used to control the length of this period. The parameters for tuning the step generation process are even more important. Determining appropriate values for all of these parameters is often accomplished through trial and error, which is prohibitively expensive for all but the simplest problems. Furthermore, many variants of annealing do not possess the desirable properties of invariance under linear transformations of the coordinates or of the merit function.

Adaptive Simulated Annealing

For these reasons, we have developed adaptive controls for all of the parameters in simulated annealing [3]. The performance of the resulting algorithm, which we call Adaptive Simulated Annealing (or just ASA), is invariant under linear transformations of both the merit function and, more importantly, of the coordinates in the configuration space. ASA requires the specification of only a single parameter: the annealing rate written as , that determines the average rate at which T is lowered. Given a value for this parameter, all else is automatically tuned to the particular problem at hand. Reducing the value of slows the cooling (hence increases the run time) and makes the search for the global minimum more thorough.

It follows from the earlier discussion of Monte Carlo algorithms that the effectiveness of any adaptive mechanism here will be crucially dependent upon the automatic control of the statistics of the random step generator. First, a fundamental requirement of simulated annealing is that the step distribution must be symmetric. That is, a step s is just as likely as the step –s. We have chosen to use a Gaussian distribution for the steps. In two dimensions, a general Gaussian distribution resembles an elliptical cloud that becomes less dense away from its center. The proportions, orientation, and size of the ellipse are the key parameters in this case. It is clear, that, as the value of T is lowered, the region being explored in the configuration space is reduced. It therefore seems reasonable to expect that the optimal Gaussian for the steps should also be modified as the annealing process advances. If the step size is too large for the current value of T, almost all of the attempted moves are rejected and the process stagnates. On the other hand, if the steps are too small, the walk may cover only a fraction of the configuration space that should be explored at the current temperature for optimal efficiency. It also appears reasonable to expect that the relative shape and orientation of the Gaussian cloud will need to be adjusted continually in order to maintain optimal efficiency. We have proposed a simple procedure that automatically adjusts the scale, shape, and orientation of the n-dimensional Gaussian at each stage during annealing.

The basic idea for step control in ASA is based on what is called the central limit theorem. This theorem states that, if you average a collection of independent random numbers, the result is a random number with a distribution that is roughly a (one-dimensional) Gaussian. This holds regardless of the distributions of each of the individual random numbers and it also follows that the variance (i.e., the mean-square spread) in the resulting random number is just the average of the variances of the individual random numbers. As a result, it turns out that a Gaussian multi-dimensional step generator can be realized by taking linear combinations of a collection of random vectors.

In ASA, the idea is to keep a record of the last m steps that have been accepted and to generate new steps by taking random linear combinations of these steps and scaling the result by an appropriate (dynamically adjusted) expansion factor. In this way, the statistics of the generated steps are directly coupled to both the behavior of the merit function and the current value of T. This scheme turns out to be not only effective but simple: without expensive matrix operations, the algorithm automatically adjusts to the multi-dimensional terrain in a way that is invariant under arbitrary linear transformations that scale and stretch the space in a manner that warps rectangular prisms to rotated, elongated parallelepipeds. Within the context of lens design, this invariance is of utmost importance since there is no natural measure of distance in the configuration space. (Recall that the coordinates are normally a mixture of refractive indices, thicknesses, curvatures, aspheric coefficients, etc., and there is no intuitive way to define the distance between two points in such a space.)

Another important aspect of the guided random walk is the means for dealing with steps that take the system outside the region of interest. That is, if one or more constraints are violated by the proposed system, the step cannot be accepted. Efficiency is enhanced if, instead of simply rejecting such a system and requesting a new random step, the point at which the current step first crosses a constraint boundary is found and the step is then reflected off this constraint back into the region of interest. Without this reflection process, the regions near constraints turn out to be undersampled and this can adversely affect the overall efficiency.

For optimal efficiency it is important that the details of this process of reflection take a form that ensures invariance under linear transformations of the coordinates. Since our intuition is based in a space where there is a natural measure of distance, it seems unambiguous to state that a step should be reflected in a given plane as if it were a mirror. This is not so, however. The difficulty can be appreciated by observing that the idea of a normal to a plane surface is not invariant under linear transformations. To see this, consider two perpendicular lines on a page and imagine a scale transformation that stretches the plane along a direction that is not parallel to either of the lines. As a result, the lines will no longer be perpendicular. It follows that, to complete the specification of the reflection process, it is necessary to introduce a particular measure of distance — i.e., a "metric" — to the coordinate space.

In ASA, there is a natural metric defined by the elliptical Gaussian cloud of the step distribution: the distance between two points can be measured in terms of the number of steps (of mean size in that direction) required to move from one point to the other. Notice that, due to the coupling to the step distribution, the form of this metric evolves during the annealing process. Now, if the space is redrawn by reference to this metric (which amounts to a linear transformation) the elliptical Gaussian cloud takes the form of a sphere and this gives the natural representation in which to bounce off constraints as if they were mirrors. With this, the essential pieces of the crucial step-generation component of ASA are completely determined.

The details of the temperature control aspects of ASA are more sophisticated, but it is intuitive that, for a given merit function, there will be particular ranges for the value of T during which a relative reduction in the rate of decrease of T will lead to better overall efficiency. This aspect of the adaptiveness in ASA is controlled by monitoring certain statistics of the recent stages of the search. The process is terminated when the typical relative change in the current merit function value falls below a certain level. On any given problem, ASA should be run as many times as is workable and the designs that emerge can then be reoptimized and fine-tuned by using the "Iterate full" or "Iterate standard" commands. There is a range of context-specific matters that must be addressed in applying an algorithm like ASA to lens design problems. One of these issues is the proper choice of non-linear coordinate transformations to enhance efficiency.

The general outline presented here should give some appreciation of the philosophy underlying the design of ASA. It is important not to forget that typical design problems are challenging tasks and the success of an automated algorithm is still dependent upon the designer's guidance. No longer is the designer asked for a starting point, but now the designer is asked for the specification of the region to be explored. We have enjoyed some impressive successes with ASA in lens design[4-6] and look forward to hearing of your experiences with this new algorithm in your own work.

References

1. David Shafer, Laser Focus World, 135, (Jan. 1993).

2. I.O. Bohachevsky et al., SPIE 485, 104, (1984).

3. A.E.W. Jones and G.W. Forbes, J. Global Optimization 6, 1 (1995)

4. G.W. Forbes and A.E.W. Jones, Proc. SPIE 1354, 144, (1990).

5. G.W. Forbes and A.E.W. Jones, Optics and Photonics News 3, 22, (March 1992).

6. G.W. Forbes and A.E.W. Jones, OSA Proceedings 22, 42, (1994).

신고

'research > distributed algorithm' 카테고리의 다른 글

Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
[펌] k-means clustering  (0) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요

Important Referencess for Wireless Sensor Networks

[1] Robert Kleinberg.
Geographic routing using hyperbolic space.
In Proc. of 26th IEEE International Conference on Computer Communications (INFOCOM'07), Anchorage, AK, 2007.

[2] S. Subramanian, S. Shakko, and P. Gupta.
On optimal geographic routing in wireless networks with holes and non-uniform traffic.
In Proc. of 26th IEEE International Conference on Computer Communications (INFOCOM'07), Anchorage, AK, 2007.

[3] Gang Zhao, Xiangqian Liu, and Min-Tue Sun.
Anchor-based geographic routing for sensor networks using projection distance.
In Proc. of the 2nd IEEE International Symposium on Wireless Pervasive Computing (ISWPC'07), San Juan, Puerto Rico, Febrary 2007.

[4] Gang Zhao, Xiangqian Liu, and Min-Tue Sun.
Energy-aware geographic routing for sensor networks with randomly shifted anchors.
In Proc. of the IEEE Wireless Communications and Networking Conference (WCNC'07), pages 3454–3459, Hong Kong, China, March 2007.

[5] Noa Arad and Yuval Shavitt.
Minimizing recovery state in geographic ad-hoc routing.
In Proc. of the ACM 2006 MobiHoc, Florence, Italy, May 2006.

[6] Shigang Chen, Guangbin Fan, and Jun-Hong Cui.
Avoid ``void'' in geographic routing for data aggregation in sensor networks.
International Journal of Ad Hoc and Ubiquitous Computing (IJAHUC), (Special Issue on Wireless Sensor Networks), 1(4):169–178, 2006.

[7] Qing Fang, Jie Gao, and Leonidas J. Guibas.
Locating and bypassing holes in sensor networks.
IEEE Mobile Networks and Applications, 11(2):187–200, April 2006.

[8] Hannes Frey and Ivan Stojmenovic.
On delivery guarantees of face and combined greedy-face routing algorithms in ad hoc and sensor networks.
In Proc. of the 12th ACM Annual International Conference on Mobile Computing and Networking (MOBICOM'06), pages 390–401, Los Angeles, CA, September 2006.

[9] H. Huang.
Adaptive algorithms to mitigate inefficiency in greedy geographical routing.
IEEE Communication Letters, 10(3):150–152, March 2006.

[10] Ben Leong, Barbara Liskov, and Robert Morris.
Geographic routing without planarization.
In Proc. of the 3rd Symposium on Networked Systems Design & Implementation (NSDI'06), 2006.

[11] X. Ma, M.-T. Sun, X. Liu, and G. Zhao.
Improving geographical routing for wireless networks with an efficient path pruning algorithm.
In Proc. of the 2006 IEEE SECON, Reston, VA, September 2006.

[12] N. Abu-Ghazaleh, K. D. Kang, and K. Liu.
Towards resilient geographic routing in wireless sensor networks.
In Proc. of the 1st ACM Workshop on QoS and Security for Wireless and Mobile Networks, Montreal, Canada, October 2005.

[13] L. Blazevic, J.-Y. Le Boudec, and Silvia Giordano.
A location-based routing method for mobile ad hoc networks.
IEEE Transactions on Mobile Computing, 4(2):97–110, March-April 2005.

[14] Jehoshua Bruck, Jie Gao, and Anxiao (Andrew) Jiang.
Localization and routing in sensor networks by local angle information.
In Proc. of the 6th ACM international symposium on Mobile ad hoc networking and computing (MOBIHOC'05), Urbana-Champaign, IL, USA, May 2005.

[15] Rodrigo Fonseca, Sylvia Ratnasamyy, Jerry Zhao, Cheng Tien Ee, David Culler, Scott Shenker, and Ion Stoica.
Beacon vector routing: Scalable point-to-point routing in wireless sensornets.
In Proc. of the 2nd USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI'05), 2005.

[16] Tian He, John A. Stankovic, Chenyang Lu, and Tarek F. Abdelzaher.
A spatiotemporal communication protocol for wireless sensor networks.
IEEE Transactions on Parallel and Distributed Systems, 16(10):995–1006, October 2005.

[17] Young-Jin Kim, Ramesh Govindan, Brad Karp, and Scott Shenker.
On the pitfalls of geographic face routing.
In Proc. of the 3rd ACM/SIGMOBILE International Workshop on Foundations of Mobile Computing (DialM-POMC'05), September 2005.

[18] Young-Jin Kim, Ramesh Govindan, Brad Karp, and Scott Shenker.
Geographic routing made practical.
In Proc. of the 2nd Symposium on Networked Systems Design and Implementation (NSDI'05), Boston, MA, 2005.

[19] B. Leong, S. Mitra, and B. Liskov.
Path vector face routing: geographic routing with local face information.
In Proc. of the 13th IEEE International Conference on Network Protocols (ICNP'05), 2005.

[20] C. Li, W. Hsu, B. Krishnamachari, and A. Helmy.
A local metric for geographic routing with power control in wireless networks.
In Proc. of the IEEE SECON'05, 2005.

[21] David Liben-Nowell, Jasmine Novak, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins.
Geographic routing in social networks.
Proceedings of the National Academy of Sciences (PNAS), 102(33):11623–11628, Aug 2005.

[22] Christos H. Papadimitriou and David Ratajczak.
On a conjecture related to geometric routing.
Theoretical Computer Science, (Special Issue on Algorithmic Aspects of Wireless Sensor Networks), 344(1):3–14, November 2005.

[23] L. Savidge, Huang Lee, H. Aghajan, and A. Goldsmith.
QoS-based geographic routing for event-driven image sensor networks.
In Proc. of the 2nd International Conference on Broadband Networks, volume 2, pages 991– 1000, 2005.

[24] R. C. Shah, A. Wolisz, and J. M. Rabaey.
On the performance of geographical routing in the presence of localization errors.
In Proc. of the IEEE ICC'05, 2005.

[25] M. Witt and V. Turau.
BGR: blind geographic routing for sensor networks.
In Proc. of the Third International Workshop on Intelligent Solutions in Embedded Systems (WISES'05), 2005.

[26] Fan Ye, Gary Zhong, Songwu Lu, and Lixia Zhang.
GRAdient Broadcast: A robust data delivery protocol for large scale sensor networks.
ACM Wireless Networks, 11(3):285–298, March 2005.

[27] Qing Fang, Jie Gao, and Leonidas J. Guibas.
Locating and bypassing holes in sensor networks.
In Proc. of IEEE INFOCOM'04, 2004.

[28] A.B. McDonald S. Fotopoulou-Prigipa.
GCRP: geographic virtual circuit routing protocol for ad hoc networks.
In Proc. of the 1st IEEE International Conference on Mobile ad hoc and Sensor Systems (MASS'04), 2004.

[29] H. Frey.
Scalable geographic routing algorithms for wireless ad hoc networks.
IEEE Network Magazine, 18(4):18–22, July-August 2004.

[30] H Huang.
Adaptive geographical routing in wireless ad-hoc networks.
In Proc. of the 2004 IEEE VTC, pages 2749–2753, 2004.

[31] T. Melodia, D. Pompili, and A. F. Akyildiz.
Optimal local topology knowledge for energy efficient geographical routing in sensor networks.
In Proc. of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM'04), volume 3, pages 1705–1716, Hong Kong, China, March 2004.

[32] Karim Seada, Marco Zuniga, Ahmed Helmy, and Bhaskar Krishnamachari.
Energy-efficient forwarding strategies for geographic routing in lossy wireless sensor networks.
In Proc. of the 2nd international conference on Embedded networked sensor systems (SenSys'04), pages 108–121, Baltimore, MD, USA, 2004.

[33] G. Xing, C. Lu, R. Pless, and Q. Huang.
On greedy geographic routing algorithms in sensing-covered networks.
In Proc. of the 5th ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc'04), Tokyo, Japan, may 2004.

[34] K. Seada, A. Helmy, and R. Govindan.
On the effect of localization errors on geographic face routing in sensor networks.
In Proc. of the 2004 IPSN, pages 71–80, April 2004.

[35] Tian He, John A. Stankovic, Chenyang Lu, and Tarek F. Abdelzaher.
SPEED: A stateless protocol for real-time communication in sensor networks.
In Proc. of international conference on distributed computing systems (ICDCS'03), 2003.

[36] Fabian Kuhn, Roger Wattenhofer, and Aaron Zollinger.
Worst-case optimal and average-case efficient geometric ad-hoc routing.
In Proc. of the 6th ACM international symposium on Mobile ad hoc networking and computing (MOBIHOC'03), Annapolis, Maryland, USA, June 2003.

[37] F. Kuhn, R. Wattenhofer, Y. Zhang, and A. Zollinger.
Geometric ad hoc routing: of theory and practice.
In Proc. of 22nd IEEE Symp. on Principles of Distributed Computing (PODC'03), 2003.

[38] Ananth Rao, Christos Papadimitriou, Scott Shenker, and Ion Stoica.
Geographic routing without location information.
In Proc. of the 9th annual international conference on Mobile computing and networking (MOBICOM'03), San Diego, CA, USA, 2003.

[39] Karim Seada, Ahmed Helmy, and Ramesh Govindan.
POSTER: On the effect of location inaccuracy on geographic face routing in wireless networks.
In Proc. of the 9th annual international conference on Mobile computing and networking (MOBICOM'03), San Diego, CA, USA, 2003.

[40] G. Xing, C. Lu, R. Pless, and Q. Huang.
Greedy geographic routing is good enough in sensing covered networks.
Technical Report WUCSE-03-50, CSE Dept., Washington University, 2003.

[41] J. Newsome and D. Song.
GEM: Graphy EMbeding for routing and data-centric storage in sensor networks without geographic information.
In Proc. of the 2003 ACM Sensys, pages 76–88, November 2003.

[42] S. Datta, I. Stojmenovic, and J. Wu.
Internal node and shortcut based routing with guaranteed delivery in wireless networks.
Cluster Computing, 5(2):169–178, April 2002.

[43] Fabian Kuhn, Roger Wattenhofer, and Aaron Zollinger.
Asymptotically optimal geometric mobile ad-hoc routing.
In Proc. of the 6th international workshop on Discrete algorithms and methods for mobile computing and communications (DialM'02), Atlanta, GA, USA, September 2002.

[44] C Lu, B. Blum, T Abdelzaher, J Stankovic, and T. He.
RAP: a real-time communication architecture for large-scale wireless sensor networks.
In Proc. of the 8th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'02), pages 55–66, Los Alamitos, USA, 2002.

[45] Prosenjit Bose, Pat Morin, Ivan Stojmenovic, and Jorge Urrutia.
Routing with guaranteed delivery in ad hoc wireless networks.
Kluwer Wireless Networks, 7(6):609–616, 2001.

[46] Douglas S. J. De Couto and Robert Morris.
Location proxies and intermediate node forwarding for practical geographic forwarding.
Technical Report MIT-LCS-TR824, MIT Laboratory for Computer Science, June 2001.

[47] Jie Gao, Leonidas J. Guibas, John Hershberger, Li Zhang, and An Zhu.
Geometric spanner for routing in mobile networks.
In Proc. of the 2nd ACM international symposium on Mobile ad hoc networking and computing (MOBIHOC'01), Long Beach, CA, USA, October 2001.

[48] M. Mauve, J. Widmer, and H. Hartenstein.
A survey on position-based routing in mobile ad hoc networks.
IEEE Network Magazine, 15(6):30–39, November 2001.

[49] Yan Yu, Ramesh Govindan, and Deborah Estrin.
Geographical and energy aware routing: a recursive data dissemination protocol for wireless sensor networks.
Technical Report UCLA/CSD-TR-01-0023, CS Dept., UCLA, May 2001.

[50] Brad Karp and H. T. Kung.
GPSR: greedy perimeter stateless routing for wireless networks.
In Proc. of the 6th annual international conference on Mobile computing and networking (MOBICOM'00), pages 243–254, Boston, MA, 2000.

[51] E. Karnakis, H. Singh, and J. Urrutia.
Compass routing on geometric networks.
In Proc. of the 11th Canadian Conference on Computational Gemetry (CCCG'99), pages 51–54, Vancouver, Canada, Aug 1999.

[52] Prosenjit Bose, Pat Morin, Ivan Stojmenovic, and Jorge Urrutia.
Routing with guaranteed delivery in ad hoc wireless networks.
In Proc. of 3rd ACM International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications (DIAL-M'99), pages 48–55, Seattle, WA, August 1999.

[53] Y. B. Ko and N. H. Vaidya.
Location-aided routing (LAR) in mobile ad hoc networks.
In Proc. of the ACM international conference on Mobile computing and networking (MOBICOM'98), pages 66–75, Dallas, TX, October 1998.
Relay Node Placement

[1] X. Cheng, D. Du, L. Wang, and B. Xu.
Relay sensor placement in wireless sensor networks.
ACM/Springer WINET.

[2] E. L. Lloyd and G. Xue.
Relay node placement in wireless sensor networks.
IEEE Transactions on Computers, 56:134–138, 2007.

[3] Xiaofeng Han, Xiang Cao, Errol L. Lloyd, and Chien-Chung Shen.
Fault-tolerant relay node placement in heterogeneous wireless sensor networks.
In Proc. of IEEE INFOCOM, 2007.

[4] A. Kashyap, S. Khuller, and M. Shayman.
Relay placement for higher order connectivity in wireless sensor networks.
In Proc. of IEEE INFOCOM, 2006.

[5] J. Tang, B. Hao, and A. Sen.
Relay node placement in large scale wireless sensor networks.
Computer Communications, 29:490–501, 2006.

[6] J. L. Bredin, E. D. Demaine, M. Hajiaghayi, and D. Rus.
Deploying sensor networks with guaranteed capacity and fault tolerance.
In Proc. of ACM MobiHoc, 2005.

[7] D. Chen, D. Du, X. Hu, G. Lin, L. Wang, and G. Xue.
Approximations for steiner trees with minimum number of steiner points.
Journal of Global Optimization, 18:17–33, 2000.

[8] G. Lin and G. Xue.
Steiner tree problem with minimum number of steiner points and bounded edge-length.
Information Processing Letters, 69:53–57, 1999

신고

'research > distributed algorithm' 카테고리의 다른 글

Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
[펌] k-means clustering  (0) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요

출처:

시뮬레이티드 어닐링 (Simulated Annealing)
금속의 담금질 (annealing) 이란 고체를 녹을 때까지 가열하고 난 후 그것을 완전한 격자 상태의 결정체가 될 때까지 식히는 물리적인 과정이다. 이런 과정 중에 그 고체의 자유 에너지 (free energy) 는 최소화된다. 오래 전부터의 경험에 의하면 고체화되는 과정에서 지역 최소점에 빠지지 않도록 하기 위해서는 조심스럽게 서서히 식혀야 한다.

조합 최적화 (combinatorial optimization) 문제에서도 이와 유사한 과정을 정의할 수 있다. 이 과정은 잠재적으로 매우 많은 해결방안 중에서 최소의 비용이 드는 해답을 구하는 문제로 공식화될 수 있다. 우리는 여기서 비용 함수 (cost function) 와 자유 에너지 사이의 관계, 그리고 해답과 물리적인 상태의 관계를 정립함으로써 물리적인 담금질 과정의 시뮬레이션에 의거한 조합 최적화 문제의 해결 방안을 소개할 수 있는데 이러한 방법이 바로 시뮬레이티드 어닐링 (Simulated Annealing) 이다.

시뮬레이티드 어닐링은 스코트 커크패이트릭 (Scott Kirkpatrick), 젤라트 (Gelatt) 와 베키 (Vecchi) [KIR83] 등에 의해 처음 제안된 방법으로 조합 최적화 문제와 관련하여 소개되었다 [KIR82, KIR83, KIR94]. 또한 1985 년에 체르니 (Cerney)[CER85] 에 의해 독립적으로 연구 발표되었다. 이러한 개념들은 고체의 물리적인 담금질과 아주 많은 경우의 수를 가진 조합 최적화 문제 사이의 밀접한 관계에 의거한다.

이 방법의 두드러진 특징은 폭넓은 응용 가능성과 최상에 가까운 해답을 얻을 수 있다는 점이다. 그러나 이 방법에도 상당히 큰 단점이 있다. 상당히 좋은 해답을 얻는데 걸리는 계산 시간이 엄청나게 길다는 점이다. 그러나 시뮬레이티드 어닐링 알고리즘의 구현시 필요한 엄청난 시간은 대규모 병렬처리 (massively parallel execution) 를 기반으로 하는 계산 모델을 사용함으로써 상당히 줄일 수 있는데 그러한 모델 중의 하나가 바로 볼쯔만 머신 (Bolzmann machine) 이다.


시뮬레이티드 어닐링 알고리즘의 응용 예

EUR100 [AAR89] 은 유럽의 100대 도시들을 연결하는 순회판매원 (TSP) 문제인데 상호 대칭적이고 2차 평면적인 유클리트 거리를 다룬다. 거리 행렬 (distance matrix) 의 요소들은 테이블에 주어진 지리학상의 좌표로 계산되어 진다.
<그림 1> [AAR89] 은 EUR100 문제의 해결을 위하여 수행된 시뮬레이티드 어닐링 최적화 과정의 단계적 발전도를 보여준다. <그림 1> 의 (a) 에서 보는 바와 같이 처음의 해답은 100개 도시의 임의적인 나열로서 콘트롤 변수인 c 의 값이 17.85 인 경우인데 최적의 값과는 거리가 멀다. 이 해답은 매우 혼란스럽고 또한 매우 큰 엔트로피 (entrophy) 를 가지며 총 여행거리는 무려 129.965 나 된다. <그림 1> 의 (b) 와 (c) 는 콘트롤 변수인 c 가 각각 4.46 과 1.28 로 총 여행거리가 각각 68.153 과 33.048 로 점차 줄어들고 있다. 마지막으로, 거의 최적인 해답은 <그림 1> 의 (d) 와 같이 얻어지는데 이때의 c 값은 0.06 이다. 이 경우 패턴이 겹치지 않고 엔트로피가 매우 작으며 총 여행 거리가 앞의 경우들보다 훨씬 적은 21,456 에 불과하다.

사용자 삽입 이미지


이 문제의 경우 최소 여행 거리는 21,134 이고 이 경우의 여행 루트는 <그림 2> [AAR88b] 에 나타나 있는데 직선의 연결은 최적의 여행 경로를 나타낸다. CYBER-205 컴퓨터에서 최적의 여행 경로를 구하는데 59.5 CPU 초가 걸렸다.

사용자 삽입 이미지

http://www.aistudy.com/neural/simul_kim.htm
http://en.wikipedia.org/wiki/Simulated_annealing
신고

'research > distributed algorithm' 카테고리의 다른 글

Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
[펌] k-means clustering  (0) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요

  1. AGEN BOLA 2012.08.25 03:26 신고  댓글주소  수정/삭제  댓글쓰기

    AGEN BOLA BV : 늦었지만 네이버로 연락드렸습니다. ^^

출처:

K-means 알고리즘은 간단하고, 직접적이다. K-means 알고리즘을 요약하면,

1단계. random하게 cluster centroid를 선택한다.

2단계. 각 vector들을 가장 가까운 cluster centroid에 연결한다.

3단계. 각 vector들과 cluster centroid 사이의 값을 전부 더한 후 평균을 낸다. 식으로 나타내면 다음과 같다. LaTeX equation

4단계. 3단계에서 나온 결과로 cluster centroid 갱신한다.

5단계. 2~4단계를 특정 임계값이 만족할 때까지 반복한다.

K-means 알고리즘이 주된 결점은 결과가 초기 cluster centroid에 너무 sensitive 하다는 것과 local optima에 빠진다는 것이다.

관련링크
http://fconyx.ncifcrf.gov/~lukeb/kmeans.html
http://fconyx.ncifcrf.gov/~lukeb/kmeanout.html
http://fconyx.ncifcrf.gov/~lukeb/clusdis.html
http://progray.vogle.com/archive/200610?page=5

신고

'research > distributed algorithm' 카테고리의 다른 글

Matlab-General simulated annealing algorithm open source  (0) 2008.02.10
fuzzy c-mean  (0) 2008.02.10
Global Optimization  (0) 2008.02.08
Important Referencess for Wireless Sensor Networks  (0) 2008.02.01
[펌] Simulated Annealing  (1) 2008.01.29
[펌] k-means clustering  (0) 2008.01.29
Posted by 나마스떼

댓글을 달아 주세요



bloom filter wikipedia
the other matrials
신고

'research > mathematics' 카테고리의 다른 글

KNITRO with AMPL  (0) 2009.07.22
Cardinality  (0) 2009.05.21
bloom filter  (0) 2008.01.10
수학기호  (3) 2006.12.22
Posted by 나마스떼

댓글을 달아 주세요

성능 분석은 컴퓨터가 운영되고 있는 상태에서 시스템이 언제 부하가 걸리거나 어디에 부하가 걸리는가를 확인하기 위한 다양한 이벤트를 측정하는 과정을 말한다. 모든 컴퓨팅 시스템의 운영상 부하에 영향을 주는 요소는 CPU, 메모리, I/O, 네트워크 등 4가지이다. I/O 성능이 좋지 않을 경우 디바이스나 I/O 버스의 속도가 늦어질 수 있다. 또한 CPU 캐시가 업무량에 비해 너무 작아서 성능을 낼 수 없을 수도 있다. 특히, 네트워크는 4가지 중에서 가장 문제점을 야기시킬 수 있는 요소를 많이 가지고 있는데, 그것은 바로 라우터, 스위치, 브릿지, 허브, 케이블 또는 어댑터 등이 연결되어 있을 수 있기 때문이다.

성능 분석의 종류
1) 수학적 기법을 이용한 성능 분석
수학을 이용하여 컴퓨팅 시스템 성능을 분석하기 위해서는 복잡한 계산이나 수식을 사용할 필요 없이 그저 간단한 수학만으로도 최상의 성능 분석을 실행할 수 있으며 그 중에서 행렬 수학은 성능 분석을 위해 가장 중요하면서 간단한 방법이다.

2) 추정 기법을 이용한 성능 분석
추정 기법은 숫자의 변화를 단순하게 일렬로 나열한 후 숫자의 변화 트렌드를 분석하여 이후에 나올 수 있는 숫자를 추정하는 기법을 의미한다. 그림 1은 한 시스템의 전체 I/O의 변화 곡선을 보여주고 있다. 그림에서처럼 W-8 이후의 I/O 추가에 대해서 추정하여 변화의 추이를 확인할 수 있게 된다. 이 변화 곡선은 대개 스프레드시트 프로그램을 통해서 작성할 수 있다. W-1부터 W-7까지 보라색 막대는 실제 데이터를 통해서 얻은 그래프이다. 그러나 W-8부터 W-12는 W-7의 데이터를 기본으로 추정을 하여 I/O 숫자 추정을 통해서 증가분의 추정 데이터를 얻을 수 있게 된다.
이 그래프를 통해서 우리는 어떻게 I/O가 현재의 시스템상에서 어떻게 변화하여 추후 어떤 방향으로 나갈 것인가를 예측할 수 있게 된다. 그러나 한가지 주의할 것은 날씨 예측처럼 언제나 변화에 대한 추정과 예측이 정확하게 100% 맞지는 않는다는 것이다. 이러한 변화 분석을 전체 그림의 한 부분으로 인식하는 것도 매우 중요하다. 또한, 예측 분석은 변경될 수 있기 때문에 추정 데이터를 고려하는데 있어서 제 3의 환경 또한 염두에 두어야 할 것이다.
I/O의 특성 중에 하나는 I/O 숫자가 증가함에 따라서 디스크 또는 디스크 어레이의 응답 시간은 어느 지점까지는 매우 느리게 증가한다는 것이다. 응답 시간이 갑작스럽게 증가한다는 것은 디바이스의 용량이 초과했다는 것을 의미한다. 그림 2는 이러한 상황을 보여주고 있다.
그림 2에서 갑자기 증가하고 있는 지점은 I/O가 갑작스럽게 증가하여 I/O의 성능이 떨어지는 것을 알려준다. 이것은 I/O에 있어서 매우 위험한 영역(Zone)이다. 일단 위험 영역에 도달하면 시스템은 문제를 발생할 수 있는 여지가 있다는 것을 명심해 두어야 한다.
위 곡선에서 위험 영역에 언제 도달할 것인가를 예측할 수 있다. 실제로 이 영역에 도달할 때를 알기 위해서 지속적으로 테스트를 한다거나 행렬 수학을 이용하여 위험 영역을 찾아낼 수도 있다.
대부분의 회사에서는 이 영역을 확인하기 위해서 하드웨어를 지속적으로 테스트할 수 있는 좋은 여건을 가진 회사가 그리 많지 않기 때문에 행렬 수학을 대체적으로 이용하여 이 위험 영역을 찾아내는 것이 시간상으로나 비용적인 측면에서 유리할 수 있다.

3) 행렬 수학과 모델링을 이용한 성능 분석
행렬 수학은 종종 스테이션, 요구와 행렬로 구성된 매우 복잡한 시스템이라고 할 수 있다.
매우 단순한 시스템의 예를 들자면 행렬 수학은 이발소나 미장원으로 볼 수 있다. 그리고 스테이션은 머리를 손질하는 사람이며, 요구는 머리를 자르거나 손질하기 위해서 미장원 또는 이발소를 찾는 손님, 그리고 고객이 머리 손질 서비스를 받기 위해서 순서를 기다려야만 하는데 이러한 순서가 바로 행렬이라고 할 수 있다.
이 행렬을 짧게 하거나 피하는 방법은 더 많은 스테이션을 만들어서 서비스의 양을 증가시키거나 요구의 양을 줄이는 것이다. 미용사가 손님을 많이 받기 위해서 머리 손질을 빨리 하는 것은 그리 좋은 방법이 아닐 것이다. 또한 손님이 머리 손질 서비스를 받기 위해서 이발소나 미장원을 찾아온다면 이 요구 또한 줄일 수는 없다.
만약 한 명의 미용사가 한 손님의 머리는 손질하는 데 20분이 걸린다면, 두 명의 미용사를 둔다면 20분에 2명의 손님에게 서비스를 제공할 수 있을 것이다. 두 개의 스테이션에서 최적의 도달율은 20분에 2명의 손님을 해결할 수 있다는 것이다. 이 도달율과 서비스율이 균형을 이루게 되면 행렬은 만들어지지 않는다. 그러나 도달율이 매 20분에 3명의 고객으로 증가한다면 한 명의 고객은 반드시 다른 손님이 머리를 손질하는 20분 동안은 기다려야만 한다. 이 때 하나의 행렬이 한 명의 고객으로 구성된다. 도달율이 증가하면 행렬의 길이는 스테이션을 증가하기 전까지 길어지게 된다.
아주 간단하게 행렬에 대해 설명해 봤다. 사실 행렬은 상당히 어려운 수학의 한 분야이다. 그러나 행렬의 의미를 알고 이것을 실제로 시스템에 도입하고 성능을 분석하는데 활용하면 많은 도움이 된다. 행렬에 대한 필요성을 깨닫는 것 자체에 큰 의미와 발전이 있다고 본다.

성능 분석 과정
기업의 IT 담당자가 성능 분석을 하기 위해서 가장 기본적으로 해야 할 일은 애플리케이션 프로파일, 데이터 수집, 성능 분석 리포트를 작성하는 것이다.

1) 애플리케이션 프로파일
성능 분석을 위해 제일 처음 해야 할 일은 바로 애플리케이션의 성능 분석을 시행하는 것이다. 이를 위해 광범위하면서도 정확한 시스템 자료를 가지고 있는 애플리케이션 지원 부서의 도움을 받아 애플리케이션의 프로파일을 위한 정보를 얻고, 도면상에 애플리케이션 구성을 정확하게 그려서 한눈에 시스템 구성을 볼 수 있도록 준비해 두어야 한다. 시스템 구성도에는 하드웨어, 소프트웨어, 네트워크 등 가능한 한 애플리케이션에 대한 모든 자료가 포함돼야 한다.

2) 데이터 수집
컴퓨팅 시스템 성능 측정 시 발생할 수 있는 어려움은 바로 소프트웨어의 성능을 어떻게 분석하느냐 하는 것이다. 데이터 수집이 모두 끝나면 애플리케이션 분석가, 시스템 운영자, 네트워크 지원 기술자 등과 토론을 통해 데이터의 특성과 성능 분석에 대하여 결론을 내려야 한다. 한편, 시스템 사용이 가장 많은 시간대에 소프트웨어 자료를 수집하는 과정은 실제 컴퓨팅 시스템에서 발생되는 문제점보다 더욱 나쁜 문제점이 발생할 수도 있다는 것을 염두에 두어야 한다.

3) 관리 리포트 작성
시스템 구성도와 데이터를 통해 얻은 성능에 대한 결론을 리포트로 작성하기 위해서는 다음의 4가지 내용이 수록되어 있어야 한다.

● 관리 요약(Management Summary): 관리 요약은 발생된 문제를 먼저 정의한 후 문제가 발생했을 때 어떻게 해결해야 하는가에 대한 추천 사항을 기술하는 것이다. 추천 사항은 보통 한 개 이상을 기술하여 선택할 수 있도록 하는 것이 좋으며, 무엇보다도 만약 추천 사항을 실행하지 않았을 경우 향후에 미칠 영향에 대해서 자세하게 기술해야 한다.
● 시스템 상세 기술(System Descrip- tion): 시스템 상세 기술은 하드웨어와 애플리케이션에 대한 성능 분석 그래프가 포함되며 그래프가 의미하는 바를 상세하게 설명해 주어야 한다. 이는 다음 단계인 문제 발생 영역을 이해시키는 데 큰 도움이 된다.
● 문제발생 영역(Problem area): 문제 발생 영역 리포트 작성 시 대개의 경우 데이터, CPU, 네트워크상의 모든 문제점을 알게 된다. 즉 데이터가 하나의 디스크에 저장되어 있어 데이터를 여러 대의 디스크에 분산시켜서 저장할 필요성, CPU가 많은 경우에 문제를 발생시킬 여지가 가장 많다는 것, 네트워크의 경우에는 그리 많은 문제를 발생시키지는 않으나 데이터베이스 메모리 버퍼를 확장시켜야 할 지 등에 대한 결정에 직면하게 될 수도 있는 것이다.
● 추천 사항(Recommendations): 리포트의 가장 마지막 단계인 추천 사항 항목에서는 위 3개 항목에서 기술된 문제점들을 어떻게 해결할 것인가에 대한 내용이 담겨 있어야 하고, 요점만을 간단히 정리하여 간단 명료하게 기술하는 것이 중요하다. 또한 문제점을 해결하기 위해서 필요한 비용이 얼마나 드는지, 어떻게 하면 용이하게 해결할 수 있는지에 대해서도 제시되어야 한다.

4) 최종 체크리스트
다음은 관리 리포트 작성 시 최종적으로 숙지할 사항이다.

▶ 데이터의 정확한 분석: 가능한 한 데이터를 정확하고 분명하게 분석하고 조사하는 작업을 그 무엇보다 중요시 다루어야 할 것이다. 데이터에 근거하지 않고 추측에 근거한다면 올바른 결론을 낼 수 없을 것이다.
▶ 간결한 리포트 작성: 보통 성능 분석 리포트를 작성할 때 분석에 대해 장황하고 세밀하게 정리하는 것을 지양하고 간결하고 명확하게 작성되어야 한다.
▶ 질문과 의문에 대한 사전 준비: 성능 분석 리포트를 작성하여 임원진에게 프레젠테이션을 하거나 리포트를 제출할 때 임원진들은 그 결과에 대한 수많은 질문을 하게 될 것이다. 이를 위해 사전에 리포트에 임원진에서 제기할 만한 질의에 대한 답변을 정리해 두어 임원진에서 보다 빠른 결정을 할 수 있도록 준비해야 한다.
▶ 다른 의견을 제시했을 경우의 대처 방안: 만약 성능 분석에 대한 결과에 대해서 임원진이 다른 의견을 제시했을 경우에 이에 대해서 반대를 하지 말아야 한다. 현재는 컴퓨팅 시스템이 매우 복잡하게 구성되어 있으며 여러 다른 상황과 밀접하게 맞물려 있기 때문에 쉽게 영향을 받을 수 있기 때문이다.
▶ 비용 효율적인 성능 분석 실시: 때때로 오랜 시간에 걸쳐 복잡한 수학적 분석 기법으로 성능을 분석하려고 많은 시간을 소비하기도 한다. 그러나 복잡한 수학적 기법을 분석하는 것도 중요하지만 시스템 관리에 있어서 오랜 경험을 바탕으로 하는 성능 분석은 기본적인 수학적 기법만을 이용하는 것으로도 훌륭한 성능 분석 결과를 도출해 낼 수 있다.

성능 분석 프로세스 구성도
그림 3은 이러한 과정들이 담긴 성능 분석 프로세스이다. 조직의 예산이 부족해짐에 따라 점차 비용을 가능한 한 사용하지 않고 성능 분석을 하는 방법을 모색하기 시작했고, 그러한 방법의 하나로 바로 그림 3과 같은 일반적인 방법이 많이 사용되고 있다.

스토리지 성능 분석의 변화
고도화된 IT 환경에서 시스템 실무자의 가장 중요한 역할은 바로 체계적인 분석을 통한 시스템 활용도를 높이는 것이다. 그런 의미에서 탄생한 것이 성능 분석이라는 분야이다. 시스템을 생산적으로 관리하고 도입할 것인가를 결정하기 이전에 철저하게 현재의 시스템을 분석하고 이것을 어떻게 사용하고 관리할 것인가에 대한 의미있는 관리 리포트를 작성하여 관리하는 것은 매우 중요하다.
성능 관리는 이제야 막 연구하기 시작한 초기 시점이라고 할 수 있다. 컴퓨팅 환경의 성능 분석에서 나아가 스토리지 성능 분석은 스토리지 어레이, NAS, SAN, 그리고 스토리지 서브시스템과 같은 더욱 복잡한 이슈의 도발이라고 할 수 있다.
이 글에서 스토리지 성능 분석이 무엇인가에 관해 매우 단순하고 대략적인 의미만을 거론했지만, 이 의미 역시 몇 년이 지나고 나면 틀림없이 많이 변화될 것이다. 오래된 툴은 다양한 네트워크 아키텍처와 프로토콜, 그리고 스토리지 디바이스로 더욱 견고한 가지를 치고 뻗어나갈 것은 분명하다.

신고

'research > E2ENetPer' 카테고리의 다른 글

TCP Performance and the Mathis Equation  (0) 2009.08.22
성능 분석  (1) 2007.04.14
iperf로 test하기  (0) 2007.03.26
RFC-Network Performance Metric  (0) 2007.03.26
Posted by 나마스떼

댓글을 달아 주세요

  1. 나마스떼 2007.04.14 21:01 신고  댓글주소  수정/삭제  댓글쓰기

    word file를 오..blog api를 사용해서 올린 것..
    쓸만하군.ㅋ

http://www.vsix.net/other/guide/iperf/iperf_speed_measure.htm
신고

'research > E2ENetPer' 카테고리의 다른 글

TCP Performance and the Mathis Equation  (0) 2009.08.22
성능 분석  (1) 2007.04.14
iperf로 test하기  (0) 2007.03.26
RFC-Network Performance Metric  (0) 2007.03.26
Posted by 나마스떼
TAG iperf

댓글을 달아 주세요

IETF Working Group - IP Performance Metrics (ippm)

 

 

Framework for IP Performance Metrics (RFC 2330)

 

1. 개요

IP단 네트워크에 대한 정의 및 metric, measurement methodology에 대한 concept 설명

 

2. Terminology for Paths and Clouds

네트워크 구성 요소에 대한 정의

Host, Link, Router, Path, subpath, cloud, exchange, cloud subpath, path digest

 

3. Measurement Methodology

- Direct measurement of a performance metric using injected test traffic

- Projection of a metric from lower-level measurement

- Estimation of a constituent metric from a set of more aggregated measurements

- Estimation of a given metric at one time from a set of related metrics at other times

 

4. Metrics

A-Frame Concept

- Propagation time of a link

             Single bit travel from the output port on the Internet host across a single link to another Internet hosts

- Bandwidth of a link for packets of size K

             Bits/second

- Route the path

- Hop count of a route

             Value n of the route path

 

5. Sampling

- Singleton metric

             Single instance of bulk throughput capacity from one host to another might be defined as a singleton metric, even though the instance involves measuring the timing of a number of Internet packets.

- Sample metric

             Refer to metrics derived from a given singleton metric by taking a number of distinct instances together

- Statistical metric

             Refer to metrics derived from a given sample metric by computing some statistic of the values defined by the singleton metric on the sample

 

5.1. Methods of Collecting Samples

- Poisson Sampling

             G(t) = 1 exp(-lambda*t)

- Geometric Sampling

             Closely related to Poisson sampling

- Generating Poisson Sampling Intervals

             Generate Ei intervals

             Ei = -log(Ui)/lambda

 

5.2 Self-Consistency

A fundamental requirement for a sound measurement methodology is that measurement be made using as few unconfirmed assumptions as possible.

Experience has painfully shown how easy it is to make an assumption that turns out to be incorrect.

 

5.3 Defining Statistical Distributions

 

5.4 Testing for Goodness-of-Fits

For some forms of measurement calibration we need to test whether a set of numbers is consistent with those numbers having been drawn from a particular distribution.

 

 


IPPM Metrics for Measuring Connectivity (RFC 2678)

 

1. 개요

Internet Connectivity 에 대하여 측정 방법에 대한 정의

Instantaneous One-way Connectivity, Instantaneous Two-way Connectivity, One-way Connectivity, Two-way Connectivity, Two-way Temporal Connectivity, Two-way Temporal Connectivity 에 대한 설명

특히 TCP Connectivity에 대한 측정 방법으로 Two-way Temporal Connectivity에 대하여 자세히 살펴볼 것.

 

2. Instantaneous One-Way Connectivity

Type-P-Instantaneous-Unidirectional-Connectivity

 

2.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T, a time

 

2.2 Definition

Src has *Type-P-Instantaneous-Unidirectional-Connectivity* to Dst at time T if a type-P packet transmitted from Src to Dst at time T will arrive at Dst

 

3. Instantaneous Two-Way Connectivity

Type-P-Instantaneous-Bidirectional-Connectivity

 

3.1 Parameters

A1, the IP address of a host

A2, the IP address of a host

T, a time

 

3.2 Definition

Addresses A1 and A2 have *Type-P-Instantaneous-Bidirectional-Connectivity* at time T of address A1 has Type-P-Instantaneous-Bidirectional-Connectivity to address A2 and address A2 has Type-P-Instantaneous-Bidirectional-Connectivity to address A1.

 

4. One-Way Connectivity

Type-P-Interval-Unidirectional-Connectivity

 

4.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T, a time

dT, a duration

 

4.2 Definition

Address Src has *Type-P-Interval-Unidirectional-Connectivity* to address Dst during the interval [T, T+dT] if for some T' within [T, T+dT] it has Type-P-instantaneous-connectivity to Dst.

 

5. Two-way Connectivity

Type-P-Interval-Bidirectional-Connectivity

 

5.1 Parameters

A1, the IP address of a host

A2, the IP address of a host

T, a time

dT, a duration

 

5.2 Definitions

Addresses A1 and A2 have *Type-P-Interval-Bidirectional-Connectivity* between them during the interval [T, T+dT] if address A1 has Type-P- Interval-Unidirectional-Connectivity to address A2 during the interval and address A2 has Type-P-Interval-Unidirectional-Connectivity to address A1 during the interval.

 

6. Two-way Temporal Connectivity

Type-P1-P2-Interval-Temporal-Connectivity

 

6.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T, a time

dT, a duration

 

6.2 Definitions

Address Src has *Type-P1-P2-Interval-Temporal-Connectivity* to address Dst during the interval [T, T+dT] if there exist times T1 and T2, and time intervals dT1 and dT2

- T1, T1+dT1, T2, T2+dT2 are all in [T, T+dT].

- T1+dT1 <= T2.

- At time T1, Src has Type-P1 instantanous connectivity to Dst.

- At time T2, Dst has Type-P2 instantanous connectivity to Src.

- dT1 is the time taken for a Type-P1 packet sent by Src at time T1 to arrive at Dst.

- dT2 is the time taken for a Type-P2 packet sent by Dst at time T2 to arrive at Src.

 

6.3 Inputs

- Types P1 and P2, addresses A1 and A2, interval [T, T+dT].

- N, the number of packets to send as probes for determining connectivity.

- W, the "waiting time", which bounds for how long it is useful to wait for a reply to a packet.

- Required: W <= 255, dT > W.

 

6.4 Algorithms

- Compute N *sending-times* that are randomly, uniformly distributed over [T, T+dT-W].

- At each sending time, transmit from A1 a well-formed packet of type P1 to A2.

- Inspect incoming network traffic to A1 to determine if a successful reply is received.  The particulars of doing so are dependent on types P1 & P2, discussed below.  If any successful reply is received, the value of the measurement is "true".  At this point, the measurement can terminate.

- If no successful replies are received by time T+dT, the value of the measurement is "false".

 

6.5 Specific methodology for TCP

A TCP-port-N1-port-N2 methodology sends TCP SYN packets with source  port N1 and dest port N2 at address A2.  Network traffic incoming to A1 is interpreted as follows:

+    A SYN-ack packet from A2 to A1 with the proper acknowledgement

      fields and ports indicates temporal connectivity.  The measurement

      terminates immediately with a value of "true".  {Comment: if, as a

      side effect of the methodology, a full TCP connection has been

      established between A1 and A2 -- that is, if A1's TCP stack

      acknowledges A2's SYN-ack packet, completing the three-way

      handshake -- then the connection now established between A1 and A2

      is best torn down using the usual FIN handshake, and not using a

      RST packet, because RST packets are not reliably delivered.  If

      the three-way handshake is not completed, however, which will

      occur if the measurement tool on A1 synthesizes its own initial

      SYN packet rather than going through A1's TCP stack, then A1's TCP

      stack will automatically terminate the connection in a reliable

      fashion as A2 continues transmitting the SYN-ack in an attempt to

      establish the connection.  Finally, we note that using A1's TCP

      stack to conduct the measurement complicates the methodology in

      that the stack may retransmit the initial SYN packet, altering the

      number of probe packets sent.}

+    A RST packet from A2 to A1 with the proper ports indicates

      temporal connectivity between the addresses (and a *lack* of

      service connectivity for TCP-port-N1-port-N2 - something that

      probably should be addressed with another metric).

 +    An ICMP port-unreachable from A2 to A1 indicates temporal

      connectivity between the addresses (and again a *lack* of service

      connectivity for TCP-port-N1-port-N2).  {Comment: TCP

      implementations generally do not need to send ICMP port-

      unreachable messages because a separate mechanism is available

      (sending a RST).  However, RFC 1122 states that a TCP receiving an

      ICMP port-unreachable MUST treat it the same as the equivalent

      transport-level mechanism (for TCP, a RST).}

 +    An ICMP host-unreachable or network-unreachable to A1 (not

      necessarily from A2) with an enclosed IP header matching that sent

      from A1 to A2 *suggests* a lack of temporal connectivity.  If by

      time T+dT no evidence of temporal connectivity has been gathered,

      then the receipt of the ICMP can be used as additional information

      to the measurement value of "false".


A One-way Delay Metric for IPPM (RFC 2679)

 

1. 개요

양방향 측정이 아닌 단방향 측정이 필요할 경우가 종종 생기는데 있어서, 효율적인 단방향 Delay에 대한 측정 방법에 대한 설명

 

2. General Issues Regarding Time

synchronization*

 

         measures the extent to which two clocks agree on what time it

         is.  For example, the clock on one host might be 5.4 msec ahead

         of the clock on a second host.  {Comment: A rough ITU-T

         equivalent is "time error".}

 

accuracy*

         measures the extent to which a given clock agrees with UTC.

         For example, the clock on a host might be 27.1 msec behind UTC.

         {Comment: A rough ITU-T equivalent is "time error from UTC".}

 

resolution*

         measures the precision of a given clock.  For example, the

         clock on an old Unix host might tick only once every 10 msec,

         and thus have a resolution of only 10 msec.  {Comment: A very

         rough ITU-T equivalent is "sampling period".}

 

skew*

         measures the change of accuracy, or of synchronization, with

         time.  For example, the clock on a given host might gain 1.3

         msec per hour and thus be 27.1 msec behind UTC at one time and

         only 25.8 msec an hour later.  In this case, we say that the

         clock of the given host has a skew of 1.3 msec per hour

         relative to UTC, which threatens accuracy.  We might also speak

         of the skew of one clock relative to another clock, which

         threatens synchronization.  {Comment: A rough ITU-T equivalent

         is "time drift".}

 

3. A Singleton Definition for One-way Delay

Type-P-One-way-Delay

 

3.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T, a time

 

3.2 Definition

For a real number dT, >>the *Type-P-One-way-Delay* from Src to Dst at T is dT<< means that Src sent the first bit of a Type-P packet to Dst at wire-time* T and that Dst received the last bit of that packet at wire-time T+dT.

 

>>The *Type-P-One-way-Delay* from Src to Dst at T is undefined (informally, infinite)<< means that Src sent the first bit of a Type-P packet to Dst at wire-time T and that Dst did not receive that packet.

 

3.3 Methodologies

As with other Type-P-* metrics, the detailed methodology will depend on the Type-P (e.g., protocol number, UDP/TCP port number, size, precedence).

Generally, for a given Type-P, the methodology would proceed as follows:

   +  Arrange that Src and Dst are synchronized; that is, that they have

      clocks that are very closely synchronized with each other and each

      fairly close to the actual time.

   +  At the Src host, select Src and Dst IP addresses, and form a test

      packet of Type-P with these addresses.  Any 'padding' portion of

      the packet needed only to make the test packet a given size should

      be filled with randomized bits to avoid a situation in which the

      measured delay is lower than it would otherwise be due to

      compression techniques along the path.

   +  At the Dst host, arrange to receive the packet.

   +  At the Src host, place a timestamp in the prepared Type-P packet,

      and send it towards Dst.

   +  If the packet arrives within a reasonable period of time, take a

      timestamp as soon as possible upon the receipt of the packet.  By

      subtracting the two timestamps, an estimate of one-way delay can

      be computed.  Error analysis of a given implementation of the

      method must take into account the closeness of synchronization

      between Src and Dst.  If the delay between Src's timestamp and the

      actual sending of the packet is known, then the estimate could be

      adjusted by subtracting this amount; uncertainty in this value

      must be taken into account in error analysis.  Similarly, if the

      delay between the actual receipt of the packet and Dst's timestamp

      is known, then the estimate could be adjusted by subtracting this

      amount; uncertainty in this value must be taken into account in

      error analysis.  See the next section, "Errors and Uncertainties",

      for a more detailed discussion.

   +  If the packet fails to arrive within a reasonable period of time,

      the one-way delay is taken to be undefined (informally, infinite).

      Note that the threshold of 'reasonable' is a parameter of the

      methodology.

 

Issues such as the packet format, the means by which Dst knows when to expect the test packet, and the means by which Src and Dst are synchronized are outside the scope of this document.  {Comment: We plan to document elsewhere our own work in describing such more detailed implementation techniques and we encourage others to as well.}

 

3.4 Reporting the metric

 

3.4.1 Type-P

The value of the metric may depend on the type of IP packets used to make the measurement, or "type-P".  The value of Type-P-One-way-Delay could change if the protocol (UDP or TCP), port number, size, or arrangement for special treatment (e.g., IP precedence or RSVP) changes.  The exact Type-P used to make the measurements MUST be accurately reported.

 

3.4.2 Loss threshold

In addition, the threshold (or methodology to distinguish) between a large finite delay and loss MUST be reported.

 

3.4.3 Path

- If the systematic error can be determined, it SHOULD be removed from the measured values.

- You SHOULD also report the calibration error, e, such that the true value is the reported value plus or minus e, with 95% confidence (see the last section.)

- If possible, the conditions under which a test packet with finite delay is reported as lost due to resource exhaustion on the measurement instrument SHOULD be reported.

 

3.4.4 Path

Finally, the path traversed by the packet SHOULD be reported, if possible.  In general it is impractical to know the precise path a given packet takes through the network.  The precise path may be known for certain Type-P on short or stable paths.  If Type-P includes the record route (or loose-source route) option in the IP header, and the path is short enough, and all routers* on the path support record (or loose-source) route, then the path will be precisely recorded.  This is impractical because the route must be short enough, many routers do not support (or are not configured for) record route, and use of this feature would often artificially worsen the performance observed by removing the packet from common-case processing.  However, partial information is still valuable context. For example, if a host can choose between two links* (and hence two separate routes from Src to Dst), then the initial link used is valuable context.  {Comment: For example, with Merit's NetNow setup, a Src on one NAP can reach a Dst on another NAP by either of several different backbone networks.}

 

4. Samples of One-way Delay

Type-P-One-way-Delay-Poisson-Stream

 

4.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T0, a time

Tf, a time

lambda, a rate in reciprocal seconds

 

4.2 Definition

Given T0, Tf, and lambda, we compute a pseudo-random Poisson process beginning at or before T0, with average arrival rate lambda, and ending at or after Tf.  Those time values greater than or equal to T0 and less than or equal to Tf are then selected.  At each of the times in this process, we obtain the value of Type-P-One-way-Delay at this time.  The value of the sample is the sequence made up of the resulting <time, delay> pairs.  If there are no such pairs, the sequence is of length zero and the sample is said to be empty.

 

4.3 Methodology

The methodologies follow directly from:

-the selection of specific times, using the specified Poisson arrival process, and

- the methodologies discussion already given for the singleton Type-P-One-way-Delay metric. Care must, of course, be given to correctly handle out-of-order arrival of test packets; it is possible that the Src could send one test packet at TS[i], then send a second one (later) at TS[i+1], while the Dst could receive the second test packet at TR[i+1], and then receive the first one (later) at TR[i].

 

4.4 Reporting the metric

 

Same as 3.4

 

5 Others

 

5.1. Type-P-One-way-Delay-Percentile

Given a Type-P-One-way-Delay-Poisson-Stream and a percent X between 0% and 100%, the Xth percentile of all the dT values in the Stream. In computing this percentile, undefined values are treated as infinitely large.  Note that this means that the percentile could thus be undefined (informally, infinite).  In addition, the Type-P-One-way-Delay-Percentile is undefined if the sample is empty.

 

   Example: suppose we take a sample and the results are:

 

      Stream1 = <

      <T1, 100 msec>

      <T2, 110 msec>

      <T3, undefined>

      <T4, 90 msec>

      <T5, 500 msec>

      >

 

   Then the 50th percentile would be 110 msec, since 90 msec and 100

   msec are smaller and 110 msec and 'undefined' are larger.

 

   Note that if the possibility that a packet with finite delay is

   reported as lost is significant, then a high percentile (90th or

   95th) might be reported as infinite instead of finite.

 

5.2. Type-P-One-way-Delay-Median

Given a Type-P-One-way-Delay-Poisson-Stream, the median of all the dT values in the Stream.  In computing the median, undefined values are treated as infinitely large.  As with Type-P-One-way-Delay-Percentile, Type-P-One-way-Delay-Median is undefined if the sample is empty.

As noted in the Framework document, the median differs from the 50th percentile only when the sample contains an even number of values, in which case the mean of the two central values is used.

 

   Example: suppose we take a sample and the results are:

 

   Stream2 = <

      <T1, 100 msec>

      <T2, 110 msec>

      <T3, undefined>

      <T4, 90 msec>

      >

 

   Then the median would be 105 msec, the mean of 100 msec and 110 msec,

   the two central values.

 

5.3. Type-P-One-way-Delay-Minimum

Given a Type-P-One-way-Delay-Poisson-Stream, the minimum of all the dT values in the Stream.    In computing this, undefined values are treated as infinitely large.  Note that this means that the minimum could thus be undefined (informally, infinite) if all the dT values are undefined.  In addition, the Type-P-One-way-Delay-Minimum is

 

   undefined if the sample is empty.

 

   In the above example, the minimum would be 90 msec.

 

5.4. Type-P-One-way-Delay-Inverse-Percentile

Given a Type-P-One-way-Delay-Poisson-Stream and a time duration threshold, the fraction of all the dT values in the Stream less than or equal to the threshold.  The result could be as low as 0% (if all the dT values exceed threshold) or as high as 100%.  Type-P-One-way-Delay-Inverse-Percentile is undefined if the sample is empty.

In the above example, the Inverse-Percentile of 103 msec would be 50%.

 


A One-way Packet Loss Metric for IPPM (RFC 2680)

 

1. 개요

단방향 패킷 Loss에 대한 측정 방법

 

2. A Singleton Definition for One-way Packet Loss

Type-P-One-way-Packet-Loss

 

2.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T, a time

 

2.2 Definition

- The *Type-P-One-way-Packet-Loss* from Src to Dst at T is 0<< means that Src sent the first bit of a Type-P packet to Dst at wire-time* T and that Dst received that packet.

- The *Type-P-One-way-Packet-Loss* from Src to Dst at T is 1<< means that Src sent the first bit of a type-P packet to Dst at wire-time T and that Dst did not receive that packet.

 

2.3 Methodologies

As with other Type-P-* metrics, the detailed methodology will depend on the Type-P (e.g., protocol number, UDP/TCP port number, size, precedence).

 

   Generally, for a given Type-P, one possible methodology would proceed

   as follows:

 

   +  Arrange that Src and Dst have clocks that are synchronized with

      each other.  The degree of synchronization is a parameter of the

      methodology, and depends on the threshold used to determine loss

      (see below).

 

   +  At the Src host, select Src and Dst IP addresses, and form a test

      packet of Type-P with these addresses.

 

   +  At the Dst host, arrange to receive the packet.

 

   +  At the Src host, place a timestamp in the prepared Type-P packet,

      and send it towards Dst.

 

   +  If the packet arrives within a reasonable period of time, the one-

      way packet-loss is taken to be zero.

 

   +  If the packet fails to arrive within a reasonable period of time,

      the one-way packet-loss is taken to be one.  Note that the

      threshold of "reasonable" here is a parameter of the methodology.

 

      {Comment: The definition of reasonable is intentionally vague, and

      is intended to indicate a value "Th" so large that any value in

      the closed interval [Th-delta, Th+delta] is an equivalent

      threshold for loss.  Here, delta encompasses all error in clock

      synchronization along the measured path.  If there is a single

      value after which the packet must be counted as lost, then we

      reintroduce the need for a degree of clock synchronization similar

      to that needed for one-way delay.  Therefore, if a measure of

      packet loss parameterized by a specific non-huge "reasonable"

      time-out value is needed, one can always measure one-way delay and

      see what percentage of packets from a given stream exceed a given

      time-out value.}

 

   Issues such as the packet format, the means by which Dst knows when

   to expect the test packet, and the means by which Src and Dst are

   synchronized are outside the scope of this document.  {Comment: We

   plan to document elsewhere our own work in describing such more

   detailed implementation techniques and we encourage others to as

   well.}

 

2.4. Reporting the metric:

 

2.4.1. Type-P

The value of the metric may depend on the type of IP packets used to make the measurement, or "Type-P".  The value of Type-P-One-way-Delay could change if the protocol (UDP or TCP), port number, size, or arrangement for special treatment (e.g., IP precedence or RSVP) changes.  The exact Type-P used to make the measurements MUST be accurately reported.

 

2.4.2. Loss threshold

The threshold (or methodology to distinguish) between a large finite delay and loss MUST be reported.

 

2.4.3. Calibration results

The degree of synchronization between the Src and Dst clocks MUST be reported.  If possible, possibility that a test packet that arrives at the Dst network interface is reported as lost due to resource exhaustion on Dst SHOULD be reported.

 

2.4.4. Path

Finally, the path traversed by the packet SHOULD be reported, if possible.  In general it is impractical to know the precise path a given packet takes through the network.  The precise path may be known for certain Type-P on short or stable paths.  If Type-P includes the record route (or loose-source route) option in the IP header, and the path is short enough, and all routers* on the path support record (or loose-source) route, then the path will be precisely recorded.  This is impractical because the route must be short enough, many routers do not support (or are not configured for) record route, and use of this feature would often artificially worsen the performance observed by removing the packet from common-case processing.  However, partial information is still valuable context.

For example, if a host can choose between two links* (and hence two separate routes from Src to Dst), then the initial link used is valuable context.  {Comment: For example, with Merit's NetNow setup, a Src on one NAP can reach a Dst on another NAP by either of several different backbone networks.}

 

3. A Definition for Samples of One-way Packet Loss

Type-P-One-way-Packet-Loss-Poisson-Stream

 

3.1 Parameter

Src, the IP address of a host

Dst, the IP address of a host

T0, a time

Tf, a time

lambda, a rate in reciprocal seconds

 

3.2 Definition

Given T0, Tf, and lambda, we compute a pseudo-random Poisson process beginning at or before T0, with average arrival rate lambda, and ending at or after Tf.  Those time values greater than or equal to T0 and less than or equal to Tf are then selected.  At each of the times in this process, we obtain the value of Type-P-One-way-Packet-Loss at this time.  The value of the sample is the sequence made up of the resulting <time, loss> pairs.  If there are no such pairs, the sequence is of length zero and the sample is said to be empty.

 

3.3 Methodology

The methodologies follow directly from:

- the selection of specific times, using the specified Poisson arrival process, and

- the methodologies discussion already given for the singleton Type-P-One-way-Packet-Loss metric.

Care must be given to correctly handle out-of-order arrival of test packets; it is possible that the Src could send one test packet at TS[i], then send a second one (later) at TS[i+1], while the Dst could receive the second test packet at TR[i+1], and then receive the first one (later) at TR[i].

 

3.4 Reporting the metric

Same as 2.4

 

4. Some Statistics Definitions for One-way Packet Loss

Type-P-One-way-Packet-Loss-Average

 

Given a Type-P-One-way-Packet-Loss-Poisson-Stream, the average of all the L values in the Stream.  In addition, the Type-P-One-way-Packet-Loss-Average is undefined if the sample is empty.

 

   Example: suppose we take a sample and the results are:

 

      Stream1 = <

      <T1, 0>

      <T2, 0>

      <T3, 1>

      <T4, 0>

      <T5, 0>

      >

 

   Then the average would be 0.2.

 

Note that, since healthy Internet paths should be operating at loss rates below 1% (particularly if high delay-bandwidth products are to be sustained), the sample sizes needed might be larger than one would like.  Thus, for example, if one wants to discriminate between various fractions of 1% over one-minute periods, then several hundred samples per minute might be needed.  This would result in larger values of lambda than one would ordinarily want.

Note that although the loss threshold should be set such that any errors in loss are not significant, if the possibility that a packet which arrived is counted as lost due to resource exhaustion is significant compared to the loss rate of interest, Type-P-One-way-Packet-Loss-Average will be meaningless.

 

 


A Round-trip Delay Metric for IPPM (RFC 2681)

 

1. 개요

Round trip delay를 측정하기 위한 기준 설명

One-way Delay 측정 방법과 비슷함

 

2. A Singleton Definition for Round-Trip Delay

Type-P-Round-trip-Delay

 

2.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T, a time

 

2.2 Definition

For a real number dT,

- the *Type-P-Round-trip-Delay* from Src to Dst at T is dT<< means that Src sent the first bit of a Type-P packet to Dst at wire-time* T, that Dst received that packet, then immediately sent a Type-P packet back to Src, and that Src received the last bit of that packet at wire-time T+dT.

- The *Type-P-Round-trip-Delay* from Src to Dst at T is undefined (informally, infinite)<< means that Src sent the first bit of a Type-P packet to Dst at wire-time T and that (either Dst did not receive the packet, Dst did not send a Type-P packet in response, or) Src did not receive that response packet.

- The *Type-P-Round-trip-Delay between Src and Dst at T<< means either the *Type-P-Round-trip-Delay from Src to Dst at T or the *Type-P-Round-trip-Delay from Dst to Src at T.  When this notion is used, it is understood to be specifically ambiguous which host acts as Src and which as Dst.  {Comment: This ambiguity will usually be a small price to pay for being able to have one measurement, launched from either Src or Dst, rather than having two measurements.}

 

2.3 Methodologies

As with other Type-P-* metrics, the detailed methodology will depend on the Type-P (e.g., protocol number, UDP/TCP port number, size, precedence).

 

Generally, for a given Type-P, the methodology would proceed as follows:

 

   +  At the Src host, select Src and Dst IP addresses, and form a test

      packet of Type-P with these addresses.  Any 'padding' portion of

      the packet needed only to make the test packet a given size should

      be filled with randomized bits to avoid a situation in which the

      measured delay is lower than it would otherwise be due to

      compression techniques along the path.  The test packet must have

      some identifying information so that the response to it can be

      identified by Src when Src receives the response; one means to do

      this is by placing the timestamp generated just before sending the

      test packet in the packet itself.

 

   +  At the Dst host, arrange to receive and respond to the test

      packet.  At the Src host, arrange to receive the corresponding

      response packet.

 

   +  At the Src host, take the initial timestamp and then send the

      prepared Type-P packet towards Dst.  Note that the timestamp could

      be placed inside the packet, or kept separately as long as the

      packet contains a suitable identifier so the received timestamp

      can be compared with the send timestamp.

 

   +  If the packet arrives at Dst, send a corresponding response packet

      back from Dst to Src as soon as possible.

 

   +  If the response packet arrives within a reasonable period of time,

      take the final timestamp as soon as possible upon the receipt of

      the packet.  By subtracting the two timestamps, an estimate of

      round-trip delay can be computed.  If the delay between the

      initial timestamp and the actual sending of the packet is known,

      then the estimate could be adjusted by subtracting this amount;

      uncertainty in this value must be taken into account in error

      analysis.  Similarly, if the delay between the actual receipt of

      the response packet and final timestamp is known, then the

      estimate could be adjusted by subtracting this amount; uncertainty

      in this value must be taken into account in error analysis.  See

      the next section, "Errors and Uncertainties", for a more detailed

      discussion.

 

   +  If the packet fails to arrive within a reasonable period of time,

      the round-trip delay is taken to be undefined (informally,

      infinite).  Note that the threshold of 'reasonable' is a parameter

      of the methodology.

 

Issues such as the packet format and the means by which Dst knows when to expect the test packet are outside the scope of this document.

 

   {Comment: Note that you cannot in general add two Type-P-One-way-

   Delay values (see [2]) to form a Type-P-Round-trip-Delay value.  In

   order to form a Type-P-Round-trip-Delay value, the return packet must

   be triggered by the reception of a packet from Src.}

 

   {Comment: "ping" would qualify as a round-trip measure under this

   definition, with a Type-P of ICMP echo request/reply with 60-byte

   packets.  However, the uncertainties associated with a typical ping

   program must be analyzed as in the next section, including the type

   of reflecting point (a router may not handle an ICMP request in the

   fast path) and effects of load on the reflecting point.}

 

2.4 Calibration

 

Generally, the measured values can be decomposed as follows:

 

       measured value = true value + systematic error + random error

 

If the systematic error (the constant bias in measured values) can be determined, it can be compensated for in the reported results.

 

       reported value = measured value - systematic error

 

therefore

 

       reported value = true value + random error

 

The goal of calibration is to determine the systematic and random error generated by the instruments themselves in as much detail as possible.  At a minimum, a bound ("e") should be found such that the reported value is in the range (true value - e) to (true value + e) at least 95 percent of the time.  We call "e" the calibration error for the measurements.  It represents the degree to which the values produced by the measurement instrument are repeatable; that is, how closely an actual delay of 30 ms is reported as 30 ms.  {Comment: 95 percent was chosen because (1) some confidence level is desirable to be able to remove outliers, which will be found in measuring any physical property; and (2) a particular confidence level should be specified so that the results of independent implementations can be compared.}

 

From the discussion in the previous three sections, the error in measurements could be bounded by determining all the individual uncertainties, and adding them together to form

 

       2*Rsource + Hinitial + Hfinal + Hrefl.

 

However, reasonable bounds on both the clock-related uncertainty captured by the first term and the host-related uncertainty captured by the last three terms should be possible by careful design techniques and calibrating the instruments using a known, isolated, network in a lab.

 

The host-related uncertainties, Hinitial + Hfinal + Hrefl, could be bounded by connecting two instruments back-to-back with a high-speed serial link or isolated LAN segment.  In this case, repeated measurements are measuring the same round-trip delay.

 

If the test packets are small, such a network connection has a minimal delay that may be approximated by zero.  The measured delay therefore contains only systematic and random error in the instrumentation.  The "average value" of repeated measurements is the systematic error, and the variation is the random error.

 

One way to compute the systematic error, and the random error to a 95% confidence is to repeat the experiment many times - at least hundreds of tests.  The systematic error would then be the median. The random error could then be found by removing the systematic error from the measured values.  The 95% confidence interval would be the range from the 2.5th percentile to the 97.5th percentile of these deviations from the true value.  The calibration error "e" could then be taken to be the largest absolute value of these two numbers, plus the clock-related uncertainty.  {Comment: as described, this bound is relatively loose since the uncertainties are added, and the absolute value of the largest deviation is used.  As long as the resulting value is not a significant fraction of the measured values, it is a reasonable bound.  If the resulting value is a significant fraction of the measured values, then more exact methods will be needed to compute the calibration error.}

 

Note that random error is a function of measurement load.  For example, if many paths will be measured by one instrument, this might increase interrupts, process scheduling, and disk I/O (for example, recording the measurements), all of which may increase the random error in measured singletons.  Therefore, in addition to minimal load measurements to find the systematic error, calibration measurements should be performed with the same measurement load that the instruments will see in the field.

 

We wish to reiterate that this statistical treatment refers to the calibration of the instrument; it is used to "calibrate the meter stick" and say how well the meter stick reflects reality.

 

In addition to calibrating the instruments for finite delay, two checks should be made to ensure that packets reported as losses were really lost.  First, the threshold for loss should be verified.  In particular, ensure the "reasonable" threshold is reasonable: that it is very unlikely a packet will arrive after the threshold value, and therefore the number of packets lost over an interval is not sensitive to the error bound on measurements.  Second, consider the possibility that a packet arrives at the network interface, but is lost due to congestion on that interface or to other resource exhaustion (e.g. buffers) in the instrument.

 

2.5 Reporting Metric

 

2.5.1 Type-P

the value of the metric may depend on the type of IP packets used to make the measurement, or "type-P".  The value of Type-P-Round-trip-Delay could change if the protocol (UDP or TCP), port number, size, or arrangement for special treatment (e.g., IP precedence or RSVP) changes.  The exact Type-P used to make the measurements MUST be accurately reported.

 

2.5.2 Loss threshold

In addition, the threshold (or methodology to distinguish) between a large finite delay and loss MUST be reported.

 

2.5.3 Calibration Results

- If the systematic error can be determined, it SHOULD be removed from the measured values.

- You SHOULD also report the calibration error, e, such that the true value is the reported value plus or minus e, with 95% confidence (see the last section.)

- If possible, the conditions under which a test packet with finite delay is reported as lost due to resource exhaustion on the measurement instrument SHOULD be reported.

 

2.5.4 Path

Finally, the path traversed by the packet SHOULD be reported, if possible.  In general it is impractical to know the precise path a given packet takes through the network.  The precise path may be known for certain Type-P on short or stable paths.  For example, if Type-P includes the record route (or loose-source route) option in the IP header, and the path is short enough, and all routers* on the path support record (or loose-source) route, and the Dst host copies the path from Src to Dst into the corresponding reply packet, then the path will be precisely recorded.  This is impractical because the route must be short enough, many routers do not support (or are not configured for) record route, and use of this feature would often artificially worsen the performance observed by removing the packet from common-case processing.  However, partial information is still valuable context.  For example, if a host can choose between two links* (and hence two separate routes from Src to Dst), then the initial link used is valuable context.  {Comment: For example, with Merit's NetNow setup, a Src on one NAP can reach a Dst on another NAP by either of several different backbone networks.}

 

3. A Definition for Samples of Round-trip Delay

Type-P-Round-trip-Delay-Poisson-Stream

 

3.1 Parameters

Src, the IP address of a host

Dst, the IP address of a host

T0, a time

Tf, a time

lambda, a rate in reciprocal seconds

 

3.2 Definition

Given T0, Tf, and lambda, we compute a pseudo-random Poisson process beginning at or before T0, with average arrival rate lambda, and ending at or after Tf.  Those time values greater than or equal to T0 and less than or equal to Tf are then selected.  At each of the times in this process, we obtain the value of Type-P-Round-trip-Delay at this time.  The value of the sample is the sequence made up of the resulting <time, delay> pairs.  If there are no such pairs, the sequence is of length zero and the sample is said to be empty.

 

3.3 Methodologies

 

The methodologies follow directly from:

 

- the selection of specific times, using the specified Poisson arrival process, and

- the methodologies discussion already given for the singleton Type-P-Round-trip-Delay metric.

 

Care must, of course, be given to correctly handle out-of-order arrival of test or response packets; it is possible that the Src could send one test packet at TS[i], then send a second test packet (later) at TS[i+1], and it could receive the second response packet at TR[i+1], and then receive the first response packet (later) at TR[i].

 

3.4 Reporting the Metrics

Same as 2.4

 

4 Others

 

4.1 Type-P-Round-trip-Delay-Percentile

 

   Given a Type-P-Round-trip-Delay-Poisson-Stream and a percent X

   between 0% and 100%, the Xth percentile of all the dT values in the

   Stream.  In computing this percentile, undefined values are treated

   as infinitely large.  Note that this means that the percentile could

   thus be undefined (informally, infinite).  In addition, the Type-P-

   Round-trip-Delay-Percentile is undefined if the sample is empty.

 

   Example: suppose we take a sample and the results are:

 

      Stream1 = <

      <T1, 100 msec>

      <T2, 110 msec>

      <T3, undefined>

      <T4, 90 msec>

      <T5, 500 msec>

      >

 

   Then the 50th percentile would be 110 msec, since 90 msec and 100

   msec are smaller and 110 msec and 'undefined' are larger.

 

   Note that if the possibility that a packet with finite delay is

   reported as lost is significant, then a high percentile (90th or

   95th) might be reported as infinite instead of finite.

 

4.2. Type-P-Round-trip-Delay-Median

 

   Given a Type-P-Round-trip-Delay-Poisson-Stream, the median of all the

   dT values in the Stream.  In computing the median, undefined values

   are treated as infinitely large.  As with Type-P-Round-trip-Delay-

   Percentile, Type-P-Round-trip-Delay-Median is undefined if the sample

   is empty.

 

   As noted in the Framework document, the median differs from the 50th

   percentile only when the sample contains an even number of values, in

   which case the mean of the two central values is used.

 

   Example: suppose we take a sample and the results are:

 

      Stream2 = <

      <T1, 100 msec>

      <T2, 110 msec>

      <T3, undefined>

      <T4, 90 msec>

      >

 

   Then the median would be 105 msec, the mean of 100 msec and 110 msec,

   the two central values.

 

4.3. Type-P-Round-trip-Delay-Minimum

 

   Given a Type-P-Round-trip-Delay-Poisson-Stream, the minimum of all

   the dT values in the Stream.  In computing this, undefined values are

   treated as infinitely large.  Note that this means that the minimum

   could thus be undefined (informally, infinite) if all the dT values

   are undefined.  In addition, the Type-P-Round-trip-Delay-Minimum is

   undefined if the sample is empty.

 

   In the above example, the minimum would be 90 msec.

 

4.4. Type-P-Round-trip-Delay-Inverse-Percentile

 

   Given a Type-P-Round-trip-Delay-Poisson-Stream and a time duration

   threshold, the fraction of all the dT values in the Stream less than

   or equal to the threshold.  The result could be as low as 0% (if all

   the dT values exceed threshold) or as high as 100%.  Type-P-Round-

   trip-Delay-Inverse-Percentile is undefined if the sample is empty.

 

   In the above example, the Inverse-Percentile of 103 msec would be

   50%.

 

 

 

 

 

 

 

 

 

 




 
신고

'research > E2ENetPer' 카테고리의 다른 글

TCP Performance and the Mathis Equation  (0) 2009.08.22
성능 분석  (1) 2007.04.14
iperf로 test하기  (0) 2007.03.26
RFC-Network Performance Metric  (0) 2007.03.26
Posted by 나마스떼

댓글을 달아 주세요



티스토리 툴바