Prob Expert

Friday, February 24, 2012

A Categorization: Ten Types of Models

A Categorization: Ten Types of Models

Business models used by financial institutions can be placed in more than ten categories, of course, but here are ten prominent general types of models:

1.Statistical credit scoring models (typically for default)
2.Consumer- or borrower-response models
3.Consumer- or borrower-characteristic prediction models
4.Loss given default (LGD) and Exposure at default (EAD) models
5.Optimization tools (these are not models, per se, but mathematical algorithms that often use inputs from models)
6.Loss forecasting and simulation models and Value-at-risk (VAR) models
7.Valuation, option pricing, and risk-based pricing models
8.Profitability forecasting and enterprise-cash-flow projection models
9.Macroeconomic forecasting models
10.Financial-risk models that model complex financial instruments and interactions
Types 8, 9 and 10, for example, are often built up from multiple component models, and for this reason and others, these model categories are not mutually exclusive. Types 1 through 3, for example, can also be built from individual-level data (typical) or group-level data. No categorical type listing of models is perfect, and this listing is also not intended to be completely exhaustive.

Thursday, July 24, 2008

intnx function

The intnx function increments dates by intervals.

intnx(interval, from, n <, alignment>)

interval is a character (e.g., string) constant or variable, from is the starting value (either a date or datetime), n is the number of intervals to increment, and alignment is optional and controls the alignment of the dates.

data temp2;
input id 1 @3 date mmddyy11.;
cards;
1 11/12/1980
2 10/20/1996
3 12/21/1999
;
run;

proc print data = temp2;
format date date9.;
run;
id date

1 12NOV1980
2 20OCT1996
3 21DEC1999
data temp3;
set temp2;
new_month = intnx('month',date,1);
run;
proc print data = temp3 noobs;
format date new_month date9.;
run;
id date new_month

1 12NOV1980 01DEC1980
2 20OCT1996 01NOV1996
3 21DEC1999 01JAN2000

Proc Surveyselect

PROC SURVEYSELECT Statement

PROC SURVEYSELECT options ;

SAMPRATE=SAS-data-set
RATE=SAS-data-set
names a SAS data set that contains sampling rates for the strata. This input data set should contain all the STRATA variables, with the same type and length as in the DATA= data set. The STRATA groups should appear in the same order in the SAMPSIZE= data set as in the DATA= data set. The SAMPRATE= data set should have a variable _RATE_ that contains the sampling rate for each stratum.

Each sampling rate value must be a positive number. You can specify each value as a number between 0 and 1. Or you can specify a value in percentage form as a number between 1 and 100, and PROC SURVEYSELECT converts that number to a proportion. The procedure treats the value 1 as 100%, and not the percentage form 1%.

The SAMPRATE= option is available only for equal probability selection methods (METHOD=SRS, METHOD=URS, METHOD=SYS, and METHOD=SEQ). For systematic random sampling (METHOD=SYS), PROC SURVEYSELECT uses the inverse of the stratum sampling rate as the interval for the stratum. See the section "Systematic Random Sampling" for details. For other selection methods, PROC SURVEYSELECT converts the stratum sampling rate to the stratum sample size before selection, multiplying the rate by the number of units in the stratum and rounding up to the nearest integer.

Tuesday, August 14, 2007

Law of large numbers

The weak law

The weak law of large numbers states that the sample average converges in probability towards the expected value

$\overline{X}_n \, \xrightarrow{P} \, \mu \qquad\textrm{for}\qquad n \to \infty.$

That is to say that for any positive number ε,

$\lim_{n\rightarrow\infty}\operatorname{P}\left(\left|\overline{X}_n-\mu\right|<\varepsilon\right)=1.$

(Proof)

Interpreting the convergence in probability, weak law essentially states that the average of many observations will eventually be close to the mean within any nonzero margin specified, no matter how small.

This version is called the weak law because convergence in probability is weak convergence of random variables.

A consequence of the weak LLN is the asymptotic equipartition property.

[edit] The strong law

The strong law of large numbers states that the sample average converges almost surely to the expected value

$\overline{X}_n \, \xrightarrow{\mathrm{a.s.}} \, \mu \qquad\textrm{for}\qquad n \to \infty .$

That is,

$\operatorname{P}\left(\lim_{n\rightarrow\infty}\overline{X}_n=\mu\right)=1,$

The proof is more complex than that of the weak law. This law justifies the intuitive interpretation of the expected value of a random variable as the "long-term average when sampling repeatedly".

This version is called the strong law because almost sure convergence is strong convergence of random variables. The strong law implies the weak.

Monday, June 11, 2007

Martingale

In probability theory, a martingale is a stochastic process (i.e., a sequence of random variables) such that the conditional expected value of an observation at some time t, given all the observations up to some earlier time s, is equal to the observation at that earlier time s.

A discrete-time martingale is a discrete-time stochastic process (i.e., a sequence of random variables) $X_1, X_2, X_3, \ldots$ that satisfies for all n

$E( \vert X_n \vert )< \infty$

$E(X_{n+1}\mid X_1,\ldots,X_n)=X_n,$

i.e., the conditional expected value of the next observation, given all of the past observations, is equal to the last observation.

Somewhat more generally, a sequence $Y_1, Y_2, Y_3, \ldots$ is said to be a martingale with respect to another sequence $X_1, X_2, X_3, \ldots$ if for all n

$E(\vert Y_n \vert )< \infty$

$E(Y_{n+1}\mid X_1,\ldots,X_n)=Y_n.$

Similarly, a continuous-time martingale with respect to the stochastic process $X t$ is a stochastic process $Y t$ such that for all t

$E( \vert Y_t \vert )<\infty$

$E ( Y_{t} \mid \{ X_{\tau}, \tau \leq s \} ) = Y_s,$ for all $s \leq t.$

Tuesday, May 22, 2007

Cauchy–Schwarz inequality

The statement of the Cauchy–Schwarz inequality is:

For all vectors $x$ and $y$ of a real or complex inner product space $V$ the following inequality holds:

$|\langle x,y\rangle|^2 \leq \langle x,x\rangle \cdot \langle y,y\rangle.$

or, equivalently, by taking the square root of both sides, and referring to the norms of the vectors:

$|\langle x,y\rangle| \leq \|x\| \cdot \|y\|.\,$

Sunday, March 04, 2007

函数的幂级数展开式

通过前面的学习我们看到，幂级数不仅形式简单，而且有一些与多项式类似的性质。而且我们还发现有一些可以表示成幂级数。为此我们有了下面两个问题：
问题1：函数f(x)在什么条件下可以表示成幂级数；
问题2：如果f(x)能表示成如上形式的幂级数，那末系数cn(n=0,1,2,3,…)怎样确定？
下面我们就来学习这两个问题。
泰勒级数
我们先来讨论第二个问题.假定f(x)在a的邻区内能表示成这种形式的幂级数，其中a是事先给定某一常数，我们来看看系数cn与f(x)应有怎样的关系。
由于f(x)可以表示成幂级数，我们可根据幂级数的性质，在x=a的邻区内f(x)可任意阶可导.对其幂级数两端逐次求导。得：
，
，
………………………………………………
，
………………………………………………
在f(x)幂级数式及其各阶导数中，令x=a分别得：
把这些所求的系数代入得：

该式的右端的幂级数称为f(x)在x+a处的泰勒级数.
关于泰勒级数的问题
上式是在f(x)可以展成形如的幂级数的假定下得出的.实际上，只要f(x)在x=a处任意阶可导，我们就可以写出函数的泰勒级数。
问题：函数写成泰勒级数后是否收敛？是否收敛于f(x)？
函数写成泰勒级数是否收敛将取决于f(x)与它的泰勒级数的部分和之差

是否随n→＋∞而趋向于零.如果在某一区间I中有那末f(x)在x=a处的泰勒级数将在区间I中收敛于f(x)。此时，我们把这个泰勒级数称为函数f(x)在区间I中的泰勒展开式.
泰勒定理
设函数f(x)在x=a的邻区内n+1阶可导，则对于位于此邻区内的任一x，至少存在一点c,c在a与x之间，使得：

此公式也被称为泰勒公式。(在此不加以证明）
在泰勒公式中，取a=0，此时泰勒公式变成：
其中c在0与x之间
此式子被称为麦克劳林公式。
函数f(x)在x=0的泰勒级数称为麦克劳林级数.当麦克劳林公式中的余项趋于零时，我们称相应的泰勒展开式为麦克劳林展开式.
即：
几种初等函数的麦克劳林的展开式
1.指数函数ex

2.正弦函数的展开式

3.函数(1+x)m的展开式

Wednesday, February 28, 2007

Chain rule for several variables

The chain rule works for functions of more than one variable. Consider the function $z = f (x, y)$ where $x = g (t)$ and $y = h (t)$ , then

${\partial z \over \partial t}={\partial f \over \partial x}{dx \over dt}+{\partial f \over \partial y}{dy \over dt}$

Suppose that each function of $z = f (u, v)$ is a two-variable function such that $u = h (x, y)$ and $v = g (x, y)$ , and that these functions are all differentiable. Then the chain rule would look like:

${\partial z \over \partial x}={\partial z \over \partial u}{\partial u \over \partial x}+{\partial z \over \partial v}{\partial v \over \partial x}$

${\partial z \over \partial y}={\partial z \over \partial u}{\partial u \over \partial y}+{\partial z \over \partial v}{\partial v \over \partial y}$

If we considered $\vec r = (u,v)$ above as a vector function, we can use vector notation to write the above equivalently as the dot product of the gradient of f and a derivative of $\vec r$ :

$\frac{\partial f}{\partial x}=\vec \nabla f \cdot \frac{\partial \vec r}{\partial x}$

More generally, for functions of vectors to vectors, the chain rule says that the Jacobian matrix of a composite function is the product of the Jacobian matrices of the two functions:

$\frac{\partial(z_1,\ldots,z_m)}{\partial(x_1,\ldots,x_p)} = \frac{\partial(z_1,\ldots,z_m)}{\partial(y_1,\ldots,y_n)} \frac{\partial(y_1,\ldots,y_n)}{\partial(x_1,\ldots,x_p)}$