2017-03-10

ファインマン・カッツの公式

最近確率的計算法を学んでいます。一つな計算式はPDEの問いをブラウン運動に接続する。 the Feynman-Kac formula is problematic in that it seems to be stated in different forms depending on the source (in some cases u satisfies a PDE like that in Black Scholes, in others the heat equation), the notation for the conditional expectation is unclear, and the proofs are inconsistent, it seems, in their use of different Ito lemmas.
英語版のウィキベディアの解はu(x,t) = E(Q)(INTEGRAL(t,T)(exp(-INTEGRAL(t,r)(V(Xtau,tau)dtau))*f(Xr,r)dr) + exp(-INTEGRAL(t,T)(V(Xtau,tau)dtau))psi(XT)|Xt=x)
This basically indicates that the solution function u is a conditional expectation of psi's value at the terminal time T and the V and f values at times in the interim.
しかしシンガポールの南洋理工大学のLeon教授の説明は理解できました。
In this case, the solution to the PDE, g, was simply E(Q,x,t)(h(X(T))), or the expectation over all t and x of the value of some measurable function h at time T. 証明の由来も分かりました。　Basically, just start with a stochastic X(t) such that
(here, x = X(t))
dX(t) = b(t,x)dt + c(t,x)dW(t). Consider that g is a function of X and t, and apply Ito's lemma to find dg (drop arguments for clarity).
dg = (dg/dt + b*dg/dx + (1/2)*c^2*d^2g/dx^2)dt + c*(dg/dx)dW
However, since g is a Martingale (proven based on definition of g as conditional expectation of h), the dt term must be 0, meaning that (dg/dt + b*dg/dx + (1/2)c^2*d^2g/dx^2) = 0. In addition, substituting T for t into the g expression,
g(T,X(T)) = E(Q,t,x)(h(X(T))) = h(X(T)) = h(x).
The final two statements are the content of Feynman-Kac．
しかし利子の拡散方程式を表すハル・ワイト公式の偏微分方程式をファインマン・カッツで由来すればこの方程式の解の源を知りません。

2017-02-21

動的計画法とゲーム理論

最近コードシェッフと言うシステム開発試合に苦労しました。一つな問いはゲーム木の検索に関わる問題でした。　Using a standard memoization, I encountered no incorrect results, but the performance, even for the intermediate cases, was too slow. I could at least complete the intermediate cases using a dynamic programming approach with a "bestfrom" cache based on start index and move number, but this generated an incorrect answer in the simple case. でも僕が作った入力データは全部正しい出力になりました。やっぱりちょっと意外な動作を考えなければ行けないと解けません。The subtle point I had missed was that, while I was careful to check that the "bestFrom" array, which indicated the best sum possible starting from or after an index in a given move was not overstating the value by caching an index beyond the allowed start range, I forgot that it might underestimate the value by caching a sum from a previous visit when the right bound on the start index was much lower.
I found no general dynamic programming solution. Only a memoization solution which is too slow for the larger cases (beyond easy) and a dynamic-programming approach used only when M==2 (the medium test case constraint). Thus, the hard cases were left as timed-out.

2017-01-29

伊藤等長と二次変動

最近確率的な積分を学んでいます。I have had difficulty in deriving the expression sometimes given for the quadratic variation of the Ito integral. It seems to be based on Ito's isometry, which I have yet to see the derivation of.
一つな前提はB（ｔ）と言うブラウン運動は微分ができない関数です。無作為な値を出力するからです。
Ito's isometry expresses the expectation of the square of a stochastic integral for a function in terms of the standard integral of the square of that function.
E*1

It is unclear how such a formula is derived, but it appears to arise from the definition of covariance for stochastic functions.
Another formula, an unproven identity, relates quadratic variance of an Ito integral to the Ito integral of a function with respect to the quadratic variance of Brownian motion.
[(INT(0,T)(x(t)dB(t)))] = (INT(0,T)(x(t)d[B(t)]))
Here, [B(t)] is the quadratic variation of B.
The use of stochastic integrals seems to be to integrate stochastic differential equations, such as those used in geometric Brownian motion. Ito's lemma does not involve stochastic integrals, but the SDE's it translates between are solved via them. A more difficult concept is quadratic variation. It seems to be a value [M] such that M^2(t)-[M](t) is a Martingale. In addition, the variance of M is E([M]), which makes little sense, at first. However, if [M](0) is defined to be 0 (as it is in many places), then E(M^2(t)-[M](t)) = E(M^2(t))-[M](0) . Since the variance of M is E(M^2(t))-(E(M(t))^2), var(M) = E(M^2(t))-M^2(0), as M is a martingale. As M^2(t)-[M](t) is a Martingale by definition of [M], E(M^2(t))-E([M](t)) = E(M^2(0))-[M](0). As [M](0)=0, E(M^2(t))-E([M](t)) = E(M^2(0)). As E(M(t)) = M(0), E(M^2(t))-E(M^2(0)) = var(M) = E([M]), so the variance of the Martingale process is equal to the expectation of the quadratic variation of the process. This can be extended to demonstrate the variance of the Ito integral. var(I(t)) = E([I](t)), which is defined by another theorem.
派生証券の値を計算する事に役に立ちそう。

credit
https://almostsure.wordpress.com/2010/03/29/quadratic-variations-and-the-ito-isometry/, Wikipedia

*1:INT(0,T)(x(t)dB(t)))^2) = E(INT(0,T)(x^2(t)dt

2016-12-24

離散の空間に道を検索する

最近トップコーダーの図表問題に苦しんでいます。The problem is to determine whether a "path" between two numbers exists, where a new number x is reachable from a start node s, if gcd(s,x)>k. It is clear that the first step from s is either a factor of s greater than k or a multiple of any such factor (less than or equal to the total number of nodes n).ｋより大きい因数を算出していますがｋより小さい因数を倍数と組み合わせる場合に新しい結節点がでるし前の点とのｇｃｄがｋより大きいことがありえそうです。
The initial approach of searching all multiples of all factors greater than k did not work, though, as, for 12345 and 54321 start and end with a k=1000 parameter, one can move 12345->2469*334->1002(divides 2469*334)->954906 (1002 times 953)->54321 (gcd with 954906 is 2859). This path would not be found, as 1002 is neither a factor of 12345 nor a multiple of a factor which is greater than k (it is a multiple of 3).
It seems that the only feasible algorithm is a tree search, which would cost O(n^3*sqrt(n)) it seems, as for each node all reachable nodes must be found (n^2), and reachability is checked via the gcd (sqrt(n)), and then all nodes must be visited on a backtracking search from the start to determine whether the goal is indirectly reachable. (lower bound O(n)). A more general approach of searching all multiples of all factors whereby the multiples exceed k also fails, as it leads to many false positives (e.g. if a large prime like 104279 is multiplied by a factor of 15 like 3 where k=4, the resulting multiple exceeds k, but gcd(3*104279,15)=3).
The only solution (from xiao_dan and dirak) seems to be to utilize dynamic programming, a recursive lookup mechanism or linked list, and equivalence classes. Basically, scan all values in [k+1,n] and for each x, form the equivalence class x, 2x, 3x, .... These will all have gcd of x among them and are connected. Add the class to the graph such that any values in the class which have already been added will cause their earlier classes to point to the new one. This makes sense, as 12 can be added once for 3, and then revisited for 4, 6, etc. In the case of 4, the 12 will cause the 3 class to point to 4. In the case of 6, this is a member of the 3 class already added, and thus cannot start its own class. While the correct solutions used a single array and recursion efficiently, the code is difficult to decipher. A more logical approach would likely be tu use two data structures: one for equivalence classes, and one for class translation (e.g. 3->4 in above case). Then, the algorithm should loop oveelr all equivalence classes in the k+1 to n range. Whenever a starting value i that has not been processed as another equivalence class is found, the value is saved and set as the cluster name for all multiples of this value (members of the equivalence class) and a link to any other classes that have multiples which lie in this new class also is added. Then, to determine the reachability of node x from node y, a lookup of the base class is performed. This lookup loops over the data structures of classes and class links until the classlink[x]=x relationship holds. If x and y have the same base class, x is reachable from y.
以外に難しい中級問題だった。

2016-12-14

人工知能、Maximum a Posteriori Learning via Conjugate Gradient

Multiple methods of computing the value of a variable theta which optimizes the posterior distribution exists.　The conjugate gradient function seems to be an incremental extrema search algorithm analogous to gradient descent.

2016-12-14

信用リスクの計算方法

CreditMetricsは格付遷移行列を基づいて取引先の将来の適切な格付と利率を計算します。この手法はＪＰモーガン社に開発されました。
CreditRisk+はクレディ・スイスに開発されました。　取引先をエクスポージャーによってバンドに配分します。
In the above cases, the method of computing the expected credit loss or credit VaR for a portfolio of obligors is similar. 倒産確率の相関関係は債務者を二社毎計算して共分散行列を計算できます。The estimation is carried out via Monte Carlo simulation whereby the covariance matrix is initially estimated (using a bivariate normal inverse cdf to compute the probability of default for two obligors, and the joint probability and indvidual probabilities together provide a formula for the correlation). Cholesky decomposition can be used to map uniform random variables to correlated variables, enable multiple return scenarios for the set of obligors to be generated. In each scenario, an obligor's asset return can be translated to a new credit rating, which provides the discount rate for cash flows from that obligor. 全ての債務者の資産利益率によって格付が決まる。その格付は利子を指すしこの利子が各債務者の資金流入額を考慮して現在の債権などの値を計算する。
Exposure is presented net of the expected recovery amount (回収額).
CreditPortfolioViewは経済指標を因数として使う方法です。GDP,雇用指数、製造業景況指数などを元にして倒産の確立を算出します。Pjt = 1/(1+exp(-Yjt)), where j is the country, t is the time, and Y is the macroeconomic indicator.
KMVは破産からの距離を計算します。

2016-12-07

数量的手法でペアー・トレード

最近「クオントピアン」と言うホームページのパイソンに書かれているアルゴリズムを学んでいます。目的は相対的な関係を前提にするペアー取引です。The algorithm's use of an ordinary least squares function to compute the hedge ratio between the two stocks, a "zscore" metric to determine the direction for each stock, and a matrix based on numpy "ndarray" objects which grows columnwise each day makes it somewhat difficult to decipher.
最近自分のペアートレードアルゴリズムを開発しました。zスコアではなくてEWMAに頼ります。