/*@@@@@@@@@@@@ Q-TEST OF EXTREMAL WHITE NOISE @@@@@@@@@@@@@@@@@@@@@@@@@@ Author: Jonathan B. Hill: Latest version: Nov. 5, 2006 Dept. of Economics Florida International University Miami, FL jonathan.hill@fiu.edu www.fiu.edu/~hilljona This code accompanies the paper (*) "Gaussian Tests of 'Extremal White Noise' for Dependent, Heterogeneous, Heavy Tailed Stochastic Processes with an Application." The test is applicable for time series with regularly varying tails (e.g. any process with an infinite variance that satisfies a central limit theorem), with extremes that are near-epoch-dependent on the extremes of some uniform or strong mixing process. This includes infinitely many ARFIMA, FIGARCH, bilinear, and extremal threshold processes. The code includes a procedure for the extremal white noise Q-test; a procedure for computing B. Hill's (1975) tail index estimator, and a robust kernal variance estimator of B. Hill's estimator based on my work in the paper "On Tail Index Estimation for dependent, heterogeneous data". All procedures are called from the omnibus procedure "COMPLETE_CO_REL_Q_TEST.prc" GLOBAL variable defauls are set below. The key to the strength of the Q-test is selecting the nuisance tail fractile "m" by ranking the sample co-relations, and utilizing a "suitably" small rank. Small ranks r = [.005n]...[[.05n] work well with tests of two-tailed serial extremal dep. Smaller ranks r = [.005n]...[.01n] work well with tail-specific tests, or difference-in-tails tests. An average of co-relations and their standard errors, and an average of Q-statsitics, over any subset of co-relation ranks is asymptotically valid. The code, below, used a flag that allows for reporting co-relations and a_hat's over a window of ranks, with defaults set to the ranges just given. Not surpisingly, most time series will need to be differenced: undifferenced I(1) data will generate strange/infinite values, or crash the program. The tail index is estimated for a large number of "feasible" sample tail fractiles 'm'. The program also produces an average alpha over a window of m's generated by ranking the first order co-relation: see (*). A far more accurate means to estimate alpha is presented in other code on my web-site. Please consult that source. The main procedure will produce PLOTS of the original time series, the tail index estimator with robust confidence bands, and sample co-relations with confidence bands. Thus, LIBRARY PGRAPH is required. Otherwise, simply disable the plot portions of the code. A sample of size > 1000, with 50 displacements will take about 15 minutes to run. Only 5 displacements: a few minutes. MAIN PROCEDURE: complete_co_rel_q_test: Main procedure that calls all sub-ruitines. Requires a univariate time series "y" SUB-PROCEDURES: Q_test_ewn: Q-test of extremal white noise. co_rel_psi: Computes the co-relation. a_c: Computes the Hill-estimator and the extremal scale. a_hat_kern_var: Computes the Hill-estimator, and the kernel variance estimator of J.B. Hill (2005) a_hat_kern_var_left_right: Computes left/right tail estimators and a kernel asymptotic covariance matrix. The interested programmer can use this ruitine to test tail-shape difference. SIMULATION PROCEDURES (I highly recommend new users to use simulated data first to acquaint yourself with the output format): pareto_rv: Simulations a uinivariate Pareto time series. extremal_wn_setar_sim: Simulates an extremal white noise process (SETAR with threshold --> infinity). arma_sim: Simulates arma(p,q) data. Jonathan B. Hill, 2006 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@*/ new; cls; clear all; library pgraph; /* SIMULATED DATA */ n = 500; y = pareto_rv(2*n,1.5,0,1,1); /* input: n,alpha,skew,left-scale,right-scale */ /*{y} = extremal_wn_setar_sim(n,0|.9,ln(n.^(.2)),y);*/ /* input n, theta, threshold, innovations */ /*{y} = arma_sim(1,0,.9,y);*/ /* input p,q,theta,innovations */ /* GLOBAL SETTINGS */ /* TEST STATISTIC */ tails = 0; /* 0 = two tails; 1 = left tail; 2 = right tail; 12 = diff. in tails */ h = 5; /* number of displaments */ /* The window of co-relation ranks over which averages are taken: [n^rank_min_av_per]...[n^rank_max_av_per] */ rank_min_av_per = .005; rank_max_av_per = .05; kern_band = .20; /* kernel var. bandwidth = [m^kern_band]: max. value is .25 */ /* Feasible tail fractiles m = [n^m_pow]-[n^m_win]...[n^m_pow]+[n^m_win] */ m_pow = .9; m_win = .895; m_min = round(n^m_pow)-round(n^m_win); m_max = round(n^m_pow)+round(n^(m_win*.8)); n_m = m_max-m_min+1; /* Co-relation Q-test */ {co_rel,co_rel_se,q_stat,p_value,a_hat,var} = complete_co_rel_q_test(y,h); end; /* @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ */ /*------------------------------------------------------------------------ ** complete_co_rel_q_test.prc ** ** Purpose: Performs the co-relation based Q-test of J.B. Hill (2005a), computes the ** B. Hill-tail index estimator, cf. Hill (1975), and a kernel variance ** variance estimator of the Hill-estimator. See Hill (2005b). ** ** Usage: {co_rel,co_rel_se,q_stat,p_value,a_hat,var} = complete_co_rel_q_test(y,h) ** ** Input: y nx1 vector time series ** h number of q-test lags ** ** Output: co_rel hx1 vector serial co-relations ** co_rel_se hx1 vector asymptotic standard error estimates ** q_stat hx1 vector q-statistics ** p_value hx1 vector p-values ** a_hat B. Hill's tail index estimator ** var kernel variance of B.Hill's tail estimator: see J.B. Hill ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc (6) = complete_co_rel_q_test(y,h); local n,a,v,i,y1,q_stat,p_value,m_star,a_x,co_rel,co_rel_se,a_hat,var; local r_min_av,r_max_av,m_seq,rank_seq,rank,a_hat_1,a_hat_2,m_1,m_2; local a_hat_ranks,var_ranks; local y_axis_min,y_axis_max,y_axis; y1 = packr(lagn(y,seqa(0,1,h+1))); n = rows(y1); /* Q-test of extremal white noise: returns optimal m = m_star based on ranking co-relations */ {q_stat,p_value,m_star,m_1,m_2,a_hat_1,a_hat_2,co_rel,co_rel_se,m_seq} = Q_test_ewn(y,h); rank_seq = seqa(1,1,rows(m_seq)); /* B. Hill's tail index, and J.B. Hill's kernel variance estimator: based on m's from ranked co-relations */ {a_hat_ranks,var_ranks} = a_hat_kern_var(y1[.,1],m_star); /* .... based on all m's */ {a_hat,var} = a_hat_kern_var(y1[.,1],m_seq); r_min_av = round(rank_min_av_per*n); /* for averaging co-relation over ranks */ r_max_av = round(rank_max_av_per*n); "NOTE: All 'rankings' imply the tail fractile 'm', for each displacement, is chosen by ranking the co-relations."; " Avererage co-relations and Q-statistics are averaged over a subset of 'm's' based on a pre-selected set of ranks."; " Average statistics are asymptotically valid as long as a feasible set of fractilse 'm'"; " contains elements that are proportional to each other (e.g. m = n^d+-n^(d/2))."; ""; " For best test results for one-tailed and two-tailed tests a sample size n >= 300 is required to"; " ensure an adequate number of extremes. For difference-in-tails the test works well with n >= 500,"; " but n >=800 is required to ensure an accurate estimate of the co-relation difference itself."; "_______________________________________________________________________________________";""; " CO-RELATIONS AND Q-STAT'S FOR RANK [.01*n] =" round(.01*n) "";""; " LAG Lower Co-Relation Upper Q-stat p-value"; seqa(1,1,h)~(co_rel[round(.01*n),.]' - 1.96*co_rel_se[round(.01*n),.]')~co_rel[round(.01*n),.]'~(co_rel[round(.01*n),.]' + 1.96*co_rel_se[round(.01*n),.]')~q_stat[round(.01*n),.]'~cdfchic(q_stat[round(.01*n),.]',1);""; " Rank [.01*n] AVERAGE CO-RELATION AND Q-STAT. OVER ALL DISPLACEMENTS";""; " Lower Co-Relation Upper Q-stat p-value"; meanc(co_rel[round(.01*n),.]'-1.96*co_rel_se[round(.01*n),.]')~meanc(co_rel[round(.01*n),.]')~meanc(co_rel[round(.01*n),.]' + 1.96*co_rel_se[round(.01*n),.]')~meanc(q_stat[round(.01*n),.]')~cdfchic(meanc(q_stat[round(.01*n),.]'),1);""; "";""; " AVERAGE CO-RELATION and Q-STAT oOVER ALL DISPLACEMENTS FOR RANK [.05*n] =" round(.05*n) "";""; " Lower Co-Relation Upper Q-stat p-value"; meanc(co_rel[round(.05*n),.]'-1.96*co_rel_se[round(.05*n),.]')~meanc(co_rel[round(.05*n),.]')~meanc(co_rel[round(.05*n),.]' + 1.96*co_rel_se[round(.05*n),.]')~meanc(q_stat[round(.05*n),.]')~cdfchic(meanc(q_stat[round(.05*n),.]'),1);""; "";""; " AVERAGE CO-RELATION and Q-STAT OVER ALL DISPLACEMENTS FOR RANK [.10*n] =" round(.10*n) "";""; " Lower Co-Relation Upper Q-stat p-value"; meanc(co_rel[round(.1*n),.]'-1.96*co_rel_se[round(.1*n),.]')~meanc(co_rel[round(.1*n),.]')~meanc(co_rel[round(.1*n),.]' + 1.96*co_rel_se[round(.1*n),.]')~meanc(q_stat[round(.1*n),.]')~cdfchic(meanc(q_stat[round(.1*n),.]'),1);""; "";""; " AVERAGE CO-RELATION and Q-STAT ACROSS RANKS" r_min_av "to" r_max_av; ""; " LAG Lower Co-Relation Upper Q-stat p-value"; seqa(1,1,h)~(meanc(co_rel[r_min_av:r_max_av,.]) - 1.96*meanc(co_rel_se[r_min_av:r_max_av,.]))~meanc(co_rel[r_min_av:r_max_av,.])~(meanc(co_rel[r_min_av:r_max_av,.]) + 1.96*meanc(co_rel_se[r_min_av:r_max_av,.]))~meanc(q_stat[r_min_av:r_max_av,.])~cdfchic(meanc(q_stat[r_min_av:r_max_av,.]),1);""; ""; " AVERAGE OVER RANK WINDOW AND DISPLACEMENTS";""; " Lower Co-Relation Upper Q-stat p-value"; meanc(meanc(co_rel[r_min_av:r_max_av,.]) - 1.96*meanc(co_rel_se[r_min_av:r_max_av,.]))~meanc(meanc(co_rel[r_min_av:r_max_av,.]))~meanc(meanc(co_rel[r_min_av:r_max_av,.]) + 1.96*meanc(co_rel_se[r_min_av:r_max_av,.]))~meanc(meanc(q_stat[r_min_av:r_max_av,.]))~cdfchic(meanc(meanc(q_stat[r_min_av:r_max_av,.])),1);""; ""; "****************************************************************************************";""; " ______________________________________"; " ______________________"; " ______________________________________"; "****************************************************************************************";""; " ALL RANKED CO-RELATIONS";""; rank=1; do until rank > rows(co_rel); "_____________________RANK =" rank "__________________"; " Lower Bound Co-Relation Upper Bound"; (co_rel[rank,.]' - 1.96*co_rel_se[rank,.]')~co_rel[rank,.]'~(co_rel[rank,.]' + 1.96*co_rel_se[rank,.]');""; rank=rank+1; endo; ""; "****************************************************************************************";""; " ______________________________________"; " ______________________"; " ______________________________________"; "****************************************************************************************";""; " Q-TEST P-VALIES";""; " Displacement";""; " RANK" seqa(1,1,h)'; rank_seq~p_value;"";""; "****************************************************************************************";""; " ______________________________________"; " ______________________"; " ______________________________________"; "****************************************************************************************";""; "";""; " AVERAGE TAIL INDEX ESTIMATES AND ROBUST INTERVALS "; " Across subset of ranks" r_min_av "to" r_max_av;""; " Alpha(NW) VAR/m 95%CI: k"; a=meanc(a_hat_ranks'); v=meanc(var_ranks'); meanc(a[r_min_av:r_max_av,.])~meanc(v[r_min_av:r_max_av,.])~1.96*sqrt(meanc(v[r_min_av:r_max_av,.])); "_______________________________________________________________________________________"; "";""; " TAIL INDEX FOR [.01*n] =" round(.1*n) "CO-RELATION RANK"; " Across subset of ranks" r_min_av "to" r_max_av;""; " Alpha(NW) VAR/m 95%CI: k"; a=meanc(a_hat_ranks'); v=meanc(var_ranks'); a[round(.01*n),1]~v[round(.01*n),1]~1.96*sqrt(v[round(.1*n),1]); "_______________________________________________________________________________________"; "";""; " TAIL INDEX per ALL CO-RELATION RANKS ";""; " rank implied m d = ln(m)/ln(n) Alpha VAR/m 95%CI: k"; rank_seq~meanc(m_star')~(ln(meanc(m_star'))./ln(n))~meanc(a_hat_ranks')~meanc(var_ranks')~sqrt(meanc((var_ranks)')).*1.96; ""; "_______________________________________________________________________________________";""; " TAIL INDEX FOR ALL SAMPLE TAIL FRACTILES 'm' ";""; " m Alpha VAR/m 95%CI: k"; m_seq~meanc(a_hat')~meanc(var')~sqrt(meanc((var)')).*1.96; ""; "_______________________________________________________________________________________"; ""; begwind; graphset; window (2,2,1); setwind(1); title ("Y(t)"); _plegctl = 0;_plegstr = ""; xlabel(" ");ylabel(" "); xy(0,y); nextwind; title ("Alpha_hat(m) and Robust Kernel 95% Band"); _plegctl = 0; xlabel("fractile m"); xy(m_seq,a_hat-1.96*sqrt(var)~a_hat~a_hat+1.96*sqrt(var)); nextwind; title ("Co-Relations and Robust 95% Interval\L(averaged over pre-set rank window)"); _plegctl = 1;_plegstr = "-k\000p(h)\000k"; y_axis_min = maxc(1.96*meanc(co_rel_se[r_min_av:r_max_av,.])|abs(minc(meanc(co_rel[r_min_av:r_max_av,.])))); y_axis_max = maxc(1.96*meanc(co_rel_se[r_min_av:r_max_av,.])|abs(maxc(meanc(co_rel[r_min_av:r_max_av,.])))); y_axis = maxc(y_axis_min|y_axis_max); ytics(-round(100*y_axis)/100-.05,round(y_axis*100)/100+.05,.1,.1); xlabel("displacement h"); _pbartyp = {0 0,6 0,0 0}; _pbarwid = -.75; bar(seqa(1,1,h),(-1.96*meanc(co_rel_se[r_min_av:r_max_av,.]))~meanc(co_rel[r_min_av:r_max_av,.])~(1.96*meanc(co_rel_se[r_min_av:r_max_av,.]))); graphset; endwind; retp(co_rel,co_rel_se,q_stat,p_value,a_hat,var); endp; /*------------------------------------------------------------------------ ** Q_test_ewn.prc ** ** Purpose: perform co-relation Q-test ** ** Usage: {q_stat,p_val,m_star,co_rel,co_rel_se,m_seq} = Q_test_ewn(x,h); ** ** Input: x nx1 vector ** h number of q-test lags ** ** Output q_stat hx1 vector of q-stat's for each lag = 1...h ** p_val hx1 vector p-values of q_stat's ** m_star optimally selected tail fractiles "m" based on co-relation ranks ** m1 ** m2 ** a_hat_1 ** a_hat_2 ** co_rel hx1 vector co-relation coefficients ** co_rel_se asymptotic standard errors ** m_seq sequence of m's used for test computation ** ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc (10) = Q_test_ewn(x,h); local ww,m,x1,d,n,n1,n2,q_stat,i,j,k,m_star,m_n,var_x_1,a_x_1,var_x_2,a_x_2; local stat_func,psi1,psi,w,m_1,m_2,a_hat,var,m_11,m_22; local sq_m,mse,w_j,n_m,m_min,m_max,m_seq,co_rel,co_rel_se; local a_hat_1,a_hat_2,var_1,var_2,rank,cov,cov_a_inv; local w_med; n = rows(x); m_min = round(n^m_pow)-round(n^m_win); /* set of feasible m's = m_min ... m_max */ m_max = round(n^m_pow)+round(n^(m_win*.8)); n_m = m_max - m_min + 1; /* # of m's */ m_seq = seqa(m_min,1,n_m); x1 = packr(lagn(x,seqa(0,1,h+1))); psi1=zeros(n_m,h); m_11=zeros(n_m,h); m_22=zeros(n_m,h); ww = zeros(n_m,h); m_star = zeros(n_m,h); /* the m's associated with the ranks */ /* compute "psi": if co-relation difference is required, then psi = p(2) - p(1) */ j = 1; m=m_min; do until m > m_max; i=1; do until i > h; {psi1[j,i],m_11[j,i],m_22[j,i],ww[j,i]} = co_rel_psi(x1[.,1],x1[.,i+1],m); i=i+1; endo; j=j+1; m=m+1; endo; psi=zeros(n_m,h); a_hat=zeros(n_m,h); var=zeros(n_m,h); co_rel=zeros(n_m,h); co_rel_se=zeros(n_m,h); q_stat=zeros(n_m,h); w = zeros(n_m,h); a_hat_1=zeros(n_m,h); var_1=zeros(n_m,h); a_hat_2=zeros(n_m,h); var_2=zeros(n_m,h); m_1=zeros(n_m,h); m_2=zeros(n_m,h); rank = 1; do until rank > n_m; /* compute m-selection criterion and select m */ i=1; do until i > h; if tails .eq 12; /* tail difference */ d=seqa(1,1,rows(m_seq))~m_seq~abs(psi1[.,i]); w_med = median(ww); else; /* two, left or right tail */ d=seqa(1,1,rows(m_seq))~m_seq~abs(ones(rows(psi1),1)-psi1[.,i]); w_med = median(ww); endif; d=sortc(d,3); m_star[rank,i] = d[rank,2]; psi[rank,i] = psi1[d[rank,1],i]; m_1[rank,i]= m_11[d[rank,1],i]; m_2[rank,i]= m_22[d[rank,1],i]; w[rank,i] = ww[d[rank,1],i]; {a_hat_1[rank,i],a_hat_2[rank,i],cov,cov_a_inv} = a_hat_kern_var_left_right(x1[.,1],m_star[rank,i]); var_1[rank,i] = cov[1,1]; var_2[rank,i] = cov[2,2]; i=i+1; endo; i=1; do until i > h; if tails .eq 0; /* two-tailed */ co_rel[rank,i] = 1-psi[rank,i]; co_rel_se[rank,i] = w_med[i,1]; q_stat[rank,i] = meanc(((co_rel[rank,1:i].^2)')./(w_med[1:i,1].^2)); else; /* all else */ if tails .eq 12; /* difference in tails: "w" contains all var. components */ co_rel[rank,i] = psi[rank,i]; co_rel_se[rank,i] = w_med[i,1]; q_stat[rank,i] = meanc(((co_rel[rank,1:i].^2)')./(w_med[1:i,1].^2)); endif; if tails .eq 1 or tails .eq 2; /* left or right tail */ co_rel[rank,i] = psi[rank,i]-1; co_rel_se[rank,i] = w[rank,i]; q_stat[rank,i] = meanc(((co_rel[rank,1:i].^2)')./((w[rank,1:i]').^2)); endif; endif; i=i+1; endo; rank=rank+1; endo; retp(q_stat,cdfchic(q_stat,1),m_star,m_1,m_2,a_hat_1,a_hat_2,co_rel,co_rel_se,m_seq); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** co_rel_psi.prc ** ** Purpose: Derive co_relation coefficient 1 - psi(x,y) ** ** Usage: {psi,a_x,w} = co_rel_psi(x,y,m); ** ** Input: x,y nx2 time series ** m order index used for a,c derivation: m = o(n^d) for some 0 < d < 1 ** ** Output psi co_relation coefficient sub-element: co-relation = 1 - psi(x,y) ** a_x index of regular variation based on right-tail information ** w scale for co-relation in Q-test proc ** ** Comment: 1. Asymptotic theory assumes alpha is common to x and y; assumes X and Y have identical scales c(i) ** 2. The co-relation estimator can be easily extended to het. processes {X,Y} ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc (4) = co_rel_psi(x,y,m); local A,A1,A2,b,x1,x2,y1,y2,xy1,xy2,n,n1,n2,m1,m2,i,co_rel,psi,xy,w,w1,w2; local chi_x_1,chi_x_2,chi_y_1,chi_y_2,chi_xy_1,chi_xy_2,a_x,var_x,c_x_0,c_x_1,c_x_2; local a_y,var_y,c_y_0,c_y_1,c_y_2,a_xy,var_xy,c_xy_0,c_xy_1,c_xy_2,a_x_1,a_x_2; local a_x1,a_x2,a_y1,a_y2,a_xy1,a_xy2,cov12,cov12_a_inv; local var_x1,var_x2,var_y1,var_y2,var_xy1,var_xy2,d; local nxy1,nxy2,mxy1,mxy2,d1,d2,d3; local ms,a_hat,a_z,cov_z,cov_z_a_inv,var_w; x1 = x.*(x .le 0); x2 = x.*(x .> 0); y1 = y.*(y .le 0); y2 = y.*(y .> 0); if tails .eq 0; /* two-tailed: use X - Y */ xy = x-y; else; /* all other cases: X + Y */ xy = x+y; endif; xy1 = xy.*(xy .le 0); xy2 = xy.*(xy .> 0); n = rows(x); n1 = sumc(x .le 0); n2 = sumc(x .> 0); m1 = round(n1^(ln(m)/ln(n))); /* for left-tail statistic, use no more than the # of negative obs. */ m2 = round(n2^(ln(m)/ln(n))); if tails .eq 0; /* two tails */ {a_x,var_x,c_x_0} = a_c(abs(x),abs(x),m,m); /* a_c(x for a_hat, x for c-base, m) */ {a_xy,var_xy,c_xy_0} = a_c(abs(x),abs(xy),m,m); psi = c_xy_0/(2*c_x_0); /* co-relation base: co_rel = psi-1 */ {a_z,cov_z,cov_z_a_inv,ms} = a_hat_kern_var_z_vec(abs(x)~abs(xy),m); w = (ln(n/minc(ms))/sqrt(minc(ms))).*(a_z[1,1]^(-1)).*(sqrt(abs(cov_z[1,1] + cov_z[2,2] - 2*cov_z[1,2]))); retp(psi,m,m,w); endif; if tails .eq 12; /* difference in tails */ m1=maxc(minc(m1|m2)|10); m2=m1; {a_x1,var_x1,c_x_1} = a_c(-x1,-x1,m1,m1); {a_y1,var_y1,c_y_1} = a_c(-x1,-y1,m1,m1); {a_xy1,var_xy1,c_xy_1} = a_c(-x1,-xy1,m1,m1); {a_x2,var_x2,c_x_2} = a_c(x2,x2,m2,m2); {a_y2,var_y2,c_y_2} = a_c(x2,y2,m2,m2); {a_xy2,var_xy2,c_xy_2} = a_c(x2,xy2,m2,m2); {a_z,cov_z,cov_z_a_inv,ms} = a_hat_kern_var_z_vec(abs(x1)~abs(xy1)~x2~xy2,m); A1 = (c_xy_1/(2*c_x_1))*(-a_z[1,1]^(-1))|((a_z[1,1]^(-1)))|((a_z[3,1]^(-1)))|(-a_z[3,1]^(-1)); A2 = (c_xy_2/(2*c_x_2))*(-a_z[1,1]^(-1))|((a_z[1,1]^(-1)))|((a_z[3,1]^(-1)))|(-a_z[3,1]^(-1)); psi = c_xy_2/(2*c_x_2)-c_xy_1/(2*c_x_1); w1 = (ln(n/minc(ms))/sqrt(minc(ms))).*sqrt(abs(A1'cov_z*A1)); w2 = (ln(n/minc(ms))/sqrt(minc(ms))).*sqrt(abs(A2'cov_z*A2)); w = minc(w1|w2); retp(psi,sqrt(m1*m2),sqrt(m1*m2),w); endif; if tails .eq 1; /* left tail */ {a_x,var_x,c_x_1} = a_c(-x1,-x1,m1,m1); {a_xy,var_xy,c_xy_1} = a_c(-x1,-xy1,m1,m1); psi = c_xy_1/(2*c_x_1); {a_z,cov_z,cov_z_a_inv,ms} = a_hat_kern_var_z_vec(-x1~-xy1,m1); w = (ln(n/minc(ms))/sqrt(minc(ms))).*(a_z[1,1]^(-1)).*(sqrt(abs(cov_z[1,1] + cov_z[2,2] - 2*cov_z[1,2]))); retp(psi,m1,m1,w); endif; if tails .eq 2; /* right tail */ {a_x,var_x,c_x_2} = a_c(x2,x2,m2,m2); {a_xy,var_xy,c_xy_2} = a_c(x2,xy2,m2,m2); psi = c_xy_2/(2*c_x_2); {a_z,cov_z,cov_z_a_inv,ms} = a_hat_kern_var_z_vec(x2~xy2,m2); w = (ln(n/minc(ms))/sqrt(minc(ms))).*(a_z[1,1]^(-1)).*(sqrt(abs(cov_z[1,1] + cov_z[2,2] - 2*cov_z[1,2]))); retp(psi,m2,m2,w); endif; endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** a_c.prc ** ** Purpose: Derive Hill's (1975) tail index estimator a, and stable domain scale c, cf. Hall (1982) ** ** Usage: {a,c} = a_c(x_a,x_c,m); ** ** Input: x_a nx1 time series for a_hat derivation ** x_c nx1 time series for c_hat base derivation: X(m+1) ** m order index used for a,c derivation: m = o(n^d) for some 0 < d < 1 ** ** Output a tail index estimator ** c scale estimator ** ** Jonathan B. Hill (2003) -----------------------------------------------------------------------*/ proc (3) = a_c(x_a,x_c,m_a,m_c); local a_hat,c_hat,var,n,x_ord; local x_ord_mc; x_ord = sortc(abs(packr(miss(x_c,0))),1); n = rows(x_ord); if m_c .ge n; m_c = n-1; endif; {a_hat,var} = a_hat_kern_var(x_a,m_a); c_hat = (m_c/n)*(abs(x_ord[n-m_c,1])^a_hat); retp(a_hat,var,c_hat); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** a_hat_kern_var.prc ** ** Purpose: Derive Hill's (1975) tail index estimator a, and Newey-West variance estimator ** ** Usage: {a_hat,var} = a_hat_kern_var(x,m); ** ** Input: x nx1 time series ** m order index used for a,c derivation: m = o(n^d) for some 0 < d < 1 ** ** Output a_hat tail index estimator ** var kernel variance estimator ** ** Jonathan B. Hill (2003) -----------------------------------------------------------------------*/ proc (2) = a_hat_kern_var(x1,m); local i,j,k,w,a_hat,z,x_m_a,n,x_ord,n_pos; local a_inv,bn,ln_xx,ln_xx_lag,var,var_a; x1=abs(x1); /* a_hat for all x: |x| */ x1=packr(miss(x1,0)); /* effectively removes "0's", and does not change true "alpha" */ n = rows(x1); x_ord = sortc(x1,1); a_hat = zeros(rows(m),cols(m)); var_a = zeros(rows(m),cols(m)); i=1; do until i > rows(m); j=1; do until j > cols(m); if m[i,j] .eq 0; goto skip_a; /* reset m = 1 if ranked m is simply "0" */ endif; a_hat[i,j] = (meanc(ln(abs(x_ord[n-m[i,j]+1:n,1])))-ln(abs(x_ord[n-m[i,j],1])))^(-1); a_inv = 1./a_hat[i,j]; bn = round(m[i,j].^kern_band); ln_xx = (ln(x1[.,1])-ln(abs(x_ord[n-m[i,j],1]))); x_m_a = x_ord[n-m[i,j],1].*a_hat[i,j]; /* X(m+1)^a_hat */ ln_xx = ln_xx.*(ln_xx .> 0) - (m[i,j]/n)*a_inv; /* used to estimate asymptotic variance */ var = sumc(ln_xx[.,1].^2)./m[i,j]; /* asymptotic variance */ k=1; do until k > bn; ln_xx_lag = packr(lagn(ln_xx,(0|k))); w = 1-k/bn; /* Bartlett kernel */ var = var + 2*w*sumc(ln_xx_lag[.,1].*ln_xx_lag[.,2])./m[i,j]; k=k+1; endo; var = abs((var).*a_hat[i,j]^4); var_a[i,j] = var./m[i,j]; /* scaled for interval estimation*/ skip_a: j=j+1; endo; i=i+1; endo; retp(a_hat,var_a); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** a_hat_kern_var_left_right.prc ** ** Purpose: Derive Hill's (1975) tail index estimator for left and right tails ** and a kernel estimator of the covariance matrix, cf. J.B. Hill (2006) ** ** Usage: {a,cov} = a_hat_kern_var_left_right(x,m); ** ** Input: x nx1 time series ** m order index used for a,c derivation ** assumed that m is a scalar ** ** Output a_hat_1 left tail index estimator ** a_hat_2 right tail index estimators ** cov kernel variance estimator ** cov_a_inv kernel variance estimator of 1/a ** ** Jonathan B. Hill (2003) -----------------------------------------------------------------------*/ proc (4) = a_hat_kern_var_left_right(x,m); local x1,x2,n1,n2,n12,m1,m2,m12,i,j,k,w,a_hat_1,a_hat_2,z; local cov,n,x_ord_1,x_ord_2,n_pos; local a_inv_1,a_inv_2,bn,ln_xx_1,ln_xx_2; local ln_xx_lag_1,ln_xx_lag_2,cov_a_inv,tail_flag,var; n = rows(x); x1=abs(x.*(x .< 0)); x2=x.*(x .> 0); n1=sumc(x .< 0); n2=sumc(x .> 0); cov_a_inv = miss(zeros(2,2),0); cov = miss(zeros(2,2),0); if n1 .> 0; tail_flag = 1; if n2 .> 0; tail_flag = 12; m1 = round(n1^(ln(m)/ln(n))); m2 = round(n2^(ln(m)/ln(n))); m12 = minc(m1|m2); /* DOES THIS MATCH OTHER m12 settings? */ n12 = minc(n1|n2); endif; else; if n2 .> 0; tail_flag = 2; else; tail_flag = 0; endif; endif; if tail_flag .eq 0; "Tail Index Estimation ERROR: X(t) = 0 for all t"; retp(0,0,0,0); elseif tail_flag .eq 1; {a_hat_1,var} = a_hat_kern_var(x1,m); a_hat_2 = 0; cov[1,1] = var; cov_a_inv[1,1] = cov[1,1].*(a_hat_1^(-2)); elseif tail_flag .eq 2; {a_hat_2,var} = a_hat_kern_var(x2,m); a_hat_1 = 0; cov[2,2] = var; cov_a_inv[2,2] = cov[2,2].*(a_hat_2^(-2)); else; x_ord_1 = sortc(x1,1); x_ord_1 = x_ord_1[n-n12+1:n,1]; x_ord_2 = sortc(x2,1); x_ord_2 = x_ord_2[n-n12+1:n,1]; a_hat_1 = (meanc(ln(abs(x_ord_1[n12-m12+1:n12,1])))-ln(abs(x_ord_1[n12-m12,1])))^(-1); a_hat_2 = (meanc(ln(abs(x_ord_2[n12-m12+1:n12,1])))-ln(abs(x_ord_2[n12-m12,1])))^(-1); /* compute newey west variance estimator for a_hat^(-1) for het., dep. data */ a_inv_1 = 1./a_hat_1; a_inv_2 = 1./a_hat_2; bn = round(minc(m1|m2).^kern_band); ln_xx_1 = ((ln(x_ord_1[.,1])-ln(x_ord_1[n12-m12,1])).*(ln(x_ord_1[.,1])-ln(x_ord_1[n12-m12,1]) .> 0) - (m12/n12)*a_inv_1); ln_xx_2 = ((ln(x_ord_2[.,1])-ln(x_ord_2[n12-m12,1])).*(ln(x_ord_2[.,1])-ln(x_ord_2[n12-m12,1]) .> 0) - (m12/n12)*a_inv_2); cov_a_inv[1,1] = sumc(ln_xx_1[.,1].^2)./m12; cov_a_inv[1,2] = sumc(ln_xx_1[.,1].*ln_xx_2[.,1])./m12; cov_a_inv[2,1] = sumc(ln_xx_1[.,1].*ln_xx_2[.,1])./m12; cov_a_inv[2,2] = sumc(ln_xx_2[.,1].^2)./m12; k=1; do until k > bn; ln_xx_lag_1 = packr(lagn(ln_xx_1,(0|k))); ln_xx_lag_2 = packr(lagn(ln_xx_2,(0|k))); w = 1-k/bn; /* Bartlett kernel */ cov_a_inv[1,1] = cov_a_inv[1,1] + 2*w*sumc(ln_xx_lag_1[.,1].*ln_xx_lag_1[.,2])./m12; cov_a_inv[1,2] = cov_a_inv[1,2] - 2*w*sumc(ln_xx_lag_1[.,1].*ln_xx_lag_2[.,2])./m12; cov_a_inv[2,1] = cov_a_inv[2,1] - 2*w*sumc(ln_xx_lag_2[.,1].*ln_xx_lag_1[.,2])./m12; cov_a_inv[2,2] = cov_a_inv[2,2] + 2*w*sumc(ln_xx_lag_2[.,1].*ln_xx_lag_2[.,2])./m12; k=k+1; endo; cov = abs((cov_a_inv).*((a_hat_1^2|a_hat_2^2)*(a_hat_1^2|a_hat_2^2)')); /* scale as asymptotic covariances */ cov[1,1]=cov[1,1]/m1;cov[2,2]=cov[2,2]/m2;cov[1,2]=cov[1,2]/sqrt(m1*m2);cov[2,1]=cov[1,2]; cov_a_inv[1,1]=cov_a_inv[1,1]/m1;cov_a_inv[2,2]=cov_a_inv[2,2]/m2;cov_a_inv[1,2]=cov_a_inv[1,2]/sqrt(m1*m2);cov_a_inv[2,1]=cov[1,2]/sqrt(m1*m2); cov[1,2] = (cov[1,2]+cov[2,1])/2;cov[2,1] = cov[1,2]; cov_a_inv[1,2] = (cov_a_inv[1,2]+cov_a_inv[2,1])/2;cov_a_inv[2,1] = cov_a_inv[1,2]; endif; retp(a_hat_1,a_hat_2,cov,cov_a_inv); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** a_hat_kern_var_z_vec.prc ** ** Purpose: Derive Hill's (1975) tail index estimator for a k-vector of processes {z(t)} ** with support on [0,infin), and a kernel estimator of the covariance matrix, ** cf. J.B. Hill (2006) ** ** Usage: {a,cov} = a_hat_kern_var_z_vec(x,m); ** ** Input: z nxk time series: assumed to be positive ** m order index used for a,c derivation: assumed that m is a scalar ** ** Output a_hat_z vecror of tail index estimates ** cov kernel variance estimator ** cov_a_inv kernel variance estimator of 1/a ** mz ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc (4) = a_hat_kern_var_z_vec(z,m); local nz,nzz,mx,mz,mzz,i,j,k,w,a_hat_z; local cov,n,z_ord,n_pos,a_inv_z,bn,ln_zz; local ln_zzs,ln_zz_lag_mj,ln_zz_lag_pj,cov_a_inv,tail_flag,var; n = rows(z); k=cols(z); nz=sumc(z .> 0); mz = round(nz^(ln(m)/ln(n))); mzz = minc(mz); nzz = minc(nz); cov_a_inv = miss(zeros(k,k),0); cov = miss(zeros(k,k),0); z = miss(z.*(z .> 0),0); /* converts 0's to missing values */ z_ord = zeros(rows(z),k); i=1; do until i > k; z_ord[.,i] = sortc(z[.,i],1); i=i+1; endo; z_ord=packr(z_ord); a_hat_z = (meanc(ln(abs(z_ord[nzz-mzz+1:nzz,.])))-ln(abs(z_ord[nzz-mzz,.]))')^(-1); /* compute kernel variance estimator for a_hat^(-1) for het., dep. data */ a_inv_z = 1./a_hat_z; bn = round(mzz.^kern_band); ln_zz = ((ln(z_ord)-ln(z_ord[nzz-mzz,.])).*(ln(z_ord)-ln(z_ord[nzz-mzz,.]) .> 0) - (mzz/nzz).*a_inv_z'); cov_a_inv = ln_zz'ln_zz/mzz; j=1; do until j > bn; ln_zz_lag_mj = zeros(rows(ln_zz),k); ln_zz_lag_pj = zeros(rows(ln_zz),k); ln_zzs=ln_zz; i=1; do until i > k; ln_zz_lag_mj[.,i] = lagn(ln_zzs[.,i],j); ln_zz_lag_pj[.,i] = lagn(ln_zzs[.,i],-j); i=i+1; endo; ln_zz_lag_pj=packr(ln_zzs~ln_zz_lag_mj~ln_zz_lag_pj); ln_zzs = ln_zz_lag_pj[.,1:cols(ln_zzs)]; ln_zz_lag_mj = ln_zz_lag_pj[.,cols(ln_zzs)+1:cols(ln_zzs)+cols(ln_zz_lag_mj)]; ln_zz_lag_pj = ln_zz_lag_pj[.,cols(ln_zzs)+cols(ln_zz_lag_mj)+1:cols(ln_zz_lag_pj)]; w = 1-j/bn; /* Bartlett kernel */ cov_a_inv = cov_a_inv + w*(ln_zzs'ln_zz_lag_mj+ln_zz_lag_pj'ln_zzs)./mzz; j=j+1; endo; cov = cov_a_inv.*(a_hat_z.^2.*(a_hat_z.^2)'); retp(a_hat_z,cov,cov_a_inv,mz); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** pareto_rv.prc ** ** Purpose: simulate a-stable r.v. ** ** Usage: {x} = stable_rv(n,alpha,beta,c1,c2); ** ** Input: n sample size ** alpha tail index ** beta skewness ** c1,c2 left/right extreme scales ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc pareto_rv(n1,alpha,beta,c1,c2); local n,i,j,t,rv,z0,z1,w,u; n=n1*2; rv=zeros(n,1); z1=zeros(n,1); z0=rndu(n,1); u=rndu(n,1).*(.999999) + .0000005; rv=(abs(((1-u)./c1).^(-1/alpha))).*(z0 .> .5) + (abs(((u)./c2).^(-1/alpha))).*(z0 .< .5) ; rv=rv-1; z0=2.*(z0 .> .5) - 1; rv=rv.*z0; retp(rv[n1+1:n,.]); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** arma_sim.prc ** ** Purpose: Simulates ARMA(p,q) for given p,q ** ** Usage: {y} = arma_sim(p,q,theta,e) ** ** Inputs: p,q orders ** theta kx1 vector of slopes, includes constant and trend slopes if applicable ** e random shock: nx1 ** ** Output: y nx1 vector time series ** ** Comment: Requires global variables trend = 1 if trend, const = 1 if constant term ** ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc (1) = arma_sim(p,q,theta,e); local y,i,t,phi,n; local phi0,delta; n = rows(e); if p+q .eq 0; "Error: Simulated ARMA does not have any orders: need p > 0 and/or q > 0."; retp(error(0)); endif; if p .eq 0; phi = 0; else; phi = theta[1:p,1]; endif; if q .eq 0; theta =0; else; theta = theta[p+1:rows(theta),1]; endif; y=zeros(n,1); y[1,1] = e[1,1]; if p + q .eq 0; /* merely white noise */ y=e; goto skip_arma; endif; if p .< q; if p .eq 0; /* MA(q) */ i=1; do until i > q; y[i+1,1] = theta[1:i,1]'e[i:1,1]+e[i+1,1]; i=i+1; endo; else; /* ARMA(p,q), 0 < p < q */ i=1; do until i > p; y[i+1,1] = phi[1:i,1]'y[i:1,1]+theta[1:i,1]'e[i:1,1]+e[i+1,1]; i=i+1; endo; i=p+1; do until i > q; y[i+1,1] = phi'y[i:i-p+1,1]+theta[1:i,1]'e[i:1,1]+e[i+1,1]; i=i+1; endo; endif; else; if q .eq 0; /* AR(p) */ i=1; do until i > p; y[i+1,1] = phi[1:i,1]'y[i:1,1]+e[i+1,1]; i=i+1; endo; else; /* ARMA(p,q), 0 < q < p */ i=1; do until i > q; y[i+1,1] = phi[1:i,1]'y[i:1,1]+theta[1:i,1]'e[i:1,1]+e[i+1,1]; i=i+1; endo; i=q+1; do until i > p; y[i+1,1] = phi[1:i,1]'y[i:1,1]+theta'e[i:i-q+1,1]+e[i+1,1]; i=i+1; endo; endif; endif; i=maxc(p|q)+1; do until i > n-1; if p .eq 0; y[i+1,1] = theta'e[i:i-q+1,1]+e[i+1,1]; else; if q .eq 0; y[i+1,1] = phi'y[i:i-p+1,1]+e[i+1,1]; else; y[i+1,1] = phi'y[i:i-p+1,1]+theta'e[i:i-q+1,1]+e[i+1,1]; endif; endif; i=i+1; endo; skip_arma: y = y[round(n/2)+1:n,1]; /* trim to remove dependence on start values */ retp(y); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** setar_sim.prc ** ** Purpose: Simulate Self Exciting Treshold AR [SETAR(p1,p2)]; use y(t-d) as threshold trigger ** ** Usage: {y} = setar_sim(n,phi1,phi2,c,d,u) ** ** Input: n sample size ** phi1 1st regime AR coeff. ** phi2 2nd regime AR coeff. ** c tehshold ** d delay parameter for threshold variable y(t-d) ** u 2*n-vector of innovations ** ** Comment: automatically use an intercept ** ** Jonathan B. Hill (2003) -----------------------------------------------------------------------*/ proc (1) = setar_sim(n,phi1,phi2,c,d,u); local i,t,j,y; local p1,p2,I_t; p1=rows(phi1)-1; p2=rows(phi2)-1; y=zeros(2*n,1); t=maxc(p1|p2|d)+1; do until t > 2*n; I_t = {}; I_t = (y[t-d,1] .> c); y[t,1]=((1|y[t-1:t-p1,1])'phi1).*I_t + ((1|y[t-1:t-p2,1])'phi2).*(1-I_t) + u[t,1]; t=t+1; endo; y=y[n+1:2*n,1]; retp(y); endp; /*____________________________________________________________________________________*/ /*------------------------------------------------------------------------- ** extremal_wn_setar_sim.prc ** ** Purpose: Simulate Extremal Self Exciting Treshold AR [SETAR(p1,p2)] ** Use e(t-1) as threshold trigger ** {x(t)} is asymptotically Extremal White Noise ** ** Usage: {y} = setar_sim(n,phi,bn,u) ** ** Input: n sample size ** phi AR coeff. ** bn threshold sequence: bn --> infin as n --> infin ** u 2*n-vector of innovations ** ** Comment: automatically use an intercept ** ** Jonathan B. Hill (2006) -----------------------------------------------------------------------*/ proc (1) = extremal_wn_setar_sim(n,phi,bn,u); local i,t,j,y; local p,I_t; p=rows(phi)-1; y=zeros(2*n,1); t = p+1; do until t > 2*n; I_t = {}; I_t = (abs(u[t-1,1]) .le bn); y[t,1]=((1|y[t-1:t-p,1])'phi).*I_t + u[t,1]; /* if |u(t-1)| <= bn then AR(1); else iid */ t=t+1; endo; y=y[n+1:2*n,1]; retp(y); endp; /*____________________________________________________________________________________*/ /*-------------------------------------------------------------------------- ** day_filter(y,d) ** ** Purpose: filter daily data for day effects ** ** Usage: ** ** Input: y - time series ** d - number of days/ trading week ** first_day - the first day of the data: 1,2,... ** ** Output: ** ** Jonathan Hill (2005) ----------------------------------------------------------------------------*/ proc day_filter(y,d,first_day); local x,i,j,n,y_f,days,days1; n = rows(y); y_f = zeros(n,1); days1 = eye(d); days = eye(d); i=1; do until i > 2*round(n/d); days = days|days1; i=i+1; endo; days = days[first_day:n+first_day-1,.]; x = packr(y~days); y=x[.,1]; x=x[.,2:cols(x)]; y_f = y-x*inv(x'x)*x'y; retp(y_f); endp; /*____________________________________________________________________________________*/ /*-------------------------------------------------------------------------- ** month_filter(y) ** ** Jonathan Hill (2006) ----------------------------------------------------------------------------*/ proc month_filter(y,first_month); local x,i,j,n,y_f,month,month1; n = rows(y); y_f = zeros(n,1); month1 = eye(12); month = eye(12); i=1; do until i > 2*round(n/12); month = month|month1; i=i+1; endo; month = month[first_month:n+first_month-1,.]; x=y~month; y=x[.,1]; x=x[.,2:cols(x)]; y_f = y-x*inv(x'x)*x'y; retp(y_f); endp; /*____________________________________________________________________________________*/ /*-------------------------------------------------------------------------- ** diff.prc ** ** ** Purpose: create difference series ** ** Usage: {W} = diff(W,d); ** ** Input: W: TxK design matrix ** d: difference lag (assumed integer > 0); ** ** Output: W = (T-d)xk matrix of differenced variates ** ** Jonathan Hill (2003) ----------------------------------------------------------------------------*/ proc diff(W,d); local i,W1,W2; if d == 0; retp(W); endif; if round(d) .ne d; "Error: difference lag is not an integer"; retp(error(0)); else; if d .< 0; "Error: difference lag < 0"; retp(error(0)); endif; endif; W1 = zeros(rows(W)-d,cols(W)); W2 = zeros(rows(W),d+1); i=1; do until i > cols(W); W2 = packr(W[.,i]~lagn(W[.,i],d)); W1[.,i] = W2[.,1]-W2[.,2]; i=i+1; endo; retp(W1); endp; /*____________________________________________________________________________________*/