DolphinDB offers a template function pivot, a function unpivot, and a SQL clause pivot by to change the dimensions of the data.
We will use the following table t1 to illustrate the usage of pivot, unpivot, and pivot by.
syms=`BIDU`MSFT`ORCL$symbol
sym=syms[0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2 2]
price=172.12 170.32 172.25 172.55 175.1 174.85 174.5 36.45 36.15 36.3 35.9 36.5 37.15 36.9 40.1 40.2 40.25 40.15 40.1 40.05 39.95
qty=100* 10 3 7 8 25 6 10 4 5 1 2 8 6 10 2 2 5 5 4 4 3
trade_time=09:40:00+1 30 65 90 130 185 195 10 40 90 140 160 190 200 5 45 80 140 170 190 210
t1=table(sym, price, qty, trade_time);
t1;
sym price qty trade_time
---- ------ ---- ----------
BIDU 172.12 1000 09:40:01
BIDU 170.32 300 09:40:30
BIDU 172.25 700 09:41:05
BIDU 172.55 800 09:41:30
BIDU 175.1 2500 09:42:10
BIDU 174.85 600 09:43:05
BIDU 174.5 1000 09:43:15
MSFT 36.45 400 09:40:10
MSFT 36.15 500 09:40:40
MSFT 36.3 100 09:41:30
MSFT 35.9 200 09:42:20
MSFT 36.5 800 09:42:40
MSFT 37.15 600 09:43:10
MSFT 36.9 1000 09:43:20
ORCL 40.1 200 09:40:05
ORCL 40.2 200 09:40:45
ORCL 40.25 500 09:41:20
ORCL 40.15 500 09:42:20
ORCL 40.1 400 09:42:50
ORCL 40.05 400 09:43:10
ORCL 39.95 300 09:43:30
To rearrange a table column as a matrix based on two other columns in the same table. we can use the template function pivot.
To count the number of observations within each minute for each stock:
pivot(count, price, minute(trade_time), sym);
BIDU MSFT ORCL
---- ---- ----
09:40m|2 2 2
09:41m|2 1 1
09:42m|1 2 2
09:43m|2 2 2
To display the last observation within each minute for each stock:
pivot(last, price, minute(trade_time), sym);
BIDU MSFT ORCL
------ ----- -----
09:40m|170.32 36.15 40.2
09:41m|172.55 36.3 40.25
09:42m|175.1 36.5 40.1
09:43m|174.5 36.9 39.95
We can also use the SQL clause pivot by for the example above.
select last(price) from t1 pivot by minute(trade_time), sym;
minute_trade_time BIDU MSFT ORCL
----------------- ------ ----- -----
09:40m 170.32 36.15 40.2
09:41m 172.55 36.3 40.25
09:42m 175.1 36.5 40.1
09:43m 174.5 36.9 39.95
The function unpivot can be useful when we would like to rearrange 2 columns in a table as 1 column. Please note that in the following example, a new column is created for the column names of col1 and col2.
t=table(1..3 as id, 4..6 as col1, 7..9 as col2);
t;
id col1 col2
-- ---- ----
1 4 7
2 5 8
3 6 9
t.unpivot(`id, `col1`col2);
id valueType value
-- --------- -----
1 col1 4
2 col1 5
3 col1 6
1 col2 7
2 col2 8
3 col2 9
Finally, we give an example of calculating average pairwise correlation of daily stock returns for 4 stocks with 5 days of observations with the help of the template function pivot.
x=2017.03.27..2017.03.31
date=take(x,20)
symbol=take(`A`B`C`D, 20).sort()
returns=0.0121 0.0059 -0.0215 -0.004 0.0312 0.0019 0.0004 -0.0321 -0.0098 -0.0001 0.0003 0.0078 -0.0276 -0.0021 0.0219 0.0087 -0.0003 -0.0127 -0.0022 0.0186
t = table(date, symbol, returns);
t;
date symbol returns
---------- ------ -------
2017.03.27 A 0.0121
2017.03.28 A 0.0059
2017.03.29 A -0.0215
2017.03.30 A -0.004
2017.03.31 A 0.0312
2017.03.27 B 0.0019
2017.03.28 B 0.0004
2017.03.29 B -0.0321
2017.03.30 B -0.0098
2017.03.31 B -0.0001
2017.03.27 C 0.0003
2017.03.28 C 0.0078
2017.03.29 C -0.0276
2017.03.30 C -0.0021
2017.03.31 C 0.0219
2017.03.27 D 0.0087
2017.03.28 D -0.0003
2017.03.29 D -0.0127
2017.03.30 D -0.0022
2017.03.31 D 0.0186
We can use pivot to transform the column returns to a matrix with each row for a date and each column for a stock
retMat = pivot(last, t.returns, t.date, t.symbol);
retMat;
A B C D
------- ------- ------- -------
2017.03.27|0.0121 0.0019 0.0003 0.0087
2017.03.28|0.0059 0.0004 0.0078 -0.0003
2017.03.29|-0.0215 -0.0321 -0.0276 -0.0127
2017.03.30|-0.004 -0.0098 -0.0021 -0.0022
2017.03.31|0.0312 -0.0001 0.0219 0.0186
Next, with the help of the template cross (:C), calculate the correlation between any of the two stocks. The result is a square matrix.
corrs = retMat corr :C retMat;
corrs;
A B C D
-------- -------- -------- --------
A|1 0.844171 0.945889 0.98557
B|0.844171 1 0.882214 0.800108
C|0.945889 0.882214 1 0.891941
D|0.98557 0.800108 0.891941 1
Setting all diagonal elements of the correlation matrix to be NULL:
rfor(i in 0:ows(corrs)) corrs[i,i]=NULL;
corrs;
A B C D
-------- -------- -------- --------
A| 0.844171 0.945889 0.98557
B|0.844171 0.882214 0.800108
C|0.945889 0.882214 0.891941
D|0.98557 0.800108 0.891941
Finally, calculate the average pairwise correlation. We need to rearrange the matrix corrs to a vector first.
corrs.flatten().avg();
0.891649