Chapter 8: SQL Statements

This chapter covers how to use SQL to access, retrieve and manipulate data in DolphinDB. DolphinDB’s SQL syntax is very similar to the standard SQL language in a Relational Database Management System (RDBMS) such as MySQL, Oracle, SQL Server, etc.

Syntax

select [top_clause] column_expressions
from table_name | table_expression
[where filtering_conditions]
[grouping_clause [having_clause] | order_clause]

Common features with standard SQL

  • Support select, insert, update and delete statement for retrieving, inserting, updating and deleting records in a table, respectively. Since 1.30.17/2.00.5, DolphinDB supports the create statement to create a database (table), and the alter statement to add columns to a table. Starting in version 1.30.22/2.00.10, all DolphinDB SQL keywords can be written in all-capital letters (e.g., SELECT, FROM, WHERE).

  • Support the where clause.

  • Support group by and order by clauses.

  • Support table join, including inner join, left join, left semijoin, full join.

  • Starting from version 1.30.22, line breaks are supported for SQL statements. Note that:
    • Keywords with multiple words (such as ORDER BY, GROUP BY, UNION ALL, INNER JOIN) cannot be split into two lines.

    • If an alias for a column or table is not specified with keyword as, it must follow the original name without a line break.

What are the differences?

  • Most functions can be directly called in SQL queries.

  • Other differences as listed below.

Standard SQL syntax

DolphinDB syntax

Explanation

where sym=’IBM’

where sym=`IBM (“IBM”) or

where sym==`IBM (“IBM”)

To represent string in DolphinDB, we can use a backtick ` before a single word, or a pair of double

quotes ” ” or single quotes ‘ ‘ around a single word or multiple words with spaces or symbols as

string.

where sym=’IBM’ and qty>2000

where sym==`IBM, qty>2000

where sym==`IBM and qty>2000

where sym=`IBM && qty>2000

All SQL keywords such as “select”, “where”, “and”, “or” must use lower case in DolphinDB.

“,” is preferred to “and” with multiple “where” conditions. Each condition is tested on the

filtered data set with previous conditions.

where sym=”IBM” or qty>2000

where sym==`IBM or qty>2000

where sym==`IBM || qty>2000

where x not in (1,2) and y not in (22,23)

where (not x in [1,2]) and (not y in (22,23))

DolphinDB does not support NOT IN operator.

select avg(price), sym from Dataset group by sym

select avg(price) from Dataset group by sym OR

select avg(price), sym from Dataset group by sym

If a GROUP BY column is not specified in the SELECT clause, it will be automatically included in

the result.

N/A

context by

context by is an innovation of DolphinDB. It simplifies processing time-series data within

each group. With group by, each group returns a scalar; with context by, each group returns

a vector of the same size as the group’s records.

N/A

pivot by

pivot by transforms a vector into a matrix or table.

N/A

cgroup by

Perform cumulative grouping calculations

N/A

map

Execute the SQL statement on each partition separately, then merge the results.

case …. when ….

iif

select column_name(s)

from table1 left join table 2

on table1.column_name=table2.column_name

select column_name(s) from lj(table1, table2, column_name)

DolphinDB’s syntax is more succinct.

LEFT JOIN

left join, lj, left semijoin, lsj

left join (lj) and left semijoin (lsj). Left join returns all records from the left table

and the matched records from the right table. The result is NULL from the right table if there is

no match. If there are more than one matched record in the right table, all the matched records in

the right table are returned. Unlike left join, if there are more than one matched record in the

right table, left semi join only returns the first record.

INNER JOIN

ej, sej, inner join

ej and sej. Return only the rows that have equivalent values for the matching columns.

Unlike ej, sej will sort the table in order by the join column.

OUTER JOIN

fj, full join

Full join. Return all rows from both the left table and the right table, whether they have a

matching row or not.

N/A

aj

Asof join. It takes each record in the left table as a reference and checks if there is

a match in the right table. If there is no match, the most recent observation will be chosen.

If there are more than one match, the last one will be chosen.

N/A

wj, pwj

Window join and prevailing window join. They are a generalization of asof join. For each row

in the left table, window join applies aggregate functions on a window of rows in the right table.

If the right table doesn’t have a matching value for the window, prevailing window join will

fill it with the last value before the window and then apply the aggregate functions.

Compatibility for SQL Dialects

Since version 1.30.22/2.00.10, DolphinDB has enhanced the compatibility for Oracle and MySQL dialects. In addition to supporting standard SQL syntax, DolphinDB deals with the inconsistent behaviors of functions with the same name due to dialect-specific features: You can select a dialect mode in a session and run scripts written in that dialect. Currently, three dialect modes are available: DolphinDB, Oracle and MySQL.

Note:

  • Scripts written in DolphinDB language can be correctly parsed in Oracle or MySQL dialect mode.

  • Only part of the functions or features of Oracle/MySQL are supported:

SQL Dialect

Features

Functions (case insensitive)

Oracle

Comment symbols: --, /**/

Concatenation operator: ||

concat, decode, to_char, to_date, nvl, nvl2, and instr

Note: to_char only accepts numeric, DATE, DATEHOUR, and DATETIME types.

MySQL

sysdate