2. Java SQL Transpiler Library¶
Transpile, Resolve, Lineage – A Database independent, stand-alone SQL Transpiler, Column Resolver and Lineage Tracer written in pure Java, translating various large RDBMS SQL Dialects into a few smaller RDBMS Dialects for Unit Testing.
JSQLTranspiler allows you to develop and test your Big Data SQL at no cost on a local DuckDB instance before deploying and running it in the cloud.
Internal Functions will be rewritten based on the actual meaning and purpose of the function, respecting different function arguments’ count, order and type. Rewrite of Window- and Aggregate-Functions as well.
Download¶
Flavor |
File |
Size |
---|---|---|
Java Library Stable |
(80 kb) |
|
Java Library Snapshot |
(80 kb) |
|
CLI Fat JAR Stable |
(1.3 MB) |
|
CLI Fat JAR Snapshot |
(1.3 MB) |
Like us on the JSQLTranspiler Git Repository
git clone https://github.com/starlake-ai/JSQLTranspiler.git
cd JSQLTranspiler
./gradlew build
<dependency>
<groupId>ai.starlake.jsqltranspiler</groupId>
<artifactId>jsqltranspiler</artifactId>
<version>0.6</version>
</dependency>
<repositories>
<repository>
<id>jsqltranspiler-snapshots</id>
<snapshots>
<enabled>true</enabled>
</snapshots>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots/</url>
</repository>
</repositories>
<dependency>
<groupId>ai.starlake.jsqltranspiler</groupId>
<artifactId>jsqltranspiler</artifactId>
<version>0.7-SNAPSHOT</version>
</dependency>
repositories {
mavenCentral()
}
dependencies {
implementation 'ai.starlake.jsqltranspiler:jsqltranspiler:0.6'
}
repositories {
maven {
url = uri('https://s01.oss.sonatype.org/content/repositories/snapshots/')
}
}
dependencies {
implementation 'ai.starlake.jsqltranspiler:jsqltranspiler:0.7-SNAPSHOT'
}
Examples¶
-- Google BigQuery
SELECT
DATE(2016, 12, 25) AS date_ymd,
DATE(DATETIME '2016-12-25 23:59:59') AS date_dt,
DATE(TIMESTAMP '2016-12-25 05:30:00+07', 'America/Los_Angeles') AS date_tstz;
-- Rewritten DuckDB compliant statement
SELECT
MAKE_DATE(2016, 12, 25) AS date_ymd,
CAST(DATETIME '2016-12-25 23:59:59' AS DATE) AS date_dt,
CAST(TIMESTAMP '2016-12-25 05:30:00+07' AT TIME ZONE 'America/Los_Angeles' AS DATE) AS date_tstz;
-- Same Tally
1
-- Same Result
"date_ymd","date_dt","date_tstz"
"2016-12-15","2016-12-15","2016-12-15"
/* Schema:
Table a: Columns col1, col2, col3, colAA, colBA
Table b: Columns col1, col2, col3, colBA, colBB
*/
-- provided SELECT with STAR Operators
SELECT *
FROM ( ( SELECT *
FROM b ) c
INNER JOIN a
ON c.col1 = a.col1 ) d
;
-- Resolved Columns via JSQLColumnResolver.rewrite(...)
-- Without needing an actual database connection
SELECT d.col1
, d.col2
, d.col3
, d.colBA
, d.colBB
, d.col1_1
, d.col2_1
, d.col3_1
, d.colAA
, d.colAB
FROM ( ( SELECT b.col1
, b.col2
, b.col3
, b.colBA
, b.colBB
FROM b ) c
INNER JOIN a
ON c.col1 = a.col1 ) d
;
/* Schema:
Table a: Columns col1, col2, col3, colAA, colBA
Table b: Columns col1, col2, col3, colBA, colBB
*/
-- provided SELECT with STAR Operator
-- Without needing an actual database connection
SELECT Sum( colBA + colBB ) AS total
, ( SELECT col1 AS test
FROM b ) col2
, CURRENT_TIMESTAMP() AS col3
FROM a
INNER JOIN ( SELECT *
FROM b ) c
ON a.col1 = c.col1
;
<?xml version="1.0" encoding="UTF-8"?>
<ColumnSet>
<Column alias='total' name='Sum'>
<ColumnSet>
<Column name='Addition'>
<ColumnSet>
<!-- scope points on the actual physical column b.colBA -->
<Column name='colBA' table='c' scope='b.colBA' dataType='java.sql.Types.OTHER' typeName='Other' columnSize='0' decimalDigits='0' nullable=''/>
<!-- scope points on the actual physical column b.colBB -->
<Column name='colBB' table='c' scope='b.colBB' dataType='java.sql.Types.OTHER' typeName='Other' columnSize='0' decimalDigits='0' nullable=''/>
</ColumnSet>
</Column>
</ColumnSet>
</Column>
<Column alias='col2' name='col1'>
<ColumnSet>
<Column alias='test' name='col1' table='b' dataType='java.sql.Types.OTHER' typeName='Other' columnSize='0' decimalDigits='0' nullable=''/>
</ColumnSet>
</Column>
<Column alias='col3' name='CURRENT_TIMESTAMP'/>
</ColumnSet>
String sqlStr = "SELECT * FROM (SELECT * FROM A) AS A \n" +
"JOIN B ON A.a = B.a \n" +
"JOIN C ON A.a = C.a;";
Set<String> tables = TablesNamesFinder.findTablesOrOtherSources(sqlStr);
assertThat(tables).containsExactlyInAnyOrder("A", "B", "C");
tables = TablesNamesFinder.findTables(sqlStr);
assertThat(tables).containsExactlyInAnyOrder("B", "C");
SQL Dialects¶
JSQLTranspiler currently understands the following Big RDBMS dialects:
Google BigQuery
Databricks
Snowflake
Amazon Redshift
and rewrites into to the following small RDBMS dialects:
DuckDB
planned: H2
planned: Postgres
Features¶
Comprehensive support for Query and DML statements (INSERT, DELETE, UPDATE, MERGE)
RDBMS specific Functions, Predicates and Operators
RDBMS specific Date and Number formatting parameters
Extensive
ARRAY
,ROW
andSTRUCT
supportDeeply Nested Expressions such as correlated Sub-Selects, CTE’s and
WITH
clausesExplicit and Implicit Cast expressions, e. g.
DATE '2023-12-31'
,'2023-12-31'::Date
andCast('2023-12-31' AS Date)
SQL Named and Ordinal Parameters:
?
,?1
or:parameter
Lateral Table and Sub-Select Functions, e. g.
UNNEST()
,TABLE()
Window and Aggregate Functions
Columns Resolution for
EXCEPT
andREPLACE
filters as well as forUSING
joins (left or right)