Understanding Identifiers and Variables in Snowflake
Written on
Chapter 1: Introduction to Snowflake Identifiers and Variables
As a programmer, transitioning between different coding languages can often lead to errors such as mistyping variable prefixes or misapplying SQL bind variable rules. For instance, when working with Snowflake stored procedures in JavaScript, it’s crucial to remember that declared arguments must be utilized in uppercase.
To make your life easier, I have compiled various use cases associated with identifiers, variables, function arguments, and naming conventions specific to Snowflake and its ecosystem, which includes SnowSQL, environment variables, JSON property names, and stored procedures in multiple programming languages.
Section 1.1: What Are Identifiers in Snowflake?
Identifiers serve as user-friendly names that distinctly refer to database objects such as databases, schemas, tables, columns, sequences, and functions. They can be included in SQL statements either hard-coded or unquoted. It's essential to limit identifiers to a maximum of 2565 characters and to refrain from using special characters.
Notably, there are significant distinctions between certain fully qualified table names that may seem identical. For example, MyDb.MySchema.MyTable, MYDB.MYSCHEMA.MYTABLE, and "MyDb"."MySchema"."MyTable" are not always interchangeable.
When the session parameter QUOTED_IDENTIFIERS_IGNORE_CASE is set to False (the default), names enclosed in double quotes will maintain their case sensitivity, while unquoted names are stored in uppercase. If the parameter is set to True, all names will be converted to uppercase.
It's also crucial to note that "MyDb.MySchema.MyTable" does not constitute a fully qualified table name, as each segment must ultimately be enclosed in double quotes if special characters are present.
Section 1.2: Best Practices for Naming
I personally prefer typing all SQL statements and object names in lowercase, as uppercase feels too assertive. By creating tables and columns with lowercase names, I ensure they are stored in uppercase while still referring to them in lowercase later on. However, there is an exception when using the object name as a string literal.
This brings us to the IDENTIFIER function, which is useful for dynamic identifier names within a FROM clause. For example, FROM mytable can effectively be likened to FROM IDENTIFIER('MYTABLE'), if the table name was stored in uppercase. It’s also possible to use double quotes in string literals if the name is case-sensitive, e.g., FROM IDENTIFIER('"mytable"').
Chapter 2: Working with JSON and Variables
The first video titled "Review: Identifiers and Variables in Snowflake SQL" provides a comprehensive overview of identifiers and variables in Snowflake, highlighting their significance in programming.
In SQL, columns are referenced by their names, as seen in SELECT col1..., except when using SELECT *, which retrieves all columns. In the GROUP BY and ORDER BY clauses, you can refer to columns by their ordinal position, or utilize $1, $2, etc., for referencing columns by position from the FROM clause.
Section 2.1: JSON Properties
JSON data can be stored in VARIANT cells, consisting of JSON objects (dictionaries of name-value pairs), arrays, and scalar values. Similar to Python, property values in SQL can be accessed using either obj["name"] or obj.name.
For instance, in a query expression like v:myobj.prop1.prop2["name2"].array1[2]::string, v is the column name or alias, myobj is the top JSON object, and prop1 is one of its property names.
Section 2.2: Global and Local Variables
Session variables, defined outside any function or procedure, are available only during your connection to Snowflake and have case-insensitive names. They can be defined with commands like SET v1 = 'abc' and can be accessed in SQL statements using $v1 or GETVARIABLE('v1').
Similarly, local variables can be established within a Snowflake Scripting block or stored procedure, with syntax that allows for optional types and default values.
The second video titled "How to use Variables in Snowflake" elaborates on the importance of using variables effectively in your programming tasks.
Conclusions
In summary, always use double-quoted identifier names for case-sensitive names or those with special characters. Remember that unquoted names will default to uppercase in the database. Use the IDENTIFIER or TABLE functions when dynamically passing query identifiers. Additionally, leverage positional column references, and adhere to the appropriate syntax for accessing JSON properties and defining variables within Snowflake.