Huawei GaussDB (for MySQL) is a cloud-based high-performance, high-available relational database that fully supports the syntax and functionality of the open source database MySQL. This chapter introduces GaussDB (for MySQL) data types, system functions and operators to help readers get started with SQL syntax.

After learning this chapter, readers will be able to do the following four things.

  1. (1)

    Describe the definitions and types of SQL statements and identify the categories to which a given statement belongs, including data definition language (DDL), data manipulation language (DML), data control language (DCL), and data query language (DQL).

  2. (2)

    List the available data types and learn to select the correct data type for creating a table. For example, when to choose a character data and when to choose numeric value data. The right data type helps improve the efficiency of storing and querying data.

  3. (3)

    Describe the usage of different system functions and master how to use them correctly in query statements. For example, what numeric processing function should be used in specific numeric processing; what character processing functions should be used in character processing. The correct system functions can improve the use of the database and query efficiency.

  4. (4)

    List the common operators and master the priority and usage of different operators . For example, when to use the logical operator and when to use the comparison operator . The correct operators can also improve the query efficiency and query accuracy.

3.1 Overview of SQL Statements

3.1.1 What is an SQL Statement

Structured query language (SQL) is a purpose-built programming language used to manage relational DBMS or to perform stream processing in relational data stream management systems. SQL is based on relational algebra and tuple relational algorithms, which includes the DDL and DML. The scope of SQL management covers data insertion, query, update and deletion, as well as database schema creation and modification, and data access control . GaussDB (for MySQL) is a kind of relational database. SQL statements include DDL, DML, DCL and DQL.

DDL is used to define or modify objects in the database, where database objects include tables, indexes, views, databases, stored procedures, triggers, custom functions, etc. The following operations are mainly involved.

  1. (1)

    Define databases: create a database (CREATE DATABASE), modify a database attribute (ALTER DATABASE), and delete a database (DROP DATABASE).

  2. (2)

    Define tablespaces: create a tablespace (CREATE TABLESPACE), modify a tablespace (ALTER TABLESPACE), and delete a tablespace (DROP TABLESPACE).

  3. (3)

    Define tables: create a table (CREATE TABLE), modify a table attribute (ALTER TABLE), delete a table (DROP TABLE), and delete all data in a table (TRUNCATE TABLE).

  4. (4)

    Define indexes: create an index (CREATE INDEX), modify an index attribute (ALTER INDex), and delete an index (DROP INDEX).

  5. (5)

    Define roles: create a role (CREATE ROLE), and delete a role (DROP ROLE).

  6. (6)

    Define users: create a user (CREATE USER), modify a user attribute (ALTER USER), and delete a user (DROP USER).

  7. (7)

    Define views: create a view (CREATE VIEW), and delete a view (DROP VIEW).

  8. (8)

    Define events: create an event (CREATE EVENT), modify an event (ALTER EVENT), and delete an event (DROP EVENT).

DML is used to insert, update and delete data in database tables, mainly involving the following operations.

  1. (1)

    Data operations: insert data (INSERT), update data (UPDATE) and delete data (DELETE).

  2. (2)

    Import/Export operations: import (LOAD) and export (DUMP).

  3. (3)

    Other operations: call (CALL), replace (REPLACE), etc.

DCL is used to set or change database transactions, save point operations and permission operations (user or role authorization , permission revoking, role creation, role deletion, etc.), locking tables (two locking modes of shared lock and exclusive lock are supported), locking instances, and shutdown, etc. The following operations are mainly involved.

  1. (1)

    Transaction management: start a transaction (START TRANSACTION/BEGIN), commit a transaction (COMMIT), and roll back a transaction (ROLLBACK).

  2. (2)

    Save point setting: start a save point (SAVEPOINT), roll back a save point (ROLLBACK TO SAVEPOINT), and publish a save point (PUBLISH SAVEPOINT).

  3. (3)

    Authorization operations: grant a permission (GRANT), revoke a permission (REVOKE), create a role (CREATE ROLE), and delete a role (DROP ROLE).

  4. (4)

    Locking table: lock a table (LOCK TABLE), and unlock a table (UNLOCK TABLE).

  5. (5)

    Locking instance (LOCK INSTANCE FOR BACKUP).

  6. (6)

    Shutdown (SHUTDOWN).

DQL is used to query the data in the database, such as single-table query and multi-table query, which mainly involves the following operations.

  1. (1)

    Query data (SELECT).

  2. (2)

    Merge the result sets of multiple SELECT statements.

3.1.2 Comprehensive Application of SQL Statements

The following is a comprehensive application of the operations involved in the above four languages to store a company’s employee information.

  1. (1)

    First of all, you need to create a table to store employee information, which can be realized by the CREATE TABLE statement, i.e. DDL.

  2. (2)

    To insert specific employee information into the table, it can be achieved by the INSERT statement, i.e. DML.

  3. (3)

    When you need to commit the inserted information to make it persistent, you can achieve it through the COMMIT statement, i.e. DCL.

  4. (4)

    It can be done by the SELECT statement, i.e. DQL.

Different SQL statements are applicable to different service scenarios, and the readers have to choose the appropriate SQL statement according to the specific scenario.

3.2 Data Types

Data type is a basic attribute of data. Data are generally divided into common data and uncommon data. Common data types include numeric value, character, date and time, and so on. Uncommon data types include Boolean data, spatial data, JSON data, etc.

Different data types occupy different storage space, and can perform different operations. The data in the database is stored in the data tables. Each column in the data table is defined with the data type. When storing data, the user must comply with the attributes of these data types, otherwise errors may occur.

3.2.1 Common Data Types

  1. 1.

    Numeric Value

    The numeric value types available in GaussDB (for MySQL) database include integer, floating-point number and fixed-point number, which support basic 32-bit integer and 64-bit integer.

    1. (1)

      There are five types of integers, as shown in Table 3.1.

      INTEGER (32-bit signed integer) occupies 4 bytes, with value range from -231 to 231-1, which can be expressed by the keywords INT, INTEGER, BINARY_INTERGER, INT SIGNED, INTEGER SINGNED, SHORT, SMALLINT and TINYINT. BIGINT (64-bit signed integer) occupies 8 bytes, with value range from -263 to 263-1, which can be expressed by the keywords BIGINT, BINARY_BIGINT and BIGINT SIGNED.

    2. (2)

      Floating-point numbers are divided into two types as follows.

      FLOAT: single-precision floating-point number occupying 4 bytes, with 8-bit precision.

      DOUBLE: double-precision floating-point number occupying 8 bytes, with 16-bit precision.

    3. (3)

      The fixed-point numbers occupy 4–24 bytes, with actual length related to the effective number it represents, and with value range from -1.0E128 to 1.0E128, which can be expressed by keywords DECIMAL and NUMERIC. They are in the following syntax format, requiring s ≤ p.

      NUMERIC/DECIMAL、NUMERIC/DECIMAL(p) and NUMERIC /DECIMAL(p,s)

      The bytes occupied by DECIMAL/NUMERIC depend on their precision, among which, “p” takes values ranging from 1 to 65, and “s” from 0 to 30.

      If the values of “p” and “s” are not specified, “p” defaults to 10, meaning that there is no restriction on the value after the decimal point. If the value of “s” is not specified or s = 0, the fixed-point number has no decimal part.

  2. 2.

    Character

    The character types supported by GaussDB (for MySQL) are CHAR, VARCHAR, BINARY, VARBINARY, TEXT, BLOB, ENUM, and SET. Under the default encoding set “utf8mb4”, Chinese characters occupy 3 bytes, numeric and English characters occupy 1 byte, and other characters occupy up to 4 bytes. The characters are divided into fixed-length strings and variable-length strings.

    CHAR(n) is used to store fixed-length bytes or strings, with the n indicating the length of the string, and taking values from 0 to 255. If the length of the input string is less than n, the right end will be made up with spaces. For example, CHAR(4) will occupy 4 bytes no matter how many characters are input.

    VARCHAR(n) is used to store variable-length bytes or strings, with the n indicating the length of the string, and taking values from 0 to 65535. If the length of the input string is less than n, there is no need to make up with spaces. The number of bytes occupied by VARCHAR is the actual number of characters input + 1 byte (n ≤ 255) or 2 bytes (n > 255), so VARCHAR(4) occupies 4 bytes when 3 English characters are input.

    In the string comparison between CHAR and VARCHAR, the case sensitivity and the spaces at the end are ignored.

    BINARY(n) stores binary fixed-length strings, and automatically adds 0x00 bytes to the end of the strings when the length is less than n bytes.

    VARBINARY(n) stores binary variable-length strings, but there is no need to add 0x00 bytes to the end of the strings when the length is less than n bytes.

    TEXT stores variable-length strings of large objects, which can save character data, such as articles and diaries. Its keywords are mainly TINYTEXT (1 byte), TEXT (2 bytes), MEDIUMTEXT (3 bytes), and LONGTEXT (4 bytes).

    BLOB stores binary variable-length strings of large objects, which can save binary data, such as photos. Its keywords are mainly TINYBLOB (1 byte), BLOB (2 bytes), MEDIUMBLOB (3 bytes) and LONGBLOB (4 bytes).

    ENUM refers to single-select enumeration, which can contain up to 65535 different elements.

    SET refers to multi-select enumeration, which can contain up to 64 different elements.

  3. 3.

    Date

    The types of date data are shown in Table 3.2.

    Gaussian database supports two date types: timestamp without time zone (8 bytes) and timestamp with time zone. When storing timestamp data without time zone, you can use DATETIME, DATE and TIMESTAMP types, which can all indicate year, month, day, hour, minute and second information; however, unlike DATE and DATETIME which support up to seconds, TIMESTAMP can support up to microseconds.

    YEAR can also be expressed as a two-digit string “YY”, ranging from 00 to 99, among which, values of 00–69 and 70–99 are converted to YEAR values of 2000–2069 and 1970–1999.

    The value range of DATETIME/DATE is [0001-01-01 00:00:00, 9999-12-31 23:59:59], Expressed as “2019-08-22 17:29:13”.

    TIMESTAMP[(n)] can specify the precision to be saved through the parameter n, ranging from 0 to 6; or takes no parameter, in which case the default precision of decimals after the second is 6. For example, 2019-08-22 17:29:13.263183 (n = 6), 2019-08-22 17:34:36.383 (n = 3). The value range of TIMESTAMP is [0001-01-01 00:00:00.000000, 9999-12-31 23:59:59.999999].

    When storing timestamp data with time zone, TIMESTAMP(n) WITH TIME ZONE and TIMESTAMP(n) WITH LOCAL TIME ZONE can be used. The difference between the two is that TIMESTAMP(n) WITH TIME ZONE holds the time and time zone information and therefore occupies 12 bytes, e.g., 2019-08-22 18:41:30.135428 +08:00. TIMESTAMP(n) WITH LOCAL TIME ZONE uses local data information, which only saves time information, not time zone information. It will be converted to the timestamp of the current time zone of the database when stored, and will be displayed with the information of the local time zone when viewed, so it occupies 8 bytes. For example, when stored, it is displayed as 2019-08-22 18: 41:30.135428; when viewed, it is displayed as 2019-08-22 18:41:30. 135428 +08:00.

Table 3.1 Integer types
Table 3.2 Date types

3.2.2 Uncommon Data Types

Boolean data can be stored by the keywords BOOL and BOOLEAN, occupying 1 byte. For string input, the normal strings TRUE and FALSE are supported, as well as the single characters T and F, and the string values 1 and 0. Boolean data can be converted to and from INT and BIGINT data because Boolean data can be seen as the numbers 0 and 1, so it can be converted to the integers 0 and 1. Integer data can also be converted to Boolean data. The conversion rule is that integer 0 corresponds to the Boolean value FALSE, and other non-zero integers correspond to the Boolean value TRUE. For the output of Boolean data, when Boolean data is displayed, or when converting Boolean data to character data, Gaussian database uniformly outputs 1 as string TRUE and 0 as string FALSE. When the input value is null, the output of the Boolean data is also null.

Spatial data types include GEOMETRY, POINT, LINESTRING, POLYGON, etc.

JSON data (JSON: Javascript Object Notation) support native JSON data, allowing for more efficient storage and management of JSON documents.

3.2.3 Cases of Data Types

To store department information of a company, first create a table with fields for department information. Suppose the department information to be stored includes department number, department level, department name, establishment time, and whether it is an excellent department, etc., we need to determine the data type of the specific information first: if the department number is numeric data, it can be expressed as NUMBER; if the department level is integer data, it can be expressed as INT; if the department name is character data, it can be expressed as VARCHAR; the establishment time can be expressed by date data; whether it is an excellent department can be expressed by Boolean data. This is the CREATE TABLE statements to create a department information table. The code is shown below.

SQL>DROP TABLE IFEXITS T_TEST_CASE; CREATE TABLE T_TEST_CASE( section_id NUMBER(10) PRIMARY KEY, section_grade INT, section_name VARCHAR(100), section_is_excellent BOOLEAN, section_date DATE );

After the table is created successfully, if you want to store the department description information in the table, you can add more columns to the table. Suppose the column name is “section_description”, if the department description information is expressed by a string, the content may be bulky; and if it is defined as BLOB data, it can be achieved by the following statement.

SQL> ALTER TABLE T_TEST_CASE ADD section_description BLOB;

Where, “seciton_description” is the department description field and BLOB is the data type of the field information. The department level is now an integer, and if you want to modify it to a decimal number, recorded as a floating point number, you can do so by modifying the data type of the corresponding column. The “section_grade” field can be modified to DOUBLE type by the following code.

SQL> ALTER TABLE T_TEST_CASE MODIFY section_grade DOUBLE;

3.3 System Functions

A system function is encapsulation for some service logic to accomplish a specific function. System functions can be executed with or without parameters depending on their specific functions, and they return the result after execution.

GaussDB (for MySQL) provides 10 types of system functions. This section introduces five more common system functions: numeric calculation function , character processing function, time and date function, type conversion function, and system information function.

GaussDB (for MySQL) system functions cannot be modified manually.

3.3.1 Numeric Calculation Functions

The numeric calculation functions are responsible to calculate numeric values, such as absolute value calculation function ABS(x), sine function SIN(x), cosine function COS(x), inverse sine function ASIN(i), and inverse cosine function ACOS(x).

The ABS(x) function is used to calculate the absolute value. The input parameter can be a numeric value or a non-numeric value that can be implicitly converted to a numeric value. The type of the return value is the same as that of the input parameter. x must be an expression that can be converted to a numeric value type. ABS(x) eventually returns the absolute value of x (including INT, BIGINT, REAL, NUMBER, and DECIMAL types).

The SIN(x) and COS(x) functions are used to calculate the sine and cosine values, whose input parameter is an expression that can be converted to a numeric value, and the return value is of type NUMBER.

The ASIN(x) and ACOS(x) functions are used to calculate the arc sine and arc cosine values, whose input parameter is an expression that can be converted to a numeric value, with the range of [-1, 1], and the return value is of type NUMBER.

The code is shown below.

mysql> SELECT ABS(-10),COS(0),SIN(0),ACOS(1),ASIN(0) FROM dual; +----------+--------+--------+---------+---------+ | ABS(-10) | COS(0) | SIN(0) | ACOS(1) | ASIN(0) | +----------+--------+--------+---------+---------+ | -- 10 | -- 1 | -- 0 | -- 0 | - 0 | +----------+--------+--------+---------+---------+ 1 row in set (0.00 sec)

ROUND(X,D) can truncate the numeric value X before and after the decimal point according to the value specified by D, and round it to return the truncated value. The value of D is in the range [−30, 30]. If D is ignored, all fractional parts are intercepted and rounded. If D is negative, it means that the left digit from the decimal point is filled with zeros and rounded, and the decimal part is removed. The code is shown below.

mysql> SELECT ROUND(1234.5678,-2),ROUND(1234.5678,2) FROM dual; +---------------------+--------------------+ | ROUND(1234.5678,-2) | ROUND(1234.5678,2) | +---------------------+--------------------+ | 1200 | 1234.57 | +---------------------+--------------------+ 1 row in set (0.00 sec)

POW(X,Y ) is equivalent to POWER(X,Y ), which means calculate the Yth power of X. The code is shown below.

mysql> SELECT POW(3,2),POWER(3,-2) FROM dual; +----------+---------------------+ | POW(3,2) | POWER(3,-2) | +----------+---------------------+ | 9 | 0.1111111111111111 | +----------+---------------------+ 1 row in set (0.00 sec)

The CEIL(X ) function is used to calculate the smallest integer greater than or equal to the specified expression n, whose input parameter is an expression that can be converted to a numeric value, and the return value is an integer. For example, CEIL(15.3) is calculated as 16. Numeric calculation functions are shown in Table 3.3.

Table 3.3 Numeric calculation functions

The SIGN(X ) function is used to take the sign of the numeric value type, which returns 1 if greater than 0, returns -1 if less than 0, and returns 0 if equal to 0. The returned value is of the numeric value type. For example, for SIGN(2*3), 2 × 3 = 6, if greater than 0, the calculation result is 1.

The SQRT(X ) function is used to calculate the square root of a non-negative real number, whose input parameter is an expression that can be converted to a non-negative values, and the return value is of type DECIMAL. For example, SQRT(49) is calculated as 7.

The TRUNCATE (X,D) function is used to intercept the input numeric data in the specified format, without rounding, where X indicates the data to be intercepted, and D for the interception accuracy, and the return value is of type NUMBER. For example, TRUNCATE(15.79,1) is 15.7 after intercepting a decimal to the right; TRUNCATE(15.79,-1) is 10 after intercepting an integer to the left.

The FLOOR(X ) function is used to find the nearest integer less than or equal to the value of the expression, whose input parameter is an expression that can be converted to a numeric value, and the return value is of type NUMBER. For example, FLOOR(12.8) is calculated as 12.

The PI() function is used to return the value of π, with valid number default to 7 digits. For example, PI() returns 3.141593.

The MOD(X,Y) function is used for modulo operations, whose input parameter is an expression that can be converted to a NUMBER data, and the return value is of type NUMBER. For example, MOD (29,3) is calculated as 2.

Other numeric calculation functions include the exponentiation function POWER(), etc.

3.3.2 Character Processing Functions

The character splicing functions CONCAT(str[,...]) and CONCAT_WS(separator,str1,str2,...) are used to splice one or more strings. The CONCAT() function splices the strings generated by each parameter without separating them; the input parameters are strings or expressions that can be converted to strings, separated by commas. The CONCAT_WS() function splices the strings and separate them with commas; the first input parameter is the separator, and the subsequent ones are strings or expressions that can be converted to strings. If the parameter is NULL, CONCAT or CONCAT_WS, the parameter will be ignored. If NULL is enclosed in single quotes, NULL will be treated as a string. The CONCAT() and CONCAT_WS() functions can be nested and support return values of up to 8000 bytes.

mysql> SELECT CONCAT('11','NULL','22'),CONCAT_WS('-','11',NULL,'22') FROM dual; +--------------------------+-------------------------------+ | CONCAT('11','NULL','22') | CONCAT_WS('-','11',NULL,'22') | +--------------------------+-------------------------------+ | 11NULL22 | 11-22 | +--------------------------+-------------------------------+ 1 row in set (0.00 sec)

In the above example, the CONCAT() function splices the strings ‘11’, ‘NULL’ and ‘22’ to return the string 11NULL22, and the CONCAT_WS() function splices ‘11’, NULL and ‘22’ by the separator ‘-’, where NULL means null, to return 11–22.

The HEX (str) function returns a string of hexadecimal value, whose input parameter is of numeric value type or character type, and the return value is of string type. The HEX2BIN (str) and HEXTORAW (str) functions return strings represented as hexadecimal strings. The difference between the two is that the HEX2BIN() function returns the BINARY type, where the input hexadecimal string must be prefixed with 0x, while the HEXTORAW() function returns the RAW type.

mysql> SELECT HEX('ABC'); +------------+ | HEX('ABC') | +------------+ | 414243 | +------------+ 1 row in set (0.00 sec)

In the above example, the HEX(‘ABC’) function returns the hexadecimal string 414243 for ABC. The HEX2BIN(‘0X28’) function returns the string “(” represented by the hexadecimal string 28. The HEXTORAW(‘ABC’) function returns the hexadecimal string ABC of type RAW.

The string insertion function INSERT(str,pos,len,newstr) replaces the string with the length of len with newstr from pos position, and then returns the replaced string. If pos is not within the length of the string str, the original string is returned. If the value of the parameter len is greater than the length of the rest of the strings starting from the parameter pos, then all strings starting from pos are replaced with newstr. Both the input parameters str and newstr are expressions that can be converted to strings, with the maximum value of up to 8000 bytes.

mysql> SELECT INSERT('quadratic',5,2,'what'),REPLACE('123456','45', 'abds') FROM dual; +------------------ - - --------+-------------------------------+ | INSERT('quadratic',5,2,'what') | REPLACE('123456','45','abds') | +-----------------------------+---------------------------- - ---+ | quadwhattic | 123abds6 | +--------------------------------+----------------------------+ 1 row in set (0.00 sec)

INSERT(‘quadratic’,5,2,‘what’) means that replace the two consecutive characters’ of the quadratic string from the fifth character with what, which is equal to REPLACE(‘quadratic’,‘ra’,‘what’).

The REPLACE(str,src,dst) function is to replace the corresponding src substring in the string str with the dst substring. The input parameter str indicates the original string, src indicates the string to be replaced, and dst indicates to replace the string. The return value is of string type.

REPLACE(‘123456’,‘45’,‘abds’) means to replace “45” in the string “123456” with “abds”, equal to INSERT(‘123456’,4,2,‘abds’).

The INSTR(str1,str2) function is a string lookup function that returns the first occurrence of the string to be found in the source string, where str1 indicates the source string, and str2 indicates the string to be found.

mysql> SELECT INSTR(' gaussdb 数据库','库'); +---------------------------------+ | INSTR(' gaussdb 数据库','库') | +---------------------------------+ | 10 | +---------------------------------+ 1 row in set (0.00 sec)

In the example above, the INSTR(‘gaussdb 数据库’,‘库’) function returns the first occurrence of the string “库” in the source string to be found, returning 10, indicating the first occurrence of “库”. Character processing functions are shown in Table 3.4.

Table 3.4 Character processing functions

The LEFT(str, length) function returns the left few characters of the specified string. For example, the result after executing LEFT(‘abcdef’,3) is abc. If length is less than or equal to 0, then a null string is returned. The function of RIGHT(str,length) is opposite to that of LEFT(), which returns the right few characters of the specified string. For example, the result after executing RIGHT(‘abcdef’,3) is def. If length is less than or equal to 0, then a null string is returned.

The LEFT () and RIGHT () functions are described as follows. str is the source string from which the substring is to be extracted. length is a positive integer, specifying the number of characters returned from the left or right. If length is 0 or a negative number, then a null string is returned. If length is greater than the length of the str string, the function returns the entire str string. The client currently supports a maximum string of 32767 bytes, so the function returns a maximum value of 32767 bytes.

The LENGTH(str) function is used to get the length of the string function, for example, the result of executing LENGTH(‘1234大’) is 7. The LENGTH () function returns the number of characters in str, whose input parameter is an expression that can be converted to a string, and the return value is of type INT.

The LOWER(str) function is used to convert a string to the corresponding lowercase form. For example, the result of executing LOWER(‘ABCD’) is abcd, without converting the numeric value type. Corresponding to the LOWER() function, the UPPER(str) function is used to convert a string to the corresponding uppercase form. For example, the result of executing UPPER(‘abcd’) is ABCD, without converting the numeric value type. The LOWER() and UPPER() functions have input parameters that can be converted to string expressions and return values that are of string type.

The SPACE (n) function is to generate n spaces, and the value range of n is [0,4000]. For example, the result of CONCAT(‘123’, SPACE(4),‘abc’) is 123 abc.

The REVERSE(str) function returns the reverse order of the string, only supports the string type. For example, the result of REVERSE(‘abcd’) is dcba.

SUBSTR(str,start,len) is a string interception function. For example, SUBSTR(‘abcdefg’,3,4) indicates that intercept a string of length 4 from the third character, delivering the result cdef. The SUBstr () function intercepts and returns a substring with len characters from start in str, where the input parameter str must be an expression that can be converted into a string, and the input parameters start and len must be expressions that can be converted into INT type. The return value is of string type.

3.3.3 Time and Date Functions

The DATE_FORMAT(date,format) function is a formatted date function, used to convert to the required format according to the parameter format. The value of format includes: % w (Monday - Sunday); %w (1–7); %Y (YYYY: 4-digit year); %m (1–12); %d (00–31).

mysql> SELECT DATE_FORMAT(SYSDATE(),'%W'),DATE_FORMAT(SYSDATE(),'%w'),DATE_ FORMAT(SYSDATE(),'%Y-%m-%d'); +---------------------------+---------------------------+---------------------------------+ |DATE_FORMAT(SYSDATE(),'%W')|DATE_FORMAT(SYSDATE(),'%w')|DATE_FORMAT(SYSDATE(),'%Y-%m-%d')| +---------------------------+---------------------------+----------------------------------+ | Tuesday | 2 | 2020-05-19 | +---------------------------+---------------------------+---------------------------------+ 1 row in set (0.00 sec)

The EXTRACT(field from datetime) function extracts the specified time field “field” from the specified datetime, where the values of the field include year, month, day, hour, minute, and second, and the return value is of the numeric value type. If the field value is SECOND, the return value is of the floating-point number type, where the integer part indicates second, and the decimal part indicates microsecond. This function takes any numeric value or any non-numeric value that can be implicitly converted to a numeric value as an parameter and returns the same data type as the parameter.

mysql> SELECT EXTRACT(month from date '2019-08-23') FROM dual; +----------------------------------------------+ | EXTRACT(month from date '2019-08-23')| +----------------------------------------------+ | 8 | +----------------------------------------------+ 1 row in set (0.00 sec)

The above code extracts the month from “2019-08-23”, and returns the result 8; and intercepts from the system date according to “YY”, and the result is 2019-01-01 00:00:00. Time and date functions are shown in Table 3.5.

Table 3.5 Time and date functions

3.3.4 Type Conversion Functions

IF(cond,p1,p2) function: Cond is taken as the calculation condition, if the condition is true, p1 is returned, otherwise, p2 is returned.

IFNULL(p1,p2) function: P1 is returned if p1 is not NULL, otherwise p2 is returned.

NULLIF(p1,p2) function: If p1 is equal to p2, NULL is returned; otherwise, P1 is returned. It is not supported that both parameters are CLOB type or BLOB type, and the input parameter p1 cannot be NULL, otherwise the verification will report an error.

The specific example is as follows.

mysql> SELECT IF(10>13,10,14),IFNULL(10,12),nullif(10,12); +-----------------+---------------+---------------+ | IF(10>13,10,14) | IFNULL(10,12) | NULLIF(10,12) | +-----------------+---------------+---------------+ | 14 | 10 | 10 | +-----------------+---------------+---------------+ 1 row in set (0.00 sec)

Type conversion functions are shown in Table 3.6.

Table 3.6 Type conversion functions

The ASCII(str) function returns the ASCII value corresponding to the first character of the string str, whose input parameter is a string or a single character, which needs to be enclosed by single quotation marks (''), and the return value is the ASCII value.

CHAR(n) returns characters with ASCII value of n, where the value range of n is [0,127], and the input parameter is an expression that can be converted into the numeric value type.

The CAST(value as type) function converts the column name or value to the specified data type, and the expression can be converted to the same type as itself. When using CAST () function for data type conversion, the following conditions can be met, otherwise an error will be reported.

  1. (1)

    The two expressions can be converted implicitly.

  2. (2)

    The data types must be explicitly converted.

The code is shown below.

mysql> SELECT CAST('125e342.83' AS signed); +------------------------------+ | CAST('125e342.83' AS signed) | +------------------------------+ | 125 | +------------------------------+ 1 row in set,1 warning (0.00 sec)

The function CONVERT(value,type) converts value type into type type, and the value range is all data types except LONGBLOB, BLOB, and IMAGE.

The code is shown below.

mysql> SELECT CONVERT((1/3)*100,UNSIGNED) AS percent FROM dual; +---------+ | percent | +---------+ | 33 | +---------+ 1 row in set (0.00 sec)

3.3.5 System Information Functions

System information functions are used to query the system information of GaussDB (for MySQL). The VERSION() function is used to return the database version number; the CONNECTION_ID() function returns the server connection ID number; the DATABASE() function returns the name of the current database; the SCHEMA() function returns the name of the current Schema; the USER(), SYSTEM_USER(), SESSION _USER(), and CURRENT_USER() functions return the name of the current user; the LAST_INSERT_ID() function returns the value of auto_increment; the CHARSET(str) function returns the character set of the string str;and the COLLATION(str) function returns the character alignment of the string str.

3.4 Operators

An operator can process one or more operands, which may be before, after, or between two operands. It is an important element that makes up an expression, specifying the operation to be performed on the operands. Operators are classified into unary and binary operators depending on the number of operands required. The priority of operators determines the order in which different operators are computed in an expression. Operators of the same priority are computed in left-to-right order.

Common operators can be divided into logical operators , comparison operators , arithmetic operators , test operators , wildcards and other operators according to usage scenarios.

3.4.1 Logical Operators

The logical operators are shown in Table 3.7.

Table 3.7 Logical operators

The operand must be a Boolean value, which can be expressed as three types of value - TRUE, FALSE and NULL, where NULL means unknown.

Logical AND (AND) is used to achieve the logical AND operation between conditions. When all operands are TRUE and not NULL, T is returned; when at least one operand is FALSE, F is returned, otherwise NULL is returned. Logical AND is generally used in query conditions WHERE/ON/HAVING statements.

For Logical OR (OR), when both operands are not NULL, and at least one is TRUE, T is returned, otherwise F is returned; when one operand is NULL, if the other operand is TRUE, then T is returned, otherwise NULL is returned; if both operands are NULL, then NULL is returned. Logical OR is generally used in the query condition WHERE/ON/ HAVING statements.

For logical NOT (NOT), if the operand is TRUE, T is returned, F is returned; if the operand is FALSE, T is returned; if the operand is NULL, NULL is returned. It supports to add the NOT keyword before the conditional expression after the WHERE/HAVING clause to take the inverse of the conditional result, often used together with relational operators , such as NOT IN and NOT EXISTS.

There is a staffs table, which contains information such as employee name, job number, hiredate, salary, etc. To query the information of employees who joined after the year of 2000 and enjoy salary > 5000 from the staffs table, the following statement can be used. Since both conditions need to be satisfied, the two conditions after WHERE should be joined by AND.

SELECT * FROM staffs WHERE hire_date>'2000-01-01 00:00:00' AND salary>5000

If you want to query employees who joined after 2000 or whose salary is >5000 from the staffs table, i.e., if one of the two conditions is required to satisfy, the two conditions after WHERE should be joined by OR.

SELECT * FROM staffs WHERE hire_date>'2000-01-01 00:00:00' OR salary>5000

If you want to query from staffs table for employees who did not join after 2000 and whose salary is >5000, you can add NOT in front of the condition of joining after 2000; at this time, the relationship between hiredate and salary is AND, so the two conditions after WHERE are joined by AND.

SELECT * FROM staffs NOT WHERE hire_date>'2000-01-01 00:00:00' AND salary>5000

3.4.2 Comparison Operators

The comparison operators are shown in Table 3.8.

Table 3.8 Comparison operators

All data types can be compared using the comparison operator and return a value of Boolean type. The comparison operators are all binary operators , and the two piece of data being compared must be of the same data type or of a type that can be implicitly converted. GaussDB database provides six comparison operators , including <, >, <=, >=, =, <> or != (not equal to), which should be selected according to the service scenario.

The comparison operator > is used to query the employees whose salary is greater than 5000 from staffs table.

SELECT *FROM staffs WHERE salary>5000

The comparison operator <> is used to query the employees whose salary is not equal to 5000 from staffs table.

SELECT *FROM staffs WHERE salary<>5000

3.4.3 Arithmetic Operators

The arithmetic operators are shown in Table 3.9.

Table 3.9 Arithmetic operators

The arithmetic operators shown in Table 3.9 are used to perform calculations on numeric operands. GaussDB database provides the following 11 types of arithmetic operators : +, -, *, /, % (modulo operation), || (string concatenation), | (bitwise inclusive OR), & (bitwise AND), ^ (bitwise exclusive OR), << (left shift), and >> (right shift).

Example of arithmetic operator syntax:

SELECT operation AS result FROM sys_dummy;SELECT 2+3 FROM dual。

The operations are in the form of +, −, *, /, etc., and the order of priority is four arithmetic operations > left and right shift > bitwise AND > bitwise exclusive OR > bitwise inclusive OR.

When one of the above bitwise operations is executed, if the input parameter has decimal places, the input parameter will be rounded before the bitwise operation is done. A code example is as follows.

mysql> SELECT 2+3,2*3,3<<1 FROM dual; +-----+-----+------+ | 2+3| 2*3 | 3<<1 | +-----+-----+------+ | 5 | 6 | 6 | +-----+-----+------+ 1 row in set (0.00 sec)

3.4.4 Test Operators

The test operators are shown in Table 3.10.

Table 3.10 Test operators

GaussDB database provides 13 test operators , as shown in Table 3.10. IN and NOT IN operators are used to specify the judgment range of a subquery, where IN means that the element is in the specified set, and NOT IN means that the element is not in the specified set.

The sample code is as follows.

SELECT * FROM T_TEST_OPERATOR WHERE ID IN(1,2);

EXISTS means that an eligible element exists, and NOT EXISTS means that no eligible element exists. The sample code is as follows.

SELECT COUNT(1) FROM dual WHERE EXISTS(SELECT ID FROM T_TEST_OPERATOR WHERE NAME='zhangsan'); SELECT COUNT(1) FROM dual WHERE NOT EXISTS(SELECT ID FROM T_TEST_OPERATOR WHERE NAME='zhangsan');

BETWEEN...AND... means between the two, i.e. a closed interval, e.g. a BETWEEN x AND y is equivalent to y>=a and a>=x; while NOT BETWEEN…and… means not between the two, i.e. an open interval, e.g. a NOT BETWEEN x AND y is equivalent to. The sample code is as follows.

SELECT * FROM T_TEST_OPERATOR WHERE ID BETWEEN 1 AND 2;

IS NULL means the field is equal to NULL; while IS NOT NULL means the field is not equal to NULL. The sample code is as follows.

SELECT * FROM T_TEST_OPERATOR WHERE NAME IS NULL;

ANY means it is enough that one of the values in the subquery satisfies the condition, which matches with each content in one of the following three forms.

  1. (1)

    =ANY: The function is exactly the same as that of the IN operator .

    SELECT * FROM emp WHERE sal IN ( SELECT sal FROM emp WHERE job = ‘MANAGER’);

  2. (2)

    >ANY: Larger than the largest data in the record returned by the subquery.

    SELECT *FROM emp WHERE sal>ANY(SELECT sal FROM emp WHERE job=‘MANAGER’)

  3. (3)

    <ANY: Small than the smallest data in the record returned by the subquery.

    SELECT * FROM emp WHERE sal<ANY(SELECT sal FROM emp WHERE job=‘MANAGER’)

LIKE means matching with the expression; NOT LIKE means no match with the expression. Only character type is supported. The sample code is as follows.

SELECT * FROM T_TEST_OPERATOR WHERE NAME LIKE '%an%';

REGEXP and REFEXP_LIKE indicate that the string matches the regular expression and the expression return value is of Boolean type. The syntax of REGEXP_LIKE: REGEXP_LIKE(str,pattern[,match_param]). The input parameter “str” is the string subject to regular processing, supporting the string type and NUMBER type; the input parameter “pattern” is the regular expression to be matched; the input parameter “match_param” indicates the pattern (‘i’ means case-insensitive search; ‘c’ means case-sensitive search; ‘c’ is set by default). The sample code is as follows.

DROP TABLE IF EXISTS T_TEST_OPERATOR ; CREATE TABLE T_TEST_OPERATOR (ID INT,NAME VARCHAR(36)); SELECT * FROM T_TEST_OPERATOR WHERE NAME REGEXP'[a-z]*'; SELECT * FROM T_TEST_OPERATOR WHERE REGEXP_LIKE(NAME ,'[a-z]*');

To find information in a table that indicates a row with an ID field of 1 or 2, you can use the ID IN (1, 2) condition after WHERE to perform a conditional query.

SELECT * FROM T_TEST_OPERATOR WHERE ID IN (1,2);

Require the system to return 1 when there is a string equal to “zhangsan” in the NAME field in the table, and the EXISTS operator can be used for conditional query.

SELECT COUNT(1) FROM SYS_DUMMY WHERE EXISTS (SELECT ID FROM T_TEST_OPERATOR WHERE NAME='zhangsan');

To find out the information in the table with ID fields between 1 and 2, you can use BETWEEN 1 AND 2 for conditional query.

SELECT * FROM T_TEST_OPERATOR WHERE ID BETWEEN 1 AND 2;

To query the information of the rows in the table whose NAME field is NULL, you can use the IS NULL operator for conditional query.

SELECT * FROM T_TEST_OPERATOR WHERE NAME IS NULL;

To query the information of the rows in the table whose ID field is 1, 3 and 5, you can use the ANY operator for conditional query.

SELECT * FROM T_TEST_OPERATOR WHERE ID= ANY(1,3,5);

To find the information of rows with “an” string in the NAME field, use the LIKE operator with the wildcard %.

SELECT * FROM T_TEST_OPERATOR WHERE NAME LIKE '%an%';

3.4.5 Other Operators

Wildcard and other operators are shown in Tables 3.11 and 3.12. % indicates any number of characters, including no character. _ indicates an exact unknown character. These two characters are often used in LIKE and NOT LIKE statements to achieve string matching.

Table 3.11 Wildcards
Table 3.12 Other operators

Single quotes (') are used to indicate the string type. If a single quotation mark is included in the string text, then two single quotation marks must be used The sample code is as follows.

INSERT INTO tt1 values('''');

Double quotes (") and back quotes (`) are used to indicate the name of an object such as a table, field, index, etc. or an alias. They are case-sensitive and support keywords as names or aliases. If the object name is included in double quotes or back quotes, GaussDB database takes case-insensitive treatment and treats both upper and lower cases as upper case.

3.5 Summary

This chapter is about the data types, system functions, operators and SQL statements involved in Huawei GaussDB (for MySQL) to help readers get a preliminary understanding of GaussDB (for MySQL) and lay a good foundation for the next step of learning.

3.6 Exercises

  1. 1.

    [True or False] The BIGINT type occupies 4 bytes. ( )

    1. A.

      True

    2. B.

      False

  2. 2.

    [True or False] the BLOB type is used to store the binary data for large objects with variable length. ( )

    1. A.

      True

    2. B.

      False

  3. 3.

    [Single Choice] Run

    CREATE TABLE aaa (name CHAR(5)); INSERT INTO aaa values(‘TEST’); SELECT name=‘test’ FROM aaa;

    and you will get the result ( ).

    1. A.

      1

    2. B.

      0

  4. 4.

    [Multiple Choice] Which of the following are numeric calculation functions ? ( )

    1. A.

      LENGTH(str)

    2. B.

      SIN(D)

    3. C.

      TRUNC(X,D)

    4. D.

      HEX(p1)

  5. 5.

    [Multiple Choice] GaussDB (for MySQL) takes the UNIX timestamp by the function ( ).

    1. A.

      UNIX_TIMESTAMP()

    2. B.

      UNIX_TIMESTAMP(datetime)

    3. C.

      UNIX_TIMESTAMP(datetime_string)

    4. D.

      FROM_UNIXTIME(unix_timestamp)

  6. 6.

    [Single Choice] When the function if(cond,exp1,exp2) is false in the cond condition, ( ) is returned.

    1. A.

      exp1

    2. B.

      exp2

  7. 7.

    [Multiple Choice] Which of the following are logical operators ? ( )

    1. A.

      AND

    2. B.

      OR

    3. C.

      NOT

    4. D.

      NOT OR

  8. 8.

    [True or False] Wildcards are used in LIKE and NOT LIKE statements. ( )

    1. A.

      True

    2. B.

      False

  9. 9.

    [True or False] The arithmetic operator with the lowest priority is ^. ( )

    1. A.

      True

    2. B.

      False