1995

Handling Persistence

  • Flat files
  • fseek() and ftell()
  • mmap()
Introduction To Algorithms * Red Black Tree * B-Tree

BASIC SQL

Help * help contents * help data definition * help data manipulation * help functions

Crash Course in SQL

  • display results (like a print statement)
  • create a table
  • insert into a table
  • delete from a table
  • select rows from a table
  • drop a table

Displaying results

```sql mysql> select 'hello world'; ```
```sql +-------------+ | hello world | +-------------+ | hello world | +-------------+ 1 row in set (0.00 sec) ```
```sql mysql> select 'hello world' result; +-------------+ | result | +-------------+ | hello world | +-------------+ 1 row in set (0.00 sec) ```
```sql mysql> select 'hello world' as `result`; +-------------+ | result | +-------------+ | hello world | +-------------+ 1 row in set (0.00 sec) ```
Using `SELECT` ```sql mysql> select 1 + 2 as result, database(), version(), user(); +--------+------------+-----------+----------------+ | result | database() | version() | user() | +--------+------------+-----------+----------------+ | 3 | ckim_db | 5.6.22 | ckim@localhost | +--------+------------+-----------+----------------+ 1 row in set (0.00 sec) ```
Creating person table
```sql mysql> create table person( -> name varchar(100), -> title varchar(100) -> ); Query OK, 0 rows affected (0.02 sec) ```
```sql mysql> insert into person values ('chong', 'programmer'); Query OK, 1 row affected (0.03 sec) ```
```sql mysql> insert into person(name, title) values ('jim', 'cto'); Query OK, 1 row affected (0.00 sec) ```
Viewing person table
```sql mysql> select * from person; ```
```sql +-------+------------+ | name | title | +-------+------------+ | chong | programmer | | jim | cto | +-------+------------+ 2 rows in set (0.00 sec) ```
```sql mysql> select * from person where name='chong'; +-------+------------+ | name | title | +-------+------------+ | chong | programmer | +-------+------------+ 1 row in set (0.00 sec) ```
INSERT using SELECT
```sql mysql> insert into person(name,title) -> select name, 'CEO' from person limit 1; Query OK, 1 row affected (0.00 sec) Records: 1 Duplicates: 0 Warnings: 0 ```
```sql mysql> select * from person; +-------+------------+ | name | title | +-------+------------+ | chong | programmer | | jim | cto | | chong | CEO | +-------+------------+ ```
DELETE FROM a table
```sql mysql> delete from person where title='CEO'; Query OK, 1 row affected (0.00 sec) mysql> select * from person; +-------+------------+ | name | title | +-------+------------+ | chong | programmer | | jim | cto | +-------+------------+ 2 rows in set (0.00 sec) ```
Dropping a table
```sql mysql> create table address (id bigint); -- oops Query OK, 0 rows affected (0.01 sec) ```
```sql mysql> drop table address; Query OK, 0 rows affected (0.01 sec) ```
```sql mysql> create table address( -> id bigint primary key auto_increment, -> street varchar(50), -> city varchar(40), -> state varchar(2) -> person_id bigint); Query OK, 0 rows affected (0.02 sec) ```
What we have so far ```sql SELECT .. [ FROM tbl [ WHERE ... ] ] CREATE TABLE tbl ( ... ) INSERT INTO tbl VALUES (...) INSERT INTO tbl SELECT ... DELETE FROM tbl WHERE ... DROP TABLE tbl ```

Making changes

Altering person table * Want to add column, `id` * `id` should show up first

Help on `alter table`

```sql mysql> help alter table ```
``` Name: 'ALTER TABLE' Description: Syntax: ALTER [IGNORE] TABLE tbl_name [alter_specification [, alter_specification] ...] [partition_options] alter_specification: table_options | ADD [COLUMN] col_name column_definition [FIRST | AFTER col_name ] | ADD [COLUMN] (col_name column_definition,...) | ADD {INDEX|KEY} [index_name] [index_type] (index_col_name,...) [index_option] ... | ADD [CONSTRAINT [symbol]] PRIMARY KEY [index_type] (index_col_name,...) [index_option] ... | ADD [CONSTRAINT [symbol]] UNIQUE [INDEX|KEY] [index_name] [index_type] (index_col_name,...) [index_option] ... | ADD FULLTEXT [INDEX|KEY] [index_name] (index_col_name,...) [index_option] ... | ADD SPATIAL [INDEX|KEY] [index_name] (index_col_name,...) [index_option] ... | ADD [CONSTRAINT [symbol]] FOREIGN KEY [index_name] (index_col_name,...) reference_definition | ALGORITHM [=] {DEFAULT|INPLACE|COPY} | ALTER [COLUMN] col_name {SET DEFAULT literal | DROP DEFAULT} | CHANGE [COLUMN] old_col_name new_col_name column_definition [FIRST|AFTER col_name] | LOCK [=] {DEFAULT|NONE|SHARED|EXCLUSIVE} | MODIFY [COLUMN] col_name column_definition [FIRST | AFTER col_name] | DROP [COLUMN] col_name | DROP PRIMARY KEY | DROP {INDEX|KEY} index_name | DROP FOREIGN KEY fk_symbol | DISABLE KEYS | ENABLE KEYS | RENAME [TO|AS] new_tbl_name | ORDER BY col_name [, col_name] ... | CONVERT TO CHARACTER SET charset_name [COLLATE collation_name] | [DEFAULT] CHARACTER SET [=] charset_name [COLLATE [=] collation_name] | DISCARD TABLESPACE | IMPORT TABLESPACE | FORCE | ADD PARTITION (partition_definition) | DROP PARTITION partition_names | TRUNCATE PARTITION {partition_names | ALL} | COALESCE PARTITION number | REORGANIZE PARTITION partition_names INTO (partition_definitions) | EXCHANGE PARTITION partition_name WITH TABLE tbl_name | ANALYZE PARTITION {partition_names | ALL} | CHECK PARTITION {partition_names | ALL} | OPTIMIZE PARTITION {partition_names | ALL} | REBUILD PARTITION {partition_names | ALL} | REPAIR PARTITION {partition_names | ALL} | REMOVE PARTITIONING index_col_name: col_name [(length)] [ASC | DESC] index_type: USING {BTREE | HASH} index_option: KEY_BLOCK_SIZE [=] value | index_type | WITH PARSER parser_name | COMMENT 'string' table_options: table_option [[,] table_option] ... (see CREATE TABLE options) partition_options: (see CREATE TABLE options) ALTER TABLE changes the structure of a table. For example, you can add or delete columns, create or destroy indexes, change the type of existing columns, or rename columns or the table itself. You can also change characteristics such as the storage engine used for the table or the table comment. Following the table name, specify the alterations to be made. If none are given, ALTER TABLE does nothing. The syntax for many of the permissible alterations is similar to clauses of the CREATE TABLE statement. See [HELP CREATE TABLE], for more information. table_options signifies table options of the kind that can be used in the CREATE TABLE statement, such as ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, MAX_ROWS, or ROW_FORMAT. For a list of all table options and a description of each, see [HELP CREATE TABLE]. However, ALTER TABLE ignores the DATA DIRECTORY and INDEX DIRECTORY table options. partition_options signifies options that can be used with partitioned tables for repartitioning, for adding, dropping, merging, and splitting partitions, and for performing partitioning maintenance. It is possible for an ALTER TABLE statement to contain a PARTITION BY or REMOVE PARTITIONING clause in an addition to other alter specifications, but the PARTITION BY or REMOVE PARTITIONING clause must be specified last after any other specifications. The ADD PARTITION, DROP PARTITION, COALESCE PARTITION, REORGANIZE PARTITION, EXCHANGE PARTITION, ANALYZE PARTITION, CHECK PARTITION, and REPAIR PARTITION options cannot be combined with other alter specifications in a single ALTER TABLE, since the options just listed act on individual partitions. For more information about partition options, see [HELP CREATE TABLE], and http://dev.mysql.com/doc/refman/5.6/en/alter-table-partition-operations .html. For information about and examples of ALTER TABLE ... EXCHANGE PARTITION statements, see http://dev.mysql.com/doc/refman/5.6/en/partitioning-management-range-li st.html. Some operations may result in warnings if attempted on a table for which the storage engine does not support the operation. These warnings can be displayed with SHOW WARNINGS. See [HELP SHOW WARNINGS]. For information on troubleshooting ALTER TABLE, see http://dev.mysql.com/doc/refman/5.6/en/alter-table-problems.html. URL: http://dev.mysql.com/doc/refman/5.6/en/alter-table.html ```
Altering person
```sql mysql> alter table person add id bigint primary key auto_increment first; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 ```
```sql mysql> select * from person; +----+-------+------------+ | id | name | title | +----+-------+------------+ | 1 | chong | programmer | | 2 | jim | cto | +----+-------+------------+ 2 rows in set (0.00 sec) ```

Discovering your schema

What does your database look like?
```sql show databases show tables select database(); use dbname ```
How is the address table defined
```sql mysql> select * from address; ```
```sql Empty set (0.00 sec) ```
```sql mysql> show create table address; ```
```sql +---------+----------------------------------------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- ----+ | Table | Create Table | +---------+----------------------------------------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- ----+ | address | CREATE TABLE `address` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `street` varchar(50) DEFAULT NULL, `city` varchar(40) DEFAULT NULL, `state` varchar(2) DEFAULT NULL, `person_id` bigint(20) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +---------+----------------------------------------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- ---------------------------------------------------------------------- ----+ 1 row in set (0.00 sec) ```
```sql mysql> show create table address\G *************************** 1. row *************************** Table: address Create Table: CREATE TABLE `address` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `street` varchar(50) DEFAULT NULL, `city` varchar(40) DEFAULT NULL, `state` varchar(2) DEFAULT NULL, `person_id` bigint(20) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) ```
```sql mysql> describe address; +-----------+-------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+-------------+------+-----+---------+----------------+ | id | bigint(20) | NO | PRI | NULL | auto_increment | | street | varchar(50) | YES | | NULL |   | | city | varchar(40) | YES | | NULL |   | | state | varchar(2) | YES | | NULL |   | | person_id | bigint(20) | YES | | NULL |   | +-----------+-------------+------+-----+---------+----------------+ 5 rows in set (0.00 sec) ```
Inserting into address
```sql mysql> select * from person; +----+-------+------------+ | id | name | title | +----+-------+------------+ | 1 | chong | programmer | | 2 | jim | cto | +----+-------+------------+ 2 rows in set (0.00 sec) ```
```sql mysql> insert into address(street, city, state, person_id) -> values ('123 Sesame St', 'Tampa', 'FL', 1); Query OK, 1 row affected (0.00 sec) ```
```sql mysql> select * from address; +----+---------------+-------+-------+-----------+ | id | street | city | state | person_id | +----+---------------+-------+-------+-----------+ | 1 | 123 Sesame St | Tampa | FL | 1 | +----+---------------+-------+-------+-----------+ 1 row in set (0.00 sec) ```

select from multiple tables

Joins

```sql mysql> select name, title, street, city, state -> from person join address on person.id = address.person_id; ```
```sql +-------+------------+---------------+-------+-------+ | name | title | street | city | state | +-------+------------+---------------+-------+-------+ | chong | programmer | 123 Sesame St | Tampa | FL | +-------+------------+---------------+-------+-------+ 1 row in set (0.00 sec) ```
Inserting another address for 'chong'
```sql mysql> insert into address values -> (null, '1313 Mockingbird Ln', 'Tampa', 'FL', 1); Query OK, 1 row affected (0.00 sec) mysql> select name, title, street, city, state -> from person join address on person.id = address.person_id; +-------+------------+---------------------+-------+-------+ | name | title | street   | city | state | +-------+------------+---------------------+-------+-------+ | chong | programmer | 123 Sesame St | Tampa | FL | | chong | programmer | 1313 Mockingbird Ln | Tampa | FL | +-------+------------+---------------------+-------+-------+ 2 rows in set (0.00 sec) ```
Warning: Old style
mysql> select name, title, street, city, state
    -> from person p, address a
    -> where p.id = a.person_id;
+-------+------------+-----------------------+-------+-------+
| name  | title      | street                | city  | state |
+-------+------------+-----------------------+-------+-------+
| chong | programmer | 123 Sesame St         | Tampa | FL    |
| chong | programmer | 1313 Mockingbird Lane | Tampa | FL    |
+-------+------------+-----------------------+-------+-------+
2 rows in set (0.00 sec)
            

Developer's Guide to SQL

Python
# creating "tables"

class Person(object):
    def __init__(self, name, title):
        self.name = name
        self.title = title

class Address(object):
    def __init__(self, street, city, state, person):
        self.street = street
        self.city = city
        self.state = state
        self.person = person
            
Python
# start with empty "tables"
persons = []
addresses = []

# inserting
persons.append( Person('chong', 'programmer') )
persons.append( Person('jim', 'cto') )

addresses.append( Address('123 Sesame St', 'Tampa', 'FL', persons[0]) )
addresses.append( Address('1313 Mockingbird Ln', 'Tampa', 'FL', persons[0]) )

# selecting
print [ {'name': p.name, 'title': p.title, 'street': a.street, 'city': a.city, 'state': a.state}
        for p in persons
        for a in addresses
        if p == a.person ]
Python
 # output
[
  {
    'city': 'Tampa',
    'state': 'FL',
    'street': '123 Sesame St',
    'name': 'chong',
    'title': 'programmer'
  },
  {
    'city': 'Tampa',
    'state': 'FL',
    'street': '1313 Mockingbird Ln',
    'name': 'chong',
    'title': 'programmer'
  }
]
Ruby

Person = Struct.new(:name, :title)
Address = Struct.new(:street, :city, :state, :person)

persons = []
addresses = []

persons << Person.new('chong', 'programmer')
persons << Person.new('jim', 'cto')

addresses << Address.new('123 Sesame St', 'Tampa', 'FL', persons.first)
addresses << Address.new('1313 Mockingbird Ln', 'Tampa', 'FL', persons.first)

result = []
persons.each do |p|
  addresses.each do |a|
    result << { name: p.name, title: p.title, street: a.street,
                city: a.city, state: a.state } if a.person == p
  end
end

puts result.inspect
          
Status: Junior Level

Designing a schema

Normalizing * Don't repeat yourself * Keep relevant info together * Keep non-relevant info apart
Bad ```sql create table person( id bigint primary key auto_increment, phone1 varchar(20), phone2 varchar(20), cubicle_location varchar(100), cubicle_type enum('small','medium','large'), name varchar(100) ); ```
Good ```sql create table person( id bigint primary key auto_increment, name varchar(100), cubicle_id bigint ); create table phone( id bigint primary key auto_increment, person_id bigint, ph_num varchar(20) ); create table cubicle( id bigint primary key auto_increment, location varchar(100), type enum('small','medium','large') ); ```
Data Warehousing * ETL = Extract, Transform, Load * Denormalize tables * Read-only * Store huge amounts of data * Do mostly group calculations: `sum`, `avg`, `max`

Problem

+----+------------+-------+
| id | date       | total |
+----+------------+-------+
|  1 | 2015-01-02 |    12 |
|  2 | 2015-01-03 |     8 |
|  3 | 2015-01-04 |     8 |
|  4 | 2015-01-06 |     3 |
|  5 | 2015-01-10 |     1 |
+----+------------+-------+
            
  • There are days where there's no data
  • Write a query that will include all dates given a range
create dates table
```sql -- plant the seed create table dates (dt date primary key); insert into dates values ('1900-01-01'); -- run this multiple times as needed insert into dates select adddate(dt, tmp.n) from dates, (select count(*) as n from dates) as tmp; -- trim the fat delete from dates where dt < '1960-01-01' or dt > '2100-01-01'; ```
Use outer joins
```sql mysql> select dates.dt, ds.total -> from dates -> left join day_summary ds on dates.dt = ds.date -> where dates.dt between '2015-01-01' and '2015-01-06' -> order by 1; +------------+-------+ | dt | total | +------------+-------+ | 2015-01-01 | NULL | | 2015-01-02 | 12 | | 2015-01-03 | 8 | | 2015-01-04 | 8 | | 2015-01-04 | 10 | | 2015-01-05 | NULL | | 2015-01-06 | 3 | +------------+-------+ 6 rows in set (0.00 sec) ```

Problem

What table has person_id

Information Schema
```sql mysql> select table_name -> from information_schema.columns -> where table_schema=database() -> and column_name='person_id'; +------------+ | table_name | +------------+ | address | +------------+ 1 row in set (0.00 sec) ```
# Performance Tuning * `explain` * `analyze table` * `optimize table` * `create index`
explain

mysql> explain select name, street from person join address on person.id = address.person_id;
+----+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows | Extra                                              |
+----+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------------+
|  1 | SIMPLE      | address | ALL  | NULL          | NULL | NULL    | NULL |    2 | NULL                                               |
|  1 | SIMPLE      | person  | ALL  | PRIMARY       | NULL | NULL    | NULL |    2 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------------+
2 rows in set (0.00 sec)
              
create an index
mysql> alter table address add index(person_id);
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select name, street from person join address on person.id = address.person_id;
+----+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows | Extra                                              |
+----+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------------+
|  1 | SIMPLE      | address | ALL  | person_id     | NULL | NULL    | NULL |    2 | NULL                                               |
|  1 | SIMPLE      | person  | ALL  | PRIMARY       | NULL | NULL    | NULL |    2 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------------+
2 rows in set (0.00 sec)
mysql> insert into address(person_id) values (3); -- 7 more times...
Query OK, 1 row affected (0.00 sec)

mysql> explain select name, street from person join address on person.id = address.person_id;
+----+-------------+---------+------+---------------+-----------+---------+-------------------+------+-------+
| id | select_type | table   | type | possible_keys | key       | key_len | ref               | rows | Extra |
+----+-------------+---------+------+---------------+-----------+---------+-------------------+------+-------+
|  1 | SIMPLE      | person  | ALL  | PRIMARY       | NULL      | NULL    | NULL              |    2 | NULL  |
|  1 | SIMPLE      | address | ref  | person_id     | person_id | 9       | ckim_db.person.id |    1 | NULL  |
+----+-------------+---------+------+---------------+-----------+---------+-------------------+------+-------+
2 rows in set (0.00 sec)
            
Cardinality
mysql> show index from address where column_name ='person_id';
+---------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| address |          1 | person_id |            1 | person_id   | A         |          10 |     NULL | NULL   | YES  | BTREE      |         |               |
+---------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.00 sec)
mysql> select cardinality
    -> from information_schema.statistics
    -> where table_schema=database()
    -> and table_name='address'
    -> and column_name='person_id';
+-------------+
| cardinality |
+-------------+
|          10 |
+-------------+
1 row in set (0.00 sec)
Index maintenance
```sql analyze table tblname; -- recalculate statistics ```
```sql optimize table tblname; -- defrag ```
```sql alter table address engine=myisam ```
Other tools * show processlist; * slow logs ** show variables like 'slow%log';
# PROBLEM ![hierarchical data](img/hierarchy.png)
One solution
CREATE TABLE category(
  category_id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(20) NOT NULL,
  parent INT DEFAULT NULL
);

SELECT * FROM category ORDER BY category_id;
+-------------+----------------------+--------+
| category_id | name                 | parent |
+-------------+----------------------+--------+
|           1 | ELECTRONICS          |   NULL |
|           2 | TELEVISIONS          |      1 |
|           3 | TUBE                 |      2 |
|           4 | LCD                  |      2 |
|           5 | PLASMA               |      2 |
|           6 | PORTABLE ELECTRONICS |      1 |
|           7 | MP3 PLAYERS          |      6 |
|           8 | FLASH                |      7 |
|           9 | CD PLAYERS           |      6 |
|          10 | 2 WAY RADIOS         |      6 |
+-------------+----------------------+--------+
10 rows in set (0.00 sec)
Problems * How many nodes exist under a category * What is the depth of this node * Give the ancestor path
Hierarchical data http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
# Engines
* myisam * innodb * blackhole * csv
MyISAM vs InnoDB
Blackhole
CSV
# Problem Who touched my table?
Triggers ```sql create table info(n int); insert into info values (1),(2),(3); create table log(v varchar(100)); ``` ```sql create trigger whodoneit before delete on info for each row insert into log values (concat(user(), ' removed ', old.n)); ```
SQLite
Other topics * set theory * UNION ALL * "MINUS" * intersect * varchar vs binary * self joins * join tables * \P less -S * show variables (datadir) * stored procedure * replication (@@sql_log_bin)
Other topics * ci and collation * partition * hashing row entries to checksum blocks of rows * my.cnf * batch processing * ORM * composite indexes * recovery * aggregate functions * multiple servers
Other topics * subqueries * one row * multiple rows * in FROM * group by * count(*) or count(field)