PostgreSQL Comprehensive Notes
Table of Contents
- What is PostgreSQL?
- Why Use PostgreSQL?
- PostgreSQL vs MySQL
- PostgreSQL Features
- PostgreSQL Commands by Category
- Common psql Commands
- Best Practices
What is PostgreSQL?
PostgreSQL (often called "Postgres") is a powerful, open-source object-relational database management system (ORDBMS) that has been in active development for over 30 years. It's known for its reliability, feature robustness, and performance.
Key Characteristics
- ACID Compliant: Ensures data integrity through Atomicity, Consistency, Isolation, and Durability
- Open Source: Free to use with a permissive license
- Cross-Platform: Runs on Linux, Windows, macOS, and other Unix-like systems
- Extensible: Supports custom data types, functions, and extensions
- Standards Compliant: Follows SQL standards more closely than most databases
Why Use PostgreSQL?
1. Advanced Data Types
- JSON and JSONB for document storage
- Arrays, ranges, and geometric types
- Custom data types
- Full-text search capabilities
2. Performance & Scalability
- Multi-version concurrency control (MVCC)
- Advanced indexing (B-tree, Hash, GiST, SP-GiST, GIN, BRIN)
- Table partitioning
- Parallel queries
- Write-ahead logging (WAL)
3. Reliability & Data Integrity
- ACID compliance
- Foreign key constraints
- Check constraints
- Transactions with savepoints
- Point-in-time recovery
4. Extensibility
- Custom functions in multiple languages (PL/pgSQL, Python, JavaScript, etc.)
- Extensions ecosystem
- Custom operators and data types
- Procedural languages
5. Security Features
- Row-level security
- SSL connections
- Authentication methods (MD5, SCRAM, Kerberos, LDAP)
- Role-based access control
PostgreSQL vs MySQL
Feature | PostgreSQL | MySQL |
---|---|---|
ACID Compliance | Full ACID compliance | ACID with InnoDB engine |
Data Types | Rich set including JSON, arrays, ranges | Basic types, limited JSON support |
Concurrency | MVCC (better read performance) | Locking-based |
Standards Compliance | Highly SQL standard compliant | Less compliant |
Replication | Streaming, logical replication | Master-slave, master-master |
Full-text Search | Built-in advanced search | Basic FULLTEXT indexes |
Extensibility | Highly extensible | Limited extensibility |
Performance | Better for complex queries | Better for simple read-heavy workloads |
Learning Curve | Steeper | Gentler |
Community | Strong academic/enterprise focus | Web development focus |
When to Choose PostgreSQL over MySQL
Choose PostgreSQL when you need:
- Complex queries and data analysis
- Data integrity is critical
- Advanced data types (JSON, arrays, etc.)
- Full-text search capabilities
- Geographic data (PostGIS extension)
- Extensibility and customization
- Strong consistency requirements
- ACID compliance across all operations
Choose MySQL when you need:
- Simple web applications
- Read-heavy workloads
- Faster development cycle
- Simpler administration
- Better performance for basic operations
PostgreSQL Features
1. Data Types
- Numeric: INTEGER, BIGINT, DECIMAL, NUMERIC, REAL, DOUBLE PRECISION
- Character: CHAR, VARCHAR, TEXT
- Date/Time: DATE, TIME, TIMESTAMP, INTERVAL
- Boolean: BOOLEAN
- Binary: BYTEA
- Network: INET, CIDR, MACADDR
- Geometric: POINT, LINE, CIRCLE, POLYGON
- JSON: JSON, JSONB
- Arrays: Any data type can be an array
- Ranges: INT4RANGE, TSRANGE, etc.
- Custom Types: User-defined types
2. Advanced Features
- Window Functions: ROW_NUMBER(), RANK(), LAG(), LEAD()
- Common Table Expressions (CTEs): WITH queries
- Recursive Queries: Hierarchical data processing
- Table Inheritance: Object-oriented database features
- Materialized Views: Pre-computed query results
- Triggers: Row and statement-level triggers
- Stored Procedures: Functions in multiple languages
3. Indexing
- B-tree: Default index type
- Hash: For equality operations
- GiST: Generalized search trees
- SP-GiST: Space-partitioned GiST
- GIN: Generalized inverted indexes
- BRIN: Block range indexes
- Partial Indexes: Indexes on subsets of data
- Functional Indexes: Indexes on expressions
4. Replication & High Availability
- Streaming Replication: Real-time data streaming
- Logical Replication: Publication/subscription model
- Hot Standby: Read queries on standby servers
- Point-in-Time Recovery: Restore to any point in time
- Connection Pooling: Built-in connection management
PostgreSQL Commands by Category
Database Operations
Creating and Managing Databases
-- Create database
CREATE DATABASE mydatabase;
CREATE DATABASE mydatabase WITH ENCODING 'UTF8';
-- Drop database
DROP DATABASE mydatabase;
-- List databases
\l
-- Connect to database
\c mydatabase
-- Show current database
SELECT current_database();
Table Operations
Creating Tables
-- Basic table creation
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Table with constraints
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
amount DECIMAL(10,2) CHECK (amount > 0),
status VARCHAR(20) DEFAULT 'pending'
);
-- Table with arrays
CREATE TABLE tags (
id SERIAL PRIMARY KEY,
name VARCHAR(50),
categories TEXT[]
);
Altering Tables
-- Add column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Drop column
ALTER TABLE users DROP COLUMN phone;
-- Rename column
ALTER TABLE users RENAME COLUMN username TO user_name;
-- Change column type
ALTER TABLE users ALTER COLUMN email TYPE TEXT;
-- Add constraint
ALTER TABLE users ADD CONSTRAINT unique_email UNIQUE (email);
-- Drop constraint
ALTER TABLE users DROP CONSTRAINT unique_email;
Table Information
-- List tables
\dt
-- Describe table structure
\d users
-- Show table with details
\d+ users
-- Show table constraints
\d+ users
Data Manipulation
INSERT Operations
-- Basic insert
INSERT INTO users (username, email) VALUES ('john_doe', 'john@example.com');
-- Multiple inserts
INSERT INTO users (username, email) VALUES
('jane_doe', 'jane@example.com'),
('bob_smith', 'bob@example.com');
-- Insert with returning
INSERT INTO users (username, email)
VALUES ('alice', 'alice@example.com')
RETURNING id, created_at;
-- Insert from another table
INSERT INTO backup_users SELECT * FROM users WHERE created_at < '2023-01-01';
SELECT Operations
-- Basic select
SELECT * FROM users;
SELECT username, email FROM users;
-- With conditions
SELECT * FROM users WHERE created_at > '2023-01-01';
SELECT * FROM users WHERE username LIKE 'john%';
-- Ordering and limiting
SELECT * FROM users ORDER BY created_at DESC LIMIT 10;
SELECT * FROM users ORDER BY username OFFSET 5 LIMIT 10;
-- Aggregations
SELECT COUNT(*) FROM users;
SELECT COUNT(*), AVG(amount) FROM orders;
SELECT status, COUNT(*) FROM orders GROUP BY status;
-- Joins
SELECT u.username, o.amount
FROM users u
JOIN orders o ON u.id = o.user_id;
-- Window functions
SELECT username,
ROW_NUMBER() OVER (ORDER BY created_at) as row_num,
RANK() OVER (ORDER BY created_at) as rank
FROM users;
UPDATE Operations
-- Basic update
UPDATE users SET email = 'newemail@example.com' WHERE id = 1;
-- Update multiple columns
UPDATE users SET
username = 'new_username',
email = 'new@example.com'
WHERE id = 1;
-- Update with returning
UPDATE users SET username = 'updated_user'
WHERE id = 1
RETURNING username, email;
-- Update from another table
UPDATE orders SET status = 'completed'
FROM users
WHERE orders.user_id = users.id AND users.username = 'john_doe';
DELETE Operations
-- Basic delete
DELETE FROM users WHERE id = 1;
-- Delete with condition
DELETE FROM users WHERE created_at < '2023-01-01';
-- Delete with returning
DELETE FROM users WHERE username = 'old_user'
RETURNING id, username;
-- Truncate table (faster for all rows)
TRUNCATE TABLE users;
TRUNCATE TABLE users RESTART IDENTITY; -- Reset sequences
Index Operations
-- Create index
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_username ON users(username);
-- Unique index
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
-- Partial index
CREATE INDEX idx_active_users ON users(username) WHERE active = true;
-- Functional index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
-- Composite index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- List indexes
\di
-- Drop index
DROP INDEX idx_users_email;
-- Reindex
REINDEX INDEX idx_users_email;
REINDEX TABLE users;
View Operations
-- Create view
CREATE VIEW active_users AS
SELECT * FROM users WHERE active = true;
-- Materialized view
CREATE MATERIALIZED VIEW user_stats AS
SELECT
DATE_TRUNC('month', created_at) as month,
COUNT(*) as user_count
FROM users
GROUP BY DATE_TRUNC('month', created_at);
-- Refresh materialized view
REFRESH MATERIALIZED VIEW user_stats;
-- Drop view
DROP VIEW active_users;
DROP MATERIALIZED VIEW user_stats;
-- List views
\dv
Function and Procedure Operations
-- Create function
CREATE OR REPLACE FUNCTION get_user_count()
RETURNS INTEGER AS $$
BEGIN
RETURN (SELECT COUNT(*) FROM users);
END;
$$ LANGUAGE plpgsql;
-- Function with parameters
CREATE OR REPLACE FUNCTION get_user_by_id(user_id INTEGER)
RETURNS TABLE(username VARCHAR, email VARCHAR) AS $$
BEGIN
RETURN QUERY
SELECT u.username, u.email
FROM users u
WHERE u.id = user_id;
END;
$$ LANGUAGE plpgsql;
-- Call function
SELECT get_user_count();
SELECT * FROM get_user_by_id(1);
-- Drop function
DROP FUNCTION get_user_count();
-- List functions
\df
User and Permission Operations
-- Create user
CREATE USER myuser WITH PASSWORD 'mypassword';
-- Create role
CREATE ROLE myrole;
-- Grant permissions
GRANT SELECT, INSERT ON users TO myuser;
GRANT ALL PRIVILEGES ON DATABASE mydatabase TO myuser;
-- Revoke permissions
REVOKE INSERT ON users FROM myuser;
-- List users
\du
-- List permissions
\dp users
Transaction Operations
-- Begin transaction
BEGIN;
-- Commit transaction
COMMIT;
-- Rollback transaction
ROLLBACK;
-- Savepoints
BEGIN;
SAVEPOINT sp1;
-- Some operations
ROLLBACK TO sp1;
COMMIT;
-- Transaction isolation levels
BEGIN ISOLATION LEVEL READ COMMITTED;
BEGIN ISOLATION LEVEL REPEATABLE READ;
BEGIN ISOLATION LEVEL SERIALIZABLE;
Advanced Query Operations
JSON Operations
-- JSON data type
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
data JSON
);
-- Insert JSON data
INSERT INTO documents (data) VALUES
('{"name": "John", "age": 30, "city": "New York"}');
-- Query JSON
SELECT data->>'name' as name FROM documents;
SELECT * FROM documents WHERE data->>'age' = '30';
-- JSONB operations
CREATE TABLE documents_b (
id SERIAL PRIMARY KEY,
data JSONB
);
-- JSONB indexing
CREATE INDEX idx_documents_data ON documents_b USING GIN (data);
Array Operations
-- Create table with arrays
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title VARCHAR(100),
tags TEXT[]
);
-- Insert array data
INSERT INTO posts (title, tags) VALUES
('First Post', ARRAY['tech', 'programming', 'sql']);
-- Query arrays
SELECT * FROM posts WHERE 'tech' = ANY(tags);
SELECT * FROM posts WHERE tags @> ARRAY['tech'];
SELECT unnest(tags) as tag FROM posts;
Full-Text Search
-- Basic full-text search
SELECT * FROM documents WHERE to_tsvector(content) @@ to_tsquery('postgresql');
-- Create text search index
CREATE INDEX idx_documents_fts ON documents USING GIN (to_tsvector('english', content));
-- Ranking results
SELECT title, ts_rank(to_tsvector(content), to_tsquery('postgresql')) as rank
FROM documents
WHERE to_tsvector(content) @@ to_tsquery('postgresql')
ORDER BY rank DESC;
System and Maintenance Operations
Database Information
-- Show version
SELECT version();
-- Show current user
SELECT current_user;
-- Show database size
SELECT pg_size_pretty(pg_database_size('mydatabase'));
-- Show table sizes
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Show running queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
Backup and Restore
# Dump database
pg_dump mydatabase > mydatabase_backup.sql
# Dump with compression
pg_dump -Fc mydatabase > mydatabase_backup.dump
# Restore database
psql mydatabase < mydatabase_backup.sql
# Restore from custom format
pg_restore -d mydatabase mydatabase_backup.dump
# Dump specific table
pg_dump -t users mydatabase > users_backup.sql
Performance and Statistics
-- Update table statistics
ANALYZE users;
ANALYZE; -- All tables
-- Vacuum table
VACUUM users;
VACUUM FULL users; -- More thorough but locks table
-- Auto vacuum settings
SELECT name, setting FROM pg_settings WHERE name LIKE 'autovacuum%';
-- Index usage statistics
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes;
-- Table statistics
SELECT schemaname, tablename, n_tup_ins, n_tup_upd, n_tup_del
FROM pg_stat_user_tables;
Common psql Commands
# Connection
psql -h hostname -U username -d database
# psql meta-commands
\? # Help on psql commands
\h # Help on SQL commands
\h SELECT # Help on specific SQL command
# Database and table info
\l # List databases
\c database # Connect to database
\dt # List tables
\di # List indexes
\dv # List views
\df # List functions
\du # List users
\dp table # List table permissions
# Table structure
\d table # Describe table
\d+ table # Describe table with details
# Settings
\set # Show all variables
\timing on # Show query execution time
\x # Toggle expanded display
# File operations
\i filename # Execute commands from file
\o filename # Send output to file
\o # Send output back to stdout
# Quit
\q # Quit psql
Best Practices
Performance Optimization
- Use appropriate data types - Don't use TEXT when VARCHAR(50) suffices
- Create proper indexes - On frequently queried columns
- Use EXPLAIN ANALYZE - To understand query performance
- Regular maintenance - VACUUM and ANALYZE regularly
- Connection pooling - Use tools like PgBouncer for high-traffic applications
Security
- Use roles and permissions - Principle of least privilege
- Enable SSL - For all connections
- Regular updates - Keep PostgreSQL updated
- Row-level security - For multi-tenant applications
- Audit logging - Track database access and changes
Data Integrity
- Use constraints - Foreign keys, check constraints, not null
- Transactions - Wrap related operations in transactions
- Backup strategy - Regular backups with tested restore procedures
- Monitoring - Set up alerts for disk space, connections, performance
Development
- Use prepared statements - Prevent SQL injection
- Connection management - Properly close connections
- Error handling - Handle database errors gracefully
- Testing - Test database operations thoroughly
- Migration scripts - Version control database schema changes
PostgreSQL is a powerful and feature-rich database system that excels in complex applications requiring data integrity, advanced queries, and extensibility. While it may have a steeper learning curve than MySQL, its robustness and capabilities make it an excellent choice for serious applications.