Top ChatGPT Prompts for Data Analysts: Master SQL Generation
Stop wrestling with complex SQL syntax and start getting insights faster. This curated collection of ChatGPT prompts for data analysts is designed to streamline your SQL generation workflow. From simple joins to advanced window functions, use these prompts to write accurate, efficient queries in seconds, freeing you up for higher-level analysis.
Efficiently construct a standard SQL SELECT statement to retrieve specific columns with filtering conditions.
Act as a senior data analyst. Construct a SQL query to select the columns `[column1]`, `[column2]`, and `[column3]` from the table `[table_name]` where `[column4]` is equal to `[value]`. The SQL dialect is [PostgreSQL/MySQL/SQL Server].
INNER JOIN Query Development
Develop an INNER JOIN query to combine rows from two tables based on shared key columns.
Develop a SQL query that joins the `[table1]` with `[table2]` on the key `[table1.id] = [table2.foreign_id]`. Select `[table1.columnA]`, `[table2.columnB]`, and `[table1.columnC]`. Filter the results where `[table1.date_column]` is after '2023-01-01'. Use the [PostgreSQL] dialect.
Aggregate Calculations with GROUP BY
Formulate a query to group data and calculate aggregate values (COUNT, SUM, AVG) for distinct groups.
Produce a SQL query for [MySQL] that counts the number of `[items]` and calculates the average `[price]` from the `[sales]` table. Group the results by `[category]` and `[region]`. Only include groups having more than `[10]` items.
LEFT JOIN Query Construction
Construct a LEFT JOIN query to retrieve all records from the left table, including matched data from the right.
Construct a SQL query to select all customers from the `[customers]` table and their corresponding order dates from the `[orders]` table. The join key is `[customers.customer_id] = [orders.customer_id]`. If a customer has no orders, the order date should be NULL. The dialect is [SQL Server].
Common Table Expression (CTE) for Subqueries
Define a temporary, named result set using a CTE to improve readability and structure complex queries.
Using a Common Table Expression (CTE) in [PostgreSQL], first select all employees from the `[employees]` table who are in the `[Sales]` department. Then, from the CTE result, select the employees who have a `[hire_date]` before '2022-01-01'. Name the CTE `SalesEmployees`.
Ranking Data with Window Functions
Utilize the RANK() window function to assign ordered ranks to rows within specific data partitions.
Develop a [SQL Server] query to rank employees by their `[salary]` within each `[department]`. The result should include `[employee_name]`, `[department]`, `[salary]`, and the calculated rank. Highest salary should get rank 1.
Identify Duplicate Records
Formulate a query to precisely identify and count duplicate rows based on specified column combinations.
Construct a SQL query to find duplicate records in the `[contacts]` table based on the `[email]` and `[full_name]` columns. The output should show the email, full name, and the count of occurrences for each duplicate.
Subquery for WHERE Clause Filtering
Develop a query that employs a subquery within its WHERE clause for targeted data filtering.
Develop a SQL query to select all products from the `[products]` table whose `[category_id]` is in the set of categories from the `[categories]` table where the `[is_active]` flag is true. Use a subquery in the WHERE clause.
Conditional Logic with CASE Statement
Construct a query utilizing a CASE statement to apply conditional logic for data categorization or bucketing.
Formulate a [PostgreSQL] query on the `[orders]` table. Select the `[order_id]` and create a new column `[order_size]` based on the `[amount]`. If amount < 100, 'Small'. If amount between 100 and 500, 'Medium'. Otherwise, 'Large'.
Moving Average Calculation via Window Functions
Apply a window function to compute a moving average (e.g., 7-day) for a specified metric over time.
Produce a [SQL Server] query to calculate the 7-day moving average of `[daily_revenue]` from the `[daily_sales]` table. The table has `[sale_date]` and `[daily_revenue]` columns. The result should be ordered by date.
COALESCE for NULL Value Replacement
Formulate a query that effectively replaces NULL values in a designated column with a specified default.
From the `[products]` table, select the `[product_name]` and `[description]`. If the `[description]` is NULL, replace it with the string 'No description available'. Use the COALESCE function.
Demystify SQL Query Execution Plans
Obtain a clear, human-readable explanation of a complex SQL query's execution plan, identifying bottlenecks.
Act as a database administrator. I have the following [PostgreSQL] query. Explain its execution plan in simple terms, highlighting potential bottlenecks. The query is: [Paste your complex SQL query here]. The execution plan is: [Paste your EXPLAIN ANALYZE output here].
Optimize Query Performance with Index Suggestions
Receive targeted index recommendations based on a specific query and its underlying table schema to boost performance.
Given the following [MySQL] query and table schema, suggest one or more indexes to improve its performance. Explain why each suggested index would be beneficial.
Query: [Paste your slow SQL query here]...
Multi-Table JOIN Query
Construct a query that seamlessly integrates data from three distinct tables using multiple JOIN clauses.
Develop a [SQL Server] query to join `[customers]`, `[orders]`, and `[products]`.
- Join `[customers]` and `[orders]` on `[customer_id]`.
- Join `[orders]` and `[products]` on `[product_id]`....
Date Difference Calculation in SQL
Formulate a query to accurately calculate the duration in days between two specified date columns.
In [PostgreSQL], formulate a query to calculate the number of days between `[order_date]` and `[shipped_date]` in the `[orders]` table. Name the resulting column `[processing_days]`.
Analyze Sequential Data with LEAD/LAG
Effortlessly retrieve data from preceding or succeeding rows within a result set using LEAD/LAG window functions.
Develop a [SQL Server] query on the `[monthly_sales]` table to show each month's `[revenue]` alongside the previous month's revenue. Use the LAG() window function, partitioned by `[product_id]` and ordered by `[month]`.
Pivot Table for Data Transformation
Transform row-oriented data into a summarized columnar (pivot) format for easier analysis.
I have a `[sales]` table with columns `[year]`, `[quarter]`, and `[amount]`. Construct a [SQL Server] query using PIVOT to transform this data so that each `[year]` is a row and the columns are `[Q1]`, `[Q2]`, `[Q3]`, `[Q4]`, containing the sum of `[amount]` for each quarter.
Construct a CREATE TABLE Statement
Generate a precise DDL `CREATE TABLE` statement based on a detailed specification of columns and data types.
Construct a `CREATE TABLE` statement for a [PostgreSQL] table named `[users]`. It should have the following columns:
- `[user_id]` (integer, primary key, auto-incrementing)
- `[username]` (text, unique, not null)...
JSON Column Data Extraction
Develop a query to efficiently extract a specific key's value from a JSON-formatted column.
I have a table `[events]` in [PostgreSQL] with a JSONB column named `[properties]`. Develop a query to select the `[event_id]` and extract the value of the `[user_id]` key from the `[properties]` column. The `[user_id]` is a nested key: `{'user': {'user_id': 123}}`.
UPDATE Statement with SELECT Subquery
Construct an `UPDATE` statement that modifies records in one table using values derived from another via a `SELECT` subquery.
Construct a [SQL Server] `UPDATE` statement to set the `[total_spent]` column in the `[customers]` table. The value should be the sum of `[amount]` from the `[orders]` table for each customer. Join the tables on `[customer_id]`.
Regex Filtering for Text Data
Formulate a query that applies regular expressions to effectively filter text data based on complex patterns.
Develop a [PostgreSQL] query to find all users in the `[users]` table whose `[email]` address is from a `[gmail.com]` or `[yahoo.com]` domain. Use a regular expression for the filtering condition.
Refactor Nested Subqueries to CTEs
Enhance query readability and maintainability by refactoring complex nested subqueries into clean CTEs.
Act as a senior data engineer. Refactor the following [SQL Server] query to use a Common Table Expression (CTE) instead of the nested subquery in the FROM clause. Explain the benefits of the refactored version.
Query: [Paste your query with a nested subquery here]
Year-over-Year Growth Calculation
Formulate a query to accurately compute year-over-year (YoY) growth percentages for key metrics.
Using the `[annual_summary]` table which has `[year]` and `[total_revenue]`, formulate a [PostgreSQL] query to calculate the year-over-year revenue growth percentage. Use the LAG() window function to get the previous year's revenue.
Recursive CTE for Hierarchical Data
Develop a query using a recursive CTE to effectively traverse and explore hierarchical datasets.
I have an `[employees]` table with `[employee_id]`, `[name]`, and `[manager_id]`. Construct a recursive CTE in [SQL Server] to find the entire management chain for the employee with `[employee_id] = 10`.
Extract First/Last Value in a Group
Employ `FIRST_VALUE()` or `LAST_VALUE()` window functions to retrieve specific values from ordered partitions.
Develop a [PostgreSQL] query on a `[transactions]` table. For each `[customer_id]`, find the amount of their very first transaction. The table has `[customer_id]`, `[transaction_date]`, and `[amount]`. The result should show each `[customer_id]` and their `[first_transaction_amount]`.
Construct an INSERT Statement
Generate a precise SQL `INSERT` statement to add a new record to a table using specified values.
Act as a database assistant. Construct a SQL INSERT statement for the `[employees]` table. The table has columns `[first_name]`, `[last_name]`, `[department_id]`, and `[hire_date]`. The values to insert are: 'John', 'Doe', 101, and '2024-01-15'. The SQL dialect is [MySQL].
Turn these prompts into a reusable workspace
Save your favourite prompts once, reuse them with Alt+P, keep a live Table of Contents of long chats, and export conversations when you're done.