From aeda9c975015d644aa84d2c3fba9b73b71e20acc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A9ter=20Kir=C3=A1ly?= Date: Fri, 8 Mar 2024 14:49:21 +0100 Subject: [PATCH] Fixing coding style issues --- episodes/01-introduction.md | 4 ++-- episodes/03-filtering.md | 6 +++--- episodes/04-ordering-commenting.md | 6 ++++-- episodes/05-aggregating-calculating.md | 8 ++++---- episodes/06-joins-aliases.md | 16 ++++++++++------ episodes/09-create.md | 10 +++++----- episodes/11-extra-challenges.md | 19 ++++++++++--------- episodes/Bonus_GoodStyle.md | 3 ++- 8 files changed, 40 insertions(+), 32 deletions(-) diff --git a/episodes/01-introduction.md b/episodes/01-introduction.md index e64c424c..04a5069a 100644 --- a/episodes/01-introduction.md +++ b/episodes/01-introduction.md @@ -166,9 +166,9 @@ The main data types that are used in doaj-article-sample database are `INTEGER` Different database software/platforms have different names and sometimes different definitions of data types, so you'll need to understand the data types for any platform you are using. The following table explains some of the common data types and how they are represented in SQLite; [more details available on the SQLite website](https://www.sqlite.org/datatype3.html). | Data type | Details | Name in SQLite | -| :--------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :-------------------------------------------------------------------------------------------------------------------- | +| :--------------------- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| :-------------------------------------------------------------------------------------------------------------------- | | boolean or binary | this variable type is often used to represent variables that can only have two values: yes or no, true or false. | doesn't exist - need to use integer data type and values of 0 or 1. | -| integer | sometimes called whole numbers or counting numbers. Can be 1,2,3, etc., as well as 0 and negative whole numbers: -1,-2,-3, etc. | INTEGER | +| integer | sometimes called whole numbers or counting numbers. Can be 1, 2, 3, etc., as well as 0 and negative whole numbers: -1, -2, -3, etc. | INTEGER | | float, real, or double | a decimal number or a floating point value. The largest possible size of the number may be specified. | REAL | | text or string | and combination of numbers, letters, symbols. Platforms may have different data types: one for variables with a set number of characters - e.g., a zip code or postal code, and one for variables with an open number of characters, e.g., an address or description variable. | TEXT | | date or datetime | depending on the platform, may represent the date and time or the number of days since a specified date. This field often has a specified format, e.g., YYYY-MM-DD | doesn't exist - need to use built-in date and time functions and store dates in real, integer, or text formats. See [Section 2.2 of SQLite documentation](https://www.sqlite.org/datatype3.html#date_and_time_datatype) for more details. | diff --git a/episodes/03-filtering.md b/episodes/03-filtering.md index 1cba9a17..39e916f2 100644 --- a/episodes/03-filtering.md +++ b/episodes/03-filtering.md @@ -25,7 +25,7 @@ SQL is a powerful tool for filtering data in databases based on a set of conditi ```sql SELECT * FROM articles -WHERE ISSNs='2056-9890'; +WHERE ISSNs = '2056-9890'; ``` We can add additional conditions by using `AND`, `OR`, and/or `NOT`. For example, suppose we want the data on *Acta Crystallographica* published after October: @@ -33,7 +33,7 @@ We can add additional conditions by using `AND`, `OR`, and/or `NOT`. For example ```sql SELECT * FROM articles -WHERE (ISSNs='2056-9890') AND (Month > 10); +WHERE (ISSNs = '2056-9890') AND (Month > 10); ``` Parentheses are used merely for readability in this case but can be required by the SQL interpreter in order to disambiguate formulas. @@ -44,7 +44,7 @@ ISSNs codes "2076-0787" and "2077-1444", we can combine the tests using OR: ```sql SELECT * FROM articles -WHERE (issns = '2076-0787') OR (issns = '2077-1444'); +WHERE (ISSNs = '2076-0787') OR (ISSNs = '2077-1444'); ``` When you do not know the entire value you are searching for, you can use comparison keywords such as `LIKE`, `IN`, `BETWEEN...AND`, `IS NULL`. For instance, we can use `LIKE` in combination with `WHERE` to search for data that matches a pattern. diff --git a/episodes/04-ordering-commenting.md b/episodes/04-ordering-commenting.md index db0c03fb..db4603dd 100644 --- a/episodes/04-ordering-commenting.md +++ b/episodes/04-ordering-commenting.md @@ -56,7 +56,8 @@ Consider the following query: ```sql SELECT * FROM articles -WHERE (ISSNs = '2076-0787') OR (ISSNs = '2077-1444') OR (ISSNs = '2067-2764|2247-6202'); +WHERE (ISSNs = '2076-0787') OR (ISSNs = '2077-1444') + OR (ISSNs = '2067-2764|2247-6202'); ``` SQL offers the flexibility of iteratively adding new conditions but you may reach a point where the query is difficult to read and inefficient. For instance, we can use `IN` to improve the query and make it more readable: @@ -79,7 +80,8 @@ join multiple tables because they represent a good example of using comments in SQL to explain more complex queries.*/ -- First we mention all the fields we want to display -SELECT articles.Title, articles.First_Author, journals.Journal_Title, publishers.Publisher +SELECT articles.Title, articles.First_Author, journals.Journal_Title, + publishers.Publisher -- from the first table FROM articles -- and join it with the second table. diff --git a/episodes/05-aggregating-calculating.md b/episodes/05-aggregating-calculating.md index cd24b1b7..5020e9ba 100644 --- a/episodes/05-aggregating-calculating.md +++ b/episodes/05-aggregating-calculating.md @@ -72,7 +72,7 @@ For example, we can adapt the last request we wrote to only return information a SELECT ISSNs, COUNT(*) FROM articles GROUP BY ISSNs -HAVING count(Title) >= 10; +HAVING COUNT(Title) >= 10; ``` The `HAVING` keyword works exactly like the `WHERE` keyword, but uses aggregate functions instead of database fields. When you want to filter based on an aggregation like `MAX, MIN, AVG, COUNT, SUM`, use `HAVING`; to filter based on the individual values in a database field, use `WHERE`. @@ -94,7 +94,7 @@ but only for the journals with 5 or more citations on average. SELECT ISSNs, AVG(Citation_Count) FROM articles GROUP BY ISSNs -HAVING AVG(Citation_Count)>=5; +HAVING AVG(Citation_Count) >= 5; ``` ::::::::::::::::::::::::: @@ -106,9 +106,9 @@ HAVING AVG(Citation_Count)>=5; In SQL, we can also perform calculations as we query the database. Also known as computed columns, we can use expressions on a column or multiple columns to get new values during our query. For example, what if we wanted to calculate a new column called `CoAuthor_Count`: ```sql -SELECT Title, ISSNs, Author_Count -1 as CoAuthor_Count +SELECT Title, ISSNs, Author_Count - 1 as CoAuthor_Count FROM articles -ORDER BY Author_Count -1 DESC; +ORDER BY Author_Count - 1 DESC; ``` In section [6\. Joins and aliases](06-joins-aliases.md) we are going to learn more about the SQL keyword `AS` and how to make use of aliases - in this example we simply used the calculation and `AS` to represent that the new column is different from the original SQL table data. diff --git a/episodes/06-joins-aliases.md b/episodes/06-joins-aliases.md index 9bd30538..71f302f8 100644 --- a/episodes/06-joins-aliases.md +++ b/episodes/06-joins-aliases.md @@ -54,7 +54,8 @@ We will cover [relational database design](08-database-design.md) in the next ep When joining tables, you can specify the columns you want by using `table.colname` instead of selecting all the columns using `*`. For example: ```sql -SELECT articles.ISSNs, journals.Journal_Title, articles.Title, articles.First_Author, articles.Month, articles.Year +SELECT articles.ISSNs, journals.Journal_Title, articles.Title, + articles.First_Author, articles.Month, articles.Year FROM articles JOIN journals ON articles.ISSNs = journals.ISSNs; @@ -63,7 +64,8 @@ ON articles.ISSNs = journals.ISSNs; Joins can be combined with sorting, filtering, and aggregation. So, if we wanted the average number of authors for articles on each journal, we can use the following query: ```sql -SELECT articles.ISSNs, journals.Journal_Title, ROUND(AVG(articles.Author_Count), 2) +SELECT articles.ISSNs, journals.Journal_Title, + ROUND(AVG(articles.Author_Count), 2) FROM articles JOIN journals ON articles.ISSNs = journals.ISSNs @@ -83,7 +85,7 @@ Write a query that `JOINS` the `articles` and `journals` tables and that returns ## Solution ```sql -SELECT journals.Journal_Title, count(*), avg(articles.Citation_Count) +SELECT journals.Journal_Title, COUNT(*), AVG(articles.Citation_Count) FROM articles JOIN journals ON articles.ISSNs = journals.ISSNs @@ -142,16 +144,18 @@ We can alias both table names: ```sql SELECT ar.Title, ar.First_Author, jo.Journal_Title FROM articles AS ar -JOIN journals AS jo +JOIN journals AS jo ON ar.ISSNs = jo.ISSNs; ``` And column names: ```sql -SELECT ar.title AS title, ar.first_author AS author, jo.journal_title AS journal +SELECT ar.title AS title, + ar.first_author AS author, + jo.journal_title AS journal FROM articles AS ar -JOIN journals AS jo +JOIN journals AS jo ON ar.issns = jo.issns; ``` diff --git a/episodes/09-create.md b/episodes/09-create.md index ffe7da04..47ecd95a 100644 --- a/episodes/09-create.md +++ b/episodes/09-create.md @@ -55,10 +55,10 @@ a better definition for the `journals` table would be: ```sql CREATE TABLE "journals" ( - "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, - "ISSN-L" TEXT, - "ISSNs" TEXT, - "PublisherId" INTEGER, + "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, + "ISSN-L" TEXT, + "ISSNs" TEXT, + "PublisherId" INTEGER, "Journal_Title" TEXT, CONSTRAINT "PublisherId" FOREIGN KEY("PublisherId") REFERENCES "publishers"("id") ); @@ -85,7 +85,7 @@ INSERT INTO "journals" VALUES (3,'2076-2616','2076-2616',2,'Animals'); We can also insert values into one table directly from another: ```sql -CREATE TABLE "myjournals"(Journal_Title text, ISSNs text); +CREATE TABLE "myjournals" (Journal_Title text, ISSNs text); INSERT INTO "myjournals" SELECT Journal_Title, ISSNs FROM journals; ``` diff --git a/episodes/11-extra-challenges.md b/episodes/11-extra-challenges.md index 4d9178a4..160dc9b8 100644 --- a/episodes/11-extra-challenges.md +++ b/episodes/11-extra-challenges.md @@ -35,7 +35,7 @@ How many `articles` are there from each `First_author`? Can you make an alias fo ## Solution 1 ```sql -SELECT First_Author, COUNT( * ) AS n_articles +SELECT First_Author, COUNT(*) AS n_articles FROM articles GROUP BY First_Author ORDER BY n_articles DESC; @@ -56,7 +56,7 @@ How many papers have a single author? How many have 2 authors? How many 3? etc? ## Solution 2 ```sql -SELECT Author_Count, COUNT( * ) +SELECT Author_Count, COUNT(*) FROM articles GROUP BY Author_Count; ``` @@ -77,10 +77,10 @@ language is unknown. ## Solution 3 ```sql -SELECT Language, COUNT( * ) +SELECT Language, COUNT(*) FROM articles JOIN languages -ON articles.LanguageId=languages.id +ON articles.LanguageId = languages.id WHERE Language != '' GROUP BY Language; ``` @@ -101,10 +101,10 @@ number of citations for that `Licence` type? ## Solution 4 ```sql -SELECT Licence, AVG( Citation_Count ), COUNT( * ) +SELECT Licence, AVG(Citation_Count), COUNT(*) FROM articles JOIN licences -ON articles.LicenceId=licences.id +ON articles.LicenceId = licences.id WHERE Licence != '' GROUP BY Licence; ``` @@ -124,12 +124,13 @@ Write a query that returns `Title, First_Author, Author_Count, Citation_Count, M ## Solution 5 ```sql -SELECT Title, First_Author, Author_Count, Citation_Count, Month, Year, Journal_Title, Publisher +SELECT Title, First_Author, Author_Count, Citation_Count, + Month, Year, Journal_Title, Publisher FROM articles JOIN journals -ON articles.issns=journals.ISSNs +ON articles.issns = journals.ISSNs JOIN publishers -ON publishers.id=journals.PublisherId; +ON publishers.id = journals.PublisherId; ``` ::::::::::::::::::::::::: diff --git a/episodes/Bonus_GoodStyle.md b/episodes/Bonus_GoodStyle.md index 762d5269..ad3752cb 100644 --- a/episodes/Bonus_GoodStyle.md +++ b/episodes/Bonus_GoodStyle.md @@ -40,7 +40,8 @@ SELECT articles.Title, articles.First_Author, journals.Journal_Title, publishers Into something that looks like this: ```sql -SELECT articles.Title, articles.First_Author, journals.Journal_Title, publishers.Publisher +SELECT articles.Title, articles.First_Author, journals.Journal_Title, + publishers.Publisher FROM articles JOIN journals ON articles.ISSNs = journals.ISSNs