```{R}
library(tidyverse)
library(HumanitiesDataAnalysis)
```

(Optional) Change a SQL code block to count something other than title.

```{R}
SELECT title, COUNT(*) AS N
FROM names GROUP BY title
ORDER BY N DESC
LIMIT 8

```

The following code in SQL does a simple select.

```{R}
SELECT title, COUNT(*) AS N
FROM names GROUP BY pid
ORDER BY N DESC
LIMIT 8;
```

Rewrite it in R. The code below takes part of preparing the table itself
for work, which corresponds to “FROM names” above; the rest, you’ll have
to do yourself and put into a sensible order.

<div class="cell"
hash="Merges_and_joins_cache/json/unnamed-chunk-32_b8efa5f19a1c2482710de95d8b8d3a24">

```{R}
pleiades = pleiades_db()

pleiades_db() |> 
  tbl("names") |>
  collect() |>
  KEEP_GOING()
```


```{R}
pleiades = pleiades_db()

pleiades_db() |> 
  tbl("names") |>
  collect() |>
  KEEP_GOING()
```

</div>

We can use joins and nests to make datasets much larger. We’ve looked a
bit before at the “subjects” field in our books dataset. Now we’ll split
it up using an unnest operation.

```{R}

all_subjects = books |>
  mutate(subject = str_split(subjects, " -- |///")) |>
  select(lccn, subject)

all_subjects |> head(3)

all_subjects |> head(10) |> unnest(subject)

```

There are standard census divisions in the US for representing states as
part of *regions*. You can find these easily online: the code below
reads one in.

Using an inner join and a summarize function, plot the urban population
by *region* or *division*. For example, you might have four lines:
North, South, East, and West.

```{R}

divisions = read_csv("https://raw.githubusercontent.com/cphalpert/census-regions/master/us%20census%20bureau%20regions%20and%20divisions.csv")

long_cities = CESTA |>
  pivot_longer(`1790`:`2010`, names_to = "year", values_to = "population")

```


```{R}

divisions = read_csv("https://raw.githubusercontent.com/cphalpert/census-regions/master/us%20census%20bureau%20regions%20and%20divisions.csv")

long_cities = CESTA |>
  pivot_longer(`1790`:`2010`, names_to = "year", values_to = "population")

```