For our work with databases, we’ll be using three tools:
Database design/theory is a much larger topic. In fact it is an entire course (INST 327)! Here are a just a few database systems you may encounter:
Basic database operations are called “CRUD”:
# Next, to view the data, use a SELECT query
sq = '''SELECT title FROM books'''
# Execute the query as before, appending fetchall()
# which assigns results to 'books' variable
books = cursor.execute(sq).fetchall()
print(books)
[('2001: A Space Odyssey',),
('I, Robot',),
('The Martian',),
('The Left Hand Of Darkness',)]
The data in this CSV file (books.csv) consists of a list of titles, authors, and dates of important works of fiction. The data are similar to the data used in the above examples.
The first task is to create a program that can read the data in the attached file and load it into a database.
title | author | year |
---|---|---|
Things Fall Apart | Chinua Achebe | 1958 |
Chimera | John Barth | 1972 |
The Sot-Weed Factor | John Barth | 1960 |
Under the Volcano | Malcolm Lowery | 1947 |
id | title | author | year |
---|---|---|---|
1 | Things Fall Apart | Chinua Achebe | 1958 |
2 | Chimera | John Barth | 1972 |
3 | The Sot-Weed Factor | John Barth | 1960 |
4 | Under the Volcano | Malcolm Lowry | 1947 |
id | title | author_id | year |
---|---|---|---|
1 | Things Fall Apart | 1 | 1958 |
2 | Chimera | 2 | 1972 |
3 | The Sot-Weed Factor | 2 | 1960 |
4 | Under the Volcano | 3 | 1947 |
id | name |
---|---|
1 | Chinua Achebe |
2 | John Barth |
3 | Malcolm Lowry |
Consider again the bibliographic database, note that there are multiple titles in the attached file written by a single author. In order to normalize this data, the author names should be moved into their own table and related to the book data through a relationship.
How can the authors data be related to the book titles? Can you create a program that will manage the normalization process at load time?