Keith VanderLinden
Calvin University
Database Management Systems (DBMS) provide:
Data scientists benefit from being able to manage data stored in a variety of types of data systems.
Relational
Non-Relational
This course will focus on non-relational database systems.
A MongoDB database is a set of collections, each of which is set of documents. Here’s a collection of three documents.
[
{
"name": "Halley's Comet",
"officialName": "1P/Halley",
"specs": {"orbitalPeriod": 75, "radius": 3.4175, "mass": 2.2e14 }
},
{
"name": "Wild2",
"officialName": "81P/Wild",
"specs": {"orbitalPeriod": 6.41, "radius": 1.5534, "mass": 2.3e13 }
},
{
"name": "Comet Hyakutake",
"officialName": "C/1996 B2",
"specs": {"orbitalPeriod": 17000, "radius": 0.77671, "mass": 8.8e12 }
}
]Aggregation Pipelines are a more flexible way to query data.
orders.insert_many( [
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
{ "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }
] )
inventory.insert_many( [
{ "_id" : 1, "sku" : "almonds", "in-stock" : 120 },
{ "_id" : 3, "sku" : "cashews", "in-stock" : 60 },
{ "_id" : 4, "sku" : "pecans", "in-stock" : 70 }
] )orders.aggregate([
{ "$lookup": {
"from": "inventory",
"localField": "item",
"foreignField": "sku",
"as": "inventory_documents"
}
} ]) _id item price quantity inventory_documents
0 1 almonds 12 2 [{'_id': 1, 'sku': 'almonds', "in-stock": 120}]
1 2 pecans 20 1 [{'_id': 4, 'sku': 'pecans', "in-stock": 70 }]