qop's notes: 使用 MongoDB 的 find() 及 aggregate()

2020-04-26

使用 MongoDB 的 find() 及 aggregate()

db.collection.find() 官方的說明文件在 https://docs.mongodb.com/manual/reference/method/db.collection.find/。

先滙入 https://github.com/ozlerhakan/mongodb-json-files 裡 datasets 的 books.json 到 test 資料庫：

$ mongoimport --db test --collection books --file ~/downloads/books.json
2020-04-26T08:21:27.742+0800 connected to: mongodb://localhost/
2020-04-26T08:21:29.561+0800 431 document(s) imported successfully. 0 document(s) failed to import.

先來看篩選（filter），以下是使用 mongo shell 的操作：

> use test

> db.books.findOne()  # 先找出第一筆查看大概的結構
{
 "_id" : 3,
 "title" : "Specification by Example",
 "isbn" : "1617290084",
 "pageCount" : 0,
 "publishedDate" : ISODate("2011-06-03T07:00:00Z"),
 "thumbnailUrl" : "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.book-thumb-images/adzic.jpg",
 "status" : "PUBLISH",
 "authors" : [
  "Gojko Adzic"
 ],
 "categories" : [
  "Software Engineering"
 ]
}

> db.books.find({isbn: 1617290084})  #找不到資料，型別不對
> db.books.find({isbn: '1617290084'})  #找到第一筆

# categoriries 為陣列，使用字串篩選時，結果會是陣列中有包含該字串的資料都會找出來
> db.books.find({categories: 'Software Engineering'})

# 計算個數
> db.books.find({categories: 'Software Engineering'}).count()

# 只查看前兩筆
> db.books.find({categories: 'Software Engineering'}).limit(2)

# 跳過 3 筆，取 5 筆
> db.books.find({categories: 'Software Engineering'}).skip(3).limit(5)

# categoriries 為陣列，使用陣列篩選時，值必須相同（序順也必須一樣）
> db.books.find({categories: ['Software Engineering']})

# 以下搜尋的結果會不同
> db.books.find({categories: ['Java', 'Software Engineering']})
> db.books.find({categories: ['Software Engineering', 'Java']})

# 陣列中特定位置的值
> db.books.find({'categories.1': 'Java'})
> db.books.find({'categories.1': 'Software Engineering'})
# 若是物件也是類似的作法 {'name.first': 'Bill'}

# $all 同時包含（AND運算，不管順序）
> db.books.find({categories: { $all: ['Software Engineering', 'Java']}}) 

# $in 包含任一個（OR運算）
> db.books.find({categories: { $in: ['Software Engineering', 'XML']}}) 

# $nin 不包含任一個
> db.books.find({categories: {$nin: ['Java']}}).count()

# 包含 'Software Engineering'，但不包含 'Java'
> db.books.find({categories: { $in: ['Software Engineering'], $nin: ['Java']}})

# 使用 Regular Expression
> db.books.find({categories: /Software/})  # 陣列中任一符合
> db.books.find({title: /Practice/})  # 字串

# 沒有某個欄位
> db.books.find({publishedDate: {$exists: false} }).count()
# 有某個欄位
> db.books.find({publishedDate: {$exists: true} }).count()

# 某個物件的值大於、等於、小於、不等於、大於等於、小於等於、範圍
> db.inventory.find({'size.h': {$gt: 10}})
> db.inventory.find({'size.h': {$eq: 10}})
> db.inventory.find({'size.h': {$lt: 10}})
> db.inventory.find({'size.h': {$ne: 10}})
> db.inventory.find({'size.h': {$gte: 10}})
> db.inventory.find({'size.h': {$lte: 10}})
> db.inventory.find({'size.h': {$gt: 9, $lt: 15}})

# 日期
> db.books.find({publishedDate: {$gte: new Date('2014-01-01')} })

投射（project），就是要看哪些欄位，要看給 1，不看給 0。可以使用 find() 的第二個參數，或使用 aggregate()。

# 只顯示 title， _id 預設是顯示的
# 使用 find() 的第二個參數
> db.books.find({}, {title:1, _id:0})
> db.books.find({publishedDate: {$gte: new Date('2014-01-01')} }, {title:1, publishedDate:1})
# 使用 aggregate()
> db.books.aggregate({$project: {title:1, _id:0}})
> db.books.aggregate([{
        $match: {
            publishedDate: { $gte: new Date('2014-01-01') }
        }
    },{
        $project: {title:1, publishedDate:1}
    }])

# 不顯示設定的欄位，使用 find() 的第二個參數
> db.books.find({}, {shortDescription: 0, longDescription: 0}).limit(5)
# 不顯示設定的欄位，使用 aggregate()
> db.books.aggregate([{$project: {shortDescription: 0, longDescription: 0}}, {$limit: 5}])

排序（sort），依欄位，升冪給 1，降冪給 -1。

> db.books.aggregate([{
        $match: {
            categories: 'XML'
        }
    }, {
        $project: {title:1, publishedDate:1}
    }, {
        $sort: {publishedDate: -1}
    }])

複雜的操作都落在 aggregate() 上。

沒有留言:

張貼留言

2020-04-26

使用 MongoDB 的 find() 及 aggregate()

使用 MongoDB 的 find() 及 aggregate()

沒有留言:

FB 留言