MongoDB : An introduction
By — SOURABH MISHRA
MongoDB, the most popular NoSQL database, is an open-source document-oriented database. The term ‘NoSQL’ means ‘non-relational’. It means that MongoDB isn’t based on the table-like relational database structure but provides an altogether different mechanism for storage and retrieval of data. This format of storage is called BSON ( similar to JSON format).
A simple MongoDB document Structure:
by: 'Sourabh Mishra',
Features of MongoDB:
- Document Oriented
- Replication and High Availability
Day 1 summary —
- Indexing in MongoDB is a way to speed up searching by creating an index table with the particular field which is most likely to search. Then we sort it according to use case so that searching has to be done accurately and decreases the execution time.
- Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large datasets and high throughput operations
- In the MongoDB database clusters we usually create copies of the server that stores the same data so that in case the server goes down our data is still available for use. These servers are called replicas.
- COLLSCAN is the by-default plan of searching used by MongoDB. When we use indexing for faster searching it changes to IXSCAN.
- Compound indexing in MongoDB refers to when we create an index of more than one. Multiple item indexing is compound indexing.
- in mongo DB for aggregation, we use an aggregation pipeline which has multiple steps and each step does some process and at the end returns us the aggregated result
- — Mongo router is a program that manages the nodes in the master-slave network of the database cluster like a load balancer.
- — In MongoDB, the cluster is used for either a replica set or a sharded cluster. A sharded cluster is also known as horizontal scaling, where data is distributed across many servers. The main purpose of sharded MongoDB is to scale reads and writes along with multiple shards.
Day 2 summary —
- Filesystem is a way of storing the files and folders in storage devices like a hard disk.
- Data Model is a way of organizing the data. Data Model means A plan or structure to store the data into storage for better I/O operations.
- While the user doing the I/O operation on the top of storage the performance should be superfast otherwise it will decrease the user experience. More and more data dumping to the storage it will impact the I/O performance where we face a challenge of latency.
- Latency can be minimized with the game of managing, planning and how data is organized come in play where the storage same and size of the data same but the I/O performance will be increased where known as Data node
- SQL database is the one that organizes the data in for of Table in the storage and NoSQL is the database which has a data model
according to the requirement and use case.
- There are different types of database servers to store different data. If we have the data which is structured and all the records have to save columns or features we have to use the Sequel Data Base and for unstructured data and unique columns for every record we can go with non-sequal data or no schema for data.
- To configure the MongoDB server we need to first install it and then set the path of its binaries so that we can run the commands from any directory.
— CRUD operations in Mongo DB are done using the functions insert, find, update, and deleteOne/deleteMany functions respectively.
— To configure compass we first have to install it if not installed using the latest version of MongoDB server. Then we give the URL of the server to connect to it.
— MongoDB can be accessed in 3 ways
- CLI with commands,
- WebUI from Compass and
- API with any programming language. Installed MongoDB, Worked on CLI, WEBUI and Integrated with Python Programming language
— Document Oriented DB is the one in which instead of records, document term is used & they operate on a document only. Data stored is in the form of documents only.
— To integrate MongoDB API with python we first need to install the pymongo library and then use MongoClient class from the pymongo module to connect to the database.