1200字范文 > 如何确定MongoDB是否适合您

如何确定MongoDB是否适合您

时间：2023-03-31 09:12:35

相关推荐

如何确定MongoDB是否适合您

by Luc Claustres

卢克·克劳斯特雷斯(Luc Claustres)

如何确定MongoDB是否适合您 (How to decide if MongoDB is right for you)

For the past couple of years, I built web applications around MongoDB. In this short article I would like to answer some of the recurrent questions or misunderstanding most developers have when evaluating it:

在过去的几年中，我围绕MongoDB构建了Web应用程序。在这篇简短的文章中，我想回答大多数开发人员在评估它时经常遇到的问题或误解：

What is the licensing?什么是许可？ What does it mean MongoDB is a NoSQL database?MongoDB是NoSQL数据库是什么意思？ What about MongoDB performances?MongoDB的性能如何？

发牌 (Licensing)

Yes, MongoDB is licensed under Free Software Foundation’sGNU AGPL v3.0. Practically, this means that enhancements you make to MongoDB must be released to the community. The source code of any derived work has to be distributed as well.

是的，MongoDB已获得Free Software Foundation的GNU AGPL v3.0的许可。实际上，这意味着必须将对MongoDB的增强功能发布给社区。任何派生作品的源代码也必须分发。

You might wonder if your application is a derived work. I must confess I never found a simple definition of such a term. However, in the specific case of MongoDB, they simply recognize that applications using their database are a separate work. Moreover, their supported drivers are released under Apache License v2.0. This is a permissive license. It does not require you to publish your source code, and your application usually only talks to MongoDB using a driver.

您可能想知道您的应用程序是否是派生作品。我必须承认，我从未找到过这个术语的简单定义。但是，在MongoDB的特定情况下，他们只是意识到使用数据库的应用程序是一项单独的工作。此外，它们受支持的驱动程序是根据Apache License v2.0发布的。这是许可的许可证。它不需要您发布源代码，并且您的应用程序通常仅使用驱动程序与MongoDB对话。

As a consequence, you don’t need to be concerned with the licensing of MongoDB to build your app around it. They even send signed letters asserting the promise to legal departments if there are questions. They also provide commercial licenses if the signed letter isn’t enough.

因此，您无需担心MongoDB围绕其构建应用程序的许可。如果有疑问，他们甚至会向法律部门发送已签署承诺的承诺书。如果签署的信还不够，他们还提供商业许可证。

Note: although a great deal of experience make me trust this analysis, I am not a lawyer. The view presented here is my personal understanding and is not an official one.

注意：尽管有很多经验使我相信此分析，但我不是律师。这里提出的观点是我个人的理解，不是官方的观点。

NoSQL (NoSQL)

Yes, MongoDB is a NoSQL database. This word can be pretty confusing. I will try to analyze the most common ideas with a focus on how this applies to MongoDB.

是的，MongoDB是一个NoSQL数据库。这个词可能会令人困惑。我将尝试分析最常见的想法，重点是如何将其应用于MongoDB。

面向文件 (Document-oriented)

In traditional SQL databases, data is arranged in the form of tables and rows. Each row has a fixed number of columns that can only store data of a specific type (e.g., Integer, Text, Datetime). This defines theschemaof your data.

在传统SQL数据库中，数据以表和行的形式排列。每行具有固定数量的列，这些列只能存储特定类型的数据(例如，整数，文本，日期时间)。这定义了数据的架构。

In MongoDB, data is stored in the form of BSON objects organized intocollections.Data isusually handled in the form of JSON objects. This makes mapping objects into the database a simple task,normally eliminating anything similar to anobject-relational mapping.

在MongoDB中，数据以BSON对象的形式存储到集合中。数据是通常以JSON对象的形式处理。这使得将对象映射到数据库成为一项简单的任务，通常消除了类似于对象关系映射的任何操作。

交易性 (Transactional)

Prior to v4, MongoDB provided only document-wide transactions. Writes were never partially applied to an inserted or updated document. The operation was atomic in the sense that it either fails or succeeds. For the document in its entirety, it was said to be ACID at the document level. As a consequence, there was no possibility of atomic changes that span multiple documents. You had to emulate the required database transactions (e.g. using 2 phase commit).

在v4之前，MongoDB仅提供文档范围的交易。写入从未部分应用于插入或更新的文档。从失败或成功的意义上讲，该操作是原子的。对于整个文档，据说在文档级别是ACID 。结果，不可能跨越多个文档进行原子更改。您必须模拟所需的数据库事务(例如，使用2期commit )。

Since v4, MongoDB supports multi-document ACID transactions, making it the only open source database to combine the document model with ACID guarantees.

从v4开始，MongoDB支持多文档ACID事务，使其成为唯一将文档模型与ACID保证相结合的开源数据库。

无模式(真的吗？) (Schema-less (really?))

This means you don’t have to tell the database the structure of your data and the primitive types to be used before being able to manage it. This also means you can mix documents that have different structures in the same collection of data.

这意味着在能够管理数据库之前，不必告诉数据库数据的结构和要使用的原始类型。这也意味着您可以在同一数据集合中混合使用具有不同结构的文档。

One of the great benefits is thatschema migrations become easier(most of the adjustments to the database are transparent and automatic). Rollback is unlikely to cause problems. Another advantage is thatdynamically extending existing data models with custom attributes at runtime is straightforward.

最大的好处之一就是架构迁移变得更加容易(对数据库的大多数调整都是透明且自动的)。回滚不太可能引起问题。另一个优点是，在运行时动态扩展具有自定义属性的现有数据模型非常简单。

Butall this does not mean you don’t have any schema at all. If it is notexplicitlydeclared, it shinesimplicitlyfrom your application logic. It might be declared in other ways to handle form/data validation. Anyway, you still have to explicitly tell the database how to create indices to ensure good performance.

但这一切并不意味着您根本没有任何架构。如果未显式声明，则它在应用程序逻辑中隐式闪耀。可以通过其他方式声明它来处理表单/数据验证。无论如何，您仍然必须明确告诉数据库如何创建索引以确保良好的性能。

Indeed, schema design is the cornerstone of making awesome databases, whether SQL or not. If you do not understand your data and the limitations of hardware and software, you can not effectively design a schema.

实际上，无论是否使用SQL，架构设计都是制作出色数据库的基石。如果您不了解数据以及硬件和软件的限制，则无法有效地设计架构。

非关系性(真的吗？) (Non-relational (really?))

This means that you don’t have to always create a relation between two documents to handle aggregated data structures.

这意味着您不必总是在两个文档之间创建关系来处理聚合的数据结构。

Indeed, in relational databases, the SQL JOIN clause allows you to combine rows from two or more tables using a common field between them. Document-oriented databases such as MongoDB are designed to storedenormalizeddata. Ideally, there should be no relationship between collections: if the same data is required in two or more documents, it must be repeated. One of the great benefits is that asingle read operationis required to get all data.

实际上，在关系数据库中，SQL JOIN子句允许您使用两个或多个表之间的公共字段来合并它们之间的行。面向文档的数据库(例如MongoDB)旨在存储非规范化数据。理想情况下，集合之间应该没有关系：如果两个或多个文档中需要相同的数据，则必须重复该数据。一大好处是，只需一次读取操作即可获取所有数据。

But you can still create relations and refer to another document if you’d like or have the need:

但是，如果您愿意或需要，您仍然可以创建关系并引用另一个文档：

by ID, then you can “populate” it manually with a second query or using DBRefs

按ID，然后您可以通过第二个查询或使用DBRef手动“填充”它

by any other field, then you can use the$lookupoperator

通过其他任何字段，则可以使用$lookup运算符

This makes MongoDB really flexible and allows you to choose how to handle the relations between your objectson a case-by-case basis.

这使得MongoDB的确实灵活，并允许您选择如何处理对案件逐案基础上的对象之间的关系。

性能 (Performance)

读/写 (Read/Write)

Yes, MongoDB like any other “true” database is made to handle a huge volume of data.In a nutshell, hundreds or thousands of objects is nothing for a database,so you don’t have to worry if you have such numbers. You can find a lot of benchmarks around. Here is a simple one to give you some rough order of magnitude. The documents stored are really simple and typically represent a time-stamped measurement:

是的，MongoDB像其他“真正的”数据库一样，可以处理大量数据。简而言之，成百上千的对象对于数据库来说是没有用的，因此您不必担心是否有这样的数字。您可以找到很多基准。这是一个简单的方法，可以为您提供大致的数量级。存储的文档非常简单，通常代表带时间戳的度量：

{ value: random(0,100), timestamp: date}

Because of the way MongoDB delegates memory management to the operating system, having more complex documents (typically containing tens of attributes) does not affect results significantly

由于MongoDB将内存管理委托给操作系统的方式，拥有更复杂的文档(通常包含数十个属性)不会显着影响结果

Both attributes have been indexed. MongoDB automatically adds and indexes the document unique ID. I tested three requests:

这两个属性均已建立索引。 MongoDB自动添加并为文档唯一ID编制索引。我测试了三个请求：

find the maximum value of the collection using the aggregation framework

使用聚合框架找到集合的最大值

find the 100 greatest values greater than 99.9找到大于99.9的100个最大值 get a single document by ID通过ID获取单个文档

The “maximum request” is not benefitting from indexes because of the aggregation, while the “greater than” and “by ID” requests can use it. You will see how this is important for performance.

由于聚合，“最大请求”无法从索引中受益，而“大于”和“按ID”请求可以使用它。您将看到这对性能至关重要。

The test configuration was MongoDB 3.4.1 64 bits — OS Windows 7 Pro SP1 — CPU Core i7–4712HQ 2.3GHz — 16Go RAM—SSD HD, and the test results were the following:

测试配置为MongoDB 3.4.1 64位— OS Windows 7 Pro SP1 — CPU Core i7–4712HQ 2.3GHz — 16Go RAM —SSD HD，测试结果如下：

So if you build the correct indices querying a billion documents, it is still performant enough for most applications on a single server. If needed, you can increase performance using sharding.

因此，如果您构建查询十亿个文档的正确索引，则对于单个服务器上的大多数应用程序而言，它仍然具有足够的性能。如果需要，您可以使用分片来提高性能。

Here are the scripts used to create/query the database for this test:

以下是用于为此测试创建/查询数据库的脚本：

And the run commands:

和运行命令：

// Launch server./mongod --dbpath "C:\Program Files\MongoDB\Server\3.4\data" --port 27018// Insertion exemple for 10e7./mongo --port 27018 --eval "var arg1=10000000" create_collection.js// Requests./mongo --port 27018 --eval "" query_collection.js

记忆 (Memory)

Yes, MongoDB often looks like it uses all available RAM. It actually relies on different storage engines. WiredTiger is the default starting in MongoDB 3.2, and MMAPv1 is the default for MongoDB versions before 3.2. However, they work pretty similarly. Via the file system cache, theyautomatically use all free memory that is not used by the engine cache or by other processes. And this is coherent if you’d like to have maximum performances.

是的，MongoDB通常看起来会使用所有可用的RAM。它实际上依赖于不同的存储引擎。 WiredTiger是默认MongoDB中3.2出发， MMAPv1是3.2之前的MongoDB版本中的默认。但是，它们的工作原理非常相似。通过文件系统缓存，它们自动使用引擎缓存或其他进程未使用的所有可用内存。如果您想获得最佳性能，这是一致的。

So system resource monitors often show that MongoDB uses a lot of memory,but its usage is dynamic. If another process suddenly needs half the server’s RAM, MongoDB will yield cached memory to the other process.

因此，系统资源监视器通常显示MongoDB使用大量内存，但是其使用情况是动态的。如果另一个进程突然需要服务器一半的RAM，则MongoDB会将缓存的内存提供给另一个进程。

As a consequence, the single parameter you can tune to optimize memory usage is the engine cache size. For example, by default, the WiredTiger engine uses 50% of RAM minus 1 GB, which can be pretty large on servers with a lot of memory. This can even cause some trouble if you use containers with limited memory, so simply find out the right balance for your use case.

因此，您可以调整以优化内存使用的单个参数是引擎缓存大小。例如，默认情况下，WiredTiger引擎使用50％的RAM减去1 GB，这在具有大量内存的服务器上可能很大。如果使用内存有限的容器，这甚至可能会造成一些麻烦，因此只需为您的用例找到合适的平衡。

结论 (Conclusion)

I hope you know have a more precise idea of the benefits provided by MongoDB if it suits your needs. Recently, MongoDB has started a Database as a Service offer called MongoDB Atlas that might be useful for you to test out.

我希望您对MongoDB所提供的好处有一个更精确的了解(如果它适合您的需求)。最近，MongoDB已启动名为MongoDB Atlas的数据库即服务产品，可能对您进行测试很有用。

If you liked this article feel free to have a look at our Open Source solutions, the Kalisio team !

如果您喜欢本文，请随时查看我们的开源解决方案 Kalisio团队！