MongoDB

From ArchWiki

MongoDB (from humongous) is an open source document-oriented database system developed and supported by MongoDB Inc. (formerly 10gen). It is part of the NoSQL family of database systems. Instead of storing data in tables as is done in a "classical" relational database, MongoDB stores structured data as JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.

Installation

MongoDB has been removed from the official repositories due to its re-licensing issues [1].

PKGBUILDs are provided in AUR:

  • mongodbAUR - builds from source, requiring 180GB+ free disk space, and may take several hours to build (i.e. 6.5 hours on Intel i7, 1 hour on 32 Xeon cores with high-end NVMe.)
  • mongodb-binAUR - prebuilt MongoDB binary extracted from official MongoDB Ubuntu repository packages. Compilation options used are unknown.

Install tools (mongoimport, mongoexport, mongodump, mongorestore, among others) using the PKGBUILD from the AUR corresponding to the main PKGBUILD you chose:

Usage

Start/Enable the mongodb.service daemon.

Note: During the first startup of the mongodb service, it will pre-allocate space, by creating large files (for its journal and other data). This step may take a while, during which the MongoDB shell is unavailable.

To access the MongoDB shell [2]:

$ mongosh

Or, if authentication is configured:

$ mongosh -u userName

Configuration

File Format

MongoDB uses a YAML-based configuration file format. See https://docs.mongodb.com/manual/reference/configuration-options/ for available configuration options.

/etc/mongodb.conf
systemLog:
   destination: file
   path: "/var/log/mongodb/mongod.log"
   logAppend: true
storage:
   journal:
      enabled: true
processManagement:
   fork: true
net:
   bindIp: 127.0.0.1
   port: 27017
setParameter:
   enableLocalhostAuthBypass: false
..

Requiring Authentication

Warning: By default, MongoDB does not require any authentication. Although MongoDB only listens on the localhost interface by default, this still allows any local user to connect without authenticating and may exposes the database(s). It is recommended to enable access control to prevent any unwanted access.

To create a MongoDB user account with administrator access [3]:

$ mongosh
use admin
db.createUser(
  {
    user: "myUserAdmin",
    pwd: "abc123",
    roles: [ { role: "userAdminAnyDatabase", db: "admin" }, "readWriteAnyDatabase" ]
  }
)

Append the following to your /etc/mongodb.conf.

/etc/mongodb.conf
security:
  authorization: "enabled"

Restart mongodb.service.

NUMA

Running MongoDB with Non-Uniform Access Memory (NUMA) can significantly impact performance. [4]

To see if your system uses NUMA:

# dmesg | grep -i numa

Also, /var/log/mongodb/mongod.log will show warnings if NUMA is in use and MongoDB is not started through numactl. (The mongo shell will also show this, but only if you do not have authentication enabled.)

If your system uses NUMA, to improve performance, you should make MongoDB start through numactl.

Edit mongodb.service according to the package you installed.

If using mongodbAUR, change it from:

ExecStart=/usr/bin/mongod $OPTIONS

To:

ExecStart=/usr/bin/numactl --interleave=all /usr/bin/mongod $OPTIONS

If using mongodb-binAUR, change it from:

ExecStart=/usr/bin/mongod --quiet --config /etc/mongodb.conf

To:

ExecStart=/usr/bin/numactl --interleave=all /usr/bin/mongod --quiet --config /etc/mongodb.conf

Zone claim also needs to be disabled, but on arch, /proc/sys/vm/zone_reclaim_mode defaults to 0.

Reenable and Restart mongodb.service as needed.

Clean Start and Stop

By default, systemd immediately kills anything after asking it to start or stop, if it has not finished doing so within 90 seconds.

mongodbAUR makes systemd wait as long as it takes for MongoDB to start, but mongodb-binAUR does not. Both packages allow systemd to kill MongoDB after it is asked to stop, if it has not finished within 90 seconds.

Large MongoDB databases can take a considerable amount of time to cleanly shut down, especially if swap is being used. (An active 450GB database on a top of the line NVMe with 64GB RAM and 16GB swap can take an hour to shut down.)

By default, MongoDB uses journaling. [5] With journaling, an unclean shutdown should not pose a risk of data loss. But, if not shutdown cleanly, large MongoDB databases can take a considerable amount of time to start back up. In this case, choosing whether to require a clean shutdown is a choice of a slower shutdown versus a slower startup. [6]

Warning: If you disable journaling, failing to require a clean shutdown severely risks data loss, so you really need to require a clean shutdown. [7]

To prevent systemd from killing MongoDB after 90 seconds, edit mongodb.service.

To allow MongoDB to cleanly shutdown, append to the [Service] section: (On large databases, this may substantially slow down your system shutdown time, but speeds up your next MongoDB start time)

TimeoutStopSec=infinity

If MongoDB needs a long time to start back up, it can be very problematic for systemd to keep killing and restarting it every 90 seconds [8], so mongodbAUR prevents this. If using mongodb-binAUR, to make systemd wait as long as it takes for MongoDB to start, append to the [Service] section:

TimeoutStartSec=infinity

Troubleshooting

MongoDB will not start

If MongoDB will not start, and you just upgraded to mongodbAUR 4.0.6-2+, you probably have a custom /etc/mongodb.conf. When MongoDB was in the Official repositories, it used an Arch-specific configuration file that used the systemd service type of simple. It now supplies upstream's systemd service and configuration files, which instead use a systemd service type of forking. Pacman will automatically upgrade your systemd service file, but will only automatically upgrade your /etc/mongodb.conf if you never modified it. In that case, systemd will be expecting mongod to fork, but its configuration file will tell it not to. You need to: switch to the new configuration file installed at /etc/mongodb.conf.pacnew, and duplicate changes you made to the old one that you still need, considering the new one is now in the YAML format, and the old one is probably in the MongoDB 2.4 format; or modify your existing one to enable forking. (To continue using the old 2.4 file format instead of YAML, adding fork: true should be what is needed.)

Check that mongodb.service is configured to use the correct database location.

Add --dbpath /var/lib/mongodb to the ExecStart line:

ExecStart=/usr/bin/numactl --interleave=all mongod --quiet --config /etc/mongodb.conf --dbpath /var/lib/mongodb

Check that there is at least 3GB space available for its journal files, otherwise mongodb can fail to start (without issuing a message to the user):

$ df -h /var/lib/mongodb/

Check if the mongod.lock lock file is empty or not:

# ls  -lisa /var/lib/mongodb

If it is, stop mongodb.service. Run a repair on the database, specifying the dbpath (/var/lib/mongodb/ is the default --dbpath in Arch Linux):

# mongod --dbpath /var/lib/mongodb/ --repair

Upon completion, the dbpath should contain the repaired data files and an empty mongod.lock file.

Warning: In dire situations, you can remove the file, start the database using the possibly corrupt files, and attempt to recover data from the database. However, it is impossible to predict the state of the database in these situations. See upstream document for detail.

After running the repair as root, the files will be owned by the root user, whilst Arch Linux runs it under a different user. You will need to use chown to change the ownership of the files back to the correct user. See following link for further details: Further reference

# chown -R mongodb: /var/{log,lib}/mongodb/

Warning about Transparent Huge Pages (THP)

One may want to permanently disable this feature by using a tmpfile:

/etc/tmpfiles.d/mongodb.conf
w /sys/kernel/mm/transparent_hugepage/enabled - - - - never
w /sys/kernel/mm/transparent_hugepage/defrag - - - - never

Use sysctl to disable THP at runtime:

# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# echo never > /sys/kernel/mm/transparent_hugepage/defrag

Warning about Soft rlimits too low

If you are using systemd service, then edit the unit file:

[Service]
# Other directives omitted
# (file size)
LimitFSIZE=infinity
# (cpu time)
LimitCPU=infinity
# (virtual memory size)
LimitAS=infinity
# (locked-in-memory size)
LimitMEMLOCK=infinity
# (open files)
LimitNOFILE=64000
# (processes/threads)
LimitNPROC=64000

See following link for further details: Further reference