Updated on Mar 13, 2019
We outlined that the size of the database is relative to the number of rows its tables contain. However, the question of how these entries got there at first place still remains unanswered. In the next lines of this tutorial, we will cover some of the most common reasons why your database increase in size.
If you have never heard of what a spambot is - it is basically a computer program that connects to websites for spam purposes.
But why those are causing the space to your database to grow?
All the modern websites are storing their data (Like blog posts, articles, comments, etc.) in a database. This means that any data added to a website no matter by an Admin or Regular User will also go to the database. Combine that with an insecure registration form allowing registrations without human verification (captcha challenge), and the equation gets completed.
It is pretty simple - the spambots are registering accounts on a website which grants them with regular user privileges like posting comments and sometimes even creating pages (mainly on social platform type of websites). From there they can inject as many comments and content as they are configured to. Since all the data is being stored in a database as we already mentioned then the database will get filled with thousands of meaningless rows causing for its size to grow needlessly.
How to protect your database from spambots?
As you want only legitimate users to be able to post content on your website, you will have to configure:
How to clear your database from spam content?
Unfortunately, there is no universal approach for clearing your database from any user-posted content. The reason for that is mainly because the added spam content is mixed with all the legitimate content on your website. The best option here would be to contact a developer who can review the entries one by one and remove those that are spam related. Of course, the cleanup should be performed only after you are sure that the door is closed for new content posting.
Another thing that some websites do is placing their cache in the databases they are using. Why? Well, the database is answering requests way faster than the storage, which means that it will deliver cached content faster. Although that is no longer a common practice, it is still a thing for some custom scripts.
How to clear your database from cache entries?
You can identify the cache table and "Empty" it out. Emptying that table should not on theory cause issues with your website since any missing cache will be regenerated. However, that should be done only after the database cache is stopped or at least changed to File System cache. Of course, the alternatives for caching are quite a lot like APC or even Memcached for example.
We are combining these two under the same topic because they are generated basically the same way.
There are websites that are using their databases for storing logs for example for:
Those are important indeed, however storing those in the same database as the one your website actually utilizes is pretty much destructive for your visitors. The logs will not only increase the size of your database but if multiple logs are constantly updated this can cause uncontrolled and exponential growth. Often such logs are the Access or Registration logs which are recording, for example, each login attempt or a registration. However, if the website Login/Registration forms are not protected, and a spambot hits those, the log tables will surely be filled with quite a number of rows.
The analytics data is the very same - it logs information for your users' activity on your website similarly to the logs. However, since it aims for that information to be graphically presented to the admin of the website it often relies on the database of the website when storing the data.
How to prevent logging in the database?
For that, you will need to either contact the developer of your application or check with the author of the Open Source script you are using. Often the logging is defined in some sort of configuration file which you will need to edit or even sometimes defined in configuration tables stored in the same database.
Writing user sessions to the database is a good idea indeed, but the issue with that is if those are not cleared. Typically every user logging on your website is creating a session which is then used by the cookie added to the browser of the visitor. On theory every time a user logs out the cookie gets destroyed, and the session should be too, however in practice sometimes for the developers is enough to destroy only the cookie from the client's browser and leave the session on the server. If the sessions are preserved in the database, then those will fill it up eventually.
How to prevent sessions filling the database?
For that, you will have to change the way how sessions are being stored - from a Database to File and that will be sufficient. Usually, that configuration is kept either in a configuration file or directly in the database. This will allow for the sessions to be actually written directly on the file system allowing for manual deletion either by the server admin or by the website administrator.
MySQL databases typically allow for storing even image files. However, there is a conceptual issue with that since the modern images are quite big. If your website uses the main database for storing let's say any user uploaded images like for example profile pictures or any custom uploaded images this will cause for the database size to grow. In that case, even if you identify the table is storing those images, you will be unable to clear it since you are clearing user uploaded data.
How to prevent images being added to the database?
Again here the only solution is to contact the developer of your website or the authors of the open source application you are using because only changing a column type in MySQL will only cause for certain functionalities on your website to break. Therefore, this needs to be addressed by a developer since all of the upload forms have to be rewritten.
Sometimes the size of your database is not caused by a few large tables but by lots of small ones. This is a typical result from too many plugins being installed on the website or if for example, and additional database import is being performed on a wrong database.
How to reduce the number of Tables?
First and foremost, you will have to uninstall any plugins/modules that are not used on your website. Then you will have to look if there is a patter since the tables can also be created by a regular script. This means that your website might be creating database tables with different purposes. Ideally, you would want to strip all the needless tables, and in case you need developer assistance you should definitely get some.