database architecture design

The following is an excerpt from Chapter 8 of the book Mining New Gold - Managing Your Business Data by Penny, Jeffrey and Gillian Garbus which can be purchased here.

On previous blogs, we’ve discussed the importance of determining the order of priority of your data and establishing processes for storage, maintenance, and backup. But what about database architecture? Why is it so critical?

Database architecture focuses on design, maintenance, and administration of the software that manages the actual data underlying your application. Correct architecture decisions will have a positive effect on database performance, as well as on your ability to scale up.

How to Design a Maintenance Plan and Manage Your Data

A few years ago, a database management system vendor paid for a Super Bowl commercial where a balloon floated around in a server room, while a party went on in the next room. Forget for a moment that once the balloon loses a bit of air, it will likely get sucked into a fan’s intake and cause cooling problems with the server. This is the single biggest disservice a vendor has ever performed, because every system needs maintenance.

The same way you wouldn’t drive your car for too long without a tune up, you can’t let your DBMS run without preventive maintenance. And just like with an inexpensive commuter car vs. an Italian Sportster, the harder and faster your DBMS must run, the more important it is to make sure servers are maintained.

In addition, if databases aren’t maintained, you can expect performance problems. At first, they may be intermittent, then they’ll being snowballing. Frequently, this is the effect of allowing inexperienced database administrators (as well as network administrators or developers who don’t understand database administration) to take care of your systems.

To prevent this from happening, at a minimum, you need to:

1. Make sure backups are running and are being transferred off-site, and a restore is tested periodically.

This ensures that, in case of a disaster, you have a way to get back to a specific point in time. This includes monitoring database transaction logs to make sure they don’t grow unnecessarily.

Regardless of what your technical people tell you, decisions about retention are business -not technical- decisions, and they are generally driven by regulatory reasons. In general, you want to push requirements upon them, not the other way around. If you don’t test the restore capability, you won’t know if you are really backed up.

2. Consider staging backups.

One backup a month should be for data that is historical, isn’t touched by a lot of users, nor can be lost and replaced quickly.

Data that is integral to your business should be backed up at least daily, with incremental backups set up every 15 minutes or every hour depending on space you have available and performance needs. This is the best way to secure data if it is attacked from outside.

If you are a victim of a ransomware attack, these backups can be used to recover your data. If you have staged data and the attack hit last night just before the backup ran, then last night’s backup will be bad and unusable. But if you have prior night’s backup then you will lose only one days’ worth of data, which in most cases causes angst but your business should be able to recover.

If you cannot lose one day’s worth of data, then you must ensure you are back up and possibly replicating your systems with a warm standby that mirrors applications, code, security and data access of the original server so you can restore to the warm standby.

3. Validate consistency of database.

Apologies for industry jargon – it means you validate the absence of corruption in your data indexes and catalogs. This should be performed before backups are aged out.

4. Rebuild indexes

Indexes are a critical path to performance. Over time, as data changes and grows, indexes become unbalanced in a variety of ways. Rebalancing (rebuilding/reorganizing) your index structures has impact performance.

5. Update statistics

Modern DBMS have cost-based optimizers. In plain English, when you request information from the database, the DBMS decides the most efficient way of retrieving the data, based upon a histogram it keeps on the distribution of data. In a nutshell, when the histogram gets out of date, the approach the server takes to retrieving your data becomes erratic at best.
If you’re not certain all of these steps being performed, it’s time to call a consultant.

Key Takeaways

1. There are specific data management jobs that need to be run. The jobs themselves are slightly different between database types, but they all have three things in common:

A. The maintenance jobs need to run.
B. They have to be successful in completing their work.
C. They should be able to report back details to the IT or management team to provide insight into the database architecture.

2. The jobs should be verifiable and successes and failures should be documented for someone to be able to review the jobs information and determine if action is needed.

3. Many DBAs and IT managers use scripts that send emails to them to get information, verifying the success of jobs, sending alerts for job failures, or when a job itself finds alerting factors.

4. There are tools that can monitor these scripts and show you in simple fashion a quick look into the health of your environment. If you are using the script to email tracking system, we suggest looking into a tool that can provide environment health information faster and provide analysis that’s easier to understand.

For More Information on Database Optimization, Sign Up for One of Our Webinars