common mistakes by DBA



Listing down common mistakes that every DBA makes at least once in his career , there are few which are common for beginner but more often for "good experienced DBA's" too.

Here i highlighted experience DBA's are always good ,even though they make mistake but keeps courage to recover and learn from it :)


1 : Attempt to start the DB which is already running !

yeah ! i knew it you have done that , so even i.

This mostly happens during planned activity , OS admin asked to startup DB but its already up through cluster through init services.



DBA has missed to check status of DB or process before going for startup of DB


2 : No issue in Database


A common reply from DBA to application questions about slowness or error


DBA's always try to rely on their monitoring or performance related alert , if no alert means no issue in DB.

yes ! it is correct in 98% of the cases , but what if alert did not received due to monitoring issue or kind of new issue which is not configured or configurable.

DBA has missed to do a basic health check and given hands off to application with overconfidence



3: Shutdown wrong database





Yeah , this is the reaction immediately once you realize that it's gone !

DBA has shut down production db while working on development assignment.


DBA has missed to use color code for access terminal , lower env should be yellow and green for prod. (my preference)


4: All tablespace free space ok but ..


DBA will always check users tablespace available space and overlook the system tablespace like sysaux / temp / undo / system.

Many occasions the inadequate space in undo/temp tablespaces (although they get recycled) being ignored and could lead to hanged application sessions waiting for their turn post auto space free up.



5: Wrong table /partition dropped





Although their are feature's available to recover drop table /partitions quickly , DBA's do make mistake during planned maintenance activity of dropping/ truncating wrong table.


DBA has missed to draw the action plan or attempted to use own created commands during activity instead of following planned one.


6: Taking App/Dev for ride ....








Most of the DBA's just to buy some time , misguide their app support or developers.


App/Dev has no support than DBA's in crisis , and they always listen to DBA's honestly and follow as they say ...


I have experienced these mostly in performance tuning and network connectivity to db or during app/db compatibility checks


7: Delayed response created havoc


DBA's always remain busy (if no social site restriction then too much busy :)) , and misses a critical notification or alert which later becomes root cause for unplanned outage


DBA has missed here to judge the impact of not taking action in early phase.


8: Copy paste


What wrong can go with copy paste ?


You copied something like "shu" and pasted on oracle database sqlplus through right click , See the magic how fast db/sessions will go down.


Toggling between two sessions , copying command from one session to other , OS command on db prompts and vice versa are common things which may lead to tense situation if any of the words you copied interprets to command !


9: User expiry

Application critical user created on DB with good tablespace , temp space , quota , permission etc.


But forgot to give appropriate profile with unlimited or defined expiry , one fine weekend when you will be enjoying sunbath , your application guy will disturb you "application is down as password has expired" and you have to set the old password as is with root cause analysis by monday !


10: Unrecoverable backups

Last but not the least "useless backups..."


You have been baking-up db quite long time and showing client that no need to worry we can recover from any point of time ...


but backups were designed with very less archive retention policy and in crisis that got back fired !



Hope you agree to this , do let me know ..which one you did?

2 comments:

  1. Hi
    If base backup expire , can't we change it to " forever backup" ?
    I mean to change to useable backup?

    ReplyDelete
    Replies
    1. You need to restore valid control file around backup time to make use of those backup's considering control file keep retention.

      Delete