Perl for DBAs

Perl for DBAs

 

Perl for DBAs

 

Perl for DBA’s

Perl for DBAs

 

By now, you have no doubt at least heard of Perl. It's been around for quite a few years, having been first developed in the 1980's, and came into widespread use in the 1990's.

 

Many people's exposure to Perl has been limited to its use as a web scripting language, where it gained a great deal of popularity. Perl is a great language for web use, but its functionality extends far beyond what is needed for use with web applications.

 

There's a standard list of tools that most DBAs have at their disposal and know how to use. On Unix platforms this list of tools might include the Korn and Bourne shells, as well as the Bash shell on Linux systems, sed, grep, awk, tr, cut, paste, bc and numerous others.

 

This list may be somewhat limited if you are a Win32 platform, though there are sources for Unix utilities on Win32 platforms. See the list of resources at the end of the article.

 

This toolset is of course rounded out by the tools that come with the Oracle database, SQL*Plus, svrmgrl, or even sqldba. This list will vary dependant on the version of Oracle that you are using.

 

Though this may seem quite enough tools for anyone to manage databases, my goal is to convince you make room in your toolbox for one more.

A Little History

Perl was designed from the beginning for manipulating data, originally created by Larry Wall to take the best features of C, awk, sed, sh and other Unix utilities, and combining them into one powerful language.

 

Though originally designed with systems administration work in mind, Perl's innate ability to work with data made it a natural for database use, and so Oraperl was created by Kevin Stock in 1990.

 

Oraperl allowed connections to Oracle databases to be made directly from within Perl scripts, and direct manipulation of the data in the database via Perl. This was quite refinement over the redirection that was frequently used in shell scripts when DBA’s needed to access data outside of the database. See Example 1.

 

Example 1.

 

#!/bin/ksh

 

export ORACLE_SID=ts01

export ORAENV_ASK=NO

unset SQLPATH

 

. oraenv $ORACLE_SID

 

sqlplus -s <<EOF

system/manager

 

set feed off term off pause off head off

spool users.txt

 

select distinct s.username username

from v\$session s

where s.username is not null

/

EOF

 

while read user

do

if [ ! -z $user ]; then

echo "$user: Please log out!" | mailx -s "please log out of database!" $user@yourdomain.com

fi

done < users.txt

 

 

Oraperl was a library that had to be built and linked into the Perl binary. Since this required rebuilding and relinking the Perl binaries, this could be quite an undertaking on some platforms.

 

This method also required that you have a separate Perl binary for connecting to different databases. If you had both Sybase and Oracle databases, you would need to have a different binary for each database.

 

With the advent of Perl version 5 in 1994, a new modular Perl architecture made it possible to create modules for Perl that did not require relinking the Perl binaries.

 

At this time Tim Bunce introduced the DBI, or DataBase Interface Module. The Perl DBI provided a consistent interface to for writing database applications, regardless of the database. Each database would need a corresponding DBD, or DataBase Driver module, written to DBI specification. The DBD::Oracle module was also made available at this time.

 

Since then, there have been DBD modules created for most popular relational databases including Oracle, Sybase, DB2, Teradata, mysql, PostgreSQL and several others. There are also DBD drivers for ODBC and JDBC.

 

Another advantage Perl has is its ability to create several simultaneous connections to different databases. This makes it simple to move data between disparate databases without using database links or Oracle Heterogeneous Services.

Getting Perl

Installing Perl on your system is beyond the scope of this article. I’m going to assume that it’s already installed, which is usually the case on most Unix systems. Should you need to install it on your system, see the Resources section at the end of this article.

 

Though Perl may not be installed by default on Win32 Systems, it is easily acquired and installed by visiting the web site for ActiveState Perl. The binaries are already compiled and ready to go. See the Resources section at the end of this article for details.

Getting DBI and DBD::Oracle

Installing DBI and DBD::Oracle on Win32 systems is a very simple process. There are two sources for pre-built versions of the DBI and DBD::Oracle modules. The first is from ActiveState. To install these modules, simply open a COMMAND window, and use the ActiveState Perl Package manager as seen in Example 2. Be sure to set the HTTP_PROXY environment variable if needed at your site.

 

Example 2.

 

C:> set HTTP_PROXY=http://myproxy.mydomain.com:8080

C:> ppm

PPM interactive shell

PPM> install DBI

PPM> install DBD::Oracle

PPM> install DBD::ODBC

PPM> quit

 

Another source for DBI and DBD::Orale is from the ftp://ftp.esoftmatic.com/outgoing/DBI site. These prebuilt modules provided by Jeff Urlwin are the preferred source of these database modules, as they are usually more up to date than those found at ActiveState, and are built with later versions of the Oracle libraries. The DBD::Oracle module at ActiveState is built with Oracle 7 libraries at the time this was written. See Example 3 for details.

 

Example 3.

 

C:> set HTTP_PROXY=http://myproxy.mydomain.com:8080

C:> ppm

PPM interactive shell

PPM> install http://www.xmlproj.com/PPM/DBI.ppd

PPM> install ftp://ftp.esoftmatic.com/outgoing/DBI/DBD-Oracle.ppd

PPM> install ftp://ftp.esoftmatic.com/outgoing/DBI/DBD-ODBC.ppd

PPM> quit

 

Assuming that Oracle is already installed, the Perl DBI, DBD::Oracle and DBD::ODBC modules are installed and ready to go.

 

On Unix systems this is a little more work, but still not too difficult. You will probably need to be logged in as root to do this, but check with your system administrator to be sure.

 

First download the source for DBI and DBD::Oracle. Both can be found at http://search.cpan.org. This needs to be done on a server that already has Perl 5 and Oracle installed.

 

Now decompress the source archive and untar it into a working directory:

 

gunzip -c DBI-1.30.tar.gz | tar xvf -

 

Create the makefile, run the make and install:

 

perl Makefile.PL

make test

make install

 

The steps for installing the DBD::Oracle module are similar. You will need to set an environment variable for the test phase to succeed. The ORACLE_USERID variable needs to be a valid username/password combination.

 

gunzip -c DBI-1.30.tar.gz | tar xvf -

export ORACLE_USERID=system/manager

perl Makefile.PL

make test

make install

Using Perl

By now, you're no doubt asking yourself, OK, this is all well and good, but what can I DO with it?

I'm glad you asked, because we're going to consider a few real life situations where Perl turned out to be the best tool for the job.

 

I was recently asked to help one of my employer's financial analysts to provide some data in for external use. The catch was that the data needed to be provided in MS Excel workbooks. This might not normally be a problem if only a few hundred rows needed to be provided. Dump the data to a comma delimited text file, load into into Excel and be done with it right?

 

Wrong. That method works for small data sets, but is somewhat unwieldy when the data to be dumped is more than one million rows worth. There had to be a better way.

 

A few minutes search at the Comprehensive Perl Archive Network or CPAN, were rewarded with several modules that might make this task easier.

The DBI Template Script

Before delving into those details though, let's first examine a barebones Perl script that uses the DBI and DBD::Oracle modules. We'll briefly outline the function of different parts of the script that setup the environment, make the database connection and prepare the SQL, and then spend a little more time on the details of the working section.

 

Don't worry about completely understanding all of the Perl syntax, it's not important at this point.

What is important is seeing what Perl can do for you.

 

Lines 1-7: These load the required Perl Modules.

1: #!/usr/bin/perl -w

2:

3: use warnings;

4: use strict;

5: use DBI;

6: use Getopt::Long;

7:

 

Lines 8-38 setup the command line interface via the Getopt::Long module, retrieve command line arguments and cause the script to exit with an error if the expected arguments are not available.

 

8: my %optctl = ();

9:

10: Getopt::Long::GetOptions(

11: \%optctl,

12: "database=s",

13: "username=s",

14: "password=s",

15: "sysdba!",

16: "sysoper!",

17: "z","h","help");

18:

19: my($db, $username, $password, $connectionMode);

20:

21: $connectionMode = 0;

22: if ( $optctl{sysoper} ) { $connectionMode = 4 }

23: if ( $optctl{sysdba} ) { $connectionMode = 2 }

24:

25: if ( ! defined($optctl{database}) ) {

26: Usage();

27: die "database required\n";

28: }

29: $db=$optctl{database};

30:

31: if ( ! defined($optctl{username}) ) {

32: Usage();

33: die "username required\n";

34: }

35:

36: $username=$optctl{username};

37: $password = $optctl{password};

38:

 

Lines 39- 47 make the connection to the database, and line 49 causes the script to exit if the connection failed.

 

39: my $dbh = DBI->connect(

40: 'dbi:Oracle:' . $db,

41:

$username, $password,

42: {

43: RaiseError => 1,

44: AutoCommit => 0,

45: ora_session_mode => $connectionMode

46:

}

47:

);

48:

49: die "Connect to $db failed \n" unless $dbh;

 

Line 51 sets the number of rows to be retrieved from the database in each call, much like SET ARRAYSIZE in sqlplus.

 

50:

51: $dbh->{RowCacheSize} = 100;

52:

 

Lines 53-57 setup the SQL, prepare the SQL and request the database to execute it. If you have ever used the DBMS_SQL package, this may look somewhat familiar.

 

53: my $sql=q{select * from dual};

54:

55: my $sth = $dbh->prepare($sql);

56:

57: $sth->execute;

58:

 

Lines 59-61 are the working portion of this code. As each row is retrieved, it is placed into an array of name array in line 59, and the zeroth element of the array is printed in line 60. This section doesn't do much now, but we'll add to it later.

 

59: while( my @array = $sth->fetchrow_array ) {

60: print "$array[0]\n";

61: }

62:

Line 63 disconnects from the database. Since there is no more code to execute below this section, the Perl will exit the script

 

63: $dbh->disconnect;

64:

 

Lines 65-70 are a usage() subroutine that is called when the help option is present on the command line, or if expected command line arguments are not present.

 

65: sub Usage {

66: print "\n";

67: print "usage: DBI_template.pl\n";

68: print " DBI_template.pl -database dv07 -username scott -password tiger [-sysdba || -sysoper]\n";

69: print "\n";

70: }

 

Running this script should prove rather uneventful, as all it does is connect to a database and print the output from SELECT * FROM DUAL.

Creating Excel Workbooks In One Easy Step

Now let's get back to dumping a few million rows of data to MS Excel.

 

To make this example easy, we'll work with the well known SCOTT/TIGER account and the EMP table, which only contains a very few rows. The concepts will be the same, only the data is different.

 

After searching CPAN website, the module that appeared to fit my needs was Spreadsheet::WriteExcel.

A little experimentation confirmed that to be the case, and only a few modifications were required to turn the DBI template script into a script that would create multiple 60,000 line Excel workbooks.

 

These lines were added at the top of the script:

 

use Spreadsheet::WriteExcel;

use Spreadsheet::WriteExcel::Big;

use constant LINES_PER_BOOK =>

60001;

 

The Spreadsheet::WriteExcel::Big module was required to create workbooks over seven megabytes in size, which these would be.

 

The LINES_PER_BOOK constant was set to track the maximum number of rows that would be created in each workbook.

 

The SQL statement at line 53 was changed to select * from scott.emp.

 

At the very end of the script, a short function was added to create a new file name for each successive workbook:

{

my $workBookNumber = 0;

sub newWorkBookName {

return "C:/temp/emp_dump_" . ++$workBookNumber .

".xls";

}

}

 

Lines 59-62 were replaced with:

 

59: my $workbook =

Spreadsheet::WriteExcel::Big->new(newWorkBookName());

60: die "unable to create workbook - $!\n" unless $workbook;

61: $workbook->set_tempdir('C:/temp');

62: my $worksheet = $workbook->addworksheet();

63:

64: my $colNames = $sth->{NAME_uc};

65:

66: my $rowCount=0;

67: my $lineCount=0;

68: $worksheet->write_row($lineCount,0,$colNames);

69:

70: while( my @empData = $sth->fetchrow_array ) {

71:

72: print "." unless $rowCount++%1000;

73:

74: if ( ++$lineCount >= LINES_PER_BOOK ) {

75: $workbook->close;

76: my $workBookName = newWorkBookName();

77: $workbook =

Spreadsheet::WriteExcel::Big->new($workBookName);

78: die "unable to create workbook - $!\n" unless $workbook;

79: $worksheet = $workbook->addworksheet();

80: $lineCount=0;

81: $worksheet->write_row($lineCount,0,$colNames);

82: $lineCount=1;

83: print "\nNew Workbook: $workBookName\n";

84: }

85: $worksheet->write_row($lineCount,0,\@empData);

86: }

87: $workbook->close;

 

These few lines are responsible for dumping an arbitrarily large SQL query into an arbitrarily large number of Excel workbooks. Though this example is using the SCOTT.EMP table and will only create a single workbook with a few rows, it will happily keep churning out workbooks for millions of rows, sequentially numbering each as it goes.

 

This was good news to the analyst that thought she would have to manually import this data into a great many spreadsheets.

 

As written for this article this script is setup to run on a Win32 machine. It could just as easily have been run from a Unix database server, and in fact that is how the script originally worked. The Excel workbooks were created directly on a NetApp filer that was mounted on a Linux server via SAMBA. This allowed the script to be run by a machine with a great deal more power than my desktop machine, while eliminating any ftp copies that might be necessary to make the files accessible to the analyst.

 

Monitoring Database Connectivity

 

One of a DBA's many responsibilities is ensuring that databases are up and accepting connections. The only sure way to test database connectivity is by making a connection from a remote machine. This ensures that the listener is up, the database is up, and that it's accepting connections.

 

There are any number of methods in use for doing this, but many of them have one fatal flaw. When a database connection hangs without returning an error message, the monitor does not move on to the next database to be checked.

A simple example of this is found in Example 4. This script will work fine until a connection hangs, and then it will fail to move on to the next database that needs to be checked. There could be other databases that have problems as well, but this script will not inform you under those conditions.

 

Example 4.

 

#!/usr/bin/ksh

export ORACLE_SID=ts01

. oraenv $ORACLE_SID

 

while :

do

for db in ts01 ts02 ts03

rm -f /tmp/connect.txt

sqlplus scott/tiger@$db <<EOF

set head off term off echo off pause off pagesize 0

spool /tmp/connect.txt

select 'DB OK' from dual;

EOF

[ grep 'DB OK' /tmp/connect.txt 2>/dev/null] || {

echo "db $db is down" | mailx -s "database $db is down" \

dba@somewhere.com

}

done

done

 

This is clearly a job for Perl! By using the alarm() function, we can prevent hanging connections from also hanging the connectivity monitor.

 

This is best implemented on Unix or Unix like system (Linux) or others that supply the alarm() or POSIX SIGALRM functionality. Although ActiveState Perl has an alarm() function in version 5.8 it does not exist in prior versions of ActiveState Perl. I have not yet tested this new functionality on Win32 platforms.

 

This simplified Perl script is making the assumption that the Oracle environment is already setup (ORACLE_HOME specifically), and that the databases to be checked all have a SCOTT/TIGER account. This could certainly be more robust, but has been kept simple for clarity. Please see Example 5 for the complete listing.

 

One new module has been used: Mail::Sendmail at line 6. This may be installed using the same methods previously detailed for installing DBI and DBD::Oracle.

 

Example 5.

 

1: #!/usr/bin/perl -w

2:

3: use warnings;

4: use strict;

5: use DBI;

6: use Mail::Sendmail qw(sendmail);

7:

8: my $timeOut = 60; # wait 60 seconds for connection

9: my $interval = 300; # 5 minutes between connectivity checks

10:

11: my @databases = ('ts01','ts02','ts03');

12: my ($username, $password) = ('scott','tiger');

13:

14: while(1) { # loop forever

15: foreach my $db ( @databases ) {

16: print "checking $db\n";

17: my $dbh='';

18: eval {

19: # set alarm to timeout current operation

20: local $SIG{ALRM} = sub {die "connection timeout\n"};

21: alarm $timeOut;

22:

23: $dbh = DBI->connect(

24: 'dbi:Oracle:' . $db,

25: $username, $password,

26: {RaiseError => 1}

27: );

28: };

29:

30: alarm 0; # reset the alarm

31:

32: if ($dbh) { # success

33: print "Connection succeeded for $db\n";

34: $dbh->disconnect;

35: } else { # failure

36: print "Error connecting to $db\n";

37: my %mail = (

38: To => 'thedba@somewhere.com',

39: From => 'oracle@somewhere.com',

40: Subject => "Database $db is down!",

41: );

42: unless (sendmail %mail) { print "Error sending mail: $Mail::Sendmail::error \n" }

43: }

44: }

45:

46: sleep $interval; # wait for next db check

47: }

 

The lines of interest here are 18-28 and 32-44. Lines 18 enters an ‘eval’ block. This is code that Perl is to evaluate and execute at runtime, much like EXECUTE IMMEDIATE in PL/SQL. Lines 20 and 21 setup the timeout functionality. If the connection to the database at line 23 is not made before 60 seconds have passed as specified by the $timeOut variable, the alarm will cause the eval block to be exited.

 

Line 32 checks to see if the connection attempt was successful. If not, the $dbh variable will be an empty string as assigned in line 17.

In that case lines 36-42 are responsible for sending an email notifying the DBA that the database is down.

 

Line 46 causes the script to wait $interval seconds before doing it all again.

This simple script serves to illustrate why many DBA's are finding a place in their toolbox for Perl. I hope this brief introduction will provide incentive for you to learn more.


Resources:

 

Recommended books to learn Perl:

 

Learning Perl, 3rd Edition

Randal L. Schwartz, Tom Phoenix

O'Reilly 2001

0-596-00132-0

http://www.oreilly.com/catalog/lperl3/

 

Learning Perl on Win32 Systems

Randal L. Schwartz, Erik Olson, Tom Christiansen

O'Reilly, 1st Edition August 1997 
1-56592-324-3

http://www.oreilly.com/catalog/lperlwin/

 

Programming the Perl DBI

Alligator Descartes, Tim Bunce

O'Reilly 2000

1-56592-699-4

http://www.oreilly.com/catalog/perldbi/

 

Perl for Oracle DBAs

Andy Duncan, Jared Still

O'Reilly 2002

0-596-00210-6

http://www.oreilly.com/catalog/oracleperl/

 

Perl Source, tutorials, and more

http://www.perl.com/

 

Perl Mailing Lists

http://lists.perl.org/

 

Facts About Perl

http://www.perl.org/press/fast_facts.html

 

Perl History

http://www.perl.org/press/history.html

 

Perl binaries for Win32

http://www.activestate.com/Products/ActivePerl/

 

Cygwin – Unix tools for Win32

http://sources.redhat.com/cygwin/

 

Source for DBD modules

http://www.cpan.org/modules/by-module/DBD/

 

jeff Urlwin's prebuilt modules for DBI and DBD::Oracle

ftp://ftp.esoftmatic.com/outgoing/DBI

 

O'Reilly's Perl books

http://perl.oreilly.com/

 

Template script

/downloads/templates/dbi_template.html

 

Perl for DBAs in an MS Word Document