package Lire::Program;

use strict;

use vars qw( $VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS
	     $LR_TAG $LR_ID $PROG $LR_SUPERSERVICE $LR_SERVICE $start 
	     $has_file_temp $init_called );

use File::Basename;
#use File::Spec;
use Fcntl;
use Exporter;
use Symbol;

use Lire::DlfSchema;

=pod

=head1 NAME

Lire::Program - Lire's programs common infrastructure and DLF converter API.

=head1 SYNOPSIS

Any program:

    use Lire::Program qw/ :msg /;

    lr_info( "doing stuff " );
    lr_debug( "debugging information" );
    lr_warn( "encountered unexpected value: ", $value );
    open( CFGFILE, "my.conf" ) or lr_err( "open error: ", $! );

DLF converters:

    use Lire::Program qw/ :dlf /;

    init_dlf_converter( "email" );
    while ( <> ) {
	lire_chomp(); # Optional
	# Convert logfile to DLF
    }
    end_dlf_converter( $lines, $dlflines, $errorlines );

=head1 DESCRIPTION

This module should be used by all Lire programs (at least the perl ones :-).

It includes:

=over

=item 1.

Common behavior for good integration in the Lire suite. (Output of
performance information and other stuff).

=item 2.

Functions for portability.

=item 3.

Logging functions

=item 4.

The perl DLF converter API.

=back

=head1 COMMON BEHAVIOR FOR LIRE PROGRAMS

When you use the Lire::Program module, you make sure that your program
will behave correctly in the Lire tool chain. This module will install
BEGIN and END blocks to comply with Lire's policy.

=over

=item 1.

Only message in the proper logging format

    I<superservice> I<service> I<lr_id> I<program> I<level> I<msg>

should be output. This module will install a __WARN__ signal
handler that make sure that all module that use warn to output message
are rewritten in the proper format.

=item 2.

All programs should start by a C<info> message which logs their
arguments. This module takes care of this.

=item 3.

All programs should end by printing C<info> messages with performance
statistics and the message 'ended'. Using this module takes care of
this. At the end of your program the following will be output
(stripped of the common information):

    memory stats: vsize=10688K rss=9380K majflt=406
    elapsed time in seconds real=9 user=8.72 system=0.06
    stopped

The memory profiling information will only be output on platform
running the Linux kernel.

=back

=head2 COMMON VARIABLES

As a convenience, you can import in your namespace using the :env tag
some variables common to all Lire programs. (Note that you can also
use those variables without importing them by prefixing the variables
with the Lire::Program:: namespace.)

=over

=item $PROG

The name of your program (that's the value of $0 without the directory path).

=item $LR_SUPERSERVICE

The superservice which you are processing, or C<all> if this
information isn't available.

=item $LR_SERVICE

The service which you are processing, or C<all> if this information
isn't specified.

=item $LR_ID

That's the job identifier. It should be shared by all commands in a
Lire job. This is the value UNSET when none was specified.

=item $LR_TAG

That's the prefix to all log messages. Should correspond to

    $LR_SUPERSERVICE $LR_SERVICE $LR_ID $PROG

=back

=cut

BEGIN {
    ($VERSION)	= '$Revision: 1.18 $' =~ m!Revision: ([.\d]+)!;

    ($PROG) = basename $0;
    $LR_SUPERSERVICE    = $ENV{LR_SUPERSERVICE}   || 'all';
    $LR_SERVICE	     = $ENV{LR_SERVICE}	    || 'all';
    $LR_ID  	     = $ENV{LR_ID}	    || "UNSET";
    $LR_TAG	     = "$LR_SUPERSERVICE $LR_SERVICE $LR_ID $PROG";

    $SIG{__WARN__} = sub {
	lr_warn( @_ );
    };

    @ISA = qw( Exporter );

    @EXPORT = qw();

    @EXPORT_OK = qw( tempfile tempdir );

    %EXPORT_TAGS = (
		    dlf => [ qw/ init_dlf_converter end_dlf_converter lire_chomp / ],
		    msg => [qw( lr_emerg lr_crit lr_err lr_warn
				lr_notice lr_info lr_debug) ],
		    env	=> [ qw/ $PROG $LR_SUPERSERVICE $LR_SERVICE $LR_TAG $LR_ID / ],
		   );

    foreach my $tag ( keys %EXPORT_TAGS ) {
	Exporter::export_ok_tags( $tag );
    }
    eval "use File::Temp";
    $has_file_temp = ! $@;
}

# Load File::Spec only once the warn handler is set
# to prevent
# bogus message: Subroutine load redefined at /usr/lib/perl5/5.00503/File/Spec.pm line 24.
# under perl 5.00503
use File::Spec;

=pod

=head1 PORTABILITY FUNCTIONS

For portability across perl versions, the Lire::Program interface
offers some functions that are usually found in the latest version of
perl but that may not be present in some old ones (the oldest version
of perl we support is 5.00503).

To use these functions, you need to import them explicitely or use the
full name (with the namespace).

=head2 tempfile()

    my $fh = tempfile();
    my $fh = tempfile( $template, [ SUFFIX => ".txt" ] )
    my ($fh, $name ) = tempfile( $template, [ SUFFIX => ".dlf" ] );

This is a wrapper around the File::Temp::tempfile perl function when
available, and it offers a home grown version which should be safe
when itsn't available. The only difference is that the file will
always be created in the directory specified in $ENV{TMPDIR} or
F</tmp> when unset.

The first argument to the function should be a template name
containing at least 6 X (i.e. tempXXXXXX) which we replaced to
generate a random name. When no arguments are passed, a default
template of tempfileXXXXXX will be use.

Other options can be passed to the function by using key => value
pair. The only option understood by the home grown version is SUFFIX
which will be appended to the filename. (The perl version understands
more options, but you shouldn't use them for compatibility.)

The function takes precautions against symlink attacks (and create the
file with readwrite permission for the owner only). It will die(), if
it fails to create a temporary file after 10 attempts. (This shouldn't
happen unless someone is seriously trying to race with us.)

The function will return in scalar context a anonymous file handle
opened on the temporary file. The temporary file was unlinked after
creation and will thus be deleted automatically when you close the
file handle.

When used in an array context, the function will return a file handle
and the path to the temporary file (this can be useful for debugging
purpose or when you can't passed the file by file handle to another
process). In this case, the file should be deleted manually.

=cut

my @chars = ( 'a' .. 'z', 'A' .. 'Z', 0 .. 9, );
sub tempfile {
    my $tmpdir;
    if ( File::Spec->can( "tmpdir" ) ) {
	$tmpdir = File::Spec->tmpdir;
    } else {
	$tmpdir = $ENV{TMPDIR} || '/tmp';
    }
    if ( $has_file_temp ) {
	if (@_) {
	    return File::Temp::tempfile( @_, DIR => $tmpdir );
	} else {
	    return File::Temp::tempfile( DIR => $tmpdir );
	}
    }
    # Poor's man tempfile, File::Temp is only part of perl 5.6.1
    my ($tmpl, %args);
    if (@_) {
	($tmpl, %args) = @_;
	$tmpl = $tmpdir . "/" . basename( $tmpl );
    } else {
	$tmpl = $tmpdir . "/tempfileXXXXXX";
    }
    # Try 10 times to open a file
    my $fh = gensym;
    for (0..10) {
	$tmpl =~ s/X/$chars[rand @chars]/ge;
	$tmpl .= $args{SUFFIX} if defined $args{SUFFIX};
	if ( sysopen( $fh, $tmpl, O_RDWR|O_CREAT|O_EXCL, 0600 ) ) {
	    binmode $fh;
	    unlink $tmpl unless wantarray;
	    return wantarray ? ($fh, $tmpl): $fh;
	}
	sleep 1;
    }
    die "Lire::Program::tempfile somebody is trying to race with us !\n";
}

=head2 tempdir()

    my $dir = tempdir();
    my $dir = tempdir( $template )

This is a wrapper around the File::Temp::tempdir perl function when
available, and it offers a home grown version which should be safe
when itsn't available. The only difference is that the directory will
always be created in the directory specified in $ENV{TMPDIR} or
F</tmp> when unset.

The first argument to the function should be a template name
containing at least 6 X (i.e. tempXXXXXX) which we replaced to
generate a random name. When no arguments are passed, a default
template of tempdirXXXXXX will be use.

Other options can be passed to the function by using key => value
pair. The only option understood by the home grown version is DIR
which is specify wehre the directory will be created (The perl version
understands more options, but you shouldn't use them for
compatibility.)

The function takes precautions against symlink attacks (and create the
file with readwrite permission for the owner only). It will die(), if
it fails to create a temporary directory after 10 attempts. (This
shouldn't happen unless someone is seriously trying to race with us.)

The function will return the name of the directory that was created.

=cut

sub tempdir {
    my $tmpdir;
    if ( File::Spec->can( "tmpdir" ) ) {
	$tmpdir = File::Spec->tmpdir;
    } else {
	$tmpdir = $ENV{TMPDIR} || '/tmp';
    }
    if ( $has_file_temp ) {
	if (@_) {
	    return File::Temp::tempdir( @_, DIR => $tmpdir );
	} else {
	    return File::Temp::tempdir( DIR => $tmpdir );
	}
    }
    # Poor's man tempdir, File::Temp is only part of perl 5.6.1
    my ($tmpl, %args);
    if (@_) {
	($tmpl, %args) = @_;
	$tmpl = $tmpdir . "/" . basename( $tmpl );
    } else {
	$tmpl = $tmpdir . "/tempdirXXXXXX";
    }
    # Try 10 times to create a directory
    for (0..10) {
	$tmpl =~ s/X/$chars[rand @chars]/ge;
	if ( mkdir $tmpl, 0700 ) {
	    return $tmpl;
	}
	sleep 1;
    }
    die "Lire::Program::tempdir somebody is trying to race with us !\n";
}

=pod

=head1 LOGGING FUNCTIONS

Programs that need to log messages to the user should import the
logging functions using the :msg tag. Using the proper lr_I<level>()
function will make sure that messages are output in the proper
format.

All logging functions take any number of parameters that will be
joined together to form the message (like print(), die(), warn() and
friends).

=cut

sub label_msg {
    my ( $label, @msg ) = @_;

    # Maybe remove trailing newline of last part of the message.
    chomp $msg[$#msg];

    # Add $LR_TAG err in front of all lines
    my $msg = join "", @msg, "\n";
    $msg =~ s/^/$LR_TAG $label /mg;

    $msg;
}

=pod

=head2 lr_emerg()

This logs a message at the C<emerg> level and aborts your program. This
probably shouldn't be used.

=cut

sub lr_emerg {
    my $msg = label_msg( "emerg", @_ );
    die $msg;
}

=pod

=head2 lr_crit()

This logs a message at the C<crit> level and aborts your program. This
should only be used when something is really broken in the Lire
program or the environment. It is used a few places in Lire when
assertion that should really never fail (like a DLF file with the
wrong number of fields).

=cut

sub lr_crit {
    my $msg = label_msg( "crit", @_ );
    die $msg;
}

=pod

=head2 lr_err()

This logs a message at the C<err> level and usually aborts your
program. (It is the equivalent of a die().) The program won't abort if
it's used in an eval block. Uses this for error condition.

=cut

sub lr_err {
    my $msg = label_msg( "err", @_ );
    die $msg;
}

=pod

=head2 lr_warn()

This logs a message at the C<warning> level. Perl's builtin warn is
mapped to this function. Use this for non-fatal errors.

=cut

sub lr_warn {
    my $msg = label_msg( "warning", @_ );
    print STDERR $msg;
}

=pod

=head2 lr_notice()

This logs a message at the C<notice> level. This should be used for
significant informational messages that the user should see. (By
default, the user will only see messages at level C<notice> or
higher.)

=cut

sub lr_notice {
    my $msg = label_msg( "notice", @_ );
    print STDERR $msg;
}

=pod

=head2 lr_info()

This logs a message at the C<info> level. Use this for general
informational messages.

=cut

sub lr_info {
    my $msg = label_msg( "info", @_ );
    print STDERR $msg;
}

=pod

=head2 lr_debug()

This logs a message at the C<debug> level. Use this for debugging messages.

=cut

sub lr_debug {
    my $msg = label_msg( "debug", @_ );
    print STDERR $msg;
}

=pod

=head1 DLF CONVERTERS FUNCTION

All DLF converters should import the DLF converters API by using the
:dlf import tag. Consult also the Lire::DlfSchema(3pm) man page for
other important information.

=head2 init_dlf_converter()

    init_dlf_converter( $superservice );

This should be called by the DLF converter after its initialization
and before starting to output DLF records. The only parameter is the
name of the superservice for which this converter is outputting DLF
records.

=cut

sub init_dlf_converter {
    my ( $superservice ) = @_;

    # Empty for now
    $init_called = $superservice;
}

=pod

=head2 end_dlf_converter()

    end_dlf_converter( $lines, $dlflines, $errorlines );

This should be called at the end of your program (when it has finished
converting the log). This function logs the conversion statistics in
the standard format and will store in the Lire database the
appropriate information associated to the current job.

The parameters are:

=over

=item $lines

The number of log lines read.

=item $dlflines

The number of DLF record output.

=item $errorlines

The number of errors encountered.

=back

=cut

sub end_dlf_converter {
    my ( $lines, $dlflines, $errorlines ) = @_;

    $errorlines ||= 0;

    lr_err( "Lire::Program::init_dlf_converter wasn't called" )
      unless $init_called;

    lr_info( "read $lines lines; output $dlflines DLF lines; $errorlines errors" );
    
    # Save the values
    system( "lr_db_store", $LR_ID, "loglines",	$lines );
    lr_err "lr_db_store failed" if ( $? != 0 );
    system( "lr_db_store", $LR_ID, "dlflines",	$dlflines );
    lr_err "lr_db_store failed" if ( $? != 0 );
    system( "lr_db_store", $LR_ID, "errorlines", $errorlines );
    lr_err "lr_db_store failed" if ( $? != 0 );

    $init_called = undef;
}

=pod

=head2 lire_chomp( [$line])

DLF converters should use that function instead of the chomp builtin.
This function will remove UNIX or DOS line ending whereas the native
chomp will only remove the native end-of-line marker. Since Lire
process log files that weren't necessarly produce on the same
platform, this function is more tailored to Lire.

The lire_chomp() will remove the end-of-line marker fromt the $line
parameter or from $_ if that parameter is omitted.


=cut

sub lire_chomp(;$) {
    if ( @_ == 1 ) {
	$_[0] =~ s/\r?\n?$//;
    } else {
	$_ =~ s/\r?\n?$//;
    }
}


# This is executed after the function were compiled
BEGIN {
    $start  = time;
    if ( @ARGV ) {
	lr_info ( "started with ", join " ", @ARGV );
    } else {
	lr_info ( "started with no argument" );
    }
}

# FIXME: Make this portable to other OS then Linux 2.X
sub print_linux_memory_stats {
    open PROC, "/proc/self/stat"
      or die "can't open /proc/self/stat";
    my @stats = split /\s+/, <PROC>;
    close PROC;

    # Stat layout is in linux, from proc(5)
    #  0 pid	    10 cminflt	    20 itrealvalue  30 blocked
    #  1 comm	    11 majflt	    21 starttime    31 sigignore
    #  2 state	    12 cmajflt	    22 vsize	    32 sigcatch
    #  3 ppid	    13 utime	    23 rss	    33 wchan
    #  4 pgrp	    14 stime	    24 rlim
    #  5 session    15 cutime	    25 startcode
    #  6 tty	    16 cstime	    26 endcode
    #  7 tpgid	    17 counter	    27 startstack
    #  8 flags	    18 priority	    28 kstkesp
    #  9 minflt	    19 timeout	    29 signal

    # Sanity check to see if we read the good structure
    die "layout of /proc/self/stat doesn't match we expect"
      unless $stats[0] == $$ &&
	$stats[3] == getppid &&
	  $stats[4] == getpgrp 0;

    my $vsize = $stats[22] / 1024;
    my $rss   = $stats[23] * 4;
    my $majflt = $stats[11];
    lr_info( "memory stats: vsize=${vsize}K rss=${rss}K majflt=$majflt" );
}

END {
    eval {print_linux_memory_stats()}
      if -e "/proc/self/stat";

    my $real = time - $start;
    my ($user, $system ) = times;

    # Print performance data
    lr_info( "elapsed time in seconds real=$real user=$user system=$system" );
    lr_info( "stopped" );
}

1;

__END__

=pod

=head1 SEE ALSO

Lire::DlfSchema(3pm)

=head1 VERSION

$Id: Program.pm,v 1.18 2002/07/09 15:26:05 flacoste Exp $

=head1 COPYRIGHT

Copyright (C) 2001 Stichting LogReport Foundation LogReport@LogReport.org

This file is part of Lire.

Lire is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program (see COPYING); if not, check with
http://www.gnu.org/copyleft/gpl.html or write to the Free Software 
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111, USA.

=head1 AUTHOR

Francis J. Lacoste <flacoste@logreport.org>

=cut

