The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Astro::STSDAS::Table::Binary - access a STSDAS binary format file

SYNOPSIS

  use Astro::STSDAS::Table::Binary;
  my $tbl = Astro::STSDAS::Table::Binary->new;
  $tbl->open( $file ) or die( "unable to open $file\n");

  if ( $tbl->is_row_order ) { ... }
  if ( $tbl->is_col_order ) { ... }

  # read an entire table:
  $rows = $tbl->read_rows_hash;
  # or
  $rows = $tbl->read_rows_array;
  # or
  $cols = $tbl->read_cols_hash;
  # or
  $cols = $tbl->read_cols_array;

  # read the next column from a column ordered table:
  $col = $tbl->read_col_col_array;
  # or
  $tbl->read_col_row_hash( \@rows );
  # or
  $tbl->read_col_row_array( \@rows );

  # read the next row from a row ordered table:
  $row = $tbl->read_row_row_array;
  # or
  $row = $tbl->read_row_row_hash;

DESCRIPTION

Astro::STSDAS::Table::Binary provides access to STSDAS binary format tables.

STSDAS binary tables have some special properties:

  • They may be in row (each "record" is a row) or column (each "record" is a column) order. This is handled by having different data read routines for the different orders. They are not entirely symmetric.

    The easy way to deal with this is to simply read the entire table into memory (provided it's small) with one of the read_rows_... or read_cols_... routines.

  • Data elements may be vectors. Vectors are represented in the data as references to lists.

  • Data values may be undefined. Undefined values are converted to the Perl undefined value.

METHODS

Astro::STSDAS::Table::Binary is derived from Astro::STSDAS::Table::Base, and thus inherits all of its methods. Inherited methods are not necessarily documented below.

new
  $self = Astro::STSDAS::Table::Binary->new;

The new method is the class constructor, and must be called before any other methods are invoked.

open
  $tbl->open( file or filehandle [, mode] );

open connects to a file (if it is passed a scalar) or to an existing file handle (if it is passed a reference to a glob). If mode is not specified, it is opened as read only, otherwise that specified. Modes are the standard Perl-ish ones (see the Perl open command). If the mode is read only or read/write, it reads and parses the table header. It returns the undefined value upon error.

close

explicitly close the table. This usually need not be called, as the file will be closed when the object is destroyed.

read_rows_hash
  $rows = $tbl->read_rows_hash;

Digest the entire table. This is called after open. The table is stored as an array of hashes, one hash per row. The hash elements are keyed off of the (lower cased) column names.

Vector elements are stored as references to arrays containing the data.

For example, to access the value of column time in row 3,

        $rows->[2]{time}
read_rows_array
  $rows = $tbl->read_rows_array;
  $rows = $tbl->read_rows_array( \%attr );

Digest the entire table. This is called after open. The table is stored as list of arrays, one array per row.

Vector elements are normally stored as references to arrays containing the data, e.g., if there are three columns, where the second column is a vector of length 3, $rows may look like this:

     $rows->[0] = [ e00, [ e01_0, e01_1, e01_2 ], e02 ]
     $rows->[1] = [ e10, [ e11_0, e11_1, e11_2 ], e12 ]

and

     $rows->[0][2]

extracts row 0, column 2.

However, if the VecSplit attribute is set to zero, vectors are left inlined in the data, and

 $tbl->read_rows_array( { VecSplit => 0 } )

results in:

     $rows[0] = [ e00, e01_0, e01_1, e01_2, e02 ]
     $rows[1] = [ e10, e11_0, e11_1, e11_2, e12 ]
read_cols_hash
  $cols = $tbl->read_cols_hash;

Digest the entire table. This is called after open. The table is stored as an hash, each element of which is a reference to an array containing data for a column. The hash keys are the (lower cased) column names. Vector elements are stored as references to arrays containing the data.

For example, to access the value of column time in row 3,

        $cols->{time}[2]
read_cols_array
  $cols = $tbl->read_cols_array;

Digest the entire table. This is called after open. The table is stored as an array, each element of which is a reference to an array containing data for a column. Vector elements are stored as references to arrays containing the data.

For example, to access the value of column 9 in row 3,

        $cols->[9][3]
is_row_order

This method returns true if the table is stored in row order.

is_col_order

This method returns true if the table is stored in column order.

read_col_col_array
  $col = $tbl->read_col_col_array;
  $col = $tbl->read_col_col_array( \%attr );

This reads the next column from a column ordered table into an array. It returns a reference to the array containing the data.

Vector elements are normally stored as references to arrays containing the data, e.g.:

     $col->[0] = [ e00, e01, e02 ]
     $col->[1] = [ e10, e11, e12 ]
     $col->[2] = [ e20, e21, e22 ]

However, if the VecSplit attribute is set to zero, they are left inlined in the data:

 $tbl->read_col_col_array( { VecSplit => 0 } )

results in:

    $col->[0] = e00
    $col->[1] = e01
    $col->[2] = e02
    $col->[3] = e10
    ...

This is faster, as the data are originally stored in this format.

The method returns the undefined value if it has reached the end of the data.

read_row_row_array
  $row = $tbl->read_row_row_array;
  $row = $tbl->read_row_row_array( \%attr );

  $tbl->read_row_row_array( \@row );
  $tbl->read_row_row_array( \@row, \%attr );

This reads the next row from a row-ordered table into an array, in the same order as that of the columns in the table.

It returns the undefined value if there are no more data.

By default it reads the data into an array which is reused for each row. The caller may optionally pass in a reference to an array to be filled.

Vector elements are normally stored as references to arrays containing the data, e.g.:

     $row->[0] = e0
     $row->[1] = [ e10, e11, e12 ]
     $row->[2] = e2

However, if the VecSplit attribute is set to zero, they are left inlined in the data:

 $tbl->read_row_row_array( { VecSplit => 0 } )

results in:

    $row->[0] = e0
    $row->[1] = e10
    $row->[2] = e11
    $row->[3] = e12
    $row->[4] = e2
    ...

This is faster, as the data are originally stored in this format.

read_row_row_hash
  $row = $tbl->read_row_row_hash;
  $tbl->read_row_row_hash( \%row );

This reads the next row from a row-ordered table into a hash, keyed off of the column names. Vector elements are stored as references to arrays containing the data.

It returns the undefined value if there are no more data.

By default it reads the data into a hash which is reused for each row. The caller may optionally pass in a reference to a hash to be filled.

read_col_row_hash
  $tbl->read_col_row_hash( \@rows) ;

This reads the next column from a column ordered table. The data are stored in an array of hashes, one hash per row, keyed off of the column name. The passed array ref is to that array of hashes. Vector elements are stored as references to arrays containing the data.

It returns undefined if it has reached the end of the data.

This routine is seldom (if ever) called by an application.

read_col_row_array
  $tbl->read_col_row_array( \@rows) ;
  $tbl->read_col_row_array( \@rows, \%attr) ;

This reads the next column from a column ordered table. The data are stored in an array of arrays, one array per row. The passed array ref is to that array of arrays.

Vector elements are normally stored as references to arrays containing the data, e.g., after reading in three columns, @rows might look like this:

     $rows[0] = [ e00, [ e01_0, e01_1, e01_2 ], e02 ]
     $rows[1] = [ e10, [ e11_0, e11_1, e11_2 ], e12 ]

where the second column is a vector of length 3. If the VecSplit attribute is set to zero, vectors are left inlined in the data:

 $tbl->read_col_row_array(\@rows, { VecSplit => 0 } )

results in:

     $rows[0] = [ e00, e01_0, e01_1, e01_2, e02 ]
     $rows[1] = [ e10, e11_0, e11_1, e11_2, e12 ]

This is probably a bit faster, as the data are read in in this fashion.

It returns undefined if it has reached the end of the data.

This routine is seldom (if ever) called by an application.

CAVEATS

  • This class can only read, not write, tables.

  • Reading of column-ordered tables is untested.

  • Reading of tables with vector elements is untested.

  • Do not delete or add columns by manipulating the table's cols attribute. This will only confuse the table reader, as it assumes a one-to-one mapping between what's in the list of columns and what's in the table.

LICENSE

This software is released under the GNU General Public License. You may find a copy at

   http://www.fsf.org/copyleft/gpl.html

AUTHOR

Diab Jerius ( djerius@cpan.org )