CoreLib: Portable Core Library


Introduction
Application Frameworks
User Interface Elements
Configuration Files
Error Processing
Files and Directories
Memory Management
Byte Stores
String Functions
ValNode Functions
Math Functions
Miscellaneous Utilities
Portability Issues


 Introduction

NCBI has defined a series of header files, basic utility routines, and programming guidelines for the C programming language intended to encourage good programming practice in general and to facilitate the creation of code which will compile and run without change on a variety of hardware platforms under a variety of operating systems and user interfaces, both command line and windowing.  We have developed and tested the system on Intel 80386 and 80486 machines under MS-DOS and Microsoft Windows 3.1, on various Macintosh II machines under Mac-OS, on many different machines under UNIX, on an IBM 3090 running AIX and on VMS VAX. A complete list of systems is given in the README file for the NCBI software toolkit release.

A large number of applications have been written using this set of core tools, and they compile and run without change on all the above system.  While there is clearly no perfect or all inclusive system, we find this one works remarkably well.  This system is not meant to be a universal panacea.  It is meant only to allow the creation of portable code for most of the types of things scientists might want to do on a computer.  It is not expected to support extremely interactive or graphically oriented programs, nor is it meant to support extremely computation resource limited applications.  It is to make an average application portable and robust.

Application Frameworks

In C, there is a common programming model that we are all accustomed to, in which command-line arguments are made available to the program's main function and the stdin and stdout streams are used for input and output data.  However, with some of the modern graphical user interfaces, this may not be convenient or desired.  Furthermore, graphical interfaces generally require substantial initialization before any application code runs and may require specific steps be taken before application exits.  The exact steps required vary widely from one platform to the next.

To simplify the process of writing programs that run in all of these situations, let us introduce the notion of an application framework, which takes care of whatever initialization and termination steps may be required and provides a uniform mechanism for obtaining program arguments.  The NCBI Toolkit provides two application frameworks.  The first is part of CoreLib and is extremely simple, but useful for "quick-and-dirty" tool development.  The second is provided by Vibrant and supports the full look-and-feel of the target graphical interface (described elsewhere in this manual).

It is important to note that the use of these frameworks is purely optional.  Any NCBI Toolbox function, with the single exception of the GetArgs, may be called from any application, whether it a simple UNIX filter program or a full-blown Macintosh application.

Main Entry Point

To use this simple framework, you should write a function called Main, which you can think of this as the entry point to your program.  In fact, the true entry point (main or WinMain or whatever) is a function within CoreLib, which will perform whatever initilization is required for the platform and then call Main.  From Main you will call functions to perform the task for which the program is designed and then Main should return zero on success or non-zero on failure, as in the following example.

#include <ncbi.h>

 

Int2 Main()

{

    if (!DoSearch("swissprot","query.aa",12,0.1,"blast.out"))

          return 1;  /* failure */

 

    return 0;   /* success */

}

You should include the C header file ncbi.h to get the function definition (prototype) for Main, as well as all other functions, types, and constants described in this chapter.  It turn includes various other headers, such as one that contains platform-specific definitions (ncbilcl.h) and others that define the interfaces to the various modules (for example, the memory management functions are defined in ncbimem.h).  Since the order of inclusion may be important in certain instances, it is safest to simply include ncbi.h.

Getting Program Arguments

Notice that Main takes no parameters.  How, then, will your program get access to arguments that may be supplied by the user?  Early in the program, you should make a single call to a function called GetArgs to obtain the program's run-time arguments.  Actually, GetArgs does more than just get arguments; it may also prompt the user for input, validate the arguments against allowed ranges, and convert them to the appropriate integer or floating point types.  The Arg structure contains all of the information required to do this and contains storage for the values returned by the function.

Boolean GetArgs (CharPtr progname, Int2 argcount, Arg *arglist)

Gets arguments for the program named progname.  The  argument specifications are stored in arglist, an array of Arg structures containing argcount elements.  The Arg structure has the following definition:

typedef struct {

    char *prompt;         /* Visible prompt for user */

    char *defaultvalue;   /* Default value */

    char *from;                /* Low value in allowed range */

    char *to;             /* High value in allowed range */

    Boolean    optional;  /* Is this argument optional? */

    char  tag;            /* Command-line switch */

    Int1  type;           /* Data type */

    FloatHi    floatvalue;     /* Returned floating point value */

    Int4  intvalue;       /* Returned integer value */

    CharPtr    strvalue;  /* Returned string value */

} Arg, *ArgPtr;

The arguments on the command line are expected to consist of the dash (-) character, a single letter tag, and finally the argument value.  There may be a space between the tag and the value.  For example, "-F fname" and "-Ffname" are equivalent.  The field called type determines how the argument is interpreted as well as where it is stored in the Arg structure.

Datatype Symbol

Data Type Description

Storage Field in Arg

ARG_BOOLEAN

TRUE/FALSE value

intvalue

ARG_INT

Integer value

intvalue

ARG_FLOAT

Floating point value

floatvalue

ARG_STRING

String value

strvalue

ARG_FILE_IN

Name of input file

strvalue

ARG_FILE_OUT

Name of output file

strvalue

ARG_DATA_IN

Datalink in

strvalue  [[ VERIFY ]]

ARG_DATA_OUT

Datalink out

strvalue  [[ VERIFY ]]

Arguments are considered to be required unless the optional field is TRUE.  The user will be prompted for all non-optional arguments not given on the command line using the string supplied in prompt.  Optional arguments not supplied by the user are assigned to defaultvalue (may be NULL).  All numerical arguments are converted from strings to either integer or floating point and validated against the valid range defined by the from and to fields (may be NULL for no validation).  If all non-default arguments have been supplied and validated, TRUE is returned.  If not, or if the only argument given is "-", the program usage is shown and the function returns FALSE.

The example above may be extend as follows.

#include <ncbi.h>

 

Arg arg[] = {

  {"Database","nr",NULL,NULL,FALSE,'D',ARG_STRING,0.0,0, NULL},

  {"Query file","query.aa",NULL,NULL,FALSE,'Q',ARG_FILE_IN,0.0,0, NULL},

  {"Threshold","13","5","25",TRUE,'T',ARG_BOOLEAN,0.0,0,NULL},

  {"Expect","0.1","0.01","10",TRUE,'E',ARG_FLOAT,0.0,0,NULL },

  {"Output file","blast.out",NULL,NULL,FALSE,'O',ARG_FILE_OUT,0.0,0, NULL}};

 

Int2 Main()

{

    if (!GetArgs("demo",DIM(arg),arg))

          return 1;  /* failure */

 

    if (!DoSearch(arg[0].strvalue,arg[1].strvalue,

               arg[2].intvalue,arg[3].floatvalue,arg[4].strvalue))

          return 1;  /* failure */

 

    return 0;   /* success */

}

The Vibrant version of GetArgs produces a dialog box containing the prompt strings edit fields into which the user may enter the values.  If all required arguments are supplied, TRUE is returned.

User Interface Elements

We find it useful to include a minimal set of functions in the core library to provide feedback to the user for such purposes as displaying messages (alerts), providing audible feedback (beeps), and indicating the progress of lengthy operations (monitors).  However, we recognize that a significant amount of customization is needed to suit the tastes and requirements of individual applications programmers using this Toolkit.  Indeed, every single user interface element described below may be replaced by one of your own design.  This is done by registering hook functions with the library that will be called to generate the desired effects.  Without this, your program will get the default functionality provided by CoreLib, which is extremely simple and uses primarily console I/O.  Programs featuring a graphical interface will almost certainly want to install hook functions to provide something more elegant.  For example, the Vibrant application framework installs hooks for all user interface elements prior to calling your Main function.

Alerts

Alerts are used to show a message to the user, which in some cases may be in the form of a question with a small number of possible answers. 

MsgAnswer MsgAlert (MsgKey key, ErrSev sev, const char *capt,
const char *fmt, ...)

Generates a message string using the format string fmt and a variable number of arguments.  The key parameter is used to specify the list of possible user responses and may be any of the following constants.

Symbol                                    Description

KEY_NONE                        No response requried (console) or OK button (graphical)

KEY_OK                               OK button

KEY_OKC                            OK and Cancel buttons

KEY_YN                              Yes and No buttons

KEY_YNC                           Yes, No and Cancel buttons

KEY_RC                               Retry and Cancel buttons

KEY_ARI                             Abort, Retry and Ignore buttons

Two additional parameters, a caption string capt and a severity code sev, may be supplied if desired.  Although they  are ignored in the default MsgAlert processing provided by CoreLib, these two arguments are passed through to the message hook function (if any) for use in graphical alerts.  The caption string is intended for use in the caption bar of the alert window (if it has one) and is normally the name of the application.  The severity code is for use in selecting an icon to appear in the content area of the window beside the message text.  Any of the severity constants listed for the ErrPostEx function (described later in this chapter) may be used.

MsgAnswer MsgAlertStr (MsgKey key, ErrSev sev,
const char *caption, const char *str)

Same as MsgAlert except that the message str is a single string instead of a format string and argument list.

MsgAnswer Message (Int2 option, const char *fmt, ...)

Displays a message to the user that is generated from the format specification string fmt and a variable list of arguments.  The option argument modifies the behavior of the function and may be any one of the following.

Symbol                                    Description

MSG_ERROR                     Beep, show the message, and wait for an acknowlegement from the user before continuing.

MSG_FATAL                      Beep, show the message, then halt the program by calling the AbnormalExit function.

MSG_OK                              Show the message and wait for an acknowledgement from the user before continuing (press the OK button or wait for a keypress).

MSG_OKC                           Show the message and prompt for OK/Cancel.

MSG_YN                             Show the message and prompt for Yes/No.

MSG_YNC                          Show the message and prompt for Yes/No/Cancel.

MSG_RC                              Show the message and prompt for Retry/Cancel.

MSG_ARI                            Show the message and prompt for Abort/Retry/Ignore.

MSG_POST                         Show the message and continue (in graphical interfaces, the alert must generally be dismissed by explicit action of the user).

MSG_POSTERR                Beep, show the message and continue.

Message calls MsgAlert to actually display the message.  If an application property has been installed (see SetAppProperty) with the key "AppName", it is used as the caption when calling MsgAlert.  Otherwise, there is no caption.  The function result is an enumerated type and may be any of the following values.

typedef enum MsgAnswser

{

    ANS_NONE,

    ANS_OK,

    ANS_CANCEL,

    ANS_ABORT,

    ANS_RETRY,

    ANS_IGNORE,

    ANS_YES,

    ANS_NO

}

ASN_NONE is returned for the options that do not require any user response (MSG_POST and MSG_POSTERR).


MsgHook SetMessageHook (MsgHook hook)

Installs hook as the function to be called for showing messages.  A pointer to the previous hook function (if any) is returned, so that it is possible to later restore it.  The message hook function should have the following form.

MsgAnswer LIBCALLBACK MyMessageHook (MsgKey key, ErrSev sev,

               const char *caption, const char *message)

{

    MsgAnswer answer;

 

    /* Create a dialog box using caption as the title.  Within

    *  content region, show message and an icon selected using

    *  sev.  Place buttons on the dialog based on the value of

    *  key.  Wait for the user to press a button, then destroy

    *  the dialog window and return the appropriate answer code. */

   

    return answer;

}

Beeps

void Beep ()

Sounds an audible beep.

BeepHook SetBeepHook (BeepHook hook)

Installs hook as the function to be called for sounding beeps.  The function takes no arguments and has no return value.  The return value is a pointer to the previous BeepHook.

void LIBCALLBACK MyBeepHook ()

{

    PlaySoundFile("beep.snd");
}

 

Int2 Main ()

{

    SetBeepHook(MyBeepFunction);

 

         ...etc...

 

    return 0;

}

Monitors

Monitors are user interface elements used to indicate the status or progress of potentially lengthy operations.  There are two general types of monitors.  The first is the string monitor, which displays a series of strings, one after the other.  The second is the integer range monitor, which indicates progress of an operation as an integer value within some defined range.  String monitors and integer monitors must be created before they can be used using either MonitorStrNew or MonitorIntNew, respectively and destroyed when they are no longer needed using MonitorFree

In addition to these, there is the notion of a default progress monitor which may be used by calling ProgMon, even though these may have been initialized in a completely different code module or not initialized at all.  Normally, the default monotor is created in the top-level application code and registered with the system by calling SetProgMon.

The monitor functionality provided by CoreLib is appropriate for use in situations where console I/O is used.  For applications having a graphical interface or for console-style programs in which customized monitor behavior is desired, you can write your own monitor hook function to implement the user interface and install it by calling SetMonitorHook.  Programs using Vibrant need not do this as the application framework takes care of installing hook functions for all user interface elements.

MonitorPtr MonitorStrNew (const char *title, Int2 len)

Creates a new string monitor with the caption title and returns a pointer to it.  The maximum length of any string value is supplied as the len argument.  NULL is returned on failure.

Boolean MonitorStrValue (MonitorPtr mon, const char *sval)

Sets the value of the string monitor mon to the string sval.  The return value indicates success or failure.

MonitorPtr MonitorIntNew (const char *title, Int4 n1, Int4 n2)

Creates a new integer monitor with the caption title whose extent is from n1 to n2 and returns a pointer to it.  NULL is returned on failure.

Boolean MonitorIntValue (MonitorPtr mon, Int4 ival)

Sets the value of the integer monitor mon to ival.  The return value indicates success or failure.

MonitorPtr MonitorFree (MonitorPtr mon)

Frees the monitor mon, which may be either of the integer or string monitor class.  The return value is always NULL.


MonitorHook SetMonitorHook (MonitorHook hook)

Installs hook as the function to be called to carry out monitor activities.  The value of the previous hook function is returned.  The hook function should have the following form.

int LIBCALLBACK MyMonitorHook (Monitor *mon, MonCode code)

{

    switch (code)

    {

          case MonCode_Create :

               /* allocate memory & create interface elements here */

               if (failure)

                     return FALSE;

               break;

          case MonCode_Destroy :

               /* free memory & destroy interface elements here */

               break;

          case MonCode_IntValue :

               /* */

               break;

          case MonCode_StrValue :

               /* */

               break;

          default :

               return FALSE;

    }

    return TRUE;
}

Boolean SetProgMon (ProgMonFunc hook, VoidPtr data)

Installs hook as the function to be called for default progress monitor handling.  A pointer to an arbitrary data block data and a string are passed to the hook function when it is called.  In the example below, a normal string monitor is used for default processing.

Boolean LIBCALLBACK MyProgMonHook (VoidPtr data, CharStr str);

Monitor *defProgMon;

 

Int Main ()

{

    defProgMon = MonitorStrNew(“Progress Messages”,80);

    SetProgMon(MyProgMonHook,(void*)defProgMon);

 

         ... do stuff ...

 

    MonitorFree(defProgMon);

    return 0;
}

 

Boolean LIBCALLBACK MyProgMonHook (void *data, const char *str)

{

    return MonitorStrValue((Monitor*)data,str);

}

Boolean ProgMon (CharPtr str)

Pass the string str to the default progress monitor.  If no default monitor has been installed with SetProgMon, calling this function has no effect.  The return value is whatever was returned by the default monitor hook function.

 

Configuration Files

A scheme for storing and modifying persistent system and application configuration options is provided.  It is modeled on services provided in the Microsoft Windows environment and has been extended to work all of the platforms that we support.

File Names

Since each platform may have its own convention for naming configuration files, we have opted to use a common basename from which the actual filename can be derived as appropriate for the system.  This is described in the table below, where xxx represents the basename.

Platform

File Name

Locations searched

UNIX

.xxxrc

1. Path from NCBI environment variable
2. User's home directory
3. Current working directory

VMS

.xxxrc

1. Path from NCBI environment variable
2. User's home directory
3. Current working directory

Macintosh

xxx.cnf

1. System Folder:Preferences
2. System Folder

MS-DOS

xxx.cfg

1. Path from NCBI environment variable
2. Current directory

MS-Windows

xxx.ini

1. Windows directory

File Format

Configuration files are plain ASCII text files that may be edited by the user.  They are divided into sections, each of which is headed by the section name enclosed in square brackets.  Below each section heading is a series of key=value strings, somewhat analogous to the environment variables that are used on many platforms.  Any line that begins with a semi-colon is considered a comment.  The following lines serve as an example of what may appear in a settings file:

[General]

AsnLoad = c:\ncbi\asnload

AsnData = c:\ncbi\asndata

 

[CD-ROM]

path = E:\

 

[NetService]

; Note: set USERNAME = ? to be prompted for your username

username=?

host=dispatcher@ncbi.nlm.nih.gov

timeout=30

 

Configuration File Functions

Boolean SetAppParam (const char *filebase, const char *sect, const char *key, const char *val)

Sets the value of key to val in section sect of the configuration file specified by filebase.  The return value indicates success or failure.

Boolean TransientSetAppParam (const char *filebase,
const char *sect, const char *key, const char *val)

Sets a configuration value like SetAppParam, except that the setting exists only in memory and is not written to the configuration file.

int GetAppParam (const char *filebase, const char *sect,
const char *key, const char *dflt, char *buf, int buflen)

Searches section sect of the configuration file specified by filebase for key and returns its value in the buffer buf.  If key is not found, the default value dflt is copied to buf.  The return value is the number of characters copied to buf, which may be up to buflen-1.

Boolean FindPath (const char *filebase, const char *sect,
const char *key, char *buf, int buflen)

Gets a configuration setting by passing the supplied arguments to GetAppParam (with NULL as the default) and then ensures that the returned string is of the proper form for a filesystem path on the particular platform.

Error Processing

The core library includes functions for posting, reporting, logging, and handling whatever error conditions may be encountered during program execution.  An important concept is that indicating that an error occurred, or posting an error, can be functionally decoupled from the handling of that error.  The function ErrPostEx is provided for posting an error along with an indication of its severity.  If no special provisions have been made, default processing of the error will occur, which may include (depending on the severity) displaying the error to the user and halting the program.  However, there are a number of ways to customize this behavior.  The simplest is to adjust the severity level that will be displayed to the users or that will result in a fatal program exit using ErrSetMessageLevel and ErrSetFatalLevel, respectively.  For maximal control, you can use ErrSetHandler to install your own function that will be called whenever an error is posted.

The software toolkit provides the ability to keep a log of all posted errors, which we have found quite useful as an aid to debugging or for producing reports on large data processing runs.  Error logging is performed at the time an error is posted, regardless of how or when the error is ever handled.  Logging is disabled by default; to enable it, use ErrSetOptFlags with EO_LOGTO_USRFILE as the argument.  The name of the file can be modified with the ErrSetLogfile function.

When interpreting an error message, it is sometimes useful to know something about the context in which the error was posted.  For example, knowing that an error is from the ASN.1 function library as opposed to the network services library might be of assistance in diagnosing problems with a client program that retrieves ASN.1 data from a network service.  In the past we have used defined integer context codes for this purpose.  However, for a variety of reasons, we now prefer to use a string to indicate the context, or module, in which the error occurred.  At a finer granularity, you might want to know the filename and line number in the C source file in which the error was posted, but that is mainly of interest to programmers and not shown by default.  In order to allow some context information to be captured with minimal effort, we make use of two macros, THIS_MODULE and THIS_FILE, which you can (but are not required to) define once at the top of each source file.  Both represent strings and may be defined as NULL if they are to be ignored.  If you do not define them at all, you will inherit the default definitions from ncbierr.h:

#ifndef THIS_MODULE

#define THIS_MODULE  NULL

#endif

#ifndef THIS_FILE

#define THIS_FILE  __FILE__

#endif

ErrPostEx is actually implemented as a macro, which passes these two strings, along with the line number, to the toolbox functions (this obviates the need for several additional arguments).  Since not all linkers will merge duplicate strings, it is usually best to instantiate string variables for the module and filename and define the macros as aliases.  Without doing this it is possible to end up with one copy of each string for each expansion of the ErrPostEx macro. 

A typical example would be:

static char *this_module = "MyModule";

#define THIS_MODULE  this_module

static char *this_file = __FILE__;

#define THIS_FILE this_file

 

#include <ncbi.h>

 

If you wish to include the ncbi.h header file first, then you can undefine the symbols prior to redefining them. 

A recent enhancement to the error processing code is support for error message files.  These files may contain information allowing you to (1) convert integer error codes to a mnemonic string on output, (2) provide a verbose explanatory message to be appended to the standard error message, and (3) specify the severity level to be used for any error.  The files are plain ASCII text and fairly easy to edit, so they may be used to customize error reporting according to the preferences of individual users.

Posting An Error

void ErrPost (int context, int errcode, const char *fmt, ...)

Posts a fatal error that is defined by errcode and described to the user by means of a string that is generated from the format string fmt a variable number of arguments.  The context argument is effectively the equivalent of the module, but it is only displayed if THIS_MODULE has not been defined to anything (i.e., if it is defined to NULL, as it is in ncbierr.h).

NOTE: This is an old function that has been retained for compatibility purposes.  New code should use ErrPostEx instead.

int ErrPostEx (ErrSev sev, int errcode, int subcode,
const char *fmt, ...)

Posts an error of severity sev.  The error is defined by errcode and subcode and described to the user by means of a string that is generated from the format string fmt a variable number of arguments.  The return value is the same as that returned by ErrPostStr (see below).  The possible severity codes are:

Symbol                                    Description

SEV_INFO                           Purely an informational message, not a true error.

SEV_WARNING               Warning of a possible error condition.

SEV_ERROR                       An error has occurred but execution can continue.

SEV_FATAL                       An fatal (non-continuable) error has occurred.

int ErrPostStr (ErrSev sev, int errcode, int subcode,
const char *str)

Posts an error as described for the ErrPostEx function except that str contains the descriptive error text as a single string instead of a format string plus variable argument list (hence, it can be called from programs that are written in languages other than C or C++). 

Both ErrPost and ErrPostEx call ErrPostStr after formatting the string and this is where the real work takes place.  First, an internal ErrDesc structure is populated with all of the information describing the error that occurred.  If logging is enabled and sev is greater than or equal to the current LogLevel, this information is then logged according to whatever style flags have been set using the ErrSetOptFlags function.  The user-supplied error handler function is given the first opportunity to handle the error.  If it returns zero or if there is no such function, default processing takes place.  If sev is greater than or equal to the current MessageLevel, Message is called to display the error to the user.  If sev is greater than or equal to the current FatalLevel, the program is halted by calling AbnormalExit.

The return value is one of the "answer codes" (e.g. ANS_OK) that may be returned by the MsgAlert function.  If the programmer has installed an error handler function, it should return one of these codes if it handles the error or zero otherwise.  If MsgAlert was called as a result of default error processing, its result value is returned to the caller.  If neither of these is true, zero is returned.

User Error Strings

One thing we have found to be quite useful is the ability to include in the error messages additional strings defined by the user (meaning the programmer in this case) in order to provide additional information about the context in which the error occurred.  For example, image that you have a program that streams through every record in a sequence database performing some sort of analysis or calculation.  Before processing each record you could use ErrUserInstall supplying (say) its accession number as the string.  Then, if any error occurs during the run, you would know which record was being processed at the time because its accession number would be part of the error message.

ErrStrId ErrUserInstall (const char *msg, ErrStrId id)

If id is zero, the string msg is added to the list of user-defined error strings and a unique id value for that string is returned.  Otherwise, the text of an existing entry in the list identified by id is replaced with msg.

Boolean ErrUserDelete (ErrStrId id)

Deletes the user-defined error string identified by id (returned by ErrUserInstall).

void ErrUserClear ()

Clears the entire list of user-defined error strings.

Customization

int ErrSetFatalLevel (ErrSev level)

Sets the minimum severity that will result in a fatal exit to level.   The return value is the previous setting.  The default value is SEV_FATAL, but changing it to SEV_MAX will prevent the application aborting.

int ErrGetFatalLevel ()

Returns the current FatalLevel value.

int ErrSetMessageLevel (ErrSev level)

Sets the minimum severity that will be displayed to the user (via the Message function) to level.The return value is the previous setting.  The default value is SEV_WARNING, but setting it to SEV_MAX will disable all error reporting.

int ErrGetMessageLevel ()

Returns the current MessageLevel setting.

int ErrSetLogLevel (ErrSev level)

Sets the minimum severity that will be logged to level.  The return value is the previous setting.  The default value is SEV_INFO.  Note that one of the log output channels (logfile, stderr, or trace) must be enabled before any logging will occur.

int ErrGetLogLevel ()

Returns the current LogLevel setting.

int ErrSetLogfile (const char *filename, unsigned long flags)

Sets the name of the error log file to filename (from the default name "error.log").  Note that the ER_LOG_USRFILE flag must be set (see below) to actually enable logging to the named file.  The flags may be any of the following, which may be combined with the bitwise-OR operator.

Symbol                                    Description

ELOG_BANNER               Writes a banner line with the current time and date.

ELOG_APPEND               Appends to an existing file (if there is one).

ELOG_NOCREATE         Do not attempt to create the file at this time (wait until the first error is posted).  Ignored if ELOG_BANNER given.

unsigned long ErrSetOptFlags (unsigned long flags)

Sets one or more bit-flags, which should be combined with the bitwise-OR operator into the flags argument.  The flags may be any of the following [default state in brackets]:

Symbol                                    Description

EO_LOG_SEVERITY       Log an indication of the severity (e.g. "WARNING") to the file [yes]

EO_MSG_SEVERITY       Show an indication of the severity (e.g. "WARNING") to the user [yes]

EO_SHOW_SEVERITY   EO_LOG_SEVERITY | EO_MSG_SEVERITY

EO_LOG_CODES             Log the module name, error code, and subcode [yes]

EO_MSG_CODES             Show the module name, error code, and subcode to the user [yes]

EO_SHOW_CODES         EO_LOG_CODES | EO_MSG_CODES

EO_LOG_FILELINE         Log the source file and line number at which ErrPostEx was called [yes]

EO_MSG_FILELINE        Show the source file and line number to the user [no]

EO_SHOW_FILELINE    EO_LOG_FILELINE | EO_MSG_FILELINE

EO_LOG_USERSTR         Log programmer-defined error strings [yes]

EO_MSG_USERSTR        Show programmer-defined error strings to the user [yes]

EO_SHOW_USERSTR    EO_LOG_USERSTR | EO_MSG_USERSTR

EO_LOG_ERRTEXT        Show the error message to the user [yes]

EO_MSG_ERRTEXT        Log the error message [yes]

EO_SHOW_ERRTEXT    EO_LOG_ERRTEXT | EO_MSG_ERRTEXT

EO_LOG_MSGTEXT       Retrieve and log the verbose explanatory text from a message file [no]

EO_MSG_MSGTEXT       Retrieve the verbose explanatory text from a message file and and present it to the user [no]

EO_SHOW_MSGTEXT   EO_LOG_MSGTEXT | EO_MSG_MSGTEXT

EO_XLATE_CODES        Translate the integer error code and subcode into the mnemonic strings defined in an error message file.  (If the file cannot be found, the integer values are displayed as if this flag were not set.)

EO_BEEP                             Produce an audible beep when displaying an error to the user [no]

EO_WAIT_KEYPRESS    Wait for the user to press a key or button before continuing [no]

EO_PROMPT_ABORT    Prompt the user as to whether to abort [no]

EO_LOGTO_USRFILE    Log to the error log file [no]

EO_LOGTO_STDOUT    Log to stdout [no]

EO_LOGTO_STDERR     Log to stderr [no]

EO_LOGTO_TRACE       Log to the "trace device" (see TRACE, below) [no]

unsigned long ErrClearOptFlags (unsigned long flags)

Clears one or more bit-flags (see above), which may be combined with the bitwise-OR operator into the flags argument.

unsigned long ErrTestOptFlags (unsigned long flags)

Tests one or more bit-flags (see above), which may be combined with the bitwise-OR operator into the flags argument.

void ErrSaveOptions (ErrOpts *erropt)

Copies all error option settings to the local buffer pointed to by erropt.  This will include severity levels for logging, displaying, and aborting as well as all option flags.  This should be done prior to changing any settings if you intend to later restore the state.

void ErrRestoreOptions (const ErrOpts *erropt)

Restores the error options state using the information that was previously captured using ErrSaveOptions in the buffer pointed to by erropt.

Configuration File Settings

The main configuration file for the NCBI toolkit (variously called .ncbirc, ncbi.ini, or ncbi.cfg, etc., depending on the platform) may contain settings within the "ErrorProcessing" section to provide additional runtime customization.  Here are some example settings:

[ErrorProcessing]

 

;  need to tell the system where the message files are kept

MsgPath=/sun/ncbi/errmsg

 

SEV_INFO    = "==> note    "

SEV_WARNING = "==> WARNING "

SEV_ERROR   = "==> ERROR   "

SEV_FATAL   = "==> FATAL   "

 

;  override a few of the option flags

EO_SHOW_MSGTEXT = 1   ;always show me everything...

EO_BEEP = 0           ;...but those beeps drive me nuts

The MsgPath key is used to tell the system where to look for error message files.  If this setting is not present, only the current directory will be examined.  Failure to locate the error message file will not prevent any application from running.  Instead, it will simply not be possible to convert integer error codes to strings or to display verbose error messages.

The strings that are used to indicate the severity of the error ("WARNING", for example) may be modified if desired.  To do so, use the same symbols used to indicate severity in your code (e.g., SEV_WARNING) as the key with the desired string as the value.  In the example above, the strings are quoted, but this is only required if leading or trailing spaces are to be included in the string.

In a similar fashion, each of the option flags may be set or cleared by using the symbol for that flag as the key and either 1 (one) or 0 (zero) as the value (alternatively, you can use YES/NO or TRUE/FALSE).  Note that these settings override anything that the programmer may have chosen to implement.  For example, if the configuration file contained the line EO_BEEP=0, there would be no beeps sounded on an error even if the code explicitly contained the command ErrSetOptFlags(EO_BEEP).


Preparing Error Message Files

You can use your favorite text editor to prepare error message files as they are plain ASCII text.  However, the name of the file is significant; it must be derived from the module name by converting to all lower case characters and appending the ".msg" extension.  For example, the message file for the "CoreLib" module should be called "corelib.msg" (shown below).  The first line of the file should consist of the keyword "MODULE" followed by the name of the module (e.g. "CoreLib" in the example below). 

MODULE CoreLib

 

$$ NoMemory, 1, SEV_FATAL

 

$$ File, 2, SEV_INFO

 

$^   Open, 1

This often indicates that the file simply does not exist.

Alternatively, it may exist but you do not have permission to

access the file in the requested mode.

 

$^   Read, 2

Not sure what would cause this...

 

$^   Write, 3, SEV_FATAL

This may indicate that the filesystem is full.

 

$$ Math, 3

$^   Param, 1

$^   Domain, 2

$^   Range, 3

$^   Iter, 4

 

$$ SGML, 4

$^   Init, 1

$^   Braces, 2

$^   Entity, 3

Lines beginning with "$$" are used to define a main-level error code.  The first two (comma delimited) tokens on the line are the mnemonic string and integer representations of the error.  In the example above, the string "NoMemory" is equated to error code 1.  These two tokens are required, but a third optional token may be supplied to specify a severity level to be used when posting that error.  Note that this overrides the severity used by the programmer and therefore allows for runtime customization of the program.  In the example above, all "NoMemory" errors would be fatal.

In a similar manner, lines beginning with "$^" may be used to define subcodes within the scope of the maincode that appears above it.  In the example above, "Open" is a subcode within "File".  A subcode can inherit a severity from its parent if it does not have one of its own.  For example, SEV_INFO would used for all "File" errors with the exception of "Write", which would be SEV_FATAL.

Below any maincode or subcode line you may (optionally) enter a block of text to be used as the verbose error message.  This is appended to the actual error message posted by the program and is intended to provide additional explanation.  A common pitfall is to repeat the error message in the explanation.  For example, you would probably not want to begin the explanation for File.Write with "A file write error has occurred" because this would almost certainly be in the original error message. 

Fetching and Displaying Errors

int ErrPeek ()

Returns a non-zero value if an error pending (i.e., has been posted but not yet processed).

int ErrCopy (ErrDesc *errdesc)

In an error is pending, information about the error is copied to the local buffer pointed to by errdesc and a non-zero value is returned.  The error is not cleared by this function and may still be displayed by calling ErrShow. Zero is returned if no error is pending.

void ErrClear ()

Clears a pending error, if there is one.

int ErrFetch (ErrDesc *errdesc)

Copies the description of the pending error, if there is one, to errdesc and then clears the error condition (functionally equivalent to ErrCopy followed by ErrClear).

int ErrShow ()

If an error is pending and its severity is greater than or equal to the MessageLevel setting, it is displayed to the user via a call to the MsgAlert function.  If the EO_PROMPT_ABORT option flag has been set, the message includes the question "Abort, Retry, or Ignore ?".  In this case, if the users responds "Abort", the program aborts (otherwise, execution continues and either ANS_RETRY or ANS_IGNORE is returned so the caller may decide whether or not to distinguish between these alternatives).  If the EO_PROMPT_ABORT bit is cleared (as it is by default), the program will abort if the severity of the error is greater than or equal to the FatalLevel.

Installing Custom Error Handlers

ErrHookProc ErrSetHandler (ErrHookProc hook)

Installs hook as the function to be called when an error posted.  If error logging is enabled, the error will already have been logged before the hook function is called.  The hook function takes a pointer to an ErrDesc structure and should return a non-zero value if it handled the error (preferably ANS_OK) and zero if it did not.  In the latter case, the system will perform the default error handling, which may involve displaying the error and/or halting the program. 


Here's an Example:

#include <ncbi.h>

 

int LIBCALLBACK MyErrorHandler (const ErrDesc *err)

{

    if (strcmp(err->cntxstr,"CoreLib") ==0 &&

               err->errcode == E_NoMemory)

    {

          Beep(); Beep(); Beep(); /* something they'll notice! */

          ReleaseLifeboat(); /* free up memory reserves */

          return ErrShow();

    }

    return 0;  /* zero means we didn't handle the error */

}

 

int main (int argc, char **argv)

{

    ErrSetHandler(MyErrorHandler);

 

    ... do stuff ...

 

    return 0;

}

 

Miscellaneous Utility Functions

void ErrLogPrintf (const char *fmt, ...)

Formats a string using a printf-style format string fmt and a variable-length list of arguments and then writes it to any error logging streams that may have been enabled.

void ErrLogPrintStr (const char *str)

Similar to ErrLogPrintf, except that str is a single string to be written to the error logging streams (for users of programming languages other than C or C++).

void AbnormalExit (int)

Terminates the program immediately.  This function should only be called if an application is not capable of exiting any other way.  Cleanup code (e.g. closing files and sockets) will not generally get called before program halts.  On some systems, calling this function may also invoke a debugger if one has been installed.

Files and Directories

[[ ...insert text here... ]]

ANSI-Style Functions

NCBI Toolbox

ANSI C

Description

FileOpen

fopen

Opens a file for reading or writing

FileClose

fclose

Closes an open file

FileRead

fread

Reads bytes from an open file

FileWrite

fwrite

Writes bytes to an open file

FileGets

fgets

Reads a string from an open file

FilePuts

fputs

Writes a string to an open file

Directory Management

Boolean CreateDir (char *pathname)

Creates a directory called pathname.  The return value indicates success or failure.

void FileCreate (char *fileName, char *type, char *creator)

[[ ...insert text here... ]]

char* TmpNam (char *tmp)

Generates a unique temporary file name.  If a pointer to a string buffer is supplied as the tmp argument, the name is placed there; otherwise it is formatted in a static buffer within the library.  Either way, the return value points to the generated temporary file name.

Boolean FileRemove (char *fileName)

Removes the file fileName, if it exists.  The return value indicates success or failure.

Boolean FileRename (char *oldFileName, char *newFileName)

Changes the name of a file from oldFileName to newFileName.  The file cannot generally be moved from one directory to another, so no path should be included.  The return value indicates success or failure.

char* FileBuildPath (char *root, char *subpath, char *filename)

[[ ...insert text here... ]]

char* FileNameFind (char *pathname)

Returns a pointer to the filename portion of pathname.  For example, on a Macintosh system, FileNameFind("harddisk:System Folder:Excel Settings") would return a pointer to "Excel Settings".

Int4 FileLength (char *fileName)

Returns the length (in bytes) of the file called fileName.  If the length cannot be determined, zero is returned.

CD-ROM

Boolean EjectCd (char *sVolume, char *deviceName, char *rawDeviceName, char *mountPoint, char *mountCmd)

Ejects a CD-ROM from the device.  This function has no effect on DOS and Microsoft Windows systems.

Parameter

Operating System

Description

sVolume

MacOS

Volume name, e.g. “SEQDATA”

deviceName

UNIX

Name of CD-ROM device, e.g. “/dev/sr0”

rawDeviceName

UNIX

Name of raw device, e.g. “dev/rsr0”

mountPoint

UNIX, VMS

Filesystem location where CD-ROM data should be mounted, e.g. “/cdrom”

mountCmp

UNIX

A script or program that performs the ejecting and mounting actions.  For many UNIX systems, mounting requires super-user privileges

Boolean MountCd (char *sVolume, char *deviceName,
char *mountPoint, char *mountCmd)

Mounts a CD-ROM.  The parameters are the same as those described for EjectCd, except that the raw device is not needed.  This function has no effect on DOS and Microsoft Windows systems.


Customization

void SetFileOpenHook (FileOpenHook hook)

Installs hook as the function to be called by FileOpen to actually open the file. The arguments of the hook function are the same as those for FileOpen.  In the following example, the hook function looks to see if a full path is given and, if so, first attempts to open a file of the same name in the current directory.  If that fails, a normal file open is performed using the fopen function.

FILE* LIBCALLBACK  MyFileOpenHook (const char *fname, const char *fmode);

 

Int 2 Main ()

{

    SetFileOpenHook (myFileOpenHook);

 

    ... do stuff ...

 

    return 0;
}

 

FILE* LIBCALLBACK  MyFileOpenHook (const char *fname, const char *fmode)

{

    /* Note:  FileOpen checks for NULL arguments, so we don’t

          have to do it here */

 

    char *p = strchr(fname,DIRDELIMCHR);

    if (p != NULL)

    {

          FILE *fd = fopen(p+1,fmode);

          if (fd != NULL)

               return fd;

    }

    return fopen(fname,fmode);
}

 

Memory Management

Services for the dynamic allocation and deallocation of memory differ widely among platforms.  All of them provide standard ANSI functions, such as malloc, which allocates non-relocatable memory blocks that are referenced by pointers.  In addition, the Macintosh and Microsoft Windows environments provide the ability to allocate relocatable memory blocks, which is designed to reduce heap fragmentation.  Relocatable blocks are referenced by handles instead of pointers.  They must be locked before use and a valid pointer is obtained as part of the lock operation. When they are later unlocked, the pointer becomes invalid.  For most routine uses, we recommend using fixed memory.  Clearly, fixed memory is easier to use (since locking is unnecessary), and ongoing advances in chip architectures and system software on the microcomputer platforms are eliminating the performance advantage that relocatable memory currently offers.

ANSI-Style Functions

NCBI Toolkit

ANSI C

Description

Malloc

malloc

Allocates memory

Calloc

calloc

Allocates memory

Realloc

realloc

Changes the size of a previously allocated block.

Free

free

Frees a memory block.

MemCopy

memcpy

Copies a range of bytes .

MemMove

memmove

Copies a range of bytes (source and destination may overlap).

MemFill

memset

Sets a range of bytes to a particular value.


Fixed Memory

void* MemGet (size_t size, unsigned int flags)

Allocates a fixed memory block containing size bytes using options encoded in flags and returns a pointer to it. The flags may be any of the following:

Symbol                                    Description

MGET_CLEAR                  Clears the allocated memory to zeros.

MGET_ERRPOST             Posts an error (severity ERR_FATAL) on memory allocation failure.

void* MemNew (size_t bytes)

General-purpose fixed memory allocator that calls MemGet with the MGET_CLEAR and MGET_ERRPOST flags.

void* MemMore (void *ptr, size_t size)

Changes the size of a fixed memory block ptr to size bytes.  On failure, NULL is returned.

void* MemExtend (void *ptr, size_t size, size_t oldsize)

Changes the size of a fixed memory block ptr, the current size of which must be supplied as oldsize, to a new size of size.  If size is greater than oldsize, the additional memory is cleared to zeros.  On failure, NULL is returned.

void* MemDup (const void *ptr, size_t size)

Duplicates a fixed memory block ptr, the size of which must be supplied as the size argument.  On failure, NULL is returned.

void* MemFree (void *ptr)

Frees the mixed memory block ptr (if it is non-NULL).  The return value is always NULL.

Relocatable Memory

We provide functions for the manipulation of relocatable memory, but on systems where this is not available, fixed memory is used instead (and the type Handle is equivalent to Pointer).

Handle HandGet (size_t size, Boolean clear)

Allocates a moveable memory block containing size bytes.  On failure, NULL is returned (no error is posted).

Handle HandNew (size_t size)

Allocates a moveable memory block containing size bytes and clears it to zeros.  On failure, an error is posted (SEV_FATAL) and NULL is returned.

Handle HandMore (Handle hnd, size_t size)

Changes the size of moveable block hnd to size bytes.  On success, the return value is the re-sized block, which may or may not be the same as hnd.  On failure, NULL is returned. 

Handle HandFree (Handle hnd)

Frees moveable memory block hnd (if non-NULL).  The return value is always NULL.

void* HandLock (Handle hnd)

Locks moveable memory block hnd and returns a pointer to it.  The pointer remains valid until HandUnlock is called. 

void* HandUnlock (Handle hnd)

Unlocks moveable memory block hnd that was previously locked with HandLock.

NOTE: Ensure that you do not have nested HandLock/HandUnlock calls.  Although some systems will handle this situation properly, others do not.  Notably, on the Macintosh, HandUnlock will always unlocked by HandUnlock regardless of how many times it was locked.

Byte Stores

We have implemented an additional type of dynamic storage called a ByteStore.  It is designed to look and behave much like an unformatted file, but its data exist in memory (however, due to various virtual memory schemes, the data may be in files after all!).  A ByteStore is especially useful for storing large amounts of data that would normally exceed the limits imposed by systems using 16-bit memory addressing.

A ByteStore is created by BSNew and the pointer it returns is a requried argument for the remaining ByteStore functions.  At the end of its lifespan, it is deallocated by the BSFree function.

A ByteStore has a logical length, which is returned by BSLen and corresponds to the number of bytes of data it contains.  The physical length of a ByteStore is actual amount of memory allocated and is often larger than the logical length.  All functions that add data to a ByteStore automatically take care of increasing the physical length to accommodate the logical length.

As with file I/O, a ByteStore uses the notion of a current position that is used for reading and writing data.  The functions BSSeek and BSTell provide a means of setting and querying the current position.  The functions BSRead, BSWrite, BSGetByte, and BSPutByte are analogous to functions used in file I/O.  However, unlike file I/O, an block of data may be inserted or deleted internally using BSInsert and BSDelete.

ByteStorePtr BSNew (long len)

Creates a ByteStore with an initial physical length of len and returns a pointer to it.  If len is zero, a default physical size is used.

ByteStorePtr BSDup (ByteStorePtr bs)

Creates a new ByteStore that is a copy of bs.

void* BSMerge (ByteStorePtr bs, void *buff)

Copies all of the data in ByteStore bs to a single memory buffer buff.  If buff is NULL, the function will allocate a buffer of the correct size.  Otherwise, it is the responsibility of the caller to ensure that the buffer is at least as large as the value returned by BSLen.  The return value is the pointer to the buffer containing the merged data.

ByteStorePtr BSFree (ByteStorePtr bs)

Deallocates ByteStore bs (if bs is NULL, nothing happens).  The return value is always NULL.

long BSLen (ByteStorePtr bs)

Returns the logical length of ByteStore bs.

Int2 BSSeek (ByteStorePtr bs, long offset, int origin)

Moves the current position of ByteStore bs to offset bytes from the point indicated by origin, which may be any of the following:

Symbol                                    New position

SEEK_SET                           offset bytes from the beginning of the ByteStore

SEEK_END                         offset bytes from the end of the ByteStore

SEEK_CUR                          offset bytes from the current position

long BSTell (ByteStorePtr bs)

Returns the current position of ByteStore bs.

long BSWrite (ByteStorePtr bs, void *buff, long len)

Writes len bytes of data from the memory buffer buff to the current position of ByteStore bs.  Following this operation, the current position will be increased by the number of bytes written.  The return value is the same as len if the write was successful or zero if not. 

long BSRead (ByteStorePtr bs, void *buff, long len)

Attempts to read len bytes of data from the current position of ByteStore bs to the buffer buff.  The return value is the number of bytes actually read, which may be less than len if the logical end of data is reached.  Following this operation, the current position will be increased by the number of bytes read.

long BSInsert (ByteStorePtr bs, void *buff, long len)

Inserts len bytes from memory buffer buff before the current position in ByteStore bs.  The current position is then increased by len so that it points to the position just after the inserted range.  The return value is the same as len if the insertion was successful or zero if not. 

long BSInsertFromBS (ByteStorePtr bs, ByteStore *bs2, long len)

Inserts len bytes into ByteStore bs by reading them from a second ByteStore bs2.  The return value is the actual number of bytes transferred, and the current positions of both ByteStores will be increased by this amount.

long BSDelete (ByteStorePtr bs, long len)

Deletes len bytes from ByteStore bs beginning at the current position (which is not changed by the operation).  The return value specifies the actual number of bytes deleted, which may be less than len if the logical end of data was reached.

Int2 BSPutByte (ByteStorePtr bs, int b)

Inserts the byte b at the current position of ByteStore bs and advances the position by one.  If b is equal to the constant EOF, the ByteStore is truncated at the current position.  The return value is b on success or EOF on failure.

Int2 BSGetByte (ByteStorePtr bs)

Returns the byte at the current position of ByteStore bs and advances the position by one.  If the logical end of data has been reached, the constant EOF is returned.

String Functions

ANSI-Style Functions

NCBI Toolkit

ANSI C

Description

StringLen

strlen

Gets string length

StringCpy

strcpy

Copies a string

StringNCpy

strncpy

Copies a string (n chars)

StringCat

strcat

Catenates strings

StringNCat

strncat

Catenates strings (n chars)

StringCmp

strcmp

Compares strings

StringNCmp

strncmp

Compares strings (n chars)

StringChr

strchr

Searches for a character in a string (from beginning)

StringRChr

strrchr

Searches for a chatacter in a string (from end)

StringPBrk

strpbrk

Searches for first characters in a string that is a member of a specified set

StringStr

strstr

Searches for a substring in a string

StringSpn

strspn

Counts leading characters that are members of a specified set

StringCSpn

strcspn

Counts leading characters that are not members of a specified set

StringSet

strset

Sets all characters of a string to a specified character

StringNSet

strnset

Sets up to n characters of a string to a specified character

StringTok

strtok

Breaks a string into tokens

Refer to ANSI C documentation for details.

Additional String Functions

int StringICmp (const char *a, const char *b)

Compares strings a and b like StringCmp, but ignoring case (assumes ASCII character set). 

int StringNICmp (const char *a, const char *b, size_t n)

Compares up to the first n characters of strings a and b like StringNCmp, but ignoring case (assumes ASCII character set).

char* StringMove (char *dst, const char *src)

Copies the string src to dst and returns a pointer to the null byte that terminates the concatenated string.

char* StringSave (const char *str)

Copies str to a dynamically allocated memory block an returns a pointer to that block.

size_t StringCnt (const char *str, const char *list)

Searches the string str for any of the characters of the string list and returns the number of occurrences found.

Number Strings

Several functions are provided for converting integers to ASCII strings.  The following option flags determine how the string is formatted (may be combined with the bitwise-OR operator).

Symbol                                    Description

MISC_COMMAS               Insert commas only when |value| >= 10,000

MISC_ALLCOMMAS      Insert commas for any |value| >= 1,000

MISC_ANYCOMMAS     Both MISC_COMMAS and MISC_ALLCOMMAS

MISC_PLUSSIGNS           Prepend a plus sign (+) to positive values               

char* Ltostr (long x, int opts)

Converts the integer value x to ASCII using options opts.

int Lwidth (long x, int opts)

Returns the length of the string that would result from the conversion of integer value x to ASCII using options opts.

char* Ultostr (unsigned long x, int opts)

Converts the unsigned integer value x to ASCII using options opts.

int Ulwidth (unsigned long x, int opts)

Returns the length of the string that would result from the conversion of unsigned integer value x to ASCII using options opts.

Time Strings

Boolean DayTimeStr (char *buf, Boolean date, Boolean time)

Gets the current calendar time and generates a string representation in buf. containing the date and/or time as specified by the date and time arguments.  The buffer should be of sufficient size to hold 24 characters.

SGML Strings

[[ ...insert text here... ]]

char* Sgml2Ascii (const char *sgml, char *ascii, size_t buflen)

Converts the SGML string in sgml into a printable ASCII string and copies it to ascii.  The buflen parameter gives the length of the ascii buffer.  The return value is the same as ascii [[check this]].

size_t Sgml2AsciiLen (const char *sgml)

Returns the length of the printable ASCII string that would result from convesion from SGML text.

ValNode Functions

A ValNode is a simple data structure that allows a mixture of data types to be grouped into a linked list. It contains a "choice" slot, which is used to discriminate the datatype held in the union called "data". ValNodes are used extensively in ASN.1 objects to represent CHOICE, SEQUENCE OF, and SET OF types. They are also used in other NCBI functions where a very flexible linked list is required.

typedef union dataval

{

    VoidPtr ptrvalue;

    Int4 intvalue;

    FloatHi realvalue;

    Boolean boolvalue;

} DataVal, *DataValPtr;

 

typedef struct valnode

{

    Uint1 choice;              /* to pick a choice */

    DataVal data;              /* attached data */

    struct valnode *next;      /* next in linked list */

} ValNode, *ValNodePtr;

ValNodePtr ValNodeNew (ValNodePtr node)

Creates a new ValNode and returns a pointer to it.  If desired, the newly-created node may be attached to the end a linked list, of which node is the tail element.  Otherwise node should be NULL.

ValNodePtr ValNodeAdd (ValNodePtr *head)

Creates a new ValNode and returns a pointer to it.  The head argument points to a variable that contains the head element of a linked list of ValNodes to which the new node should be appended.  If head contains NULL, it will be initialized with the pointer to the newly-created ValNode.

ValNodePtr ValNodeAddBoolean (ValNodePtr *head, Int2 choice, Boolean bool)

Creates a new ValNode by calling ValNodeAdd and sets its choice member to choice and its data.boolvalue member to bool.  The return value is the new ValNode.

ValNodePtr ValNodeAddInt (ValNodePtr *head, Int2 choice, Int4 value)

Creates a new ValNode by calling ValNodeAdd and sets its choice member to choice and its data.intvalue to value.  The return value is the new ValNode.

ValNodePtr ValNodeAddFloat (ValNodePtr *head, Int2 choice, FloatHi value)

Creates a new ValNode by calling ValNodeAdd and sets its choice member to choice and its data.intvalue to value.  The return value is the new ValNode.

ValNodePtr ValNodeAddStr (ValNodePtr *head, Int2 choice, CharPtr str)

Creates a new ValNode by calling ValNodeAdd and sets its choice member to choice and its data.ptrvalue member to the string str.  The string is not copied to allocated storage.  The return value is the new ValNode.

ValNodePtr ValNodeCopyStr (ValNodePtr *head, Int2 choice, CharPtr str)

Creates a new ValNode by calling ValNodeAdd and sets its choice member to choice and its data.ptrvalue member to a copy of the string str.  The return value is the new ValNode.

ValNodePtr ValNodeAddPointer (ValNodePtr *head, Int2 choice, Pointer ptr)

Creates a new ValNode by calling ValNodeAdd and sets its choice member to choice and its data.ptrvalue to ptr.  The return value is the new ValNode.

ValNodePtr ValNodeLink (ValNodePtr *head, ValNodePtr node)

Adds node to the end of a linked list whose head element is in the variable pointed to by head.  If head contains NULL, it is initialized with to the value of node.  The return value is always the head element of the linked list.

ValNodePtr ValNodeFree (ValNodePtr node)

Frees an entire list of ValNode structures of which node is the head element.  Whatever data may be referenced in the data member is not freed. The return value is always NULL.

ValNodePtr ValNodeFreeData (ValNodePtr vn)

Frees a list of ValNode structures like the ValNodeFree function, except that associated data is also freed.  This function should only be used if it is known that the data.ptrvalue member of every node in the list is either NULL or a valid pointer to a single fixed memory block.

ValNodePtr ValNodeExtract (ValNodePtr *head, Int2 choice)

Scans the linked list whose head element is in the variable pointed to by head for the first node whose choice element is equal to choice.  If found, the node is unlinked from the list and returned as the function result.  If it is not found, NULL is returned.

ValNodePtr ValNodeExtractList (ValNodePtr *headptr, Int2 choice)

Scans the linked list whose head element is in the variable pointed to by head for all nodes whose choice element is equal to choice.  The return value is the head element of a linked list of all such nodes.

ValNodePtr ValNodeFindNext (ValNodePtr head, ValNodePtr curr, Int2 choice)

Scans a linked list of ValNodes for a node whose choice member is equal to choice and returns a pointer to it.  The search begins with curr, if non-NULL, or head otherwise.  If choice is negative, the next node is returned.

ValNodePtr NodeListNew (void)

[[ ...insert text here... ]]

ValNodePtr NodeListFree (ValNodePtr head)

[[ ...insert text here... ]]

Int2 NodeListLen (ValNodePtr node)

Returns the number of elements in the linked list of which node is the head element.

ValNodePtr NodeListFind (ValNodePtr head, Int2 item, Boolean extend)

[[ ...insert text here... ]]

Boolean NodeListRead (ValNodePtr head, Int2 item, VoidPtr ptr, size_t size)

[[ ...insert text here... ]]

Boolean NodeListWrite (ValNodePtr head, Int2 item, VoidPtr ptr, size_t size)

[[ ...insert text here... ]]

Boolean NodeListAppend (ValNodePtr head, VoidPtr ptr, size_t size)

[[ ...insert text here... ]]

Boolean NodeListInsert (ValNodePtr head, Int2 item, VoidPtr ptr, size_t size)

[[ ...insert text here... ]]

Boolean NodeListReplace (ValNodePtr head, Int2 item, VoidPtr ptr, size_t size)

[[ ...insert text here... ]]

Boolean NodeListDelete (ValNodePtr head, Int2 item)

[[ ...insert text here... ]]

Math Functions

Macros

Macro

Description

LN2

Natural logarithm of 2

LN10

Natural logarithm of 10

EXP2(x)

Base-2 exponential of x

LOG2(x)

Base-2 logarithm of x

EXP10(x)

Base-10 exponential of x

LOG10(x)

Base-10 logarithm of x

Arithmatic Functions

long Gcd (long a, long b)

Returns the greatest common divisor of a and b.

long Nint (double x)

Returns the nearest integer to x.

Transendental Functions

double Log1p (double x)

Returns log(x+1) for all x > -1

double Expm1 (double x)

Returns exp(x)-1 for all x

double Powi (double x, int n)

Returns the integral power of x

double Factorial (int x)

Returns x! (x factorial)

Gamma Functions

double Gamma (double x)

gamma(x)

double LnGamma (double x)

log(gamma(x))

double LnGammaInt (int n)

log(gamma(n)), integral n

double DiGamma (double x)

digamma(x) 1st order derivative of log(gamma(x))

double TriGamma (double x)

trigamma(x) 2nd order derivative of log(gamma(x))

double PolyGamma (double x, int order)

Nth order derivative of log(gamma)

void GammaCoeffSet (double *coef, unsigned dimension)

Change gamma coefficients

Advanced Functions

double LogDerivative (int order, double *u)

Nth order derivative of ln(u)

double NRBis (double y, double(*f) (double), double (*df) (double), double p, double x, double q, double tol)

Combined Newton-Raphson and Bisection root solver

double RombergIntegrate (double(*f)(double, VoidPtr), void *fargs, double p, double q, double eps, int epsit, int itmin)

Romberg numerical integrator

Miscellaneous Utilities

Macros

Macro

Description

ABS(a)

Returns the absolute value of a (any numerical type).

SIGN(a)

Returns -1 if a is negative, +1 if it is positive, or 0 if it is zero.

MIN(a,b)

Returns the maximum of a and b (any numerical type).

MAX(a,b)

Returns the minimum of a and b (any numerical type).

ROUNDUP(a,b)

Rounds a up to the nearest multiple of b.

DIM(a)

Returns the dimension (number of elements) in the array a.

Random Numbers

void RandomSeed (long seed)

Sets the seed value of the random number generator to seed.

long RandomNum ()

Returns the next value in the series of pseudo-random numbers.

Sorting

void HeapSort (void *base, size_t nel, size_t size,
int (LIBCALLBACK *cmp)(VoidPtr,VoidPtr))

Sorts an array of elements, which may be of any basic or structured type.  The starting address of the array is base, with nel and size being the number and size of elements in the array.  A pointer to an element comparison function cmp must also be supplied.

Time

time_t  GetSecs ()

Returns the current value of a timer that ticks once per second.

Boolean  GetDayTime (struct tm *dtp)

Returns the current time in broken-down format.

Process ID

long  GetAppProcessID ()

Returns a unique number identifying the process.

Application Properties

We will refer to named block of arbitrary data associated with a single application instance as an application property.  Application properties have two main uses.  First, they allow for isolation of application instance data in certain shared library contexts where the data space would normally be shared by all applications using the library.  Second, they allow for a simple level of communication between code modules without requiring that they “know” anything about each other.  For example, during initialization of your program, you might create a property called “ProgramName” with a string giving the name of your program.  Other code modules might then use this property when generating various messages and reports.  Application properties are identified by a string with case being significant. They are created or modified with SetAppProperty, retrieved with GetAppProperty, and destroyed with RemoveAppProperty.  If you want to scan through the property list, use EnumAppProperties with a pointer to a callback function to be called once for each property.

void* SetAppProperty (const char *key, void *data)

Installs data, identified by key, as a property of the current application.

void* GetAppProperty (const char *key)

Returns the property data associated with key.  If the property does not exist, NULL is returned.

void* RemoveAppProperty (const char *key)

Removes the property of the current application that is identified by key, if there is one.  If the data pointer for this property is returned as the function result and it is the responsibility of the programmer to release whatever dynamic memory may be involved.

int EnumAppProperties (AppPropEnumProc proc)

Calls the user-supplied function proc once for each property of the current application.

Debugging Macros

The following macros are designed to assist in debugging during program development (or more precisely to prevent the need for debugging!) and are only enabled if the macro _DEBUG is defined during compilation.  However, when the time comes to build a "release version" to distribute to end-users, they can be easily disabled by recompiling without _DEBUG defined.

void TRACE (const char *fmt, ...)

Formats a string using fmt as a printf-style format string along with a variable number of arguments and then writes it to the "trace device".  What the trace device actually represents differs with the platform and compiler switches.  Under Microsoft Windows, traced messages go to the debugger console (AUX) if it is running.  Although similar facilities may exist on other platforms, none are supported at present (but we will entertain any suggestions you may have). For this reason the default “trace device” on UNIX systems is “stderr” and on all other platforms is to a file called “trace.out”. This behavior may be circumvented by doing the following prior to including <ncbi.h>:

 

#define one of these symbols:

                TRACE_TO_STDOUT

                TRACE_TO_STDERR

                TRACE_TO_AUX (Windows only)

                TRACE_TO_FILE  (goes to “trace.out”)

Followed by:

                TRACE_DEVICE_DEFINED  (inhibits redefinition)

 

Note that all the above only makes TRACEing possible, but does not enable the feature. To do so, compile selected files with the symbol _DEBUG defined. This is _not_ done in the default makefiles. When _DEBUG is not defined, TRACE() has no effect.

ASSERT(expression)

If _DEBUG is defined, asserts that expression is TRUE.  If it evaluates to FALSE, a message is displayed giving the expression and the file name and line number where the assertion failed.  After this, the program halts through a call to AbnormalExit.  If _DEBUG is not defined, expression is never evaluated. 

VERIFY(expression)

Similar to ASSERT, except that expression is always evaluated.  This should be used if the expression contains an assignment or function call that should be executed regardless of whether or not _DEBUG is defined.

 

Portability Issues

There are always a variety of factors conspiring to hinder the portability of C code despite the best intentions of the programmer.  These barriers are due to differences in hardware, operating systems, compilers, and filesystems. 

We have attempted to sequester all system-specific definitions into a single header file called ncbilcl.h (which is included by ncbi.h)  It contains defined symbols describing the platform as well as type definitions and often a variety of macros.  The NCBI Toolkit includes a version of this file for each of the supported platforms.

Filename

Hardware

Operating System

Compiler

ncbilcl.370

IBM 370

AIX

System V cc

ncbilcl.acc

Sun SPARC

SunOS

Sun acc

ncbilcl.alf 

DEC Alpha-XP

OSF/1

DEC C compiler

ncbilcl.aov 

DEC Alpha-XP

OpenVMS

BSD cc

ncbilcl.aux

Macintosh 68K

AU/X

AU/X

ncbilcl.bor

Intel PC

MS-DOS

Borland C/C++

ncbilcl.bwn 

Intel PC

Windows DOS

Borland C/C++

ncbilcl.ccr 

Sun SPARC

SunOS

CodeCenter

ncbilcl.cpp

Sun SPARC

SunOS

Sun C++

ncbilcl.cra

Cray YMP

Unicos

Cray C compiler

ncbilcl.cvx 

CONVEX

UNIX System V

BSD cc

ncbilcl.gcc 

Sun SPARC

SunOS

Gnu gcc or g++

ncbilcl.hp

HP PA-RISC

HP-UX

System V cc

ncbilcl.mpw

Macintosh 68K

MacOS

Apple MPW C

ncbilcl.msc 

Intel PC

MS-DOS

Microsoft C

ncbilcl.msw 

Intel PC

Windows DOS

Microsoft C

ncbilcl.nxt

Next

NextStep

Next C compiler

ncbilcl.r6k

IBM RS 6000

AIX

System V cc

ncbilcl.sgi 

SGI MIPS

UNIX System V

System V cc

ncbilcl.sol 

Sun SPARC

Sun Solaris

SunPro

ncbilcl.sun

Sun SPARC

SunOS

BSD cc

ncbilcl.thc

Macintosh 68K

MacOS

Symantec C/C++

ncbilcl.ult

DEC MIPS

ULTRIX

System V cc

ncbilcl.vms

VAX

OpenVMS

BSD cc


Portable Types

In C, the sizes of basic types vary with each compiler implementation.  Certain minimum sizes are guaranteed by the ANSI standard, however.  The choice of which type to use in any particular situation may be based on the required precision and the natural word size of the hardware.  Always use the sizeof operator rather than assuming any particular size.

We have defined the following types.

Integral Types

Type

Description

Size

Min. Value

Max. Value

Boolean

A TRUE or FALSE value

1

FALSE

TRUE

Byte

Smallest unit of storage that a C program can address (unsigned)

1

0

UINT_MAX

Char

ASCII character occupying one byte of storage (may be either signed or unsigned)

1

CHAR_MIN

CHAR_MAX

Uchar

Unsigned ASCII character

1

UCHAR_MIN

UCHAR_MAX

Int1

Signed integer, 1 byte

1

INT1_MIN

INT1_MAX

Uint1

Unsigned integer, 1 byte

1

0

UINT1_MAX

Int2

Signed integer, 2 bytes

2

INT2_MIN

INT2_MAX

Uint2

Unsigned intege, 2 bytes

2

0

UINT2_MAX

Int4

Signed integer, 4 bytes

4

INT4_MIN

INT4_MAX

Uint4

Unsigned integer, 4 bytes

4

0

UINT4_MAX

Floating-point Types

Type

Description

Min. Value

Max. Value

FloatLo

Low-precision floating point value (same as float)

FLT_MIN

FLT_MAX

FloatHi

High-precision floating point variable (same as double)

FLT_MAX

DBL_MAX

Pointer Types

Type,

Description

VoidPtr

Generic pointer (same as Pointer)

BoolPtr

Pointer to Boolean

BytePtr

Pointer to Byte

CharPtr

Pointer to Char

UcharPtr

Pointer to Uchar

Int1Ptr

Pointer to Int1

Uint1Ptr

Pointer to Uint1

Int2Ptr

Pointer to Int2

Uint2Ptr

Pointer to Uint2

Int4Ptr

Pointer to Int4

Uint4Ptr

Pointer to Uint4

FloatHiPtr

Pointer to FloatHi

FloatLoPtr

Pointer to FloatLo

FnPtr

Generic function pointer

Pointer

Generic pointer (same as VoidPtr)

Handle

Generic handle. Points to a block of memory that is moveable on Macintosh & Windows. On other platforms it is the same as a Pointer

Avoiding Name Collisions

The types are first typedeffed with names like Nlm_Int2.  Then they are defined with easier to use names like #define Int2 Nlm_Int2.  A similar procedure is used in declaring the utility functions.  This is because one wishes to treat them in your program as real data types.  However, if a conflict with a typedeffed name in some other program or header occurs, one cannot "untypdef" things, and it's a problem to use the other headers.  #defines can be undefined which solves the conflict problem.  We typedef with "Nlm_..." in the expectation that there will be no conflict with the name. We then #define that to something easier to remember, but more likely to conflict, and get the best of both worlds. The defined types are listed below

Byte Order

The order of bytes within any integral value of size greater than 1 is defined by the hardware. 

Although other orderings are possible, none of the platforms we support has such a configuration.  One of the following symbols should be defined in every ncbilcl.xxx.

Symbol

Description

IS_BIG_ENDIAN

The target platform is "big endian", having the most significant byte in the lowest address.

IS_LITTLE_ENDIAN

The target platform is "little endian", having the most significant byte in the highest address.

Function Prototypes

A mechanism has also been worked out for declaring functions and prototypes such that compilers which can check function prototypes will check them, and those which don't do not see them (prototypes are syntax errors on older compilers !).  The trick is to declare the prototype with the PROTO(()) macro (note the double parentheses).  A similar macro, VPROTO(()), is provided for functions with variable argument lists.

Int2 StringCmp PROTO((CharPtr str1, CharPtr str2));

Int2 Message VPROTO((Int2 key, char *fmt, ...));