SECTION 9 -- RDBMS
    
<>    09.001 What processes do what?
    
    dmfacp   archiver process
    
             wakes up at intervals and copies committed transactions
             on journalled tables from the log file to the journal file.
    
    dmfrcp   recovery process (recovery server in OpenIngres)
    
             The dmfrcp process moves log records from log buffers
             in memory and writes these records to the logging
             system file.  In case of server or system failure, the
             dmfrcp recovers pending transactions by reading the
             logging system file and performing appropriate recovery
             actions.  This is done during start-up.  While the recovery
             process is running the database is locked and users attempting
             to connect see the message "The database is currently
             unavailable."
    
    iidbms   INGRES RDBMS server process
    
             the actual data management component.  There may be multiple 
             servers per installation.  OpenIngres allows customer defined
             server classes.
    
    iigcc    communication server process
    
             provides the network communication function of INGRES/Net.
             There may be multiple instances of iigcc in an installation.
            
    iigcn    name server process
    
             monitors and identifies all the INGRES servers in an
             installation (eg iidbms and iigcc).  There will be one per
             installation, on each client and server.
    
    iislave  disc I/O process (Unix machines only)
    
             actually performs disc I/O operations for iidbms.  To
             maximize concurrency, it is advisable to have at least one 
             iislave per disc drive.  (See note 09.002 below.)
    
    
    
<>    09.002 How are extra iislave (disc) processes started?
    
    It is generally advisable to have at least one iislave process per disc.
    The required number of iislaves is requested by the user `ingres' with:
    
      ingsetenv II_NUM_SLAVES=
    
    where  is the number of iislaves to fork when iistartup is
    executed.  This assignment is non-volatile and needs to be made only
    once, or as new discs are added.  If multiple servers are to be started
    the value of II_NUM_SLAVES can be adjusted for each server prior to 
    starting it.
    
    
    
<>    09.003 How do I recover an inconsistent database?
    
    Contact Computer Associates Ingres Technical Support.  This answer is
    not deliberately intended to be facetious, it is just that an
    inconsistent database can be caused by a number of things.  Technical
    Support are best able to determine the cause of the inconsistency and
    advise on the appropriate recovery procedure.
    
    Some situations which cause a database to be marked inconsistent can 
    be pre-empted.  See section 09.019 for more information.
    
    It would be very foolish to do as some contributors suggest and just
    use verifydb to force the database consistent.  That option of verifydb
    is a last-resort salvage tool, not a recovery mechanism.  It fixes 
    nothing and leaves a mess in the database.
    
    
<>    09.004 What are "compressed" table structures?  
    
    In ordinary table structures space is allocated for data rows according
    to the maximum width of the row.  With compressed table structures
    Ingres applies a compression algorithm in order to reduce the amount of
    space required.  The advantage is a reduction in storage requirements;
    the disadvantages are twofold:
    
    Firstly there is a small performance penalty whenever a row is stored
    or retrieved, as it has to be compressed or uncompressed.  Secondly
    when a row in a compressed table is updated, it is unlikely to remain
    the same size as the old row.  If it is bigger then it won't fit in the
    hole left by the old row, and so it is added somewhere else in the
    table.  Thus an UPDATE is effectively turned into a DELETE followed by
    an INSERT, and a hole, ie unused space, is left in the table.
    Eventually these holes occupy more and more of the table until a MODIFY
    is required to reclaim the space.
    
    For these reasons compressed structures are not recommended for tables
    that are updated heavily.  For other operations, ie: SELECT, INSERT and
    DELETE there is a performance penalty because OpenIngres 1.x uses LZW
    compression.  (Ingres 6.4 just eliminates "blank" space at the end of
    strings, which is not very intensive.)
    
    Note that for CISAM and CBTREE structures only the data is compressed.
    Data stored as part of the key is never compressed in the index.
    
    
    
<>    09.005 Why might I configure more than 1 DBMS server process?
    
    The DBMS server is multithreading: it handles SQL requests on behalf of
    many INGRES clients.  Yet INGRES allows me to set up more than 1 DBMS
    server.  What reasons might I have to do this?
    
    There are quite a number of reasons you might decide to do this:
    
      o Multiprocessors.  Although a DBMS server is multithreading, it can
        only run a single thread at a time.  If you have more than 1 CPU it
        may be beneficial to have more than 1 DBMS server.  As the number
        of CPUs increases then you are likely to need more DBMS server
        processes to exploit the extra power.
    
        The optimal number depends on average SQL arrival rate and service
        time.  You could build a model using queueing theory.  Alternatively
        do what everyone else does: try different numbers and measure
        which is best for your workload and configuration.
    
      o Keep DBMS size manageable.  You can get a lot of users on 1 DBMS
        server.  But you need to configure more and more DBMS resources as
        the number of users grows.  There comes a time when the DBMS just
        gets too *enormous* to be manageable.
    
        Splitting the workload across multiple servers allows you to run
        without setting all the Kernel process limits up to ridiculous
        values, negating their original purpose.
    
      o Separate conflicting work types.  Some types of work don't run well
        together in the same DBMS server, e.g.  OLTP and heavy batch, OLTP
        and heavy scans (reports or MIS).  There is no internal
        mechanism for assigning and managing thread priorities within
        the DBMS server, which can lead to undesirable interactions between
        the different parts of the workload.
    
        A very common scenario is to have a separate server for daytime
        batch and reports.  This prevents the batch/reports interfering
        with OLTP throughput and response times.
    
        Another possible scenario is to separate 2 OLTP systems.  This
        separation may help with performance analysis and tuning, and
        prevent interactions between them.  They each have their own set of
        resources.
    
      o Error containment.  When a DBMS server crashes or decides to shut
        down, all users of that server are affected.  It might make sense
        to partition your workload into multiple DBMS servers, so that
        only part of the user population is affected by a DBMS failure.
    
        This can be particularly useful if your workload is unstable, even
        as a temporary measure while you find the real problem.
    
        Note that currently if a DBMS connected to a shared cache fails,
        then all other DBMS servers using that shared cache will fail too.
        That is, the shared cache is the unit of failure.
    
    
    
<>    09.006 How can I direct batch or reports to a particular server?
    
    OpenIngres supports customer-defined server classes.  See section 
    09.018 for more information.
    
    Ingres 6.4 does not offer such an elegant solution as OpenIngres.
    At the moment, you have to bypass the Name Server to do this.
    There are a number of steps:
    
      o Set up a *private* DBMS server, that is: one the Name Server doesn't
        know about.  Do this by setting the DBMS startup flag 'nonames'.
    
        No connections will be routed to this server by the Name Server.
    
      o Store the private DBMS server ID in a Well-Known place, e.g. an
        O/S file, an installation-level environment variable.
    
        Remember this changes when INGRES is restarted.  You need to 
        find out what it is and store it on every restart.
    
      o Arrange that the II_DBMS_SERVER variable is set locally for batch
        and reports which you want to run in the private DBMS server:
    
            $ II_DBMS_SERVER=1234
            $ export II_DBMS_SERVER
        
        Where '1234' is the private server ID remembered earlier.  This
        routes the batch/report to the private server.
    
    You will need to write some simple shell scripts to store and retrieve
    the private DBMS server id.  The iishutdown script looks for private
    servers: you don't need to shut them down manually.
    
    
    
<>    09.007 How does INGRES use sort space?
    
    If you don't do anything, INGRES will take sort space to modify tables
    from the table location, and other sort space from the database home
    location.  For production databases this is probably not what you want.
    
    You can tell INGRES to use sort work locations which can be set up
    anywhere.  See the DBA Guide for details.
    
    There are some things the DBA Guide doesn't tell you:
    
      o The sort workfiles are striped across the available sort locations
        by the sorter.  The size of the stripe is determined by the sorter
        at run time.
    
      o It follows that the input and output files of the sort phases are
        striped across the same disks.  So you're guaranteed disk
        contention--and there's nothing you can do about it.
    
      o Keep the sort locations about the same size.  When one location
        fills that work file is *full*.  There may be Gbs free in the other
        sort locations, that doesn't matter.  The sorter just gives up on
        the sort in progress and returns an error.
    
      o Unlike some other system locations, sort locations can be changed
        by resetting the relevant installation environment variables.  The
        sort locations are dereferenced by the DBMS server when it
        starts up, so restart INGRES or at least the DBMS servers to pick
        up the new sort configuration.
    
      o Check your sort locations have been picked up like this:
    
            $ isql iidbdb
            SET TRACE POINT DM1440;
        
        This gives a list of sort location pathnames known to the
        DBMS server at the moment.
    
      o There must be no gaps in your sequence of sort location numbers.
        When the DBMS server gets to a gap it stops looking for any more.
    
      o It follows that if sort location #1 is not defined then sort space
        is allocated using the default scheme described earlier.
    
    Note: space for temp tables is always taken from the database home 
    location, not from the sort locations.
    
    
    
<>    09.008 When does INGRES switch to a new journal?  
    
    Journals are switched automatically when you take a checkpoint.  But
    sometimes INGRES will switch to a new journal between checkpoints.
    
    The journals are written by the Archiver Process (ACP), dmfacp.  By
    default this writes the journal in 8 kb `pages'.  When it has written
    1024 of these pages it will look to switch journals, i.e. after 8 Mb of
    journal.
    
    The ACP is woken up periodically to copy information from the log file
    to the individual database journals.  When it has completed its log
    file scan and copied all it can to the journals, it then checks if the
    journals need switching.  So in practise the switch happens at 8 Mb + a
    little bit.
    
    In OpenIngres the journal switch point and page size are configurable.
    
    
    
<>    09.009 Should I use mirroring on my data locations?
    
    Maybe.  
    
    There's no standard virtual disk package: each is implemented
    differently.  You need to ask your vendor this question: does your
    virtual disk implementation guarantee that mirrors are in sync after a
    crash or power-off?
    
      o If the answer is yes, then it is probably safe to mirror your
        data locations.  The choice is then a trade-off of resilience
        against performance.
    
      o If the answer is no, then Ingres fast commit recovery will probably
        leave the mirrors out of step. The same SQL query may give different
        answers at different times.  Avoid, or if you want to take the risk
        then invest in an uninterruptable power supply.  This will give you
        some protection against the most common failure case.
    
      o If the answer is fudged, it's probably best to assume a `no'.
    
    If you do mirror your data locations, the disk I/O load on the total
    system will increase.  You need to configure more disks and controllers
    to handle the load.
    
    Note: the problem is that fast commit recovery algorithms have to probe
    the database after a crash to see which updates made it to disk.  If the
    mirrors are out of step, different things will happen depending on which
    mirror returns the data.  There are a number of scenarios, some of which
    are not very nice: duplicate rows on one mirror, missing rows on one
    mirror, corrupt index on one mirror, etc.
    
    
    
<>    09.010 Can I use mirroring as a super-fast checkpoint?
    
    Yes, but ONLY if you do so by modifying the checkpoint template and
    using "ckpdb".  This method will work for either on-line or off-line
    checkpoints, and inherits all the respective limitations.
    
    To implement this, your "mirroring" solution must be controllable from
    the command line.  You need to be able to reliably run a command that:
    
        1) determines if the mirrors are all on-line and synchonized
        2) disengages one of the mirror disks so it is no longer updated
        3) re-engages the mirror disks
    
    The general checkpoint template algorithms are:
    
        1) In the BEGIN phase, assure all mirrors are on-line and 
           synchronized.  If not, fail.
        2) In the WORK phase, disengage one of the mirror disks for each
           data location.
        3) In the END phase, do nothing.
    
    After the "ckpdb" command completes, you have a set of disks containing
    a "copy" of your database.  If you performed an off-line checkpoint,
    this "copy" is perfect.  If you performed an on-line checkpoint, this
    copy is not perfect, but will be made perfect when you recover by
    "rollforwarddb" properly applying the dump file records.
    
    Since you probably want to re-engage your mirror disks to prepare for
    the next checkpoint, you must now copy the data from the off-line
    mirror set to tape.  Then issue the proper command to "resync" the
    mirrors.  Remember this may or may not have adverse performance
    implications.  Some solutions will resync faster if the time spent
    "off-line" is minimized.
    
    If you use this solution, remember that from the time the ckpdb begins
    until you resynchronize the mirrors, you are protected by one less
    mirror set.  You may want to consider a minimum of three mirror sets if
    you choose to implement this strategy.
    
    Make sure you properly modify the checkpoint template so you can
    recover properly.
    
    PLEASE NOTE! Simply using the OS or hardware to "break" a mirror
                 set for later backup to tape WILL NOT WORK unless Ingres
                 has been shut down normally.  If there are any
                 transactions in the log file, or any unflushed cache
                 pages in an active DBMS server, this method will NOT
                 provide a "recoverable" backup.
    
    
<>    09.011 Can I mirror my log file?
    
    In installations using FAST COMMIT (i.e. almost all Ingres
    installations) committed transactions are not written through the DMF
    cache to disc right away.  Committed updates can exist in the cache and
    the transaction log for quite some time without being written to disc.
    The transaction log file is therefore a `critical single point of
    failure' for an Ingres system.  If fast commit is enabled and you lose
    the log file, you WILL lose committed transactions.  A mirrored log
    file could protect against this situation.
    
    OpenIngres supports a dual logging option to mirror the log.  This
    option should be exploited.  See the System Reference Guide for
    instructions.  (Note that dual logging will only be effective if the
    mirror is on an entirely separate drive.)  The rest of this note 
    applies to Ingres 6.4 which does not have this valuable feature.
    
    Ingres 6.4 does not support dual logging as an Ingres feature, but
    mirroring can still be done.  As always, there is a price: performance
    in this case.  If your workload is update- and commit-intensive you
    will notice it most.  On an extreme workload of this type (TP1) a
    reduction in throughput of some 40% was measured.
    
    But you can choose between performance and resilience on this single
    point of failure.  An alternative approach is to disable fast-commit,
    but that will also have a severe performance penalty.
    
    Some virtual disk packages don't support mirrors on raw disk.  Check
    the manuals.
    
    If you don't mirror your transaction log and you do want to run
    fast commit, you can reduce the amount of data that would be lost
    in the event of a log failure by making the archiver run after
    every consistency point.  The default is to run after every fourth
    consistency point.
    
    
<>    09.012 Can I have more than 1 DBMS page cache?
    
    This is quite easy to arrange, but be sure it's really what you want to
    do.
    
    Simply starting multiple servers with the runrundbms utility will, by
    default, start each server with its own private DMF page cache in its
    internal data space.  The Ingres locking system is used to maintain the
    coherency of multiple caches, using a value block locking technique.
    You may therefore need to configure additional lock resources.
    
    If you want to have a number of DBMS servers using fast commit on the
    same database, then you can use the shared_cache option.  This
    creates the DMF page cache in a shared memory space, which is then
    connected to by all of the servers.  There is a system of latches and
    semaphores to safely coordinate their access to the common DMF page
    cache.  (Note that if several servers share a cache and one server
    dies, all the servers attached to the same cache will immediately 
    die too.)
    
    It is also possible to have several shared caches, but you must give
    them different names.  See the I & O Guide for details.
    
    
    
<>    09.013 Can I have more than 1 fast commit server?
    
    Yes.  Set -fast_commit and -shared_cache and DON'T set -sole_server.
    Refer to Chapter 2 of the I&O Guide, "Using a Shared Buffer Cache" for
    more information.
    
    
<>    09.014 What is the Ingres Search Accelerator?
    
    The Ingres Search Accelerator is a hardware device that speeds
    up certain types of database operation and is based on ICL's
    SCAFS technology.  (CAFS stands for Content Addressable File
    Store and was developed by ICL for their VME mainframes.  SCAFS
    is the SCSI bus based version.)  It is available for the ICL DRS6000 
    and DRS3000, the IBM RS/6000, and probably any Fujitsu box that both
    it and Ingres run on.
    
    The system works by passing a query, or part of a query, to an
    intelligent disk controller.  The disk controller applies this
    query to the data that it has been asked to read from the disk
    and only passes back the rows that match.  This results in a
    significant reduction in the amount of data being passed back to
    the main processor giving the following benefits:
        - there is reduced bus activity, thereby allowing other
          system activities to run more quickly
        - the CPU has fewer rows to process and therefore less work
          to do, so queries execute more quickly and more CPU
          cycles are available for other activities
        - there is less data coming back to the CPU so Ingres data
          caches are flushed less frequently
    
    The following operators are supported by the Accelerator and
    queries containing them are eligible to be "accelerated":
    
           Operator              Meaning
    
           =                     equal
           !=, <>                not equal
           <                     less than
           >                     greater than
           <=                    less than or equal, not greater
                                 than
           >=                    greater than or equal, not less
                                 than
           IS NULL               TRUE if the column value is null
           IS NOT NULL           TRUE if the column value is not
                                 null
           LIKE                  TRUE if the column value conforms
                                 to a match string
           NOT LIKE              TRUE if the column value does
                                 not conform to a match string
    
    Note:  A character string containing a character range (eg
           [A-H]) cannot be handled by the Accelerator.
           The Accelerator can handle absolute dates but not date
           intervals.
    
    
    
<>    09.015 When and how do I use the INGRES Search Accelerator?
    
    Use of the Accelerator is completely transparent to the
    application.  The Query Optimizer decides whether or not to
    invoke the Accelerator and there is no way, as with normal
    queries, of hinting at or suggesting a plan to the Optimizer.
    The Accelerator hardware can be fitted "retrospectively" and any
    existing applications will benefit from performance increases
    automatically.
    
    The Accelerator is used to best effect when there are tables
    that would normally be scanned rather than accessed through an
    index.  Typically this occurs in MIS systems where ad hoc
    queries are issued by the users and the application designer
    cannot determine the most effective indexing in advance.
    
    
    
<>    09.016 How can I find out more about the Ingres Search Accelerator?
    
    See the "Ingres Search Accelerator Workload Trials Report"
    which gives detailed performance figures for five different
    real-life applications.  The results presented in the report
    were independently verified by management consultants KPMG, and
    ranged from a worst case of 2.5 times more to a best case of 45
    times more in terms of transaction throughput.
    
    Details on the product can be had by pointing your Web browser to:
    
      http://www.icl.co.uk/cguide/cg500009.html
    
    
<>    09.017 Why am I running out of QSF memory (E_OP0886 and E_QS0001)?
    
    There are two possibilities: the pool is too small for your local
    requirements, or the pool is too fragmented (or equivalently: your 
    queries are too complicated).
    
    Unless otherwise specified, the size of the QSF (Query Storage
    Facility) memory pool is estimated from the number of connected sessions
    declared when the server is started.  By default, the pool size is set
    to 60kb+N*40kb, where N is the number of connected sessions allowed.
    This formula is just a rule-of-thumb for sizing the pool.  In any given
    installation the type of work being done may mean this estimate is too
    small.
    
    The allowed number of connected sessions is defined using CBF in 
    OpenIngres, and in Ingres 6.4 by -connected_sessions in 
    $II_SYSTEM/ingres/files/rundbms.opt.  It is possible to increase the
    size of the QSF pool by adjusting the allowed number of connected 
    sessions upward.
    
    A better solution might be to increase the size of the pool directly by
    specifying a pool size using the -qsf.pool_size variable in the same
    file (this line will usually have to be added to the file).  To find
    out the current pool size:
    
      SET TRACE POINT QS501
    
    After making the necessary changes, shut down and restart the server.
    One suggestion seen on the net is to simply double the pool size;
    in the absence of any better advice that suggestion is repeated here.
    
    Note that increasing the size of the pool may not solve the problem, or
    not for long anyway.  Objects in the pool require contiguous chunks of
    space.  The pool may have enough total free space, but it may be so
    fragmented that no one region is big enough.  If enlarging the pool
    size provides no lasting relief, there are three possible courses of 
    action: (i) review and simplify your searches if they are hand coded;
    (ii) SET TRACE POINT QS506 to clear the pool completely, or (iii) start 
    another server, de-register the first one, and register the new one.
    
    Breaking up a complex search with lots of 'ORs' into unions may help.
    
    The rundbms.opt file and its contents are described in the Installation
    and Operations manual.  The behaviour of the QSF pool is described in
    more detail in Advisor note US-16503.
    
    
    
<>    09.018 How can I assign particular jobs to particular servers?
    
    The goal here is to have two or more database servers eg: 'LIVE_SERVER'
    and 'REPORT_SERVER' running and to be able to select one or the other
    when connecting to the database, eg:
    
       sql report_database/REPORT_SERVER < report.sql & 
       sql live_database/LIVE_SERVER < live.sql &
    
    The REPORT_SERVER could be running at a lower process priority than the
    LIVE_SERVER (NB not a good idea to do this on the SAME database).
    
    Existing server types exist in the following file:
    
       II_SYSTEM/ingres/files/name/iiname.all (Unix)
       II_SYSTEM:[ingres.files.name]iiname.all (VMS)
    
    It contains something like:
    
       INGRES          local   transient 
       COMSVR          local   transient
       STAR            local   transient 
       NODE            global
       LOGIN           global 
       LTICKET         local 
       RTICKET         local
    
    To add your own server type add the lines:
    
       LIVE_SERVER          local   transient 
       REPORT_SERVER        local   transient
    
    Servers must be registrated with the name server manually, eg:
    
       $ runrundbms rundbms_noname.opt
    
       Checking INGRES installation environment...
    
       INGRES installation is setup correctly
    
       Starting a DBMS server (iidbms) ...  
       II_DBMS_SERVER = 2264
    
       $ iinamu add REPORT_SERVER * 2264
    
    It is possible to automate this with some judicious shell/DCL
    programming. After this it will be possible to connect to the new
    server as described above.
    
    This has the distinct advantage over using II_DBMS_SERVER that the name
    server is not bypassed, and so connections to remote hosts are
    possible.
    
    OpenINGRES supports the manipulation of server classes through the new
    CBF (Configuration By Forms) and no manual jiggerypokery is required.
    
    
<>    09.019 Should I set II_DMFRCP_STOP_ON_INCONS_DB?
    
    [This environment variable is apparently documented only in some
    release notes for some platforms, although it is thought that this
    feature is supported by all versions of OpenIngres.  The following is a
    posting taken nearly verbatim from comp.databases.ingres, explaining
    the purpose of this environment variable.  If your release notes do not
    describe it then obviously you use it AT YOUR OWN PERIL.  --Roy Hann]
    
    [The] recovery algorithms are predicated on the notion that at any point
    the RCP can be asked to step in and recover either a single transaction
    or all transactions for a failed server.  For example, if a server runs
    out of resources or otherwise fails to recover a transaction, it can
    ask the RCP to take over the transaction (this is known as a "pass
    abort").  If the RCP fails to recover a transaction, it has nowhere
    else to go for help: it is the court of last resort.  If the RCP
    encounters errors performing recovery, it marks the affected
    database(s) inconsistent.  Once a database has been marked
    inconsistent, recovery from a checkpoint is necessary to ensure data
    integrity.
    
    Therefore it is critically important that we ensure that the RCP has
    sufficient resources to be able to abort any transaction at any time
    and to be able to recover all transactions for a server at any time.
    
    In general, very few additional locking system resources are needed
    during recovery, due to the effects of some of the Ingres recovery
    protocols. Certain rare situations exist where many locks are required
    to perform recovery.  One such is UNDO recovery of a transaction which
    has deleted rows in a btree table with overflow chains on the leaf
    pages.
    
    Currently, the locking system cannot guarantee that resources will be
    available for the RCP during recovery.  In the case of a pass abort, it
    is possible for the RCP to deadlock with other transactions running in
    the server.  Either of these conditions can elicit a variety of error
    messages, depending on the state of the RCP when the condition
    manifests itself.  The problem at Qantek seems to have been a failed
    pass abort.
    
    When a transaction is selected for UNDO recovery, it by definition has
    not committed, and so it still holds all the normal transaction page
    locks for data pages which the transaction updated. Since Before Image
    recovery operates by replacing data pages with their before images from
    the log, the data pages which are updated by Before Image recovery are
    still locked by the transaction. Thus no new lock resources are needed
    to restore data pages from their before images.  The problem arises
    when processing a pass abort which involves changes to btree leaf and
    index pages.  If the pass abort happened when the lock list was very
    nearly full, the additional locks required to recover the leaf and
    index pages may not be available. In fact, if the btree table has a
    highly duplicative key, an arbitrary number of locks may be required to
    lock the leaf page overflow chain.  The attempt to escalate to a table
    level lock may result in deadlock with transactions still running in
    the server.  Deadlock is also possible when attempting to lock the leaf
    or index pages, even if sufficient locks are available. Any of these
    situations will result in the database being marked inconsistent,
    necessitating recovery from a checkpoint.
    
    Recovery from checkpoint may not be acceptable in all situations, so an
    alternate processing scheme which forces the system to shut down rather
    than mark a database inconsistent is provided.  To use this alternate
    processing scheme, the INGRES environment variable
    II_DMFRCP_STOP_ON_INCONS_DB must be set to "YES" with the ingsetenv
    utiltiy.  When INGRES shuts down, the open databases are left in a
    state which will require the RCP to operate in REDO mode on restart.
    
    When a transaction is selected for REDO recovery, the database
    requiring REDO recovery is taken offline and no access is allowed to
    the database except by the RCP (or CSP in the cluster node failure
    case). REDO recovery takes advantage of this by forgoing normal
    database, table, and page locking during REDO recovery.  This
    dramatically reduces the locking resources required to process any
    transactions which must be REDOne or UNDOne.  If the locking system has
    been underconfigured, it is possible to reconfigure it before
    restarting INGRES.
    
    With II_DMFRCP_STOP_ON_INCONS_DB set to "YES", almost all conditions
    which would normally result in a database being marked inconsistent can
    be handled without rolling forward from a checkpoint.  Since the RCP is
    started before the DBMS servers, it has an opportunity to clean up the
    open databases without risk of lock conflict.
    
    Shutting down INGRES is a drastic measure.  It should be avoided, but
    in this case the alternative is worse.  The best that can be done is to
    minimize the frequency.  To do this, avoid lock escalation.  Design
    applications so that table level locks, if required, are taken
    initially, rather than after some number of pages of the table have
    been locked. In particular, avoid escalations due to running out of
    configured locks per transaction.  Either configure the INGRES locking
    system with enough locks per transaction to meet the needs of the
    application, or modify the application to require fewer locks.  Highly
    duplicative keys should also be avoided.
    
    
    
<>    09.020 How do I recover after a fire/flood in the machine room?  
    
    1) Regularly take a valid checkpoint (on or off line) on your
       production machine.  Do this checkpoint to either tape or disk, but
       copy it to tape so that it can be stored off-site.  When disaster
       strikes call tech support.  They will talk you through step 2 and up.
    
    2) Make sure your replacement system has a database by the same name
       with the same locations and disk paths.
    
    3) Shutdown Ingres on the replacement box.
    
    4) Wipe out ALL the data for the replacement machine's default database
       directory.
    
    5) Copy the destroyed database's aaaaaaaa.cnf file from the checkpoint
       tape to the default database location on the replacement system.
    
    6) Make the jnl and dmp directories on the replacement box identical to
       the destroyed machine.
    
    7) Bring Ingres up on the replacement box.
    
    8) Use "rollforward" with or without journals to "recover" the database
       from the checkpoint.
    
    The only thing to worry about is users, groups, and roles, that are
    stored in the iidbdb.