Getting Shorter Garbage Collection Pauses

We noticed that our test PeopleSoft system was very slow on occasion, such that the load balancer decided it was broken. Sessions were redirected to the webpage we have for the system being in maintenance.

Since the webserver is Weblogic, it runs in a Java Virtual Machine (JVM). The first thing to check is how long the garbage collection pauses are. Fortunately I had garbage collection logging switched on, so I could see that they were over 100 seconds on occasion, which is far too long. This is what my garbage collection log parameters were set to (Java 11).

The Problem with Ansible on RedHat

Normally newer versions of Operating systems have newer packages. But not RedHat when it comes to Ansible. On my workstation:

$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.10 (Ootpa)
$ ansible --version
ansible [core 2.16.3]
  config file = /etc/ansible/ansible.cfg
  ...
  python version = 3.12.6
  jinja version = 3.1.2
  libyaml = True

But on the management server:

$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 9.4 (Plow)
$ ansible --version
ansible [core 2.14.17]
  config file = /etc/ansible/ansible.cfg
  ...
  python version = 3.9.18
  jinja version = 3.1.2
  libyaml = True

Wait, what? The older operating system has a newer version of Ansible, and Python?

Refreshing a Test PeopleSoft Campus Environment

We have a number of PeopleSoft test environments. I have written about my automated build process before, but I have not yet mentioned what we do to the database when we refresh.

My approach here is that I want as much as possible to build the environment from scratch. This means that we have a consistent build. There are also database fields that need to be changed. Recently a colleague and I reviewed the tables that needed changing and came up with the following.

Peoplesoft Log Parsing with Regular Expressions

As mentioned in my previous post on this topic, we need to configure BindPlane to read our files. I chose to use the file source as there was no prepared parser for PeopleSoft logs.

Configuration again

Application server and process Scheduler

On the application server there are three types of files. These are:

Application Logs

I set this up as a file, and added the following regex to split it into fields. This is wrapped for readability - in reality it is all on one line. The spaces are part of the regex, the newlines are not.

Sending Logs to Google Observability Logging

At present all our logs are in random files in random places in the operating system. I would like to see whether we can improve this. As an example of the problems caused by the current situation, if a user reports an issue, they normally don’t give a timestamp, so we have to assume the issue occurs say up to 20 minutes before the call was raised. Then we have to search the logs on the operating system for their user ID. As mentioned these logs are in various places. It isn’t easy to limit to a time span using standard operating system tools. Also we have a redundant architecture, meaning the users session could have been on any one of four web servers, and four application servers. The error could have happened on any of these 8 VMs.

Notes on Caching

Just a quick one this time. PeopleSoft has three different types of cache at the application server level, and here are some notes on it.

Unshared Cache

What it is

So called because each application server process creates it’s own cache under %PS_SERVDIR% which is by default $PS_CFG_HOME/domainname/CACHE/ Under here are directories for each process, e.g. PSAPPSRV_1 etc.

How to use it

This is the default type of cache. It is used if you configure

Oracle Backup Restore Failures

How I Test My Backups

I like to test my backups. It helps me sleep to know I could get my data back if the worst happened and it was scrambled by ransomware, or a bug in our code.

My sleep was rendered less peaceful when the restores suddenly started failing for no reason that I could understand. We use RMAN to backup and restore the data, and the script is fairly simple - it effectively says to restore the database as it was at noon yesterday. Something like this:

Process Scheduler Auto Update

I have been setting up process monitor auto update, and have managed to get it to work - I can see the process status updates in the process monitor screen. Here is what I did.

Oracle support document id 2772617.1 explains how to set this up manually. I wanted this to work as part of the automated build, which means supplying the parameters as part of psft_customisations.yml

A Gotcha!

We have multiple application servers running on the same port (but on different VMs). This means we need different domain IDs for each process scheduler because domain ID and port are used as a key. I would have thought that the hostname should be included to make this a unique identifier, but Oracle have chosen not to do that. Note that domain ID is different from domain name. Oracle documentation suggests using the database name in lower case. The DPK default is APPDOM (which is also the default domain name). If either of these is used, when you set up the inter domain event credentials on the process scheduler and configure the domain (for example by running):