Ever have a batch process which takes too long? A daily job that runs 25 hours? An overnight task that takes 12 hours? Something the users want sooner?
My favorite solution is to use wireless and host on a planet with a slower rotation rate but there are problems with that approach starting with lack of rack space

.
The traditional solution is Beowulf.
- You say you have no Beowulf cluster for parallel processing? Your desktop support people will not let you build one out of spares?
- You are doing ETL using business languges not supported by parallel processing libraries such as MPICH and MPI?
Read on - there is another way!
- The basic approach is to write a script which will spawn a number of child processes and then figure out when they have all finished. The child processes will configure themselves to perform their part of the work in isolation from their siblings. Each child process will write a "done" file before terminating so that the main script knows that it is finished.
- This could be written in DOS or WSH or PowerShell but example below is in UNIX shell. It is the extention of prior work with a fixed number of child processes (four to be precise, named for the Horsemen of the Apcolypse in test and the Horsemen of Notre Dame in production) which used the job scheduler to spawn and collect the herd. The method I show here does not require a job scheduler and this can be configured to use a different number of children or process different items without paperwork
.
- Lets break the scripts down:
- The main script, with logging removed, starts off like this:
#!/bin/sh
# run script to do any preparation before the children are spawned. this could be
# used to create an empty file to hold the collective output of the child threads
commonInitialization
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
# remove the done file for each child process, one for each file in the horsemen folder
for pgm in `ls -1 horsemen`
do
touch ${pgm}done
rm ${pgm}done
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
done
# spawn off each child process, one for each file in the horsemen folder
for pgm in `ls -1 horsemen`
do
./horseman ${pgm} &
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
done
- Good so far? We have performed single-threaded up front processing, removed the done file for each child process and spawned each child process.
- Now lets take a look at one of the child processes, the script horseman invoked with the name of the thread as the single parameter:
#!/bin/sh
# this thread is going to process each item listed in the file named by the parameter
for item in `cat horsemen/${1}`
do
# it needs to build a command, a piece of SQL in my instance, which will allow it to
# run in isolation from its siblings, both in terms of data processed and files touched
## first it composes a sed command to modify a template, a master command or sql file,
## to create a specific command or sql file to run for this iteration
## the template contains :::: in place of the thread name and ::: in place of the item name
echo sed -e s/::::/${1}/g -e s/:::/${item}/g \
${MYSQLDIR}SampleMaster.sql | sed "s/_/'/g" > ~/${1}scratch
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
## then it runs the sed command to create the specific command for this iteration
sh ~/${1}scratch > ${MYSQLDIR}Sample${1}.sql
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
## then it runs the specific command to process this item in isolation from the other threads
./exec_sqlplus Sample${1} > ~/Sample${item}.log
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
done
# now that all items in the file named for the thread have been processed, this thread
# needs to write a "done" file so that the main script will know that it has finisned
touch ~/${1}done
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
- Now that the child process is done, lets take a look at the rest of the main script which performs single threaded processing on what the children have produced and makes sure that they all finish
# first make sure that the file that allows the program to wait exists.
# to break an infinite loop, remove this file to exit the inner loop below
touch rm_this_to_stop
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
# wait for each child process to finish in the same order as it started them
for pgm in `ls -1 horsemen`
do
# inner loop waits for each a specific child process to write its done file
while [ ! -f ${pgm}done ]
do
sleep 300
# and it one of the child processes does not finish it can be made to exit
if [ ! -f rm_this_to_stop ] ; then
exit 13
fi
done
# run script to perform single threaded processing after a child finishes
# this could be used to append an output file from the newly completed
# thread onto a single merged output file initialized in the beginning
threadCleanup
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
done
# run script to do any collective processing after all the children are done
# this could be used to concatenate the output file from each completed
# thread into a single output file which would not require initialization
commonCleanup
rc=$?
if [ $rc -ne 0 ] ; then
exit 13
fi
- And that's it. Parallel processing in well under 150 lines of shell script.
- So how easy is this thing to configure?
- The directory horseman contains a file for each thread
- Each file contains a list of items the the tread will loop though
- To parody Suess, there could be two files in horsemen called Numbers and Colors.
- horsemen/Numbers would contain:
oneFish
twoFish
- horsemen/Colors would contain:
redFish
blueFish
- The first thread would be called Numbers and would process oneFish and twoFish in turn before writing Numbersdone.
- The second thread would be called Colors and would process redFish and blueFish in turn before writting Colorsdone
- The template file could be as simple as:
#!/bin/sh
echo :::
to try to tell the story. The generated sed command will replace the ::: with the item from the file, in this case the camelized line from the book. Note that the actual order of execution will jumble the story.
- To reconfigure it, just make different files with different contents.
- Do not allow an item and a file to have the same name.
- Remember not to fill memory driving the machine into swap.
- Remember to keep the child processes runing in isolation.
- Remember that contention creates a bottleneck.
Hopefully this technique can be of some use to you in speeding up one of your processes.