Building a Kiwi Tools GFDB with QSeis

Note

There now exists a new, better tool to create Green’s function databases: the Fomosto tool, which is part of the Pyrocko package. It can use QSeis among other tools for the modelling. It produces Green’s function stores in the new GF Store format but can also convert these to the Kiwi Tools GFDB format.

About QSeis

QSeis is a Fortran77 program for the calculation of synthetic seismograms based on a layered halfspace earth model [Wang1999]. The code is available on the GFZ Section 2.1 downloads web page.

Introduction

In this document, it is assumed, that QSeis has been compiled and installed on your system and that it can be run by typing qseis in a terminal. A Python module (tunguska.qseis) has been installed along with the Kiwi Tools which uses QSeis to generate a Kiwi Green’s function database. This module aids in the determination of suitable input parameters for QSeis and can be used to parallely execute many QSeis instances on a multiprocessor machine (to fill a large Green’s function database with traces typically requires many runs of QSeis).

We will now write a driver script which uses the Python QSeis Module (tunguska.qseis) to create a regional Green’s function database for a simple crustal model for Taiwan.

Requirements of the target application

A Kiwi Green’s function database contains precalculated Green’s functions on a grid for combinations of source depth z and receiver distance x. How many Green’s functions to calculate in advance is determined by the use-case. The depth range to be calculated is given by the earthquake source depths to be studied. The distance range by the smallest and largest distances between possible source points and receiver positions. How dense the grid has to be made depends on whether we want to be able to later interpolate the Green’s function between neighboring nodes (a requirement for the analysis of extended sources) or if we don’t have such a requirement (in such a case we must not forget to turn off interpolation in the application). For bilinear interpolation to work fine, it is neccessary to make the grid dense enough, so that no aliasing effects can occur. This means that the grid spacing dx should be (considerably) smaller than v_min / f_max, where v_min is the slowest apparent velocity of the seismic waves at the surface and f_max is the highest frequency to be analysed. For example if we want to study waveforms in a frequency range of up to 2 Hz and the slowest horizontal velocities are 2 km/s, we need a grid spacing well below 1 km so we may try with 250 m. The Green’s functions should be calculated with a temporal sampling rate of at least 4 Hz in this example, better more.

Earth model

For this example we will simply use the global CRUST2.0 model by Laske, Masters and Reif, extract a crustal profile for Taiwan and merge it with the PREM model for the mantle. There might be more suitable earth models for Taiwan in the literature, this is just for example.

Here is a quick and dirty script to do that:

#!/usr/bin/env python

from pyrocko import crust2x2
from tunguska import qseis
import numpy as num

prem_string = '''
         0        5.8        3.2        2.6     1456.2        600
        15        5.8        3.2        2.6     1456.2        600
        15        6.8        3.9        2.9    1350.11        600
      24.4        6.8        3.9        2.9    1350.11        600
      24.4    8.10619    4.50391    3.38075    1436.39        600
        80    8.05454    4.48537    3.37471    1430.05        600
        80    8.05454    4.48537    3.37471    193.101         80
       220    7.92451    4.43867    3.35949    190.877         80
       220    8.55895     4.6439    3.43577    362.922        143
       400    8.90524     4.7699    3.54326     372.34        143
       400    9.13392    4.93249    3.72375    366.349        143
       600    10.1578    5.51593    3.97582    362.332        143
       600    10.1578    5.51602    3.97582    362.316        143
       670    10.2662    5.57021    3.99212    362.921        143
       670    10.7513    5.94513    4.38074    759.322        312
       771    11.0656    6.24054     4.4432    730.384        312
       771    11.0656    6.24039     4.4432    730.415        312
      2741    13.6804    7.26593    5.49148    822.173        312
      2741    13.6804    7.26597    5.49148    822.165        312
      2891    13.7166    7.26465    5.56646    826.754        312
      2891    8.06479          0    9.90344      57823         -1
    5149.5    10.3557          0    12.1663      57823         -1
    5149.5    11.0283    3.50431    12.7636    445.809       84.6
      6371    11.2622     3.6678    13.0885    431.356       84.6
'''

prem = qseis.QSeisLayeredModel()
prem.set_model_from_string(prem_string, units='ugly')  # ugly means it's 'km' and 'km/s'
                                                       # and not 'm' and 'm/s'.

# get crustal profile for taiwan coordinates
c2 = crust2x2.Crust2()
prof = c2.get_profile(23.76, 121.)

# get arrays with the crustal parameters
depth_crust, vp_crust, vs_crust, rho_crust =  prof.get_weeded()

# no Q values are given in CRUST2.0: use some standard values
qp_crust, qs_crust = num.ones(depth_crust.size)*1400., num.ones(depth_crust.size)*600.

# merge crustal model with mantle part of PREM
lcut, hcut = 5, -4
depth = num.hstack((depth_crust, prem.get_depth()[lcut:hcut]))
vp    = num.hstack((vp_crust,    prem.get_vp()[lcut:hcut]))
vs    = num.hstack((vs_crust,    prem.get_vs()[lcut:hcut]))
rho   = num.hstack((rho_crust,   prem.get_density()[lcut:hcut]))
qp    = num.hstack((qp_crust,    prem.get_qp()[lcut:hcut]))
qs    = num.hstack((qs_crust,    prem.get_qs()[lcut:hcut]))

merged = qseis.QSeisLayeredModel()
merged.set_model( depth, vp, vs, rho, qp, qs )

# prints the model in a format suitable for input in the QSeis input file
print merged

The output is:

24
1 0 2.5 1.2 2.1 1400 600
2 0.7 2.5 1.2 2.1 1400 600
3 0.7 6 3.5 2.7 1400 600
4 15 6 3.5 2.7 1400 600
5 15 6.6 3.7 2.9 1400 600
6 24 6.6 3.7 2.9 1400 600
7 24 7.2 4 3.05 1400 600
8 35 7.2 4 3.05 1400 600
9 35 8 4.6 3.3 1400 600
10 80 8.05454 4.48537 3.37471 1430.05 600
11 80 8.05454 4.48537 3.37471 193.101 80
12 220 7.92451 4.43867 3.35949 190.877 80
13 220 8.55895 4.6439 3.43577 362.922 143
14 400 8.90524 4.7699 3.54326 372.34 143
15 400 9.13392 4.93249 3.72375 366.349 143
16 600 10.1578 5.51593 3.97582 362.332 143
17 600 10.1578 5.51602 3.97582 362.316 143
18 670 10.2662 5.57021 3.99212 362.921 143
19 670 10.7513 5.94513 4.38074 759.322 312
20 771 11.0656 6.24054 4.4432 730.384 312
21 771 11.0656 6.24039 4.4432 730.415 312
22 2741 13.6804 7.26593 5.49148 822.173 312
23 2741 13.6804 7.26597 5.49148 822.165 312
24 2891 13.7166 7.26465 5.56646 826.754 312

Generating the Database

We can now write a driver script, which we will use to create the Green’s function database. Typically there is some trial and error involved in determining a stable and efficient set of parameters for a new setup. The strategy is to first do some trial runs with a sparse grid and/or with a lowered sampling rate until we get a feeling for the parameters. The final run for a dense grid, may take days or even weeks of computation time and we don’t want to waste that effort.

taiwan.py:

#!/usr/bin/env python

import tunguska.qseis as qseis
import tunguska.forkmap as forkmap
from pyrocko import util
import gmtpy
import os, sys, logging, math
from os.path import join as pjoin
from tempfile import mkdtemp

km = 1000.

# A label used to mark output traces. When output of mseed traces is enabled,
# the location attribute of these will be set to the value of `tag`.
# This can be helpful when comparing the outputs of several test runs.

tag = 'A'

# Temporary files are kept in this directory (must be absolute path):

tmp = pjoin(os.getcwd(),'tmp')

# Number of QSeis instances to run in parallel

nworkers = 2

# Number of distances to compute in each QSeis run. The possible maximum depends
# on some header parameters compiled into QSeis.

block_nx = 200

# A minimal time length for the seismograms is calculated from the extents of
# the database and the seismic velocities in the earth model. Typically this
# estimated length is too short, so the estimated length of the seismograms is
# inceased with this factor. Longer settings require more computational effort.

length_factor = 2.0

# Parameter for sampling rate of the wavenumber integration (1 = sampled
# with the spatial Nyquist frequency, 2 = sampled with twice higher than
# the Nyquist, and so on: the larger this parameter, the smaller the space-
# domain aliasing effect, but also the more computation effort);

wavenumber_sample_rate = 2.5

# Whether to apply the flat earth transform

flat_earth_transform = 1

# Definition of the time windows to cut the traces before inserting them into
# the Green's function database.
# Can use external files (tables), which define the time as a function of source
# depth and distance or `None` to not cut. Or 'from_model' which will use a crude
# guess from the model velocities (see below in this script).

cutting = 'from_model'
#cutting = (Phase('beginning'), Phase('ending'))
#cutting = None


# First, for a separate database is created for each source depth. In a second
# step, the partial databases are combined into one big one.
# These set the naming of the output files

partial_db_path = pjoin(tmp, 'gfdb_partial/db_z%(depth)i')
output_db_path = 'gfdb/db'

# By setting a directory name here, it is possible to save all the generated
# traces  additionally in Mini-SEED format. This may be useful during tuning of
# the input parameters. Set to `None`, once done with the tuning, to save some
# storage space.

extra_traces_dir = 'extra_%s' % tag
#extra_traces_dir = None

# Kiwi GFDB configuration (see also the documentation of gfdb_build.)

gfdb_config = {
    'nchunks':  1,            # divide data into that many files
    'nx':       5,            # number of distances
    'nz':       2,            # number of source depths
    'ng':      10,            # number of GF components: 8 or 10
    'dt':       1.0,          # sampling interval in seconds
    'dx':     100.0*km,       # grid spacing in receiver distance
    'dz':      10.0*km,       # grid spacing in source depth
    'firstx': 100.0*km,       # distance of first receiver in grid
    'firstz':   1.0*km        # depth of first source depth
}

# Layered earth model (depth, vp, vs, density, Qp Qs

model_string = '''
    0.    6.       3.5      2.7      1400.     600
   15.    6.       3.5      2.7      1400.     600
   15.    6.6      3.7      2.9      1400.     600
   24.    6.6      3.7      2.9      1400.     600
   24.    7.2      4.       3.05     1400.     600
   35.    7.2      4.       3.05     1400.     600
   35.    8.       4.6      3.3      1400.     600
   80.    8.05454  4.48537  3.37471  1430.05   600
   80.    8.05454  4.48537  3.37471   193.101   80
  220.    7.92451  4.43867  3.35949   190.877   80
  220.    8.55895  4.6439   3.43577   362.922  143
  400.    8.90524  4.7699   3.54326   372.34   143
  400.    9.13392  4.93249  3.72375   366.349  143
  600.   10.1578   5.51593  3.97582   362.332  143
  600.   10.1578   5.51602  3.97582   362.316  143
  670.   10.2662   5.57021  3.99212   362.921  143
  670.   10.7513   5.94513  4.38074   759.322  312
  771.   11.0656   6.24054  4.4432    730.384  312
  771.   11.0656   6.24039  4.4432    730.415  312
 2741.   13.6804   7.26593  5.49148   822.173  312
 2741.   13.6804   7.26597  5.49148   822.165  312
 2891.   13.7166   7.26465  5.56646   826.754  312
'''

class LinRLogPlot(gmtpy.LinLogPlot):

    def setup_defaults(self):
        self.set_defaults( ymode='min-max' )

    def setup_projection(self, widget, scaler, conf):
        widget['J'] = '-JX%(width)gp/-%(height)gpl'
        scaler['B'] = '-B%(xinc)g:%(xlabel)s:/2:%(ylabel)s:WseN'


def usage():
    sys.exit(('''usage: %s  command [ arguments ... ]

Available commands:

  config-check

    Run autoconfiguration and give some debugging output.


  work [ depth_in_m ]

    Create partial databases for each (or a specific) depth. Depth should be
    given in meters.


  combine

    Combine partial databases into one complete database.


  plot-model outputfile.pdf

    Create a plot of the model.

''') % sys.argv[0] )


logging.basicConfig( level   = logging.INFO,
                     format  = '[%(asctime)s] %(levelname)-8s %(message)s',
                     datefmt = '%Y-%m-%d %H:%M:%S' )



if len(sys.argv) < 2:
    usage()


util.ensuredirs(partial_db_path)
util.ensuredirs(output_db_path)

qseis_config = qseis.QSeisConfig()
qseis_config.layered_model.set_model_from_string(model_string, units='ugly')

qseis_config.autoconf_modelling(gfdb_config, length_factor=length_factor,
                                tlead_in=5.*gfdb_config['dt'])

if cutting == 'from_model':
    vmin = qseis_config.layered_model.get_vs().min()
    vmax = qseis_config.layered_model.get_vp().max()

    cbeg = lambda x,z: math.sqrt(x**2 + z**2)/vmax - 20*gfdb_config['dt']
    cend = lambda x,z: math.sqrt(x**2 + z**2)/vmin*length_factor \
                        + 10*gfdb_config['dt']

    cutting = cbeg, cend

# manually override some variables:
qseis_config.sw_flat_earth_transform = flat_earth_transform
qseis_config.sample_rate = wavenumber_sample_rate

builder = qseis.QSeisGFDBBuilder(partial_db_path, output_db_path, gfdb_config,
                                 block_nx, qseis_config, cutting=cutting,
                                 extra_traces_dir=extra_traces_dir,
                                 tag=tag, tmp=tmp)

command = sys.argv[1]
args = sys.argv[2:]

if command == 'config-check':
    pass

elif command == 'combine':
    builder.combine()

elif command == 'plot-model':

    if len(args) != 1:
        usage()

    margin = [ x*gmtpy.cm for x in [2.5,2.5,2.,2.] ]
    plotter = LinRLogPlot(
        xscaled_unit_factor=0.001,
        yscaled_unit_factor=0.001,
        width=15*gmtpy.cm,
        height=15*gmtpy.cm,
        margins = margin,
        xmode='0-max',
        ylimits=(100.,None),
        xspace=0.05,
        xlabel = 'v@-S@-, v@-P@-',
        xscaled_unit = 'km/s',
        ylabel = 'Depth',
        yscaled_unit = 'km')

    mod = qseis_config.layered_model
    plotter.plot((mod.get_vp(), mod.get_depth()+1.), '-W1p,%s' %
                    gmtpy.color('scarletred3'))
    plotter.plot((mod.get_vs(), mod.get_depth()+1.), '-W1p,%s' %
                    gmtpy.color('skyblue3'))

    plotter.save(args[0])

elif command == 'work':

    if len(args) == 1:
        z = float(args[0])

        builder.work_depth(z)

    elif len(args) == 0:
        zs = builder.all_depths()

        @forkmap.parallelizable(nworkers)
        def work_depth(z):
            builder.work_depth(z)

        forkmap.map(work_depth, zs)

    else:
        usage()
else:
    usage()
[Wang1999]Wang, R., (1999), A simple orthonormalization method for stable and efficient computation of Green’s functions, Bulletin of the Seismological Society of America, 89(3), 733-741.