Browse Source

mag_min, fdigit conflicts

pull/6/head archive/aspo_dev
blubbblub 8 months ago
parent
commit
2e5adce111
  1. 114
      README.md
  2. 15
      src/cc.py
  3. 345
      src/clustering.py
  4. 89
      src/clusty.py
  5. 11
      src/config.py
  6. 10
      src/util_clusty.py

114
README.md

@ -4,7 +4,7 @@ CLUSTY
Clusty is a toolbox for the clustering of earthquakes based on waveform similarity abserved across a network of seismic stations.
Application example and reference :
Petersen, Niemz, Cesca, Mouslopoulou, Bocchini (2021): Clusty, the waveform-based network similarity clustering toolbox: concept and application to image complex faulting offshore Zakynthos (Greece), GJI, https://doi.org/10.1093/gji/ggaa568
Petersen, Niemz, Cesca, Mouslopoulou, Bocchini (2021): Clusty, the waveform-based network similarity clustering toolbox: concept and application to image complex faulting offshore Zakynthos (Greece), GJI, Volume 224, Issue 3, Pages 2044–2059, https://doi.org/10.1093/gji/ggaa568
Input data: Seismic waveforms, station metadata, (optional: picks)
@ -22,6 +22,25 @@ prerequisits:
- plotly: https://plotly.com/python/
Installation:
-------------
(system wide, from source)
```
git clone https://git.pyrocko.org/clusty/clusty.git
cd clusty
sudo python setup.py install
```
To check if the installation was successfull, run
```
clusty
```
in your terminal. A welcome and help message should appear. :-)
Basic commands to run clusty:
-----------------------------
A full run can be started using:
@ -32,6 +51,7 @@ A full run can be started using:
log_level argument can be: DEBUG, INFO, ERROR, WARNING
However, we recommend to run the tool bit by bit, using the following options:
```--cc``` to compute cross correlations, ```--netsim``` to compute the network similarity, ```--eps_analysis``` to obtain insight into dbscan parameter settings, ```--cluster``` to cluster the earthquakes based on the precomputed network similarity, ```--plot_results``` to obtain result plots, ```--merge_freq_results``` to merge clustering results obtaine in different frequency ranges or ```--export_stacked_traces``` to export stacked waveforms for each cluster.
@ -43,7 +63,7 @@ Input:
- catalog file in pyrocko format
- station file in pyrocko format
- picks (optional) in pyrocko format or xml
- waveform data - one subdirectory per event with ```NETWORK.STATION``` in waveform-file (see ```helper_functions/data_download.py``` for fdsn download example to this format or contact us for help in converting into this format from continous data)
- waveform data - the station code ```STATION``` should be in waveform filename (see also ```helper_functions/data_download.py``` for fdsn download example to this format or contact us for help in converting into this format from continous data)
Example configuration file:
@ -70,40 +90,92 @@ settings:
- !clusty.config.cc_comp_settings
bp: [3.0, 0.05, 0.2] # order, corner highpass [Hz], corner lowpass [Hz]
filtertmin: 120
filtertmax: 600
# define the time window for application of bp filter and downsampling.
# this is not the window for cc computation.
# adjust if you have short trace snippets only...
downsample_to: 0.1 # [s]
#pick_dir: '' # optional
phase: [R, L]
components: [HHZ, HHE, HHN]
compute_arrivals: true
# pick_path: '' # optional, give path to directory or file with picks
# pick_option: # optional, choose 'pyr_file' for a single pyrocko pick
# file, 'pyr_path' in case of one pick file per event or 'xml' in case of
# an event xml file with picks
phase: [R, L] # can also be [P,S] or a single phase
components: [HHZ, HHE, HHN]
# use x as wild cards: [xxZ, XXN, xxE] or [xHZ, xHN, XHE]
compute_arrivals: true
# computes arrivals with velocity models or takes them from picks
vmodel: path/to/velocity-model
use_precalc_ccs: false # boolean value to indicate if cross-correlations are already computed
snr_calc: true # boolean value to indicate weather SNRs should be computed for each trace
# if vmodel is '' and no picks are provided, clusty will try to use the
# crust2x2 model for the source area
use_precalc_ccs: false
# indicate if cross-correlations are already computed
snr_calc: true # indicate weather SNRs should be computed for each trace
snr_thresh: 2.0 # minimum SNR
max_dist: 30.0 # maximum inter event distance [km]
debug_mode: false # opens an interactive waveform browser to check time window and filter settings,
# for testing only
debug_mode_S: false # same for second phase
debug_mode: false
debug_mode_S: false
# opens interactive waveform browser to check time window and filter settings, for testing only
- !clusty.config.network_similarity_settings
get_station_weights: false # should a station weighting be applied? in case of uneven station distribution
method_similarity_computation: trimmed_mean_ccs # other methods: median, mean, max., product, weighted_sum_c_diff (see Petersen & Niemz et al. 2020)
use_precalc_net_sim: false # boolean value to indicate whether network similarity matrix is already computed
trimm_cut: 0.3 # parameter for trimmed mean method - cut off percentage of worst stations
apply_cc_station_thresh: true # should a cross-correlation based threshold be used (see Petersen & Niemz et al. 2020)
cc_thresh: 0.7 # min. cc to be met at ```min_n_stats``` stations covering an azimuthal range of at least ```az_thresh``` deg to consider an event for clustering
get_station_weights: false
# indicate if a station weighting should be applied. in case of uneven station distribution
method_similarity_computation: trimmed_mean_ccs
# other methods: median_ccs, mean_ccs, max_cc, product, weighted_sum_c_diff
# (see Petersen & Niemz et al. 2020)
use_precalc_net_sim: false
# boolean value to indicate whether network similarity matrix is already computed
trimm_cut: 0.3
# parameter for trimmed mean method - cut off percentage of worst stations
apply_cc_station_thresh: true
# should a cross-correlation based threshold be used (see Petersen & Niemz et al. 2020)
cc_thresh: 0.7
# min. cc to be met at ```min_n_stats``` stations covering an azimuthal
# range of at least ```az_thresh``` deg to consider an event for clustering
min_n_stats: 5
az_thresh: 60 # [deg]
combine_components: true # combine all components (or separate results?)
weights_components: [0.4, 0.3, 0.3] # weightings for components, here HHZ,HHN,HHE
combine_components: true
# combine all components (or separate results?)
weights_components: [0.4, 0.3, 0.3]
# weightings for components, same order as in components list above
- !clusty.config.clustering_settings
method: dbscan
dbscan_eps: [0.08, 0.12, 0.14, 0.16, 0.18, 0.2] # range of eps values to get started, use smaller steps in a smaller range after finding a rough best value
dbscan_eps: [0.08, 0.12, 0.14, 0.16, 0.18, 0.2]
# range of eps values to get started, use smaller steps in a smaller range after finding a rough best value
dbscan_min_pts: 5
# dbscan min pts value
plot_netsim_matrix: False
# plot network similarity matrix
plot_map: True
# plot clustering results on a map
wf_plot: [] # add tuple with (eps, minpts) if you want wf plots
wf_plot_stats: [] # add net.stat if waveform plots should be returned
export_stacked_traces: false # set to true if stacked traces are needed (slow)
wf_cl_snuffle: [] # add tuple with (eps, minpts) if you want to open
# clustered waveforms in snuffler - much nicer than the static wf_plot...
t_buffer_wfplot: 20.
# only needs to be adjusted if your input wf traces are very short.
export_stacked_traces: false
# set to true if stacked traces are needed (slow)
debug_stacking: false
```
values used in Zakynthos study:

15
src/cc.py

@ -738,10 +738,11 @@ def comp_snr(tr, arr, twd):
try:
noise_wdw = tr.chop(tmin=arr-180, tmax=arr-60, inplace=False).ydata
# trace.snuffle([signal_wdw,noise_wdw])
except trace.NoData:
logging.warning('Longer time series recommended for SNR calculation')
print(util.tts(tr.tmin),util.tts(arr),util.tts(tr.tmax))
# print(util.tts(tr.tmin),util.tts(arr),util.tts(tr.tmax))
noise_wdw = tr.chop(tmin=tr.tmin, tmax=arr, inplace=False).ydata
try:
@ -773,8 +774,12 @@ def make_preprocessed_trace_list(i_st, st, catalog, cc_settings,
logging.debug(st.station)
if not '*' in waveform_dir:
p = pile.make_pile(waveform_dir, regex='%s.%s' % (st.network, st.station),
p = pile.make_pile(waveform_dir, regex='%s' % (st.station),
show_progress=False, fileformat='detect')
print(p)
#if not p.tmin:
# p = pile.make_pile(waveform_dir, regex='_%s_' % (st.station),
# show_progress=False, fileformat='detect')
else:
files = glob.glob('%s/*%s.%s*' % (waveform_dir, st.network, st.station))
p = pile.make_pile(files, show_progress=False, fileformat='detect')
@ -788,6 +793,8 @@ def make_preprocessed_trace_list(i_st, st, catalog, cc_settings,
#hor_channel_list = [chan for chan in channel_list if chan[-1]!='Z']
n_phases = len(phases)
logging.info('pile: %s' % pile)
n_failed = 0
trace_list = []
@ -1145,6 +1152,10 @@ def get_similarity_matrix(station_list, catalog, cc_settings, waveform_dir,
# do not want to use horizontal traces for P phases.
continue
logging.debug('%s %s' % (c,i_ph))
# pile_all = pile.make_pile(waveform_dir, fileformat='detect',
# show_progress=False)
for i_st, st in enumerate(station_list):
if logging.getLogger().isEnabledFor(logging.INFO):
print('Station: %5d/%5d ' % (i_st+1, n_st), end='\r')

345
src/clustering.py

@ -166,13 +166,16 @@ def catalog_output(catalog, cluster_labels, outfile, no_dist_evs=False):
return cluster_cat
def mag_scale_log(mag):
#return 2**mag/2
#return 1.5**mag*1.2
return 1.75**mag/1.8
def mag_scale_log(mag, log_tuple):
# #return 2**mag/2
# #return 1.5**mag*1.2
# return 1.75**mag/1.8 #*5 #castor (max 4)
return log_tuple[0]**mag/log_tuple[1]
def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
def map_plot(catalog, method, cc_cnf, cl_cnf, fn=False, mark_nodata=False):
log_tuple = cl_cnf.mag_scale_log
faults = cl_cnf.map_faults_path
gmtconf = dict(
MAP_TICK_PEN_PRIMARY='1.25p',
MAP_TICK_PEN_SECONDARY='1.25p',
@ -195,6 +198,9 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
ev_nocl_notcl = []
ev_nocl_butmt = []
mags = [ev.magnitude for ev in catalog]
mag_min = num.min(mags)
mag_max = num.max(mags)
for ev in catalog:
if int(ev.extras['cluster_number']) != -1:
if ev.moment_tensor:
@ -208,7 +214,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
if not mark_nodata and ev.extras['clustered'] is True:
lats_nocl.append(ev.lat)
lons_nocl.append(ev.lon)
s_nocl.append(mag_scale_log(ev.magnitude))
s_nocl.append(mag_scale_log(ev.magnitude, log_tuple))
else:
ev_nocl_notcl.append(ev)
@ -227,8 +233,8 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
dist1 = od.distance_accurate50m(od.Loc(lat_m, lon_m), corners[0])
dist2 = od.distance_accurate50m(od.Loc(lat_m, lon_m), corners[1])
radius = max(dist1, dist2)
radius = max(dist1, dist2)*1.1
if not mark_nodata:
eps = method.split('_')[-2]
minpts = method.split('_')[-1]
@ -272,19 +278,19 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
if len(ev.extras) == 2:
lat_nocl_data.append(ev.lat)
lon_nocl_data.append(ev.lon)
s_nocl_data.append(mag_scale_log(ev.magnitude))
s_nocl_data.append(mag_scale_log(ev.magnitude, log_tuple))
else:
ev_nodatathresh.append(ev)
if mark_nodata:
# this is old
rows_nodatasnr = [(ev.lon, ev.lat, mag_scale_log(ev.magnitude)) for ev in ev_nodatathresh if ev.extras['note'] == 'nodatasnr']
rows_thresh = [(ev.lon, ev.lat, mag_scale_log(ev.magnitude)) for ev in ev_nodatathresh if ev.extras['note'] == 'thresh']
rows_nodatasnr = [(ev.lon, ev.lat, mag_scale_log(ev.magnitude, log_tuple)) for ev in ev_nodatathresh if ev.extras['note'] == 'nodatasnr']
rows_thresh = [(ev.lon, ev.lat, mag_scale_log(ev.magnitude, log_tuple)) for ev in ev_nodatathresh if ev.extras['note'] == 'thresh']
m.gmt.psxy(in_rows=rows_nodatasnr, S='c', *m.jxyr)
m.gmt.psxy(in_rows=rows_thresh, S='d', *m.jxyr)
m.gmt.psxy(in_columns=(lon_nocl_data, lat_nocl_data, s_nocl_data), S='c', G='grey', *m.jxyr)
else:
rows = [(ev.lon, ev.lat, mag_scale_log(ev.magnitude)) for ev in ev_nodatathresh if ev.extras['clustered'] is False]
rows = [(ev.lon, ev.lat, mag_scale_log(ev.magnitude, log_tuple)) for ev in ev_nodatathresh if ev.extras['clustered'] is False]
m.gmt.psxy(in_rows=rows, S='c', W='0.5p,grey', *m.jxyr)
# events without MT and without cluster number but entered DBSCAN
@ -301,7 +307,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
* pmt.magnitude_to_moment(ev.magnitude)
m6 = pmt.to6(mt)
data = (ev.lon, ev.lat, 10) + tuple(m6) + (1,0,0)
s = 'd%sp' % mag_scale_log(ev.magnitude)
s = 'd%sp' % mag_scale_log(ev.magnitude, log_tuple)
if not cc_cnf and not mark_nodata:
if ev.extras['origins'][0] == True:
@ -309,7 +315,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
in_rows=[data],
G=g_col,
S=s,
W='0.05p,%s' %g_col,
W='0.05p,%s' %'grey',
M=True,
*m.jxyr)
else:
@ -318,7 +324,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
in_rows=[data],
G=g_col,
S=s,
W='0.05p,%s' %g_col,
W='0.05p,%s' %'grey',
M=True,
*m.jxyr)
@ -353,7 +359,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
cluster_color_dict[cl]=g_col
lonsnomt = [ev.lon for ev in pl_ev]
latsnomt = [ev.lat for ev in pl_ev]
sizenomt = [mag_scale_log(ev.magnitude) for ev in pl_ev]
sizenomt = [mag_scale_log(ev.magnitude, log_tuple) for ev in pl_ev]
m.gmt.psxy(in_columns=(lonsnomt, latsnomt, sizenomt),
G=g_col, S=symbol, W='0.5p,%s' % g_col,
*m.jxyr)
@ -367,7 +373,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
g_col = util_clusty.color2rgb(pl_ev[0].extras['color'])
lonsnomt = [ev.lon for ev in pl_ev]
latsnomt = [ev.lat for ev in pl_ev]
sizenomt = [mag_scale_log(ev.magnitude) for ev in pl_ev]
sizenomt = [mag_scale_log(ev.magnitude, log_tuple) for ev in pl_ev]
m.gmt.psxy(in_columns=(lonsnomt, latsnomt, sizenomt),
G=g_col, S='c', W='0.5p,%s' % g_col,
*m.jxyr)
@ -383,7 +389,7 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
* pmt.magnitude_to_moment(ev.magnitude)
m6 = pmt.to6(mt)
data = (ev.lon, ev.lat, 10) + tuple(m6) + (1,0,0)
s = 'd%sp' % mag_scale_log(ev.magnitude)
s = 'd%sp' % mag_scale_log(ev.magnitude, log_tuple)
m.gmt.psmeca(
in_rows=[data],
G=g_col,
@ -393,18 +399,34 @@ def map_plot(catalog, method, cc_cnf, faults, fn=False, mark_nodata=False):
*m.jxyr)
#print(mag_min, mag_max)
lower = num.floor(mag_min)
upper = num.ceil(mag_max)
if upper-lower <= 2:
step = 0.5
else:
step = 1
for i_ma, ma in enumerate(num.arange(lower,upper, step)):
if i_ma == 0:
leg = 'S 0.1i c %sp black 1p 0.3i M %.1f\n' % (mag_scale_log(ma, log_tuple),ma)
else:
leg = leg + 'S 0.1i c %sp black 1p 0.3i M %.1f\n' % (mag_scale_log(ma, log_tuple),ma)
leg = 'S 0.1i c %sp black 1p 0.3i M 3.0\n' % (mag_scale_log(3.)) +\
'S 0.1i c %sp black 1p 0.3i M 4.0\n' % (mag_scale_log(4.)) +\
'S 0.1i c %sp black 1p 0.3i M 5.0\n' % (mag_scale_log(5.)) +\
'S 0.1i c %sp black 1p 0.3i M 6.0\n' % (mag_scale_log(6.)) +\
'D 0 1p\n'
leg = leg + 'D 0 1p\n'
# leg = 'S 0.1i c %sp black 1p 0.3i M 3.0\n' % (mag_scale_log(3.)) +\
# 'S 0.1i c %sp black 1p 0.3i M 4.0\n' % (mag_scale_log(4.)) +\
# 'S 0.1i c %sp black 1p 0.3i M 5.0\n' % (mag_scale_log(5.)) +\
# 'S 0.1i c %sp black 1p 0.3i M 6.0\n' % (mag_scale_log(6.)) +\
# 'D 0 1p\n'
for key in sorted(cluster_color_dict.keys()):
item = cluster_color_dict[key]
leg = leg + 'S 0.1i c %sp %s - 0.3i %s\n' % (mag_scale_log(5.), item, key)
leg = leg + 'S 0.1i c %sp %s - 0.3i %s\n' % (mag_scale_log(num.median(mags), log_tuple), item, key)
with open('leg.txt', 'w') as f:
f.write(leg)
@ -476,10 +498,19 @@ def plot_cluster_timing(result_catalog, method_string):
one_day = datetime.timedelta(days = 1)
for clust in sorted(list(set(cl_numbers))):
#clcat = [ev if ev.extras['cluster_number'] == clust else None for ev in clustered_events]
clcat = [ev for ev in result_catalog if ev.extras['cluster_number'] == clust]
#moms = [pmt.magnitude_to_moment(ev.magnitude) if ev else 0.0 for ev in clcat]
min_mag = num.min([ev.magnitude for ev in result_catalog if ev.magnitude!= None])
mag_substituted = False
for ev in result_catalog:
if ev.magnitude == None:
ev.magnitude = min_mag/2
mag_substituted = True
if mag_substituted is True:
logging.warning('Not all events in catalog have magnitudes. Missing magnitudes were \
set to min(mag)/2 for plotting.')
moms = [pmt.magnitude_to_moment(ev.magnitude) for ev in clcat]
moms.insert(0,0.)
moms_cum = num.cumsum(moms)
@ -531,7 +562,7 @@ def plot_cluster_timing(result_catalog, method_string):
#ax[1].plot(xvals_d, cum_mom)
ax[1].set_ylabel('Cum. moment per cluster', fontsize=fs)
ax[1].set_yscale('log')
ax[1].set_ylim(10e13, 10e17)
#ax[1].set_ylim(10e13, 10e17)
#ax[0].set_xscale('log')
#ax[1].set_xscale('log')
plt.tight_layout()
@ -647,7 +678,7 @@ def get_twd(i_st, i_ev, phases, cc_cnf, arrivals=None):
def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arrivals,
cc_cnf, method_string, representative_events, max_sil_vals,
tshift_array):
tshift_array, cl_cnf):
'''
read result file and plot waveforms of each cluster for chosen stations on
@ -677,7 +708,7 @@ def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arriva
logging.info('Preparing waveform plot for %s.%s' % (st.network, st.station))
if not '*' in waveform_dir:
p = pile.make_pile(waveform_dir, regex='%s.%s' % (st.network, st.station),
p = pile.make_pile(waveform_dir, regex='%s' % (st.station),
show_progress=False)
else:
files = glob.glob('%s/*%s.%s*' % (waveform_dir, st.network, st.station))
@ -702,7 +733,7 @@ def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arriva
# erstmal alle auffuellen, die mit ersten event cc ueber threshold haben
for ii, (i_ev, ev) in enumerate(cl_events):
if ii == 0:
t_shift_list[0] = - ev.time #- arrT[i_ph, i_st, i_ev]
t_shift_list[0] = arrivals[0, i_st, i_ev] # - ev.time #- arrT[i_ph, i_st, i_ev]
i_rel = i_ev
else:
if num.isnan(tshift_array[0, i_st, i_ev, i_rel]):
@ -738,13 +769,13 @@ def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arriva
plot_last = True
try:
tr_Z = p.all(tmin=arr-300, tmax=arr+900,
tr_Z = p.all(tmin=arr-cc_cnf.filterrmin, tmax=arr+cc_cnf.filtertmax,
trace_selector=lambda tr: tr.station == st.station and
tr.channel.endswith('Z'))[0]
tr_N = p.all(tmin=arr-300, tmax=arr+900,
tr_N = p.all(tmin=arr-cc_cnf.filterrmin, tmax=arr+cc_cnf.filtertmax,
trace_selector=lambda tr: tr.station == st.station and
tr.channel.endswith('E'))[0]
tr_E = p.all(tmin=arr-300, tmax=arr+900,
tr_E = p.all(tmin=arr-cc_cnf.filterrmin, tmax=arr+cc_cnf.filtertmax,
trace_selector=lambda tr: tr.station == st.station and
tr.channel.endswith('N'))[0]
@ -763,7 +794,7 @@ def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arriva
if snr > cc_cnf.snr_thresh:
#print('before %s %s %s ' % (tr_Z.tmin, len(tr_Z.ydata), arr))
t_buffer = 60
t_buffer = cl_cnf.t_buffer_wfplot
#print('evtime', ev.time)
#print('arr', arr)
#print('t_shift_list', num.nanmax(t_shift_list), num.nanmin(t_shift_list))
@ -787,6 +818,7 @@ def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arriva
tr_E.chop(tmin=0.00, tmax=twd[1]+t_buffer)
# print('2nd chop %s %s %s' % (tr_Z.tmin, tr_Z.tmax, len(tr_Z.ydata)))
# print('twd', twd[1])
ntr +=1
else:
continue
@ -869,6 +901,221 @@ def wf_cluster_plot(resultcat, subset_stations_and_indices, waveform_dir, arriva
(st.network, st.station, method_string))
plt.close()
def wf_cluster_snuffle(resultcat, subset_stations_and_indices, waveform_dir, arrivals,
cc_cnf, method_string, representative_events, max_sil_vals,
tshift_array, cl_cnf):
'''
read result file and plot waveforms of each cluster for chosen stations on
top of each other.
nrows = nclusters
ncolums = nstations (select a subset or random number)
'''
if tshift_array is None:
logging.info('No tshift array found.')
nstations = len(subset_stations_and_indices)
set_clusters = list(set([ev.extras['cluster_number'] for ev in resultcat
if ev.extras['cluster_number'] != -1]))
set_clusters.sort()
nclusters = len(set_clusters)
phases = cc_cnf.phase
n_phases = len(phases)
result_dict = {}
for cl in set_clusters:
result_dict[cl] = [(i_ev, ev) for i_ev, ev in enumerate(resultcat)
if ev.extras['cluster_number'] == cl]
for i_st, st in subset_stations_and_indices:
logging.info('Preparing waveform plot for %s.%s' % (st.network, st.station))
if not '*' in waveform_dir:
#p = pile.make_pile(waveform_dir, regex='%s.%s' % (st.network, st.station),
# show_progress=False)
p = pile.make_pile(waveform_dir, regex='%s' % (st.station),
show_progress=True, fileformat='detect')
else:
files = glob.glob('%s/*%s.%s*' % (waveform_dir, st.network, st.station))
p = pile.make_pile(files, show_progress=False)
trZs = []
trEs = []
trNs = []
test_comps = [c[-1] for c in cc_cnf.components]
for i_cl, cl in enumerate(set_clusters):
cl_events = result_dict[cl]
nev = len(cl_events)
ntr = 0
plot_last = False
plot_last_list = []
sil_val = max_sil_vals[cl]
if tshift_array is not None:
# get list of time shifts for current cluster:
t_shift_list = num.empty((nev))
t_shift_list.fill(num.nan)
i_ev_list = [clev[0] for clev in cl_events]
# erstmal alle auffuellen, die mit ersten event cc ueber threshold haben
for ii, (i_ev, ev) in enumerate(cl_events):
if ii == 0:
t_shift_list[0] = -arrivals[0, i_st, i_ev]
i_rel = i_ev
else:
if num.isnan(tshift_array[0, i_st, i_ev, i_rel]):
continue
else:
t_shift_list[ii] = t_shift_list[0] + tshift_array[0, i_st, i_ev, i_rel]
# test if there are still some shift times unset, if so use shift times
# relative to other than first events.
if num.isnan(num.min(t_shift_list)):
for i0, i_ev0 in enumerate(i_ev_list):
if not num.isnan(t_shift_list[i0]):
continue
else:
for i1, i_ev1 in enumerate(i_ev_list):
if i0 == i1:
continue
dt = tshift_array[0, i_st, i_ev1, i_ev0]
#dt2 = tshift_array[i_ph, i_st, i_ev0, i_ev1]
if not num.isnan(dt):
t_shift_list[i0] = t_shift_list[i1] + dt
for ii, (i_ev, ev) in enumerate(cl_events):
arr = arrivals[0, i_st, i_ev]
twd = get_twd(i_st, i_ev, phases, cc_cnf, arrivals)
twd_ = twd
twd = twd[0]
try:
if 'Z' in test_comps:
tr_Z = p.all(tmin=arr-cc_cnf.filtertmin, tmax=arr+cc_cnf.filtertmax,
trace_selector=lambda tr: tr.station == st.station and
tr.channel.endswith('Z'))[0]
print('tr Z found', tr_Z)
if 'N' in test_comps:
tr_N = p.all(tmin=arr-cc_cnf.filtertmin, tmax=arr+cc_cnf.filtertmax,
trace_selector=lambda tr: tr.station == st.station and
tr.channel.endswith('E'))[0]
if 'E' in test_comps:
tr_E = p.all(tmin=arr-cc_cnf.filtertmin, tmax=arr+cc_cnf.filtertmax,
trace_selector=lambda tr: tr.station == st.station and
tr.channel.endswith('N'))[0]
except:
logging.debug('WF cluster plot: %s.%s %s - no data found' %
(st.network, st.station, ev.name))
continue
#if not arr > tr_Z.tmin and not arr < tr_Z.tmax:
# print('arr error')
# continue
if cc_cnf.downsample_to:
if 'Z' in test_comps:
tr_Z.downsample_to(cc_cnf.downsample_to)
if 'N' in test_comps:
tr_N.downsample_to(cc_cnf.downsample_to)
if 'E' in test_comps:
tr_E.downsample_to(cc_cnf.downsample_to)
if 'Z' in test_comps:
tr_Z.bandpass(int(cc_cnf.bp[0]), cc_cnf.bp[1], cc_cnf.bp[2])
if 'N' in test_comps:
tr_N.bandpass(int(cc_cnf.bp[0]), cc_cnf.bp[1], cc_cnf.bp[2])
if 'E' in test_comps:
tr_E.bandpass(int(cc_cnf.bp[0]), cc_cnf.bp[1], cc_cnf.bp[2])
if cc_cnf.snr_calc:
snr = comp_snr(tr_Z, arr, twd_)
else:
snr = 9999
if snr > cc_cnf.snr_thresh:
#print('before %s %s %s ' % (tr_Z.tmin, len(tr_Z.ydata), arr))
t_buffer = cl_cnf.t_buffer_wfplot # 20
# print('evtime', ev.time)
# print('arr', arr)
#print('t_shift_list', num.nanmax(t_shift_list), num.nanmin(t_shift_list))
#input()
#t_buffer = num.nanmax(t_shift_list), num.nanmin(t_shift_list)
try:
if 'Z' in test_comps:
tr_Z.chop(tmin=arr - t_buffer, tmax=arr + twd[1]+t_buffer)
logging.info('first chop %s %s %s ' % (tr_Z.tmin, tr_Z.tmax, len(tr_Z.ydata)))
if 'N' in test_comps:
tr_N.chop(tmin=arr - t_buffer, tmax=arr + twd[1]+t_buffer)
if 'E' in test_comps:
tr_E.chop(tmin=arr - t_buffer, tmax=arr + twd[1]+t_buffer)
except trace.NoData:
logging.info('Trace to short for cutting')
continue
if tshift_array is not None:
dt = t_shift_list[ii]
if not num.isnan(dt):
start = arr - ev.time - dt
ntr +=1
if 'Z' in test_comps:
tr_Z.shift(dt)
trZ_ = tr_Z.copy()
trZ_.set_codes(location=cl)
trZ_.ydata = trZ_.ydata/ num.max(trZ_.ydata)
trZs.append(trZ_)
if 'N' in test_comps:
tr_N.shift(dt)
trN_ = tr_N.copy()
trN_.set_codes(location=cl)
trN_.ydata = trN_.ydata/ num.max(trN_.ydata)
trNs.append(trN_)
if 'E' in test_comps:
tr_E.shift(dt)
trE_ = tr_E.copy()
trE_.set_codes(location=cl)
trE_.ydata = trE_.ydata/ num.max(trE_.ydata)
trEs.append(trE_)
#tr_Z.chop(tmin=t_buffer/2., tmax=twd[1]+t_buffer/2)
#tr_N.chop(tmin=t_buffer/2., tmax=twd[1]+t_buffer/2)
#tr_E.chop(tmin=t_buffer/2., tmax=twd[1]+t_buffer/2)
#print('2nd chop %s %s %s' % (util.tts(tr_Z.tmin), util.tts(tr_Z.tmax), len(tr_Z.ydata)))
# print('twd', twd[1])
else:
continue
else:
continue
else:
continue
if 'Z' in test_comps:
trace.snuffle(trZs)
#io.save(trZs, './results/%s_traces_Z_aligned.yaff' % st.station,
# format='yaff')
if 'N' in test_comps:
trace.snuffle(trNs)
#io.save(trNs, './results/%s_traces_N_aligned.yaff' % st.station,
# format='yaff')
if 'E' in test_comps:
trace.snuffle(trEs)
#io.save(trEs, './results/%s_traces_E_aligned.yaff' % st.station,
# format='yaff')
def export_stacked_traces(resultcat, waveform_dir, arrivals, coef_array,
tshift_array, station_list,
@ -1258,10 +1505,15 @@ def fruchtermann(cat, dists, eps, pts_min, method_string, cc_cnf, cl_cnf):
sim_connect =[(u, v) for (u, v, d) in G.edges(data=True) if d['weight'] > float(eps)]
min_mag = np.min([ev.magnitude for ev in cat if ev.magnitude!= None])
min_mag = num.min([ev.magnitude for ev in cat if ev.magnitude!= None])
mag_substituted = False
for ev in cat:
if ev.magnitude == None:
ev.magnitude = min_mag/2
mag_substituted = True
if mag_substituted is True:
logging.warning('Not all events in catalog have magnitudes. Missing magnitudes were\
set to min(mag)/2 for plotting.')
node_colors = [cat[int(idx)].extras['color'] for idx in G.nodes()]
@ -1413,6 +1665,7 @@ def sankey_multi(file_list, basic_cnf, fn=False, debug=False, single_freq=True,
clabels_clean.append(clabel_clean)
mycolors = util_clusty.get_colormap()
start_source =0
for i_file in range(len(file_list)-1):
if debug:
@ -1767,7 +2020,9 @@ def calc_adjusted_rand(cat_list):
plt.close()
def map_plot_freqmerge(catalog, method, faults):
def map_plot_freqmerge(catalog, method, cl_cnf):
faults = cl_cnf.map_faults_path
log_tuple = cl_cnf.mag_scale_log
gmtconf = dict(
MAP_TICK_PEN_PRIMARY='1.25p',
MAP_TICK_PEN_SECONDARY='1.25p',
@ -1801,7 +2056,7 @@ def map_plot_freqmerge(catalog, method, faults):
else:
lats_nocl.append(ev.lat)
lons_nocl.append(ev.lon)
s_nocl.append(mag_scale_log(ev.magnitude))
s_nocl.append(mag_scale_log(ev.magnitude, log_tuple))
# map dimensions:
lats_all = sorted([ev.lat for ev in catalog]) # if ev.extras['cluster_number'] != -1]
@ -1862,7 +2117,7 @@ def map_plot_freqmerge(catalog, method, faults):
* pmt.magnitude_to_moment(ev.magnitude)
m6 = pmt.to6(mt)
data = (ev.lon, ev.lat, 10) + tuple(m6) + (1,0,0)
s = 'd%sp' % mag_scale_log(ev.magnitude)
s = 'd%sp' % mag_scale_log(ev.magnitude, log_tuple)
m.gmt.psmeca(
in_rows=[data],
G=g_col,
@ -1879,7 +2134,7 @@ def map_plot_freqmerge(catalog, method, faults):
g_col = util_clusty.color2rgb(ev.extras['color'])
for i in range(origins):
if ev.extras['origins'][i] == True:
s = '%s%sp' % (symbol_list[i], mag_scale_log(ev.magnitude))
s = '%s%sp' % (symbol_list[i], mag_scale_log(ev.magnitude, log_tuple))
#elif ev.extras['origins'][1] == True:
# s = '%s%sp' % (symbol_list[], mag_scale_log(ev.magnitude))
#elif ev.extras['origins'][2] == True:
@ -1900,7 +2155,7 @@ def map_plot_freqmerge(catalog, method, faults):
* pmt.magnitude_to_moment(ev.magnitude)
m6 = pmt.to6(mt)
data = (ev.lon, ev.lat, 10) + tuple(m6) + (1,0,0)
s = 'd%sp' % mag_scale_log(ev.magnitude)
s = 'd%sp' % mag_scale_log(ev.magnitude, log_tuple)
if ev.extras['origins'][0] == True:
m.gmt.psmeca(
in_rows=[data],
@ -1920,10 +2175,10 @@ def map_plot_freqmerge(catalog, method, faults):
M=True,
*m.jxyr)
leg = 'S 0.1i c %sp black 1p 0.5i M 3.0\n' % (mag_scale_log(3.)) +\
'S 0.1i c %sp black 1p 0.5i M 4.0\n' % (mag_scale_log(4.)) +\
'S 0.1i c %sp black 1p 0.5i M 5.0\n' % (mag_scale_log(5.)) +\
'S 0.1i c %sp black 1p 0.5i M 6.0\n' % (mag_scale_log(6.))
leg = 'S 0.1i c %sp black 1p 0.5i M 3.0\n' % (mag_scale_log(3., log_tuple)) +\
'S 0.1i c %sp black 1p 0.5i M 4.0\n' % (mag_scale_log(4., log_tuple)) +\
'S 0.1i c %sp black 1p 0.5i M 5.0\n' % (mag_scale_log(5., log_tuple)) +\
'S 0.1i c %sp black 1p 0.5i M 6.0\n' % (mag_scale_log(6., log_tuple))
with open('leg.txt', 'w') as f:
f.write(leg)

89
src/clusty.py

@ -188,6 +188,7 @@ def main():
else:
pass
if station_list == []:
logging.warning('No stations found. Please use stationxml with file'
+ 'extension ''xml'''
@ -195,6 +196,7 @@ def main():
+ ' ''.pf'' or ''.yaml''.')
if basic_cnf.st_allowlist != []:
logging.info('st allow list: %s' % basic_cnf.st_allowlist)
station_list = [st for st in station_list if '%s.%s' %
(st.network, st.station)
in basic_cnf.st_allowlist]
@ -210,7 +212,7 @@ def main():
logging.info('Number of events: %s' % len(cat))
logging.info('Number of stations: %s' %len(station_list))
logging.info('Number of stations: %s %s' %(len(station_list),[st.station for st in station_list]))
if station_list == []:
logging.warning('No stations found.')
@ -293,7 +295,7 @@ def main():
num.save(file='./precalc_arrays/arrivals_array', arr=arrivals_array)
elif cc_cnf.pick_dir != '' and cc_cnf.pick_xml is False:
elif cc_cnf.pick_path != '' and cc_cnf.pick_option == 'pyr_path':
for i_ph, phase_name in enumerate(cc_cnf.phase):
for i_st, st in enumerate(station_list):
for i_ev, ev in enumerate(cat):
@ -310,7 +312,8 @@ def main():
try:
arr = [pick.tmin for pick in picks if
(pick.match_nsl((st.network, st.station, st.location)) and pick._phasename == phase_name)][0]
(pick.match_nsl((st.network, st.station, st.location)))][0]
# and pick._phasename == phase_name)][0]
arrivals_array[i_ph, i_st, i_ev] = arr
except IndexError: # if the given phase has no pick
@ -319,10 +322,31 @@ def main():
num.save(file='%s/precalc_arrays/arrivals_array' % basic_cnf.work_dir, arr=arrivals_array)
elif cc_cnf.pick_dir != '' and cc_cnf.pick_xml is True:
elif cc_cnf.pick_path != '' and cc_cnf.pick_option == 'pyr_file':
picks = marker.load_markers(cc_cnf.pick_path)
for i_ph, phase_name in enumerate(cc_cnf.phase):
for i_st, st in enumerate(station_list):
for i_ev, ev in enumerate(cat):
try:
arr = [pick.tmin for pick in picks if
(pick.match_nsl((st.network, st.station, st.location))
and pick._phasename == phase_name
and pick._event_time == ev.time)][0]
arrivals_array[i_ph, i_st, i_ev] = arr
print(arr)
except IndexError: # if the given phase has no pick
arrivals_array[i_ph, i_st, i_ev] = num.nan
print('error')
num.save(file='%s/precalc_arrays/arrivals_array' % basic_cnf.work_dir, arr=arrivals_array)
elif cc_cnf.pick_path != '' and cc_cnf.pick_option == 'xml':
logging.info('Generating arrivals array from picks in xml format.')
search_me = os.path.join(cc_cnf.pick_dir, '*.xml')
search_me = os.path.join(cc_cnf.pick_path, '*.xml')
pick_files = glob.glob(search_me) # find all files
logging.info('found pick files: %s' % pick_files)
@ -744,7 +768,8 @@ def main():
continue
method_string = os.path.split(f)[-1][4:-5]
comp_name = method_string.split('_')[-3]
cl.map_plot(result_catalog, method_string, cc_cnf, cl_cnf.map_faults_path)
if cl_cnf.plot_map:
cl.map_plot(result_catalog, method_string, cc_cnf, cl_cnf)
cl.plot_cluster_timing(result_catalog, method_string) # vielleicht noch spaeter?! final plots...
if cl_cnf.plot_hist:
@ -785,14 +810,56 @@ def main():
result_catalog = model.load_events(os.path.join('results/', 'cat_' + method_string + '.yaml'))
except:
logging.error('Catalog for chosen eps and minpts for wf plot not found.')
logging.info(method_string)
sys.exit()
representative_events, max_sil_vals = cl.get_repr_ev(result_catalog, dists)
cl.wf_cluster_plot(result_catalog, subset_stations_and_indices,
basic_cnf.waveform_dir, arrivals_array, cc_cnf, method_string,
representative_events, max_sil_vals, tshifts_array)
representative_events, max_sil_vals, tshifts_array, cl_cnf)
if cl_cnf.wf_cl_snuffle:
eps = cl_cnf.wf_cl_snuffle[0]
minpts = cl_cnf.wf_cl_snuffle[1]
logging.info('Opening waveform snuffle for eps %s and minpts %s.' % (eps, minpts))
if cl_cnf.wf_plot_stats == []:
subset_stations_and_indices = [(i_st, st) for i_st, st in enumerate(station_list)]
else:
subset_stations_and_indices = [(i_st, st) for i_st, st in enumerate(station_list)
if '%s.%s' % (st.network, st.station) in cl_cnf.wf_plot_stats]
arrivals_array = num.load('./precalc_arrays/arrivals_array.npy')
try:
tshifts_array = num.load('./precalc_arrays/cctshifts_array.npy')
except:
tshifts_array = None
for i_comp in range(n_comps):
#phase_name = get_phasename(cc_cnf, ns_cnf, i_ph)
if n_comps == 1 and ns_cnf.combine_components:
comp_name = 'comb'
else:
comp_name = cc_cnf.components[i_comp]
dists = num.load('./precalc_arrays/dists_%s.npy' % comp_name)
try:
method_string = [os.path.split(f)[-1][4:-5] for f in result_catalogs
if float((os.path.split(f)[-1][4:-5]).split('_')[-2]) == eps
and float((os.path.split(f)[-1][4:-5]).split('_')[-1]) == minpts
and (os.path.split(f)[-1][4:-5]).split('_')[-3] == comp_name][0]
result_catalog = model.load_events(os.path.join('results/', 'cat_' + method_string + '.yaml'))
except:
logging.error('Catalog for chosen eps and minpts for wf plot not found.')
sys.exit()
representative_events, max_sil_vals = cl.get_repr_ev(result_catalog, dists)
cl.wf_cluster_snuffle(result_catalog, subset_stations_and_indices,
basic_cnf.waveform_dir, arrivals_array, cc_cnf, method_string,
representative_events, max_sil_vals, tshifts_array, cl_cnf)
logging.info('Finished result plots.')
@ -874,13 +941,13 @@ def main():
cat_merged = merge_freqs.merge_harmonized_catalogs(result_catalogs)
model.dump_events(cat_merged, '%s/results/cat_merged_%s.pdf' % (work_dir,method),
model.dump_events(cat_merged, '%s/results/cat_merged_%s.yaml' % (work_dir,method),
format='yaml')
fn = '%s/results/cluster_map_%s.pdf' % (work_dir,method)
cl.map_plot(cat_merged, method_string, faults=faults_path,
cc_cnf=False, fn=fn)
if cl_cnf.plot_map:
cl.map_plot(cat_merged, method_string, cl_cnf,
cc_cnf=False, fn=fn)
if args.export_stacked_traces:
logging.info('Preparing waveform stacking, reading prepared arrays.')

11
src/config.py

@ -22,10 +22,12 @@ class GeneralSettings(Object):
class cc_comp_settings(Object):
bp = List.T()
filtertmin = Float.T(default=120, help='starttime before arrival for filtering')
filtertmax = Float.T(default=600, help='endtime after filtertmin for filtering')
downsample_to = Float.T(optional=True)
twd = List.T(default=None)
pick_dir = String.T(default='')
pick_xml = Bool.T(default=False)
pick_path = String.T(default='')
pick_option = String.T(default='pyr_file') # pyr_single, xml, pyr_path
phase = List.T(default=[])
components = List.T(default=[])
use_precalc_arrivals = Bool.T(default=False)
@ -67,8 +69,10 @@ class clustering_settings(Object):
plot_hist = Bool.T(default=False)
plot_netsim_matrix = Bool.T(default=False)
repr_ev = Tuple.T(default=())
plot_map = Bool.T(default=True)
wf_plot = Tuple.T(default=())
wf_plot_stats = List.T(optional=True)
wf_cl_snuffle = Tuple.T(default=())
frucht_k_factor = Float.T(default=2.5)
frucht_plot_lib = String.T(default='plotly')
frucht_show = Bool.T(default=False)
@ -77,6 +81,8 @@ class clustering_settings(Object):
export_stacked_traces = Bool.T(default=False)
prepare_for_grond = Bool.T(default=False)
debug_stacking = Bool.T(default=False)
mag_scale_log = Tuple.T(default=(1.75, 1.8))
t_buffer_wfplot = Float.T(default=20.)
class merge_cat_settings(Object):
cat_config_list = List.T(default=[])
@ -85,3 +91,4 @@ class merge_cat_settings(Object):
export_stacked_traces = Bool.T(default=False)
frucht_show = Bool.T(default=False)
sankey_show = Bool.T(default=False)

10
src/util_clusty.py

@ -1,5 +1,6 @@
import matplotlib as mpl
from matplotlib.colors import LinearSegmentedColormap
import random
def hex_to_rgb(hex_color, out='std', alpha = 0.1):
hex_color = hex_color.strip("#")
@ -23,15 +24,16 @@ def get_colormap(cmap_name='default'):
# (mint green removed due to lack of visibilty on map, #436B6B replaced a near black color)
# #B68E100 subsitituted by #be6400, #FB00D1 by #755b71
# exchanged 2 and 4
mycolors = ['#000000', '#FD3216', '#FED4C4', '#72f283', '#328a3e',
'#FE00CE', '#0DF9FF', '#F6F926', '#FF9616', '#6A76FC',
mycolors = ['#000000', '#00A08B', '#FD3216', '#6A76FC','#FF9616',
'#620042', '#328a3e','#FE00CE', '#0DF9FF', '#F6F926',
'#72f283',
'#EEA6FB', '#DC587D', '#D626FF', '#6E899C', '#00B5F7',
'#B68E00', '#FF0092', '#22FFA7', '#E3EE9E',
'#86CE00', '#BC7196', '#7E7DCD', '#FC6955', '#E48F72',
'#86CE00', '#BC7196', '#7E7DCD', '#862A16', '#E48F72',
'#DB7093', '#2E91E5', '#E15F99', '#1CA71C', '#FB0D0D',
'#DA16FF', '#436B6B', '#Be6400', '#750D86', '#EB663B',
'#511CFB', '#00A08B', '#755b71', '#FC0080', '#B2828D',
'#6C7C32', '#778AAE', '#862A16', '#A777F1', '#620042',
'#6C7C32', '#778AAE', '#FED4C4', '#A777F1', '#FC6955',
'#1616A7', '#DA60CA', '#6C4516', '#0D2A63', '#AF0038']
elif cmap_name == 'seiscloud_comp':

Loading…
Cancel
Save