Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSEED #17

Open
dylanmikesell opened this issue Sep 21, 2022 · 5 comments
Open

MSEED #17

dylanmikesell opened this issue Sep 21, 2022 · 5 comments

Comments

@dylanmikesell
Copy link

Hi @paudetseis, is there any reason on your end to add MSEED read functionality. It seems to me like it should be able to handle both. I have mseed day volumes stored locally. I looked through io.py last night. Seems like a bit of work to fix things to get mseed working. Has anyone else made this request? I can have a go at it, but if no one else has been asking about this functionally, maybe I will just convert my data to SAC.

@paudetseis
Copy link
Member

Hi @dylanmikesell you are right - ideally the code would handle both. It should in fact be straightforward to modify the code to allow loading MSEED files. Most lines referring to SAC (e.g., lines 73, 75, 78, 144, 152, etc.) use it as a string to search for the the corresponding SAC extension. You could add an argument to the functions download_data(), parse_localdata_for_comp() and list_local_data_stn(), for instance ext='SAC', which would load SAC files by default, but accept other file extensions (like MSEED - note this will need to be a string e.g., 'MSEED'). Then all you need is to modify those lines above to use the new variable ext in the string match. E.g. for lines 142-147:

        lclfiles = list(filter(
            stdata,
            '*/{0:4s}.{1:3s}.{2:s}.{3:s}.*.{4:2s}{5:1s}.{s}'.format(
                styr, stjd, sta.network.upper(
                ), sta.station.upper(), sta.channel.upper()[0:2],
                comp.upper(), ext)))

Then on lines 324 and 327, you change st1 = read(sacf1, format='SAC') to st1 = read(sacf1, format=ext).

The only place where this gets tricky is on lines 333 and 334, where there is a check for no data in stats.sac['user9']. I'm not sure how to deal with this with mseed files that don't have the stats.sac info.

I hope this helps.

@dylanmikesell
Copy link
Author

Okay. Thanks for the tips. I will have a look and work on this.

One more question: can we adapt the filename structure requirements? I just converted some mseed to sac using IRIS' mseed2sac function. An example filename looks like YG.SE3.00.HH1.D.2021.200.000001.SAC. io.py cannot find these data even now that they are in SAC format. Are these format 1 and format 2 coming from your own channel naming? Would it be useful to provide different formats, e.g., like msnoise?

data_structure['SDS'] = "YEAR/NET/STA/CHAN.TYPE/NET.STA.LOC.CHAN.TYPE.YEAR.DAY"
data_structure['BUD'] = "NET/STA/STA.NET.LOC.CHAN.YEAR.DAY"
data_structure['IDDS'] = "YEAR/NET/STA/CHAN.TYPE/DAY/NET.STA.LOC.CHAN.TYPE.YEAR.DAY.HOUR"
data_structure['PDF'] = "YEAR/STA/CHAN.TYPE/NET.STA.LOC.CHAN.TYPE.YEAR.DAY"

My miniseed do not follow your formats either, e.g., SE3.YG.00.HHZ.2021.183. So I need to figure something out.

@dylanmikesell
Copy link
Author

@paudetseis, can you tell me what stdata is meant to be in the following lines?

lclfiles = list(filter(
stdata,
'/{0:4s}.{1:3s}.{2:s}.{3:s}..{4:2s}{5:1s}.{s}'.format(
styr, stjd, sta.network.upper(
), sta.station.upper(), sta.channel.upper()[0:2],
comp.upper(), ext)))

If you look at what stdata actually is, it is the path given by the --local-data option PATH. Thus, I can never get lclfiles to populate. If you look in the doc of the functions in io.py it says the following.

  stdata : List
      Station list

Thus, I think there might be a bug in the local-data code. Does it work for your own tests?

@paudetseis
Copy link
Member

@dylanmikesell there might well be a bug in the local-data code. To be honest with you I have never tested this part - it's inherited from other codes written by a collaborator. But yes, the stdata variable holds the path to your local data folder.

I also like your suggestion to have different naming conventions for the data files. Ultimately I think the scripts need a major overhaul but I don't have the time to spend on this unfortunately. Feel free to have a stab at it and make them work for your application. If your changes end up making it more broadly accessible, can you please generate a PR so I can merge your changes with the main branch`?

Thanks

@dylanmikesell
Copy link
Author

Hi @paudetseis. I finally got back to this. I have the local SAC data partially figured out and will submit a PR when done. First, we were misssing a call to list_local_data_stn() to actually build the file list from the local data path. The function where there, it was just never called in the appropriate place. That is fixed now.

My SAC data do not have a user9 header word...and as you stated before, MSEED will not have this either. I am curious, where does user9 come from as a check for no data in your code? If that header set in your own local data? To me it seems like we should just have a check on the st.data length. Here is your code.

# Check for NoData and convert to NaN
                stnd = st[0].stats.sac['user9']
                eddt = False
                if (not stnd == 0.0) and (not stnd == -12345.0):
                    st[0].data[st[0].data == stnd] = ndval
                    eddt = True

If you can provide me a little more insight into user9 I can get this fixed. I noticed a few other issues have popped up about this local directory issue, so I will try to wrap it up quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants