-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dictionary with information on experimental data #27
Comments
@TristanHehnen Thanks. I have been working on building a module for the gas phase group. It can be found here. I suggest you start a similar module for matl-db called "matl.py" and add your classes to this module. Then, this module can live in a Utilities directory of the repo and be called using something similar to this (this is just a temporary script I'm using to build the module). |
I'll have a look at it. |
I would like to introduce to you the initial prototype for parsing the readme files. I would like to get some feedback such that I do not spend a lot of time creating something that is disliked/thrown away in the end. For now the prototype consits of a script containing the individual functions and a Jupyter notebook that serves as a brief demo of the functionality. It doesn't follw the module proposal by Randy, i.e. "matl.py" yet, but it could certainly moved in this direction. As an example, only the TGA data from UMET is processed in the demo. This is primarily due to the overhead that comes with adjusting the readme files. And if we have reached an agreement as to how they should look like, it would be easier to adjust the format automatically with the here developed script. That means, the readme files should be unified in that different laboratories provided different parameters to describe their experimental campaigns (e.g. different temperature programs, lids on the crucibles or ways to describe the crucibles, etc.). In my view, all these items should be used in all experiment descriptions consistantly. Items that are not relevant for the particular experiment in question should contain "None", as they do in the dictionary later on. The goal of having the laboratories fill out "None" consciously, is to reduce the chance to forget data. Furthermore, I suggest to have the heating rates and initial sample masses written only in the "Test Condition Summary" table. They seem not to be too useful in the text section. Since Isaac started to unify the data file names, I would like to ask if the label of the individual experiment is supposed to by the same as the data file name in general. This would be helpful to reduce the footprint of the summary table and I don't really see why there should be a difference in naming. As for the structure of the dictionary, it is ordered by experiment --> institute --> repetition (rep. label/data file name) --> parameters (e.g. heating rate or initial sample mass), see the demo. As further steps, I would set up functionality that translates the dictionary into the readme file format and saves it as |
@TristanHehnen Thanks for all your work on this! I think it is headed in the right direction. But I am not the keeper for As long as very simple instructions can be put together for the participants, I am in favor. I suggest you use Isaac or Morgan as a test case and see if they can follow the instructions. |
I'll take a look at this more closely tomorrow. To be honest, my python
experience is severely limited but I'll see what I can make of things. For
now, I'll [comment] on some specific things that you wrote in order:
…------
As an example, only the TGA data from UMET is processed in the demo. This
is primarily due to the overhead that comes with adjusting the readme
files. And if we have reached an agreement as to how they should look like,
it would be easier to adjust the format automatically with the here
developed script.
[Makes sense to me; As I've started working with data, these issues become
more apparent. Even as things are currently... it took a surprising amount
of manual effort to rename + reorganize files and edit READMEs into some
level of consistency, as we have now.]
------
That means, the readme files should be unified in that different
laboratories provided different parameters to describe their experimental
campaigns (e.g. different temperature programs, lids on the crucibles or
ways to describe the crucibles, etc.). In my view, all these items should
be used in all experiment descriptions consistently. Items that are not
relevant for the particular experiment in question should contain "None",
as they do in the dictionary later on. The goal of having the laboratories
fill out "None" consciously, is to reduce the chance to forget data.
[Improving on this template would help. Standardizing things also makes
sense. Thankfully, we don't expect too much new data to come in. It would
be great if README files were submitted consistently. Some labs were
wonderful with that. Others not so much. One wrote a great, thorough,
multi-page description of their tests: although this was great for
understanding, it wasn't when it came time to write the README (and makes
automated analysis more difficult)]
------
Furthermore, I suggest to have the heating rates and initial sample masses
written only in the "Test Condition Summary" table. They seem not to be too
useful in the text section.
[This is fine, *but* is it necessary/does it hurt as is? Effectively, the
READMEs come from my edits of test descriptions provided by labs. Often, I
would pull data from their written text to populate the summary tables but
I wouldn't go back to delete the written summary included above each
section. Keeping that info in the written summary maintains the
flow/continuity of some group's descriptions as you read them]
------
Since Isaac started to unify the data file names, I would like to ask if
the label of the individual experiment is supposed to by the same as the
data file name in general. This would be helpful to reduce the footprint of
the summary table and I don't really see why there should be a difference
in naming.
[Yes, this should definitely be the case. Now, it's easier to catch which
tests look different. When I started, I did my best to guess what info we
would need in that file name so names evolved as more data came in. I made
some incorrect guesses at the start. That said, I think filenames are all
now set; editing READMEs for consistency with those filenames, as you
suggest, is the right move and should be doable (without having to repeat
the exercise again for future changes)]
On Tue, Jun 16, 2020 at 7:26 AM Randy McDermott ***@***.***> wrote:
@TristanHehnen <https://github.com/TristanHehnen> Thanks for all your
work on this! I think it is headed in the right direction. But I am not the
keeper for matl-db (only a maintainer), so I think we need to get
consensus from Isaac ***@***.*** <https://github.com/leventon>) and Morgan
***@***.*** <https://github.com/mcb1>).
As long as very simple instructions can be put together for the
participants, I am in favor. I suggest you use Isaac or Morgan as a test
case and see if they can follow the instructions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADBUSGMZGPFSECPQPO4RZYDRW5JFJANCNFSM4LGQKNUQ>
.
--
--------------------------------------
Isaac Leventon, PhD
National Institute of Standards and Technology
Fire Research Division
Building 224 - Room A265
Gaithersburg, MD 20899
(301) 975-6887
|
Guys, what about the idea of using a Google Form (or something) to submit the README data, which would get converted to a csv and then the scripts would generate the README.md file. In that way, you could have dropdowns where you want only specific answers. |
That's probably a good idea. We might get a new data set from Chile in the next few months - that'd be a good trial run for the form, otherwise it'll at least be set for our next material. Either way, @TristanHehnen - after we settle on a final format of the readme, do you want to set up a test case with Google forms to work on a script to build a proper README file from there? Forms can output data as .csv files; parsing them should be straightforward but there might be tweaks needed to do that with your script given the different format of those files vs. our current README.md files. Ideally, we'd also upload measurement data through google forms, but it looks like Forms requires users to have a google account to do that, so I'll likely just include a text notification reminding visitors to [submit data by email as a .zip file to [email protected]] when they submit the form. As for your script - honestly, I'm out of my element here so we should wait for proper feedback from Morgan. The general concept / flow of what you have here makes sense but I can't really comment on the functionality, writing, or design of the code itself. As for general conceptual comments... Default options: Test Conditions Table: Test Label and File Name: Calibration type: Initial Mass: I really like that we have some ability to visualize data so the Plotting section is great, but I'll hold off on comments there until we can sort through the README first |
I would not worry about the "None". If you use a form, then whatever you have in the dropdown can be converted to None as needed. None is commonly used in Python script arguments, so it is handy that way. @TristanHehnen I'd say press forward with your processing scripts. We are in a similar situation on the gas phase where really only I know how the scripts work. To some degree, this is unavoidable. The fact that you are taking charge and making things happen means you are in control of this aspect of the project. It is welcome from my point of view. @leventon I would argue that "ideally" measurement data would come from a pull request to GitHub. In lieu of that, emailing a zip that we push to GitHub is the best option. Usually I have to massage the column headers, etc. But let's give the form idea a try just amongst ourselves. Create a simple toy form and send it to me and Tristan and we can build from there. |
Sounds good. Will do.
…On Thu, Jun 18, 2020 at 1:43 PM Randy McDermott ***@***.***> wrote:
I would not worry about the "None". If you use a form, then whatever you
have in the dropdown can be converted to None as needed. None is commonly
used in Python script arguments, so it is handy that way.
@TristanHehnen <https://github.com/TristanHehnen> I'd say press forward
with your processing scripts. We are in a similar situation on the gas
phase where really only I know how the scripts work. To some degree, this
is unavoidable. The fact that you are taking charge and making things
happen means you are in control of this aspect of the project. It is
welcome from my point of view.
@leventon <https://github.com/leventon> I would argue that "ideally"
measurement data would come from a pull request to GitHub. In lieu of that,
emailing a zip that we push to GitHub is the best option. Usually I have to
massage the column headers, etc.
But let's give the form idea a try just amongst ourselves. Create a simple
toy form and send it to me and Tristan and we can build from there.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADBUSGMRVPNDIRYATCJNVJLRXJGZ7ANCNFSM4LGQKNUQ>
.
--
--------------------------------------
Isaac Leventon, PhD
National Institute of Standards and Technology
Fire Research Division
Building 224 - Room A265
Gaithersburg, MD 20899
(301) 975-6887
|
Thank you @leventon and @rmcdermo for your time and comments!
For the next steps, I would like to wrap up the TGA experiment processing functionality, using the UMET data set as example. When we have agreed that this is how it should look like, I would propagate the necessary changes to all other TGA experiments within the repo. Afterwards, adding the functionality for the next experiment with a single example case, e.g. Cone Calorimeter, have this discussed, propagate the changes and so forth. |
I like the Wiki, it’s a good addition to start building/adding reference material there. As for the hope that we’ll get people to submit data through Github PRs vs. email – long term, I hope we get there, but we just haven’t seen any willingness from our participants yet for that. It’s a learning curve / barrier to entry that we likely won’t get past with all participants. I am no longer wholly incompetent with Github but it took a surprising amount of effort to get to this level (just for adding/editing of data / files in our repo). I’m not sure I would (and it looks like most contributors wouldn’t either) want to go through that just to submit files. As for the TGA/DSC template that’s there, I’m still not sure how well it will work out if we rely on that vs. trying to create a form that needs to be filled out in a certain way. I mention that because we had already included that info in the guidelines that were emailed to everyone AND templates were available on the Repo when most labs submitted data but we still got quite a spread in what was submitted. So long as they can edit fields (e.g., [none]) when they write their own files, we’ll likely get a lot of variation (not all labs, but most). As a trial run for that vs. just following the templates you provide for TGA data – using the Chile group as a test case might be a good idea. I suspect they’ll submit cone and TGA data. How would you feel about getting your template up and available on the repo and making a google form as a second option. Hopefully they can provide feedback for which was easier to work with. Til then, fields like calibration types that need (or would do well from) having suggested items likely can be best defined with a drop down menu in a form. TGA naming- I’m right there with you on those file names getting long / out of hand. In fact, heating rate wasn’t even included on some of the earlier data sets. As I started analyzing that data though, it became apparent that heating rate and gaseous environment were needed (or at least very helpful) to include. TGA initial masses: What’s happening here is likely how the experiment is run. From my experience with the test, you have a range of options for defining that mass (m0). In a number of cases, separately measuring m0 before you start your test gives you the most accurate measurement. The balance is hypersensitive and so you can see shifts in that signal at the start /end of the test. Let’s say true mass is 5.0 mg. It’s not uncommon for the initial steady state TGA mass (at 20C-80C) to read higher or lower (though be stable). In that case, I’d use the time resolved mass loss but renormalize the initial mass to match m0 as measured independently. Steps like that.. they’re clear to the experimentalist (it’s why UMD submitted their own averaging / uncertainty analysis) but it can be hard to automate. *This is something that will require further discussion I think I shared with you (email) a copy of the outline/next steps that was sent to the condensed phase committee a couple weeks ago. In ~2 weeks from now, I’ll need to prepare a summary of data to the participants. That will include preliminary analysis of all test data (hopefully, I’m working through that in MATLAB now). When that report is shared with the committee, and then with participants, we are requesting feedback on how we want to do that analysis (e.g., how to define smoothing, test averaging, uncertainty analysis, key data point identification…) There is not necessarily one best approach; one of our goals was to come to a consensus as a community on how to do that. *** Of all files to work with for TGA – please avoid UMET for now. That set is messed up. I’m aware of some challenges; I have different notes on what I want to do there, and I’ll edit it eventually when I can but.. just for now, please choose a friendlier set. I think SANDIA TGA data was okay (and that gives you a range of test conditions to play with too). *** |
Oops. Please forgive the formatting of that last message. Larger font is not meant to indicate emphasis, I don't know what happened there. |
Markdown thinks the As usual, I disagree with the comment about automation. These things are not difficult. Just get the data into a simple column format and we can do pretty much anything. |
Hello everyone, my apologies for the radio silence recently. I've now implemented some improvements for the processing of the README files. The individual steps are now better implemented into functions. These functions contain inline comments and docstrings, in an effort to make things more accessible for users, or rather developers. For developers and maintainers: For users: Now, the question is if the layout/format of the README files, at least for TGA experiments, is settled (e.g. my proposal in the UMET README in my fork). Then I would have another pass over the implemented functions to ensure they work with said format and unify the remaining README files. What are your thoughts about this? |
@TristanHehnen I am very much in favor of moving forward with your Python scripts. I have just spent the last few days going through the current Matlab scripts and, while these were necessary to get started, they need an overhaul. What would be very helpful, and I am not sure how far you are from having this, is if you could create a master Python script that would process the exp data and create all the plots needed for Isaac's document. Isaac is going to email me his personal copy and then I will push the pdf up to the Releases page. You can then use that document as a basis for your scripts. If that document is not sufficient, then I think it means it needs work. So, this will be an excellent exercise. |
@rmcdermo I can certainly help translating the Matlab functionality into Python scripts. However, if it is not too urgent, I would like to focus first on the foundation - processing all the README files. For translating the Matlab functionalities I would open a new issue to keep both tasks clearly sperate. |
The translation of the Matlab functionalities to Python have now their own issue, see issue #80 |
@leventon |
So, just as a head-up: the TGA data can now be processed and the respective information is already stored in the Python file containing the dictionary. I would now proceed to the cone calorimeter. EDIT: Typo |
Looks great, thanks! |
Update: DSC data can now be processed. Construction notebook and dictionary are updated accordingly. EDIT: |
I've now unified most of the README files concerning the cone calorimeter data. Based on this I've created a template. I would like to ask you to check said template for consistency and completeness. Specifically look at the sample holder and retainer frame dimensions. Across various README files values for both were provided and are thus replicated in the template. Main questions here are:
Furthermore, there are some significant differences on the volume around sample and heater (sample chamber). Some apparatuses have some kind of box around them (glass walls at the sides), while others can seal this part off and control the atmosphere. I'm not sure how to deal with this and I've just provided a relatively basic approach to collect this data. Would this be sufficient or am I missing something here? Would we need some flow rates here as well, specifically for the controlled atmosphere ones? For the backing my idea is to adress each material as an individual layer. The provided lines would need to be copied and the individual entries numbered accordingly. With the thermocouples there are different ways their locations are reported. Some are marked "front" and some "back". I'm thinking now, that there could be two coordinate systems, one starting at the centre of the front face of the sample and the other on the top face of the backing (back of the sample). Then negative z-values would denote locations within the sample and backing respectively - positive z-values point towards the heater for both systems. The summary table could be extended to get a column for the flame out time and a column for the residual mass after the test. Tristan |
Why not also require a detailed drawing of the system? Modelers usually need this sort of thing. |
Hi Tristan,
There's a lot here it looks comprehensive. Good work.
I added a new README that lists some of the most important features of
tests/setup that we'd want to know (prepared w/ committee input, following
a call 1-2 weeks ago). I think your template is pretty exhaustive but
please review that to see if anything is missing (e.g., baseline
corrections, calibration information):
https://github.com/MaCFP/matl-db/tree/master/Non-charring/PMMA
In general, I'd try to avoid repeat data (e.g., heat fluxes or sample mass
at top of list are also in the Table at end) The table should be the one
source for data that varies between tests (specific values that you might
read in programmatically), above should be general info to someone browsing
the data. Could we add a descriptor in the above section noted these are
more general values (e.g., Heat flux(es): 25-65 kW/m2 || Initial Sample
Mass ~70 g)
Frame/holder info:
Key is *exposed* sample surface area
Similarly, other sample area/mass values should be noted as 'initial'
Backing materials
Some of these values will be temp-dependent. It's not clear to me the best
way to report that is. At the very least, we'll need a field for
"temperature at which that property is evaluated at, if not the ability to
list multiple".
For completeness/good modeling, this info is key. We're asking for a lot
here so either a link to a source or a separate formatted text file with
this info might be needed.
Thermocouples
Bead diameter shouldn't really affect measurements, we're not in the gase
phase. Do we need this (given how much other info is here)?
- Calibration type (description + frequency): [None]
- Instrument
- Type--> Manufacturer:
- Model Number/apparatus
- Note:
…On Thu, Oct 29, 2020 at 9:33 AM Randy McDermott ***@***.***> wrote:
Why not also require a detailed drawing of the system? Modelers usually
need this sort of thing.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADBUSGPRYHAXRNQJISU3NGTSNFVJVANCNFSM4LGQKNUQ>
.
--
--------------------------------------
Isaac Leventon, PhD
National Institute of Standards and Technology
Fire Research Division
Building 224 - Room A265
Gaithersburg, MD 20899
(301) 975-6887
|
Hi @leventon ! First off: Yes, there are a lot of items in the template. I would like to emphasise that they are all collected from the README files that have been provided by the institutes. I only added two minor details the ignition time column and the bead diameter.
I put the rest into the description part in the beginning, to keep the information on the sample holder that was already provided by the institutes.
|
To 3.: I move the nominal exposed surface and the diameter/edge length to the sample itself which reduces the amount of items for the retainer frame even more. |
Hi @TristanHehnen , lots of good work here, thanks for the update.
We should confirm the group's calibration procedure (type / frequency / matls) and whether or not results have been corrected for drifts in their baseline (TGA, DSC, MCC, and Cone HRR are all often adjusted in such a way). This calibration and baseline correction should be done by the experimentalist, not the modeler, and the process described (not asked to be reproduced). Are these the same as a "correction curve"? Maybe, but that's ambiguous wording to me. For clarity, I'd refer you to each of the reference texts suggested in the preliminary summary (they discuss the principles/practices needed) but that's not most supportive of replies. A more friendly, immediately useful response might be to have a ~30 min call when we can go through each of these types of corrections/calibrations // setup/processing steps rather than trade messages.
|
Small clarification: The thermocouple diameter was introduced by Edinburgh and I changed it bead diameter. |
Got it, thanks.
Can we keep that one as a "note" but not request it of all labs?
…On Fri, Nov 6, 2020, 05:05 FireTristan ***@***.***> wrote:
Small clarification: The thermocouple diameter was introduced by Edinburgh
<https://github.com/MaCFP/matl-db/tree/master/Non-charring/PMMA/Edinburgh>
and I changed it bead diameter.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADBUSGJNABXJP53KE2Q2BQDSOPC6RANCNFSM4LGQKNUQ>
.
|
Sure. |
As mentioned in #26, here is the link to the first iteration of the
ExperimentalDataInfo.py
. It contains basically the information that is provided via theREADME.md
files and knows the location of the CSV files containing the data from the different experiments. I like this approach, because it allows me to access all the information from within python scripts or Jupyter notebooks. The human-readable keys to access the different items I find to reduce errors, as compared. Also dictionaries can be easily transformed into Pandas DataFrames which allows for nice rendering of tables in the Jupyter notebooks. Furthermore, I can easily pass the information on to scripts that build FDS and optimisation input files.It could be located in the root directory of the MaCFP Git repo (obviously file paths need to be adjusted).
Since it aims to mirror the structure of the
README.md
files it might be relatively simple to set up scripts to automatically screen the repository and add information of new data sets.If this script is considered a useful addition to the MaCFP project we can add it in.
The text was updated successfully, but these errors were encountered: