Home » DataWarrior » Functionality » Calculation of mean/median values in box &whisker plots.
|
Re: Calculation of mean/median values in box &whisker plots. [message #602 is a reply to message #601] |
Thu, 25 July 2019 10:30   |
nbehrnd
Messages: 233 Registered: June 2019
|
Senior Member |
|
|
Hello Tim,
lacking a minimal working example I can only guess what you refer too. Recent
work of mine with DW however equally considered data with columns lacking some
entries, too. The solution working «good enough» for me, aiming for whisker
plots and their statistics, however was to start with a table with each cell
in the corresponding column already filled with the place holder «N/A»; to be
replaced by real data only if these are at hand. (This equally may be entered
prior to DW with a conditional formatting in a spred sheet, or as manual edit
per cell in DW, too.) Thankfully, this kind of «other entry type» seen in other
statistical programs (e.g., R) seems to be recognized by DW, too.
As an example, I populated an array about the first ten alcanes, and let DW
determine their molecular weight; eventually displayed as a whisker plot (cf.
alkanes_complete.dwar). In a copy of this file (alkanes_except_octaneMW.dwar)
the entry about the molecular weight for octane's was substituted by the «N/A»
place holder. Now, the dot about the entry missing (or, in parlance of R, about
«[data] not available») is set below the others, no longer considered for further
statistics -- both on screen, as well as in the plots' statistics.
This approach was used with DataWarrior (stable release 5.0.0) running in Linux
Xubuntu (18.04.2 LTS, 64 bit).
Norwid
[Updated on: Thu, 25 July 2019 10:32] Report message to a moderator
|
|
|
|
Re: Calculation of mean/median values in box &whisker plots. [message #617 is a reply to message #613] |
Sun, 25 August 2019 20:13   |
nbehrnd
Messages: 233 Registered: June 2019
|
Senior Member |
|
|
Hi Thomas, Hi Tim,
in my observation, all four statistical values provided in the whisker
plot do change by setting manually the cell entry of molecular mass to
the string of "N/A" (without the quotation marks). Processing the data
a twice allows me to retrieve the changes in the whisker plot and its
statistical data, too. Here I share my approach to the task with DW
(Linux version 5.0.0) with the test file alkanes_complete.dwar above:
Reading the file as-such which contains 10 complete entries:

Accessing the cell value for methane, "16.0428" is replaced by "=N/A".
As expected, DW will indicate this as a non-valid entry. Of course, no
whisker plot is provided now. But the dot is still present in the plot.

Next step, replacing "=N/A" by "N/A". DW accepting this now provides a
Box whisker plot. The dot without associated value is sorted out, the
statistical data are updated.

Two additional observations:
If "N/A" is entered for the first time, the line vanishes completely,
hence shortening the list of 10 alkanes to 9 alkanes. This removal
may affect other columns than the column currently worked on, too.
This contradicts the aim to preserve the complete line which only has
no entry for this very cell.
Say, there is a second entry with no value available. Then, a direct
input of "N/A" via DW's edit cell function is possible without danger
to loose the complete line. E.g., direct access to

Again for documentation, the file used here is attached below. Maybe
there is a better way -- if so, I'm curious to learn it.
Norwid
-
Attachment: step_0.png
(Size: 7.11KB, Downloaded 1005 times)
-
Attachment: step_1.png
(Size: 5.26KB, Downloaded 990 times)
-
Attachment: step_2.png
(Size: 8.80KB, Downloaded 931 times)
-
Attachment: step_3.png
(Size: 9.24KB, Downloaded 986 times)
-
Attachment: test_file.dwar
(Size: 2.85KB, Downloaded 519 times)
|
|
|
|
Goto Forum:
Current Time: Fri Jun 13 21:03:56 CEST 2025
Total time taken to generate the page: 0.07973 seconds
|